Abstract

Video games have without a doubt had an impact on pop culture and even the world. Since the 1970s, games like “Pac-Man”, “Super Mario”, “The Legend of Zelda”, “Final Fantasy”, and “Call of Duty” have attracted billions of fans to the video game industry and paved the way for gaming to become mainstream today. Even in recent years, with the emergence of social networks, smartphones and tablets, new categories such as mobile and social games have been introduced to today’s gamers. As of 2020, the global video game market has estimated annual revenues of US$159 billion across hardware, software, and services, three times the size of the 2019 global music industry and four times that of the 2019 film industry.

In this report, we will introduce three of the biggest publishers in the past decade from within our dataset. Based on our online research, we observed that the four biggest publishers in the video game industry were Sony, Tencent, Nintendo, and Microsoft, but because there were hardly any games from Tencent from within the dataset, we elected to select the other three. We will identify each publisher’s most popular video games, and compare/contrast different attributes between the three publishers.

Finally we are interested in better understanding how the popularity of video game platforms changes over time, as well as what constitutes a high-grossing video game. We will also make observations of numerous word clouds and identify the most commonly used words for video game titles.

Introduction to the Data Set

This dataset is titled Video Games Sales 2019. The dataset is from Kaggle and contains 55,792 video games released from 1970 to 2019. You can find the specific dataset here. Each row corresponds to a game that generated sales in 2019. There are 16 columns in the dataset. Categorical variables include: Rank, Name, basename, Genre, ESRB-Rating, Platform, Publisher, Developer, Last_Update, status

Quantitative variables include: Critic_Score, User_Score, Total_Shipped, Global_Sales, NA_Sales, PAL_Sales, JP_Sales, Other_Sales

Other variables include: VGChartz_Score, Year, url, Vgchartzscore, img_url

How has video games change over time?

What does the popularity of gaming platforms look like over time?

The time series graph plots the top 5 most popular platforms of all time based on the number of games released per platform. We can see the rise and fall of platforms such the Play Station on the graph as they get replaced by newer consoles. For instance, we can see that the fall of the PS (PlayStation) is partly due to the rise of its replacement (PS2). Notice how the number of games created for the PS decreases as the number of games for PS2 increases simultaneously. We also see the semi consistent rise of the PC.

How do the Global_Sales of video games from different Genre compare throughout the years?

Next we will explore the many genres that the video games in the dataset has to offer.

The Global_Sales of video games have a general trend of increasing, reaching a peak, and then decreasing. For this dataset, we decided to look at the top 5 genres that generated the most sales, which are Sports, Action, Shooter, Racing, and Role-Playing. Misc was part of the top Global_Sales but we decided to not use that because it does not specify which genre we are looking at, so we believe that using the next top genre will give us more insight. We see some peaks of the genres around 2008 - 2011 and then slowly decreasing in later years. Action games had the highest sales in 2011 followed by Sports games in 2008. Something to note about the dataset is that there were many n/a in the Global_Sales column.

We then ran a statistical test to see if there is a statistically significant different in Global_Sales among the different genres.

## 
##  Bartlett test of homogeneity of variances
## 
## data:  sales by Genre
## Bartlett's K-squared = 51.511, df = 4, p-value = 1.746e-10
## 
##  One-way analysis of means (not assuming equal variances)
## 
## data:  sales and Genre
## F = 5.4447, num df = 4.00, denom df = 107.53, p-value = 0.0004965

Since p-value = 0.0004965, which is < alpha = 0.05, we reject the null hypothesis. We conclude that there is a statistically significant difference of sales between the different genres. This further shows that different genres do generate different Global_Sales.

We then wanted to dive deeper into the distribution of genres of the top 3 publishers: Microsoft, Nintendo, and Sony. We looked at the Global_Sales of different genres of the three publishers and decided to plot the top 10 genres that generated most sales combined.

Note that the labels on the x-axis is ordered from the highest sale genre for all three companies to the lowest sale genre. In this case, the most sales generating genre by Global_Sales when looking at all three is puzzle games, followed by action. Of the top 10 genres, the least sales generating genre is simulation. From the graph we can see that puzzle games seem to generate high sales for Microsoft and Nintendo but relatively low sales for Sony. This means that puzzle games does not generate as much Global_Sales for Sony as it does for the other two publisher.

For Microsoft, out of the top 10 generating genres, action games generates the most global sales followed by shooter and puzzle games. Adventure games generate the least sales for Microsoft. For Nintendo, puzzle games generates the most sales followed by platform games. Shooter, strategy, and simulation games generate the least sales. For Sony, sports games generate the most global sales while strategy and simulation games generate the least.

How do the Global_Sales of games relate to their Critic_Score?

From the plot we observe from the yellow line that the trend between Critic_Score and Global_Sales might not be very strong, but higher scores tend to have higher sales (Action and Shooter games with the high/low score also have corresponding high/low sales), although most data points gather around Critic_Score = 7 and Global_Sales = 1. We can tell the relationship between Critic_Score and Global_Sales with this plot and also distinguish between different Genre.

We also want to explore the difference in critic and user scores of the top 5 sales generating genres which are Action, Racing, Role-Playing, Shooter, and Sports.

What does the distribution of sold copies for games by 3 of the biggest publishers/companies of the last decade look like?

To answer our question, we’ll be implementing a stacked bar plot, with x = Publisher, y = Total_Shipped. The bars will be stacked with platforms since they would fall within a Publisher, like the DS and the Switch under Nintendo for example.

In the past decade, Sony Computer Entertainment has sold more copies of video games than both Microsoft and Nintendo. Nintendo has sold games across more platforms than that of Microsoft and Sony. We can also observe that Microsoft’s Xbox Live platform has sold the most video game copies in the past decade, contributing to most of Microsoft’s numbers. We can see how diverse Nintendo is in terms of platforms for which they have published games for from 2010 to 2019, compared to that of Microsoft or Sony.

Conclusion

Through our research questions and from our findings consisting of the plots and statistical analyses, we have noticed some interesting points behind the games from our dataset.

Firstly, the popularity of a platform rises and falls as it gets replaced by other consoles. We also see a consistent rise in popularity of the PC. The top five genres that generate the most sales are Sports, Action, Shooter, Racing, and Role-Playing games. Also, between those five genres there exists a statistically significant difference in global sales. These genres have a similar trend of increasing to a peak and then decreasing. Among Microsoft, Nintendo, and Sony, we learned that puzzle games generated the highest amount of global sales.

We also found that the trend between critic score and global sales might not be very strong, but higher scores do tend to have higher sales. In addition, we also found that critics and users rated games differently. This is similar to what we expected because video games at their very core are supposed to be all about the player’s experience, and each person’s experience playing a game can be different.

Finally, the most popular words that we found in video game titles from the original dataset were “game”, “world”, “adventure”, “super”, “star”, and “edition”. For each of the three companies, most of the common words we observed in our word clouds were found in the titles of their respective popular video game franchises. There were no clear similarities among the three companies. We did observe that both Nintendo and Sony Computer Entertainment brought forth numerous platforms to release games from the last decade compared to Microsoft. However, most of Microsoft’s sold video game copies came from the Xbox Live.

Thoughtout the years, video games have progressed and we look forward to see the transformation for video games.