Dataset

We wanted to explore how different aspects of video games related to their sales. Our dataset is from Kaggle. It has 11,563 rows and 16 columns, and each row corresponds to an individual video game title. The variables in the data are the video game name, publisher, date the game was released, the platform (i.e. PS4, Wii, etc.) the game is played on, genre of the video game, global sales data, and sales data from North America, the EU, and Japan for each video game. There are also critic scores (out of 100) and user scores (out of 10) from Metacritic for each video game, along with the number of critic ratings and number of user ratings.

Main Research Questions of the Project

We have a few research questions we wanted to explore in this project:

  • Question: What kinds of video games are played in different regions of the world?

  • Question: How do these factors affect video game sales?

    • Video game release dates
    • Critic scores/ user scores
    • Video game platforms
  • Question: Do users and critics agree on how good games are?

Question 1: What kinds of video games are played in different regions of the world?

We evaluated the popularity of different game genres based on the sales of the games. From this bar plot, we can tell that games of Action genre are by far the most popular type of video games worldwide and for North American, EU and Other regions, followed by Sports games. Interestingly, this sentiment does not hold for Japanese gamers, who love playing Genre=Role-playing games the most. Globally, Genre=Strategy is the least popular type of video game. To see if total video game sales are increased by the large quantity of the games launched with a particular genre, we also plotted the average sales for the games. The second plot shows that while action games have overall the highest sales, more people globally are willing to pay for each platform game (Platform games are games with gameplay primarily centered around jumping and climbing to navigate the player’s environment. One example of platform games is Mario). Still, Japanese gamers prefer role-playing games the most, on average.

Then, we also wanted to explore more in-depth the games most issued by game developers to study the gamer preferences for different types of games, since these creators should have the best knowledge of consumer preference. Below shows the Top 100 most common words in the game names (removed stopwords and applied stemming).

The most common words that game producers use to name their games are “world”, “game”, followed by “soccer”, “racing”,“NBA”,“NFL”,“war”,“legend”,“hero”. This shows that gamers are more attracted to games that involve sports and competition, or games that have fictional factors such as “dragon” and “war”. The implication of this word cloud also supports our conclusion above that action and sports games are the most populr two types of games around the world.



Question 3: How do different video game platforms relate to video game sales?

To learn more about how different video game platforms related to sales, we plotted total sales for each platform. In doing so, we combined different platforms that were related, such as PS2, PS3, PS4, which all fall under the platform PlayStation.

Total sales globally for video game platforms largely appear to go towards PlayStation platforms, followed by Xbox, Wii, Nintendo DS, PC, and Gameboy. In terms of regions, the same trend appears for the EU, North America, and Other. Total sales in Japan are generally one of the smallest for each platform, but also are dominated by PlayStation and Wii. From the graph, we can also see that North American total sales across each platform are highest, followed by the EU, and then the other regions.



Question 4: Do users and critics agree on how good games are?

To see how the two variables relate, we made a heatmap to show where the density of points was. The heatmap is informative because it shows where the density of the scatterplot is without the messiness of several thousand points. We also added a line (shown in black) to show a one-to-one relationship for user and critic scores, as well as a line to show the liner model between the 2 variables.

We found that this graph shows most ratings are high from both critics and users, and that user score is slightly higher than critic score because more user scores are above the slope of 1 line than below. Another thing to note is that the linear model line intersects the slope of 1 line at approximately (7.5, 7.5) and has a lessor slope. This means that below a score of 7.5, the user score is generally higher than the critic score and vice versa.



Question 5: How do critic scores and/or user scores affect video game sales?

Since we also wanted to know how critic ratings affected video games sales, we plotted critic scores by average sales for each video game in different areas of the world. However, since we found the resulting graphs for critic scores to be exponential, we have shown the same plot, using a log-scale on the y-axis instead. Interestingly, for user scores, we did not find an exponential trend, so the plot shown uses a linear scale.

All four regions of the world show a positive relationship between log(average sales) and critic score. Each linear trendline shown has a positive correlation coefficient, which suggests that as critic score increases, sales also tends to increase. In particular, log(average sales) in Japan seems to be more affected by critic score than the other regions of the world, since its linear trendline has a steeper slope.

We also see a positive relationship between average sales and user score. The slopes of the linear regression for each region all appear to be similar, which indicates that user score has similar effects on sales globally.

Overall, critic scores seem to have a stronger effect on average sales than user scores, since the slopes of the trendlines for critic scores are steeper. This suggests that critic scores are a more useful metric to look into for generating the most sales.



Main Conclusions and Overall Takeaways

In conclusion, we learned many things about video game sales around the world. We found that action games are the most popular game around the world, except in Japan, which prefers role-playing games more. We also noticed that the most common words in video game titles have fictional, competitive or sport-related elements, indicating that gamers are drawn to games with these components. Another interesting trend we discovered was that game sales spiked and peaked during the Great Recession, which suggests that video game popularity could be influenced by this event. Finally, we demonstrated that users and critics rate games differently, which results in different effects on sales. Critic scores affect sales the most in Japan, compared with other regions of the world; in contrast, user scores tended to affect sales similarly in all regions of the world. We found that the most popular platform was Playstation, which generated the most sales by far. Overall, we concluded that Video Game culture in Japan is different from the rest of the world, and that factors such as critic and user scores do matter when it comes to sales.



Future Work

We noted that there was a peak of total video game sales during the time that the Great Recession occurred, but we cannot establish any relation between the Great Recession and video game popularity. Correlation does not imply causation, so we are unable to make any claims about this phenomenon; instead, we see it as an interesting coincidence that could be looked into further.