Data

The video game dataset is originally from kaggle.com and contains data on video games with sales greater than 100,000 copies. There are 16,598 rows in the dataset, which each correspond to a video game. Each video game is ranked based on overall sales (in millions), and sales data is specifically available for North America, Europe and Japan. All other sales are grouped in an “other” category, and there is additionally a column with global sales for each video game. Other columns in the dataset detail a video game’s name, platform of release, year of release, genre, and publisher. Platforms of release include platforms such as the Wii, PS4, or PC; there are 31 unique platforms in the dataset. The year of release is unavailable for 271 video games in the dataset; for the other video games, the year of release ranges from 1980 to 2020. There are 12 unique genres (including a miscellaneous category) and 579 unique publishers in the dataset, with the most common publisher being Electronic Arts.

Research Questions

Through our research, we hope to analyse the key factors in what makes a video game successful and how this success varies throughout the world. To complete our analysis, we will address the following questions:

  • How do specific game attributes such as name, genre, and publisher affect rankings of video games?

  • Which genres are more financially successful in North America? Are these genres consistently successful in diferent regions?

  • How do sales of video games differ across regions? How have sales been across time?

Titles

Here we analyze our first research question and study if the success of a video game is related to the words used in the title of the video game. In the graph below, we look at words used in the titles of the 1000 best selling video games.

Wordcloud of video game names in the top 1000 ranked video games

Wordcloud of video game names in the top 1000 ranked video games

The graph above is important because it shows the most common words used in the names of the highest selling videogames. We can see that games with “adventure”, “war”, “super”, etc. are common in many of the highest selling video games. In some cases, this is indicative of a popular series of video games like “Mario” being common because of all the Super Mario video games. Overall, this graph gives a lot of insight into what kinds of games are the most popular.

Our project deals with analyzing what factors make video games successful, and one of the factors that could play a role in video game success is the title of a video game. We analyze video game titles with a sentiment analysis to see if popular video games have more positive or negative words in their title. We create two word clouds below, one looking at the types of positive words in popular video game titles and the other looking at the types of negative words in video game titles.

The first word cloud shows that “super” and “hero” appear frequently in video game titles; other positive words include “magic,” “love,” “grand,” and “marvel.” There appear to be more video game titles with negative words, such as “dead,” “monster,” “dark,” “madden,” and “evil.” This could suggest that video games titles containing negative words are more attractive to video game users, or that specific video game genres are more appealing to video game users and the titles happen to reflect those genres.

Contributing Factors of Financial Success

Genres

We aim to investigate our second research quetion: which video game genres are more financially successful in North America and also determine whether these same genres are also successful elsewhere in the world.

To do so, we create side-by-side boxplots of North American sales for each genre. The sales were log transformed to account for the few sales that are much larger in magnitude than the others and to make it easier to see the overall trend in North American sales by genre.

The graph above indicates that video games within the Platform genre may be more successful on average than other genres and that the Adventure and Strategy genre is less successful than other genres. Every boxplot has one or more outliers, indicating that each genre has one or more successful video games, so video game success is not entirely dependent on the genre trend. Several video games in the Sports genre appear to have been highly successful compared to other video games.

Top Genre by Region

We continue analysis of genre success by looking at whether video game success by genre varies by region. To do this, we create a density plot of sales specifically for games within the Platform genre, which was the one of the most successful genres identified in the previous section.

We see that the distribution of log(sales) is fairly different among the different regions. The distribution of log(sales) also tends to be skewed to the right. We see that Global sales and North American sales are the most similar, which makes sense because North American sales tend to make up the majority of Global sales. Visually, North American video games sales are most similar to European video game sales, which could suggest that video games that are popular in North America also tend to be popular in Europe.

Video Games Sales in Other Regions

Conclusion

From our analysis, we found that many attributes might affect video game rankings. For example, whether or not the name of the game contains negative words and if it has the name of a popular series are contributing factors. Also, genre and publisher could affect rankings since different genres had different amounts of sales with platform games having the largest. Similarly, different publishers had varying amounts of sales with Nintendo being the highest. When looking at North America, we see that platform games are the most popular but their sales vary by region and are considerably less popular in Japan. When looking at sales over time, we noticed that there was a sharp decline in sales in 2008 across all regions. We found that most of this decline was in the sales of Nintendo and think that more research needs to be done to determine if Nintendo’s decline in sales is the reason behind the large drop in global sales or if there was some other external factor. To do this research, we would need to obtain more data such as data about the economy that could allow us to test if the recession in 2008 was the cause of this decline.