Introduction

Movies are a timeless form of entertainment that is deeply ingrained in our culture and society. Titles such as James Bond, Harry Potter, Star Wars, are household names with their own cult-like following. Families all flock to the theaters to wait in line and see their favorite stories come to life on the big screen. However, in recent years, with the popularity of streaming services such as Netflix and Disney+, movies are being produced in much larger quantities, whether that be movies of popular genres or spin-offs of popular series.

In this report, we will be analyzing the quality of movies and how their characteristics have changed throughout the years. To dive deeper into this trend, we will compare movies based on the attributes a movie-goer will consider when watching a movie to figure out what constitutes a “good” movie.



Data

For this project, we used a data set that can be found on Kaggle here. The data set contains 7668 rows and 15 columns, providing information (scraped from IMDb) about several movies from 1980 to 2020. Each row corresponds to a particular movie, and each column represents a different attribute describing these movies. We briefly describe the variables here:

  • name: Name of the movie.
  • rating: Content rating of the movie.
    • We found a small subset of movies that did not have typical movie ratings. Therefore, we largely ignored these and consolidated them into a bucket called “Other”. The ratings in this category include: Approved (1), TV-14 (1), X (3), TV-PG (5), TV-MA (9), and NC-17 (23). The number of movies falling under each of these ratings is indicated in parenthesis.
    • Additionally, we consolidated movies with no value in the rating column as well as movies with rating equal to Not Rated or Unrated into one category just called “Not Rated”.
    • Thus, we ended up focusing on five ratings: Not Rated, G, PG, PG-13, R.
  • genre: Main genre of the movie.
    • There were 19 genres represented: Comedy (2245), Action (1705), Drama (1518), Crime (551), Biography (443), Adventure (427), Animation (338), Horror (322), Fantasy (44), Mystery (20), Thriller (16), Family (11), Romance (10), Sci-Fi (10), Western (3), Musical (2), History (1), Music (1), Sport (1). The number of movies in each genre is indicated in parenthesis.
    • We focused on the top 6 genres (i.e. those with more than 400 movies) for our analyses.
  • year: Year of the movie’s release.
    • There are 200 movies from every year between 1985 and 2019.
    • There are fewer counts (ranging from 92 to 168) from years between 1980 and 1984, and just 25 from 2020.
  • released: Release date of the movie.
    • The format is Month D, YYYY (Country).
  • score: IMDb user rating of the movie.
    • 10 point scale
    • The scores range from 1.9 to 9.3 with a mean of 6.39 and a median of 6.5.
  • votes: Number of user votes for the movie.
    • This ranges from 7 votes to 2,400,000 votes with a mean of 88,108 votes and a median of 33,000 votes.
    • There are 3 NA’s.
  • director: The director of the movie.
  • writer: The writer of the movie.
  • star: The main actor/actress of the movie.
  • country: The country of origin for the movie.
  • budget: The budget of the movie.
    • The budgets range from $3,000 to $356,000,000 with a mean of $35,589,876 and a median of $20,500,000.
    • There are 2171 NA’s.
  • gross: The revenue of the movie.
    • The gross revenues range from $309 to $2,847,246,203 with a mean of $78,500,541 and a median of $20,205,757.
    • There are 189 NA’s.
  • company: The production company of the movie.
  • runtime: The duration of the movie.
    • The runtimes range from 55 minutes to 366 minutes with a mean of 107.3 minutes and a median of 104 minutes.
    • There are 4 NA’s.

As we can see there is a good mix of categorical and quantitative variables. In particular, the attributes name, rating, genre, director, writer, star, country, and company are categorical, while year, released, score, votes, budget, gross, and runtime are quantitative.



Research Questions

As briefly described above, our main overarching objective is to study how movies have evolved over the past 40 years with a particular focus on what attributes make a movie successful. We largely used movie score as our main metric in determining the quality of a movie.

With that in mind, we developed the following three main research questions:

  1. How have movie scores trended over time?

 

  1. Which movie characteristics actually impact score?
    • Genre
    • Rating & Budget
    • Revenue & Budget
    • Country of Origin

 

  1. How do other characteristics, such as movie genres and ratings, interact with each other?


How have movie scores trended over time?

To set a baseline, we wanted to first examine how movie scores have evolved over the past 40 years. We felt that looking for these overall trends in how movies are scored is important for a couple of reasons. Of course, first and foremost, we might be able to gain some insights on how the overall quality of movies has changed over time. However, beyond this top-level insight, any clear cut trends can provide valuable context for our more in-depth analyses of how different movie characteristics relate to scoring. We first grouped all the movies by month of release (i.e. Jan 1980, …, November 2020), and then calculated the average movie score for each of these months. Below, we plot these monthly average movie scores.

As you can see, at first glance, the data is pretty noisy. Also note that given we are working with monthly average scores (i.e. we are grouping by release month and averaging the score of all of the movies in a given month), all the scores here lie between roughly 5.0 to 7.5. This is, as would be expected, a much tighter range than that for scores of individual movies (1.9 to 9.3).

In order to look beyond the noise, we have plotted the moving average of these scores (using a 2 year moving average window).

After clearing up the noise, from this graph, we now see a pretty clear upward trend in movie scores over the past 4 decades (at least before what seems to be a decline starting in the mid-late 2010s). This would seem to tell us that movies have been gradually getting higher scores as time has passed. Based on what we know so far, we can’t quite tell what exactly is causing this trend, but there are a couple of possible causes.

The most obvious explanation would be that the quality of movies has actually improved. Given technological advances, more data on viewer tastes, and other improvements made over the past 40 years, this seems to be a pretty intuitive conclusion. However, there are other plausible explanations. For example, one theory would be that, given societal changes and the fact that the people scoring movies are changing, we may just be witnessing a change in the way movies are scored. People scoring newer movies may be using a different set of criteria or just being more lenient than scorers from 40 years ago. Hence the higher scores of this era may just be a symptom of score “inflation” rather than a reflection of improved quality.

One interesting finding is the very recent sharp dipping trend in scores over the past half decade or so. While there have been slight dips in the past, none have been as pronounced (and few have been as prolonged) as the current one, and the overall upwards trend has ultimately prevailed. It will be interesting to see over the coming years if this dip is just another blip or a sign of shifting fortunes. Unfortunately, again we can’t quite tell what’s causing this shift, but our group’s theory is that the rising popularity of streaming services and the subsequent focus on mass production of movie content may be leading to an overall decline in movie quality. While there may be a lot of great new movies coming out (maybe more so than ever), the sheer focus on constant content creation could be leading to a much higher level of sub-par movies being released as well.

Ultimately, stepping away from the more in-depth discussions on causes and recent trend reversal, the key takeaway here is a pretty clear cut increasing trend in movie scores over time.



Which movie characteristics actually impact score?

Now with that baseline study of movie scores in hand, we wanted to dive deeper into how different movie characteristics relate to score.

Genre

First, we hone in on movie genres to see if certain genres tend to produce higher or lower scoring movies.

These boxplots showcase the conditional distribution of movie score given genre for each of the 5 decades represented in the data set. As mentioned earlier, given the vast number of genres present, we decided to focus in on the top 6 genres (by count of movies in the data set). These are: Comedy, Action, Drama, Crime, Biography, and Adventure.

At first glance, it doesn’t look like there are any major differences between the different decades that really stand out. Roughly speaking, it looks like scores are trending ever so slightly higher, but that trend is not as evident as it was in our previous discussion using moving averages. Looking at a more granular level, we see that the biography genre seems to consistently have the highest scores, but it does look to be on a downward trend since the 1990s and other genres are beginning to catch up. On the other end of the spectrum, we see that action movies seem to consistently score the lowest, but the scores for this genre have been improving and it has gone from being clearly the lowest scoring in the 1980s and 1990s (with a center around a score of 6) to being either along with or above comedy in the 2010s (with a center around a score of 6.5). It has gotten off to a great start in the 2020s as well. The action genre along with the comedy and adventure genres seem to consistently have not just the lowest scores, but also the widest range of scores, suggesting there are a lot of movies with differing qualities in these genres. In that case, the lower center for score of these genres may be a case of many bad movies counteracting good ones and pulling the overall genres’ scores down. This is again in pretty sharp contrast to the biography genre, which we already established consistently scores highest, but also looks to have the smallest range, suggesting the quality of movies in this genre is more consistent.

 

Rating & Budget

Next, we wanted to explore the relationship between movie scores and budget, while taking into account movie content rating as well.

For this, we created a scatter plot of score vs. budget, faceted by rating. As mentioned earlier, the data set contained ratings not normally seen associated with movies. Because of this, as well as their small sample sizes, we decided not to include them individually in our analysis, but as a group called “Other.” Our data set also had movies that did not contain a rating or had rating classified as “Not Rated” or “Unrated”. As such, we consolidated all of these into one category called “Not Rated”. Adding in the four main movie ratings - G, PG, PG-13, and R - we ultimately ended up with six main ratings categories.

Looking at the graph, we do not see any noticeable differences between the score in relation to budget, based on rating. They all follow the same slight positive trend where score increases as budget increases, except for the “Other” ratings category. We noticed that PG-13 and R rated movies seem to have slightly higher scores compared to the other two main ratings. Interestingly, we found that G and R movies have not gone above a budget of $200 million, but there are PG and PG-13 movies that have done so. In fact, the PG-13 category has several movies that have significantly higher budgets. We think that may be due to the fact that children and teenagers are viewed as one of the largest target audiences for movies, and so more money is invested into movies catered to them.

## 
##         G Not Rated     Other        PG     PG-13         R 
##       153       412        42      1252      2112      3697

The above table displaying counts of movies falling under each rating category supports this, but also reveals that R-rated movies make up most of the movies in the data set for the past four decades by far. This is interesting because we see that R-rated movies typically have lower budgets than other ratings (highest density of points between 0 and $100,000,000), yet we see no noticeable difference in the score of these movies. This would seem to suggest that R-rated movie makers may not need to spend as much to create higher scoring movies.

 

Revenue & Budget

Based on the findings we just discussed, we wanted to dive even deeper into studying the relationship between a movie’s score and its financial success. Therefore, below created a scatter plot containing budget on the x-axis and gross revenue on the y-axis, with the data points colored by score. In order to avoid distortions created by movies with extreme scores, we narrowed our focus down to movies with scores between 5.2 and 7.6 (i.e. the middle 80% of all the movies).

Note, we also included a linear regression line on the scatter plot. The corresponding output from the linear model predicting gross revenue from budget is shown below.

## 
## Call:
## lm(formula = gross ~ budget, data = middle_scores_sub)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -456345283  -40424547   -5918536   18739106 1230964336 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -1.185e+07  2.129e+06  -5.563  2.8e-08 ***
## budget       3.009e+00  3.929e-02  76.589  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 107400000 on 4496 degrees of freedom
##   (1800 observations deleted due to missingness)
## Multiple R-squared:  0.5661, Adjusted R-squared:  0.566 
## F-statistic:  5866 on 1 and 4496 DF,  p-value: < 2.2e-16

Already from the trend of the points on the scatter plot and the slope of the linear regression line, we can see that there is a pretty clear strong positive relationship between revenue and budget. This is confirmed by our linear regression model output showcasing that the positive slope is indeed statistically significant. Hence, we are seeing that, on average, movies with bigger budgets do indeed seem to actually bring in more revenues. Although this is the logically expected result, we were admittedly a bit unsure of whether the data would clearly highlight this. We have all heard stories of big budget movies failing to live up to expectations, and we even see a few data points on the scatter plot displaying this phenomenon, but for the most part, it looks like bigger budgets tend to produce bigger revenues. While this finding must be music to the ears of big budget movie makers, the variability in the plot above highlights that a big budget alone is no guarantee of success at the box office.

Now let’s step back from that quick digression on the relationship between just budget and revenue alone, and bring back score into the picture. Unfortunately, it is not as clear how the budget and revenue of a movie are related to its score. There are many purple points in the lower left corner of the graph, which would counterintuitively indicate several high scoring movies that produce low revenues and/or several low budget movies that still generate high scores. However, one thing we do see is that purple points are more commonly found above the regression line, while yellow points are more frequent under. This would suggest that high scoring movies tend to make more of their budgets. That is, for two movies with the same budget, we would expect the high scoring one to bring in more revenue than the average movie with that budget (i.e. be “above the regression line”), while the low scoring one would be expected to generate less. (i.e. be “below the regression line”). This is much more in line with our expectations of high scoring movies tending to outperform low scoring counterparts.

 

Country of Origin

Finally wrapping up our discussion of movie score, given we had information available on a movie’s country of origin as well as the presence of a decent amount of foreign films, we were curious about the impact of a movie’s origin on its success. Below, we map the average score of movies from different countries. Note that countries with no movies are made transparent/white.

From a data standpoint, it is important to note that in this analysis we excluded 58 movies listed as being from countries that no longer exist (e.g. Yugoslavia). We also adjusted some listed country names to match present-day country names (e.g. changed “Soviet Union” to “Russia”). We were left with 54 countries having at least one movie.

Our immediate takeaway is that the results here do not really align with conventional expectations (or at least our own expectations). We see that movies from major markets, such as the US and China, actually have average movie scores below the global median. Additionally, I think many would be surprised to see that countries like Lebanon, Russia, and Argentina are producing above-average movies (at least based on IMDb score) considering they don’t really have a well-known reputation for making great movies.

Given the surprising nature of these results, we wanted to take a look at the actual count of movies from each country to provide some extra context. Again, countries with no movies are transparent/white.

Unsurprisingly, the US has the most movies by far with 5472 of the 7610 total movies considered in this analysis. The country with the next highest count of movies is the UK with 816 movies. Additionally, the median count is just 6.5, indicating there are several countries with merely a handful of movies. Given these wide disparities, in order to better visualize the counts without being distorted by the US, we converted the counts to a natural log scale.

Now moving onto the actual results, we can clearly see that the conventionally expected big movie markets have much higher counts of movies represented in the data set than smaller markets do. Comparing the two maps, we see that, broadly speaking, the count of movies seems to be inversely related to the average movie score. That is, for the most part, countries (like the US) that produce a lot of movies tend to have lower average movie scores, while countries (like Lebanon or Argentina) that produce relatively very few movies have higher average movie scores. Given the sample size issues we discussed, we view this overall result as very likely to be a consequence of the limitations of the data rather than an actual meaningful finding. For example, there is only one movie from Lebanon in the data set. While that movie scored very well, it would be a stretch to say that it is representative of the entire Lebanese movie industry. With so many countries having very small counts, biases in which movies were selected for inclusion in the data set have a disproportionate impact. Hence, we cannot confidently draw representative and definitive conclusions about most of the countries presented here.

We could try to draw conclusions for countries with larger sample sizes, but we still run the same risk considering the heavy US bias. At first glance, it seems like the US has one of the lowest average movie scores around the world, even compared to other bigger markets that were below the global median. However, that could again just be a result of the fact that the US likely has the sample that is most representative of its country’s overall movie industry (and thus actually aligns with its country’s average movie). We are only guessing without knowing the actual movie inclusion criteria, but there is a good chance that only relatively popular / well-known movies make it in from outside the US, even from relatively well represented countries. Nonetheless, to present one possibly interesting finding (take it with a grain of salt given the aforementioned doubts about validity of conclusions here), among the relatively big and well represented countries, India and Spain seem to have relatively high average movie scores.



How do movie genres and ratings interact with each other?

To finish, we wanted to step away from honing in on movie scores, and just broadly look at how movie characteristics themselves have changed over time, particularly genre and rating. Below, we visualize the distribution of year given genre and rating.

Overall, looking at a high-level, we again get confirmation that R-rated movies are the most common, with PG-13 second and PG third. Here we now see that these three have actually remained the most popular (or at least most produced) across the years. However, it is interesting to note some of the trends and changes over time visible here. Specifically, we see that PG movies have decreased in frequency, PG-13 movies have increased in frequency, and movies with R and G ratings increased in frequency between 1980 - 2000 but declined somewhat since.

Now let’s dive deeper into genres for each rating. For R-rated movies, it looks like the Comedy, Drama, and Action genres are the most common (and roughly equally common) across the years. For PG-13 movies, it looks like the Comedy, Action, and Drama genres are again the most common. Within this PG-13 rating category, however, we see that the Comedy genre has decreased in frequency recently, while the Action genre has increased over that time (the frequency of the Drama genre has stayed relatively stable over the years). This could be an interesting takeaway about the changing nature of PG-13 movies. For PG movies, Comedy used to be the most common genre, but in recent years “Other” genres (i.e. genres outside of Action, Adventure, Biography, Comedy, Crime, or Drama) have begun to dominate. This is another interesting turn. With G rated movies, Other genres have dominated across the 4 decades.

Taken overall, it looks like there are 2 big trends. One is that Comedy movies have really declined in popularity recently among PG and PG-13 movie producers. Instead these producers are churning out more Action movies (for PG-13) and genres that are less represented (for PG). However, the other trend is that PG movies themselves are on the decline, while PG-13 grows in popularity.



Conclusion

For this project, our group was interested in what attributes make a good movie and how they have changed over time. Coming into this project, we naturally had our own expectations, and because of that, many of our findings were very intriguing.

Taking a very big picture view of movie quality, we saw a pretty clear increasing trend in movie scores over time before an intriguing recent reversal in that trend in the past half decade or so (which happens to coincide with the rise in popularity of streaming).

With that baseline set, we tried to examine which movie characteristics impacted score, and we were surprised by some of the results with genre. Among the top genres, we found Biography movies to score the highest and also be very consistent in quality within the genre. On the other hand, more popular genres, like Action, Comedy, and Adventure, scored relatively poorly and produced a very large array of movies with vastly differing qualities. Thinking back to our expectations, we definitely did not expect to see comedy movies have such a low rating, nor action movies to be so unpopular back in the day. In terms of genre, we found R-rated movies to be the most common. More interestingly, we found that R-rated movies tended to produce movies of very similar quality to PG-13 and PG movies despite usually having a much smaller budget. Continuing on the theme of financial success, we saw that a good scoring movie or a big budget alone didn’t always lead to big box office success. However, we noticed that high scoring movies did tend to outperform lower scoring movies with a similar budget, so, reassuringly, quality does have some impact on financial success! Finally, our analysis on movies and their country or origin produced very surprising results. Large markets tended to produce lower-scoring movies than smaller markets such as Russia and Argentina. However, after diving deeper, we found sample size issues that threatened the validity of these results.

Lastly, we looked beyond movie scores and at how movie characteristics themselves have evolved. There we saw a declining trend in PG movies and an uptick in PG-13 movies. On the genre side of things, there is an interesting recent shift away from Comedy movies.

Overall, our data set was really intriguing and helped us realize that with large data sets like this, it can be hard to find very apparent trends and differences. However, it was very insightful being able to find slight changes over time on a subject that most people around the world interact with.

In terms of next steps, we would like to get a more thorough data set to work with. As became apparent in our study of country of origin, there are implicit biases in this data set because of how only a rather small subset of movies are selected for inclusion. Having a broader set of movies to work with might give us more valid, and hopefully more interesting, results.