Introduction

We are analyzing the Google Play Store Apps dataset, which contains information about 10 thousand apps on the Google play store from 2 years ago. We looked at 10 variables from this dataset, 7 of which are categorical and 3 are continuous. The 7 categorical variables are the name, category, number of installs sorted into buckets, type, price, content rating, and genre of the app. The 3 continuous variables are the number of reviews, average rating, and size of the app. We wish to use this dataset to explore on a few research questions:

Question 1 - What are some of the words that appear frequently for apps with different ratings? What other factors affect the ratings of an app?

Question 2 - What factors contribute to the popularity of an app (number of installs)?

Question 3 - What are the most common words that appear on names of an app?

Question 4 - Does a size of an app and the category of an app have a significant impact on the price?

Plot 1 - Box Plot of Ratings for Each Category and Type

We wanted to learn about Research Question 1, which suggests we should examine Category, Rating, and Type(Free or Paid)

From the box plot above, we see that the mean tends to lie right above Rating = 4 and we see that the distributions seem to be similar among categories. We do notice that there are quite a few possible outliers at the low values of Rating for all categories and types, and the paid apps tend to have slightly less outliers. For the community category, the mean Rating seems to be about the same for both Free and Paid apps at around 4.2. The box is slightly bigger for the Paid type apps which implies a bigger interquartile range. For the Fun category, we can see that the mean Rating is approximately 4.3 for Free apps and 4.4 for Paid apps. The size of the boxes are almost identical and we observe very sinilar distribution with Free apps having a slightly larger variance. For Interest apps, we see a very similar trend like the Fun category with a mean Rating of 4.3 for Free apps and 4.4 for Paid apps. The size of the boxes are similar which suggest similar interquartile range. For the Money apps, we can see that the mean Rating for Paid apps is lower than Free apps which is different from other categories and therefore worth noting. The mean Rating for Free apps lie around 4.3 and 4.2 for Paid Apps. We see a larger interquartile range for Paid apps as the box is almost twice the height. For the last category, utility, the mean Rating for Free apps is around 4.3 and 4.4 for Paid apps. The interquartile range and variance seems to be quite similar as we do see sinilar sizes boxes. Overall we do see that all the boxes overlap each other and it seems as if the mean Rating is not significantly impacted by the Category and the Type. It is also important to note that there are not a lot of low rated apps.

Plot 2 - Word Cloud of words in App names

We wanted to learn about Research Question 3, which suggests we should examine the texts in the app name.

Word Cloud of most frequent words in App names

From the word cloud, we see that the most common word is “free”. This makes sense since promoting that an app is free appeals to many users. Other common words we see are game, app, mobil, video, photo, live, chat, shop, etc. Many of these are just simple capabilities of smartphones. These apps are for smartphones, so it makes sense that many of the apps are related to functionalities of smartphones like photo, video, camera, chat. As we get further out of the center of the cloud, we see some words that are not so relevant to the smartphone and technology like war, zombi, hero. However, so many of the words are related to the internet and the mobile device. The nost common words are almost like categories for apps.

Plot 3 - Heatmap of Category vs Installs

We wanted to learn if there is a correlation between how popular an app is and the quality of the app, and if this correlation changes between app categories (Question 2). So, we made a heatmap with the x axis being the app category for the 15 most popular categories and the y axis being the number of installs, a measure of the quality of the app. Also, each cell is colored by the average rating of the apps in that category, a measure of how good that app is.

For apps with more than 100,000 installs, there does not appear to be a strong correlation between the number of installs an app has and the average rating of that app. However, for cells which represent few installs, there are certain cells with very low average ratings and others with very high average ratings. This makes a lot of sense, as these apps will have less reviews on average due to their lower number of installs, so there will be a lot more variance in the rating of the app. The categories with the lowest overall ratings appear to be medical, communication, and then photography. There does not appear to be a clear category with the best overall ratings.

Plot 4 - Scatterplot of log of Reviews vs Ratings Colored by Content Rating

We wanted to see if the number of reviews had a connection on the rating of an app, and if this connection changed with the content rating of the app (Question 1). To determine this, we made a scatter plot with the log of reviews on the x axis (as the range of reviews is extremely large) and the mean rating for an app on the y axis. Additionally, the points in the scatterplot are colored by the content rating of the app.

The most impactful conclusion from this graph is that on average, the rating for an app increases as the number of reviews increases. Additionally, the variance of the ratings for apps with a certain amount of reviews decreases as the number of reviews increases. Intuitively this makes sense, as having more reviews means having less variance in the rating for the app, which will result in less variance for all the apps with that amount of reviews. Based on the coloring on the graph, we can see that the vast majority of apps are rated for everyone, and that apps rated everybody 10+ have slightly higher ratings on average for all amounts of reviews. The relationship between reviews and ratings appear to be essentially the same for all the other content ratings.

Plot 5 - Word Cloud of Translated Review

A wordcloud was created to answer the research question on what words appeared most frequently for apps of different ratings (Question 1).

Wordcloud of reviews apps with different ratings (3, 4, 5): ratings were initially continuous, so they were rounded.

This wordcloud analyzes frequent words for each category of rating. Initially, the Ratings variable was continuous, and ranged from 2.6 to 5. These values were categorized by rounding to the nearest whole number, and thus were categorized into 3, 4, and 5. Interestingly, we can see that for Rating = 3, there seems to be many words related to online chatting applications, such as date, chat, match, hook. It turns out that while there were more than 10,000 and 50,000 reviews for Rating =4, Rating =5, respectively, there were only 700 reviews for Rating = 3, 200 of which were related to dating apps. This explains why there were so many dating app- related words in the wordcloud. For Rating = 4, the most frequent words are game, play, like. In general, the words seem positive, in addition to some words that request updates and (bug) fixes. For Rating =5, the most frequent words are game, like, love, good, great. Overall, they are very positive words, which is expected for an app review for an app with high ratings. It was interesting that the most frequent word for Rating =4 and Rating = 5 were both Game.

Plot 5 - Histogram of Distribution of Size of Mobile Apps given App Category, Facetted by Number of Installations

A histogram was created to see the distribution of app size and installs, faceted by category (Question 2). The question in mind when creating this histogram was whether the size of an app, which may be correlated with the quality of contents of the app, had a relationship to the number of installations of the app, for each category. The initial hypothesis was that an app that has greater size would have more installs because a larger app size probably meant that the app had more content and better quality.

The histogram shows that this is not necessarily the case. In general, most most apps had sizes less than 50 megabytes. For Category = INTEREST, we see a decrease in number of installs for Size < 10, somewhat of an increase in number of Installs for Size < 50, and an obvious increase in number of installs for Size < 100. This seems to correspond with the initial hypothesis. Category = FUN had the least number of apps that had size <= 1, possibly because FUN contained categories such as games, which probably have app size greater than just 1 megabyte. The increasing trend in installs for size < 50 and size < 100 was also observed for Category = FUN. Most of the downloaded apps in Category = Money and Utility had size < 50. For Money, again we see a decrease in installs for size <10 and an increase in installs for size < 50. These trends seem to support the hypothesis that a larger app size leads to larger number of installs, in general, for apps in Categories Interest, Fun and Money.

Plot 6 - Scatter Plot Comparing Price and Size of Google Play Store Apps by Category

We wanted to learn about Research Question 4, which suggests we should examine Category, Size(KB), and Price(USD)

From this plot, one of the main takeaways is that the majority of apps are under $10 USD and under around 50,000 KB. While we wouldn’t say that most ‘UTILITY’ category apps have higher sizes, but the apps of that category do make up the most of the apps above the 50,000 KB line. Furthermore, these and ‘INTEREST’ apps seem to be priced higher than other categories, on average. Also, ‘FUN’ apps tend to be under $20 and 50,000 KB. However, there does not seem to be a correlation between the size and the price of an app.

Conclusion

In this project, we tried to answer questions regarding what words were most frequently used in App titles and App User Reviews, and how different aspects of the app are related to its success, i.e, the number of installations of the app and user ratings of the app. We concluded through the boxplot that the type of an app, whether it was free or not, did not seem to affect the ratings of the app. The wordclouds helped visualize the most frequent words for apps with different ratings. As anticipated, apps with higher ratings had positive words such as great and love. Interestingly, apps with lower ratings (3 stars) had mixed reviews, and many of the words seemed to come from reviews of dating apps. The heatmap showed that there was no significant correlation for apps with high number of installs, but a correlation with high variance for apps with lower number of installs. Then, the scatter plot showed that the relationship between reviews and ratings appear to be essentially the same for all the other content ratings, except for everybody 10+, which had a slightly higher average rating. We were able to conclude using the faceted histograms that app size had a positive relationship with the number of installations of the app, and the scatterplot helped us conclude that there did not seem to be a correlation between the size and the price of an app. Finally, it is worth noting that because the Google Play Store dataset was so large and each feature had many different values, we had to manually coerce similar categories. This coercion may have introduced bias into our analysis, and there is still much to be explored with this data beyond what we have found. In the future, analysis using different coercions and features of the data may shine a light on a different aspect of this data.