The data we work with is a data on hotel reviews. This dataset consists of roughly 515,000 customer reviews of 1,493 luxury hotels in Europe. To make the data more workable, we decided to use a 10,000 random subsample which only includes 1,310 of the total hotels. For each review, we have the positive feedback, the negative feedback, characteristics of the hotel (hotel name, hotel address, the longitude and latitude of the hotel, and the average score for the hotel), and characteristics of the review (reviewer nationality, review date, and the reviewer score). For each type of review, we are also given word counts. Finally, there was a column for some aggregations such as the the total number of reviews, total numbers of ratings (without reviews), and total number of reviews done by the reviewer. The given text was preprocessed by removing Unicode and changing all text to lowercase.
In this analysis, we are interested in analyzing the factors that influence hotel reviews. There are two large questions we examine: + What are the factors associated with differences in average scores for hotels? + What are the factors associated with differences in individual reviews? Within these larger questions, we will take deeper dives into smaller subquestions such as focusing on text analysis or geographic analysis.
In this section, we are interested in looking at the factors that are associated with differences in average scores. Thus, we would be interested to see how aggregate features relate to the average score of the hotel, such as looking at the total number of ratings a hotel receives and the average score. Popular hotels may be more likely to receive a high number of reviews and also have good scores.
From the relationship between average score and the total number of reviews (as well as the fitted Loess curve), we can see barring the area of outliers in the curve (at very high numbers of outliers), it is a very flat curve. This suggests there really isn’t much of a strong relationship between the total number of reviews and the average score.
Another factor that is important in understanding influencing average scores is the geography. Perhaps certain regions consistently receive higher reviews than other ones for whatever reason.
From the choropleth, we can see that Austria and Spain are on the higher end for average score of hotels within their borders whereas the United Kingdom and the Netherlands are on the opposite end. That being said, we likely can extract any generalizable patterns. From this map, its hard to say if there is any clear geographic clustering that would explain the results.
There is however, the opposite pattern we can look for. We can also take a look at which countries generate the most reviewers to understand where this data is being generated. This can give us insight on the possible priorities of people reviewing the data. Having made the plot, we remove the UK for the final plot since it completely dominated the initial plot.
We can see that a lot of the data is driven by traditionally “Western” countries including the UK, the USA, Australia, Ireland, and Switzerland. There is also an interesting core of reviews from the UAE and Saudi Arabia. This makes sense since our data is mostly for luxury hotels in Europe.
Do more reviews mean more people were unhappy with their stay at the hotel? From personal experience, we have only really seen people leave reviews because they weren’t satisfied with their stay somewhere, or didn’t like the service at a restaurant, etc. In order to answer this question, we constructed the below plot.
This plot shows the average score of the hotel (score including guests who did not leave a review) and the percentage of scorers leaving a review for that hotel. Separating the review scores by country of the hotel makes the plot easier to look at, and based on the different looking regression lines, we can see that the answers to this question are probably different based on the country the hotel is in. It is worth noting that in a lot of countries (such as Spain, France, and Italy), it seems that as the percentage of responses that are reviews increases, the average score decreases.
In this section, we are more interested in understanding the characteristics of a review and how it associates with the overall score given. Thus, one factor we can examine is the relationship between positive and negative words in the review and then see how this relationship relates to the review score.
From the plot we can see a clear delineation between the reviewer scores based on the number of the positive and negative words in their review. We see that the bottom part of the plot, with lower total positive words, contains most of the red points. Similarly, the left side of the plot i.e. the part with lesser negative words contains most of the blue points with the higher reviewer scores. Hence, we can imply that the scores are directly dependent on the proportions of positive or negative words in the reviews.
Another factor we can look at is using the text in the positive reviews and the text in the negative reviews to see if we can generate insights about what customers find positive versus negative in their hotel experience.
The comparison word cloud between positive and negative reviews gives us a lot of insight into the factors that are most important to how a hotel is perceived. On the positive side, we can see the most important factors were the location, the staff, and the comfort. For negative reviews, notably, it appears “nothing” is the most common negative review which makes sense as these are reviews for luxury hotels and were generally given good scores in the dataset. However, things that did lead to complaints included things like being small or expensive.
In order to better understand what words are actually used most frequently in the positive and negative reviews, we can use Natural Language Processing to extract bigrams from both the positive and negative reviews. Using the distribution of the bigrams can be useful in gauging what pairs of words are most influential in affecting the review scores of the hotels.
Looking at the distributions of the positive bigrams we can make out the characteristics that play the biggest role in making a reviewer’s stay good at a hotel. We see that the comfort of the beds, the politeness of the staff and the location of the hotel all play major role in determining whether a stay is positive or not. Similarly, analyzing the negative bigram distribution we can see the major factors that annoy people the most during their stays. Some of the top negative factors include bad smell, faulty air conditioners, and extra charges. Therefore, using this data hotels can figure out the areas that need immediate attention and improve their chances of getting higher ratings.
Finally, the last factor we examine in seeing factors that influence reviews is time as there may be seasonal cycles or certain times that could associate with higher review scores. We look at a seasonal decomposition of the average score of all reviews plotted on a that day.
From the seasonal decomposition of the time series, we can see that there are some patterns of the seasonal pattern. It seems like there is bigger seasonal deviation in months that typically have more tourism such as winter holiday months as well as summer months. That being said, there is not a huge difference.
Overall, based on the various statistical analyses and visualizations we conducted on this data, we were able to come to a few conclusions to answer our questions. We initially wanted to examine the reasons behind differences in average score for hotels as well as the factors influencing differences in individual reviews.
When looking at the differences in average scores for hotels, we see that there isn’t necessarily a strong relationship between the number of scores, but rather there might be a significant relationship between the percentage of guests leaving reviews and average score. We wanted to look at time series plots to see if there was a significant difference in average score on a seasonal basis. However, besides some deviation in the summer/winter months, there was nothing too noticeable. However, text analyses showed us that the location, staff, and comfort were the most important factors for positive reviews and that “small” and “expensive” were contributing factors to negative reviews, although that as strong.