Introduction
The dataset that we are exploring contains information about hotel bookings from year 2013 until late 2017. There are 32 columns and 119390 rows in the dataset. Each observation corresponds to one instance of hotel booking. Our data conveys interesting information about hotel bookings. More specifically, it focuses on booking information for a city hotel and resort hotel. It includes detailed information about the characteristics of booking, such as arrival date, number of days booked before arrival, special requests, etc. Additionally, it also includes some information about the guest, including the demographics of guests. Overall, the dataset provides various and useful attributes of hotel booking.
For our project, we decided to explore different factors that contribute to the average daily rate of a hotel booking. Our exploration can add values to people who are looking for ways to book hotels at a lower rate. More specifically, we will focus on these three research questions: What relationship is there between the time of a booking and its rate? Does the relationship change with different hotel types? How do travelers’ demographics have a relationship with the rate of a booking? What characteristics of a booking have a relationship with the rate of a booking?
Body
First, we want to explore the relationship between the time of a booking and its average daily rate. Thus, we will examine variables including arrival time, lead time, average daily rate, and hotel type.
Interpretation: After June 2015, it is interesting that within a year, the average daily rate peaks around April and May, a few months right before peak travelling time from our other investigations. Another worth noting takeaway is that resort hotels are more volatile than city hotels. We found out that it would the cheapest to book hotels around December and January, so travelers can have an even better deal if they book resort hotels at this time.
The above graph describes a seasonal decomposition of the average daily hotel rate dependent on booking date split by hotel type. The trend lines show that resort hotels have larger cyclical changes in price where in the middle of the year, the average daily rate is higher than that of city hotels but in the beginning and end of years, the resort hotel price is lower than city hotels. The seasonal curve shows that there are more fluctuations for resort hotel rates compared to city hotel rates. This shows that seasonal changes affect the price of resort hotels more than city hotels due to larger demand and heightened price in the warmer seasons (late spring, summer) compared to colder seasons creating dramatic price changes. Graph: Autocorrelation Plots of Resort and City Hotels
We will continue to understand time interval changes and their effect on hotel prices by looking at autocorrelation plots.
Using booking date, the above graph shows the autocorrelation plots for average daily rate split by hotel type. The biggest thing seen is that the autocorrelations for city hotels last longer than resort Hotels, meaning that the autocorrelation switches from positive to negative at a later lag time for city hotels compared to resort hotels. Because a larger portion of months have weather that is not well suited for resort hotels, resort hotels have longer durations of lower prices represented by the negative autocorrelations. Overall, while there seems to be one switch in autocorrelation sign for city hotels, the multiple switches at shorter lag time intervals for resort hotels suggests a larger presence of seasonality determining average daily rate for these hotels.
Interpretation: This scatter plot explores the relationship between number of days booked ahead and average daily rate of a booking. If a traveler is booking a city hotel, it might be a good strategy to book ahead, indicating that more number of days booked ahead leads to a lower average daily rate of booking. However, it does not seem to be a good strategy for resort hotels because there is a slightly positive linear relationship.
Moving forward, we were interested in the possible existence of a relationship between the traveler’s demographic -the market segment of the reservation- and the location of the hotel with the average daily rate the visitor would have to pay. To investigate, we can take a look at the data with a choropleth map, looking for any noteworthy differences in any given country.
To look into this, we can use a choropleth map, viewing the mean trend of each country, stylized into a visual map of the world, shown below. We can see that hotels at different locations have different daily rates. On average, hotel rates are the cheapest in African countries.
## [1] "country" "mean_adr" "long" "lat" "group" "order"
## [7] "subregion"
We also want to look for relationships between the market sector and the average daily rate one might expect per sector. To look into this, we can use boxplots! By plotting box-and-whisker diagrams, one box per sector, against the ADR, we can quickly see if there are any significant differences between the average daily rates and the market sectors. The results of this plot is shown here. We can see that on average complementary travelers seem to have the lowest average daily rate of booking.
Next, we can look at the factors that can affect the price of a hotel booking. First, we will see if the number of special requests made and the meal option chosen for the reservation correlates with the price of the booking.
Based on the boxplot we see above, there seems to be an overall slight positive correlation between the average daily price and the number of special requests made, indicating more special requests at a higher average daily rate. In the plot, we can see that overall bookings that request “FB” (“Full-Board”) meals are on average the most expensive, with bookings with “HB” (“Half-Board”) meals the next most expensive and “BB” (Bed and Breakfast) being the cheapest. This aligns with our hypothesis that more special requests and a more complete meal package correlate with higher average daily booking price.
Now, we will be focusing on other less-common factors that could potentially impact the booking price of a hotel reservation. Here, we see if the number of booking changes and if the guest has frequented the hotel before affects the booking price in a City hotel and/or a Resort hotel.
Based on the plot, we can eye-ball a positive correlation between the number of booking changes and price, which shows that overall, if a booking is modified more often, the price tends to increase. Additionally, we see that repeated guests overall seem to have lower booking prices than first-timers. However, in the Resort hotel, there seems to be a sharp increase in price as the number of booking changes increase for repeated guests because of holes in the dataset; thus this sudden spike could possibly be attributed to random noise.
In conclusion, our group has explored relationships between various factors and average daily rate of a booking. More specifically, we look closer at time of booking, number of days booked ahead before arrival, country of origin or travelers, market segment of travelers, the number of special requests made and the meal option chosen of the booking, the number of booking changes and if a traveler is a repeated guest. From our graphs and analysis, we see that all the factors have a relationship with average daily rate of a booking. For example, for a traveler that books city hotels, a potential optimal strategy is to book around December and January, try to book as early as possible, have as few special requests as possible, be a repeated guest, and not change bookings.
Since there were holes in the dataset, some irregularity deviated from our conclusion could possibly be attributed to random noise. With further research and knowledge on how to differentiate between random fluctuations and significant data in large datasets, we can explain these irregularities better. We see that factors such as meal type, number of special requests, number of booking changes, and prior booking of the hotel in the past all have a significant contribution to the price of the booking