Motivation

The global wine industry is complex and continually evolving influenced by an array of factors including: region, price, variety, and individual taster preferences. Understanding how these elements interact not only offers valuable insights for consumers who seek the best value and quality, but also for producers aiming to refine their strategies and highlight the strengths of their offerings. Sourced from Kaggle, this project leverages a rich dataset of approximately 130,000 wine reviews from WineEnthusiast to examine how wine scores correlate with price, how taster biases and expertise shape quality assessments, and how wine characteristics vary across regions and varieties. By integrating statistical analysis, network visualization, and geospatial mapping, we aim to provide a data-driven narrative that illuminates global wine trends, informs consumer choices, and reveals pathways for industry innovation.

Dataset Description

As mentioned in the motivations, the dataset sourced from the WineEnthusiast reviews and made available on Kaggle, contains detailed information on approximately 130,000 wines. Each observation (row) corresponds to a single wine review, and each review includes various attributes (14 columns) that provide insights into the wine’s origin, characteristics, and the reviewer’s evaluation. There are a total of 129,971 reviews and each observation represents one wine review from WineEnthusiast. Here is a sample of the variables, id (int): A unique integer identifier for each wine review, country (chr): The country of origin for the wine (e.g., Italy, US, France), description (chr): A detailed written review describing the wine’s characteristics, aromas, and flavors, designation (chr): The specific vineyard or bottling information, if provided, points (int): The reviewer’s score for the wine, typically on a scale from approximately 80 to 100, price (int): The approximate retail price of the wine in USD, province (chr): The specific province or state within the country where the wine is produced, region_1 (chr): A more specific region within the province, if available (e.g., Napa Valley within California), region_2 (chr): An even more granular subregion (often not populated), taster_name (chr): The name of the WineEnthusiast taster/reviewer, taster_twitter_handle (chr): The reviewer’s Twitter handle, if available, title (chr): The title of the review, typically including the winery, vintage, and varietal, variety (chr): The type of grape or wine variety (e.g., Pinot Noir, Chardonnay), winery (chr): The name of the winery that produced the wine. Additionally, the dataset underwent several preprocessing steps to enhance its suitability for our analyses. First, we addressed missing values by filtering out rows without essential information (e.g., variety or taster_name), ensuring that each observation contained the critical attributes needed for meaningful comparisons. To better highlight overarching patterns, we then narrowed our scope to focus on the most frequently reviewed varieties and countries, thereby reducing noise and emphasizing the regions and grapes most relevant to consumer and industry interests. For geospatial analyses, we standardized certain country names (e.g., converting “US” to “United States of America”) to ensure consistency and facilitate mapping. Finally, we derived price categories—Low, Medium, and High—to investigate how descriptive language usage (via TF-IDF) might vary across wines at different price points, offering further insight into potential taster biases and specialization.

Research Questions

Do wine scores and prices correlate across countries, and how does the number of reviews influence this relationship? Do tasters specialize in scoring specific wine varieties or regions? Do distinct geographic regions, both globally and within key wine-producing areas, display unique patterns in average wine quality, and how do these insights deepen our understanding of terroir and regional identity in wine?

Research Question 1: Do wine scores and prices correlate across countries, and how does the number of reviews influence this relationship?

The relationship between wine scores, prices, and number of reviews across countries inquiry for both consumers and producers. Consumers seek high-quality wine at fair prices, and producers can use this relationship to optimize pricing strategies based on both perceived quality and popularity. By understanding this correlation, insights into market dynamics consumer preferences can be drawn.

First, we will highlight patterns using network analysis to explore shared preferences among wine varieties (see shared tasters). Then, we will answer this question by visualizing how average wine scores and prices correlate across the top 10 countries by the number of views. Finally, by using a linear regression model, we will find out whether these factors are influenced by review counts or shared taster preferences.

## # A tibble: 10 × 2
##    variety                  total_reviews
##    <chr>                            <int>
##  1 Pinot Noir                       13272
##  2 Chardonnay                       11753
##  3 Cabernet Sauvignon                9472
##  4 Red Blend                         8946
##  5 Bordeaux-style Red Blend          6915
##  6 Riesling                          5189
##  7 Sauvignon Blanc                   4967
##  8 Syrah                             4142
##  9 Rosé                              3564
## 10 Merlot                            3102

Interpretation:

This network graph illustrates the relationships among the top 10 wine varieties based on shared tasters’ co-reviews. The node sizes represent the total number of reviews for each wine variety, while the edge widths and colors denote the number of shared tasters between pairs of wine varieties.

Key observations include:

Chardonnay and Pinot Noir are prominent nodes, as indicated by their larger sizes, reflecting their high total review counts. Stronger connections (thicker and darker edges) are evident between popular wine varieties, such as Red Blend and Cabernet Sauvignon, suggesting a significant overlap in tasters reviewing these varieties. Weaker connections (lighter edges) between certain varieties, such as Rosé and Bordeaux-style blends, imply fewer shared tasters. This visualization provides insights into how tasters’ preferences overlap, revealing potential patterns in co-reviews. It may highlight how certain wine varieties are clustered together in tasters’ evaluations, potentially influencing overall scoring patterns or biases.

The bubble chart shows the relationship between average wine score, average price, and number of reviews. Average score and average price are plotted on the y and x axis, respectively, and number of reviews is represented by the size of the bubbles. There is a clear positive relationship - countries with higher average wine prices, like Germany and France, tend to have slightly higher scores. However, this relationship does not seem to be strictly linear. Also, it is not fully clear that there is any influence of review count. Countries like US, Italy and France, with a larger number of reviews, have slightly higher scores and prices, reflecting their reputation for producing quality wines. However, there are a good amount of noteworthy outliers, where Austria has the highest average score but not the highest price, suggesting that quality may not always align with cost directly and that there are some other factors in play. Austria also has a very small number of total reviews. Furthermore, Australia is very similar to the US in terms of average points and average price, but it also has a tiny amount of reviews. This may suggest that though average price and average score may have a positive relationship, the number of reviews may not be so impactful.

Statistical Analysis

Why Regression Analysis Was Used

Regression analysis was selected to quantify the relationships between average wine scores (dependent variable) and the predictors total reviews and shared tasters.

Assumptions of the Regression Model

Linearity:

The relationships between the predictors (total reviews and shared tasters) and the dependent variable (average wine scores) are assumed to be linear. Justification: Preliminary scatterplots suggested no extreme non-linear patterns.

Independence:

Observations are independent of one another. Justification: Each wine variety represents an independent data point, and there is no evidence of repeated measures.

Homoscedasticity:

The variance of residuals is constant across all levels of the predictors. Justification: Residual plots show no clear patterns, suggesting homoscedasticity.

Normality of Residuals:

Residuals are assumed to be normally distributed. Justification: The residuals are relatively symmetric and centered around zero.

Absence of Multicollinearity:

The predictors (total reviews and shared tasters) are not highly correlated. Justification: Correlation analysis prior to regression showed no multicollinearity concerns.

## 
## Call:
## lm(formula = average_score ~ total_reviews + shared_tasters, 
##     data = network_data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.1958 -0.4737  0.1158  0.3063  0.9910 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    8.659e+01  7.280e-01 118.944 7.83e-13 ***
## total_reviews  1.382e-04  7.176e-05   1.926   0.0955 .  
## shared_tasters 5.350e-03  2.606e-03   2.053   0.0792 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.7626 on 7 degrees of freedom
## Multiple R-squared:  0.5056, Adjusted R-squared:  0.3643 
## F-statistic: 3.579 on 2 and 7 DF,  p-value: 0.08499

Interpretation Intercept: The intercept of 86.59 represents the baseline average score for a wine variety when both total_reviews and shared_tasters are zero. This value reflects a baseline measure of wine quality independent of the predictors.

Total Reviews: The coefficient of 0.0001382 suggests a small positive relationship between the total number of reviews and average wine scores. For every additional 10,000 reviews, the predicted score increases by approximately 1.38 points. However, this relationship is not statistically significant (p = 0.0955).

Shared Tasters: The coefficient of 0.00535 indicates a weak positive relationship between the number of shared tasters and average scores. For every additional 100 shared tasters, the average score is expected to increase by 0.535 points. This effect is also not statistically significant (p = 0.0792).

Model Fit: The R² value of 0.5056 implies that approximately 50.56% of the variance in average wine scores can be explained by the predictors (total_reviews and shared_tasters).However, the adjusted R² value of 0.3643 suggests that the explanatory power of the model decreases when accounting for the number of predictors and sample size. The overall model is not statistically significant (F = 3.579, p = 0.08499), indicating limited evidence that the predictors jointly affect the average wine score.

Conclusion: The regression analysis does not provide strong evidence that either total_reviews or shared_tasters significantly impacts average wine scores. While there is a slight positive trend with total_reviews, it is not statistically significant. Similarly, the negligible negative effect of shared_tasters suggests no meaningful relationship with average scores. This confirms what we saw in the bubble plot, where there was not really a large effect of bubble size (total reviews) on average score. This means that yes, there are patterns between wine score and price across different countries (a bit of a positive relationship), but total reviews and shared tasters does not have an impact on wine score specifically.

Research Question 2: Do tasters specialize in scoring specific wine varieties or regions?

This research question investigates whether wine tasters exhibit scoring biases or preferences towards specific wine varieties or regions. It is relevant for understanding how taste subjectivity might influence wine scores and whether their specialization impacts the perceived quality of wines from certain regions. Insights from this analysis can guide consumers and producers in selecting wines and designing marketing strategies.

To test this, we will analyze taster preferences by a TF-IDF analysis by using the vocabulary in wine reviews to identify how tasters describe wine in different price categories. This will allow us to see if tasters emphasize certain specific characteristics for each price range. Then, we will evaluate score distributions using a jitter plot. This will let us visualize wine scores assigned by the top 10 tasters across the top 5 most reviewed countries. We can then detect patterns in score variability, regional biases, and taster-specific tendencies (always giving good scores, etc.).

Interpretation:

This graph visualizes the most distinctive words used in wine reviews across three price categories: Low Price, Medium Price, and High Price. The x-axis represents the TF-IDF score, which measures how uniquely a word characterizes a specific price category. Words with higher scores are more distinctive for their category. For example:

Low Price: Words like “prosecco” and “vinho” indicate a focus on specific wine types and casual descriptions. Medium Price: Terms like “prosecco” and “ciel” reflect a combination of descriptive elements and specific attributes of the wine. High Price: Words such as “romaneé” and “pommard” signify a focus on regional wine varieties and vintage years, emphasizing sophistication and exclusivity. This plot provides evidence that tasters describe wines differently based on their price category, reflecting a potential bias or specialization in how wines are scored or reviewed depending on their price range.

From the jitter plot, we can see that some tasters, like Kerin O’Keefe and Roger Voss consistently assign higher scores than other tasters, like Michael Schachner. Furthermore, some reviewers almost always taste wines from specific regions, which could result in biased scores, where wines from certain regions always score better than others. There are 6 tasters above that almost only try wines from the US (out of the top 5 countries). Kerin O’Keefe only tries wines from Italy. This could lead to bias, as mentioned before, especially since the distribution of scores for these reviewers are not the same. Some reviewers like Sean P. Sullivan and Anna Lee C. Iijima give a vary narrow range of scores. Others more easily give higher scores.

Statistical Analysis

Setup

The goal is to evaluate whether the distributions of TF-IDF scores differ significantly across the three wine price categories: Low Price, Medium Price, and High Price. TF-IDF values quantify the importance of words in wine reviews, and understanding how these scores vary across price categories can provide insights into how tasters describe wines differently.

Justification for Using the Kruskal-Wallis Test

The Kruskal-Wallis test is chosen because:

Non-Normal Data: TF-IDF values are often skewed due to the nature of word distributions, where a few words dominate.

Continuous Data: TF-IDF scores are continuous and can be ranked for analysis.

Multiple Groups: The test compares TF-IDF scores across three independent groups (Low, Medium, High Price categories).

Robustness: It is robust to outliers and unequal sample sizes, which are common in TF-IDF datasets.

Assumptions of the Kruskal-Wallis Test Independent Observations: TF-IDF scores within and across price categories are independent, as each word is treated separately in the analysis.

Ordinal or Continuous Data:The dependent variable (TF-IDF scores) is continuous, satisfying this assumption.

Independent Groups:The three price categories (Low, Medium, High) are mutually exclusive, with no overlap between groups.

Similar Distribution Shapes:While the test does not assume normality, it assumes that the shapes of the distributions within each group are similar. Large differences in variance could affect the validity of the results.

## 
##  Kruskal-Wallis rank sum test
## 
## data:  tf_idf by price_category
## Kruskal-Wallis chi-squared = 17.887, df = 2, p-value = 0.0001306

Interpretation of Results

Test Statistic:

The Kruskal-Wallis chi-squared statistic is 17.887, with 2 degrees of freedom. P-Value:

The p-value is 0.0001306, which is highly significant (p<0.05). ### Conclusion:

Since the p-value is less than 0.05, we reject the null hypothesis that the distributions of TF-IDF scores are identical across the three price categories.

This indicates that there are significant differences in the distributions of word importance (TF-IDF scores) between Low, Medium, and High Price categories.

Furthermore, from the jitter plot, it is clear that there is some form of scoring bias based on the taster. Different tasters have different scoring ranges and some try wines mainly from a single region, which could lead to preference.

This means that yes, there are differences when it comes to scoring based on the taster involved. Not only do they review wines differently based on price, but they also give out scores differently, which can be further emphasized by the region of wines they mainly review.

Implications:

The result supports the hypothesis that tasters use distinct vocabularies to describe wines in different price categories. For instance, high-priced wines are associated with words like “pommard” and “romaneé,” while low-priced wines are characterized by words like “prosecco” and “vinho.” Scoring distributions are also different between reviewers, meaning not only do they describe wines differently, they score them differently too.

Research Question 3: Do distinct geographic regions—both globally and within key wine-producing areas—display unique patterns in average wine quality, and how do these insights deepen our understanding of terroir and regional identity in wine?

The wine industry is not only influenced by price, variety, and individual taster biases, but also by the geographical and environmental contexts that give rise to the notion of terroir—how soil, climate, and local tradition affect wine quality. Examining average wine scores across different countries and zooming further into specific regions, such as those within states, can help us understand if certain locales are consistently producing higher-quality wines. Such insights are valuable for consumers seeking wines from reputable areas, producers aiming to highlight regional strengths, and policymakers and trade organizations interested in promoting local wine industries.

We will use a global choropleth map of average wine scores by country to identify broad geographic patterns in wine quality. Next, boxplots showing the distribution of points by country will reveal whether certain nations, on average, yield higher-quality wines or more variable ratings. Finally, a detailed map of California will allow us to investigate variations in quality at a granular level within a single wine-producing hotspot. By integrating these visual analyses, we can identify global and regional patterns, highlight potential quality clusters, and gain a richer understanding of how location impacts the sensory and qualitative dimensions of wine.

## Interpretation: These visualizations guide us through a geographic hierarchy of wine quality, starting on a global scale and moving down to finer levels of detail. By examining averages and distributions of wine ratings at each stage, we gain a richer understanding of how place influences perception, quality, and ultimately, market dynamics.

The world choropleth map reveals that average wine ratings vary notably by country. Established wine-producing regions—such as parts of Europe, North and South America, and Oceania—tend to cluster around higher average scores. However, the global view alone does not explain the variability within countries or how localized factors might shape these averages.

The boxplot of wine points by country adds depth to the global picture. It shows that while some countries yield consistently high-quality wines (narrow distributions with higher medians), others present a broader range of scores, indicating diversity in quality or differences in winemaking practices, grape varieties, or regional terroir. This highlights that country-level averages mask underlying complexity.

Drilling down to the U.S. state-level choropleth map, we see that even within a single country, wine quality is not uniform. States known for their wine industries, such as California, Oregon, and Washington, often average higher scores compared to states with less established reputations. This suggests that regional factors—climate, soil composition, viticultural practices, and local expertise—play a significant role in shaping wine quality.

Finally, the California map zooms in further, illustrating differences in average ratings at a much finer scale. Renowned counties like Napa and Sonoma emerge as top performers, reinforcing the idea that terroir—an interplay of soil, climate, and tradition—can strongly influence wine quality, even within a single state. By examining micro-level differences, we gain insight into how localized conditions and regional specialization contribute to a wine’s overall character and reputation.

From a global comparison of countries down to the nuances of specific Californian regions, these analyses reveal that wine quality is influenced by geography at multiple scales. Broad patterns observed internationally become more textured and intricate as we focus on narrower locales. These visualizations support the narrative that both macro-level factors (historical prestige, international market presence) and micro-level elements (local terroir, regional practices, and taster biases) shape our understanding and appreciation of wine quality worldwide.

Statistical Analysis

Setup

Building on our visual findings, which suggest that wine quality varies considerably across different geographical regions, we now seek a more rigorous statistical confirmation of these patterns. While choropleth maps and boxplots highlight apparent differences in average wine ratings among countries, they do not tell us whether these differences are statistically meaningful. To address this, we will apply a one-way Analysis of Variance (ANOVA) to test if the observed variations in average scores across multiple countries represent true distinctions rather than random fluctuations. By doing so, we aim to solidify our understanding of how geography and regional factors shape wine quality, thereby strengthening the narrative that place-based influences play a critical role in the global wine landscape.

Justification

The ANOVA test is well-suited for comparing the mean wine ratings of multiple countries simultaneously. It complements the global and country-level boxplots and maps by providing a formal statistical framework to determine if observed differences are more than just random variation. While our earlier tests (e.g., Kruskal-Wallis for TF-IDF distributions and regressions for price-score relationships) focused on language patterns and predictors of score, the ANOVA directly addresses differences in average ratings by geographic unit, reinforcing our narrative about terroir and regional identity.

##                 Df  Sum Sq Mean Sq F value Pr(>F)    
## country          9   55604    6178   693.4 <2e-16 ***
## Residuals   124574 1110028       9                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Interpretation of Results

The one-way ANOVA results show a highly significant difference in average wine ratings across the examined countries (F=693.4, p<2e-16). This confirms that not all countries share the same mean wine rating, providing statistical support for the patterns suggested by our maps and boxplots. In other words, these differences are not merely due to random variation but reflect genuine geographic distinctions. Thus, our analysis reinforces the idea that location—influenced by factors such as terroir, cultural practices, and historical reputation—plays a substantive role in determining a country’s overall wine quality profile.

Conclusion:

By integrating spatial visualizations with a formal statistical test, our analysis for this research question strongly suggests that geography and terroir significantly influence wine quality. The choropleth maps and boxplots provided an initial, intuitive understanding of how average scores vary at global, national, and regional scales. The subsequent one-way ANOVA confirmed that these observed differences are statistically meaningful rather than random fluctuations. Together, these findings reinforce the notion that location plays a critical role in shaping the sensory profiles, reputations, and perceived value of wines. As a result, consumers, producers, and stakeholders can better appreciate the importance of terroir and regional identity when selecting, marketing, or investing in wines from around the world.

Implications:

These findings have direct implications for a range of stakeholders in the wine industry. For consumers, recognizing that geographic factors drive quality can guide more informed purchasing decisions, encouraging them to explore regions consistently associated with higher average ratings. Producers may find value in emphasizing their regional identity and terroir, leveraging geographic reputations as a unique selling point. Marketers and policymakers, in turn, can highlight regional distinctions to promote tourism, bolster local wine economies, and support growers who uphold regional standards of excellence. Ultimately, acknowledging the statistical significance of terroir and location elevates our collective understanding of wine, encouraging both appreciation of global diversity and further exploration of emerging, yet promising, wine-producing areas.

Limitations

While our analyses shed light on the complex factors influencing wine quality, several limitations warrant consideration. First, the dataset is composed solely of WineEnthusiast reviews, potentially reflecting the preferences, biases, and tasting methodologies of a specific set of professional reviewers rather than representing a universal standard. Additionally, regional and varietal coverage may be uneven; some countries, states, or varieties are far more represented than others, which could skew results and obscure underrepresented regions’ true quality profiles. Our analyses also rely on price as a proxy for market value, yet price can be influenced by marketing strategies, scarcity, and other external factors that do not necessarily correlate with intrinsic quality. Furthermore, the terroir concept—key to understanding regional differences—is multifaceted and not fully captured by the data. Soil composition, microclimates, vineyard management techniques, and cultural practices are not directly quantified here. Finally, while statistical methods like ANOVA and Kruskal-Wallis help verify that differences are not random, they cannot fully explain the causal mechanisms behind these variations. Future research might incorporate more granular environmental data, consumer-level preferences, and long-term trends to better contextualize and refine the insights drawn from these analyses.

Conclusions and Future Work

In this project, we leveraged a comprehensive dataset of wine reviews from WineEnthusiast to explore how geographic factors, taster biases, and pricing interact to shape our understanding of wine quality. Our analyses revealed several key insights. First, while there is some correlation between wine scores and price across countries, the influence of the number of reviews is not clearly significant. Second, tasters appear to specialize to some extent, using distinctive vocabularies and exhibiting scoring patterns that may be tied to both price tiers and specific regions. Third, and most notably, our geospatial analyses—supported by one-way ANOVA—demonstrated that geography plays a critical role in determining wine quality. Patterns observed at global, national, and regional scales underscore the concept of terroir and the influence of local factors on a wine’s character and reputation.

Future work could deepen these insights by incorporating additional data sources and more granular environmental information. Analyzing detailed soil composition, microclimate metrics, and vineyard management practices would bring us closer to understanding the causal mechanisms behind terroir-driven quality differences. Expanding the dataset to include consumer-level preferences, as well as reviews from different publications and cultural perspectives, could mitigate potential biases and broaden the applicability of our findings. Longitudinal studies might track how climate change, evolving market conditions, and shifting consumer tastes influence wine quality over time. Ultimately, such extensions would enrich our understanding of the complex tapestry of factors that define the global wine landscape, guiding producers, consumers, and policymakers toward more informed and sustainable decisions.

References

Works Cited

C, Todd. “Wine Tasting: Subjective or Objective?” Exploring the Nature of Wine Perception, Dec. 2010, www.wineanorak.com/subjectivity.htm#:~:text=Some%20critics%20swoon%20over%20ripe,means%20war%20with%20your%20colleagues!&text=If%20critical%20disagreements%20in%20the,the%20nail%20in%20its%20coffin.

Jamiegoode. “Score Inflation Is Everywhere and It’s Killing Wine Criticism.” Jamie Goodes Wine Blog, 4 Oct. 2017, www.wineanorak.com/wineblog/uncategorized/score-inflation-is-everywhere-and-its-killing-wine-criticism.

Terrazas, Aubrey. “Why Wine Is Expensive: And Why Price Matters.” Palate Club, 8 May 2021, www.palateclub.com/why-wine-is-expensive/.

“Wine’s Region of Origin May Impact Price More than Taste.” The Week, The Week Magazine, 24 Oct. 2018, www.theweek.in/news/sci-tech/2018/10/24/Wine-region-of-origin-may-impact-price-more-than-taste.html.