Data Description

This dataset discusses the Economic Freedom of the World and includes around 4600 data points from 1970 to 2021 with around 88 variables. It includes categorical variables such as the country, the region, the World Bank income classification as well as quantitative variables such as the overall ranking and various economic indicators utilized in the creation of the overall Economic Freedom ranking.

Here are the variables that we are focusing on:

To explore this dataset, we were interested in the following questions:

  1. What is the relationship between economic freedom and social variables?
  2. What is the relationship between the economic components and the economic freedom summary index?
  3. How does economic freedom vary between countries over time?

Exploratory Data Analysis

As a little bit of Exploratory Data Analysis, we run a pairs plot comparing the overall economic freedom score with the 5 key sub-areas that comprise the overall economic freedom score in an attempt to understand which area shares the strongest correlation with the overall economic freedom score. From the above plot, it is noticeable that the most correlated areas are the Sound Money, Free Trade, and Regulation areas, while the Government Size is the least correlated with the overall economic freedom score. This pairs plot is meant to provide some direction for the following research questions, particularly in understanding more specific relationships between the overall economic freedom score and many of the other economic and social variables measured in the Fraser Economic Freedom Dataset.

Question 1: What is the relationship between economic freedom and social variables?

To investigate the scientific question of the relationship between economic freedom and different social variables, we first investigate how economic freedom relates to gender disparity. We wanted to investigate the social factors that may affect or have large influences on the economic freedom index and this dataset had a gender disparity index. For the economic freedom index, a higher value indicates more economic freedom, or a greater ability for individuals to make their own economic decisions. For the gender disparity index, a lower value indicates a greater level of gender disparity. We initially wanted to color the points by country, but that would require a separate color for each point. The dataset already includes a quartile variable, which splits the countries by their economic freedom index ranking.

We had expected that the gender disparity index would be highly correlated with the economic freedom index. This is somewhat supported by the plot, as countries with higher economic freedom scores also generally have a higher gender disparity index. Coloring by quartile helps illustrate that though some countries, especially in the fourth quartile, have a high gender disparity index score, they still have low economic freedom. For the countries in the first quartile, the variation in gender disparity is much smaller and tends to be concentrated around 1, while the variation in gender disparity is much wider for the second, third, and fourth quartiles. The slopes of the regression lines for the second and fourth quartiles are fairly low, with the slope of the fourth quartile near zero and the slope of the second quartile slightly larger than zero. However, the slopes of the regression lines for the first and third quartiles are much steeper, with the slope for the first quartile slightly larger than the slope of the third quartile. This plot is particularly informative for exploring the relationship between economic freedom and gender disparity because it displays the relationship between indicators for these variables. Specifically, coloring by quartiles helps display the large variations in gender disparity particularly in lower quartiles.

We investigate further the relationship between economic freedom and social variables through Principal Component Analysis. We first plot the elbow plot of principal components from the quantitative social variables: judicial independence, impartial courts, property rights, military interference, legal integrity, contracts, real property, and police and crime, and gender disparity.

Based on the elbow plot, most of the variation is accounted for by the first principal component because after the first component, the elbow plot flattens and the rule of thumb line of 1/p is below the first principal component but above the second principal component. However, the rule of thumb line of 1/p is only slightly higher than the second and third principal component, which justifies our use of the second principal component in our PCA biplot.

In our PCA biplot, we plotted the first and second principal components with the points colored by quartile. Looking at the arrows that represent each of the social variables, we see that the first quartile is associated with judicial independence, impartial courts, property rights, military interference, legal integrity, contracts, real property, and police and crime. However, the first quartile is more weakly associated with the gender disparity index.

The associations for the other three quartiles are less obvious than that of the first quartile. Most of the second quartile is weakly associated with all of the social variables, though some of the second quartile is negatively associated with the social variables, as indicated by the many second quartile points in the two right quadrants but also the cluster of points in the top left quadrant. Most of the third quartile is fairly negatively associated with social variables, but some of the third quartile is positively associated with the social variables, as most of the third quartile points are in the top left quadrant. The fourth quartile is negatively associated with all of the social variables because most of the fourth quartile points are in the two left quadrants. This plot is particularly informative for answering the above question of determining the relationship of economic freedom and social variables because it illustrates the differing relationships between each of the social variables and the quartiles of the economic freedom index.

Question 2: What is the relationship between the economic components and the economic freedom score?

As mentioned in our EDA, the economic freedom score value is made up of five components. The index measures the degree of economic freedom present in five major areas: [1] Size of Government, [2] Legal System and Property Rights, [3] Sound Money [4] Freedom to Trade Internationally, and [5] Regulation. Within the five major areas, there are a total of 25 components in the index. Each component and subcomponent is placed on a scale from 0 to 10 that reflects the distribution of the underlying data. The component ratings within each area are then averaged to derive ratings for each of the five areas, and then in turn, the five area ratings are averaged to derive the summary rating for each country. While we know that there is a relationship between the five indicators and the economic freedom score, we are interested in understanding the weights of those indicators.

To quantify the weights, we decided to run a linear regression between the economic factors and a country’s economic freedom score. Since every variable is on the same scale (1-10), we know that the coefficients directly comparable in terms of their impact on the dependent variable. This means that the magnitude of the coefficient indicates the strength of the relationship between the variables, and a larger magnitude suggests a stronger influence. The summary of the model is displayed in a table below.

Estimate Standard Error t-value P-value
Intercept -0.034326 0.022646 -1.516 0.13
Free Trade 0.223198 0.003534 63.155 <2e-16 ***
Regulation 0.282458 0.003772 74.877 <2e-16 ***
Sound Money 0.202155 0.002644 76.463 <2e-16 ***
Government Size 0.167598 0.002356 71.138 <2e-16 ***
Property Rights 0.103281 0.002149 48.050 <2e-16 ***

From the table, we can see the different weights for the economic freedom score. The order of weights is the following: Regulation, Free Trade, Sound Money, Government Size, and Property Rights. We found it interesting, that there is a 0.18 difference in weights between regulation and property rights. In the context of economic freedom researchers may adopt different frameworks and criteria for assessing economic freedom. A possible explanation may be that regulations such as credit market regulation, business regulation, and labor market regulation may play a larger role in a country’s economic factors than factors such as judicial independence, military interference, and policy and crime all of which are all components in regulation and property rights respectively.

After determining the weights, we were interested to know if there were any nuances in the relationship between each quartile. We acknowledge that the relationship between these indicators and the economic freedom score may vary across different levels of economic development or policy environments. To first explore this, we decided to create a scatterplot between regulation and the economic freedom score.

We knew that the scatterplot was going to show an overall strong positive relationship between regulation and a country’s economic freedom score. We can also see that this is the case for each quartile as each of the regression lines is positive. As the quartiles increase, the countries’ scores get closer together as there seems to be more spread in countries in Quartile 3 and Quartile 4.

To further understand the clustering, we decided to create dendrograms for each economic indicator. We decided to do this because the dendrogram can visually show how observations within each quartile are grouped and we can identify clusters of similar observations.

From these dendrograms, we can see that the cluster that is red is Quartile 1, green is Quartile 2, blue is Quartile 3, and purple is Quartile 4. While we are unable to read the leaves based on the magnitude of the data, we can observe the branching of each factor. We know that branches that join lower in the dendrogram represent groups of observations that are more similar to each other. Higher joins indicate less similarity. Across all indicators except for Sound Money, we see that Quartile 2 has wide branches with numerous data points. This implied that there may be low similarity of data points within that cluster and the points may have more diverse characteristics. For Sound Money, the same is true, but for Quartile 4.

Question 3: How does economic freedom vary between countries over time?

The heatmaps above present the economic freedom score of each country in 1970 and in 2021 (The grey countries are countries that were not measured/scored that year). The median freedom score for each choropleth map is 6 and from the difference in colorings of the two heatmaps, it is noticeable that from 1970 to 2021, many countries have improved their economic freedom score. In 1970, many countries either were not measured, which itself can be interpreted as a statement to their overall openness and freedom, or were measured to have low freedom scores. Particularly notable is that in 1970, Almost all of South America, Africa, and the countries measured in Asia had economic freedom scores below the median score. A notable outlier in 1970 was Venezuela, which was significantly over the median freedom score. Countries in North America, Europe, and Australia were well above the median score in 1970.

In 2021, the choropleth map highlights a significant improvement in freedom score worldwide. Much of Asia is now measurable and above the median score, with countries such as India moving from below to above the median score. North America, Europe, and Australia remain unchanged above the median score, while many countries in South America and Africa have also made significant improvements. An interesting observation is that Venezuela, which had a score well above the median freedom score in 1970 is well below the median score in 2021, marking it as one of the only countries to deteriorate in economic freedom overtime.

Now, we want to see how economic freedom varies between countries over time. In order to investigate this question, we decided to plot a time series plot with a moving average line for each region of the world, with the region defined as “World Bank Region”. The economic systems in the world are complex, so different regions of the world may develop differently and have different economic freedom trends. As such, we plotted moving average plots for each region to capture some of the complexity and compare changes over time for similar regions of the world.

The black lines represent the actual yearly observations, while the blue line represents a moving average, taking the mean of the past seven observations. While there are unusual dips in the actual data that can be explained by one-time events, the moving average line ensures that we do not misinterpret those blips as a trend. This plot is particularly informative because it shows how economic freedom changes over time, while also making us aware of the true trend, without outlier events.

Most regions of the world show a positive change from 1970-2021, with the only region showing a negative change being North America. South Asia and Sub-Saharan Africa showed the greatest change according to the rolling average line, while East Asia & Pacific and Europe & Central Asia showed modest growth in economic freedom. Interestingly, we see positive change in the Middle East & North Africa until 2005, before a sharp decline that continues into 2021.

As of 2021, the Middle East & North Africa have the lowest economic freedom summary index at 5.92 out of 10, while North America has the highest economic freedom summary index at 8 out of 10.

Conclusion

Overall, we believe these questions were helpful to explore the relationships of economic freedom with social, economic, and temporal factors.

By comparing the relationship between the economic freedom index and the different social variables, it is evident that countries in the first quartile that have the highest economic freedom indices also tend to have higher gender disparity scores and have positive associations with all of the social variables: judicial independence, impartial courts, property rights, military interference, legal integrity, contracts, real property, and police and crime. However, this relationship becomes weaker for countries in the second and third quartiles, as they have larger variation in the gender disparity index and progressively weaker or negative associations with all of the social variables. Out of all of the quartiles, the fourth quartile has the greatest variation in the gender disparity index and negative associations with all of the social variables. This is what we had expected to see, as it indicates that countries with lower economic freedom are also worse off in other social aspects. Thus, future analyses could study whether improving a country socially can lead to an increase in economic freedom.

In our exploration of the relationship between economic indicators and the economic freedom score, we gained valuable insights into the weights of different components and their varying impacts. The linear regression analysis revealed that Regulation holds the highest weight, followed by Free Trade, Sound Money, Government Size, and Property Rights. The examination of relationships across quartiles uncovered nuances in how these indicators interact in different economic contexts. The scatterplot analysis focusing on the Regulation and economic freedom score affirmed a strong positive relationship but indicated a potential convergence of economic factors in these groups.To further explore the clustering of observations within each quartile, dendrograms were employed for each economic indicator. The dendrograms visually highlighted the grouping of observations, revealing distinctive clusters for each quartile. Notably, Quartiles 2 demonstrated lower similarity and exhibited greater diversity across all five indicators.

With different regions of the world showing different changes in economic freedom, it is clear that there are different forces at work within each region. North America was the only region that showed decline during the 1970-2021 time period, yet still remains the region with the highest economic freedom. This indicates that there is a relatively high level of economic freedom in this region, but may not remain this way in the future. South Asia and Sub-Saharan Africa showed the greatest growth, suggesting that there was room for significant economic development in these regions, especially since their 1970 economic freedom index was among the lowest out of all regions. However, the economic freedom indexes in these regions are still lagging behind North America and Europe & Central Asia; the two most developed regions in 2021. The economic freedom reversal in the Middle East & North Africa is fascinating, as it indicates that there were recent challenges in growing economic freedom in that region.

Limitations

In the investigation of gender disparity and other social variables, we focused our visualizations on coloring by quartile. However, the distribution of the economic freedom index variable is left skewed, which means that there is a long left tail. This may affect our visualizations, especially for those in the fourth quartile, as the large variation in gender disparity for the fourth quartile in the economic freedom index and gender disparity index plot could be accounted for by this. Additionally, simply breaking up all the countries into four quartiles may be a bit simplistic, as it means that countries that may have largely differing economic freedom indices are grouped together. To more accurately represent the relationship between economic freedom index and other social variables, It may be better to work with more groups than just the four quartiles given by the dataset.

There were some limitations to take into consideration for the economic indicator analysis. First, our reliance on quartiles as a grouping mechanism may mask heterogeneity within each quartile, potentially overlooking variations in economic contexts. Moreover, the dendrograms, while providing a visual representation of clustering, lack precise quantitative interpretations and may be sensitive to the chosen clustering method. In this analysis, we focused on one specific year. In the future, this analysis could be done using several years to capture potential changes in economic dynamics over time.

There are certainly limitations to the time series analysis we performed on each regions’ economic freedom index. The most pressing one we would like to address in future studies is why North America experienced negative change and why the Middle East & North Africa experienced a reversal in its growth trajectory. A loss in economic freedom is concerning, and we’ve just identified regions of the world that have experienced just that. As such, we would like to investigate why this concerning pattern occurred in these two regions, and what can be done to combat that.