Decoding Political Stability Through World Bank Data

Abstract

Given the rising perception of political instability around the world, we sought to better understand the different aspects of political stability by analyzing the World Bank’s World Developing Indicators dataset. We explored how four indicators, internet access, legal rights, tax revenue, and fertility, relate to political stability. Our analysis suggests a positive trend between political stability and internet access for low-income countries but not necessarily for other income groups. Similarly, it seems that the impact of legal rights on political stability also varies; it changes depending on the country’s geographic region. In contrast, tax revenue appears to be positively correlated with political stability regardless of income group. Finally, there seems to be a negative relationship between fertility and political stability. We note our analysis was limited to correlational relationships and incomplete data. There are many other indicators to study for future research and the potential to examine casual relationships.

Introduction

As the 2024 U.S. presidential election approaches, heightened campaign activities reflect the increasing political divisions within the country. This perceived decline in political stability, however, is not exclusive to the United States. Globally, there’s a growing sense of instability, marked by escalating social unrest and conflicts. To understand this phenomenon better, we turn to the World Bank’s World Development Indicators dataset, which provides a comprehensive overview of political stability across various nations.

Our analysis begins with a comparative visualization of global political stability in two distinct years, 2013 and 2021. This map, presented below, reveals subtle yet significant shifts in the political landscape over this eight-year period, indicating a general trend towards decreased stability.

Digging deeper, we explore how socioeconomic indicators interact and vary across different income groups. Using a PCA biplot analysis, we examine the relationships between various indicators - such as Internet access, GDP per capita, and urbanization - and how they cluster by income group. This visualization provides insights into the underlying factors that may contribute to a country’s political stability or lack thereof.

Our analysis often revealed differences in observation by how wealthy the country is, and by differing regions of the world. With that in mind, we felt it might be useful to be able to initially visualize how the income of a country intersects with its region.

Our investigation is guided by four key research questions, each aimed at uncovering different aspects of political stability:

How does internet usage correlate with political stability?
What is the impact of strong legal rights on political stability?
How are a nation’s tax revenue and its political stability interconnected?
Finally, what is the relationship between fertility rates and political stability?

Through these inquiries, we aim to paint a clearer picture of the dynamics of political stability, exploring its multifaceted nature and the various socioeconomic factors that influence it.

Dataset

The dataset was sourced from the CMU S&DS Data Repository, which hosts various datasets for educational purposes. The specific dataset we used (World Development Indicators), features 46 variables for 266 countries and regions across the years 2013 to 2022. The dataset is structured so that each row represents all observations for a country during a year. After removing the year 2022 due to missing data for many countries, removing sub-national territories, and adding a few columns with certain variables calculated per capita, we ended up with 1773 observations of 54 variables.

Although the World Development Indicators has data stretching back to the year 1960, we chose to focus on the ten years in the CMU Data Repository, as we felt that drawing conclusions over that long of a stretch of time could include inaccuracies, as their methodology for collecting data likely would have changed over time, and many variables that are now present in their data were not measured initially. Even limiting ourselves to the past 10 years, we found that there were many observations missing for certain variables, which lead us to focus on certain variables over others that seemed potentially interesting in our analysis. Again, we felt that analysis that would require leaving out this many rows of data had the potential to create flawed conclusions.

Variables

The primary variables we focused on in our analysis were:

PoliticalStability: Political stability and absence of violence/terrorism score (normalized as a z score, ranging from about -2.5 to about 2.5). This was measured by perception of the likelihood of political instability or political violence, including terrorism by surveys and NGOs.

LegalRights: Strength of legal rights index (0-12), specifically referring to the degree to which collateral and bankruptcy laws protect the rights of borrowers and lenders and thus facilitate lending, with higher scores indicating more access to credit.

Fertility: Fertility rate, total (births per woman)

GDP: Gross domestic product (measured in current US$)

TaxRevenue: Tax revenue (as a percentage of GDP)

Internet: Individuals using the Internet (% of population). Measured as the percentage of individuals who have used the internet from any location in the last three months.

Region: Geographic region the country is located in.

Income.group: Country’s income group as of 2023 (high-income, upper-middle-income, lower-middle-income, or low-income), based on its gross national income (GNI), as defined by the World Bank.

Other variables that we used include:

MobilePerCapita: Mobile cellular subscriptions per capita

PM2.5: PM2.5 air pollution, mean annual exposure (micrograms per cubic meter)

Urban: Urban population (as a percent of total population)

CO2Emissions: CO2 emissions (metric tons per capita)

Electricity: Access to electricity (% of population)

GenderEquality: CPIA gender equality rating, measuring whether the government has programs and institutions to promote gender equality in education, health, the economy, and the law (1=low to 6=high).

TelephonePerCapita: Telephone subscriptions per capita

Research Questions

How Does Internet Usage Correlate with Political Stability?

In the modern era, Internet access has become a crucial factor in global connectivity and socio-economic development. This analysis investigates the correlation between Internet access and political stability, particularly focusing on the period from 2013 to 2021. During this time, many countries have experienced significant gains in Internet penetration, a transformation that has potential implications for political stability.

Visualizing Internet Access Over Time

To contextualize these changes, we visualized Internet access in 2013 and 2021:

Analyzing the Correlation with Political Stability

We conducted a linear regression analysis across various income groups to assess the impact of Internet penetration on political stability:

## 
## Call:
## lm(formula = PoliticalStability ~ Internet, data = world_bank_internet)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.4647 -0.4447  0.1148  0.5422  2.1415 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -1.0742229  0.0405780  -26.47   <2e-16 ***
## Internet     0.0188422  0.0006601   28.54   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.7867 on 1674 degrees of freedom
## Multiple R-squared:  0.3274, Adjusted R-squared:  0.327 
## F-statistic: 814.7 on 1 and 1674 DF,  p-value: < 2.2e-16

The results indicate a significant positive correlation between Internet access and political stability, evident from the p-value (< 2e-16) and an R-squared value of 0.327. This suggests that increased Internet access is generally associated with higher levels of political stability.

Dissecting the Relationship by Income Group

However, the relationship between Internet access and political stability is not uniform across income groups:

Our analysis reveals that while low-income countries tend to show increased political stability with greater Internet access, this trend either plateaus or reverses in higher-income groups. This differentiation suggests that the impact of Internet access on political stability varies, influenced by the economic context of each country.

What Is the Impact of Strong Legal Rights on Political Stability?

Next, we explored the relationship between the strength of a country’s legal rights and its political stability. We expect countries with stronger laws that protect the rights of borrowers and lenders to have greater economic stability and, thus, also greater political stability. This hypothesis is supported by the correlation coefficient between LegalRights and PoliticalStability, 0.26, which suggests a weak positive relationship between the two variables. Countries with more robust legal rights appear to have higher political stability. However, we also wanted to consider that the relationship between legal rights and political stability may differ depending on other variables, like a country’s income group or geographical region. When we accounted for only Income.group, we did not notice any observable difference in the relationships between legal rights and political stability. But we did observe an interesting trend when we took Region into account.

The above scatterplot graphs PoliticalStability against LegalRights. Each point represents a country during a particular year, from 2013 to 2021. The trend lines are linear regressions of PoliticalStability on LegalRights for each Region. For most geographical regions, we observed the same trend as before. That is, political stability appeared to increase with stronger legal rights. However, a few regions deviated. Namely, North America, South Asia, and Europe & Central Asia. In these regions, political stability appeared to weaken with more robust legal rights. This reversal in the relationship between political stability and strength of legal rights, when taking into account region, may be due to noise. For example, there are only two countries in the North American region, which is not a representative sample. The small sample size may have resulted in the negative relationship between the two variables. However, the sample sizes for the South Asian and European & Central Asian regions are larger, so this explanation is less likely. The alternative to this explanation is the relationship between political stability and the strength of legal rights varies depending on region. Perhaps in some regions, enabling greater access to credit leads to more misuse, economic repercussions, and political instability.

To determine whether the relationship between political stability and legal rights depends on region, we tested whether the interaction terms between LegalRights and Region are significant in a linear regression model. The regression model summary, shown below, reveals that all the interaction terms, except that for the Middle East & North Africa, are statistically significant when we assume $\alpha$ level = 0.05 and the base region is East Asia & Pacific. Given these interaction terms are significant, we can reject the null hypothesis that there is no interaction between legal rights and geographical region. Overall, this suggests that while political stability is associated with the strength of legal rights within a country, this association differs depending on geographical region.

## 
## Call:
## lm(formula = PoliticalStability ~ LegalRights * Region, data = .)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.24458 -0.51548  0.07777  0.50222  2.17946 
## 
## Coefficients:
##                                              Estimate Std. Error t value
## (Intercept)                                  -0.48529    0.13146  -3.691
## LegalRights                                   0.13255    0.01754   7.557
## RegionEurope & Central Asia                   1.18783    0.16925   7.018
## RegionLatin America & Caribbean               0.50777    0.16409   3.095
## RegionMiddle East & North Africa             -0.57798    0.15965  -3.620
## RegionNorth America                           4.63782    2.14733   2.160
## RegionSouth Asia                              0.37733    0.25015   1.508
## RegionSub-Saharan Africa                     -0.30995    0.16681  -1.858
## LegalRights:RegionEurope & Central Asia      -0.19240    0.02415  -7.966
## LegalRights:RegionLatin America & Caribbean  -0.11101    0.02417  -4.594
## LegalRights:RegionMiddle East & North Africa -0.03619    0.03771  -0.960
## LegalRights:RegionNorth America              -0.46910    0.21399  -2.192
## LegalRights:RegionSouth Asia                 -0.28389    0.04153  -6.836
## LegalRights:RegionSub-Saharan Africa         -0.09611    0.02559  -3.756
##                                              Pr(>|t|)    
## (Intercept)                                  0.000232 ***
## LegalRights                                  7.84e-14 ***
## RegionEurope & Central Asia                  3.65e-12 ***
## RegionLatin America & Caribbean              0.002014 ** 
## RegionMiddle East & North Africa             0.000306 ***
## RegionNorth America                          0.030973 *  
## RegionSouth Asia                             0.131700    
## RegionSub-Saharan Africa                     0.063378 .  
## LegalRights:RegionEurope & Central Asia      3.60e-15 ***
## LegalRights:RegionLatin America & Caribbean  4.79e-06 ***
## LegalRights:RegionMiddle East & North Africa 0.337476    
## LegalRights:RegionNorth America              0.028547 *  
## LegalRights:RegionSouth Asia                 1.26e-11 ***
## LegalRights:RegionSub-Saharan Africa         0.000181 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.798 on 1274 degrees of freedom
## Multiple R-squared:  0.3362, Adjusted R-squared:  0.3295 
## F-statistic: 49.64 on 13 and 1274 DF,  p-value: < 2.2e-16

How Are a Nation’s Tax Revenue and Its Political Stability Interconnected?

## 
## Call:
## lm(formula = PoliticalStability ~ TaxRevenue + Income.group + 
##     TaxRevenue * Income.group, data = world_bank_gdptax)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.17968 -0.36071  0.02409  0.36949  1.73128 
## 
## Coefficients:
##                                             Estimate Std. Error t value
## (Intercept)                                 0.496660   0.093984   5.285
## TaxRevenue                                  0.011809   0.004468   2.643
## Income.groupUpper middle income            -1.521310   0.141316 -10.765
## Income.groupLower middle income            -1.274920   0.124106 -10.273
## Income.groupLow income                     -2.662334   0.188168 -14.149
## TaxRevenue:Income.groupUpper middle income  0.043258   0.007543   5.735
## TaxRevenue:Income.groupLower middle income  0.015602   0.006410   2.434
## TaxRevenue:Income.groupLow income           0.082158   0.013101   6.271
##                                            Pr(>|t|)    
## (Intercept)                                1.51e-07 ***
## TaxRevenue                                  0.00832 ** 
## Income.groupUpper middle income             < 2e-16 ***
## Income.groupLower middle income             < 2e-16 ***
## Income.groupLow income                      < 2e-16 ***
## TaxRevenue:Income.groupUpper middle income 1.25e-08 ***
## TaxRevenue:Income.groupLower middle income  0.01509 *  
## TaxRevenue:Income.groupLow income          5.06e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.6282 on 1149 degrees of freedom
## Multiple R-squared:  0.4965, Adjusted R-squared:  0.4934 
## F-statistic: 161.9 on 7 and 1149 DF,  p-value: < 2.2e-16

We also wanted to examine how political stability may be related to other economic indicators like tax revenue and GDP. The scatterplot and linear regression above show a positive correlation between tax revenue and political stability. The slope of the linear regression line is greater for countries that are lower-income and thus are generally less politically stable, while high-income and generally more politically stable countries seem to see less of a dramatic increase in stability tied to increases in tax revenue. A test of statistical significance here reveals that the coefficients in this linear regression are all statistically significant, generally with very low p-values, and that the $R^{2}$ of the linear regression is 0.495. Overall, an increase in tax revenue appears to offer diminishing returns for the highest-income, most stable countries in comparison to lower-income countries.

We also took a look at how GDP is associated with political stability. We believed that countries with higher GDPs would generally be more politically stable, but were curious how this relationship might compare to tax revenue, and whether or not countries with high GDPs also have proportionally high tax revenue. To examine the relationship between political stability, tax revenue, and GDP, we created a biplot. A biplot allows us to make comparisons between multiple variables at the same time (here, three), in a way that linear regressions do not. Interestingly, the biplot shows that there is virtually no correlation between GDP and political stability as the vectors associated with these two variables nearly form a 90 degree angle. GDP also appears to form an angle close to 90 degrees with tax revenue.

This graph also confirms that tax revenue is positively correlated with political stability as the vectors associated with these two variables form a very small angle. Again, it is fairly easy to see the stratification of countries by income level here, with the highest-income countries generally having high tax revenue, high raw GDP, and high political stability, with all three generally decreasing as the country’s income level decreases.

What Is the Relationship Between Fertility Rates and Political Stability?

In this section, we try to answer the question: How is fertility related to political stability based on income groups?

The contour plot has one major node, where political stability is high and fertility is low. This means that most countries in the dataset have high political stability (around 1) and lower fertility (around 2). Looking specifically at the different income groups, it appears that this high political stability and low fertility intersection is where the majority of the high income group lies. The contour plot does not include high fertility values but does reach into a lower political stability range (to -1). This range is mainly caused by the lower and upper middle income groups. They have similar fertility rates and tend to experience lower political stability. The contour plot also has a minor node, where both political stability and fertility are moderate. In other words, there is a cluster of countries that have a medium level of political stability and fertility. Generally, this is experienced by low income and lower middle income groups. This plot is particularly informative because we can see from the contour plot that the majority of the data points lie at high political stability levels and low fertility rates. We can see that the higher income groups have the largest influence in this area, as most of the data points are green or yellow

## 
## Call:
## lm(formula = Fertility ~ PoliticalStability, data = world_bank_fertility)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.7793 -0.7328 -0.2747  0.7192  3.7494 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)         2.64000    0.02730   96.70   <2e-16 ***
## PoliticalStability -0.77032    0.02772  -27.79   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.132 on 1737 degrees of freedom
## Multiple R-squared:  0.3078, Adjusted R-squared:  0.3074 
## F-statistic: 772.4 on 1 and 1737 DF,  p-value: < 2.2e-16

Based on the linear regression between political stability and fertility, it appears that both the intercept and the slope are significant (their p-values are less than 2e-16). The intercept is 2.65 and the slope is -0.77, meaning that for every increased unit in political stability, fertility drops 0.77 units. Based on the contour plot, we can see that this matches, because there is a general downward trend between political stability and fertility.

## # A tibble: 4 × 5
## # Groups:   Income.group [4]
##   Income.group        estimate std.error statistic  p.value
##   <fct>                  <dbl>     <dbl>     <dbl>    <dbl>
## 1 High income          -0.316     0.0376    -8.41  3.77e-16
## 2 Upper middle income  -0.0822    0.0437    -1.88  6.08e- 2
## 3 Lower middle income  -0.0388    0.0594    -0.654 5.14e- 1
## 4 Low income           -0.320     0.0743    -4.31  2.34e- 5

However, performing a linear regression based on each income group between political stability and fertility, we can see that not all the p-values for the slopes are significant. We can see that for both high income and low income, the p-values for their slopes are significant, but not for the lower middle income and the upper middle income. This means that the slopes for those two income groups are likely to be 0, while for high and low income, it is likely that their slopes are non-zero or negative. This means that for those two income groups, as political stability increases, fertility decreases.

Conclusion

Below shows a correlation graph between some of the main variables that we considered. The wider and darker the lines, the more correlated each variable is. The red lines that connect the variables indicate a negative correlation, while the blue lines indicate a positive correlation. All the trends that we found were supported by this correlation graph as well. However, one thing to note is that this graph does not account for confounding variables, meaning that the associations that we can see may be skewed due to other unmeasurable factors.

To summarize our findings, we noticed a positive correlation between internet access and political stability, which matches our initial assumptions, because it allows for the free flow of information, as well as reliable communication. Additionally, high political stability is a good indicator for economic stability as well, meaning that there is more opportunity for investment in infrastructure and more specifically, internet access. With legal rights, we are able to conclude that the region the countries are in affects the level of legal rights. There was a mixture between positive and negative correlations between legal rights and political stability, which likely can be explained by various economic reasons. With the relationship between tax revenue and political stability, there appears to be a positive trend, meaning that as tax revenue increases, so does political stability. Separating by income group, we can see the same positive trend as well. This is likely due to the fact that higher income countries are associated with higher political stability, and with higher income comes higher tax revenue. Regarding fertility and political stability, it appears that there is a negative relationship between the two, meaning as political stability increases, fertility rates decrease. Each income group demonstrates this trend as well. Many of the trends that we were able to find are well-rooted in social outcomes, which we are able to reveal through our testing.

Future Work

Establishing Causal Links

Our study has highlighted key correlations between socio-economic indicators and political stability. The next step involves investigating causality. Due to the complexities and ethical concerns in conducting randomized controlled trials in this context, we suggest utilizing advanced statistical methods to mitigate confounding factors. This approach, albeit challenging, is crucial given the myriad variables influencing political stability.

Investigating Under-Analyzed Variables

Our analysis was limited by data availability, particularly for variables like GenderEquality and GenderEducation. These factors are presumed to significantly impact political stability. Future research should aim to incorporate more comprehensive data sets to explore these dynamics in greater depth.

Exploring the Role of Air Pollution

An unexpected finding of our study is the negative correlation between air pollution (PM2.5) and political stability. This, coupled with the weak link between CO2Emissions and PM2.5, suggests distinct environmental policies or technological advancements across countries. Further research could unpack these relationships, offering new perspectives on environmental impact and governance.

Internet Access versus Economic Wealth

Our findings indicate a stronger correlation of Internet with positive development indicators than GDPPerCapita. This suggests that, in today’s digital age, internet accessibility may play a more pivotal role in socio-economic development than traditional economic measures. Investigating this relationship could yield insights into the changing drivers of global development.

Mobile versus Telephone Usage Dynamics

We observed a surprising lack of correlation between MobilePerCapita and TelephonePerCapita. Additionally, MobilePerCapita correlates with GenderEducation but not with GDPPerCapita, whereas TelephonePerCapita shows the opposite trend. This may reflect a leapfrogging phenomenon in technology adoption, particularly in developing countries. Investigating this trend can shed light on how advancements in communication technology influence development patterns.