Suicide Rate: Visualization and Interpretation of Factors That Play Into Suicide Rates Across the World



Loading the Dataset

Description of Dataset and Variables

The suicide rate dataset captures 12 variables for 27821 observations. We found this dataset on Kaggle and it was compiled from four other datasets: UN Development Program (2018), World Bank (2018), Szamil (2017), and WHO (2018). It was built to find signals correlated to increased suicide rates among different cohorts globally and across the socioeconomic spectrum.

Each observation corresponds to each group of suicides that share the country and year of suicide, and sex, age, generation of the suicide. Out of the 12, 6 are categorical:

  • country: country of the suicide (nominal)
  • year: year of the suicide (ordinal)
  • sex: male/female (nominal)
  • age: 6 age groups (ordinal)
  • country-year: combination of the previous variables country and year
  • generation: 6 generation groups based on age grouping average (ordinal)

The other 6 are quantitative:

  • suicideno: number of suicides for the group
  • population: population of the group that share country, year, sex, age
  • suiciderate: suicide per 100k population
  • HDI: Human Development Index measure for the year
  • GDP: Gross domestic product measure ($)
  • GDPcap: Gross domestic product per person

We mainly want to focus on suicide rate and the variables that factor into larger/smaller quantities. Since suicide is a very sensitive topic and an issue that has plagued every society, visualizing general trends can lead to answers of suicide prevention, specifically where we should divvy our resources to.

Research Questions

Some research questions we have are:

  • What is the relationship between year and suicide rate? Is this relationship dependent on the generation an individual is from?
  • Does suicide rate depend on age for each sex? If so, how?
  • Does the rate of suicide change depending on the country’s socioeconomic status? Do individual attributes contribute to this as well? In other words, what is the relationship between suicide rate and GDP per capita given age and sex?
  • For each generation, what is the general trend of suicide rate? Is there a pattern within or between groups?
  • Globally, is there a relationship between country and suicide rate? Which area(s) has the highest or lowest suicide rates?

Graph 1: Time Series Plot of Suicide Rate

To start, we visualized the relationship between suicide rate and year. Using a time series plot, we were able to measure the rate of suicide over the years from 1985 to 2016.

This time series plot displays the overall trend of suicide rate over the years from 1985 to 2016. The x-axis represents year and y-axis represents the rate of suicides. We can see that the rate generally increases over time from 1985 to 1995, decreases from 1995 to 2015, then goes back on an increasing trend after 2015. The sole highest rate is in 1995 at above 15.5, while the lowest rates are seemingly in 2011 and 2014 at around 11. It is interesting that the line follows a trend without significant outliers, and we could further look at other variables, such as generation, to see what could have contributed to the increase and decrease of suicide rates, and why the rate was so high in 1995.

Graph 2: Facetted Scatterplot of Suicide Rate and Year by Generation

To better understand the increase and decrease of suicide rate over time, we further dissected the rate of suicide by generations. By isolating each generation, we are able to see the general trends within each group.

From this graph, we can see that G.I. Gen seems to have the highest suicide rate and Gen Z seems to have the lowest. A worrisome trend within all generations is that suicide rates seem to increase as the years increase. However, it does seem like between generations, suicide rates tend to decrease with the next generation having overall lower suicide rates than the previous. This may be attributed to the de-stigmatization of mental health issues and increase in resources dedicated to suicide prevention.

Graph 3: Pairs Plot of Suicide Rate, Population, Generation

Next, we explored the relationship between suicide rate, generation, and population. We created a pairs plot to visualize all 3 variables which gives us a general overview of whether the rate of suicide depends on the size and time of societal groups.

The marginal distributions of suiciderate and GDPcap are both right-skewed. They are both very similar to an exponential decay; the distribution drops drastically to values very close to 0. And from the stacked bar plot at the bottom left, we see that the distribution of Generation has equal amounts for both male and female. However, in the side-by-side boxplot at the top right, we see that for males, there is a higher median in each generation than females. Males also have a higher interquartile range and overall greater distribution in range. Also, 50 to 70% of males have higher suicide rates. For both genders, the distribution is skewed left. Further, we see that the correlation between population and suicide rate is very low, even between sexes. This is not surprising as the size of a population does not provide meaningful information about the rate of suicide as countries with smaller population are equally as likely as countries with larger population to be plagued by suicide.

Graph 4: Stacked Histogram of Suicide Rate by Age, Facetted by Sex

We now move on to visualize the relationship of individual attributes and the rate of suicide. To start, we created a stacked histogram of suicide and age facetted by sex. This leads us to analyze suicide rate across ages and sex and whether an individual’s identity is a potential factor to higher/lower suicide rates.

For both female and male, the most common suicide rate is 0, which is a good news. The suicide rate of 0, however, is much more frequent in females than in males - the bar in females almost reaches a count of 8000, while that in males is less than 4000. While the suicide rates of each age group is generally similar, the proportion of suicide rate at 0 is the largest for the 5-14 years age group for both sexes, and this age group is almost nonexistent at any higher rate. This is reasonable since this age group is typically too young to be committing or considering committing a suicide.

Graph 5: Scatterplot of HDI and Suicide Rate Colored by GDP per Capita and Facetted by Sex

To measure suicide rates across countries, we start by visualizing HDI and GDP per capita. We also wanted to see if there was a difference between the sex groups in order to better understand how suicide rate varies across societies.

We wanted to measure the Human Development Index, which is a statistical tool used to measure a country’s overall achievement in social and economic dimensions and its relationship to the rate of suicide. We utilized sex and GDP per capita to identify which groups had the most significance. We see that as HDI increases, GDP per capita also increases and we see an overall higher suicide rate in males. Up until the 0.8 HDI range, we see the highest suicide rates at 75,000 GDP. Towards the 0.9 HDI range, most of the data points are from groups in which their country’s GDP per cap is around 100,000 and higher. Overall, we see lower counts compared to groups that come from lower GDP per capita. Overall, the distribution of females has a lower suicide rate throughout the HDI range, and we see that those with the highest suicide rate are male within 25,000 ~ 75,000 in the 0.6 to 0.85 HDI range.

Graph 6: Mean International Suicide Rate per Year Colored by Sex

Now, we look at how the rate of suicide varies globally across year and sex. The stacked histogram, though similar to our time series plot, further explores the question of how suicide rate is affected by individual attributes across time.

This plot gives us a very general, big-picture perspective of how suicide rates (on an international scale) compare between each gender and how it changes over time. We can see that 1995 and 2016 are both years with higher-than-normal suicide rates. After 1995, there seems to be a gradual decline in suicide rates. We can also see that, overall, there are significantly more male suicides than female suicides internationally.

Graph 7: Mean Suicide Rate per Country in 2000

Next, we aim to visualize how the mean suicide rate compares across different countries. For the purposes of this graph, we will use only one year, so that averages are representative of contemporary values and allow for more clear comparisons between countries. We will arbitrarily select 2000.

We have a map which shows the mean suicide rate of each country as highlighted by the size and color of the points. It’s important to note that we are missing the data from many countries in Asia and Africa, thus explaining the higher volume of points in the Americas and Europe. Looking at Europe especially, a great number of higher suicide rates are present and they overlap each other significantly, which may make interpretation more challenging. However, it’s still rather clear that all of the greatest suicide rates seem to exist there. Additionally, we can see that Japan and India have rather high suicide rates. Most countries seem to have middle-to-lower suicide rates otherwise.

Conclusion

We found interesting findings in our research that we perhaps had not expected. From our graphs above, we saw higher suicide rates in males than females. It was surprising that throughout a century worth of data, containing more than 28,000 entries, there was a significant disparity between the suicide rate in genders. We expected to see some trend in Age, yet we saw an even distribution among the age group, except ages 5-14 years old- which we talked about earlier that this group seems too young to consider suicide. Another finding was that those who live in countries with lower GDP per Capita had higher suicide rates, again mostly males. For the year 2000, we saw a concentrated amount of high suicide rates for Europe, though that may be attributed to more data for European countries. We can assume that living conditions such as health, education, and standard of living are affiliated with the rate of suicide. In further investigation, it may be useful to also look at 1990, 1995, 2005, 2010, etc. to allow for comparison of how suicide rates for different countries and regions changes over time. Overall, suicide prevention is an important topic that, although sensitive, must be discussed. We see that globally, it affects people the same and even if suicide rates are lower for some societal groups than others, it is ideal for there to be an average suicide rate of 0. We hope that our work here has brought light to where resources should be generally focused but further work that can be done includes collecting more fine tuned data for countries/societal groups, text analysis of reasons of suicide, and open discussions of mental health and its effect on suicide rates.