315 Final Project

Nitin Sivamurugan, Sean Hough, Sayak Bagchi

Data Description

The data that we are studying comes from the all-ages.csv file from the college-majors dataset. We are primarily studying how employment rates are correlated with major, as our data includes the variables major code, major, major category, total number of people in a particular major, employed count, employed full time year round count, unemployed count, unemployment rate, median earnings of full-time workers, 25th percentile of earnings, and 75th percentile of earnings.

Research Questions

  1. Which majors and industries are the most popular, and how does popularity relate to other variables?
  2. Do different majors and major categories have different earnings variability?
  3. How much conceptual overlap is there between the set of majors in the Law & Public Policy category and the Social Sciences category?

Research Question 1

Our first research question focuses on the most popular majors/industries and the ones with the highest number of employed graduates, so the variables that we largely utilize are majors, major category, total [number of graduates], and unemployment rate.

The most popular majors are mainly business fields. Among the top 10 there are highly varying unemployment rates, with certain popular majors having significantly high unemployment rates and others having practically none. Nursing, for instance, has an unemployment rate of only 2.79%, whereas a major like psychology has a nearly 7% unemployment rate, which is surprisingly high considering that the unemployment rate in America is 4.2% as of November 2021, according to the Bureau of Labor Statistics. Being in a largely popular major doesn’t seem to correlate with unemployment rates, so let us take a look at majors taken by far fewer students as well.

We can see here that among the more unpopular majors, there is still a large amount of variability in unemployment rate. Some majors have nearly no unemployment rate, while others have fairly high rates. Geological and geophysical engineering for instance has an unemployment rate of 0, while military technologies has an astonishingly high unemployment rate of 10.12%. This is extraordinarily surprising, considering that workers with less than a high school diploma had an unemployment rate of 5.5% as of January 2020.

This third graph displays the unemployment rate for different majors on its y-axis, and displays the total number of people in those major categories on its x-axis. We can see which majors belong to which major categories as well. Through analyzing this graph we can see that certain major categories have a high average unemployment rate, such as Psychology & Social Work, Arts, and Industrial Arts & Consumer Services. However, other majors categories, especially those with a large number of students such as Business, display a lower average unemployment rate with less and less variability. These majors show an unemployment rate somewhat in the middle, at around 5-6%.

The graph consequently highlights the industries with the highest and lowest unemployment rates, and suggests that the most popular majors are not likely to be chosen just for the prospect of entering an industry with a low unemployment rate. For instance, the most popular field of business still sports a median unemployment rate, while the major with the lowest unemployment rate is very unpopular among graduates.

Research Question 2

For our second research question, we want to know which majors have a wide distribution of income and which have a narrow distribution. To do this with the data provided, we used the Quartile Coefficient of Dispersion, which is the 75th percentile minus the 25th, divided by their sum. Low values represent narrow distributions, and high values represent wide distributions.

Based on the above graphs, we can conclude the majors in the first graph, including Pharmacology, Cognitive Science, and Petroleum Engineering have a high Quartile Coefficient of Dispersion for Income. On the other hand, School Student Counseling, Special Needs Education, and Pharmaceutical Sciences have the lowest. One interesting takeaway is that 9 of the 10 majors with the highest dispersion are STEM fields, while only around 3 of the 10 majors with the lowest dispersion are STEM fields. We can also see that the difference in the dispersion coefficient between majors is quite large, Pharmacology is around 0.5 and School Student Counseling is less than 0.2.

The facetted distribution above consists of density curves modeling the 75th percentile of earnings across all 16 major categories. Since fewer than two values were provided for Interdisciplinary majors, the density curve was omitted. This graphic reveals that across all industries, STEM and business were the industries with the highest 75th percentile of earnings, while Psychology & Social Work and Education were the lowest. This is indicative of high earnings variability among major category. However, it appears that major popularity is independent of high earning potential. This is evident because while the most popular majors are high-earning business positions, the least popular are still mainly high-earning math, science and engineering majors.

Research Question 3

From this comparison cloud, we can conclude that while words like administration, public policy, and reporting are of primary importance to Law & Public Policy, they are also common to Social Science Majors. We can also see that while they are primarily assoociaed with Social Sciences, the topics of criminology, economics, geography, and sociology are also important to Law & Public Policy. So for the conceptual overlap of the two categories, we can say that the ideas of economics, geography, sociology, administration, and public policy are common between the two categories.

Conclusion

Through several of our research questions, we were able to gain valuable insight on how college majors were correlated with a graduate’s employment and salary. Through several visualizations we were able to note how unemployment rates varied vastly regardless of how popular a major was, unless the major was incredibly popular, especially from a business category. Our analysis also discerned that majors with both low unemployment rates or a high earning potential were not necessarily correlated with the most popular majors in the dataset.

We were able to see that certain majors had a large spread and dispersion within their salaries, such as largely STEM majors like petroleum engineering. Other majors—mainly non-STEM— had lower dispersion such as school student counseling. We also took a more specific look at the majors of Law & Public Policy and Social Science. Here we uncovered that the fields shared large conceptual overlaps, particularly with terminology ranging from administration and public to criminology and reporting.