This dataset titled “Nightly sleep time and GPA in first-years” was taken from the CMU S&DS Data Repository. The dataset for the research project originates from a study that investigates the impact of sleep patterns on the academic performance of first-year college students across three universities. A total of 634 participants were tracked using Fitbit devices to monitor their sleep over a month during their spring term. This data includes detailed sleep episodes, demographic information, and GPA records from university registrars. Variables such as bedtime variability, total sleep time, and GPA are analyzed to explore the relationships between sleep habits and academic success.
Within this dataset we are interested in the following variables:
cohort: Codename of the cohort that the subject belongs to
demo_race: Binary label for underrepresented and non-underrepresented students (underrepresented = 0, non-underpresented = 1)
demo_gender: Gender of the subject (male = 0, female = 1), as reported by their institution
demo_firstgen: First-generation status (non-first gen = 0, first-gen = 1). Students were considered first-generation if neither parent completed any college (i.e., high school diploma or less)
bedtime_mssd: Mean successive squared difference of bedtime. This measures bedtime variability, and is calculated as the average of the squared difference of bedtime on consecutive nights.
TotalSleepTime: Average time in bed (the difference between wake time and bedtime) minus the length of total awake/restlessness in the main sleep episode, in minutes
midpoint_sleep: Average midpoint of bedtime and wake time, in minutes after 11 pm (for example, 364 is 5:04 am)
daytime_sleep: Average sleep time outside of the range of the main sleep episode, including short naps or sleep that occurred during the daytime, in minutes
cum_gpa: Cumulative GPA (out of 4.0), for semesters before the one being studied. (Since these students are first-years during their spring semester, this is usually just their fall GPA. UW has quarters, so this includes both fall and winter quarters at UW.)
term_gpa: End-of-term GPA (out of 4.0) for the semester being studied
term_units: Number of course units carried in the term
Zterm_units_ZofZ: Because each university counts units differently, each student’s units were Z-scored relative to the mean and standard deviation of all students in their study cohort. This score represented the student’s load relative to the average amount of units. The Z-scores for the cohorts were then combined and Z-scored again. 0 represents an average load, while positive values are above-average loads and negative values are below average.
The research questions that we will be exploring are:
How are the quantitative and categorical variables related to total sleep time?
How does student bedtime affect their academic performance? Do students who go to sleep earlier have higher GPAs?
How does total sleep time differ across universities (CMU, Notre Dame, University of Washington)?
We wanted to learn about whether or not the time at which students fall asleep has an affect on their academic performance. Specifically, we wanted to see if it is true that students who go to bed earlier have higher academic performance. We theorize that this is true, because typically students who go to sleep earlier would typically be more likely to attend their early morning lectures, and thus get better grades.
To explore this question, we utilized the midpoint_sleep
variable (average midpoint of bedtime and wake time), as well as the
TotalSleepTime
variable (average time in bed) to calculate
the average bedtime of each student observation. Additionally, we used
the cum_gpa
variable (cumulative GPA out of 4.0) as a
metric for academic performance.
This graph reveals that students in the study went to bed at times 11pm-7am, with the vast majority of students going to bed from 12am-4am. The graph is skewed right, due to the few students going to bed between 4am-7am, with a rare amount of students going to bed as early as 11pm. The mean bedtime appears to be roughly before 2:30am.
To get further insights into the distributions of students who go to sleep at different time segments throughout the night, we grouped the students by time segments and graphed their distributions through violin plots as shown below.
The main takeaway from this graph is that the earlier students tend to go to sleep, the less outliers of students with low GPAs there are. The distribution for students who go to bed within 11pm-1am is bimodal, with a large majority of those students having high GPAs between 3.0-4.0, and there are no outliers of students who go to bed that early but have GPAs lower than 2.5. For students who go to bed between 1am-4am, the distribution is unimodal and skewed such that there are outliers of those with lower GPAs. For students who go to bed between 4am-7am, there is a clear distinction where, although it is unimodal, the distribution is heavily skewed towards lower GPAs. This indicates that within the students who sleep this late, there are a large number of outliers of students who receive very low GPAs – as low as 1.0.
Because this graph implies that sleeping earlier is associated with higher cumulative GPA, we wanted to test that using a Pearson correlation test.
##
## Pearson's product-moment correlation
##
## data: cmu_sleep$hour and cmu_sleep$cum_gpa
## t = -5.0685, df = 632, p-value = 5.272e-07
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.2713297 -0.1216434
## sample estimates:
## cor
## -0.1976384
These are the hypotheses for the test:
Null Hypothesis (\(H_0\)): There is no linear correlation between the bedtime and cumulative GPA.
Alternative Hypothesis (\(H_a\)): There is a linear correlation between the bedtime and cumulative GPA.
The p-value of 5.272e-07 indicates the correlation observed is significant – that there is a low probability that this correlation is just due to random chance. Because the p-value is less than 0.05, we reject the null hypothesis and conclude that there is a linear correlation between the bedtime and cumulative GPA. This supports our theory from earlier where we visually observed the graphs and noticed a correlation between bedtime and cumulative GPA.
To further support our theory, we can look at how weak or strong this correlation is, to further understand the extent to which bedtime is associated with GPA. The correlation coefficient of -0.1976 indicates a weak negative relationship between bedtime and cumulative gpa. Additionally, the 95% confidence interval of [-0.2713, -0.1216] doesn’t include 0, thus it supports the conclusion that the correlation is significant and negative.
Because the correlation is negative, it implies that as bedtime ‘increases’ (in other words, the later people go to bed), then cumulative GPA decreases. This supports what we originally saw in the violin plots. All of this evidence supports the idea that students who sleep earlier tend to have higher academic performance, compared to those who sleep later.
The analysis of total sleep time across universities reveals interesting patterns in sleep habits among students. Carnegie Mellon University exhibits the lowest average total sleep time, while the University of Washington shows the highest. However, it’s important to note that despite these differences, the average total sleep time across all universities falls within a relatively narrow range of 6 to 7 hours. This finding suggests that students across these institutions tend to have similar overall sleep durations despite attending different universities. The differences observed between Carnegie Mellon University and the University of Washington could be influenced by various factors such as academic workload, campus culture, or individual lifestyle choices. Understanding these variations in sleep patterns can inform strategies for promoting healthier sleep habits among students, ultimately contributing to their well-being and academic success.
The violin plot visualizes the distribution of daytime sleep duration among first-year students across CMU, Uw, and Notre Dame. Each violin represents the distribution of daytime sleep, in minutes, for a specific university. From the plot, it can be observed that the shape and spread of the violins vary across the universities, while the data are all skewed. CMU exhibits a more narrow violin that skews more significantly than the other two schools, indicating that the CMU students in the study generally took shorter naps. UW and Notre Dame have widely spread violins, suggesting that they exhibit more variability in daytime sleep. Notre Dame in particular has the greatest spread, with more of the students taking longer nap at the university.
The density plot of midpoint sleep duration across universities offers insights into the sleep patterns of first-year students. Each university has a unimodal density distribution. Notre Dame has the most prominent peak occurring around 5:15 AM, and a significant proportion of Notre Dame students have their midpoint sleep around this time, which could indicate a consistent sleep pattern or a university culture that influences sleep schedules. The peaks for CMU and UW occur at different times, with CMU’s peak closer to 6:00 AM and UW’s peak closer to 4:30 AM. This discrepancy in peak times suggests variations in sleep schedules among students at these universities. UW’s density plot is skewed right, indicating a higher concentration of students with earlier midpoint sleep times. Notre Dame’s plot shows a more concentrated distribution. CMU’s plot indicates a broader distribution with more students having later midpoint sleep times.
The mosaic plot visually depicts the relationship between bedtime category and university for first-year students. For bedtime category, very late is defined as a bedtime after 4 am, late night as between 1-4 am, and early night as before 1 am. For the majority of the associations between categories of the two variables, the mosaic plot reveals expected frequencies of observations, as assumed under independence. However, CMU students exhibit a higher frequency of very late bedtime compared to other universities, showing that there is a significant difference in the sleep habits of students at the university. Additionally, CMU students are less likely to have late bedtime in comparison to UW and Notre Dame.
##
## Pearson's Chi-squared test
##
## data: table(cmu_sleep_copy$bedtime_category, cmu_sleep_copy$study)
## X-squared = 38.796, df = 4, p-value = 7.677e-08
## CMU UW Notre Dame
## Early Night -0.5321812 1.6664102 -1.241085
## Late Night -2.6499636 0.6795417 1.254738
## Very Late 4.4517376 -1.5787044 -1.681273
The results of the Pearson’s Chi-squared test indicate a statistically significant association between bedtime category and university (X-squared = 38.796, df = 4, p-value < 0.001). As shown in the mosaic plot, the Pearson residuals reveal the specific deviations from expected frequencies within each cell of the contingency table.
Total sleep time displays correlations with GPA and distinct sleep patterns such as midpoint and daytime sleep, suggesting academic performance and sleep habits influence each other. However, normalized course load appears to have minimal influence on total sleep time. Gender and race do not demonstrate significant associations with total sleep time, implying that these demographic factors may not directly impact sleep duration among college students in this dataset. Lastly, the first Principal Component can be used to distinguish students who are in the bottom third of Total Sleep Time from other students.
We sought to examine the relationship between students’ bedtime and their academic performance, theorizing that earlier bedtimes could lead to higher GPAs. The analysis, supported by statistical tests and visual data from violin plots, indicated a significant negative correlation between later bedtimes and lower cumulative GPAs, as confirmed by the Pearson correlation test. This supports the belief that students who go to bed earlier have better academic performance.
The analysis of sleep patterns across Carnegie Mellon University, University of Washington, and Notre Dame University reveals distinct sleep behaviors among first-year students at these institutions. While CMU students tend to have shorter total and daytime sleep compared to UW and Notre Dame, all universities show similar overall sleep durations. This indicates that there are similarities in the sleep habits of freshmen despite university differences.
In future work, we can expand the analysis to include students across their academic careers and see how these sleep habits influence long-term academic success. In our dataset, we only had student IDs for unique identifiers, but did not have any information about which year the students were in. Additionally, we would aim to incorporate more international universities to compare the sleep habits of students in different countries. We would also hope to introduce more qualitative variables like interviews and sleep journals. Students could report how they felt in the morning, and we could use text analysis to estimate the quality of sleep students are receiving.