We will be examining public-use data from The National Longitudinal Study of Adolescent to Adult Health, which is more commonly known as Add Health. This dataset contains information from 4834 to 6504 participants (depending on the year) in which data was collected across hundreds of variables. In 1995, the average age of participants was 16, whereas the average age of participants in 2008 was 29. For the sake of our final project, we will be examining variables pertaining to health and depression. These variables are listed below:
As health is a very broad term, we are choosing to operationalize it from a simple 1-item question (i.e., “In general, how is your health?”) which is answered with “Poor”, “Fair”, “Good”, “Very good”, or “Excellent”. We recognize that this definition of health may not most accurately capture an individual’s physical well-being, but seeing as our other main variable of interest, depression, is a psychological condition, we believe the examination of the connection between psychological feeling and physical feeling is an especially interesting relationship to unfold.
We conduct initial exploratory data analysis across all 4 waves to examine how depression and health change across the lifespan, but to explore what factors may influence the relationship between depression and health, we examine data from just Wave 4. To accomplish this goal, we examine three main research questions:
Before we examine what factors affect both depression and health, it is important to know how they both change over the lifespan, as well as how they actually relate to each other. First, we will be looking at how health changes from the average age of 16 to the average age of 29 by sex.
TThe overall trend across both sexes is that less people rate themselves as “Excellent” and instead rate themselves as something lower, such as “Good” or “Fair.: Interestingly, there seems to be a stark change from 2002 and 2008, such that there is a drastic drop off for both male and female participants in”Excellent” health. This period in the lifespan would be from 23 to 29, which may imply that something occurs in the human life that makes us rate our health as worse.
Next, we will be examining how depression changes over the course of the lifespan.
The overall trend for both sexes is an increase in depression, but there is a significant difference at every time point such that female participants rate themselves as more depressed than male participants. While this is a gradual increase until 2002, it is a very sharp increase from 2002 to 2008, which lines up with where we saw the most drastic change in health.
To put both these factors together, this next graph and statistical analysis will examine the relation between depression and health in 2008.
## Df Sum Sq Mean Sq F value Pr(>F)
## H4GH1 4 9.3 2.3352 33.05 <2e-16 ***
## Residuals 5100 360.4 0.0707
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = H4FSD ~ H4GH1, data = add4)
##
## $H4GH1
## diff lwr upr p adj
## Fair-Poor -0.19395759 -0.29778842 -0.090126761 0.0000035
## Good-Poor -0.28718512 -0.38657710 -0.187793133 0.0000000
## Very good-Poor -0.30952063 -0.40868893 -0.210352326 0.0000000
## Excellent-Poor -0.29549456 -0.39601466 -0.194974454 0.0000000
## Good-Fair -0.09322753 -0.13231759 -0.054137463 0.0000000
## Very good-Fair -0.11556304 -0.15408081 -0.077045269 0.0000000
## Excellent-Fair -0.10153697 -0.14341247 -0.059661465 0.0000000
## Very good-Good -0.02233551 -0.04645074 0.001779715 0.0847790
## Excellent-Good -0.00830944 -0.03748963 0.020870746 0.9372812
## Excellent-Very good 0.01402607 -0.01438288 0.042435024 0.6615341
From this boxplot, we can see a clear trend that better health is related to less depression. However, it’s more difficult to discern where this difference is significant, so we conducted a Tukey multiple comparisons of means test to find out what actually differs. From this, we see that “Poor” significantly differs from “Fair” (p = <.01), and “Poor” and “Fair” independently differ from “Good”, “Very good”, and “Excellent” (all ps < .001).
With this main relationship explored, we can now examine what factors may contribute to the strength of this relationship.
The main health services that wave four checked for were routine doctor checkups, dental services, and mental health counseling. While we did analyses for all three, the only one that significantly correlated with general health was the dentist. For people who did not visit the dentist in the past year, we can see that, for female participants, the cells have lower counts than what we expect for poorer health; for both sexes, the cells have higher counts than what we expect for better general health. For the people who did visit the dentist in the past year, we can see that the observation counts for both male and female are more than expected for poorer health; there are also less than expected counts for males that have a better general health.
This could be for several reasons. Perhaps people at this time and age (~30 years old in 2008) only visited the dentist when there were obvious health issues. Perhaps people are not aware of any oral health issues until they see the dentist. Unfortunately, there was not enough data on oral health issues to check for these possibilities.
Insurance is a huge barrier to health services, so we decided to create a mosaic plot with it and general health ratings. Surprisingly, male participants with above average health were overrepresented in the no insurance group, while female participants with below average health were underrepresented. In contrast, in the insurance group, both female and male participants with insurance have observation counts higher than expected for poorer health and only male participants have lower counts than expected for better general health.
Intuitively, it doesn’t make sense that people with less access to health services would have better health than those with more. This suggests that there could be a confounding factor. One possibility is that people without insurance do not have ways to check for problems in the first place.
As can be seen by the shading, people without insurance are much more likely to have seen 0 types of health services and much less likely to have seen 2 or more services when compared to people with insurance. This is not surprising, considering how much more limited they are. This lends credence to the idea that people without insurance cannot even check for health problems, and as such are not aware of them, leading to higher than expected general health self ratings.
We wanted to learn about the relationship between an individual’s diet and their depression level, which suggests we should examine the variables frequency of fast-food consumption (H4GH8) and depression level (the average of all depression-relevant variables in the dataset H4MH18 to H4MH27).
On the graph, each dot represents an observed data point, with the position along the x-axis indicating the frequency of fast-food consumption over 7 days, and the position along the y-axis indicating the level of depression. The red line is the line of best fit derived from the generalized additive model. The slope of the line indicates the direction and strength of the relationship between the two variables.
Overall, the graph indicates that there may be a slightly positive relationship, indicating that higher frequency of fast-food consumption could be associated with a higher level of depression, but given the flat slope of the trend line and the wide confidence interval, the relationship between fast-food consumption and depression is likely to be very weak.
We wanted to look at fast-food consumption with depression because we assumed fast-food consumption would be related to poor health, but it’s necessary to actually verify whether or not this assumption is correct.
## Df Sum Sq Mean Sq F value Pr(>F)
## H4GH1 4 268 67.07 9.464 1.28e-07 ***
## Residuals 5062 35872 7.09
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = as.integer(H4GH8) ~ H4GH1, data = add4)
##
## $H4GH1
## diff lwr upr p adj
## Fair-Poor 0.05548222 -0.9851865 1.09615092 0.9998995
## Good-Poor -0.30353838 -1.2990337 0.69195694 0.9207359
## Very good-Poor -0.49275809 -1.4859930 0.50047680 0.6573742
## Excellent-Poor -0.78677686 -1.7937156 0.22016187 0.2065432
## Good-Fair -0.35902060 -0.7529436 0.03490236 0.0936913
## Very good-Fair -0.54824031 -0.9364154 -0.16006519 0.0011126
## Excellent-Fair -0.84225908 -1.2642662 -0.42025200 0.0000005
## Very good-Good -0.18921971 -0.4314996 0.05306014 0.2069482
## Excellent-Good -0.48323848 -0.7766828 -0.18979414 0.0000701
## Excellent-Very good -0.29401877 -0.5797008 -0.00833677 0.0400378
With health on the x-axis and fast-food consumption in the past 7 days on the y-axis, we can see the surprising result that it isn’t as simple as people with poor health eating more fast food. Actually, as can be seen from the follow up analyses, the only groups that differ are “Fair” and “Excellent” (p<.001), “Fair” and “Very good” (p=.002), and “Excellent” and “Good” (p<.001).
One possible explanation for this deviation from our expectations is that participants whose health is so bad that they categorized themselves as “Poor” may be unable to leave the house to even get fast-food. Seeing as this wave is from 2008, there was no easy way to get fast-food delivered to your house, so if their health was so bad, fast-food would not even be an option. However, the result that participants who had “Excellent” health ate very little fast food falls within our expectations. Considering this is an American dataset and fast food can be part of American culture, it is not too surprising that people may eat large amounts of fast food and still consider their health to be “Good”. Another possibility is that the mean age of people in this wave was relatively young, and the effects of fast food on health may not be felt until older age.
In our effort to identify various factors that can influence people’s general health and depression, we look to relationships as a potential candidate. In particular, we will examine if a person’s relationship with their parents is associated with general health. We can assess a person’s relationship with their parents by the self-rated level of closeness they feel to their parents.
As indicated by the pink bars in the diverging bar chart above, it appears that a larger proportion of people with poor health have weaker relationships with their mothers than for people with good health. Furthermore, the lengths of the blue bars are longer for people who have very good health than they are for people who have fair or poor health, indicating that people with close relationships to their mother tend to also have better general health.
We also want to examine if there is a relationship between how close people are to their parents and how depressed they are. We are curious if there are any potential gender differences in the effects of those relationships so we will examine people’s self-rated closeness with their fathers and their depression level to see if there is any association.
From the above graph, on the left, we see that for females in this study, the median depression level tends to be the same for all levels of self-rated closeness to father. On the right part of the graph, it appears that men who feel quite or very close to their fathers tend to have lower levels of depression than men who don’t feel as close. This suggests that the degree of closeness that people feel towards their father is predictive of depression for men, but the same effect is not observed for women. Connecting this to the consistently lower levels of depression in men we saw earlier, it may be the case that a strong relationship with one’s father can mitigate depression for men but not for women.
By examining this data from both a longitudinal and cross-sectional approach, we were able to learn how depression and health changes over the lifespan, as well as what factors may contribute to the relation between depression and health in adulthood. From our longitudinal analysis, we found that people rate themselves as less healthy and more depressed as they age, although female participants rated themselves as more depressed across all waves. From our cross-sectional analysis, we learned primarily that depression and health are related such that those in the worse health classifications were more depressed, but those in better health classification were less depressed.
To explore this relation, we found that there were more participants with no health insurance who rated themselves as healthier, but this was qualified with the fact that those who had no healthcare also frequented medical professionals less, so they may not have a proper grasp of their health. Of fast-food consumption, we found a very slight positive relation between depression and number of times fast-food was consumer in the past 7 days, but perhaps more interestingly, we found that those with the worst health did not eat more fast-food than those with the best health. However, those who had “Fair” or “Good” health did eat significantly more fast-food than those who had “Excellent” health. Finally, we found that having a closer relation to parents was related to both better health and decreased depression.
From these findings, we posit that relationships may be the strongest explanation for the link between health and depression. This suggestion makes logical sense as well, seeing as those with close relationships may be less likely to be socially isolated, which can lead to depression. Additionally, if one has people close to them that care about them, they may be more likely to both be told to take care of themselves and also actually take care of themselves. However, there are still many questions we are left with from these findings.
Although we examined many of these questions by sex, none of these factors adequately addressed the overall depression differences between male and female participants. Future research should address identifying what factors contribute to this stark difference so that clinicians are able to properly treat female patients effectively. In addition to this sex difference, more work should examine the unexpected difference in health between those who do and do not have health insurance. If it’s true that those who don’t have health insurance are healthier, then should we all stop paying for health insurance? Surely this finding can be expanded upon to find some other reason why those who don’t have health insurance see themselves as healthier other than our suggestion that those who don’t have health insurance are blissfully ignorant. Further research should also examine if there has been an increase in those with the worst health eating more fast-food with the advent of delivery apps, which make ordering fast food easy for those who would be unable to leave the house. These delivery apps may have increased the total fast-food consumption since 2008, which is another area of research to examine. Finally, in relation to our finding that those who have closer relationships to their parents have better health and lower depression, researchers should examine what kinds of relationship have these effects. Is there something unique about the parental relationship, or would it be even stronger in a romantic relationship? Would this effect also be seen in close relationships with friends? Although we believe we conducted very thorough analyses, this preliminary analysis has only left us with more questions than answers.