Preventable hospital Stays can be viewed as a comprehensive measure that encompasses factors related to both quality of and access to healthcare. Preventable Hospital Stays (PHS) measures the number of hospital stays for ambulatory-care sensitive conditions per 100,000 Medicare enrollees. Ambulatory-care sensitive conditions include conditions like diabetes, asthma, urinary tract infections, hypertension, chronic obstructive pulmonary disease (COPD). Hospitalization due to these conditions can be prevented with timely and quality outpatient settings. This measure can be a tendency to overuse emergency rooms and urgent care as a main source care. (County Health Data, 2023)
Hospital care is one of the largest healthcare expenditures, thus reducing preventable hospital stays is important from a financial standpoint for policymakers, insurance companies, and consumers. The purpose of the research is to highlight the factors contributing to preventable hospitalizations to develop targeted interventions to reduce unnecessary hospital admissions.
Our research found that insufficient sleep and factors that contribute to insufficient sleep, like one-parent households and food insecurity, play a role in the occurrence of preventable hospital stays.
The data was collected and analyzed from countyhealthrankings.org. This dataset allowed us to analyze health at a county level. Preventable hospital stays can be used as a measure to quality and an access to primary healthcare. Our research question was do income inequality, unemployment and high school completion rates affect the number of preventable hospital stays of certain racial groups at the county level?
By researching this, we may uncover patterns in quality of life, access to care, and socio-economic factors that contribute to preventable hospital stays. By highlighting health behavior and healthcare infrastructure patterns, interventions can be made that can improve quality of life, reduce racial disparities, and cut healthcare costs.
The variables of High School Completion, Unemployment, and Income Inequality were used and compared with all the races (White, Black, Hispanic, Asian, Indian American/Alaska). Through our research, we extended our investigation to include the variable of Insufficient Sleep in conjunction with Frequent Mental and Physical Distress.
Socioeconomic Variables
Quality of Life Variables
Access to Care Variables
Quality of Care Variables
Based on these graphs, we also see the same 3 clusters in the racial data that were seen when plotting the socioeconomic status variables. Black and AIAN have higher preventable hosiptal stays, while Asians once again have lower rates. This indicates that race does have an impact on preventable hospital stay rates. There is less variation in the AIAN group than the other racial groups, however that could be in part to a smaller sample size. Additionally, it is important to consider sleep when looking into preventable hospital stays because studies have shown that poor sleep contributes to chronic health issues and poor health outcomes (Kulpatcharapong, 2020).
For both Frequent Mental and Physical Distress the counties with higher ratios of insufficient sleep also have higher ratios of distress. The data is clustered into 3 groups and each column within the facet wrapped grid represent a cluster.
Sleep deprivation, physical distress, and mental distress are positively correlated pairwise across all racial and ethnic groups. The relationships for distress are similar across all groups except Asians. Asians typically exhibit lower physical and mental distress compared to the other groups, but they are not necessarily less sleep-deprived on average. The distribution for insufficient sleep is roughly symmetric and normal across all the racial groups. Additionally, the distribution for frequent mental distress is similar to insufficient sleep, however since the tail is shorter than the distribution for insufficient sleep, it indicates that the data is less varied.
The bivariate chloropleth creates a trichotimized pedigree between PHS and Insufficient Sleep. The legend is broken into 9 quadrants and the x/y axis’ are broken into 3 parts that correspond to values that fall between the 0 - 33rd percentile, 34th - 66th percentile, and 67th - 100th percentile. Thus, a county in the darkest color would correspond to PHS and Insufficient Sleep percentiles that range in the 67th - 100th percentile. Here, the darkest color shows up primarily in the Southern regions and the Appalachia. Areas in the Northwest experience low rates of insufficient sleep and PHS, falling into the 0 - 33rd percentile for both variables.
A majority of the regression lines for the access to care variables are
flat, thus those variables will not be able to predict preventable
hospital stays very well. Other variables with steeper regression lines
will have to be considered to find good predictor variables.
The quality of care variables do not show a strong relationship with preventable hospital stays. Even if flu vaccine and mammography rates increase, the number of preventable hospital stays is not really affected.
We built a saturated gradient boosted decision tree. We dropped all
the variables that have 30% or more missing values, and were left with
81 variables that were used in our predictive model. We boosted the
model by using lightGBM. We tune hyperparameters (number of randomly
selected predictors, and number of trees) using 5-fold cross validation
with stratified sampling on the preventable hospital stays. Regarding
the predictive performance on the test set, RMSE, MAE, and Huber Loss
are shown below.
## # A tibble: 1 Ă— 3
## mtry trees .config
## <int> <int> <chr>
## 1 55 92 Preprocessor1_Model23
## # A tibble: 3 Ă— 3
## .metric .estimator .estimate
## <chr> <chr> <dbl>
## 1 rmse standard 2085.
## 2 mae standard 1070.
## 3 huber_loss standard 1069.
Additionally, we calculated the SHAP values to see what variables were important in contributing to the overall predictive model.
We calculated all the SHAP values and displayed the 10 most important variables that contributed to the model. Race is very important when predicting preventable hospital stays. This demonstrates the importance of accounting for race when creating methods to address preventable hospital stays. Additionally, we graphed the 5 most important variables. Looking at the % of Non-Hispanic Black within a county, the graph demonstrates that while the % of people identifying as Black is low, it will have negative SHAP values and the model predicts lower preventable hospital stays. However, as the % increases, the SHAP value increases as well.Furthermore, when looking at the SHAP graph broken down by race, we see the same groups for predicted preventable hospital stays. Asians have the lowest predicted,then Hispanics and Whites, and finally AIANs and Blacks have the highest predicted preventable hospital stays. This pattern was also observed in our exploratory graphs.
Partial SHAP values contributed by the races match with the graphs in the exploratory data graphs. It is shown in ascending order of predicted preventable hospital stays: Asian, White & Hispanic, AIAN & Black. Asians have a high density of negative SHAP values that indicate that Asians do not impact the model prediction, whereas with Blacks and AIANs have positive SHAP values that indicate that the model will predict higher preventable hospital stays. Also take note that the curve is wider in Black and AIAN meaning that there hass more variation in SHAP values for Blacks and AIANs.
The County Health Rankings Data Set collected data and aggregated at the county level, which can conceal variations within racial groups due to not fully capturing racial disparities and inequities accurately. Furthermore, using aggregated data can lead to misrepresented sample distributions especially in smaller counties where outliers will have a much greater influence than in a larger county.
The data set had a lot of missing values, thus in order to address that we used variables that had less 30% of missing observations. We imputed the missing values using the median for columns containing numeric values and the mode for columns containing nominal values. By introducing imputations, we estimate what the missing values are and that can lead to artificially reducing the variability in the data and also increases the uncertainty.
Due to unforeseen events, we only built one model that was a gradient boosted decision tree. When including 81 variables in the model there is a high chance for collinearity and overfitting, thus this model would most likely not do well given another data set.
We chose to focus on the quality of life in a given county based on sleep, physical, and mental health. Something that we could further investigate is looking into how physical environment and quality of care impacts preventable hospital stays. The variables to measure quality of care were primarily rates of preventative measures like mammogram screenings and vaccinations. Other factors that could be considered a measure of quality of care can be percentage of physicians that are board-certified or patient to staff ratios.
Based on our research, we see that there are distinct racial disparities. Thus interventions that are specifically focused on the Black and AIAN populations are needed because they are disproportionately affected by preventable hospital stays. More research can be done on other factors like investigating other stressors that might lead someone to have frequent mental/physical distress and insufficient sleep. Furthermore, it is important to consider whether the level of trust in healthcare providers impact the usage of outpatient care. Due to historical trauma and inequitable treatment, these racial groups might have a distrust of going to the doctors’, which creates an issue of access.
While racial disparities exist in the prevalence of certain health conditions, the impact of inadequate sleep on preventable hospitalizations remains a consistent factor. Emphasizing on targeting health support, and healthcare measures could play a pivotal role in reducing preventable hospital stays and narrowing health disparities among racial populations. Healthcare providers, public health advocates, and small racial communities should prioritize addressing sleep health as a critical component of efforts to improve overall mental and physical health outcomes, and reduce preventable hospitalizations across all racial groups.
Preventable hospital stays. (n.d.). Retrieved from https://www.countyhealthrankings.org/explore-health-rankings/county-health-rankings-model/health-factors/clinical-care/quality-of-care/preventable-hospital-stays?year=2023
Kulpatcharapong, S., Chewcharat, P., Ruxrungtham, K., Gonlachanvit, S., Patcharatrakul, T., Chaitusaney, B., Muntham, D., Reutrakul, S., & Chirakalwasan, N. (2020). Sleep Quality of Hospitalized Patients, Contributing Factors, and Prevalence of Associated Disorders. Sleep Disorders, 2020, 1–7. https://doi.org/10.1155/2020/8518396
Social Economic Factors:
We examined the rate of preventable hospital stays against socioeconomic factors like unemployment rate, high school completion rate, and income inequality based on the ratio of household income at the 20th and 80th percentile.
High School Completion Rate
Unemployment
Income Inequality
Based on the factors of Unemployment, High School Completion, and Income Inequality we were not able to see a lot of contribution for Preventable Hospital stays. What we gathered from performing EDA on the above variables is that there is a racial discrepancy evident through the 3 clusters of data. In the 1st cluster, we have Asians that have the lowest rates of PHS. In the 2nd cluster, we have Hispanics and Whites that have similar PHS rates, and lastly in the 3rd cluster we have Blacks and AIANs that have the highest rates of PHS. Originally, we conjectured that counties that have low unemployment rates, low income inequality, and high HS completion rates would have lower levels of PHS. However, that is not demonstrated on the scatterplots above. The majority of the points are plotted within a relatively small range on the X axis, which means that even if counties have a highly educated and employed population it does not really impact PHS. Thus, we decided to move into variables related to quality of life like insufficient sleep and factors that lead to insufficient sleep.