PHIGHT COVID RESEARCH PROJECT MSP Team: Cheyenne Ehman, Shirley Luo, Zi Yang, Ziyan Zhu Seema Lakdawala Valerie Ventura Yixuan: Objectives change: * How do statewide school poliies affect transmission at the county level (Ohio data). Are children an important factor? * How do we measure xmit * Does teach meth affect case/death rates * What covariates matter? Zi: Data sets: 2 data sets: - ohio county level cases 36K obs , 17 vars - ohio k12 data (demographicds and policies) Ziyan: data cleaning and wrangling omit some variables manually correct wrong and NA; some rule for dropping when too much missing (aggregating to county level I think) sl 10 Bracketed data to reopen data and 10 days before xmas break deaths increase per population Cheyenne: Methods EDA to find relevant covariates ANOVA (Dunnet, Sheffe tests, etc for multiple comparisons) Time series Ziyan: Bracketing helps focus and gets significant results for Yixuan: roadblocks many important covariates inexact at best sl 17 online only has a mask vs no mask difference (is this a policy variation?) QA partic -------------------------- HCI -- Skill prerequisites MSP Team: Smeet Poladia, Kelsie Lu, Elaine Xu Elaine Zhou - the data (lots of verbal description but that's where they are) describing simple conditional prob estimation as EDA for 0/1 response EDA for assistance score (need to log the assistance score and/or omit wierd super extreme outliers) assistance scores def need to be logged Smeet - methods gaussian graphical models look at clusters, try to understand prereq relats from the ggms' Elaine - learning curves (prelim illustr) Smeet - next steps improve asssistance score to a "mor robust metric" working on prereq relats from associational data more predictors for learning curves/afm model QA Elaine Zhou IFDA Talent PROJECT MSP Team: Lanyi Xu, Malik Khan, Xiaofan Zhu, Yanxi Zhou, Echo Luan Malik: introduction narrowed focus from recruitment + retention to just recruitment looking at data and WIP for Apr 9 - sl 2 how much data are you waiting for BLS 2019 limitation 2 of 5 clients have given data; waiting on other 3 - sl 3 what they got from each of the two companies qualitative responses from structured interviews less model building and more feeding bacl what clients know and don't know; recommendations for furthet data collection and anasylsis Xiaofan - deeper into public data BLS: good EDA of BLS data but now connect to IFDA questions sl 7 left - purple most relevant? Yanxi: client data: sl8 really interesting that most terminations are for new hires!!! (is this a recruitment or retention thing? or both?) Lanyi: turnover rate = # turnover / average head count (2018 and 2019 data) sl 10 what is "turnover" intersting regional effects - maybe also some time series effects? Echo: hourly wages sl 12 can we compare these wagse with industry in general and/or with employers outside IFDA (e.g. amazon) sl 13 why is termination so mch hiher for semi delivery and warehouse shipping? i ddin't undersand sl 14 Malik: part of our goal is to help clients know what data to collect and complete next steps: recommendation to clients: normalize data between clients! (same data from each client)