class: center, middle, inverse, title-slide # Welcome to CMSACamp ## Background and overview ### June 1st, 2021 --- ## Meet the instructors - Teaching Assistant: __Beomjo Park__ - started [@CMU_Stats](https://twitter.com/CMU_Stats) PhD in '18 - Previously Korea University, and research assistant at NCSoft (owner of the NC Dinos!) - Research: Robust Statistical Inference, Model misspecification, Bayesian Nonparametrics - Teaching Assistant: __Nicholas Kissel__ - started [@CMU_Stats](https://twitter.com/CMU_Stats) PhD in '19 - Previously University of Pittsburgh '19, MS in statistics and BS in math & statistics - Research: Creating inferential procedures for machine learning modeling methods -- .pull-left[ - Instructor: __Ron Yurko__ ([@Stat_Ron](https://twitter.com/Stat_Ron)) - [@CMU_Stats](https://twitter.com/CMU_Stats) '15, started PhD in '17 - Pittsburgh Pirates analytics intern '14 - Part-time Data Scientist [@ZelusAnalytics](https://zelusanalytics.com/) - Research: statistical genetics, _statistics in sports_ / _sports analytics_, and variable selection for model-based clustering ] .pull-right[ .center[] ] --- ## Statistics in sports research? You might think statistics in sports or sports analytics research is relatively new... -- .pull-left[ Professors [Brad Efron](http://statweb.stanford.edu/~ckirby/brad/) and [Carl Morris](https://statistics.fas.harvard.edu/people/carl-n-morris) disagree - "Data analysis using Stein's estimator and its generalizations" - _Journal of the American Statistical Association_ (__1975__) - Introduction of __Empirical Bayes__ to sports - Improve accuracy by pooling information from other players ] .pull-right[ .center[] ] --- ## Sports analytics research __starts with the data__ .center[] Cervone et al. ["A multiresolution stochastic process model for predicting basketball possession outcomes."](https://arxiv.org/pdf/1408.0777.pdf) _Journal of the American Statistical Association_ (2016) --- ## Sports analytics research __starts with the data__ .center[] Cervone et al. ["A multiresolution stochastic process model for predicting basketball possession outcomes."](https://arxiv.org/pdf/1408.0777.pdf) _Journal of the American Statistical Association_ (2016) --- ## NFL Big Data Bowl tracking data example <img src="http://www.stat.cmu.edu/cmsac/sure/materials/img/rfcde_gif_td_run_update.gif" width="150%" style="display: block; margin: auto;" /> Yurko et al. ["Going deep: models for continuous-time within-play valuation of game outcomes in American football with tracking data."](https://arxiv.org/pdf/1906.01760.pdf) _Journal of Quantitative Analysis in Sports_ (2020) --- ### General outline and key dates (subject to change, all times in EST) __Lectures__: Monday thru Friday, 12 to 1:30 PM (Ron's office hours are Mondays 4:30 to 5:30 PM) __Labs__: Monday thru Thursday, 2:30 to 4 PM (Beomjo's office hours are Thursday 4 to 5, Nick's are Friday 2 to 3) - Will begin with mini projects & practice presentations before shift to focus on main projects __CMSAConvo speaker series__: 3 to 4:30 PM every Friday -- .pull-left[ - First two weeks, June 1-11: - EDA, data visualization, clustering - Presentations from project advisors - __June 13: Project preference deadline__ - June 14-25: - Linear models and model assessment - Regularization and dimension reduction - __June 21-23: Resume check / career conversations with Professor Nugent__ - __June 24-25: EDA presentations__ ] -- .pull-right[ - June 28 - July 9: - Flexible models, machine learning - Labs will shift focus to main projects - __July 8-9: Modeling presentations__ - July 12 - 30: - Special topics (e.g. text analysis) - Focus on projects! - __July 30: Final project presentations__ Plus other guest speakers (__check your email!__) ] --- ## Goals for the summer .pull-left[ - Develop fundamentals research skills: data wrangling, visualization, modeling, communication - Become familiar with `R`, `tidyverse`, `ggplot2`, `markdown`, GitHub - Complete statistical learning bootcamp - Create a portfolio of projects with GitHub and practice reproducible research - __All presentations will be made using `R` Markdown with [xaringan](https://bookdown.org/yihui/rmarkdown/xaringan.html)!__ - Network with academic researchers and industry professionals ] -- .pull-right[ __Ask questions, learn, and grow__ .center[] Senior Academic Advisor Samantha Nielsen ([samanthn@andrew.cmu.edu](mailto:samanthn@andrew.cmu.edu)) ] --- ## Resources to remember! - CMSACamp website: [http://www.stat.cmu.edu/cmsac/sure/2021/materials/](http://www.stat.cmu.edu/cmsac/sure/2021/materials/) - Check out the [References](http://stat.cmu.edu/cmsac/sure/2021/materials/references.html) tab for links to online textbooks and other useful references - [Data Sources](http://stat.cmu.edu/cmsac/sure/2021/materials/data_sources.html) tab for links to various public datasets - We will also use slack to communicate, share interesting articles and materials throughout the summer - See previous email from Professor Nugent with the workspace invitation link -- .center[] --- ## CMSACamp alumni .pull-left[ <blockquote class="twitter-tweet"><p lang="en" dir="ltr">Just realized I haven’t announced this on twitter yet, but I have officially committed to earn my PhD in statistics at <a href="https://twitter.com/virginia_tech?ref_src=twsrc%5Etfw">@virginia_tech</a> !! <a href="https://t.co/TKDR3uXRVj">pic.twitter.com/TKDR3uXRVj</a></p>— Danielle Sebring (@DSebring17) <a href="https://twitter.com/DSebring17/status/1242977905203769346?ref_src=twsrc%5Etfw">March 26, 2020</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script> ] .pull-right[ <blockquote class="twitter-tweet"><p lang="en" dir="ltr">Happy and humbled to announce that I'll be returning to the one and only <a href="https://twitter.com/CMU_Stats?ref_src=twsrc%5Etfw">@CMU_Stats</a> as a PhD student this fall!<br>See you in Pittsburgh!</p>— Thea Sukianto (@stats_sukianto) <a href="https://twitter.com/stats_sukianto/status/1373010973733257217?ref_src=twsrc%5Etfw">March 19, 2021</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script> ] --- ## CMSACamp alumni .pull-left[ <blockquote class="twitter-tweet"><p lang="en" dir="ltr">Super excited and honored to be chosen as one of the winners of this year's <a href="https://twitter.com/hashtag/BigDataBowl?src=hash&ref_src=twsrc%5Etfw">#BigDataBowl</a>! Big shoutouts to <a href="https://twitter.com/sarahrunbailey?ref_src=twsrc%5Etfw">@sarahrunbailey</a> for being a fantastic mentor and <a href="https://twitter.com/StatsbyLopez?ref_src=twsrc%5Etfw">@StatsbyLopez</a> and crew for putting this amazing competition together! <a href="https://t.co/Z5VSq6xpwJ">https://t.co/Z5VSq6xpwJ</a></p>— Jill Reiner (@jillhreiner) <a href="https://twitter.com/jillhreiner/status/1357732578309009411?ref_src=twsrc%5Etfw">February 5, 2021</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script> ] .pull-right[ <blockquote class="twitter-tweet"><p lang="en" dir="ltr">Thank you <a href="https://twitter.com/SloanSportsConf?ref_src=twsrc%5Etfw">@SloanSportsConf</a> for an outstanding conference. <a href="https://twitter.com/j_bosch10?ref_src=twsrc%5Etfw">@j_bosch10</a> <a href="https://t.co/3ekyk0wvHC">https://t.co/3ekyk0wvHC</a></p>— Sam Kalman (@sam_kalman_) <a href="https://twitter.com/sam_kalman_/status/1236435614452592643?ref_src=twsrc%5Etfw">March 7, 2020</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script> ] --- ## CMSACamp alumni .pull-left[ <blockquote class="twitter-tweet"><p lang="en" dir="ltr">Getting to work in the NHL has been a dream come true💛🖤! Super excited to continue working with Sam, Nick, and others in the Penguins organizations moving forward! <a href="https://twitter.com/hashtag/LetsGoPens?src=hash&ref_src=twsrc%5Etfw">#LetsGoPens</a> <a href="https://t.co/h509BMkgOn">https://t.co/h509BMkgOn</a></p>— Katerina Wu (@kattaqueue) <a href="https://twitter.com/kattaqueue/status/1369366870009208833?ref_src=twsrc%5Etfw">March 9, 2021</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script> ] .pull-right[ <blockquote class="twitter-tweet"><p lang="en" dir="ltr">It’s unusual time but I am happy to share that I am graduating. First one to go to college and to graduate from both sides of my family <a href="https://twitter.com/hashtag/firstgen?src=hash&ref_src=twsrc%5Etfw">#firstgen</a> and <a href="https://twitter.com/hashtag/rstats?src=hash&ref_src=twsrc%5Etfw">#rstats</a> graduate. <a href="https://t.co/ZDZU5M3SnY">pic.twitter.com/ZDZU5M3SnY</a></p>— Kapil.Khanal (@almost_kapil) <a href="https://twitter.com/almost_kapil/status/1258874572365082629?ref_src=twsrc%5Etfw">May 8, 2020</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script> ] --- class: center, middle # And now it's your turn... -- # (but we're here to help!) .center[]