class: center, middle, inverse, title-slide # Welcome to CMSACamp ## Background and overview ### 2020/06/01 --- ## Meet the instructors - Teaching Assistant: __Pratik Patil__ - started [@CMU_Stats](https://twitter.com/CMU_Stats) & [Machine Learning](https://www.ml.cmu.edu/) PhD in '17 - Previously Indian Institute of Technology and University of Toronto, - Research: Statistical Machine Learning, Optimization, Information and Coding theory - Particularly enjoys sports involving bouncy spheres and rackets and bats - Teaching Assistant: __Beomjo Park__ - started [@CMU_Stats](https://twitter.com/CMU_Stats) PhD in '18 - Previously Korea University, and research assistant at NCSoft (owner of the NC Dinos!) - Research: Bayesian Nonparametrics, Variational Inference, Robust Statistical Inference, Model misspecification -- .pull-left[ - Instructor: __Ron Yurko__ ([@Stat_Ron](https://twitter.com/Stat_Ron)) - [@CMU_Stats](https://twitter.com/CMU_Stats) '15, started PhD in '17 - Pittsburgh Pirates analytics intern '14 - Research: [_selective inference_](https://sites.google.com/view/selective-inference-seminar) in statistical genetics and genomics - Actively publish _statistics in sports_ / _sports analytics_ research (more on that later...) ] .pull-right[ .center[![](https://media1.tenor.com/images/e04caa040893f7d3fc11c24f6fab2de4/tenor.gif?itemid=8079093)] ] --- ## Statistics in sports research? You might think statistics in sports or sports analytics research is relatively new... -- .pull-left[ Professors [Brad Efron](http://statweb.stanford.edu/~ckirby/brad/) and [Carl Morris](https://statistics.fas.harvard.edu/people/carl-n-morris) disagree - "Data analysis using Stein's estimator and its generalizations" - _Journal of the American Statistical Association_ (__1975__) - Introduction of __Empirical Bayes__ to sports - _We'll revisit this in a few weeks!_ ] .pull-right[ .center[![](http://www.stat.cmu.edu/cmsac/sure/materials/img/stein-baseball1.png)] ] --- ## Growth in modern sports analytics research __starts with the data__ .center[![](https://d3i71xaburhd42.cloudfront.net/e64e2935a7d2565f8d29250fec9f039ed0767cde/4-Figure1-1.png)] Cervone et al. ["A multiresolution stochastic process model for predicting basketball possession outcomes."](https://arxiv.org/pdf/1408.0777.pdf) _Journal of the American Statistical Association_ (2016) --- ## Growth in modern sports analytics research __starts with the data__ .center[![](https://d3i71xaburhd42.cloudfront.net/e64e2935a7d2565f8d29250fec9f039ed0767cde/5-Figure2-1.png)] Cervone et al. ["A multiresolution stochastic process model for predicting basketball possession outcomes."](https://arxiv.org/pdf/1408.0777.pdf) _Journal of the American Statistical Association_ (2016) --- ## NFL Big Data Bowl tracking data example <img src="http://www.stat.cmu.edu/cmsac/sure/materials/img/rfcde_gif_td_run_update.gif" width="150%" style="display: block; margin: auto;" /> Yurko et al. ["Going deep: models for continuous-time within-play valuation of game outcomes in American football with tracking data."](https://arxiv.org/pdf/1906.01760.pdf) _Journal of Quantitative Analysis in Sports_ (2020) --- ### Outline of summer [http://www.stat.cmu.edu/cmsac/sure/materials/](http://www.stat.cmu.edu/cmsac/sure/materials/) (subject to change) -- - First two weeks, June 1-12: - Getting started with `R` and `tidyverse`, GitHub, EDA, data visualization with `ggplot2` - Clustering and Gaussian mixture models - __EDA project presentations June 12th__ -- - June 15-26: - Linear models, GLMs, regularization, bias-variance tradeoff - Dimension reduction, principal component analysis (PCA) - __Regression project presentations June 26th__ -- - June 29 - July 3: TBA! -- - July 6 - 17: - Smoothing regression, generalized additive models (GAMs) - Empirical Bayes, multilevel modeling - Decision trees, ensemble models, neural networks -- - July 20 - 29: - Focus on final projects! - __Final project presentations July 27 - 29__ --- ## CMSACamp 2019 alumni .pull-left[ <blockquote class="twitter-tweet"><p lang="en" dir="ltr">Just realized I haven’t announced this on twitter yet, but I have officially committed to earn my PhD in statistics at <a href="https://twitter.com/virginia_tech?ref_src=twsrc%5Etfw">@virginia_tech</a> !! <a href="https://t.co/TKDR3uXRVj">pic.twitter.com/TKDR3uXRVj</a></p>— Danielle Sebring (@DSebring17) <a href="https://twitter.com/DSebring17/status/1242977905203769346?ref_src=twsrc%5Etfw">March 26, 2020</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script> ] .pull-right[ <blockquote class="twitter-tweet"><p lang="en" dir="ltr">Thank you <a href="https://twitter.com/SloanSportsConf?ref_src=twsrc%5Etfw">@SloanSportsConf</a> for an outstanding conference. <a href="https://twitter.com/j_bosch10?ref_src=twsrc%5Etfw">@j_bosch10</a> <a href="https://t.co/3ekyk0wvHC">https://t.co/3ekyk0wvHC</a></p>— Sam Kalman (@sam_kalman_) <a href="https://twitter.com/sam_kalman_/status/1236435614452592643?ref_src=twsrc%5Etfw">March 7, 2020</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script> ] --- ## CMSACamp 2019 alumni .pull-left[ <blockquote class="twitter-tweet"><p lang="en" dir="ltr">Honored to be a part of this! <a href="https://t.co/lsEQ8sBpPt">https://t.co/lsEQ8sBpPt</a></p>— Audrey Bertin (@ambertin99) <a href="https://twitter.com/ambertin99/status/1234869526409490433?ref_src=twsrc%5Etfw">March 3, 2020</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script> ] .pull-right[ <blockquote class="twitter-tweet"><p lang="en" dir="ltr">Thank you for all the positive responses and feedback on my presentation! I had a wonderful time at <a href="https://twitter.com/hashtag/CBJHAC?src=hash&ref_src=twsrc%5Etfw">#CBJHAC</a> and learned so much this past weekend. For those who want another look, my slides can be found here: <a href="https://t.co/fgKQ0R65Ep">https://t.co/fgKQ0R65Ep</a></p>— kat! (@kattaqueue) <a href="https://twitter.com/kattaqueue/status/1226998268527181824?ref_src=twsrc%5Etfw">February 10, 2020</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script> ] --- ## CMSACamp 2019 alumni .pull-left[ <blockquote class="twitter-tweet"><p lang="en" dir="ltr">Check out my boxing rankings for each division at <a href="https://t.co/dRckV99gCx">https://t.co/dRckV99gCx</a> Going to be adding some more interesting stuff soon. <a href="https://twitter.com/hashtag/boxing?src=hash&ref_src=twsrc%5Etfw">#boxing</a> <a href="https://twitter.com/hashtag/sports?src=hash&ref_src=twsrc%5Etfw">#sports</a> <a href="https://twitter.com/hashtag/canelo?src=hash&ref_src=twsrc%5Etfw">#canelo</a> <a href="https://twitter.com/hashtag/ggg?src=hash&ref_src=twsrc%5Etfw">#ggg</a> <a href="https://twitter.com/hashtag/fridaymorning?src=hash&ref_src=twsrc%5Etfw">#fridaymorning</a> <a href="https://twitter.com/hashtag/tysonfury?src=hash&ref_src=twsrc%5Etfw">#tysonfury</a></p>— Jeremy Sanchez (@_jsanchez1) <a href="https://twitter.com/_jsanchez1/status/1261342706107772928?ref_src=twsrc%5Etfw">May 15, 2020</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script> ] .pull-right[ <blockquote class="twitter-tweet"><p lang="en" dir="ltr">It’s unusual time but I am happy to share that I am graduating. First one to go to college and to graduate from both sides of my family <a href="https://twitter.com/hashtag/firstgen?src=hash&ref_src=twsrc%5Etfw">#firstgen</a> and <a href="https://twitter.com/hashtag/rstats?src=hash&ref_src=twsrc%5Etfw">#rstats</a> graduate. <a href="https://t.co/ZDZU5M3SnY">pic.twitter.com/ZDZU5M3SnY</a></p>— Kapil.Khanal (@almost_kapil) <a href="https://twitter.com/almost_kapil/status/1258874572365082629?ref_src=twsrc%5Etfw">May 8, 2020</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script> ] --- class: center, middle # And now it's your turn