Aaditya Ramdas is an assistant professor at Carnegie Mellon University, in the Departments of Statistics and Machine Learning. He was one of the inaugural inductees of the COPSS Leadership Academy, and a recipient of the Bernoulli New Researcher Award. His work is supported by an NSF CAREER Award, an Adobe Faculty Research Award, an ARL Grant on Safe Reinforcement Learning, the Block Center Grant for Technology and Society, amongst several others. Aaditya's main theoretical and methodological research interests include selective and simultaneous inference (interactive, structured, post-hoc control of false decision rates, etc), sequential uncertainty quantification (confidence sequences, always-valid p-values, bias in bandits, etc), and distribution-free black-box predictive inference (conformal prediction, calibration, etc). His areas of applied interest include neuroscience, genetics and voting. He is one of the organizers of the amazing and diverse StatML Group at CMU. Outside of work, here are some easy topics for conversation: travel/outdoors (hiking, scuba, etc.), trash-free living, completing the Ironman triathlon and long-distance bicycle rides.
About The Conference
Now in its fifth year, the Carnegie Mellon Sports Analytics Conference is dedicated to highlighting the latest sports research from the statistics and data science community.
Interested in presenting your research at CMSAC? Then submit an abstract using the form below to enter our fourth annual Reproducible Research Competition!
Stay tuned for more information about the upcoming #CMSAC21. Check out our 2017, 2018, 2019, and 2020 conferences.
Registration
--> Registration is FREE for everyone -->Registering indicates agreement to abide by the Code of Conduct .
Schedule Details
All times displayed are in EDT.
-
11:00 AM
Welcome and Opening Remarks
Rebecca Nugent> -
11:05 AM
Keynote Address: Analytics at the USTA
-
12:00 PM
Break
-
12:55 PM
Break
-
1:05 PM
Run vs. Pass Play Prediction: Incorporating NFL Tracking Data
CMSACamp:
Tej Seth and Nicole Tucker> -
1:25 PM
An Interpretable Method of Learning Stochastic Game Dynamics
CMSACamp:
Nicholas Ho, Sifan Tao, and Adhvaith Vijay> -
1:45 PM
Break
-
1:55 PM
Maximizing wOBA with Launch Angle and Exit Velocity
CMSACamp:
Brooke Coneeny, Erin Franke, and Sarah Sult> -
2:15 PM
Tired of Misattribution, Modeling Player Fatigue in the NBA
CMSACamp:
Grace Fain, Austin Stephen, and Matthew Yep> -
2:35 PM
Break
-
2:45 PM
A Web Application for Manually Tracking Locational Event Data in Ice Hockey
Data + Software Student-Track Winner: An Nguyen> -
3:00 PM
The SportsDataverse: An Open Source Initiative
Data + Software Open-Track Winner: Saiem Gilani> -
3:15 PM
Break
-
3:20 PM
A Spatial Framework for Analyzing NFL Offensive Line Play
Methods Student-Track Finalist: Paul Ibrahim> -
3:45 PM
Quantifying Hitter Plate Discipline in Major League Baseball
Methods Student-Track Finalist: Joshua Mould> -
4:10 PM
Applying Hierarchical Bayesian Models to ATP Data
Methods Student-Track Finalist: Horace Shew> -
4:35 PM
Break for voting
-
4:40 PM
An Analysis of Quarterback Ability to Hit Receivers In-Stride Using NFL Player Tracking Data
Methods Open-Track Finalist: Dani Treisman> -
5:05 PM
An Examination of Sport Climbing's Competition Format and Scoring System
Methods Open-Track Finalist: Quang Nguyen> -
5:30 PM
Break for voting
-
5:35 PM
Winners announced and closing remarks
Conference Speakers
Aaditya Ramdas (view paper)
Carnegie Mellon University
Comparing sequential forecasters: is 538 any better than Vegas-Odds?
Consider two or more forecasters, each making a probabilistic prediction for different events over time; for example Forecaster A might claim that Team A has an 80% chance of beating Team B, while Forecaster B thinks it is only 70%. We ask a relatively basic question: how might we compare these forecasters over time, without making any assumptions whatsoever on how the forecasts were generated (we cannot read minds!), or on the outcomes of the games? Is there a grounded way to say that one forecaster is "better" than the other in a "statistically significant" way? We present a novel and rigorous answer to this question, producing confidence intervals for differences in forecast quality that can be continuously monitored to yield valid comparisons at any time. Our coverage guarantees are also distribution-free, in the sense that they make no distributional assumptions whatsoever on the forecasts or outcomes. We demonstrate how they can be applied to Major League Baseball (MLB) games, comparing betting websites like Vegas-Odds with Nate Silver's fivethirtyeight. This is joint work with a PhD student at CMU, YJ Choe.
CMSACamp 2021 Student Speakers
Brooke Coneeny (view slides)
Swarthmore College
Brooke Coneeny is a junior at Swarthmore College studying Mathematics (with an emphasis in Statistics) and Computer Science. She is a member of Swarthmore's varsity softball team and hopes to pursue a career in sports analytics, specifically softball or baseball.
Maximizing wOBA with Launch Angle and Exit Velocity
In the game of baseball, each batter’s goal is to be a high production player where they are getting on base and driving in runs. However, not all players have the same swing or physical power and therefore have different production capabilities. Players that can consistently hit with a high exit velocity can lift the ball and put it over the fence, but players that naturally have lower exit velocities would routinely pop out if they used the same launch angle as these higher exit velocity hitters. With this intuition, we created a model that recommends a swing plane (attack angle) based on a batter’s power profile. We began by taking the player’s balls that were hit into play and created a linear model that predicted how the balls would leave the bat should their attack angle change. We then took these new launch angles and used a generalized additive model to predict the quality - measured by wOBA - of this new hit. Unfortunately, this model did not take into account that when a player changes his attack angle, the pitches he receives and makes contact with will change. We attacked this problem by taking all balls a player swung at in a year and creating GAMs to predict which balls they would make contact with, and then predict which of these balls they would hit fair. These new pitches comprised the set of pitches we then ran through the original model and allowed us to advise an optimal attack angle to the specific player.
Grace Fain
University of Oklahoma
Grace Fain is a senior at The University of Oklahoma, where she is studying Management Information Systems and Management with a concentration in Sports Management. After graduation, she plans to pursue a career in data analytics with interests in sports and business intelligence.
Tired of Misattribution, Modeling Player Fatigue in the NBA
The prevailing belief propagated by NBA league observers is that the workload of the NBA season dramatically influences a player's performance. We offer an analysis of cross game player fatigue that calls into question the empirical validity of these claims.
Erin Franke (view slides)
Macalester College
Erin is a junior at Macalester College in Saint Paul, MN and is majoring in Statistics and Economics. She is working toward a career as an analyst and is especially passionate about analytics in the field of baseball and social good. She will be joining Macalester’s varsity track and cross country team this winter and also enjoys hiking and watching/playing sports.
Maximizing wOBA with Launch Angle and Exit Velocity
In the game of baseball, each batter’s goal is to be a high production player where they are getting on base and driving in runs. However, not all players have the same swing or physical power and therefore have different production capabilities. Players that can consistently hit with a high exit velocity can lift the ball and put it over the fence, but players that naturally have lower exit velocities would routinely pop out if they used the same launch angle as these higher exit velocity hitters. With this intuition, we created a model that recommends a swing plane (attack angle) based on a batter’s power profile. We began by taking the player’s balls that were hit into play and created a linear model that predicted how the balls would leave the bat should their attack angle change. We then took these new launch angles and used a generalized additive model to predict the quality - measured by wOBA - of this new hit. Unfortunately, this model did not take into account that when a player changes his attack angle, the pitches he receives and makes contact with will change. We attacked this problem by taking all balls a player swung at in a year and creating GAMs to predict which balls they would make contact with, and then predict which of these balls they would hit fair. These new pitches comprised the set of pitches we then ran through the original model and allowed us to advise an optimal attack angle to the specific player.
Nicholas Ho (view slides)
Arizona State University
My name is Nicholas Ho and I'm a 3rd year at Arizona State University studying Computer Science and Mathematics. My current interests are in modeling biophysical dynamical systems using deep learning techniques. I am a researcher in the Structural Systems Biology lab at the Biodesign Institute.
An Interpretable Method of Learning Stochastic Game Dynamics
In soccer, modeling expected goals in a match is difficult without the use of player tracking data. Many models that attempt to make score predictions depend almost exclusively on the outcome of previous matches, and hence tend to do a poor job of capturing high score differentials (as in when a team wins by a substantial margin over another team). Relying on just the tracking data of the ball alone, we wanted to encapsulate the complex movement and forces acting on the ball into a much simpler object. This object is the potential function, which is simply an equation used to model underlying forces (e.g. gravitational potential functions). In a 2007 study, David R. Brilllinger wrote a paper on how to learn a potential function given a trajectory. The crux of this paper is that potential functions can be approximated using basis functions. In our case, these basis functions are a set of gravitational points, whose coefficients are based on the offensive and defensive movements of our teams, creating a potential function landscape unique to each team pairing. Through this “potential function landscape,” we are able to simulate games over time and create an averaged score prediction when two teams are pitted against each other. What we found is that predictions formed using this methodology capture high score differentials more reliably, and reduce both the MSE and residual variance compared to a Poisson Regression Model. We believe that this new method of modeling expected goals could also be used to determine player impacts in a game and provide a real-time game evaluation in the future.
Tej Seth (view slides)
University of Michigan
Tej is currently a Junior at the University of Michigan majoring in Information Analysis and minoring in Applied Statistics. He enjoys working on sports analytics related projects, listening to true crime podcasts and spending time with family and friends.
Run vs. Pass Play Prediction: Incorporating NFL Tracking Data
In football, it is advantageous for the defense to be able to identify and gain insight into whether the offense is running a pass play or a run play. Models have been created in the past using situational factors of the play to predict play type. We sought to recreate these previous models but also incorporate NFL tracking data to improve these models. Incorporating tracking data from weeks 1-6 of the 2017 NFL season, we used various types of machine learning models to predict whether a play would be run or pass based on a multitude of situational factors and generated positional factors. Predictability of play type was compared across all 32 NFL teams. In an extension of our project, we created a model that updates play type predictions every tenth of a second from before the snap until 2.5 seconds after the snap.
Austin Stephen
University of Wyoming
I’m a third-year undergraduate at the University of Wyoming getting degrees in Computer Science and Statistics. I'm motivated by learning how data can offer clever insights into all kinds of challenging problems. Exploring this interest, I've worked on a range of projects including automated selection, modeling stock market volatility, and sports analytics. Beyond academia, I enjoy backpacking and trail running taking advantage of the nature in Wyoming and Colorado.
Tired of Misattribution, Modeling Player Fatigue in the NBA
The prevailing belief propagated by NBA league observers is that the workload of the NBA season dramatically influences a player's performance. We offer an analysis of cross game player fatigue that calls into question the empirical validity of these claims.
Sarah Sult (view slides)
Washington University in St. Louis
Sarah is a senior at Washington University in St. Louis majoring in Economics and Computer Science. Originally from Houston, Sarah grew up watching the Houston Astros which is where her love of baseball began. Her hobbies include equestrian showjumping and Krav Maga martial arts. After graduation she hopes to work as an analyst in baseball or business.
Maximizing wOBA with Launch Angle and Exit Velocity
In the game of baseball, each batter’s goal is to be a high production player where they are getting on base and driving in runs. However, not all players have the same swing or physical power and therefore have different production capabilities. Players that can consistently hit with a high exit velocity can lift the ball and put it over the fence, but players that naturally have lower exit velocities would routinely pop out if they used the same launch angle as these higher exit velocity hitters. With this intuition, we created a model that recommends a swing plane (attack angle) based on a batter’s power profile. We began by taking the player’s balls that were hit into play and created a linear model that predicted how the balls would leave the bat should their attack angle change. We then took these new launch angles and used a generalized additive model to predict the quality - measured by wOBA - of this new hit. Unfortunately, this model did not take into account that when a player changes his attack angle, the pitches he receives and makes contact with will change. We attacked this problem by taking all balls a player swung at in a year and creating GAMs to predict which balls they would make contact with, and then predict which of these balls they would hit fair. These new pitches comprised the set of pitches we then ran through the original model and allowed us to advise an optimal attack angle to the specific player.
Sifan Tao (view slides)
University of Virginia
My name is Sifan Tao and I'm a 3rd-year student at the University of Virginia studying statistics and mathematics. I'm currently interested in causal inference and statistical machine learning. During my free time, I love playing tennis and watching sports competitions.
An Interpretable Method of Learning Stochastic Game Dynamics
In soccer, modeling expected goals in a match is difficult without the use of player tracking data. Many models that attempt to make score predictions depend almost exclusively on the outcome of previous matches, and hence tend to do a poor job of capturing high score differentials (as in when a team wins by a substantial margin over another team). Relying on just the tracking data of the ball alone, we wanted to encapsulate the complex movement and forces acting on the ball into a much simpler object. This object is the potential function, which is simply an equation used to model underlying forces (e.g. gravitational potential functions). In a 2007 study, David R. Brilllinger wrote a paper on how to learn a potential function given a trajectory. The crux of this paper is that potential functions can be approximated using basis functions. In our case, these basis functions are a set of gravitational points, whose coefficients are based on the offensive and defensive movements of our teams, creating a potential function landscape unique to each team pairing. Through this “potential function landscape,” we are able to simulate games over time and create an averaged score prediction when two teams are pitted against each other. What we found is that predictions formed using this methodology capture high score differentials more reliably, and reduce both the MSE and residual variance compared to a Poisson Regression Model. We believe that this new method of modeling expected goals could also be used to determine player impacts in a game and provide a real-time game evaluation in the future.
Nicole Tucker (view slides)
University of Evansville
Nicole Tucker is a junior at the University of Evansville where she is majoring in Mathematics and Data Science/Statistics with a minor in Communication. She is an avid football fan, and her main research interests lie in football analytics. Her passion lies in modern fan engagement, which has prompted her to find ways to use her statistics and communication background to connect fans with current sports analytics concepts. At the University of Evansville, Nicole is the president of the Statistics/Data Science Club and the Vice President of Academic Development for her chapter of Alpha Omicron Pi. Nicole currently works for Purple Aces Sports Properties (a division of Learfield) as an Associate Coordinator where she plays a role in University of Evansville game day operations and fulfillment of promotions.
Run vs. Pass Play Prediction: Incorporating NFL Tracking Data
In football, it is advantageous for the defense to be able to identify and gain insight into whether the offense is running a pass play or a run play. Models have been created in the past using situational factors of the play to predict play type. We sought to recreate these previous models but also incorporate NFL tracking data to improve these models. Incorporating tracking data from weeks 1-6 of the 2017 NFL season, we used various types of machine learning models to predict whether a play would be run or pass based on a multitude of situational factors and generated positional factors. Predictability of play type was compared across all 32 NFL teams. In an extension of our project, we created a model that updates play type predictions every tenth of a second from before the snap until 2.5 seconds after the snap.
Adhvaith Vijay (view slides)
UCLA
My name is Adhvaith Vijay and I am a 4th-year student at UCLA studying Statistics. I currently work for the Digital Humanities department as an undergraduate researcher and aspire to pursue a career in machine learning and customer analytics. Outside of academics, I play Badminton and Table Tennis.
An Interpretable Method of Learning Stochastic Game Dynamics
In soccer, modeling expected goals in a match is difficult without the use of player tracking data. Many models that attempt to make score predictions depend almost exclusively on the outcome of previous matches, and hence tend to do a poor job of capturing high score differentials (as in when a team wins by a substantial margin over another team). Relying on just the tracking data of the ball alone, we wanted to encapsulate the complex movement and forces acting on the ball into a much simpler object. This object is the potential function, which is simply an equation used to model underlying forces (e.g. gravitational potential functions). In a 2007 study, David R. Brilllinger wrote a paper on how to learn a potential function given a trajectory. The crux of this paper is that potential functions can be approximated using basis functions. In our case, these basis functions are a set of gravitational points, whose coefficients are based on the offensive and defensive movements of our teams, creating a potential function landscape unique to each team pairing. Through this “potential function landscape,” we are able to simulate games over time and create an averaged score prediction when two teams are pitted against each other. What we found is that predictions formed using this methodology capture high score differentials more reliably, and reduce both the MSE and residual variance compared to a Poisson Regression Model. We believe that this new method of modeling expected goals could also be used to determine player impacts in a game and provide a real-time game evaluation in the future.
Hey I'm Matt Yep, I'm a fourth year at UC Berkeley double majoring in Economics and Data Science, and I think that Jordan Poole is going to win Most Improved Player. I like to fish in the SF bay, read books about behavioural economics, and hoop on the practice squad with the Cal Women's basketball team.
Tired of Misattribution, Modeling Player Fatigue in the NBA
The prevailing belief propagated by NBA league observers is that the workload of the NBA season dramatically influences a player's performance. We offer an analysis of cross game player fatigue that calls into question the empirical validity of these claims.
Reproducible Research Competition
Data + Software Student-Track Winner
An is currently a senior at Harvey Mudd College (Claremont, CA) majoring in Computer Science. She enjoys British panel shows, fiddly needlepoint crafts, and watching various sports she wishes were played professionally in her hometown of Houston.
A Web Application for Manually Tracking Locational Event Data in Ice Hockey
Sometimes, sports data must be tracked by hand, as desired data may not be available through existing sources, if it exists at all. But, manual tracking is often tedious and hard to translate into a useful form for analysis. In ice hockey, manual tracking is often especially necessary in women’s and youth ice hockey as data is often sparser and more difficult to access in those competitions. To aid the process of manual tracking in ice hockey and encourage an increase in the breadth and depth of data available, this work describes an open-source web application designed to reduce the hardships of manually tracking locational event data in ice hockey. The web application distinguishes itself from similar applications by its user-friendliness and high level of customizability. By clicking on a location on a rink, the corresponding coordinates of that event are logged as a table row. Details in additional columns in the table are recorded using a details panel beside the rink. The data is downloadable in .csv format for further exploration and analysis. New details can be created with corresponding widgets in the details panel and columns in the table. The application also provides the ability to record one or two sets of coordinates per event, for tracking events where start and end locations are desired. These custom setups of the application with new details and options can be saved and later loaded into the web application, saving time recreating the environment.
Data + Software Open-Track Winner
Saiem Gilani (view paper and SportsDataverse)
Saiem Gilani is by trade a machine learning engineer specializing in computer vision with some full-stack capabilities. During the pandemic, Saiem started working on a concept called the SportsDataverse, essentially a data science and engineering for social good project. The intention is to make the sports data and analytics community more diverse, inclusive, and accessible for the everyday user by flattening the learning curve required to get started. The community of developers organizing around the SportsDataverse make this a reality by building easy-to-use packages and open-source data repositories. These days Saiem spends most of his time building production ML models and end-to-end systems engineering at Deloitte. Early career practices focused on quantitative modeling and finance from four years in healthcare consulting and medical malpractice in actuarial roles. More recent roles have included lead data scientist for a successful startup in the freight shipping industry and as data science course developer. Author of 13+ packages in Python, R, and Node.js. 2020 MIT Sloan Sports Analytics Conference Hackathon winning team member using college basketball spatio-temporal dataset. M.Sc. Analytics from Georgia Tech, B.Sc. Mathematics from Florida State University.
The SportsDataverse: An Open Source Initiative
The availability of open data streamlines the process for reproducible research and creates more accessible opportunities for research and development. One of the most impactful sports analytics papers of the last several years, "nflWAR: a reproducible method for offensive player evaluation in football", allowed for the reproducibility of the results of the paper via the nflscrapR package. Utilizing a similar principle, the SportsDataverse seeks to provide a more cohesive set of open-source sports data packages with emphasis on improving testing standards, promoting reproducible data analysis, and creating easily search-able documentation websites. The intention is for sports analytics projects to build on top of these packages to efficiently develop research concepts and analysis and merge relevant portions of their ideas into the data processing pipeline and packages. The SportsDataverse developer group has created a collection of software packages for sports data with modules written in Python, R, and Node.js. Collectively, the SportsDataverse packages cover 18 sports leagues worth of data, including 11+ men's sports and 7+ women's sports with plans for expansion. The initiative's first aim is to broaden the number of sports covered by the packages through existing data sources. Several of the packages written are directly modeled on the data engineering efforts of Sebastian Carl and the nflfastR team, including cfbfastR(py), hoopR(py), and wehoop(py). These packages allow for the loading of play-by-play and box score data for 16+ seasons of NBA, WNBA, men's and women's college basketball, and college football with parallel processing and progress updates. This saves the user countless hours of building data infrastructure that can instead be spent building their project.
Methods Student-Track Finalists
Paul Ibrahim (view paper)
University of Chicago
Paul Ibrahim is a college first year at the University of Chicago. His areas of interest are game theory, information theory, and the application of tracking data across sports.
A Spatial Framework for Analyzing NFL Offensive Line Play
In the NFL, the offensive line plays a crucial role in any offense. On passing plays inside the pocket (the scope of this paper), an offensive line’s task is generally two-fold: (A) provide defender-free space for the quarterback and (B) sustain this operable space for as long as possible. However, current measures attempting to assess offensive line play fall short of providing a holistic reflection of a line’s true effectiveness. Counting stats such as sacks and quarterback pressures have shown to be unreliable, and more advanced metrics such as Pass Block Win Rate (PBWR) that are constructed around discrete “win-loss” outcomes do not provide a continuous measure that reflects the spatial fluidity inherent to offensive line play. In this paper, we seek to redefine the notion of pressure from a spatial perspective. Using publicly available tracking data from the 2020 NFL regular season, we design an analytical framework centered around spatial analysis within the pocket, using dynamic features of the quarterback’s spatiotemporal setting to offer more meaningful information about the quality of an offensive line’s play. The methodology outlined offers a more comprehensive understanding of what individual players on the offensive line, segments of the line, and the line as a whole contribute to the quarterback’s protection.
Joshua Mould (view paper)
Villanova University
I am a Junior Computer Science and Statistics double major at Villanova University with a passion for baseball and baseball analytics. This motivated me to found the Sports Analytics Club at Villanova my freshman year and pursue multiple research opportunities. I am pursuing a career in baseball analytics and hope to work for a MLB team. In my free time I enjoy playing card games and practicing card magic and cardistry, which I picked up as a hobby during the COVID-19 shutdown.
Quantifying Hitter Plate Discipline in Major League Baseball
We use Statcast data from the 2017-2021 Major League Baseball seasons to quantify the ability of players to make correct decisions about whether or not to swing at each pitch. Using player hitting ability and pitch characteristics, we use an xgboost model to predict the likelihood of each possible outcome from a swing, and from not swinging. From each outcome, we calculate the expected runs scored in the inning. Combining these, we get the expected runs if the player swings, and the expected runs if the player takes the pitch. We quantify a hitter's plate discipline by averaging the total increase in expected runs from each pitch thrown to the hitter. We show that this metric is stable over time, explains current year hitting performance (as measured by batting average/OBP/Whatever), and can improve predictions of future hitting performance.
Horace Shew (view paper)
Swarthmore College
Horace Shew is a senior at Swarthmore College majoring in mathematics with an emphasis in statistics. He also competes for the varsity swim team.
Applying Hierarchical Bayesian Models to ATP Data
The ATP tour is a tennis tour for professional men's tennis players managed by the Association of Tennis Professionals. The tour consists of tournaments hosted annually, held all across the world on different surfaces worth different amounts of ranking points, i.e. ATP Masters 1000, ATP 500, ATP 250. In addition, the Grand Slam tournaments also count for ranking points. The tournaments differ in size of the draw, surface, location, and when during the year the tournament takes place. Along with the tournament itself, the attributes of a player and their opponent in a match also factor into his result at a tournament. The current ranking system has remained in place since 2009 and measures player performance over the long term, but we also want to capture short-term fluctuations in player performance. In this paper, we develop a hierarchical Bayesian model to predict player performance that takes these types of factors into account, using public data made available by Jeff Sackmann and Tennis Abstract (Sackmann, 2021). We examine the effects of tournament and player attributes on performance on the tour from 2009-2020. These models provide insight into how match-level, tournament-level, and player-level effects interact to infuence player performance.
Methods Open-Track Finalists
Dani Treisman (view paper)
DePaul University
Dani Treisman is a healthcare data analyst living in Chicago, IL. He has a B.S. in Statistics from Loyola University Chicago and is currently pursuing a Master's in Data Science at DePaul University. His newfound passion for sports analytics has led him to research various aspects of American sports through participation in several sports analytics competitions. His work can be found here.
An Analysis of Quarterback Ability to Hit Receivers In-Stride Using NFL Player Tracking Data
This report is an investigation into the ability of NFL quarterbacks to hit receivers in-stride. This ‘ability’ is defined here as the quarterback’s skill in preventing extreme orientation, acceleration, and speed changes for the receiver (i.e. back-shoulder catches), from the moment of the throw to the moment the pass arrives at the receiver. This work provides a starting point for more specific methods of quarterback evaluation using frame-level player tracking data and provides evidence that quarterbacks to do have some ability to hit receivers in-stride. The main findings in this paper are: 1) the aggregated (by QB) observed-expected values for receiver Acceleration Difference Over Expected correlate moderately with average Completion Probability Over Expected (CPOE), 2) Average (by QB) receiver Speed Difference Over Expected and Yards After Catch difference from pass forward to pass arrival (xYAC Difference) are highly correlated, and 3) average Acceleration and Speed differences Over Expected are stable for QBs within-season.
Quang Nguyen (view paper)
Loyola University Chicago
Quang Nguyen is a graduate student at Loyola University Chicago, pursuing a Master of Science degree in Applied Statistics. His interests include statistical applications in sports, data visualization, and reproducibility in data science. Quang previously completed his undergraduate degree in Mathematics and Data Science at Wittenberg University in Springfield, Ohio. Quang is originally from Nha Trang, Khanh Hoa, which is located in the South Central Coast of Vietnam. In his free time, you may find Quang watching sports, listening to podcasts, and cooking. He is a die-hard supporter of Manchester United F.C. of the English Premier League.
An Examination of Sport Climbing's Competition Format and Scoring System
Sport climbing, which made its Olympic debut at the 2020 Summer Games, generally consists of three separate disciplines: speed climbing, bouldering, and lead climbing. However, the International Olympic Committee (IOC) only allowed one set of medals per gender for sport climbing. As a result, the governing body of sport climbing, rather than choosing only one of the three disciplines to include in the Olympics, decided to create a competition combining all three disciplines. In order to determine a winner, a combined scoring system was created using the product of the ranks across the three disciplines to determine an overall score for each climber. In this work, the rank-product scoring system of sport climbing is evaluated through simulation to investigate its general features, specifically, the advancement probabilities for climbers given certain placements. Additionally, analyses of historical climbing contest results are presented and real examples of violations of the independence of irrelevant alternatives are illustrated. Finally, this work finds evidence that the current competition format is putting speed climbers at a disadvantage.
TBA
Contact Us
The Carnegie Mellon Sports Analytics Conference is proudly hosted by the Department of Statistics & Data Science
and the Carnegie Mellon Sports Analytics club.
CMSAC Program Committee:
Carnegie Mellon Sports Analytics Club Executives
-
Connor O’Keefe
-
Zach Smolar
-
Eric Hoeffel
Questions can be directed to cmsac@stat.cmu.edu.
CMSAC Activities Conduct Policy
(modeled on the ASA Activities Conduct Policy approved November 30, 2018 by American Statistical Association Board of Directors)
The Carnegie Mellon Sports Analytics Conference (CMSAC) is committed to providing an atmosphere in which personal respect and intellectual growth are valued and the free expression and exchange of ideas are encouraged. Consistent with this commitment, it is CMSAC policy that all participants in CMSAC activities enjoy a welcoming environment free from unlawful discrimination, harassment, and retaliation. We strive to be a community that welcomes and supports people of all backgrounds and identities. This includes, but is not limited to, members of any race, ethnicity, culture, national origin, color, immigration status, social and economic class, educational level, sex, sexual orientation, gender identity and expression, age, size, family status, political belief, religion, and mental and physical ability.
All CMSAC participants —including, but not limited to, attendees, statisticians, data scientists, sports analysts, students, registered guests, staff, contractors, sponsors, exhibitors, and volunteers —in the conference or any other related activity—whether official or unofficial—agree to comply with all rules and conditions of the activities. Your registration for or attendance at the 2020 Carnegie Mellon Sports Analytics Conference indicates your agreement to abide by this policy and its terms.
Expected Behavior
- Model and support the norms of professional respect necessary to promote the conditions for healthy exchange of scientific ideas.
- Speak and conduct yourself professionally; do not insult or disparage other participants.
- Be conscious of hierarchical structures in the sports analytics and/or broader statistics/data science community, specifically the existence of stark power differentials among students, junior analysts/statisticians, and senior analysts/statisticians—noting that fear of retaliation from those in senior-level positions can make it difficult for students or those in junior level positions to express discomfort, rebuff unwelcome advances, and report violations of the conduct policy.
- Be sensitive to body language and other non-verbal signals and respond respectfully.
Unacceptable Behavior
- Violent threats or language directed against another person
- Discriminatory jokes and language
- Inclusion of unnecessary sexually explicit, violent, or otherwise sensitive materials in presentations
- Posting (or threatening to post), without permission, other people’s personally identifying information online, including on social networking sites
- Personal insults including, but not limited to, those using racist, sexist, homophobic, or xenophobic terms
- Unwelcome solicitation of emotional or physical intimacy such as sexual advances; propositions; sexual flirtations; sexually-related touching; and graphic gestures or comments about sex or another person’s dress, body, or sexual activities
- Advocating for, encouraging, or dismissing the severity of any of the above behaviors.
Consequences of Unacceptable Behavior
At the sole discretion of the CMSAC Program Committee, unacceptable behavior may result in removal from or denial of access to meeting facilities or activities, without refund of any applicable registration fees or costs. In addition, the CMSAC reserves the right to report violations to an individual’s employer or institution or to a law-enforcement agency. Those engaging in unacceptable behavior may also be banned from future CMSAC activities or face additional penalties.
What to Do if You Witness or Are Subject to Unacceptable Behavior
If you are being harassed, notice that someone else is being harassed, or have any other concerns relating to harassment, please contact a member of the CMSAC program committee either in person or at cmsac@stat.cmu.edu. If you witness potential harm to a conference participant, be proactive in helping to mitigate or avoid that harm; if you see or hear something that concerns you, please say something.
Process for Adjudicating Reports of Misconduct
The CMSAC will contract with an independent entity to manage and adjudicate reported violations of the conduct policy.
Note: This Code of Conduct may be revised at any time by the Carnegie Mellon Sports Analytics Conference. Questions, concerns, or comments should be directed to cmsac@stat.cmu.edu.