Carnegie Mellon's Undergraduate Statistics & Data Science

Carnegie Mellon Sports Analytics Club


are proud to present the inaugural

Carnegie Mellon Sports Analytics Conference
with NFL Tartan Data Science Cup:

October 28-29, 2017


Open to the community, including high school students.


Learn more about the CMU Sports Analytics Club and Conference

October 28, 2017



The Tartan Data Science Cup is a series of Kaggle-like data analysis competitions exclusively for CMU undergraduates and local high school students. Each competition will have a different theme, research scenario, goals, and solutions. The problem description, research question, and data sets will not be released until specified date/times. Students will submit their answers by a given deadline; selected finalists will present in front of a panel of judges.

Winning teams will receive cash prizes, (temporary ownership of) the Tartan Data Science Cup, and glory. So much glory.


Episode IV: Return of the Kickoff Timeline


The problem, data, and rules for Episode IV can be found here.

The submission site for the report is here .

The submission site for the slides/code is here.
Put files in a zip/compressed folder or submit one a time.



Registration Deadline: Saturday October 28th, 2017, 11:59pm

All currently enrolled high school students and Carnegie Mellon University undergraduate students on the Pittsburgh campus are eligible to participate. Teams can consist of 1-3 students; students can only participate on one team. All student names and Andrew IDs/email addresses must be included when registering. Registration must also include a (non-identifying) team name.

To register, click here.


Release the Data! Friday October 27th 6pm

The data set and variable descriptions will be available on Friday evening but without details about the specific competition questions. Participants should try to do some exploratory data analysis prior to the competition in order to focus their efforts on Sunday.

THE DATA ARE LIVE!! Click here


Demo Session with the Data: Saturday October 28th, 5:15pm, Baker A51 (Giant Eagle Auditorium)

There will be a demo session using the data sets during the Carnegie Mellon Sports Analytics Conference at 5:15pm in Baker Hall A51 (Giant Eagle Auditorium). Topics will include loading the data sets, variable discussion, different visualizations of the features, and brainstorming interesting metrics. Students in the Tartan Data Science Cup are welcome to attend this Demo Session even if not attending the entire Carnegie Mellon Sports Analytics Conference.


Competition Begins! Sunday October 29th, 9am, Baker Hall A51 (Giant Eagle Auditorium)

The research problem and competition question(s) will be released on this website at 9am. Students are welcome to work anywhere, but the TDSC Homebase (Giant Eagle Auditorium) will be open all day as the TDSC Homebase. TDSC organizers will also be available during the day to answer questions.

Lunch will be provided for participants in the TDSC Homebase.


Time's Up! Sunday October 29th, 5pm

Submissions are due. Each team should submit a 2-3 page report describing the key results and methods used to analyze the data (made as or converted to a .pdf file).

DUE AT 6PM: up to 3 slides for a 5-minute research presentation (made as or converted to a .pdf file)

DUE AT 6PM: all (well-documented!) code used to analyze the data, obtain results, create graphics, etc (any programming language/software is acceptable)

Submission links will be open closer to the deadline.

Submission constitutes permission to post winning team entries online (under non-identifying team name).


Presentations and Final Results: Sunday October 29th, 7pm, Giant Eagle Auditorium (Baker Hall A51)

A panel of judges from Statistics & Data Science and sports analytics. The judges will review the reports from 5-7pm and then watch the slide presentations at 7pm. Students are encouraged to practice their presentations over the 5-7pm dinner break.

The top 6-8 teams will be given five minutes to present their methods and results to the judges, the other teams, and anyone else who wishes to attend. Teams can have up to three slides, but be careful -- you will be cut off after exactly five minutes! Teams outside of the top 8 are still eligible to win other prizes and encouraged to stay and watch the final presentations.

The judging criteria include:

  • Report: Does the submitted report describe the problem, methods, and results in a clear and concise manner?
  • Presentation: Did the team communicate their problem, methods, and results to an uninformed audience effectively?
  • Results: Do the team's results sufficiently and effectively answer the research problems presented?


(Typical) Prizes

alternative text

1st Place Team: $500
2nd Place Team: $300
3rd Place Team $200

Additionally, the 1st place team will receive the Tartan Data Science Cup. After each competition, the Cup is presented to the winning team, who are allowed to keep the cup and gloat for a short period of time. Members of the winning team will have their names engraved onto the Cup.

TDSC Organizer Contact Information:

Rebecca Nugent (rnugent@stat.cmu.edu), Christopher Peter Makris.