Undergraduate Research Showcase Showdown

Here we highlight and celebrate the 2019-2020 Carnegie Mellon Statistics & Data Science Undergraduate Research and Capstones! In May 2020, we had five divisions, each tackling different kinds of statistics & data science challenges. Click on each division to learn more about the program and the semester's projects.

36-315 Statistical Graphics & Visualization

As part of their final project, teams of students build and use interactive statistical graphics & visualizations as part of student-driven research studies. Feel free to take a look at this spring's projects below.

Spring 2020: Professor Zach Branson

Projects
Exploring Popular IMDb Movies from 2010 to 2016
Pavi Bhatter, Olivia Deng, Jessica Li, Jiwoo Yoo
Project Feedback
Uh-Oh Avocado!
Anusha Agarwal, Saniya Agarwal, Gayathri Manchella
Project Feedback
A Visual Analysis of Top Spotify Song Data
Kyra Balenzano, Evan Feder, David Yuan
Project Feedback
The NCAA: If You Want to Compete, Stay in School
Justin Britton, Jessica Chau, Megan Christy, Samuel Hawke
Project Feedback
Analyzing New York City Airbnb Data
Yuhao Chen, Tianyang Fan, Qiao Shen, Tianyang Fan
Project Feedback
Analysis of the NBA RAPTOR, WAR, and PREDATOR
Brian Choi, Jonathan Fung, Sophia Lee
Project Feedback
Price and Availability of Airbnb
Grace Cui, Iris Pei, Akshara Ramakrishnan
Project Feedback
Did the Orlando Nightclub Chooting Affect the Florida State Patrol's Stopping Behavior? A Pre-Causal Analysis
Carlo Duffy, Ryan Labriola, Yedin Liu
Project Feedback
Olympic Medals
Marc Edwards, Sean Price, Caleb Yoder
Project Feedback
Hotel Booking Demand and Customer Preferences
Kailai Han, June Kim, Anna Tan, Tina Zhang
Project Feedback
Loans and Borrowers of the Lending Club
Aaron Lam, David Liu, Thomas Rhee, Abigail Stevenson
Project Feedback
An Analysis of American Higher Education Institutions
Jeremy Leung, Ashwath Vijayakumar, Raymond Yang, Andrew Ye
Project Feedback
Global Terrorism
Yurong Li, Xinyu Ma, Xinzhe Qi, Jingyan Xu
Project Feedback
Spotify JAMS
Jay Liu, Shreya Nandi, Myat Sint, Alice Yang
Project Feedback
Examining Car Accidents in the United States
Erica Oh, Avinaash Padmanabhan, Eric Tu
Project Feedback
A Close Look at Meteorites
Katelyn Bloomquist, Eunice Cheng, Jacky Liu, Stephanie You
Project Feedback
What Factors Contribute to the Daily Rate of Hotel Bookings?
Nalin Aiyar, Nila Ramaswamy, Casey Rodriguez, Tracy Wang
Project Feedback
Hate Crimes in the United States
Adam Behnke, Tobias Junker, Alana Mittleman, Sara Petrie
Project Feedback
The Avocado-pocalypse
Hannah Douglas, Sweta Kotha, Jonathan Wang, Stuart Wilkins
Project Feedback
Characteristics of Successful Video Games
Rashmi Anil, Cheyenne Ehman, Ishaan Gupta, Julia Kim
Project Feedback
Netflix Movies and TV Shows
Jae Hyung Kang, Chileshe Otieno, Joohyung Shin
Project Feedback
How External Factors Affect Park Visitation in Yellowstone National Park
Taylor Cammarata, Serena Gillian, Julia Kim, Alden Pritchard
Project Feedback
Exploring Trending US Youtube Videos
Brandon Dubner, Luke Jin, Jiyoung Kim
Project Feedback
Offense and Defense with the NBA RAPTOR
Mantek Chadha, Lu Liu, Qimeng Xiao
Project Feedback

36-290/490 Undergraduate Research

At Carnegie Mellon Statistics & Data Science, there are multiple opportunites to engage in team research projects. 36-290 Early Undergraduate Research is a fall course targeted for sophomores who do semester-long projects in small groups, concentrating on learning the research process with an introduction to statistical machine learning methodology. 36-490 Undergraduate Research is an advanced research course that happens both semesters for juniors and seniors. Groups of students collaborate with researchers and scientists in other disciplines and use advanced statistical methodology to tackle real-world challenges. Both courses heavily emphasize professional skills development including collaboration and both written and oral communication.

Feel free to explore the projects below.


Fall 2019 36-290: Peter Freeman

Projects
So You're a Star -- But How Hot Are You?
Leon Lu, Kat Phelps, Tara Prakash, Aramya Trivedi
Poster 5-min Presentation Feedback Zoom
Predicting Galaxy Mass from Sky Location and Brightness
Canzhou Qu, Serena Wang, Peter Wu, Ginny Zhao
Poster 5-min Presentation Feedback Zoom
Classifying Kepler Objects of Interest
Andrew Furlong, James Lederman, Lajja Pancholy, Ananya Vasudev
Poster 5-min Presentation Feedback Zoom
Predicting Quasar Redshifts given Brightness
Michael Chen, Pauline Qin, May Wang
Poster 5-min Presentation Feedback Zoom
Spring 2020 36-490 : Peter Freeman, Zach Branson

Projects
Text Analysis of U.S. Congressional Records (with Dani Nedal)
Adam Behnke, James Mahler, Parvathi Meyyappan, Youna Song
Poster  App 5-min Presentation Feedback Zoom
How Children Inspect Visually Available Information during Language Comprehension (with Catarina Vales)
Kaili Chen, Gloria Kwakye, Raymond Yang, Qiuyi Yin
Poster 5-min Presentation Feedback Zoom
Capturing Individual Differences in Children's Knowledge Organization (with Catarina Vales)
Taeuk Kim, Jaeyeon Lee, Yizhi Zhang
Poster 5-min Presentation Feedback Zoom
Simple Neural Network Exploration of the Mechanisms of Human Speech Adaptation (with Charles Wu)
Woo June Cha, Yitian Hu, Madhuri Raman, David Xu
Poster 5-min Presentation Feedback Zoom
Investigating Speech Adaptation with Convolutional Neural Networks (with Charles Wu)
Esther Ahn, Shlok Goyal, Yuk Yeung Lam, Lesley Lyu
Poster 5-min Presentation Feedback Zoom

36-497 Corporate Capstone

At Carnegie Mellon Statistics & Data Science, students can apply to participate in our data science experiential learning program: 36-497 Corporate Capstone. In this course, we closely collaborate with both commercial and non-profit partners on real data science problems through educational project agreements. These projects can vary in scope but most commonly center on data integration, visualizations, statistical machine learning algorithms, data analysis and modeling, and proof-of-concept prototypes. Professional development skills such as collaboration and written/oral communication are heavily emphasized.

To learn more about partnering opportunities with Carnegie Mellon and Statistics & Data Science, please feel free to contact Rebecca Nugent (rnugent@stat.cmu.edu), Michael Harding (michaelharding@cmu.edu), and/or Adam Causgrove (causgrove@cmu.edu).

Feel free to explore the projects below.


Spring 2020 36-497 : Rebecca Nugent, Peter Freeman

Projects
The NPD Group: Natural Language Processing for Receipt Classification
Richard Chun, Andrew Gu, Malik Khan, Taewan Kim, Eva Zhong (with Yufei Yi)
Poster 5-min Presentation Feedback Zoom
Giant Eagle: Predicting Customer Segmentation from Transactional Data
Rahul Ahuja, Hanyue Chai, Aayush Jain, Chenxiang Zhang (with Nil-Jana Akpinar)
Poster 5-min Presentation Feedback Zoom
Koppers: Zero Harm Initiative Project
Diane Hu, Richard Kang, Elaine Ouyang, Wenna Qin, Lune Zhang (with Lorenzo Tomaselli)
Poster 5-min Presentation Feedback Zoom
Chain of Demand : A Pipeline for Characterizing Fashion Items' Online Popularity
Lisa Han, Anwen Huang, Lexie Li, Joyce Moon, Dasson Tan (with Xiaoyi Yang)
Poster 5-min Presentation Feedback Zoom
Fall 2019 Partners:
  • The NPD Group
  • The Principal Financial Group
  • Giant Eagle
  • Penguin Random House
  • Pack Up + Go
  • IKOS
Previous Partners: C.H. Robinson Worldwide, Inc, Black & Veatch, The NPD Group, The Principal Financial Group, CivicScience, TruMedia, Steady (App)

36-493 Sports Analytics

Carnegie Mellon and the Department of Statistics & Data Science is actively involved in sports analytics from cutting-edge research to conferences to student clubs to summer programs to outreach initiatives. In 36-493: Sports Analytics, we partnered with the Carnegie Mellon Athletics Department on a set of ground-breaking projects that integrate previously unlinked, disparate data sets to build interactive applications and statistical models that can be used by coaches and staff to better understand and predict student-athlete performance. To learn more about our general Carnegie Mellon Sports Analytics work, please visit the CMSAC site.

Feel free to explore the projects below.

Spring 2020 36-493 : Rebecca Nugent

Projects
Golf Performance (with CMU Men's and Women's Golf)
Marc Edwards, Yedin Liu, Xinzhe Qi
Poster   App 5-min Presentation Feedback Zoom
Carnegie Mellon Softball Pitcher Efficiency (with CMU Softball)
Gautam Goel, Scott Steinberg
Poster   App 5-min Presentation Feedback Zoom
Swimming Top Time Trajectories (with CMU Women's Swimming)
Megan Christy, Julia Miraglia, Omkar Sakhawalkar, Shwetha Venkatesh
Poster 5-min Presentation Feedback Zoom
Using Text Analysis to Evaluate Softball Run Expectancy (with CMU Women's Softball)
Sean Jin, Zachary Siegel, Anna Tan
Poster   App 5-min Presentation Feedback Zoom
Basketball Logistics and Performance Indicator Analysis (with CMU Men's Basketball)
Violet Dong, Ryan Mahtab, Shurui Zeng
Poster 5-min Presentation Feedback Zoom

Senior Honors Thesis and Independent Study

Qualified Statistics & Data Science seniors can apply for the Dietrich College Senior Honors Thesis Program; these year-long projects are supervised by a faculty member and often involve methodological development in a real-world application context.

Independent Studies can happen at any level but are most common for juniors and seniors. They can be one or multiple semesters and typically involve exploring a research topic through advanced statistical modeling and data analysis. Students find a project through conversation with faculty who often have expertise in the area of interest.

Feel free to explore the projects below.

2019-2020:

Senior Honors Theses
Does Redevelopment Affect Crime Rates? A Case Study on St. Louis
Mary Bollinger (with Peter Freeman)
Slides 20-min Presentation Feedback Zoom
Augmenting Tennis Point Stochastic Modeling Utilizing Spatiotemporal Shot Data
Corey Emery (with Peter Freeman and the USTA)
Slides 15-min Presentation Feedback Zoom
Data Analysis for Human Incarceration
Ben Klingensmith (with Robin Mejia and Jay Aronson)
Poster 15-min Presentation Feedback Zoom
Identifying Subpopulations of Neurons in Six Visual Areas in the Mouse
Hannah Douglas (with Rob Kass)
Spring 2020:

Projects
Extraction & Classification of Stellar Spectra from CTIO Plates
Lajja Pancholy (with Peter Freeman)
Poster 5-min Presentation Feedback Zoom
A Data Driven Approach to Finding an Edge in the NBA Betting Markets
Reed Peterson (with John Lehoczky)
Poster 5-min Presentation Feedback Zoom