The goals of our education component are: increase the number of students who are well-versed in the statistical issues facing the U.S Census and able to develop and implement appropriate analyses, support research projects for both the summer and the full academic year, and incorporate our research and related statistical problems in both undergraduate and graduate classes.
Our node uses a vertical integration structure in all educational projects. Undergraduates, graduate students, junior faculty, and senior faculty all work together in regular meetings. Projects are presented with slides and posters. All students are expected and trained to describe and defend their statistical analyses to an interdisciplinary audience.
Use the sidebar menus to view recent work, divided into undergraduate and master's level research projects, classes, classroom modules/implementations. The classes and classroom modules are presented such that their structure could be duplicated or implemented elsewhere.
Summer 2015
Robin Dunn*, Department of Mathematics, Kenyon College Supervisor: Eddy
Kelli Marquardt, Department of Mathematics, University of Dayton Supervisor: Eddy
Brandon Sherman*, Department of Statistics, University of Pittsburgh Supervisor: Eddy
Summer 2014
Kevin Eng, Department of Statistics Supervisor: Eddy
Nick Ettlinger, Department of Statistics Supervisor: Eddy
Summer 2013
Kairavi Chahal, Department of Statistics Supervisor: Steorts
Nicole Crimi, Department of Statistics Supervisor: Eddy
Emily Furnish*, Science and Humanities Scholars (International Relations & Politics) Supervisor: Fienberg, Steorts
Dahiana Jimenez, Department of Statistics Supervisor: Steorts
Summer 2012
Brittany Buggs*, Department of Statistics, Heinz College Public Policy Supervisor: Eddy
Nicole Crimi, Department of Statistics Top Coding in the US Census and American Community Survey[.pdf] Supervisor: Eddy
Sepideh Mosaferi*, Dietrich Humanities & Social Sciences Supervisor: Fienberg
Jaime Trujillo, Department of Statistics Errors in Census and American Community Survey Data File[.pdf] Supervisor: Eddy
George Volichenko*, Department of Statistics Capture-Recapture Interactive Simulation Package[.pdf] Supervisor: Eddy
Abbas Zaidi*, Department of Statistics Supervisor: Eddy
Olga Zubashko, Department of Statistics Supervisor: Eddy
* denotes a student who continued to graduate or professional school
These research projects are typically one or two semesters long. List does not include Senior Honor Thesis work (see menu at left).
AY 2015-2016
Ernest Chiew*, Science and Humanities Scholar, Departments of Physics, Statistics Supervisors: Eddy, Ventura
AY 2014-2015
Nick Ettlinger, Department of Statistics Supervisor: Eddy
AY 2013-2014
Nick Ettlinger, Department of Statistics Supervisor: Eddy
Shannon Gallagher*, Department of Mathematical Sciences Supervisors: Fienberg, Eddy
AY 2012-2013
Nicole Crimi, Department of Statistics Top Coding in the US Census and American Community Survey[.pdf] Supervisor: Eddy
Michael Pane*, Department of Statistics Struggles in Small Area Estimation: Benchmarking and Weighting Supervisor: Steorts
Olga Zubashko, Department of Statistics Supervisor: Eddy
* denotes an undergrad who continued to graduate school
Dietrich College of Humanities & Social Science Honors Thesis Program
(Link)
Academically qualified students are invited by the Dietrich's Dean Office to participate in the Honors Thesis Program. Students then align their research interests with a deparment and supervisor. Although a large group of students may be invited on the basis of their GPA, in practice the Statistics Department chooses only 3-5 top students a year. Below are students whose theses are grounded in the research problems of this node.
This course is an introduction to "data matching", the field of improving data quality through record linkage techniques including merging lists, identifying duplicates, and unique entity resolution. We discuss working with text and its similarity metrics, imputation models for missing data, statistical models for predicting matches, and real-life case studies of record linkage applications. The course is designed for a mini (about 7 weeks) for advanced undergraduates and master's level students.
Instructors: Steve Fienberg, Rebecca Nugent, Department of Statistics (both Node members)
TA: Sam Ventura, Department of Statistics (Node member)
13 students
Textbooks: Data Quality and Record Linkage Techniques by Herzog, Scheuren, Winkler (Springer, 2007); Data Matching: Concepts and Techniques for Record Linkage, Entity Resolution, and Duplicate Detection by Peter Christen (Springer, 2012)
This course is an introduction to privacy and confidentiality and is designed as a short course for professional master's students. The course was given at the University of Paris Dauphine. All materials are authored by Rebecca Steorts.
Statistical Graphics & Visualization is a 300-level elective class for sophomores, juniors, and seniors in Statistics, Information Systems, Computer Science, Math, and other majors interested in visualizing data. This course also satisfies the Creating General Education requirement for Dietrich College and is highly oversubscribed with a long waitlist every year.
Students learn how to extract, summarize, and visualize structure and features for low- and high-dimensional data. The class is taught in R, but other languages could be subsituted. For the final project, teams of students develop hypotheses and create statistics graphics to look for supporting or contradictory evidence. Projects are presented in poster or website form.
Spring 2017
Instructor: Sam Ventura, Department of Statistics (Node member)
TAs: none were node members this year
99 students
This version included modules on working with geo-spatial and census data.
Fall 2016
Instructor: Sam Ventura, Department of Statistics (Node member)
TAs: included Kayla Frisoli (Node member), Department of Statistics
65 students
This version included modules on working with geo-spatial and census data.
Spring 2016
Instructor: Sam Ventura, Department of Statistics (Node member)
TAs: included Kayla Frisoli (Node member), Department of Statistics
106 students
This version included modules on working with geo-spatial and census data.
Spring 2013
Instructor: Andrew Thomas, Department of Statistics
TAs: Sam Ventura and Mikhail Popov, Department of Statistics (both Node members)
55 students
Project Objective
Visualize the distributions of population characteristics in the metropolitan area(s) of interest using data sets from the US Census;
Project Instructions/Handout: pdf
(Presented at NISS, 2013) Summary of Module: poster Working with Shiny and R: poster
Shiny applications are hosted on the Department's server using RStudio's Shiny Server software.
Undergraduate Research
Undergraduate Research is a small invitation-only statistical research class for top juniors and seniors in statistics, machine learning, mathematics, computer science, etc. Interested students apply in the fall; between 12-15 students are selected. Small groups of three or four students are paired with a faculty client often in another discipline whose project requires statistical expertise and methodology. Students work all semester on this project with feedback and guidance from both the course professir and the faculty client. The goal is to expose students to the research process. Students present their work in both report and poster format with the latter being at CMU's Meeting of the Minds Undergraduate Research Symposium. The below are projects aligned with the research of this node.
Spring 2014
Instructor: Cosma Shalizi, Department of Statistics
TA: Beau Dabbs, Department of Statistics
Deduplication of Civil War Killings in El Salvador Daniel In, Karn Mishra, Joseph Pane
Faculty client: Steorts
Sampling, Survey, and Society
Sampling, Survey, and Society is an elective for majors in Statistics, Economics, and other students in the Dietrich College of Humanities and Social Sciences, drawing students with a very wide variety of backgrounds. This course revolves around the role of sampling and sample surveys in the context of U.S. society and its institutions. We examine the evolution of survey taking in the United States in the context of its economic, social, and political uses, eventually leading to discussions about the accuracy and relevance of survey responses, especially in light of various kinds of sampling errors. Students are required to design, implement, and analyze a survey sample and then write a scientific report summarizing their survey and its results.
Spring 2017
Instructor: Jared Murray (Node member), Department of Statistics
TAs: included Nicolas Kim (Node member), Department of Statistics
71 students
Spring 2016
Instructor: Jared Murray (Node member), Department of Statistics
TAs: included Nicolas Kim (Node member), Department of Statistics
61 students
Spring 2015
Instructor: Jared Murray (Node member), Department of Statistics
TAs: Maria Cuellar, Nicolas Kim (both Node members), Department of Statistics
44 students