Department of Statistics Unitmark
Dietrich College of Humanities and Social Sciences


Cross-Disciplinary Research

Our Department has a strong focus on interdisciplinary research, with deep participation in collaborations involving scientists from a range of domains. These collaborations are now primary sources of “Big Data,” and hence our work has turned towards inference problems of increasing complexity.

Cognitive Neuroscience

Cognitive neuroscience attempts to understand the great mystery of the way mind is created by brain. The field is relatively young, yet is among the fastest-growing of all intellectual disciplines, in large part due to enormous technological advances in data acquisition. Together with colleagues from the Center for the Neural Basis of Cognition (CNBC), faculty and students have developed analytical techniques for neuroimaging, including functional magnetic resonance imaging (fMRI) and magnetic encephalography (MEG), which produce high-dimensional spatial time series data with complex structure, and diffusion imaging, which produces diffusion maps that can be used to form networks representing anatomical connectivity. Another major thrust of our research is concerned with individual and multiple neuron firing patterns recorded from the brains of animals while they perform some task. One of the applications is to brain-machine interfaces, where neural signals are used to guide a prosthetic robot arm. Faculty Contact: Valerie Ventura

Social and Behavioral Sciences

Statistical methods are now a primary tools for the collection and analysis of data to inform the education, policy, and social sciences. From questionnaire development to the selection of probability samples to the design of social experiments, statisticians at Carnegie Mellon collaborate in the collection of social science data. Faculty and students regularly work with others to develop new methods for analyzing these data and they apply up-to-date methods for drawing inferences from diverse social science data sources ranging from large scale sample surveys to social networks, to educational experiments. For example, Department faculty are participating faculty in the Living Analytics Research Center (LARC), a partnership of Singapore Management University and CMU. LARC brings together data scientists, social and behavior scientists, and management scientists to pursue computational social science research in new applications that will benefit consumers, businesses, and the public sector. Related research is also being done on record linkage and disambiguation by several Department faculty in conjunction with the Heinz College and the Department of Engineering and Public Policy. Although these projects primarily focus on record linkage and associated privacy problems as related to the United States Census (over 300 million records), this work has extensive methodological applications. Faculty Contact: Bill Eddy

Bioinformatics

Bioinformatics is the name given to statistical and computational approaches used to glean understanding from large data sets in molecular biology. Recent developments in genomic and molecular research technologies, combined with developments in information technologies have produced a tremendous amount of excitement in the research community. Major research efforts include genome-wide association studies, the study of copy number variation, gene finding, protein structure prediction, prediction of gene expression and protein-protein interactions, and the modeling of evolution. Statistics faculty interested in computational biology often work in collaboration with the Lane Center for Computational Biology or with faculty at the University of Pittsburgh Medical Center. Faculty Contact: Kathryn Roeder

Astronomy and Cosmology

Using data from telescopes and satellites, Department faculty and students study questions about the origin, evolution and fate of the universe. In the last decade, there has been a deluge of valuable data and statisticians play an important role in analyzing these data. Within the department, several faculty, post-docs, and graduate students are members of the collaboration; and other active members are drawn from the other departments at Carnegie Mellon, the University of Pittsburgh, and several international institutions. The statistics department works closely with the McWilliams Center for Cosmology at Carnegie Mellon as well as with the Department of Physics and Astronomy at the University of Pittsburgh. Recent projects include: analysis of the cosmic microwave background, estimating the dark energy equation of state, analysis of galaxy spectra, detecting galaxy clusters, identifying filaments, and estimating density functions with truncated data. A common theme in this work is the goal of detecting subtle, nonlinear signals in noisy, high-dimensional data. Our primary focus is on using state-of-the-art data, and analytical methods, to advance cosmology. Faculty Contact: Chad Schafer

Data Privacy and Security

Data privacy is a fundamental problem of the modern information infrastructure. Increasing volumes of personal and sensitive data are collected and archived by health networks, government agencies, search engines, social networking websites, and other organizations. The social benefits of analyzing these databases are significant. At the same time, the release of information from sensitive data repositories can be devastating to the privacy of individuals and organizations. The challenge is to discover and release analytically useful extracts of these databases, without compromising the privacy of their entities. Together with colleagues and students in Cylab, the Heinz College, and School of Computer Science, our research focuses on the trade-off between disclosure risk and utility associated with the release of statistical databases. We are also working towards understanding the practical potential of the developed techniques by applying them to social science data sets. Faculty Contact: Jing Lei