Rebecca W Doerge Mellon College of Sciences, Biology, Statistics and Data Science, Carnegie Mellon University
Talk Title: Wanted: Large Complex `Omic Data Seeking Quantitative Reasoning and Analysis
This is an exciting and influential time for the field of Statistics in science. Technological
advances in genetic, genomic, and the other 'omic sciences are providing large amounts of complex data
that are presenting a number of challenges for the biological community. Many of these challenges are
deeply rooted in statistical issues that involve relatively small sample sizes with a large number of
parameters (e.g., single cells, genes, exons, base pairs). Although there are many different computational
tools for processing these data, there are still issues of data bias. Furthermore, there are a limited number
of appropriate statistical methods, and even fewer that acknowledge the unique nature of these data (i.e.,
high dimensional discrete counts). Following a discussion about data and its structure, experiments that
employ next-generation sequencing technologies are considered with focus on dependence of high-
dimensional data in both bulk and single cell applications. This talk will be accessible to a broad
scientific audience; an in depth understanding of statistics, biology and/or computing is not required.
Bio:
Rebecca Doerge is the Glen de Vries Dean of the Mellon College of Science at Carnegie Mellon University.
Prior to joining both the Department of Statistics and the Department of Biology at Carnegie Mellon
University she was the Trent and Judith Anderson Distinguished Professor of Statistics at Purdue University.
Dean Doerge joined Purdue University in 1995 and held a joint appointment between the Colleges of
Agriculture (Department of Agronomy) and Science (Department of Statistics) until her departure from
Purdue University. Professor Doerge's research program is focused on Statistical Bioinformatics, a
component of bioinformatics that brings together many scientific disciplines into one arena to ask, answer,
and disseminate biologically interesting information in the quest to understand the ultimate function of DNA
and epigenomic associations. Rebecca is the recipient the Teaching for Tomorrow Award, Purdue University,
1996; University Scholar Award, Purdue University, 2001-06; and the Provost's Award for Outstanding
Graduate Faculty Mentor, Purdue University, 2010. She is an elected Fellow of the American Statistical
Association (2007), an elected Fellow of the American Association for the Advancement of Science (2007),
and a Fellow of the Committee on Institutional Cooperation (CIC; 2009). She is the Chair-Elect of the AAAS
Section U. Dean Doerge has published over 130 scientific articles, published two books, and graduated 25
PhD students.
Rebecca was born and raised in upstate New York. As a first generation student, she studied theoretical
Mathematics at the University of Utah; it was there that she gained interest and experience in both
computing and Human Genetics. Rebecca obtained her PhD in Statistics from North Carolina State University
under the direction of Bruce Weir, and was a postdoctoral fellow with Gary Churchill, Biometry and Plant
Breeding at Cornell University.
Dean Doerge is a past member of the Board of Trustees for both the National Institute of Statistical Sciences,
and the Mathematical Biosciences Institute. She is a member of the Engineering External Review Committee
at Lawrence Livermore National Laboratory, and a member of the Global Open-Source Breeding Informatics
Initiative (GOBII) Advisory Board.
Anita Woolley Tepper School of Business, Carnegie Mellon University
Talk Title: Collective Intelligence and Team Gender Composition
Recent research demonstrates that teams exhibit a characteristic level of collective intelligence (CI) which reflects their ability to work together to solve a range of different problems. There is also a growing body of evidence that group composition is an important foundation for CI, and that under many conditions having more women in a group leads to higher CI. In this study, we are looking at the conditions under which having more men vs more women on a team enhances collective intelligence. In doing so we also vary group structure, another important input to CI, specifically by varying whether or not the team has a hierarchy and whether the hierarchy is stable. What we find is that group structure alters the communication patterns in teams; lack of stable hierarchy leads to more competitive communication patterns and higher CI when a team has more men, whereas stable hierarchy leads to cooperative communication patterns and higher CI when a team has more women. Our research has important implications for thinking about how organizational environments may be systematically more hospitable or hostile to men or women as a result of the structure and communication environment, and changes that may enhance collective intelligence.
Bio:
Anita Williams Woolley is an Associate Professor of Organizational Behavior and Theory at Carnegie Mellon University's Tepper School of Business. She has a PhD in Organizational Behavior from Harvard University, where she also earned Bachelor’s and Master’s degrees. At the Tepper School of Business, she teaches MBA and executive education courses on managing people and teams in organizations.
Prof. Woolley’s research includes seminal work on team collective intelligence, which was first published in Science in 2010 and has been featured in over 1000 publications and media outlets since, including Forbes Magazine, the New York Times, and multiple appearances on NPR. She was named one of the 30 most influential industrial/organizational psychologists alive in 2014 by Human Resources MBA Magazine. Professor Woolley’s research has been published in Science, Academy of Management Review, Organization Science, Organizational Behavior and Human Decision Processes, Journal of Organizational Behavior, and Small Group Research, among others. Her research has been funded by grants from the National Science Foundation, the U.S. Army Research Institute, DARPA, as well as private corporations. She has won awards for her research and her teaching.
Professor Woolley is a Senior Editor at Organization Science and on the editorial boards for Academy of Management Discoveries, Human Computation, and Small Group Research, and is a member of the Academy of Management, the Interdisciplinary Network for Group Research, and the Association for Psychological Science.
Suyin Wang Principal Financial Group
Talk Title: The Spectrum of Data Science in Asset Management
Today’s financial institutions have been compelled to deploy data-driven capabilities in crease growth and profitability, to lower cost and improve efficiencies, to drive digital transformation, and to support risk and regulatory priorities. Given the characteristics of asset management industry, I will share the perspective of understanding data science as a system, and the spectrum of data science from fundamental information to cognitive analysis, with the lessons we have learned through works at each stage.
Bio:
Suyin Wang is a Data and Operations Research Scientist at Principal Financial Group. She is currently managing the development of a quantitative strategy using machine learning for equity investment with $2.3 Billion AUM. She also initiated projects on generating economic insights from Principal’s retirement and insurance data, portfolio optimization of Principal International and research on asset management of the company’s general account. Prior to joining Principal, she was a PhD student at Carnegie Mellon University majoring in Business Technologies. Her research interest was in financial disclosure behaviors of companies using natural language processing and machine learning. She received an M.S. in Business Technologies from Carnegie Mellon University in 2018, an M.A. in Economics from Columbia University in 2016, and an M.E. in Financial Information Service from Peking University in 2015.
Seema S Lakdawala Microbiology and Molecular Genetics, University of Pittsburgh School of Medicine
Talk Title: The Puzzle of Packaging: Predicting influenza virus genomic assembly order from evolutionary history
Influenza virus epidemics cause hundreds of thousands of hospitalizations in the United States every year.
Influenza A viruses contains eight genomic viral RNA segments. Reassortment, or genetic exchange of
viral RNA segments between two distinct influenza viruses within a coinfected cell, can produce novel
pandemic strains. Prediction of emerging pandemic threats relies upon predicting reassortment events.
Reassortment is likely constrained by viral RNA interactions that facilitate genomic packaging. We are
using available sequencing data to extrapolate the relationship between all eight viral RNA segments to
provide a ‘set of rules’ that can be used to predict reassortment potential between influenza virus strains.
Bio:
Dr. Lakdawala trained as a molecular virologist at the Salk Institute in San Diego, CA on
viral ubiquitin ligases. In 2009, she began a post-doctoral fellowship with Dr. Subbarao at
the NIH to study influenza virus assembly dynamics in live cells using light sheet
microscopy and airborne transmission. Seema started an independent laboratory at the
University of Pittsburgh School of Medicine in 2015 studying influenza virus assembly and
pathogenesis. The Lakdawala Lab has published multiple papers on influenza assembly and
recently described a new architecture of the viral genome. Their research has been
featured on WESA, Gizmodo, and This Week in Virology. In addition, Dr. Lakdawala has co-
authored an article on non-pharmaceutical strategies to limit influenza virus transmission
that was published in the Washington Post. The Lakdawala lab is current funded by the NIH
(R01, CEIRS), American Lung Association, and Charles E. Kaufman Foundation.
Rhiannon Weaver Ads Integrity, Google
Talk Title: Data Science Roles at Google
"Data Science" has come to be an umbrella term encompassing the aspects of statistics, machine learning, and systems engineering that are all necessary for the integration of data analysis with production engineering and business decision-making in large scale online services. At Google, we have many roles that espouse different flavors of data science. Whether they are classified officially as "Data Scientist", "Machine Learning Researcher," "Business Analyst" or "Software Engineer," they can't solve problems in isolation from one another. In this talk I will give you my take on the broad aspects that these roles focus on within Google, with particular attention paid to the branch of Data Science that emerged from Quantitative Analysis and Statistics, and some examples of how it comes into play in the "nuts and bolts" day to day systems that Google relies on for its global platforms.
Bio:
Rhiannon joined Google in October 2016 and is a Data Scientist with the Ads Integrity team at Google Pittsburgh. Her current focus is on devising methods to improve the statistical bias and uncertainty in metrics and ground truth data used to evaluate the machine learning models that implement Google Ads policies at scale. Prior to joining Google, Rhiannon worked as a statistician and network security analyst for 9 years at CMU's CERT division of the Software Engineering Institute, and briefly at a robotics company where she primarily extracted metrics from point cloud data to use in predictive models for fluid slosh in tanker trailers.
Rhiannon has a BS in mathematics (honors) and a BS in computer science from Penn State University, and an MS and PhD in statistics from Carnegie Mellon University.
Aarti Singh Machine Learning Department, Carnegie Mellon University
Talk Title: Towards an era of intelligent interactive algorithms
Classical machine learning algorithms focus on the setting where the algorithm has access to a fixed dataset obtained prior to any analysis. In most applications, however, we have control over the data collection process such as which image labels to obtain, which drug-gene interactions to record, which network routes to probe, which movies to rate, etc. Furthermore, most applications face budget limitations on the amount and type of labels, data or features that can be collected. Decisions about which data to collect are typically taken by humans in an ad-hoc manner. Thus, there is a need to develop intelligent algorithms that can make principled and automated decisions to interact with the data generating mechanism and collect data that is most relevant for the learning task. In this talk, I will present an overview of challenges and progress in designing such interactive data analysis algorithms.
Bio:
Aarti Singh received her Ph.D. degree in Electrical Engineering from the University of Wisconsin-Madison in 2008 and was a Postdoctoral Research Associate at the Program in Applied and Computational Mathematics at Princeton University from 2008–2009, before joining the Machine Learning Department in the School of Computer Science at Carnegie Mellon in 2009 where she is currently an Associate Professor. Her work is recognized by an NSF Career Award, a United States Air Force Young Investigator Award, A. Nico Habermann Junior Faculty Chair Award, Harold A. Peterson Best Dissertation Award, and multiple best student paper awards. She is member of the National Academy of Sciences (NAS) committee on Applied and Theoretical Statistics, Associate Editor of the IEEE Transactions on Information Theory and IEEE Transactions on Signal and Information Processing over Networks, and is the Program Chair for the International Conference on Machine Learning (ICML) 2020 as well as Artificial Intelligence and Statistics (AISTATS) 2017 conferences.
Katharina Best Engineering and Applied Sciences Department, RAND
Talk Title: Doing Policy Research with Data Science
RAND’s mission is to improve policy and decisionmaking through research and analysis. Increasingly, “research and analysis” includes a component of data analysis, data analytics, and data science. This talk provides a brief introduction to recent RAND projects in the areas of national and homeland security, education, and health that applied data-driven techniques to help inform policy decision-making. Data-driven analysis, combined with other quantitative and qualitative methods, has changed how policymakers make regulations, monitor programs, and spend money. We also discuss some current challenges related to the application of data science techniques to policy questions.
Bio:
Katharina Best is Operations Researcher and Associate Research Department Director of the Engineering and Applied Sciences Department at the RAND Corporation. Her research interests focus on applications of operations research and financial engineering methods in the areas of national security, strategic planning, risk management, and education. Best's national security work covers military force development and readiness, force planning and resource allocation, military manpower and workforce development issues. She is also interested in higher education finance, financial decision-making, and education technology, data management, and privacy. Prior to coming to RAND, Best studied the dynamics of the college education market in the United States as well as the effect of loans and credit on decisions made by students, parents, lenders, and institutions of higher education. She worked as a corporate risk consultant at Oliver Wyman Financial Services. Best received her Ph.D. and M.S. in industrial engineering from the University of Michigan, and her B.S. in systems engineering from the University of Virginia.
Molly Steenson College of Fine Arts, School of Design, Carnegie Mellon Universiy
Talk Title: What Are We Really Talking About When We're Talking About Ethics?
Since 2018, the word "ethics" has seen an explosion, especially where AI and computational technologies are concerned. But why now? And is it ethics that we're looking at, or is it something different? This talk explores the moment we're in, and takes a look at the role of design in making better, more responsible computational technologies.
Curren Katz Director of Data Science R&D, Enterprise Analytics, Highmark Health
Talk Title: Artificial Intelligence in healthcare from an IDFS perspective
Artificial intelligence presents unique opportunities in healthcare. At Highmark Health, we are developing ways to use the diverse data available to an integrated delivery and financing system (IDFS) to improve the lives of the people we serve. This talk provides an introduction to AI research at Highmark Health and building a bridge between academic research and the healthcare industry.
Bio:
Curren Katz is the Director of Data Science R&D at Highmark Health. She received her PhD in Cognitive Neuroscience at Humboldt-Universität of Berlin where her research focused on using fMRI to measure task dependent cerebellar-cerebral connectivity changes and parietal activity patterns underlying numerical information processing. Her research included machine learning and predictive models using diverse data sources. She received her master’s degree from Harvard University in Mind, Brain and Behavior, her undergraduate degree from NYU. She was a researcher at Mount Sinai Beth Israel Medical Center in New York and postdoctoral researcher at the Center for the Neural Basis of Cognition at University of Pittsburgh. At Highmark, she currently leads a diverse group of Research Data Scientists focused on using machine learning and AI to change the delivery and financing of healthcare. As a research focused team, they publish findings in peer-reviewed journals and collaborate with academic and industry researchers. Prior to starting the Data Science R&D group, she was a Data Scientist at Highmark Blue Cross Blue Shield and Senior Data Scientist at Highmark Health Enterprise Analytics.