Data privacy is a fundamental problem of the modern information infrastructure. Increasing volumes of personal and sensitive data are collected and archived by health networks, government agencies, search engines, social networking websites, and other organizations. The social benefits of analyzing these databases are significant. At the same time, the release of information from sensitive data repositories can be devastating to the privacy of individuals and organizations. The challenge is to discover and release analytically useful extracts of these databases, without compromising the privacy of their entities. Together with colleagues and students in Cylab, the Heinz College, and School of Computer Science our research focuses on the trade-off between disclosure risk and utility associated with the release of statistical databases. We are also working towards understanding the practical potential of the developed techniques by applying them to social science data sets.

