Aleksandra Slavkovic
Penn State University

Differential privacy for statistical hypothesis testing

Data privacy is an overarching concern in modern society, as government and non-government agencies alike collect, archive, and release increasing amounts of sensitive personal data. The concept of differential privacy as a rigorous definition of privacy has emerged from the cryptographic community. However, further careful evaluation is needed before we can apply these theoretical results to privacy preservation in everyday data mining and statistical analysis. For example, many funding agencies and ethics boards frequently request that a power analysis be completed before a study is conducted, or before a study's results are published. In this talk we demonstrate a simple way of integrating a differential privacy framework with the classical statistical hypothesis testing in domains such as clinical trials where personal information is sensitive. We develop concrete methodology that researchers can use. We derive rules for the sample size adjustment whereby both statistical efficiency and differential privacy can be achieved for the specific tests for binomial and normal random variables and in contingency tables.