Aleksandra Slavkovic
Penn State University
Differential privacy for statistical hypothesis testing
Data privacy is an overarching concern in modern society, as government
and non-government agencies alike collect, archive, and release
increasing amounts of sensitive personal data. The concept of
differential privacy as a rigorous definition of privacy has emerged
from the cryptographic community. However, further careful evaluation is
needed before we can apply these theoretical results to privacy
preservation in everyday data mining and statistical analysis. For
example, many funding agencies and ethics boards frequently request that
a power analysis be completed before a study is conducted, or before a
study's results are published. In this talk we demonstrate a simple way
of integrating a differential privacy framework with the classical
statistical hypothesis testing in domains such as clinical trials where
personal information is sensitive. We develop concrete methodology that
researchers can use. We derive rules for the sample size adjustment
whereby both statistical efficiency and differential privacy can be
achieved for the specific tests for binomial and normal random variables
and in contingency tables.