Simultaneous Inference for Multiple Testing and Clustering
via Dirichlet Process Mixture Models

David Dahl
Statistics Department
Texas A&M University


            We propose a Bayesian nonparametric model which exploits clustering for increased sensitivity in multiple hypothesis testing. We build on Dahl and Newton (2007) who showed that this was feasible by modeling the dependence among objects through clustering and then estimating hypothesis-testing parameters averaged over clustering uncertainty. We propose several improvements. First, we separate the clustering of the regression coefficients from the accommodation of heteroscedasticity. Second, our model accommodates more general experimental designs, such as permitting covariates and not requiring independent sampling. Third, we provide a more satisfactory treatment of nuisance parameters and hyperparameters. Finally, we do not require the designation of a reference treatment. The proposed method is compared in a simulation study to ANOVA and the BEMMA method of Dahl and Newton (2007).

This is joint work with  Marina Vannucci, Michael Newton, Qianxing Mo