A Mixture Model Approach to Estimating the Number of True Null Hypotheses and Adaptive Control of FDR

Ajit C. Tamhane
Departments of IE/MS and Statistics, Northwestern University

(Joint work with Jiaxiao Shi)

        Many methods have been proposed for estimating the number, m0 (or the proportion, π0), of the true null hypotheses to adaptively control a type I error rate (e.g., the false discovery rate or FDR) using a multiple test procedure. Most of these methods use a formal test (see, e.g., Storey 2002) or a graphical technique (see, e.g., Spjøtvoll and Schweder 1982) to eliminate “significantly” non-null p-values. Then m0 is estimated from the remaining p-values by assuming that they follow the U[0, 1] distribution. Because these methods ignore that some “nonsignificant” p-values may come from alternative hypotheses (type II errors), they tend to overestimate m0, and hence lead to a more conservative control of FDR. In this paper we propose to use all p-values to estimate m0 by modelling them with a parametric mixture distribution following up on the finding by Wu, Guan and Zhao (2006) that nonparametric approaches are too conservative. Two different mixture distribution models are considered. The normal model assumes that the test statistics from the true null hypotheses are i.i.d. N(0, 1) and those from the alternative hypotheses are i.i.d. N(δ, 1) with δ = 0, and π0 and 1 π0 as the mixing proportions. The beta model assumes that the p-values from the null hypotheses are i.i.d. U[0, 1] and those from the alternative hypotheses are i.i.d. Beta(a, b) with a<b. Three methods of estimation of π0 and of the associated mixture distribution parameters are developed for each model. The methods are compared via simulation with each other and with Storey’s method in terms of the bias and mean square error of the estimators of π0 and the achieved FDR. Robustness of the estimators to the model violations is also studied by generating data from other models. The EM algorithm (Dempster, Laird and Rubin 1977) estimator performs best overall when the assumed model holds, but it is not very robust to significant model violations. An example is given to illustrate the methods.