Bayesian mixture models for complex
high-dimensional count data

Yuan Ji
The University of Texas M.D. Anderson Cancer Center

Abstract

Phage display is a very useful method to study the behavior of
a very large number of peptides and proteins on the surface of a
small bacterial virus  called phage. The resulting count data from
phage display experiments usually possess high dimensionality and
complex correlated structure. Statistical modeling of these data are
therefore challenging. The main issues involve the multiple comparison
and, more importantly, modeling the complex correlation structure in
the data, which are of major interests. We develop a class of
Bayesian mixture models for such complex high-dimensional count data
and propose a selection methodology for identifying peptides with
distinct ascending display patterns in their counts. We construct
Bayesian hierarchical priors for the parameters that are specifically
designed for this type of data. Our simulation results indicate
that the proposed mixture models and priors are very suitable for
the count data. We present a case study in details to demonstrate
the proposed methodology.