Statistical issues arising in the study of transcription regulation

Sunduz Keles
University of Wisconsin, Madison
Department of Statistics and Department of Biostatistics and Medical Informatics


        The availability of commercial and flexible microarray platforms enables the genome-wide study of transcription regulation at a high resolution.  Among the most powerful experiments are Chromatin immunoprecipication-on-array (ChIP-chip) and Cognate Site Identification array (CSI-array) experiments that allow full characterization of DNA recognition profiles of proteins and small DNA molecules.  We propose conditional mixture models for the analysis of high throughput data from ChIP-chip experiments and study the consistency of an EM-based estimation procedure for fitting these models. The data generated by CSI-arrays resemble data from designed experiments with the additional complication of a sequence alignment problem. We show how this can be by-passed with a missing data formulation and illustrate the utility of regression trees in characterizing structural properties of the DNA recognition profiles.

Joint work with Aseem Ansari, Heejung Shim, and Christopher Warren.