Seminars at Rice meet *on Fridays from 2:00 pm to 3:00 pm* at Rayzor Hall (RH 305)

Exceptions are noted in schedule below.

· *May 7th, 2pm Room: RH305*

Abstract: The presentation concerns the problem of fitting mathematical models of cell signaling pathways. Such models frequently take form of a set of ordinary differential equations. While the model is continuous time the quadratic performance index involves measurements taken only at discrete time moments. Adjoint sensitivity analysis is a tool which can be used for finding a gradient of a performance index with respect to parameters of the model. A structural formulation of sensitivity methods will be presented with special attention paid on hybrid, continuous-discrete time systems. Numerical examples of fitting of mathematical model of NF-kappaB regulatory module will be presented.

· *March 16th, noon till 1pm, Room: DH1064*

Abstract: If sexual reproduction is dominant in eukaryotic organisms, many organisms of major medical or economical importance are known to reproduce mainly or strictly clonally. A better understanding of the reproductive system of such organisms might be crucial for planing successful long term drug administration or vaccination programs. I will present both analytical and stochastic simulation results for the population genetics of clonally and partially clonally reproducing populations. High rates of clonal reproduction will positively affect heterozygosity. As a consequence, nearly twice as many alleles per locus can be maintained. Contrarily, genotypic diversity smoothly decreases with increasing rates of clonal reproduction. Asexual populations thus maintain higher genetic diversity at each single locus but a lower number of different genotypes. Mixed clonal/sexual reproduction is nearly indistinguishable from strict sexual reproduction as long as the proportion of clonal reproduction is not strongly predominant for all quantities investigated, except for genotypic diversities.

· *February 27th, 2pm Room: RH305*

Abstract: Unrelatedness of families are usually assumed when family data are used to locate susceptibility locus. This is problematic because polygenic diseases are by definition complex, and many factors are likely to contribute to some families but not to others. Wherever factors are present in a subset of families, the families for whom the factor is not important / present will contribute background noise that may well mask the signal from those families where it does play a role. This talk will introduce methods of estimating relatedness between families and a randomization based method to use population structure in gene mapping

· *February 13th, 2pm Room: RH305*

Abstract: A stochastic two-stage carcinogenesis model has been widely used to model the mechanism of tumor development for varieties of cancers and some interesting results have been revealed by this approach. In this talk, I will introduce this traditional two-stage (MVK) model and discuss more about the identifiability of the model using incidence data. Our research is focusing on the studies of the impacts of environmental exposure, namely cigarette smoking and genetic susceptibility on initiation and promotion of lung cancer by applying this model. Some experiment data, which measure the cigarette metabolism capacity and DNA repair capacity, enable us to explore the risk of individual's genetic susceptibility in the development of lung cancer. Through our simulation results, we will show that by incorporating smoking history, survival data we could make inference on the influence of the several risk factors and their interaction in the carcinogenesis of lung cancer.

· *December 3rd, 10am Room: DH2014*

Abstract: Tiled CGH microarray are interesting data for studying chromosomal change. I show develop an HMM based inference method which utilizes the known BAC adjacency information and outperforms the independent analysis.

· *December 3rd, 1pm Room: DH1044*

Abstract: In the simplest possible case, genetic susceptibility factors to a particular disease may be sought by looking for direct linkage between genetic markers and affected individuals. Each family is analysed independently and a dominant mode of action is assumed. The other extreme might be seen as a search that takes into account a host of other factors, including relatedness between families, interactions between multiple factors, environmental effects and alternative modes of inheritance. In practice, many studies lie closer to the former than to the latter. In this seminar I will present some ideas that have emerged from studying disease susceptibility in natural populations of animals, and how these have been used to try to move towards developing more flexible methods of mapping genes in humans.

· *October 29th, 4pm DH 1075*

Abstract: The past size of biological populations is important question in conservation biology, epidemiology and anthropology. The recent availability of more and better genetic data provides new opportunities to investigate this question. I will present a method which infers population size history as a function of time. This method is unique in that it estimates phylogeny simultaneously with phylogeny and that the population size function is not parametrically constrained. The method can be applied to data from an arbitrary number of unlinked loci. The theoretical bases of this method are coalescent theory, which relates population size to genealogy, and a model of molecular evolution, which allow the inference of a genealogy from sequence data. Computationally, the method is an implementation of the Metropolis- Hastings-Green algorithm, and it uses Metropolis-coupling to escape local maxima. I will present results from simulated data as well as an analysis of a world-wide sample of human mitochondrial dna.

· *September 24th*

Abstract: Mapping the genes for a complex disease, such as Diabetes or Rheumatoid Arthritis, involves finding multiple genetic loci that may contribute to onset of the disease. Pairwise testing of the loci leads to the problem of multiple testing. To avoid multiple tests, we can look at haplotypes, or linear sets of loci; but this results in a contingency table with sparse counts, especially when using marker loci with multiple alleles. Using case-parent triad data, we develop a hierarchical Bayesian model, using a conditional logistic regression likelihood to model the probability of disease given genotype. We extend the Bayesian model developed by Thomas, et al. [(1995) Genetic Epidemiology 12:455-466] by developing prior distributions on the allele main effects that model the genetic dependencies present in the HLA region of Chromosome 6. We also added a hierarchical level to allow locus and allele selection. Thus we cast the problem of identifying genetic loci relevant to the disease into a problem of Bayesian variable selection. We evaluate the performance of the procedure with some simulated examples.

· *September 17th*

Abstract: Evolution and coevolution of proteins is a very interesting subject. This talk briefly outline methods for determining phylogenies of proteins, including reciprocal blasting approach and multiple sequence alignment with ClustalW, and apply these to several protein families. Particular attention will be given to the NFkB, IkB, and IKK families involved in the NFkB pathway, which is currently being modeled at the statistics department of Rice. In the course of the talk, homologues to each of the proteins in aforementioned families will be presented, as well several inferred observations regarding evolution of the proteins in these families.

· *July 11th*

Abstract: Bacteria contain short DNA sequence motifs that are dispersed throughout the genome, known as repetitve elements. The number and location of repetive elements is distinctive for for each bacterial strain. Rep-PCR is a process where PCR amplification proceeds from primers bound to repetitive elements. Analysis of the number and length of DNA fragments amplified through Rep-PCR allows typing of bacteria at the level of the strain. I will discuss some issues involved in the statistical analysis of Rep-PCR amplicon spectra, including normalization, clustering, and visualization of high-dimensional data.

· *June 27th*

Abstract: The goal of many studies on proteins and their complexes lies in determining their function. If the typical methods, such as sequence or structure comparisons cannot suggest a structure, it is possible to discern it from comparing functionally important local patterns. This presentation will focus on a paper "A model for statistical significance of local similarities in structure" by Stark et al., in which the authors derive formula to calculate statistical significance of the root-mean-square deviation commonly used to express structural similarity. This would allow to discern true structural or functional patterns from noise.

· *May 23rd*

Abstract: We examine a mathematical model of a population of cells distributed over a linear or tubular structure. Growth of cells is regulated by a growth factor, which can diffuse over the structure. Aside from this, production of cells and of the growth factor is governed by a pair of ordinary differential equations. We find conditions under which diffusion causes destabilization of the spatially homogeneous steady state, leading to exponential growth and apparently chaotic spatial patterns, following a period of almost constancy. This phenomenon may serve as a mathematical explanation of ``unexpected'' rapid growth and invasion of temporarily stable structures composed of cancer cells.

· *May 16th*

Abstract: The aim of this work is to show under which conditions a receptor-based model can produce and regulate patterns. Such model is applied to the pattern formation and regulation in a fresh water polyp, hydra. The model is based on the idea that both head and foot formation could be controlled by receptor-ligand binding. Positional value is determined by the density of bound receptors. The model is defined in the form of reaction-diffusion equations coupled with ordinary differential equations. The objective is to check what minimal processes are sufficient to produce patterns in the framework of a diffusion-driven (Turing-type) instability. Three-variable (describing the dynamics of ligands, free and bound receptors) and four-variable models (including also an enzyme cleaving the ligand) are analysed and compared. The minimal three-variable model takes into consideration the density of free receptors, bound receptors and ligands. In such model patterns can evolve only if self-enhancement of free receptors, i.e. a positive feedback loop between the production of new free receptors and their present density, is assumed. The final pattern strongly depends on initial conditions. In the four-variable model a diffusion-driven instability occurs without the assumption that free receptors stimulate their own synthesis. It is shown that gradient in the density of bound receptors occurs if there is also a second diffusible substance, an enzyme, which degrades ligands. Numerical simulations are done to illustrate the analysis. The four-variable model is able to capture some results from cutting experiments and reflects {\it{de novo}} pattern formation from dissociated cells. The results of grafting experiments suggest that model should involve a memory-based relation. It is shown that the model is able to capture results from experiments if the dynamics of production of ligands and enzyme are described by the system of ordinary differential equations showing hysteresis. To explain cutting experiments within our model we propose another interpretation of the positional value.

· *May 9th*

Abstract: The inference of population size from sequence data is a well-study problem in population genetics. Most methods, however, impose parametric constraints on the population size history. Usually, the population is treated as constant, exponentially growing, or having undergone as step-wise expansion. I will present a method that efficiently presents the information available in the data while making no such parametric constraints. The method is similar to Bayesian inference of a nonconstant Poisson intensity. In particular, I will discuss trans-dimensional MCMC sampling.

· *25
April*

Abstract: There are a number of settings in which the usual (determistic) differential equation approach to chemical kinetics is not adequate. For example, in the ODE approach, one considers concentrations to vary continuously. This is particularly unrealistic when there are only a few molecules of one of the reactant species (and this is the case in many reactions that occur in a biological cell). In such cases, there are reasons to believe that stochastic models are more realistic representations of the molecular dynamics. The chemical master equation is one of the standard approaches to describing chemical kinetics via a stochastic processs, and the Gillespie Algorithm is the most widely used method of simulating solutions to the chemical master equation. In this talk we will briefly introduce the chemical master equation, and prove that it is equivalent to Gillespie's algorithm.

· *Canceled*

Abstract: Microarrays hold great promise as a tool for understanding disease. However the large data sets present challenges to researchers after performing the experiments. After collecting the raw data, the goal is to convert that into meaningful biological information. GeneSifter.Net is a Web-based system to help individual researchers accomplish this task. During the talk I will review public data sources that are available and how these can be used within GSDN to understand microarray data in a biological context. Along the way I will informally describe the analyses within GSDN and challenges for future improvements.

· *18
April*

Abstract: A mechanism connecting the local untwisting and opening of DNA double helix is proposed. The presented thermodynamical approach is based on two models: the Peyrard Bishop model which describes the denaturation of DNA due to thermal fluctuations and the model developed by the author (Phys. Rev. E 60 p.7253 (1999)) describing solitary torsional waves which propagate along DNA molecule forced by advancing RNA polymerase. The torsional wave implies that the DNA untwists locally causing a local decrease in the stacking interaction between adjacent base pairs. Molecular dynamics simulations have shown that thermal fluctuations (which are too small at physiological temperatures to denaturate the twisted DNA) may lead to the formation of a denaturation bubble placed in the untwisted region. The local DNA denaturation is needed to let one of DNA strands serve as a template for a synthesis of mRNA.

· *7
March*

Probabilistic Boolean networks are stochastic models for genetic regulatory networks. This talk gives an introduction to the field and some basic properties and results.

· *28
February*

Ivan P. Gorlov^{1}, Olga Gorlova^{2},
Marsha L. Frazier^{2}, & Chris Amos^{2}

^{1}Department of Biochemistry, The University of
Texas M.D. Anderson Cancer Center

^{2}Department
of Epidemiology, University of Texas M.D. Anderson Cancer Center

We studied an association of Exonic Splicing Enhancers (ESEs) motifs with missense mutations in tumor suppressor gene TP53 using International Agency on Cancer Research (IACR) database. The idea behind the study was as follows: if some sequence functions as Splicing Enhancer, than nucleotide substitutions in the site will disturb the normal splicing, abrogate p53 function and finally lead to cancers. There should be positive association between nucleotide substitutions in the ESE and disease. We found that missense mutations in TP53 strongly co-localize with ESEs, but only small fraction of the potential ESE motifs contributes to the association. Usually there is one or two ESE site per exon showing significant association with missense mutations – so-called significant ESE sites. Consensus sequences of significant ESE motifs are slightly different from known ESE motifs. Significant ESE sites that recognized by SRp55 essential splicing factor show lower variation in individual ESE sequences as compared to non-significant ESE sites. These findings suggest that association with missense mutations can provide a useful tag for identification of potentially functional ESE sites.

· *21
February*

It will be a journal club talk devoted to presentation of some concepts and problems related to haplotype blocks in human DNA. The presentation will be based on the following papers:

[1] N. Patil et al., (2001), *Blocks of Limited Haplotype
Diversity Revealed by High-Resolution Scanning of Human Chromosome 21*,
Science, vol. 294, pp. 1719-1723.

[2] Wang et al., (2002), *Distribution of Recombination
Crossovers and the Origin of Haplotype Blocks: The Interplay of Population
History, Recombination and Mutation, *Am. J. Hum. Genet., vol. 71, pp.
1227-1234.

[3] K. Zhang et al., (2002), *A Dynamic Programming
Algorithm for Haplotype Block Partitioning*, PNAS, vol. 99, pp. 7335-7339.

We will overview possible definitions of haplotype blocks and discuss which genetic forces contributed to formation of haplotype blocks as seen in human genome. We will also present algorithms to define boundaries between blocks.

· *7
February*

I will report on the paper:

D. Battogtokh, D. K. Asch , M. E. Case, J. Arnold and H.-B. Schuettler: Proc. Natl. Acad. Sci. USA, Vol. 99, Issue 26, 16904-16909, December 24, 2002

A chemical reaction network for the regulation of the quinic acid (qa) gene cluster of Neurospora crassa is proposed. An efficient Monte Carlo method for walking through the parameter space of possible chemical reaction networks is developed to identify an ensemble of deterministic kinetics models with rate constants consistent with RNA and protein profiling data. This method was successful in identifying a model ensemble fitting available RNA profiling data on the qa gene cluster.

· *31
January*

Biochemical reactions are continually taking place in all living organisms and most of them involve proteins called enzymes, which acts as remarkably efficient catalyst. Enzymes react selectively on definite compounds called substrate.

Understanding
these types of reactions are crucial in the modeling and simulating metabolic
processes, which take place within living cells. Although such cell pathways
are usually highly complicated, there have been attempts to model interactions
between proteins and genes on the basis of the enzymatic reactions.

In
this talk, the mechanism and the model of the basic enzymatic reactions,
proposed by Michaelis and Menten will be discussed, followed by its application
in pathway modeling.

· *24
January*