Fall 2001 Mathematical Biology Seminar

Seminars at Rice meet from 2pm to 3pm in Duncan Hall 1075.
Exceptions are noted in schedule below.

Parking at Rice is available in visitor spots in Lot C (Abercrombie lot), Entrance 16 from Rice Blvd.

7 September
J. Polanska, P. Jarosz-Chobot, and A. Polanski,
Evaluation of Familian Risk Factors in Type 1 Diabetes

ABSTRACT: We analyse mathematical models that allow evaluating risk factors related to Type 1 Diabetes. We discuss hypotheses necessary to apply different approaches. We prove that most general model leads to the problem of moments. The method is being applied to analysis of risk related to mother's age data from epidemiological research performed through 1989-1996 in the Upper Silesia region, Poland.

5 October
J. Siefert
Horizontal Gene Transfer

ABSTRACT: Unlike eukaryotes, which evolve principally through the modification of existing genetic information, bacteria have obtained a significant proportion of their genetic diversity through the acquisition of sequences from distantly related organisms. Horizontal gene transfer produces extremely dynamic genomes in which substantial amounts of DNA are introduced into and deleted from the chromosome. These lateral transfers have effectively changed the ecological and pathogenic character of bacterial species.

PAPER TO BE USED: Ochman H, Lawrence JG, Groisman EA. (2000): Lateral gene transfer and the nature of bacterial innovation. Nature 405:299-304

Presentation slides (.ppt format)

12 October
A. Renwick
Coalescence with Recombination

ABSTRACT: The coalescent process is a tool for modeling genetic diversity in a population. I investigate the effect of relaxing some common simplifying assumptions by monte carlo simulation. Specifically, I will discuss the facts that allowing recombination among chromosomes will, on average, increase the number of haplotypes observed while leaving the number of segregating sites unchanged. I follow the approach of Wuif and Hein (1999) in the simulation.

PAPERS TO BE USED: The following publications are attached as background material.
Carsten Wiuf and Jotun Hein (1999): The Ancestry of a Sample of Sequences Subject to Recombination. Genetics 151: 1217-1228.
Carsten Wiuf and Jotun Hein (1999): Recombination as a Point Process along Sequences. Theoretical Population Biology 55, 248-259.

19 October
Ming Ying Leung
Poisson Approximations for palindrome distributions in a random DNA sequance
ABSTRACT: The association between palindrome clusters and replication origins in herpesvirus genomes has been reported in several studies. In pursuit of a reliable statistical criterion to pick out nonrandom clusters of palindromes from DNA sequences, Poisson type approximations have played an important role. In this talk the basic ideas of the Chen-Stein technique of Poisson approximation will be presented. To illustrate the technique, Ishall prove that in a long nucleotide sequence generated as i.i.d. random variables, the number of palindromes above a given length has an approximate Poisson distribution.

REFERENCES:

Barbour, A.D., Holst, L., and Janson, S. (1992): Poisson Approximation, Chapter 1. Oxford University Press.
Leung, M.Y., Schachetel, G.A., and Yu, H.S. (1994): Scan statistics and DNA sequence analysis: the search for an origin of replication in a virus, Nonlinear World, 1, 445-471.

24 October
K. Fujarewicz
Selection and Classification of Microarray Gene Expression Data Using Support Vector Machines.

ABSTRACT: Microarrays are new technique of gene expression measurements that attracted a great deal of research interest in recent years. It has been suggested that gene expression data from microarrays (biochips) can be utilized in many biomedical areas, for example in cancer classification. Whereas several, new and existing, methods of classification has been tested, a selection of proper (optimal) set of genes, which expression serves during classification, is still an open problem. In the presentation a new heuristic method of choosing suboptimal set of genes by using support vector machines (SVMs) will be formulated. Obtained set of genes optimizes one-leave-out cross-validation error. The method is tested on microarray gene expression data of samples of two cancer types: acute myeloid leukemia (AML) and acute lymphoblastic leukemia (ALL). The results show that quality of classification of selected set of genes is much better than for sets obtained using another methods of feature selection.

REFERENCES:

Christianini N., J. Shawe-Tylor (2000): An introduction to support vector machines and other kernel-based learning methods, Cambridge Univ. Press.

Golub T. R. et al. (1999): Molecular classification of cancer: class discovery and class prediction by gene expression monitoring, Science, Vol. 286, pp. 531–537.

Fujarewicz K., Rzeszowska-Wolny J.: Neural network approach to cancer classification based on gene expression levels, Proc. IASTED Int. Conf. Modelling Identification and Control, pp. 564–568, Innsbruck, 2001.

Presentation slides (MSWord format)

2 November
H. Spratt
A Comparison of Three Methods Used To Determine the Functionally Important Residues of a Protein Sequence

ABSTRACT: I will discuss three methods used to determine which sites along a protein sequence are functionally important. These methods are the evolutionary trace, the method of maximum likelihood trees, and the method of hidden markov models using maximum likelihood trees. The evolutionary trace procedure determines which sites are important based on a protein evolutionary tree and which the conservation of residues at various branches of the tree. The maximum likelihood tree method uses a different method to draw the evolutionary trees and is able to determine different categories of protein evolution. The hidden markov method is a further adaptation of the maximum likelihood method. It adds a hidden markov chain to the process to determine which sites evolve at which rate (both of which are unknown to the observer). Results of the three analyses on several different proteins will be presented as well as the analysis using simulated protein sequences.

9 November
G.E. Fox
Comparison of Cyanobacterial Genomes

ABSTRACT: The geneme of the cyanobacterium Synechoccystis PCC 6308 has been fully sequenced and annotated in detail. Complete genome sequences are also available for five additional cyanobacteria. We have undertaken a comparison of these six genomes and the initial results of this comparison will be described and discussed. A core set of genes have been found that define the cyanobacterial phenotype. A number of these are highly unique to cyanobacteria and are specific signatures of the cyanobacterial phenotype. In addition, a number of genes have been laterally transferred in to the cyanobacteria and efforts to identify these genes will be discussed.

16 November
K. Simek
Gene Expression Data Modeling using Singular Value Decomposition.

ABSTRACT: The singular value decomposition (SVD) is a standard and straightforward procedure known in linear algebra. Now it is often used to simplify analysis and modeling of gene expression data. Using SVD gene expression profiles can be represented by a small number of characteristic modes that capture the temporal patterns of gene expression change. I present an application of SVD to dynamical modeling of time series of gene expression data described in [2]. The main goal of the approach is to find time translation matrix, which provides information how the characteristic modes influence in time on each other. The analysis leads to the conclusion that only the model of low dimensionality is needed to reconstruct the expression patterns with reasonable fidelity.

REFERENCES

[1] Neal S. Holter, Madhusmita Mitra, Amos Maritan, Marek Cieplak, Jayanth R. Banavar, and Nina V. Fedoroff (2000): Fundamental patterns underlying gene expression profiles: Simplicity from complexity. PNAS 97: 8409-8414;

[2] Neal S. Holter, Amos Maritan, Marek Cieplak, Nina V. Fedoroff, and Jayanth R. Banavar (2001): Dynamic modeling of gene expression data. PNAS 98: 1693-1698

30 November
C. Shaw
Dicty microarrays.

ABSTRACT: Dictyostelium discoideum is a unicellular organism that exhibits social behavior and cellular differentiation when starved. The transition to multi-cellularity is thought to incorporate many cell-signaling and communication strategies relevant if not conserved among the higher eukaryotes. Because Dictyostelium are easily grown, maintained, and mutagenized in the laboratory they provide an excellent model system for study with cDNA microarrays.

We have performed and analyzed over 1000 Dictyostelium microarray experiments using custom arrays we fabricate at our facility. Our experiments include analyses of both wild-type and mutant Dictyostelium development. We have been able to characterize wild-type Dictyostelium development, and the analysis of mutants has shown our ability to recover known epistatic relationships between genes in a functional pathway.

We have found 5 major expression modalities exhibited during wild-type Dictyostelium development. We have further identified cell type specific patterns of gene expression from homogenous samples from 4 cell type populations. We have also shown the impact of nutritional history on the wild-type developmental time course.

We have analyzed an extensive set of Dictyostelium mutants in the PKA pathway with microarrays. Using the microarray developmental time course as a phenotype we have sought to recover the genetic relationships in this already well known genetic pathway. We show that the microarray phenotype of development is sufficient to reproduce the known epistatic relationships between the mutagenized genes.

The ongoing Dictyostelium genome sequencing project provides plentiful sequence data for comparison with microarray expression patterns. We have begun the process of associating microarray expression data with the upstream sequence of coordinately expressed genes.

Return to previous page