Spring 2002

Mathematical Biology Seminar

Seminars at Rice meet from 2.30 pm to 3.30 pm in Duncan Hall 1075.
Exceptions are noted in schedule below.

Parking at Rice is available in visitor spots in Lot C (Abercrombie lot), Entrance 16 from Rice Blvd.

· 25 January

Adam Bobrowski

STAR-LIKE COALESCENCE

Abstract: Coalescence, or genetic drift viewed backwards, is a major force in population genetics. It introduces a very particular kind of dependence between individuals in a genetic population caused by the fact that any two or more individuals in this population might have shared a common ancestor some random time ago. The strength of coalescence is inverse proportional to the effective population size, because in large populations the time to the common ancestor increases. A star-like coalescence is a situation where population is virtually infinite, and thus the ancestor has lived virtually infinite time ago. Contrary to an apparently common belief, such populations bear little or no resemblance to populations of independent individuals, unless some very restrictive conditions concerning the speed of population growth are met.

We will discuss these conditions and exhibit examples showing that models with star-like coalescence may radically differ from those that do not involve this force. To this end, we will consider Fisher - Wright - Moran models with various mutation patterns and various population growth scenarios. These models do not involve recombination or selection

· 1 February

Michael Swartz

Loglinear models for gene mapping

ABSTRACT: Loglinear models are commonly used to find dependencies among categorical variables, so it is reasonable to extend this method of analysis to the human genome when dealing with mapping qualitative traits, such as certain diseases. In 1995, an article in Genetic Epidemiology by Duncan Thomas and others described a loglinear model used to investigate which haplotypes contributed to the risk of developing Childhood Insulin- Dependent Diabetes in the Finnish Population. They used Gibbs Sampling with an Empirical Bayes approach to estimate the model coefficients. The talk will focus on the methodology used for their analysis. Some background will be reviewed for motivation.

References:

Thomas, D., et al. (1995). Variation in HLA-Associated Risks of Childhood Insulin-Dependent Diabetes in the Finnish Population: II. Haplotype Effects. Genetic Epidemiology, vol. 12, 455-466.

· 8 February

Nathan Xu

Detecting the Signature of Natural Selection with Microsatellites

Abstract: Detecting the signature of natural selection has been a central theme in evolutionary genetics. The aim of this work is to detect the signature of natural selection based on microsatellite loci. There have been some tests on this but they are mostly based on Infinite Allele Model or Infinite Site Model and may not be applicable to microsatellite loci. Complications can arise when the population under study is under some demographic changes, which can confound to the departure of neutral expectation. Through coalescence-based simulation, samples from populations under these changes can be simulated and their effects on neutrality test will be studied. The methodologies developed should be useful to interpret the functional significance of genomic data as well as for studying the genetic basis of complex traits.

· 22 February

Andrzej Polanski

Genealogical tree probabilities in the infinitely – many-sites model

ABSTRACT: This is a Journal Club talk which will discuss results in the paper [1]. This paper is devoted to the development of sampling theory for the infinitely - many sites process. Infinitely - many - sites (IMS) model is commonly used to describe variability observed in samples of DNA sequences. The tree structure underlying the (IMS) model is shown. Definitions and relations between labeled, unlabeled, ordered, unordered, rooted and unrooted trees are explained. Markov chain, whose states correspond to genealogical trees is presented and used to derive expressions for probabilities of genealogical trees, under stationarity hypothesis. Recursions for the probability of the configuration of sequences (genealogical trees) are given. For samples of large size, where probabilities cannot be obtained by recursions, a method based on Monte Carlo iterations is developed.

References:

R. C. Griffiths, S. Tavare, Unrooted genealogical tree probabilities in the infinitely many sites model, Math. Biosci., vol. 127, pp. 77-98, 1995.

· 1 March

Stacie Gardener

Mechanical forces influence gene expression of human umbilical vein endothelial cells

Abstract: DNA microarrays provide a technique for acquiring a large amount of data on gene expression of cultured cells subjected to various stimuli. Because of the quantity of data obtained, a major question in using microarrays is how to analyze the data in a reasonable way. We are currently employing microarray technology to study the gene expression of vascular cells subjected to two different mechanical forces, shear stress and cyclic strain, which are thought to contribute to cardiovascular disease. Human umbilical vein endothelial cells (HUVECs) were subjected to shear stress at 25 dyn/cm2 for 6 and 24 hours in a parallel plate flow chamber. Employing mRNA from sheared and static cells, differential expressions of over 4000 genes were evaluated simultaneously with Research Genetics’ GeneFiltersÒ GF211 microarrays. Next, HUVECs were subjected to cyclic deformation at 10% strain for 6 and 24 hours, and GF211 microarrays were again used for the strained, motion control and static cells. Three microarrays were used for each condition in three separate experiments with each of the conditions. The resulting data was subjected to normalization, and genes that met a fold-threshold of 2.0 in either direction were considered differentially expressed. Shear stress appears to result in more significant changes in both the quantity and magnitude of gene expression than cyclic strain. Additionally, the differentially regulated genes of each condition were different. However, with both mechanical forces, genes not previously thought to be regulated by these stimuli were discovered to be differentially expressed. With further analysis of this data, we hope to determine how genes act together to react to the forces imposed by blood flow and pressure.

References:

[1] McCormick, et al., PNAS, 2001, 98(16), 8955-60

[2] Frye, et al., Annals of Biomedical Engineering, 2001, 29(Supplement 1), S-61, 5.3.4

· 25 March

Stephanie Portet, Ovide Arino

Characterization and Modeling of the cytoskeleton morphofunctional organization during mechanotransductions processes

ABSTRACT: The aim of this work is to study the involvment of the intermediate filaments (components of the cytoskeleton) in mechanical functions of the cell. More specifically, we consider the mechanotransduction processes, acting via a reorganization of the network architecture in response to extracellular mechanical changes. A first aspect of interest is the characterization of the cytoskeletal network architecture using image analysis procedures. This characterization is defined as a simplified representation of the networks, a segmentation of cytoskeletal networks, followed by a quantitation of their architecture. This methodology is made up of three approaches resulting from a multi-scale observation of networks. It is used to characterize on the one hand the intracellular variability of cytoskeletal network architecture, and on the other hand the network architectural variations of cells submitted to different mechanical conditions. In particular, the work deals with the analysis of the architecture of cells submitted to microgravity conditions. A second aspect of the work is modeling. We developed an integro-differential model of establishment of cytokeratin network, under the hypothesis that the structural organization of the cytoskeletal network depends on its biological function. The model is designed in order to check the hypothesis, expressed and observed in the characterization step, of the involvement of intermediate filament networks in the mechanotransduction by architectural variations. >From the mathematical model of structural organization of cytokeratin network induced by mechanical environment, a three-dimensional simulation model is derived. The simulation model allows the obtention of examples of cytokeratin network architecture for fixed mechanical conditions, and the quantitative study of the mathematical model behavior.

References:

[1] Portet, S. and Vassy, J. and Beil, M. and Millot, G. and Hebbache, A. and Rigaut, J.P. and Schoevaert, D. (1999): Quantitative Analysis of Cytokeratin Network Topology in the MCF7 Cell Line. Cytometry, 35: 203-213.

[2] Herrmann, H. and Aebi, U. (2000): Intermediate Filaments and their Associates: Multi-Talented Structural Elements Specifying Cytoarchitecture and Cytodynamics. Curr. Opin. Cell Biol, 12: 79-90.

[3] Dallon, J.C. and Sherratt, J. and Maini, P. (1999): Mathematical Modelling of Extracellular Matrix Dynamics Using Discrete Cells: Fibers Orientation and Tissue Regeneration. J. Theor. Biol, 199: 449-471.

· 12 April

Olivier Lichtarge

Evolution-Directed Characterization and Engineering of Protein Functional Surfaces

ABSTRACT: Protein functional surfaces control the interaction between proteins and other macromolecules and are thus essential role in all aspects of cellular life. We aim to identify and predictably modify the amino acids that mediate these interactions. Our computational approach, the Evolutionary Trace (ET), ranks the relative functional importance of amino acids in a protein by correlating evolutionary sequence variations with phylogenetic tree divergences. This is similar to mutational scanning experiments in the laboratory, but we use the abundant mutation and assays already tested through evolution. Controls and bona fide predictions validate ET. For example, an interface that controls the binding of G proteins, RGS and the PDE visual effector was predicted and then validated mutationally and crystallographically. We are now generalizing ET to other proteins with a known structure, in order to combine the sequence and structure data on a proteomic scale and extract the molecular determinants of functional sites on a large scale. Such knowledge may then be used for drug targeting and design, for protein modeling and engineering, and to elucidate the function of new genes and new protein structures.

References:

Lichtarge, O., Sowa, M.E. (2002): Evolutionary Predictions of Binding Surfaces and Interactions. Current Opinions in Structural Biology 12: 21-27.

Madabushi, S., Yao, H., Marsh, M., Philippi, A., Kristiansen, D., Sowa, M.E., Lichtarge, O. (2002): Structural Clusters of Evolutionary Trace Residues are Statistically Significant and Widespread in Proteins. Journal of Molecular Biology. 316: 139-153.

Lichtarge, O., Sowa, M.E., Philippi, A. (2002): Evolutionary Trace studies of protein functional surfaces involved in G protein signaling. Methods in Enzymology, 344:536-556.

Sowa, M.E., Wei He, Slep, K.C., Kercher, M.A., Lichtarge, O., Wensel, T.G. (2001): Prediction and Confirmation of an allosteric pathway for regulation of RGS domain activity. Nature Structural Biology 8: 234-237.

Sowa, M.E., Wei He, Wensel, T.G. and Lichtarge, O. (2000): Identification of a General RGS-Effector Interface. Proc. Natl. Acad. Sci. U.S.A. 97:1483-1488.

Lichtarge O., Bourne H.R., Cohen F.E. (1996). The Evolutionary Trace Method Defines the Binding Surfaces Common to a Protein Family. Journal of Molecular Biology 257:342-358.

· 19 April

Bruce Luxon

The Bioinformatics Revolution

ABSTRACT: High-throughput, large-scale genomics (microarrays) and proteomics (2D gels) techniques have revolutionized molecular biology in the last few years. The result is a deluge of large amounts of data that is accelerating as these technologies are embraced by more and more biomedical investigators. Bioinformatics uses modern data management and analysis techniques to organize vast amounts of data into a hopefully coherent body of knowledge. Bioinformatics seeks to synthesize these very large data fields into a rational and internally consistent picture of the biological organism as a complex system of proteins and cells with a genetic blueprint. I will show several examples from actual research how these techniques can be used to discover information within data.

Impact of Bioinformatics at the NIH: As computational capabilities and resources continue to develop, the use of computer science and technology by the biomedical community is increasing. The fusion of biomedicine and computer technology offers substantial benefits to all NIH institutes and centers in support of their general mission of improving the quality of the nation's health by increasing biological knowledge. (see this and more information on computation at NIH: http://grants.nih.gov/grants/bistic ).

· 26 April

Krzysztof Simek

Nonlinear framework for the analysis of gene interactions using microarray data

ABSTRACT: The presentation is planned as a Journal Club talk. I will present the approach from [1], [2] where authors proposed a general statistical framework to finding associations between the gene expression data using the coefficient of determination. This coefficient measures the degree to which the transcriptional levels of an observed gene set can be used to improve the prediction of the transcriptional state of a target gene relative to the best possible prediction in the absence of observations. As the predictor they used the ternary perceptron, which is a single-layer neural network with ternary threshold used as activation function. The method is applied to a set of genes undergoing genotoxic stress for validation according to the manner in which it points toward previously known and unknown relationships.

References:

[1] S. Kim, E.R. Dougherty, M.L. Bittner, Y. Chen, K.L. Sivakumar, P.S. Meltzer, J.M. Trent (2000): General nonlinear framework for the analysis of gene interaction via multivariate expression arrays, Biomedical Optics, 5(4), p.411-424

[2] S. Kim, E. R. Dougherty, M. L. Bittner, Y. Chen, K. Sivakumar, P. Meltzer, and J. M. Trent (2001): Multivariate measurement of gene expression relationships, Genomics, 67, p. 201–209

· 24 May

Ming-Ying Leung

Poisson Process Approximation for Palindrome Occurrences in Random Nucleotide Sequences

ABSTRACT: Palindromes are symmetrical words of DNA in the sense that they read exactly the same as their reverse complementary sequences. Representing the occurrences of palindromes in the a long DNA molecule as points on the unit interval, the scan statistics can be used to identify regions of unusually high concentration of palindromes. These regions have been demonstrated to be associated with replication origins on some herpesviruses. However, the use of scan statistics requires the assumption that the points representing the palindromes are independently and uniformly distributed on the unit interval. In this paper, we provide a mathematical basis for making this assumption by showing that in randomly generated DNA sequences, the occurrences of palindromes can be approximated by a Poisson process. An easily computable upper bound on the Wasserstein distance between the palindrome process and the Poisson process is obtained. This bound is then used as a guide to choose an optimal palindrome length in the analysis of a number of herpesvirus genomes. Regions harboring significant palindrome clusters are identified and compared to known locations of replication origins. This analysis brings out a few interesting extensions of the scan statistics that can help formulate an algorithm for accurate prediction of replication origins.

· 05 June

Yuriy Fofanov

Modeling of the genetic regulatory dynamics: Local Invariants vs. systems of the differential equations.

ABSTRACT: Attempts to describe genetic temporal dynamics through the use of systems of differential equations possess series of basic flaws. In the present work, on the example of gene expression time profile analysis of data from the various experiments, we show how the use of the method of Local Invariants allows avoiding some of these difficulties, though at the price of not being able to construct the model of the whole system.

Return to previous page