Seminars
at Rice meet from 2.30 pm to 3.30 pm in Duncan Hall 1075.
Exceptions are noted in schedule below.
Parking at Rice is available in visitor spots in Lot C (Abercrombie lot),
Entrance 16 from Rice Blvd.
·
25 January
Abstract: Coalescence, or genetic
drift viewed backwards, is a major force in population genetics. It
introduces a very particular kind of dependence between individuals in a
genetic population caused by the fact that any two or more individuals in this
population might have shared a common ancestor some random time ago. The
strength of coalescence is inverse proportional to the effective population
size, because in large populations the time to the common ancestor increases. A
star-like coalescence is a situation where population is virtually infinite,
and thus the ancestor has lived virtually infinite time ago. Contrary to an
apparently common belief, such populations bear little or no resemblance to
populations of independent individuals, unless some very restrictive conditions
concerning the speed of population growth are met.
We will discuss these conditions and exhibit
examples showing that models with star-like coalescence may radically differ
from those that do not involve this force. To this end, we will consider Fisher
- Wright - Moran models with various mutation patterns and various population
growth scenarios. These models do not involve recombination or selection
·
1 February
ABSTRACT: Loglinear models are commonly used to find dependencies among categorical variables, so it is reasonable to extend this method of analysis to the human genome when dealing with mapping qualitative traits, such as certain diseases. In 1995, an article in Genetic Epidemiology by Duncan Thomas and others described a loglinear model used to investigate which haplotypes contributed to the risk of developing Childhood Insulin- Dependent Diabetes in the Finnish Population. They used Gibbs Sampling with an Empirical Bayes approach to estimate the model coefficients. The talk will focus on the methodology used for their analysis. Some background will be reviewed for motivation.
References:
Thomas,
D., et al. (1995). Variation in HLA-Associated Risks of Childhood
Insulin-Dependent Diabetes in the Finnish Population: II. Haplotype Effects. Genetic Epidemiology, vol. 12,
455-466.
·
8 February
Abstract: Detecting the signature of natural selection has been a central theme in evolutionary genetics. The aim of this work is to detect the signature of natural selection based on microsatellite loci. There have been some tests on this but they are mostly based on Infinite Allele Model or Infinite Site Model and may not be applicable to microsatellite loci. Complications can arise when the population under study is under some demographic changes, which can confound to the departure of neutral expectation. Through coalescence-based simulation, samples from populations under these changes can be simulated and their effects on neutrality test will be studied. The methodologies developed should be useful to interpret the functional significance of genomic data as well as for studying the genetic basis of complex traits.
·
22 February
ABSTRACT: This is a Journal Club talk which will discuss
results in the paper [1]. This paper is devoted to the development of sampling
theory for the infinitely - many sites process. Infinitely - many - sites (IMS)
model is commonly used to describe variability observed in samples of DNA sequences. The tree structure underlying
the (IMS) model is shown. Definitions and relations between labeled, unlabeled,
ordered, unordered, rooted and unrooted trees are explained. Markov chain,
whose states correspond to genealogical trees is presented and used to derive
expressions for probabilities of genealogical trees, under stationarity
hypothesis. Recursions for the probability of the configuration of sequences
(genealogical trees) are given. For samples of large size, where probabilities
cannot be obtained by recursions, a method based on Monte Carlo iterations is
developed.
References:
R.
C. Griffiths, S. Tavare, Unrooted genealogical tree probabilities in the
infinitely many sites model, Math. Biosci., vol. 127, pp. 77-98,
1995.
·
1 March
Abstract: DNA microarrays provide a technique for acquiring a
large amount of data on gene expression of cultured cells subjected to various
stimuli. Because of the quantity
of data obtained, a major question in using microarrays is how to analyze the
data in a reasonable way. We are
currently employing microarray technology to study the gene expression of
vascular cells subjected to two different mechanical forces, shear stress and
cyclic strain, which are thought to contribute to cardiovascular disease. Human umbilical vein endothelial cells
(HUVECs) were subjected to shear stress at 25 dyn/cm2 for 6 and 24 hours in a
parallel plate flow chamber. Employing mRNA from sheared and static cells,
differential expressions of over 4000 genes were evaluated simultaneously with
Research Genetics’ GeneFiltersÒ GF211 microarrays. Next,
HUVECs were subjected to cyclic deformation at 10% strain for 6 and 24 hours,
and GF211 microarrays were again used for the strained, motion control and
static cells. Three microarrays
were used for each condition in three separate experiments with each of the
conditions. The resulting data was
subjected to normalization, and genes that met a fold-threshold of 2.0 in either
direction were considered differentially expressed. Shear stress appears to result in more significant changes
in both the quantity and magnitude of gene expression than cyclic strain. Additionally, the differentially
regulated genes of each condition were different. However, with both mechanical forces, genes not previously
thought to be regulated by these stimuli were discovered to be differentially
expressed. With further analysis
of this data, we hope to determine how genes act together to react to the
forces imposed by blood flow and pressure.
References:
[1] McCormick, et al., PNAS,
2001, 98(16), 8955-60
[2] Frye, et al., Annals
of Biomedical Engineering, 2001, 29(Supplement 1), S-61, 5.3.4
· 25 March
ABSTRACT: The aim of this work is to study the involvment of the intermediate filaments (components of the cytoskeleton) in mechanical functions of the cell. More specifically, we consider the mechanotransduction processes, acting via a reorganization of the network architecture in response to extracellular mechanical changes. A first aspect of interest is the characterization of the cytoskeletal network architecture using image analysis procedures. This characterization is defined as a simplified representation of the networks, a segmentation of cytoskeletal networks, followed by a quantitation of their architecture. This methodology is made up of three approaches resulting from a multi-scale observation of networks. It is used to characterize on the one hand the intracellular variability of cytoskeletal network architecture, and on the other hand the network architectural variations of cells submitted to different mechanical conditions. In particular, the work deals with the analysis of the architecture of cells submitted to microgravity conditions. A second aspect of the work is modeling. We developed an integro-differential model of establishment of cytokeratin network, under the hypothesis that the structural organization of the cytoskeletal network depends on its biological function. The model is designed in order to check the hypothesis, expressed and observed in the characterization step, of the involvement of intermediate filament networks in the mechanotransduction by architectural variations. >From the mathematical model of structural organization of cytokeratin network induced by mechanical environment, a three-dimensional simulation model is derived. The simulation model allows the obtention of examples of cytokeratin network architecture for fixed mechanical conditions, and the quantitative study of the mathematical model behavior.
[1]
Portet, S. and Vassy, J. and Beil, M. and Millot, G. and Hebbache, A. and
Rigaut, J.P. and Schoevaert, D. (1999): Quantitative Analysis of Cytokeratin
Network Topology in the MCF7 Cell Line. Cytometry, 35: 203-213.
[2]
Herrmann, H. and Aebi, U. (2000): Intermediate Filaments and their Associates:
Multi-Talented Structural Elements Specifying Cytoarchitecture and
Cytodynamics. Curr. Opin. Cell Biol, 12: 79-90.
[3]
Dallon, J.C. and Sherratt, J. and Maini, P. (1999): Mathematical Modelling
of Extracellular Matrix Dynamics Using Discrete Cells: Fibers Orientation and
Tissue Regeneration. J. Theor. Biol, 199: 449-471.
· 12 April
Evolution-Directed
Characterization and Engineering of Protein Functional Surfaces
ABSTRACT: Protein functional surfaces control the
interaction between proteins and other macromolecules and are thus essential
role in all aspects of cellular life. We aim to identify and predictably modify
the amino acids that mediate these interactions. Our computational approach,
the Evolutionary Trace (ET), ranks the relative functional importance of amino
acids in a protein by correlating evolutionary sequence variations with
phylogenetic tree divergences.
This is similar to mutational scanning experiments in the laboratory,
but we use the abundant mutation and assays already tested through
evolution. Controls and bona fide
predictions validate ET. For example, an interface that controls the binding of
G proteins, RGS and the PDE visual effector was predicted and then validated
mutationally and crystallographically.
We are now generalizing ET to other proteins with a known structure, in
order to combine the sequence and structure data on a proteomic scale and
extract the molecular determinants of functional sites on a large scale. Such knowledge may then be used for
drug targeting and design, for protein modeling and engineering, and to
elucidate the function of new genes and new protein structures.
References:
Lichtarge,
O., Sowa, M.E. (2002): Evolutionary Predictions of Binding Surfaces and
Interactions. Current
Opinions in Structural Biology 12: 21-27.
Madabushi,
S., Yao, H., Marsh, M., Philippi, A., Kristiansen, D., Sowa, M.E., Lichtarge,
O. (2002): Structural Clusters of Evolutionary Trace Residues are
Statistically Significant and Widespread in Proteins. Journal of
Molecular Biology. 316: 139-153.
Lichtarge,
O., Sowa, M.E., Philippi, A. (2002): Evolutionary Trace studies of protein
functional surfaces involved in G protein signaling. Methods in Enzymology,
344:536-556.
Sowa,
M.E., Wei He, Slep, K.C., Kercher, M.A., Lichtarge, O., Wensel, T.G. (2001): Prediction
and Confirmation of an allosteric pathway for regulation of RGS domain activity.
Nature Structural Biology 8: 234-237.
Sowa,
M.E., Wei He, Wensel, T.G. and
Lichtarge, O. (2000): Identification
of a General RGS-Effector Interface.
Proc. Natl. Acad. Sci. U.S.A. 97:1483-1488.
Lichtarge
O., Bourne H.R., Cohen F.E.
(1996). The Evolutionary
Trace Method Defines the Binding Surfaces Common to a Protein Family. Journal of Molecular Biology
257:342-358.
·
19 April
The
Bioinformatics Revolution
ABSTRACT: High-throughput, large-scale genomics
(microarrays) and proteomics (2D gels) techniques have revolutionized molecular
biology in the last few years. The
result is a deluge of large amounts of data that is accelerating as these
technologies are embraced by more and more biomedical investigators. Bioinformatics uses modern data
management and analysis techniques to organize vast amounts of data into a hopefully
coherent body of knowledge.
Bioinformatics seeks to synthesize these very large data fields into a
rational and internally consistent picture of the biological organism as a
complex system of proteins and cells with a genetic blueprint. I will show several examples from
actual research how these techniques can be used to discover information within
data.
Impact of Bioinformatics at the NIH: As computational capabilities and
resources continue to develop, the use of computer science and technology by
the biomedical community is increasing. The fusion of biomedicine and computer
technology offers substantial benefits to all NIH institutes and centers in
support of their general mission of improving the quality of the nation's
health by increasing biological knowledge. (see this and more information on computation at NIH: http://grants.nih.gov/grants/bistic
).
·
26 April
ABSTRACT:
The presentation
is planned as a Journal Club talk. I will present the approach from [1], [2]
where authors proposed a general statistical framework to finding associations
between the gene expression data using the coefficient of determination. This
coefficient measures the degree to which the transcriptional levels of an
observed gene set can be used to improve the prediction of the transcriptional
state of a target gene relative to the best possible prediction in the absence
of observations. As the predictor they used the ternary perceptron, which is a
single-layer neural network with ternary threshold used as activation function.
The method is applied to a set of genes undergoing genotoxic stress for
validation according to the manner in which it points toward previously known
and unknown relationships.
References:
[1]
S. Kim, E.R. Dougherty, M.L. Bittner, Y. Chen, K.L. Sivakumar, P.S. Meltzer,
J.M. Trent (2000): General nonlinear framework for the analysis of gene interaction
via multivariate expression arrays, Biomedical Optics, 5(4),
p.411-424
[2]
S. Kim, E. R. Dougherty, M. L. Bittner, Y. Chen, K. Sivakumar, P. Meltzer, and
J. M. Trent (2001): Multivariate measurement of gene expression
relationships, Genomics, 67, p. 201–209
·
24 May
ABSTRACT:
Palindromes
are symmetrical words of DNA in the sense that they read exactly the same as
their reverse complementary sequences.
Representing the occurrences of palindromes in the a long DNA molecule
as points on the unit interval, the scan statistics can be used to identify
regions of unusually high concentration of palindromes. These regions have been
demonstrated to be associated with replication origins on some herpesviruses.
However, the use of scan statistics requires the assumption that the points
representing the palindromes are independently and uniformly distributed on the
unit interval. In this paper, we provide a mathematical basis for making this
assumption by showing that in randomly generated DNA sequences, the occurrences
of palindromes can be approximated by a Poisson process. An easily computable upper bound on the
Wasserstein distance between the palindrome process and the Poisson process is
obtained. This bound is then used as a guide to choose an optimal palindrome
length in the analysis of a number of herpesvirus genomes. Regions harboring
significant palindrome clusters are identified and compared to known locations
of replication origins. This analysis brings out a few interesting extensions
of the scan statistics that can help formulate an algorithm for accurate
prediction of replication origins.
·
05 June