Seminars
at Rice meet from 2pm to 3pm in Duncan Hall 1075.
Exceptions are noted in schedule below.
Parking at Rice is available in visitor spots in Lot C (Abercrombie lot),
Entrance 16 from Rice Blvd.
Presentation slides (.ppt format)
Barbour, A.D., Holst, L., and Janson, S. (1992):
Poisson Approximation, Chapter 1. Oxford University Press.
Leung, M.Y., Schachetel, G.A., and Yu, H.S. (1994): Scan statistics and DNA
sequence analysis: the search for an origin of replication in a virus, Nonlinear
World, 1, 445-471.
ABSTRACT: Microarrays are new technique of gene
expression measurements that attracted a great deal of research interest in
recent years. It has been suggested that gene expression data from microarrays
(biochips) can be utilized in many biomedical areas, for example in cancer
classification. Whereas several, new and existing, methods of classification
has been tested, a selection of proper (optimal) set of genes, which expression
serves during classification, is still an open problem. In the presentation a
new heuristic method of choosing suboptimal set of genes by using support
vector machines (SVMs) will be formulated. Obtained set of genes optimizes
one-leave-out cross-validation error. The method is tested on microarray gene
expression data of samples of two cancer types: acute myeloid leukemia (AML)
and acute lymphoblastic leukemia (ALL). The results show that quality of
classification of selected set of genes is much better than for sets obtained
using another methods of feature selection.
REFERENCES:
Christianini
N., J. Shawe-Tylor (2000): An introduction to support vector machines and other
kernel-based learning methods, Cambridge Univ. Press.
Fujarewicz K., Rzeszowska-Wolny J.: Neural network approach to cancer classification based on gene expression levels, Proc. IASTED Int. Conf. Modelling Identification and Control, pp. 564–568, Innsbruck, 2001.
Presentation slides (MSWord format)
2 November
H. Spratt
A Comparison of Three Methods
Used To Determine the Functionally Important Residues of a Protein Sequence
ABSTRACT: I will discuss three methods used to determine which sites along a protein sequence are functionally important. These methods are the evolutionary trace, the method of maximum likelihood trees, and the method of hidden markov models using maximum likelihood trees. The evolutionary trace procedure determines which sites are important based on a protein evolutionary tree and which the conservation of residues at various branches of the tree. The maximum likelihood tree method uses a different method to draw the evolutionary trees and is able to determine different categories of protein evolution. The hidden markov method is a further adaptation of the maximum likelihood method. It adds a hidden markov chain to the process to determine which sites evolve at which rate (both of which are unknown to the observer). Results of the three analyses on several different proteins will be presented as well as the analysis using simulated protein sequences.
ABSTRACT:
The singular value decomposition (SVD) is a standard and straightforward
procedure known in linear algebra. Now it is often used to simplify analysis
and modeling of gene expression data.
Using SVD gene expression profiles can be represented by a small number
of characteristic modes that capture the temporal patterns of gene expression
change. I present an application of SVD to dynamical modeling of time series of
gene expression data described in [2].
The main goal of the approach is to find time translation matrix, which
provides information how the characteristic modes influence in time on each
other. The analysis leads to the conclusion that only the model of low
dimensionality is needed to reconstruct the expression patterns with reasonable
fidelity.
REFERENCES
30 November
C. Shaw
Dicty microarrays.
ABSTRACT:
Dictyostelium
discoideum is a unicellular organism that exhibits social behavior and cellular
differentiation when starved. The transition to multi-cellularity is thought to
incorporate many cell-signaling and communication strategies relevant if not
conserved among the higher eukaryotes. Because Dictyostelium are easily grown,
maintained, and mutagenized in the laboratory they provide an excellent model
system for study with cDNA microarrays.
We
have performed and analyzed over 1000 Dictyostelium microarray experiments
using custom arrays we fabricate at our facility. Our experiments include
analyses of both wild-type and mutant Dictyostelium development. We have been
able to characterize wild-type Dictyostelium development, and the analysis of
mutants has shown our ability to recover known epistatic relationships between
genes in a functional pathway.
We
have found 5 major expression modalities exhibited during wild-type Dictyostelium
development. We have further identified cell type specific patterns of gene
expression from homogenous samples from 4 cell type populations. We have also
shown the impact of nutritional history on the wild-type developmental time
course.
We
have analyzed an extensive set of Dictyostelium mutants in the PKA pathway with
microarrays. Using the microarray developmental time course as a phenotype we
have sought to recover the genetic relationships in this already well known
genetic pathway. We show that the microarray phenotype of development is
sufficient to reproduce the known epistatic relationships between the
mutagenized genes.
The
ongoing Dictyostelium genome sequencing project provides plentiful sequence
data for comparison with microarray expression patterns. We have begun the
process of associating microarray expression data with the upstream sequence of
coordinately expressed genes.