Genevera Allen
Baylor College of Medicine & Rice University
A Generalized Least Squares Matrix Decomposition
Variables in high-dimensional data sets common in neuroimaging and
genomics often exhibit complex dependencies. These relationships, due
to spatio-temporal processes, network structures, or latent variables,
for example, are often ignored by conventional multivariate analysis
techniques. We propose a generalization of the singular value
decomposition (SVD)
that is appropriate for transposable matrix data, or data in which
neither the rows nor the columns can be considered independent
instances. Our decomposition, entitled the Generalized least squares
Matrix Decomposition (GMD),
finds the best low rank approximation to the data with respect to a
transposable quadratic norm. By adding penalties to the
factors, we introduce the Generalized Penalized Matrix Factorization
(GPMF). We show that the GMD can be used for generalized PCA and the
GPMF for sparse GPCA and functional GPCA. We also outline extensions
of our methodology for statistical applications such as canonical
correlation analysis, non-negative matrix factorization, and linear
discriminant analysis.
Through simulations and examples we
demonstrate the utility of the GMD and GPMF for dimension reduction,
sparse and functional signal recovery, and feature selection with
high-dimensional
transposable data. Real data examples on functional MRIs are given.