next up previous index
Next: biometric.soc.02 Up: Biometric Society (ENAR & Previous: Biometric Society (ENAR &

biometric.soc.01


Sponsoring Section/Society: ENAR

Session Slot: 8:30-10:20 Tuesday

Estimated Audience Size: 125-175

AudioVisual Request: Two Overheads


Session Title: Advances and Applications of Model Averaging and Mixtures


The use of mixtures (or normals, for example) and model averaging forms an important class of tools for analysis of complex data sets and accounting for model uncertainty. This session will describe novel applications of and recent advances in mixture modeling and Bayesian model averaging in biometrics.

Theme Session: No

Applied Session: Yes


Session Organizer: Carroll, Raymond J. Texas A&M University


Address: Department of Statistics Texas A&M University College Station TX 77843-3143

Phone: 409-845-3141

Fax: ( 409-845-3144

Email: carroll@stat.tamu.edu


Session Timing: 110 minutes total (Sorry about format):

Opening Remarks by Chair - 0 minutes First Speaker - 25 minutes (or 25) Second Speaker - 25 minutes Third Speaker - 25 minutes Fourth Speaker - 25 minutes Floor Discusion - 10 minutes


Session Chair: Berry, Scott Texas A&M University


Address: Department of Statistics Texas A&M University College Station TX 77843-3143

Phone: 409-845-3141

Fax: ( 409-845-3144

Email: berry@stat.tamu.edu


1. Bayesian Model Averaging and Other Stategies in Case-Control Studies

Raftery, Adrian E.,   University of Washington


Address: Dept of Statistics University of Washington B313 Padelford Hall Box 354322 Seattle WA 98195-4322

Phone: 206-543-4505

Fax: 206-685-7419

Email: raftery@stat.washington.edu

Viallefont, Valerie, INSERM, France

Richardson, Sylvia, INSERM, France

Abstract: Hundreds of observational case-control studies are reported each year in the epidemiological and medical literatures. A key issue in the analysis of such studies is the choice of control variables. Typically, many possible control variables are considered (on the order of several dozen). The most used strategies have been stepwise logistic regression, and the two-stage method of Mickey and Greenland (1989). With these strategies, inference about treatment effects (estimates, confidence intervals, tests and P values) is made conditionally on the selected set of control variables. We argue that this may be misleading because it ignores model uncertainty, which can be considerable.

We propose instead the use of Bayesian model averaging (BMA), using a prior distribution for the parameters that is based on a sample of recent studies from the American Journal of Epidemiology. The software GLIB (http://lib.stat.cmu.edu/S/glib) is used (Raftery, 1996, Biometrika). We report on a simulation study, designed to mimic the features of commonly encountered studies in the literature. We find that conditional P values from both the stepwise and two-stage methods greatly overstate the evidence for an effect, while BMA posterior probabilities reflect it reasonably accurately. Also, the BMA point estimates have lower mean squared errors than those >from either stepwise or two-stage variable selection. We conclude that any method that selects a single model can be misleading, and that it is important to take account of model uncertainty when making inference from case-control studies.


2. Does Particulate Matter Particularly Matter

Clyde, Merlise,   Duke University


Address: Institute of Statistics and Decision Science Duke University Box 90251 Durham NC 27708-0251

Phone: 919-681-8440

Fax: 919-681-8594

Email: clyde@isds.duke.edu

Abstract: The effect of particulate matter on mortality is currently an important issue, as the U.S. Environmental Protection Agency has proposed new standards. Many of the epidemiological models are based on Poisson regression models for the mortality counts, with independent variables including various daily and lagged meteorological variables and measures of particulate matter. The high correlations among many of the explanatory variables makes tradition model selection difficult. The statistical significance of PM10 (particulate matter of aerodynamic diameter less than 10 micrometers), however, depends on the choice of meteorological variables that are selected and the choice of PM10 variables. Issues of variable choice and model uncertainty due to variable selection thus can play an important role in making decisions. We use Bayesian model averaging to assess the effect of PM10 on mortality, taking into account model uncertainty about which meteorological and PM10 variables should be included. Because there are a large number of possible models with over two hundred variables, we introduce an approximation to the posterior distribution that allows for efficient Markov Chain Monte Carlo sampling of models with high posterior probability from high dimensional model spaces. We present posterior distributions under model averaging of the effect of PM10 on mortality and of the relative risk and contrast these results with previous analyses. The methods are applicable to a wide range of generalized linear models where prediction and model uncertainty are important issues.


3. Flexible Bayesian and Frequentist Inference Using Mixtures

Wasserman, Larry,   Carnegie Mellon University


Address: Department of Statistics Baker Hall, 229I Carnegie Mellon University Pittsburgh PA 15213

Phone: 412-268-8727

Fax: 412-268-7828

Email: larry@stat.cmu.edu

Carroll, Raymond J., Texas A&M University

Roeder, Kathryn, Carnegie Mellon University

Abstract: Mixtures of Normals models provide a simple, yet flexible family of distributions. First we'll discuss these general issues: problems with improper priors, selecting the number of components, and computational considerations. Then we'll describe a few applications including: density estimation, measurement models and combining studies. Time permitting, we'll also discuss a comparison of sieves (where the number of components grows with sample size) versus infinite mixtures (such as Dirichlet process mixtures).


4. Analyzing the Effects of Haplotype Variation on a Continuous Variable Using Bayesian Model Averaging

Iturria, Stephen,   Texas A&M University


Address: Department of Statistics Texas A&M University College Station TX 77843-3143

Phone: 409-845-3141

Fax: ( 409-845-3144

Email: iturria@stat.tamu.edu

Abstract: Conventional approaches to model fitting make inferences on model characteristics under the assumption that the chosen model is correct. There can be much uncertainty, however, in the correctness of the selected model. The idea behind Bayesian model averaging (BMA) is that when possible, one should account for this uncertainty when making inferences. When inference on a quantity, Z, common to several competing models is desired, BMA attempts to account for model uncertainty by characterizing our knowledge about Z with a composite of posterior distributions for Z, each computed under a different model. The composite is a weighted average, with weights taken as the posterior probabilities for some subset of the models being considered. In this presentation we describe how one could use BMA to analyze the effect of variation in haplotypes of the X gene on observed levels in Y.

List of speakers who are nonmembers: None


next up previous index
Next: biometric.soc.02 Up: Biometric Society (ENAR & Previous: Biometric Society (ENAR &
David Scott
6/1/1998