Sponsoring Section/Society: WNAR

Session Slot: 2:00- 3:50 Tuesday

Estimated Audience Size: xx-xxx

AudioVisual Request: xxx

*Session Title: Advances in Inference Based on Randomization *

Theme Session: No

Applied Session: No

Session Organizer: **Zerbe, Gary**
University of Colorado

Address: University of Colorado Health Sciences Center Campus Box B119 4200 East 9th Ave Denver, CO 80262

Phone: (303) 315-7608

Fax: (303) 315-3183

Email: Gary.Zerbe@UCHSC.edu

Session Timing: 110 minutes total (Sorry about format):

Opening Remarks by Chair - 0 minutes First Speaker - 30 minutes Second Speaker - 30 minutes Third Speaker - 30 minutes Discussant - 10 minutes Floor Discusion - 10 minutes

Session Chair: **Zerbe, Gary**
University of Colorado

Address: University of Colorado Health Sciences Center Campus Box B119 4200 East 9th Ave Denver, CO 80262

Phone: (303) 315-7608

Fax: (303) 315-3183

Email: Gary.Zerbe@UCHSC.edu

*1. Comparing Sample Means And Variances*

**Manly, Bryan**,
University of Otago

Address: Mathematics and Statistics Department University of Otago, Dunedin, New Zealand

Phone: 64-3-479-7775

Fax: 64-3-479-8427

Email: bmanly@maths.otago.ac.nz

**Francis, Chris**,
National Institute of Water and Atmospheric Research,
New Zealand

Abstract: A common problem is that two or more samples need to be compared in terms of mean and variation differences. What is desirable is a test to compare mean values that allows the samples to be from sources with different amounts of variation, and a test to compare the variation in the samples that allows for the samples to be from sources with different means. Such tests are available with good properties, based on randomization inference, for samples from distributions that are not too far from normal (e.g. normal distributions themself, uniform distributions, and exponential distributions), but their behavior deteriorates badly for the extreme distributions that often occur in biological settings (e.g., with a high proportion of zero values, some moderate values, and a few extremely large values). For these extreme distributions simulation shows that it is sometimes not possible to separate mean differences from variation differences.Given this unfortunate situation, one approach is to define a condition number for a set of data based on the sample skewness and kurtosis such that for data with a low number the tests for means and variation are reliable while for data with a high condition number these tests are not reliable. In that case, for tests with a high condition number it is still possible to test for evidence of some sample differences, without being able to specify exactly the nature of these differences. In a court of law situation, for example, this would then at least be better than not knowing whether the tests on means and variation are reliable or not.

In this talk these ideas will be discussed in terms of the definition of the condition number and the various tests that should be used. The results of a simulation study will be presented to justify the conclusions reached.

*2. Theory of Randomization Tests *

**Edington, Eugene**,
eedgingt@acs.ucalgary.ca

Address: Department of Psychology University of Calgary Calgary, Alberta T2N 1N4 Canada

Phone: 403-234-9607

Fax: 403-282-8249

Email: eedgingt@acs.ucalgary.ca

Abstract: Randomization tests are permutation tests for testing hypotheses about experimental treatment effects and require experimental randomization (random assignment), whereas permutation tests for testing hypotheses about populations in nonexperimental research are based on random sampling. Despite the differences between the type of hypotheses tested, the same computational procedures and therefore the same computer programs may be used for determining significance for both applications. It is not surprising, therefore, that much of randomization test theory is general permutation test theory that is relevant to both experimental and nonexperimental applications.The validity of widely differing methods of significance determination, which may yield somewhat different p-values, can be demonstrated by showing that the different reference sets for determining significance each has a property called "stochastic closure." Stochastic closure ensures validity of a reference set whether that set is produced systematically, randomly, or semi-randomly. That is true of permutation tests based on random sampling as well as randomization tests, which are based on random assignment. The choice of one method over another may be on the basis of which is simpler, faster, or otherwise better for computer programming. Alternatively, preference could be based on relative power or, say, whether a random reference set is the only one possible for the experimental design.

It can be shown that subjects (or other experimental units) may be selected in any systematic (nonrandom) manner to fit the needs of the experimenter without threatening the validity of randomization tests, a fact that is of considerable importance, but equally important is the flexibility permitted in assigning those units. Randomization tests can be developed to accommodate any experimental design in which there is some form of random assignment of experimental units. The development of new experimental designs is thereby facilitated, the value of which has been well illustrated in recent years by the success of single-subject randomized clinical trials.

*3. An Empirical Comparison of Permutation Methods for Tests
of Partial Regression Coefficients in a Linear Model *

**Legendre, Pierre**,
University of Monreal

Address: Department of Biological Sciences University of Montreal C.P. 6128, succ. Centre-Ville Montreal, Quebec H3C 3J7 Canada

Phone: 514-343-7591

Fax: 514-343-2293

Email: legendre@ERE.UMontreal.CA

**Anderson, Marti**,
University of Montreal

Abstract: This study compares empirical type I error and power of different permutation techniques for a test of significance of a single partial regression coefficient in a multiple regression model, using simulations. The methods compared were permutation of raw data values, permutation of residuals under a null (reduced) model and permutation of residuals under the full model. We investigated effects of changes in 1) sample size, 2) degree of collinearity between the predictor variables, 3) size of the parameter not being tested and 4) distribution of the added random error term. We obtained several replicate sets of simulations so that statistical comparisons of error and power could be made. The type I error was maintained at the chosen level of significance (a = 0.05) for permutation of raw data only when the parameter not being tested was equal to zero. Two methods which had been identified as equivalent formulations of permutation under the null model were found to be different, with one of the formulations giving consistently inflated type I error, while the other gave results with type 1 error close to nominal significance levels. Permutation of residuals under the null model had type I error closer to a than permutation under the full model at very small sample sizes (n = 9). We detected slightly better power using permutation under the full model, compared to permutation under the null model, with increases in the size of the parameter being tested. All methods suffered reductions in power with increases in collinearity of the predictor variables. In general, the use of permutation of raw data for tests of partial regression coefficients cannot be recommended. Either of the model-based permutations had similar behaviour, generally maintaining quite good level accuracy and comparable power.

List of speakers who are nonmembers: Bryan Manly