Data Modeling, Quantile/Quartile Functions, Confidence Intervals, Introductory Statistics Reform

Emanuel Parzen
Texas A&M University

  History of statistics is alive to me as I fondly recall my
interaction with Eric Lehmann since receiving my Ph.D. in Berkeley
in 1953. The 2003 Economics Nobel Prize (awarded for fundamental
research in statistical time series analysis) reminds me of my
joking complaint to diverse applied researchers: "why do you call
it theory if I know it and applied research when you practice
it?" I have continued to learn a lot about Quantiles and
Nonparametric Data Modeling since my 1979 JASA paper. New methods
have been developed (that some applied researchers consider a
gold mine). Quantile data modeling is not practiced by most
statisticians who are limited to sample median Q2, interquartile
range IQR, and Q-Q probability plots. To estimate and test a
parameter µ one starts with the natural estimator µ^; we define a
statistic T(µ,µ^), an increasing function of µ and with distribution
(when µ is true parameter) equal to distribution of a random
variable T (usually Normal(0,1), Student, or inverse average chi-square).
To test H_0: µ=µ_0 one computes or bounds P-value(µ_0)=F_T(observed T(µ_0,µ^));
it is a distribution function of µ_0 (whose probability density one
could derive). Define its inverse µ^(u) by Q_T(u)=T(µ^(u),µ^); µ^(u),
called the parameter with P-value u, is a quantile function which has
a pseudo-Bayesian interpretation as the conditional quantile of µ
given the data. The conventional confidence level 1-a confidence interval
can be shown to be  µ^(a/2)<µ<µ^(1-(a/2)). A table of µ^(u) for many
conventional statistical problems (for u, 1-u=.05,.025,.01,.005) can teach
introductory statistics in less words and closer to the frontier of
modern statistical thinking (including quantile based Bayesian credible
intervals and bootstrap). For random variable X with quantile Q(u) define
(and plot on same graph with exponential and normal) informative q
uantile/quartile function Q/Q(u)=(Q(u)-midquartile)/2 IQR.
Talk could also discuss confidence Q-Q plots, conditional quantile,
comparison distribution, mid-distribution, and definition of sample
quantiles, linear rank statistics, and sample variance.