We give an example of ``model selection'' in multiple
regression. Here, ``model selection'' means selecting
from a relatively large collection of independent variables
which ones to keep (with nonzero coefficients) in the
regression equation. The subject is discussed in the
text (Section 11.10, p. 531-539). We utilize both
methods discussed there (Stepwise and
) and
introduce a third method: Cross-Validation.
It is necessary to consider model selection because
including too many variables increases uncertainty in
the estimated coefficients. This is especially true
when new predictors are computed from given predictors
(e.g., when adding in squared, cubed, etc. values of
given predictor variables so as to fit polynomial terms).