Finding Outliers in Regression Data
Abstract:
Robust regression algorithms are effective at
reducing the effects of outliers in multivariate
regression. M-estimators in particular select
an influence function, such as Tukey's biweight,
to downweight large residuals. Determining the
proper value of robust scale in the influence
function is critical to successful application.
Alternatively, the influence function can be
specified indirectly by (1) making a specific assumption
of the parametric form of the residuals (normal
with zero mean, for example) and (2) using a
minimum-distance fitting criterion in place of
least-squares. We examine the "natural"
influence functions that result and examine
several regression problems with outliers.
Key Words: Minimum distance estimation
M-estimation
Influence Function
Integrated square error