Aircraft manufacturers are understandably interested in the
durability of their products. As one part of a larger investigation,
they looked at how metal joint fasteners would hold up under stress.
Several fasteners were tested at several different load levels, and
the number of fasteners failing out of the total was recorded. We wish
to model the probability of failure as a function of load using logistic
regression.
First, the data:
Load Level Number Tested Number Failing
2500 50 10
2700 70 17
2900 100 30
3100 60 21
3300 40 18
3500 85 43
3700 90 54
3900 50 33
4100 80 60
4300 65 51
For this data, we wish to fit an appropriate binary regression model.
You should try a few different link functions: logit, probit,
cloglog and inverse cloglog. Plot the resulting fitted proportion curves
against the sample proportions (I'm picturing a 2 by 2 grid of plots
here, one curve per plot, with the raw proportions indicated by dots
or stars or something). You should supply the fitted beta values for
each model, and the deviance for each model. Are the betas significant?
Which type of link function achieves the best fit?
I would like to see commented code here. You are not allowed to
use Splus' GLM routines for binary data as your code, but you may
use it to check your answers.
Indicate how you are getting your starting beta values,
and give the algebraic form of your W and ystar entries.