next up previous
Next: Cross Validation Up: Results Previous: Stepwise

Best Subsets

The best subsets regression with the original origin variable results are:

Best Subsets Regression


Response is mpg

                                      c d           
                                      y i           
                                      l s   w     o 
                                      i p   e     r 
                                      n l   i   y i 
                                      d a   g a e g 
              Adj.                    e c h h c a i 
Vars   R-Sq   R-Sq    C-p         s   r e p t c r n 

   1   69.3   69.2  273.2    4.3327         X       
   1   64.8   64.7  368.7    4.6351     X           
   2   80.8   80.7   26.6    3.4272         X   X   
   2   74.1   74.0  171.4    3.9835     X       X   
   3   81.7   81.6    8.7    3.3476         X   X X 
   3   80.9   80.7   27.7    3.4276         X X X   
   4   81.8   81.6    9.3    3.3460     X   X   X X 
   4   81.8   81.6    9.3    3.3461         X X X X 
   5   82.0   81.8    7.1    3.3325     X X X   X X 
   5   82.0   81.8    7.4    3.3339     X   X X X X 
   6   82.1   81.8    6.7    3.3262   X X X X   X X 
   6   82.1   81.8    7.5    3.3299   X X   X X X X 
   7   82.1   81.8    8.0    3.3277   X X X X X X X

After recoding the origin variable, the results are:

Best Subsets Regression


Response is mpg

                                      c d             
                                      y i             
                                      l s   w         
                                      i p   e       j 
                                      n l   i   y e a 
                                      d a   g a e u p 
              Adj.                    e c h h c a r a 
Vars   R-Sq   R-Sq    C-p         s   r e p t c r o n 

   1   69.3   69.2  281.6    4.3327         X         
   1   64.8   64.7  378.4    4.6351     X             
   2   80.8   80.7   31.9    3.4272         X   X     
   2   74.1   74.0  178.6    3.9835     X       X     
   3   81.2   81.1   25.1    3.3952         X   X   X 
   3   81.1   80.9   28.8    3.4106         X   X X   
   4   81.9   81.7   12.3    3.3374         X   X X X 
   4   81.3   81.1   25.4    3.3926         X X X   X 
   5   82.1   81.8   10.7    3.3268     X   X   X X X 
   5   81.9   81.7   13.4    3.3380         X X X X X 
   6   82.3   82.0    8.1    3.3113     X X X   X X X 
   6   82.3   82.0    8.6    3.3136     X   X X X X X 
   7   82.4   82.1    7.6    3.3050   X X X X   X X X 
   7   82.3   82.0    8.8    3.3098   X X   X X X X X 
   8   82.4   82.1    9.0    3.3065   X X X X X X X X

The best subsets gives the two best variable subsets for each subset size except for all variables. Note in the first 4 selections how weight and displacement switch back and forth replacing each other with weight being better. Heavier cars tend to have larger engines, but the car weight is a better predictor. year and the origin variables appear next as best predictors with the others falling in later. The best value of $C_p$ is $7.6$ which includes all variables except cylinder (number of cylinders) and acc (acceleration). Both of these may be good predictors (I would certainly expect an 8 cylinder engine to be less gas efficient than a 4 cylinder engine), but are not included presumably because other variables are highly correlated with these (e.g. displacement) and do a better job of predicting.

When we dropped the variables cylinder and acc, and took out the test data, the results from Best Subset did not change:

Best Subsets Regression


Response is mpg

                                      d           
                                      i           
                                      s   w       
                                      p   e     j 
                                      l   i y e a 
                                      a   g e u p 
              Adj.                    c h h a r a 
Vars   R-Sq   R-Sq    C-p         s   e p t r o n 

   1   69.4   69.3  233.9    4.3909       X       
   1   64.2   64.1  328.4    4.7475   X           
   2   80.8   80.7   26.7    3.4783       X X     
   2   73.4   73.2  162.9    4.1004   X     X     
   3   81.1   81.0   23.1    3.4559       X X   X 
   3   81.1   80.9   23.6    3.4584       X X X   
   4   81.8   81.6   12.8    3.3988       X X X X 
   4   81.2   81.0   23.5    3.4531   X   X X X   
   5   82.1   81.8    9.3    3.3759   X   X X X X 
   5   81.8   81.6   14.3    3.4013     X X X X X 
   6   82.4   82.0    7.0    3.3586   X X X X X X


next up previous
Next: Cross Validation Up: Results Previous: Stepwise
Dennis Cox 2002-12-01