556 CHAPTER 10 Correlation and Regression Predicting Height from Footprint Evidence EXAMPLE 2 Data Set 9 “Foot and Height” in Appendix B includes the age, foot length, shoe print length, shoe size, and height for each of 40 different subjects. Using those sample data, find the regression equation that is best for predicting height. Is the “best” regression equation a good equation for predicting height? SOLUTION Using the response variable of height and possible predictor variables of age, foot length, shoe print length, and shoe size, there are 15 different possible combinations of predictor variables. Table 10-5 includes key results from five of those combinations. Blind and thoughtless application of regression methods would suggest that the best regression equation uses all four of the predictor variables, because that combination yields the highest adjusted R2 value of 0.7585. However, given the objective of using evidence to estimate the height of a suspect, we use critical thinking as follows. 1. Delete the variable of age, because criminals rarely leave evidence identifying their ages. 2. Delete the variable of shoe size, because it is really a rounded form of foot length. 3. For the remaining variables of foot length and shoe print length, use only foot length because its adjusted R2 value of 0.7014 is greater than 0.6520 for shoe print length, and it is not very much less than the adjusted R2 value of 0.7484 for both foot length and shoe print length. In this case, it is better to use one predictor variable instead of two. 4. Although it appears that the use of the single variable of foot length is best, we also note that criminals usually wear shoes, so shoe print lengths are more likely to be found than foot lengths. TABLE 10-5 Select Key Results from Data Set 9 “Foot and Height” in Appendix B Predictor Variables Adjusted R2 P-Value Age 0.1772 0.004 d Not best: Adjusted R2 is far less than 0.7014 for Foot Length. Foot Length 0.7014 0.000 d Best: High adjusted R2 and lowest P-value. Shoe Print Length 0.6520 0.000 d Not best: Adjusted R2 is less than 0.7014 for Foot Length. Foot Length>Shoe Print Length 0.7484 0.000 d Not best: The adjusted R2 value is not very much higher than 0.7014 for the single variable of Foot Length. Age>Foot Length> Shoe Print Length> Shoe Size 0.7585 0.000 d Not best: There are other cases using fewer variables with adjusted R2 that are not too much smaller. INTERPRETATION Blind use of regression methods suggests that when estimating the height of a subject, we should use all of the available data by including all four predictor variables of age, foot length, shoe print length, and shoe size, but practical considerations suggest that it is best to use the single predictor variable of foot length. So the best regression equation appears to be: Height = 64.1 + 4.29 (Foot Length). The following example illustrates that common sense and critical thinking are essential tools for effective use of methods of statistics.
RkJQdWJsaXNoZXIy NjM5ODQ=