10-4 Multiple Regression 557 However, given that criminals usually wear shoes, it is best to use the single predictor variable of shoe print length, so the best practical regression equation appears to be this: Height = 80.9 + 3.22 (Shoe Print Length). The P-value of 0.000 suggests that the regression equation yields a good model for estimating height. Because the results of this example are based on sample data from only 40 subjects, estimates of heights will not be very accurate. As is usually the case, better results could be obtained by using larger samples. YOUR TURN. Do Exercise 13 “Predicting Car Fuel Consumption.” Tests of Regression Coefficients The preceding guidelines for finding the best multiple regression equation are based on the adjusted R2 and the P-value, but we could also conduct individual hypothesis tests based on values of the regression coefficients. Consider the regression coefficient of b1. A test of the null hypothesis b1 = 0 can tell us whether the corresponding predictor variable should be included in the regression equation. Rejection of b1 = 0 suggests that b1 has a nonzero value and is therefore helpful for predicting the value of the response variable. Procedures for such tests are described in Exercise 17. Predictions With Multiple Regression When we discussed regression in Section 10-2, we listed (on page 533) four points to consider when using regression equations to make predictions. These same points should be considered when using multiple regression equations. PART 2 Dummy Variables and Logistic Regression So far in this chapter, all variables have represented continuous data, but many situations involve a variable with only two possible qualitative values (such as male> female or dead>alive or cured>not cured). To obtain regression equations that include such variables, we must somehow assign numbers to the two different categories. A common procedure is to represent the two possible values by 0 and 1, where 0 represents a “failure” and 1 represents a “success.” For disease outcomes, 1 is often used to represent the event of the disease or death, and 0 is used to represent the nonevent. DEFINITION A dummy variable is a variable having only the values of 0 and 1 that are used to represent the two different categories of a qualitative variable. A dummy variable is sometimes called a dichotomous variable. The word “dummy” is used because the variable does not actually have any quantitative value, but we use it as a substitute to represent the different categories of the qualitative variable. Dummy Variable as a Predictor Variable Procedures of regression analysis differ dramatically, depending on whether the dummy variable is a predictor 1x2 variable or the response 1y2 variable. If we include a dummy variable as another predictor 1x2 variable, we can use the same methods of Part 1 in this section, as illustrated in Example 3.
RkJQdWJsaXNoZXIy NjM5ODQ=