554 CHAPTER 10 Correlation and Regression If a multiple regression equation fits the sample data well, it can be used for predictions. For example, if we determine that the multiple regression equation in Example 1 is suitable for predictions, we can use the height and waist circumference of a male to predict his weight. But how do we determine whether the multiple regression equation fits the sample data well? Two very helpful tools are the values of adjusted R2 and the P-value. R2 and Adjusted R2 R2 denotes the multiple coefficient of determination, which is a measure of how well the multiple regression equation fits the sample data. A perfect fit would result in R2 = 1, and a very good fit results in a value near 1. A very poor fit results in a value of R2 close to 0. The value of R2 = 0.878 (“Coeff of Det, R^2”) in the Statdisk display for Example 1 indicates that 87.8% of the variation in weights of males can be explained by their heights and waist circumferences. However, the multiple coefficient of determination R2 has a serious flaw: As more variables are included, R2 increases. (R2 could remain the same, but it usually increases.) The largest R2 is obtained by simply including all of the available variables, but the best multiple regression equation does not necessarily use all of the available variables. Because of that flaw, it is better to use the adjusted coefficient of determination, which is R2 adjusted for the number of variables and the sample size. YOUR TURN. Do Exercise 13 “Predicting Car Fuel Consumption.” Statdisk DEFINITION The adjusted coefficient of determination is the multiple coefficient of determination R2 modified to account for the number of variables and the sample size. It is calculated by using Formula 10-8. FORMULA 10-8 Adjusted R2 = 1 - 1n - 12 3n - 1k + 124 11 - R22 where n = sample size k = number of predictor 1x2 variables

RkJQdWJsaXNoZXIy NjM5ODQ=