544 CHAPTER 10 Correlation and Regression Key Concept In Section 10-2 we presented a method for using a regression equation to find a predicted value of y, but it would be great to have a way of determining the accuracy of such predictions. In this section we introduce the prediction interval, which is an interval estimate of a predicted value of y. See the following definitions for the distinction between confidence interval and prediction interval. 10-3 Prediction Intervals and Variation DEFINITIONS A prediction interval is a range of values used to estimate a variable (such as a predicted value of y in a regression equation). A confidence interval is a range of values used to estimate a population parameter (such as p or m or s). In Example 4(a) from the preceding section, we showed that when using the 9 pairs of jackpot>tickets data from Table 10-1, the regression equation is yn = -10.9 + 0.174x, and for a jackpot of x = 625 million dollars, the predicted value of y is 97.9 million tickets (which is found by substituting x = 625 in the regression equation). For x = 625, the “best” predicted value of y is 97.9, but we have no sense of the accuracy of that estimate, so we need an interval estimate. A prediction interval estimate of a predicted value yn can be found using the components in the following Key Elements box. Given the nature of the calculations, the use of technology is strongly recommended. Minitab and StatCrunch can be used to automatically generate 95% prediction intervals. Prediction Intervals Objective Find a prediction interval, which is an interval estimate of a predicted value of y. Requirement For each fixed value of x, the corresponding sample values of y are normally distributed about the regression line, and those normal distributions have the same variance. Formulas for Creating a Prediction Interval Given a fixed and known value x0, the prediction interval for an individual y value is yn - E 6 y 6 yn + E where the margin of error is E = ta>2seB1 + 1 n + n1x0 - x 2 2 n1Σx22 - 1Σx22 and x0 is a given value of x, ta>2 has n - 2 degrees of freedom, and se is the standard error of estimate found from Formula 10-5 or Formula 10-6. (The standard error of estimate se is a measure of variation of the residuals, which are the differences between the observed sample y values and the predicted values yn that are found from the regression equation.) KEY ELEMENTS

RkJQdWJsaXNoZXIy NjM5ODQ=