10-2 Regression 529 37. Finding Critical r Values Table A-6 lists critical values of r for selected values of n and a. More generally, critical r values can be found by using the formula r = t 2t2 + n - 2 where the t value is found from the table of critical t values (Table A-3) assuming a two-tailed case with n - 2 degrees of freedom. Use the formula for r given here and in Table A-3 (with n - 2 degrees of freedom) to find the critical r values corresponding to H1: r ≠ 0, a = 0.05, andn = 703 as in Exercises 29–32. Key Concept This section presents methods for finding the equation of the straight line that best fits the points in a scatterplot of paired sample data. That best-fitting straight line is called the regression line, and its equation is called the regression equation. We can use the regression equation to make predictions for the value of one of the variables, given some specific value of the other variable. In Part 2 of this section we discuss marginal change, influential points, and residual plots as tools for analyzing correlation and regression results. 10-2 Regression PART 1 Basic Concepts of Regression In some cases, two variables are related in a deterministic way, meaning that given a value for one variable, the value of the other variable is exactly determined without any error, as in the equation y = 2.54x for converting a distance x from inches to centimeters. Such equations are considered in algebra courses, but statistics courses focus on probabilistic models, which are equations with a variable that is not determined completely by the other variable. For example, the height of a child cannot be determined completely by the height of the father and>or mother. Sir Francis Galton (1822–1911) studied the phenomenon of heredity and showed that when tall or short couples have children, the heights of those children tend to regress, or revert to the more typical mean height for people of the same gender. We continue to use Galton’s “regression” terminology, even though our data do not involve the same height phenomena studied by Galton. DEFINITIONS Given a collection of paired sample data, the regression line (or line of best fit, or leastsquares line) is the straight line that “best” fits the scatterplot of the data. (The specific criterion for the “best-fitting” straight line is the “least-squares” property described later.) The regression equation yn = b0 + b1x algebraically describes the regression line. The regression equation expresses a relationship between x (called the explanatory variable, or predictor variable, or independent variable) and yn (called the response variable or dependent variable). The preceding definition shows that in statistics, the typical equation of a straight line y = mx + b is expressed in the form yn = b 0 + b1x, where b0 is the y-intercept and b1 is the slope. The values of the slope b1 and y-intercept b0 can be easily found by using any one of the many computer programs and calculators designed to provide those values, as illustrated in Example 1. The values of b1 and b0 can also be found with manual calculations, as shown in Example 2.
RkJQdWJsaXNoZXIy NjM5ODQ=