Survey of Mathematics

826 CHAPTER 12 Statistics In Example 1, had we determined r to be a value greater than 1 or less than 1, − it would have indicated that we had made an error. Also, from the scatter diagram, we should realize that r should be a positive value and not negative. In Example 1, there appears to be a cause–effect relationship. That is, the more assembly line workers who are absent, the more defective parts are produced. However, a correlation does not necessarily indicate a cause–effect relationship. For example, there is a positive correlation between teachers’ salaries and the cost of medical insurance over the past 10 years (both have increased), but that does not mean that the increase in teachers’ salaries caused the increase in the cost of medical insurance. Suppose in Example 1 that r had been 0.53. Would this value have indicated a correlation? What is the minimum value of r needed to assume that a correlation exists between the variables? To answer this question, we introduce the term level of significance. The level of significance, denoted α (alpha), is used to identify the cutoff between results attributed to chance and results attributed to an actual relationship between the two variables. Table 12.9 gives critical values (or cutoff values) that are sometimes used for determining whether two variables are related. The table indicates two different levels of significance: 0.05 α= and 0.01. α= A level of significance of 5%, written 0.05, α= means that there is a 5% chance that, when you say the variables are related, they actually are not related. Similarly, a level of significance of 1%, or 0.01, α= means that there is a 1% chance that, when you say the variables are related, they actually are not related. More complete critical value tables are available in statistics books. To explain the use of the table, we use absolute value, symbolized . The absolute value of a nonzero number is the positive value of the number, and the absolute value of 0 is 0. Therefore, 3 3, 3 3, 5 5, 5 5, and 0 0 = − = = − = = Thus, Σ = Σ = Σ = Σ = x y x y 17, 106, 75, 2202, 2 2 and xy 387. Σ = In the formula for r, we use both Σx ( )2 and Σx .2 Note that Σ = = x ( ) (17) 289 2 2 and that Σ = x 75. 2 Similarly, Σ = = y ( ) (106) 11,236 2 2 and Σ = y 2202. 2 The n in the formula represents the number of pieces of bivariate data. Here n 6. = Now let’s determine r. = Σ − Σ Σ Σ − Σ Σ − Σ = − − − = − − − = − − = ≈ r n xy x y n x x n y y ( ) ( )( ) ( ) ( ) ( ) ( ) 6(387) (17)(106) 6(75) (17) 6(2202) (106) 2322 1802 6(75) 289 6(2202) 11,236 520 450 289 13,212 11,236 520 161 1976 0.922 2 2 2 2 2 2 Since the maximum possible value for r is 1.00, a correlation coefficient of 0.922 is a strong, positive correlation. This result implies that, generally, the more assembly line workers absent, the more defective parts produced. 7 Now try Exercise 11

RkJQdWJsaXNoZXIy NjM5ODQ=