520 CHAPTER 10 Correlation and Regression jackpot>ticket data values have been converted to z scores. Figure 10-4 is essentially the same scatterplot as Figure 10-1, except that Figure 10-4 uses different scales. The red lines in Figure 10-4 form the same coordinate axes that we have all come to know and love from earlier mathematics courses. Those red lines partition Figure 10-4 into four quadrants. FIGURE 10-4 Scatterplot of z Scores from the Data in Table 10-1 If the points of the scatterplot approximate an uphill line (as in Figure 10-4), individual values of the product zx # zy tend to be positive (because most of the points are found in the first and third quadrants, where the values of zx and zy are either both positive or both negative), so Σ1zx zy2 tends to be positive. If the points of the scatterplot approximate a downhill line, most of the points are in the second and fourth quadrants, where zx and zy are opposite in sign, so Σ1zx zy2 tends to be negative. Points that follow no linear pattern tend to be scattered among the four quadrants, so the value of Σ1zx zy2 tends to be close to 0. Using Σ1zx zy2 as a measure of how the points are configured among the four quadrants, we get the following: ■ Positive Correlation: A large positive value of Σ1zx zy2 suggests that the points are predominantly in the first and third quadrants (corresponding to a positive linear correlation). ■ Negative Correlation: A large negative value of Σ1zx zy2 suggests that the points are predominantly in the second and fourth quadrants (corresponding to a negative linear correlation). ■ No Correlation: A value of Σ1zx zy2 near 0 suggests that the points are scattered among the four quadrants (with no linear correlation). We divide Σ1zx zy2 by n - 1 to get an average instead of a statistic that becomes larger simply because there are more data values. (The reasons for dividing by n - 1 instead of n are essentially the same reasons that relate to the standard deviation.) The end result is Formula 10-2, which can be algebraically manipulated into any of the other expressions for r. ja th re an Go Figure Smallest country in the world: Vatican City with an area of 0.2 square miles. PART 3 Randomization Test When listing the requirements for an analysis of correlation, we noted that alternatives to the method presented in Part 1 are to use rank correlation (Section 13-6) or the resampling method of randomization. The randomization method is based on the principle that when assuming a null hypothesis of no correlation, we can resample by holding the x data values fixed while randomly shuffling the order of the y data values. We do the shuffling without replacement, so we are working with a randomization test that can be used to test the assumption (null hypothesis) of no correlation.
RkJQdWJsaXNoZXIy NjM5ODQ=