13-6 Rank Correlation 681 Example 1 used a sample size of n = 10, but if we use 33 smartphones, we get the following costs (dollars) that correspond to ranks of 1 through 33, respectively. Use a 0.05 significance level to test the claim that there is a correlation between the quality rank and the costs of these smartphones. 1000 1100 900 1000 750 1000 900 700 750 600 550 700 600 470 900 850 800 400 400 800 490 470 230 850 500 255 800 400 330 800 550 850 120 CP YOUR TURN. Do Exercise 13 “Taxis.” EXAMPLE 2 Large Sample Case SOLUTION REQUIREMENT CHECK The data are a simple random sample and can be converted to ranks. Test Statistic The value of the rank correlation coefficient is rs = -0.572, which can be found by using technology. Critical Values Because there are 33 pairs of data, we have n = 33. Because n exceeds 30, we find the critical values from Formula 13-1 instead of Table A-9. With a = 0.05 in two tails, we let z = 1.96 to get the critical values of -0.346 and 0.346, as shown below. rs = {z 2 n - 1 = {1.96 2 33 - 1 = {0.346 The test statistic of rs = -0.572 falls outside of the range between the critical values of -0.346 and 0.346, so we reject the null hypothesis of rs = 0. There is sufficient evidence to support the claim that there is a correlation between costs of smartphones and their quality. Detecting Nonlinear Patterns Rank correlation methods sometimes allow us to detect relationships that we cannot detect with the linear correlation methods of Chapter 10. See scatterplot on the following page, which shows an S-shaped pattern of points suggesting that there is a correlation between x and y. The methods of Chapter 10 result in the linear correlation coefficient of r = 0.627 and critical values of {0.632, suggesting that there is not sufficient evidence to support the claim of a linear correlation between x and y. If we use rank correlation and the methods of this section, we get r = 0.997 and critical values of {0.648, suggesting that there is sufficient evidence to support the claim of a correlation between x and y. Linear correlation missed it, but rank correlation recognized it. With rank correlation, we can sometimes detect relationships that are not linear. e Direct Link Between Smoking and Cancer When we find a statistical correlation between two variables, we must be extremely careful to avoid the mistake of concluding that there is a causeeffect link. The tobacco industry has consistently emphasized that correlation does not imply causality as they denied that tobacco products cause cancer. However, Dr. David Sidransky of Johns Hopkins University and other researchers found a direct physical link that involves mutations of a specific gene among smokers. Molecular analysis of genetic changes allows researchers to determine whether cigarette smoking is the cause of a cancer. (See “Association Between Cigarette Smoking and Mutation of the p53 Gene in Squamous-Cell Carcinoma of the Head and Neck,” by Brennan, Boyle, et al., New England Journal of Medicine, Vol 332, No. 11.) Although statistical methods cannot prove that smoking causes cancer, statistical methods can be used to identify an association, and physical proof of causation can then be sought by researchers.
RkJQdWJsaXNoZXIy NjM5ODQ=