CHAPTER 8 Big (or Very Large) Data Project 437 P1x2 = 3x2 1000 defined on [0, 10] Normal Quantile Plot of 20 Sample Values a. Mean Test the claim that the 20 given sample values are from a population having a mean equal to 7.5, which is the known population mean. Because the sample is not from a normally distributed population and because n = 20 does not satisfy the requirement of n 7 30, we should not use the methods of Section 8-3. Instead, test the claim by using (a) the confidence interval method based on a bootstrap sample of size 1000 (see Section 7-4), and (b) the randomization method. Use a 0.05 significance level. What do the resampling methods of bootstrapping and randomization suggest about the claim that the sample is from a population with a mean of 7.5? b. Standard Deviation Test the claim that the 20 sample values are from a population with a standard deviation equal to 1.93649, which is the known population standard deviation. Use the confidence interval method based on a bootstrap sample of size 1000. (See Section 7-4.) Use a 0.05 significance level. Does the bootstrap confidence interval contain the known population standard deviation of 1.93649? Is the bootstrap method effective for this test? What happens if we conduct the test by throwing all caution to the wind and constructing the 95% confidence interval by using the x2 distribution as described in Section 7-3? FROM DATA TO DECISION Critical Thinking: Did the Official Cheat? A county clerk in Essex County, New Jersey, had the responsibility of selecting the order in which candidates’ names appeared on election ballots. Being placed first on the ballot is generally recognized as an advantage. The clerk was supposed to use random selection for those ballot placements. Here are the clerk’s results: Democrats were selected first in 40 of 41 ballots. Republicans claimed that instead of using randomness, the clerk cheated by using a method that favored Democrats. Analyzing the Results a. In testing the claim made by the Republicans, choose between a significance level of 0.01 or 0.05. Consider the seriousness of the two types of errors. Explain your choice. b. Analyze the result of Democrats getting the top line in 40 of 41 ballots. Describe the method used and the conclusions reached. c. Assume that you are preparing an argument that will be presented in court. Is there strong evidence to support the claim that the clerk cheated? This is serious stuff, so write a thorough description of your analysis. Refer to Data Set 45 “Births in New York” in Appendix B. Use all 465,506 birth weights to test the claim that babies have a mean birth weight less than 3255 grams. Use a 0.05 significance level. Big (or Very Large) Data Project

RkJQdWJsaXNoZXIy NjM5ODQ=