78 CHAPTER 2 Exploring Data with Tables and Graphs Correlation Between Shoe Print Lengths and Heights? EXAMPLE 5 Example 4 used only five pairs of data from Data Set 9 “Foot and Height” in Appendix B. If we use the shoe print lengths and heights from all of the 40 subjects listed in Data Set 9 in Appendix B, we get the scatterplot shown in Figure 2-18 and we get the Minitab results shown in the accompanying display. The scatterplot does show a distinct pattern instead of having points scattered about willy-nilly. Also, we see that the value of the linear correlation coefficient is r = 0.813, and the P-value is 0.000 when rounded to three decimal places. Because the P-value of 0.000 is small, we have sufficient evidence to conclude that there is a linear correlation between shoe print lengths and heights. In Example 4 with only five pairs of data, we did not have enough evidence to conclude that there is a linear correlation, but in this example with 40 pairs of data, we have sufficient evidence to conclude that there is a linear correlation between shoe print lengths and heights. YOUR TURN. Do Exercise 13 “P-Values.” FIGURE 2-18 Scatterplot of 40 Pairs of Data Minitab PART 3 Regression When we do conclude that there appears to be a linear correlation between two variables (as in Example 5), we can find the equation of the straight line that best fits the sample data, and that equation can be used to predict the value of one variable when given a specific value of the other variable. Based on the results from Example 5, we can predict someone’s height given the length of their shoe print (which may have been found at a crime scene). Instead of using the straight-line equation format of y = mx + b that we have all come to know and love from prior math courses, we use the format that follows. DEFINITION Given a collection of paired sample data, the regression line (or line of best fit, or least-squares line) is the straight line that “best” fits the scatterplot of the data. (The specific criterion for the “best”-fitting straight line is the “least squares” property described in Section 10-2.) Police Deaths in Car Chases USA Today investigated the annual reporting of the numbers of police who were killed during car chases. It was found that the Federal Bureau of Investigation (FBI) counted 24 deaths in the past 35 years, but other records show that there were 371 deaths during that time period. USA Today reporter Thomas Frank wrote that “the undercount is one of the most extreme examples of the federal government’s inability to accurately track violent deaths and has led the FBI to minimize the danger of police chasing motorists.” Apparently, the FBI was categorizing these deaths as automobile accidents instead of designating them as police deaths that occurred during a car chase. U i t r t o w

RkJQdWJsaXNoZXIy NjM5ODQ=