584 CHAPTER 11 Goodness-of-Fit and Contingency Tables Step6: Table 11-5 shows the calculations of the components of the x2 test statistic for the leading digits of 1 and 2. If we include all nine leading digits, we get the test statistic of x2 = 11.2792, as shown in the accompanying TI-84 Plus CE calculator display. The critical value is x2 = 15.507 (found in Table A-4, with a = 0.05 in the right tail and degrees of freedom equal to k - 1 = 8). The TI-84 Plus CE calculator display shows the value of the test statistic as well as the P-value of 0.186. (The entire bottom row of the display can be viewed by scrolling to the right. CNTRB is an abbreviated form of “contribution,” and the values are the individual contributions to the total value of the x2 test statistic.) TI-84 Plus CE TABLE 11-5 Calculating the x2 Test Statistic for Leading Digits in Table 11-4 Leading Digit Observed Frequency O Expected Frequency E = np O − E (O − E)2 (O − E)2 E 1 69 271# 0.301 = 81.5710 -12.5710 158.0300 1.9373 2 40 271# 0.176 = 47.6960 -7.6960 59.2284 1.2418 Step7: The P-value of 0.186 is greater than the significance level of 0.05, so there is not sufficient evidence to reject the null hypothesis. (Also, the test statistic of x2 = 11.2792 does not fall in the critical region bounded by the critical value of 15.507, so there is not sufficient evidence to reject the null hypothesis.) Step8: There is not sufficient evidence to warrant rejection of the claim that the 271 leading digits fit the distribution given by Benford’s law. INTERPRETATION The sample of leading digits does not provide enough evidence to conclude that the Benford’s law distribution is not being followed. There is not sufficient evidence to support a conclusion that the leading digits are from interarrival times that are not from normal traffic, so there is not sufficient evidence to conclude that a cyberattack has occurred. YOUR TURN. Do Exercise 21 “Detecting Fraud.” FIGURE 11-3 Interarrival Times: Observed Proportions and Proportions Expected with Benford’s Law In Figure 11-3 we use a green line to graph the expected proportions given by Benford’s law (as in Table 11-4) along with a red line for the observed proportions from Table 11-4. Figure 11-3 allows us to visualize the “goodness-of-fit” between the distribution given by Benford’s law and the frequencies that were observed. In Figure 11-3, the green and red lines agree reasonably well, so it appears that the observed data fit the expected values reasonably well. Mendel’s Data Falsified? Because some of Mendel’s data from his famous genetics experiments seemed too perfect to be true, statistician R. A. Fisher concluded that the data were probably falsified. He used a chi-square distribution to show that when a test statistic is extremely far to the left and results in a P-value very close to 1, the sample data fit the claimed distribution almost perfectly, and this is evidence that the sample data have not been randomly selected. It has been suggested that Mendel’s gardener knew what results Mendel’s theory predicted, and subsequently adjusted results to fit that theory. Ira Pilgrim wrote in The Journal of Heredity that this use of the chi-square distribution is not appropriate. He notes that the question is not about goodness-of-fit with a particular distribution, but whether the data are from a sample that is truly random. Pilgrim used the binomial probability formula to find the probabilities of the results obtained in Mendel’s experiments. Based on his results, Pilgrim concludes that “there is no reason whatever to question Mendel’s honesty.” It appears that Mendel’s results are not too good to be true, and they could have been obtained from a truly random process. B o fr g m to b

RkJQdWJsaXNoZXIy NjM5ODQ=