11-1 Goodness-of-Fit 579 Finding Expected Frequencies Conducting a goodness-of-fit test requires that we identify the observed frequencies denoted by O, then find the frequencies expected (denoted by E) with the claimed distribution. There are two different approaches for finding expected frequencies E: ■ Equal Expected Frequencies: If the expected frequencies are all equal, the expected frequency for each category (or cell) is E = n>k. ■ Unequal Expected Frequencies: If the expected frequencies are not all equal, find the expected frequency for each individual category (or cell) by evaluating E = np (where n is the total sample size and p is the probability for the individual category). As good as these two preceding formulas for E might be, it is better to use an informal approach by simply asking, “How can the observed frequencies be split up among the different categories so that there is perfect agreement with the claimed distribution?” Note: The observed frequencies must all be whole numbers because they represent actual counts, but the expected frequencies need not be whole numbers. Examples: a. Equally Likely A single die is rolled 45 times with the following results. Assuming that the die is fair and all outcomes are equally likely, find the expected frequency E for each empty cell. Outcome 1 2 3 4 5 6 Observed Frequency O 13 6 12 9 3 2 Expected Frequency E With n = 45 outcomes and k = 6 categories, the expected frequency for each cell is the same: E = n>k = 45>6 = 7.5. If the die is fair and the outcomes are all equally likely, we expect that each outcome should occur about 7.5 times. b. Not Equally Likely Using the same results from part (a), suppose that we claim that instead of being fair, the die is loaded so that the outcome of 1 occurs 50% of the time and the other five outcomes occur 10% of the time. The probabilities are listed in the second row below. Using n = 45 and the probabilities listed below, we find that for the first cell, E = np = 145210.52 = 22.5. Each of the other five cells will have the expected value of E = np = 145210.12 = 4.5. Outcome 1 2 3 4 5 6 Probability 0.5 0.1 0.1 0.1 0.1 0.1 Observed Frequency O 13 6 12 9 3 2 Expected Frequency E 22.5 4.5 4.5 4.5 4.5 4.5 Measuring Disagreement with the Claimed Distribution We know that sample frequencies typically differ somewhat from the values we theoretically expect, so we consider the key question: Are the differences between the actual observed frequencies O and the theoretically expected frequencies E significant? To measure the discrepancy between the O and E values, we use the test statistic given in the preceding Key Elements box. (Later we will explain how this test statistic was developed, but it has differences of O - E as a key component.) x2 = a1O - E22 E

RkJQdWJsaXNoZXIy NjM5ODQ=