592 CHAPTER 11 Goodness-of-Fit and Contingency Tables Caution ■ If the P-value is greater than the significance level a, do not accept independence. Instead, conclude that there is not sufficient evidence to reject independence. ■ If the P-value is less than or equal to the significance level a, do not conclude that one of the variables is the direct cause of the other variable. The distribution of the test statistic x2 can be approximated by the chi-square distribution, provided that all cells have expected frequencies that are at least 5. The number of degrees of freedom 1r - 121c - 12 reflects the fact that because we know the total of all frequencies in a contingency table, we can freely assign frequencies to only r - 1 rows and c - 1 columns before the frequency for every cell is determined. However, we cannot have negative frequencies or frequencies so large that any row (or column) sum exceeds the total of the observed frequencies for that row (or column). Observed and Expected Frequencies The test statistic allows us to measure the amount of disagreement between the frequencies actually observed and those that we would theoretically expect when the two variables are independent. Large values of the x2 test statistic are in the rightmost region of the chi-square distribution, and they reflect significant differences between observed and expected frequencies. As in Section 11-1, if observed and expected frequencies are close, the x2 test statistic will be small and the P-value will be large. If observed and expected frequencies are far apart, the x2 test statistic will be large and the P-value will be small. These relationships are summarized and illustrated in Figure 11-4. Compare the observed O values to the corresponding expected E values. Os and Es are close Os and Es are far apart “If the P is low, independence must go.” Fail to reject independence Reject independence Small x2 value, large P-value Large x2 value, small P-value x2 here x2 here FIGURE 11-4 Relationships Among Key Components in a Test of Independence Finding Expected Values E An individual expected frequency E for a cell can be found by simply multiplying the total of the row frequencies by the total of the column frequencies, then dividing by the grand total of all frequencies, as shown in Example 1. E = 1row total2 1column total2 1grand total2
RkJQdWJsaXNoZXIy NjM5ODQ=