8-1 Basics of Hypothesis Testing 383 “Accept” Is Misleading We should say that we “fail to reject the null hypothesis” instead of saying that we “accept the null hypothesis.” The term accept is misleading, because it implies incorrectly that the null hypothesis has been proved or is somehow being supported, but we can never prove a null hypothesis. The phrase fail to reject says more correctly that the available evidence isn’t strong enough to warrant rejection of the null hypothesis. Multiple Negatives Final conclusions can include as many as three negative terms. (Example: “There is not sufficient evidence to warrant rejection of the claim of no difference between 0.5 and the population proportion.”) For such confusing conclusions, it is better to restate them to be understandable. Instead of saying that “there is not sufficient evidence to warrant rejection of the claim of no difference between 0.5 and the population proportion,” a better statement would be this: “Until stronger evidence is obtained, continue to assume that the population proportion is equal to 0.5.” It is important to effectively communicate the correct conclusion. CAUTION Never conclude a hypothesis test with a statement of “reject the null hypothesis” or “fail to reject the null hypothesis.” Always make sense of the conclusion with a statement that uses simple nontechnical wording that addresses the original claim. Confidence Intervals for Hypothesis Tests In this section we have described the individual components used in a hypothesis test, but the following sections will combine those components in comprehensive procedures. We can test claims about population parameters by using the P-value method or the critical value method summarized in Figure 8-1, or we can use confidence intervals. A confidence interval estimate of a population parameter contains the likely values of that parameter. If a confidence interval does not include a claimed value of a population parameter, reject that claim. For two-tailed hypothesis tests, construct a confidence interval with a confidence level of 1 - a, but for a one-tailed hypothesis test with significance level a, construct a confidence interval with a confidence level of 1 - 2a. (See Table 8-1 on page 376 for common cases.) After constructing the confidence interval, use this criterion: A confidence interval estimate of a population parameter contains the likely values of that parameter. We should therefore reject a claim that the population parameter has a value that is not included in the confidence interval. Equivalent Methods Caution: In some cases, a conclusion based on a confidence interval may be different from a conclusion based on a hypothesis test. The P-value method and critical value method are equivalent in the sense that they always lead to the same conclusion. The following table shows that for the methods included in this chapter, a confidence interval estimate of a proportion might lead to a conclusion different from that of a hypothesis test. Parameter Is a confidence interval equivalent to a hypothesis test in the sense that they always lead to the same conclusion? Proportion No Mean Yes Standard deviation or variance Yes ” g, w ct cAspirin Not Helpful for Geminis and Libras Physician Richard Peto submitted an article to Lancet, a British medical journal. The article showed that patients had a better chance of surviving a heart attack if they were treated with aspirin within a few hours of their heart attacks. Lancet editors asked Peto to break down his results into subgroups to see if recovery worked better or worse for different groups, such as males or females. Peto believed that he was being asked to use too many subgroups, but the editors insisted. Peto then agreed, but he supported his objections by showing that when his patients were categorized by signs of the zodiac, aspirin was useless for Gemini and Libra heart attack patients, but aspirin is a lifesaver for those born under any other sign. This shows that when conducting multiple hypothesis tests with many different subgroups, there is a very large chance of getting some wrong results.

RkJQdWJsaXNoZXIy NjM5ODQ=