5-1 Probability Distributions 207 TABLE 5-3 Software Piracy Country Proportion of Unlicensed Software United States 0.17 China 0.70 India 0.58 Russia 0.64 Total 2.09 Notation for 0+ In tables such as Table 5-2 or the binomial probabilities listed in Table A-1 in Appendix A, we sometimes use 0+ to represent a probability value that is positive but very small, such as 0.000000123. (When rounding a probability value for inclusion in such a table, rounding to 0 would be misleading because it would incorrectly suggest that the event is impossible, so we use 0+ instead.) Probability Histogram: Graph of a Probability Distribution There are various ways to graph a probability distribution, but for now we will consider only the probability histogram. Figure 5-2 is a probability histogram corresponding to Table 5-2. Notice that it is similar to a relative frequency histogram (described in Section 2-2), but the vertical scale shows probabilities instead of relative frequencies based on actual sample results. FIGURE 5-2 Probability Histogram for Number of Females in Two Births Looking Ahead In Figure 5-2, we see that the values of 0, 1, 2 along the horizontal axis are located at the centers of the rectangles. This implies that the rectangles are each 1 unit wide, so the areas of the rectangles are 0.25, 0.50, and 0.25. The areas of these rectangles are the same as the probabilities in Table 5-2. We will see in Chapter 6 and future chapters that such a correspondence between areas and probabilities is very useful. Probability Formula Example 1 involves a table, but a probability distribution could also be in the form of a formula. Consider the formula P1x2 = 1 212 - x2!x! (where x can be 0, 1, or 2). Using that formula, we find that P102 = 0.25, P112 = 0.50, and P122 = 0.25. The probabilities found using this formula are the same as those in Table 5-2. This formula does describe a probability distribution because the three requirements are satisfied, as shown in Example 1. Software Piracy EXAMPLE 2 Table 5-3 lists countries along with the proportion of unlicensed software in each country (based on data from Business Software Alliance). Does Table 5-3 describe a probability distribution? Table 5-3 violates the first requirement because x is not a numerical random variable. Instead, the “values” of x are categorical data, not numbers. Table 5-3 also violates the second requirement because the sum of the “probabilities” is 2.09, but that sum should be 1. Because the three requirements are not all satisfied, we conclude that Table 5-3 does not describe a probability distribution. SOLUTION YOUR TURN. Do Exercise 11 “Cell Phone Use.” A, l, e, Not at Home Pollsters cannot simply ignore those who were not at home when they were called the first time. One solution is to make repeated callback attempts until the person can be reached. Alfred Politz and Willard Simmons describe a way to compensate for those missed calls without making repeated callbacks. They suggest weighting results based on how often people are not at home. For example, a person at home only two days out of six will have a 2>6 or 1>3 probability of being at home when called the first time. When such a person is reached the first time, his or her results are weighted to count three times as much as someone who is always home. This weighting is a compensation for the other similar people who are home two days out of six and were not at home when called the first time. This clever solution was first presented in 1949.
RkJQdWJsaXNoZXIy NjM5ODQ=