2-3 Graphs That Enlighten and Graphs That Deceive 63 Jitter Some dotplots and other graphs don’t work too well when many data values are the same. One way to improve these graphs is to add jitter, which randomly nudges points so that they don’t overlap. Instead of the identical data values being plotted on top of each other, adding jitter allows us to see them as separate points that are close together. Stemplots A stemplot (or stem-and-leaf plot) represents quantitative data by separating each value into two parts: the stem (such as the leftmost digit) and the leaf (such as the rightmost digit). Better stemplots are often obtained by first rounding the original data values. Also, stemplots can be expanded to include more rows and can be condensed to include fewer rows, as in Exercise 21. Features of a Stemplot ■ Shows the shape of the distribution of the data. ■ Retains the original data values. ■ The sample data are sorted (arranged in order). Stemplot of Male Pulse Rates EXAMPLE 2 The following stemplot displays the pulse rates of the males in Data Set 1 “Body Data” in Appendix B. (The fictional outlier of 10 beats per minute included in Example 1 is not included in this example.) The lowest pulse rate of 40 is separated into the stem of 4 and the leaf of 0. The stems and leaves are arranged in increasing order, not the order in which they occur in the original list. If you turn the stemplot on its side, you can see distribution of the pulse rates in the same way you would see it in a histogram or dotplot. See also that in the stemplot, the data are now sorted (arranged in order). It is easy to see that the lowest value is 40 beats per minute and the highest value is 104 beats per minute. It is easy to see that the middle value of the sorted data is around 68 beats per minute. Pulse rates are 40 and 42 Pulse rates are 90, 92, 94, 96, 96 YOUR TURN. Do Exercise 7 “Pulse Rates.” Time-Series Graph A time-series graph is a graph of time-series data, which are quantitative data that have been collected at different points in time, such as monthly or yearly. Feature of a Time-Series Graph ■ Reveals information about trends over time re es n e The Texas Sharpshooter Fallacy The Texas Sharpshooter fallacy got its name from someone who supposedly randomly shot at the side of a barn and then proceeded to paint a bullseye around the bullet holes that appeared to be close together. He then claimed to be a sharpshooter based on the number of holes within the bullseye. This fallacy is used in statistics when a large amount of data is available, but only a small collection of the data is used. When we hear that “baseball player Aaron Judge got a hit in 7 of his last 14 at-bats,” we should know that this cluster has been carefully selected to include disproportionately many hits, as can be seen by the choice of 14 as the number of at-bats to include. Sometimes disease clusters can be misleading when a boundary is drawn around the disease occurrences, similar to the Texas “sharpshooter” drawing a bullseye around a cluster of random gunshots.

RkJQdWJsaXNoZXIy NjM5ODQ=