130 CHAPTER 3 Describing, Exploring, and Comparing Data Outliers When analyzing data, it is important to identify and consider outliers because they can strongly affect values of some important statistics (such as the mean and standard deviation), and they can also strongly affect important methods discussed later in this book. In Chapter 2 we described outliers as sample values that lie very far away from the vast majority of the other values in a set of data, but that description is vague and it does not provide specific objective criteria. Part 2 of this section includes a description of modified boxplots along with a more precise definition of outliers used in the context of creating modified boxplots. CAUTION When analyzing data, always identify outliers and consider their effects, which can be substantial. PART 2 Outliers and Modified Boxplots We noted that the description of outliers is somewhat vague, but for the purposes of constructing modified boxplots, we can consider outliers to be data values meeting specific criteria based on quartiles and the interquartile range. (The interquartile range is often denoted by IQR, where IQR = Q3 - Q1.) Identifying Outliers for Modified Boxplots 1. Find the quartiles Q1, Q2, and Q3. 2. Find the interquartile range (IQR), where IQR = Q3 - Q1. 3. Evaluate 1.5 * IQR. 4. In a modified boxplot, a data value is an outlier if it is above Q3, by an amount greater than 1.5 * IQR or below Q1, by an amount greater than 1.5 * IQR Modified Boxplots The boxplots described earlier are called skeletal (or regular) boxplots, but some statistical software packages provide modified boxplots, which represent outliers as special points. A modified boxplot is a regular boxplot constructed with these modifications: 1. A special symbol (such as an asterisk or point) is used to identify outliers as defined above. 2. The solid horizontal line extends only as far as the minimum data value that is not an outlier and the maximum data value that is not an outlier. (Note: Exercises involving modified boxplots are found in the “Beyond the Basics” exercises only.)
RkJQdWJsaXNoZXIy NjM5ODQ=