20 CHAPTER 1 Introduction to Statistics Distinguishing Between the Ratio Level and Interval Level EXAMPLE 8 For each of the following, determine whether the data are at the ratio level of measurement or the interval level of measurement: a. Times (minutes) it takes students to complete a statistics test. b. Body temperatures (Celsius) of statistics students. a. Apply the “ratio test” described in the preceding hint. If one student completes the test in 40 minutes and another student completes the test in 20 minutes, does it make sense to say that the first student used twice as much time? Yes! So the times are at the ratio level of measurement. We could also apply the “true zero” test. A time of 0 minutes does represent “no time,” so the value of 0 is a true zero indicating that no time was used. b. Apply the “ratio test” described in the preceding hint. If one student has a body temperature of 40°C and another student has a body temperature of 20°C, does it make sense to say that the first student is twice as hot as the second student? (Ignore subjective amounts of attractiveness and consider only science.) No! So the body temperatures are not at the ratio level of measurement. Because the difference between 40°C and 20°C is the same as the difference between 90°C and 70°C, the differences are meaningful, but because ratios do not make sense, the body temperatures are at the interval level of measurement. Also, the temperature of 0°C does not represent “no heat” so the value of 0 is not a true zero indicating that no heat is present. SOLUTION YOUR TURN. Do Exercise 28 “Body Temperatures.” PART 2 Big Data and Missing Data: Too Much and Not Enough When working with data, we might encounter some data sets that are ginormous, and we might also encounter some data sets with individual elements missing. Here in Part 2 we briefly discuss both cases. Big Data UPS delivers 20 million packages every day. UPS analyzes massive amounts of data in order to optimize routes and plan maintenance for its truck and aircraft fleets. Data analysis and optimization efforts to-date have enabled UPS to save 40 million gallons of fuel and shorten travel distances by 370 million miles. The need to analyze large data sets has led to the birth of data science. There is not universal agreement on the following definitions, and various other definitions can be easily found elsewhere. DEFINITIONS Big data refers to data sets so large and so complex that their analysis is beyond the capabilities of traditional software tools. Analysis of big data may require software simultaneously running in parallel on many different computers. Data science involves applications of statistics, computer science, and software engineering, along with some other relevant fields (such as sociology or finance). Big Data Instead of a Clinical Trial Nicholas Tatonetti of Columbia University searched Food and Drug Administration databases for adverse reactions in patients that resulted from different pairings of drugs. He discovered that the Paxil (paroxetine) drug for depression and the pravastatin drug for high cholesterol interacted to create increases in glucose (blood sugar) levels. When taken separately by patients, neither drug raised glucose levels, but the increase in glucose levels occurred when the two drugs were taken together. This finding resulted from a general database search of interactions from many pairings of drugs, not from a clinical trial involving patients using Paxil and pravastatin. N T lu s F A d
RkJQdWJsaXNoZXIy NjM5ODQ=