CHAPTER 10 Big (or Very Large) Data Project 573 5.Wireless Earbuds In a survey of n = 2016 adults, the respondents were asked if they had wireless earbuds and 30% of them said “yes” (based on data from the Consumer Technology Association). 6.Height of Stephen Curry Heights of adult males are normally distributed with a mean of 174.12 cm and a standard deviation of 7.10 cm (based on Data Set 1 “Body Data” in Appendix B). Professional basketball player Stephen Curry has a height of 191 cm. 7.Cans of Motor Oil Automobile motor oil is commonly sold in plastic containers labelled as containing 16 ounces. A new motor oil filling device is being tested and listed below are the amounts (ounces) that the device poured in a sample of 16 containers. 15.7 19.2 16.0 17.8 15.4 18.4 17.7 16.5 17.3 12.6 16.1 19.0 14.2 15.1 8. Drug Screening The company Drug Test Success provides a “1-Panel-THC” test for marijuana usage. Among 300 tested subjects, results from 27 subjects were wrong (either a false positive or a false negative). Technology Project Queues Data Set 30 “Queues” in Appendix B includes waiting times (seconds) from drivers in two lines (or queues) at a Delaware Department of Motor Vehicle emissions testing facility. Two-line wait times are actual times cars spent waiting in a line. Single-line wait times are modeled by assuming that there was a single line feeding the two service bays. Interarrival times are times (sec) since the previous car entered a line. Service times (sec) are times starting with a car entering a service bay and ending when it leaves the bay. a. Using the actual two-line wait times and the modeled single-line wait times, generate a scatterplot and test for a correlation. b. Using the independent variable of the two-line wait times and the dependent variable of the single-line wait times, find the equation of the regression line. How well does the regression line fit the data? c. Using the independent variables of the interarrival times and the two-line wait times, and using the dependent variable of the single line wait times, find the best multiple regression equation. How well does the multiple regression equation fit the sample data? d. Is the multiple regression equation from Part (c) better than the regression equation found in Part (b)? Why or why not? Big (or Very Large) Data Project Refer to Data Set 45 “Births in New York” in Appendix B, which contains records from 465,506 births. a. Test for a correlation between birth weight and length of stay. b. Find the equation of the regression line using length of stay as the independent x variable and using birth weight as the dependent y variable. c. Can the regression equation from part (b) be used to predict birth weight based on length of stay? Why or why not?
RkJQdWJsaXNoZXIy NjM5ODQ=