74 CHAPTER 3 Section 3.1 Solutions 3 ... - Montgomery...

33
74 CHAPTER 3 Section 3.1 Solutions 3.1 This mean is a population parameter; notation is µ. 3.2 This correlation is a population parameter; notation is ρ. 3.3 This proportion is a sample statistic; notation is ˆ p. 3.4 This proportion is a population parameter; notation is p. 3.5 This mean is a sample statistic; notation is x. 3.6 This is a population parameter for a proportion, so the correct notation is p. We have p = 170, 000/78, 000, 000 = 0.00217. 3.7 This is a population parameter for a mean, so the correct notation is µ. We have µ = 30, 795/95 = 324.2 students as the average enrollment per charter school. 3.8 This is a sample statistic for a proportion, so the correct notation is ˆ p. We have ˆ p =0.82. 3.9 This is a sample statistic from a sample of size n = 200 for a correlation, so the correct notation is r. We have r =0.037. 3.10 This is a sample statistic for a mean, so the correct notation is x. We have x = 13.10 phone calls a day. 3.11 This is a population parameter for a correlation, so the correct notation is ρ. We use technology to see that ρ = 0.131. 3.12 We expect the sampling distribution to be centered at the value of the population proportion, so we estimate that the population parameter is p =0.30. The standard error is the standard deviation of the distribution of sample proportions. The middle of 95% of the distribution goes from about 0.16 to 0.44, about 0.14 on either side of p =0.30. By the 95% rule, we estimate that SE 0.14/2=0.07. (Answers may vary slightly.) 3.13 We expect the sampling distribution to be centered at the value of the population mean, so we estimate that the population parameter is µ = 85. The standard error is the standard deviation of the distribution of sample means. The middle of 95% of the distribution goes from about 45 to 125, about 40 on either side of µ = 85. By the 95% rule, we estimate that SE 40/2 = 20. (Answers may vary slightly.) 3.14 We expect the sampling distribution to be centered at the value of the population mean, so we estimate that the population parameter is µ = 300. The standard error is the standard deviation of the distribution of sample means. The middle of 95% of the distribution goes from about 290 to 310, about 10 on either side of µ = 300. By the 95% rule, we estimate that SE 10/2 = 5. (Answers may vary slightly.) 3.15 We expect the sampling distribution to be centered at the value of the population proportion, so we estimate that the population parameter is p =0.80. The standard error is the standard deviation of the distribution of sample proportions. The middle of 95% of the distribution goes from about 0.74 to 0.86, about 0.06 on either side of p =0.80. By the 95% rule, we estimate that SE 0.06/2=0.03. (Answers may vary slightly.)

Transcript of 74 CHAPTER 3 Section 3.1 Solutions 3 ... - Montgomery...

Page 1: 74 CHAPTER 3 Section 3.1 Solutions 3 ... - Montgomery …faculty.montgomerycollege.edu/maronne/Math117A-LOCK-STATISTICS... · 74 CHAPTER 3 Section 3.1 ... 3.2 This correlation is

74 CHAPTER 3

Section 3.1 Solutions

3.1 This mean is a population parameter; notation is µ.

3.2 This correlation is a population parameter; notation is ρ.

3.3 This proportion is a sample statistic; notation is p.

3.4 This proportion is a population parameter; notation is p.

3.5 This mean is a sample statistic; notation is x.

3.6 This is a population parameter for a proportion, so the correct notation is p. We have p = 170, 000/78, 000, 000 =0.00217.

3.7 This is a population parameter for a mean, so the correct notation is µ. We have µ = 30, 795/95 = 324.2students as the average enrollment per charter school.

3.8 This is a sample statistic for a proportion, so the correct notation is p. We have p = 0.82.

3.9 This is a sample statistic from a sample of size n = 200 for a correlation, so the correct notation is r.We have r = 0.037.

3.10 This is a sample statistic for a mean, so the correct notation is x. We have x = 13.10 phone calls aday.

3.11 This is a population parameter for a correlation, so the correct notation is ρ. We use technology to seethat ρ = −0.131.

3.12 We expect the sampling distribution to be centered at the value of the population proportion, so weestimate that the population parameter is p = 0.30. The standard error is the standard deviation of thedistribution of sample proportions. The middle of 95% of the distribution goes from about 0.16 to 0.44,about 0.14 on either side of p = 0.30. By the 95% rule, we estimate that SE ≈ 0.14/2 = 0.07. (Answersmay vary slightly.)

3.13 We expect the sampling distribution to be centered at the value of the population mean, so we estimatethat the population parameter is µ = 85. The standard error is the standard deviation of the distributionof sample means. The middle of 95% of the distribution goes from about 45 to 125, about 40 on either sideof µ = 85. By the 95% rule, we estimate that SE ≈ 40/2 = 20. (Answers may vary slightly.)

3.14 We expect the sampling distribution to be centered at the value of the population mean, so we estimatethat the population parameter is µ = 300. The standard error is the standard deviation of the distributionof sample means. The middle of 95% of the distribution goes from about 290 to 310, about 10 on either sideof µ = 300. By the 95% rule, we estimate that SE ≈ 10/2 = 5. (Answers may vary slightly.)

3.15 We expect the sampling distribution to be centered at the value of the population proportion, so weestimate that the population parameter is p = 0.80. The standard error is the standard deviation of thedistribution of sample proportions. The middle of 95% of the distribution goes from about 0.74 to 0.86,about 0.06 on either side of p = 0.80. By the 95% rule, we estimate that SE ≈ 0.06/2 = 0.03. (Answersmay vary slightly.)

Page 2: 74 CHAPTER 3 Section 3.1 Solutions 3 ... - Montgomery …faculty.montgomerycollege.edu/maronne/Math117A-LOCK-STATISTICS... · 74 CHAPTER 3 Section 3.1 ... 3.2 This correlation is

CHAPTER 3 75

3.16 (a) We see in the sampling distribution that a sample proportion of p = 0.1 is rare for a sampleof this size but similar sample proportions occurred several times in this sampling distribution. Thisvalue is (ii): unusual but might occur occasionally.

(b) We see in the sampling distribution that a sample proportion of p = 0.35 is not at all unusual withsamples of this size, so this value is (i): reasonably likely to occur.

(c) We see in the sampling distribution that there are no sample proportions even close to p = 0.6 so thissample proportion is (iii): extremely unlikely to ever occur using samples of this size.

3.17 (a) We see in the sampling distribution that a sample mean of x = 70 is not unusual for samples ofthis size, so this value is (i): reasonably likely to occur.

(b) We see in the sampling distribution that a sample mean of x = 100 is not unusual for samples of thissize, so this value is (i): reasonably likely to occur.

(c) We see in the sampling distribution that a sample mean of x = 140 is rare for a sample of this size butsimilar sample means occurred several times in this sampling distribution. This value is (ii): unusualbut might occur occasionally.

3.18 (a) We see in the sampling distribution that there are no sample means even close to x = 250 so thissample mean is (iii): extremely unlikely to ever occur using samples of this size.

(b) We see in the sampling distribution that a sample mean of x = 305 is not unusual for samples of thissize, so this value is (i): reasonably likely to occur.

(c) We see in the sampling distribution that a sample mean of x = 315 is rare for a sample of this size butsimilar sample means occurred several times in this sampling distribution. This value is (ii): unusualbut might occur occasionally.

3.19 (a) We see in the sampling distribution that a sample proportion of p = 0.72 is rare for a sampleof this size but similar sample proportions occurred several times in this sampling distribution. Thisvalue is (ii): unusual but might occur occasionally.

(b) We see in the sampling distribution that a sample proportion of p = 0.88 is rare for a sample of thissize but similar sample proportions occurred several times in this sampling distribution. This value is(ii): unusual but might occur occasionally.

(c) We see in the sampling distribution that there are no sample proportions even close to p = 0.95 so thissample proportion is (iii): extremely unlikely to ever occur using samples of this size.

3.20 The population is all internet users in the US. The population parameter of interest is p, the proportionof internet users who have customized their home page. For this sample, p = 469/1675 = 0.28. Unless wehave additional information, the best point estimate of the population parameter p is p = 0.28. To find pexactly, we would have to obtain information about the home page of every internet user in the US, whichis unrealistic.

3.21 We are estimating p, the proportion of all US adults who own a laptop computer. The quantity thatgives the best estimate is p, the proportion of our sample who own a laptop computer. The best estimateis p = 1238/2252 = 0.55. Since the true proportion is unknown, our best estimate for the proportion comesfrom our sample. We estimate that 55% of all US adults own a laptop computer.

3.22 (a) We are estimating ρ, the correlation between pH and mercury levels of fish for all the lakes inFlorida. The quantity that gives the best estimate is our sample correlation r = −0.575. We estimatethat the correlation between pH levels and levels of mercury in fish in all Florida lakes is −0.575.

Page 3: 74 CHAPTER 3 Section 3.1 Solutions 3 ... - Montgomery …faculty.montgomerycollege.edu/maronne/Math117A-LOCK-STATISTICS... · 74 CHAPTER 3 Section 3.1 ... 3.2 This correlation is

76 CHAPTER 3

(b) We use an estimate because it would be very difficult and costly to find the exact population correlation.We would need to measure the pH level and the mercury in fish level for all the lakes in Florida, andthere are over 7700 of them.

3.23 (a) The value 30 is a population parameter and the notation is µ = 30. The value 27.90 is a samplestatistic and the notation is x = 27.90.

(b) The distribution will be bell-shaped and the center will be at the population mean of 30. The samplemean 27.90 would represent one point on the dotplot.

(c) The dotplot will have 1000 dots and each dot will represent the mean for a sample of 75 co-payments.

3.24 (a) The two distributions centered at the population average are probably unbiased, distributions Aand D. The two distributions not centered at the population average (µ = 2.61) are biased, dotplotsB and C. The sampling for Distribution B gives an average too high, and has large households over-represented. The sampling for Distribution C gives an average too low and may have been done in anarea with many people living alone.

(b) The larger the sample size the lower the variability, so distribution A goes with samples of size 100,and distribution D goes with samples of size 500.

3.25 (a) As the sample size goes up, the accuracy improves, which means the spread goes down. We seethat distribution A goes with sample size n = 20, distribution B goes with n = 100, and distributionC goes with n = 500.

(b) We see in dotplot A that quite a few of the sample proportions (when n = 20) are less than 0.25 orgreater than 0.45, so being off by more than 0.10 would not be too surprising. While it is possible to bethat far away in dotplot B (when n = 100), such points are much more rare, so it would be somewhatsurprising for a sample of size n = 100 to miss by that much. None of the points in dotplot C are morethan 0.10 away from p = 0.35, so it would be extremely unlikely to be that far off when n = 500.

(c) Many of the points in dotplot A fall outside of the interval from 0.30 to 0.40, so it is not at all surprisingfor a sample proportion based on n = 20 to be more than 0.05 from the population proportion. Evendotplot B has quite a few values below 0.30 or above 0.40, so being off by more than 0.05 when n = 100is not too surprising. Such points are rare, but not impossible in dotplot C, so a sample of size n = 500might possibly give an estimate that is off by more than 0.05, but it would be pretty surprising.

(d) As the sample size goes up, the accuracy of the estimate tends to increase.

3.26 The quantity we are trying to estimate is µm − µo where µm represents the average grade for allfourth-grade students who study mixed problems and µo represents the average grade for all fourth-gradestudents who study problems one type at a time. The quantity that gives the best estimate is xm−xo, wherexm represents the average grade for the fourth-grade students in the sample who studied mixed problemsand xo represents the average grade for the fourth-grade students in the sample who studied problemsone type at a time. The best estimate for the difference in the average grade based on study method isxm − xo = 77− 38 = 39.

3.27 The quantity we are trying to estimate is pa−pt where pa represents the proportion of adult cell phoneusers who text message and pt represents the proportion of teen cell phone users who text message. Thequantity that gives the best estimate is pa − pt, where pa represents the proportion of the adult cell phoneusers in the sample of 2,252 who text message and pt represents the proportion of teen cell phone users inthe sample of 800 who text message. The best estimate for the difference in the proportion who text ispa − pt = .72− .87 = −0.15.

Page 4: 74 CHAPTER 3 Section 3.1 Solutions 3 ... - Montgomery …faculty.montgomerycollege.edu/maronne/Math117A-LOCK-STATISTICS... · 74 CHAPTER 3 Section 3.1 ... 3.2 This correlation is

CHAPTER 3 77

3.28 (a) We expect means of samples of size 30 to be much less spread out than values of budgets ofindividual movies. This leads us to conclude that Boxplot A represents the sampling distribution andBoxplot B represents the values in a single sample. We can also consider the shapes. Boxplot A appearsto be symmetric and Boxplot B appears to be right skewed. Since we expect a sampling distributionto be symmetric and bell-shaped, Boxplot A is the sampling distribution and the skewed Boxplot Bshows values in a single sample.

(b) Boxplot B shows the data from one sample of size 30. Each data value represents the budget, inmillions of dollars, for one Hollywood movie made in 2011. There are 30 values included in the sample.The budgets range from about 1 million to 145 million for this sample. We see in the boxplot that themedian is about 30 million dollars. Since the data are right skewed, we expect the mean to be higher.We estimate the mean to be about 40 million or 45 million. This is the mean of a sample, so we havex ≈ 45 million dollars. (Answers may vary.)

(c) Boxplot A shows the data from a sampling distribution using samples of size 30. Each data valuerepresents the mean of one of these samples. There are 1000 means included in the distribution. Theyrange from about 27 to 79 million dollars. The center of the distribution is a good estimate of thepopulation parameter, and the center appears to be about µ ≈ 53 million dollars, where µ representsthe mean budget, in millions of dollars, for all movies coming out of Hollywood in 2011. (Answers mayvary.)

3.29 (a) Both distributions are centered at the population parameter, so 0.05.

(b) The proportions for samples of size n = 100 go from about 0 to 0.12. The proportions for samples ofsize n = 1000 go from about 0.025 to 0.07.

(c) The standard error for samples of size n = 100 is about 0.02 (since it appears that about 95% of thedata are between 0.01 and 0.09.) The standard error for samples of size n = 1000 is about 0.005 (sinceit appears that about 95% of the data are between 0.04 and 0.06.)

(d) A sample proportion of 0.08 is relatively likely from a sample of 100, but extremely unlikely with asample size of 1,000.

3.30 (a) It is not unlikely to get a sample mean more then 2 screws on either side of 50. It is however veryunlikely to see a mean below 45 or above 55, so it is unlikely for the sample mean to be more then 5or 10 screws away.

(b) The distribution shows that finding a mean number of screws equal to 42 from a sample of 10 boxes isvery unlikely if the company’s claim is accurate, so, yes, it would be reasonable to conclude that thecompany’s claim is likely to be incorrect.

(c) The sampling distribution shows us that a mean of 42 screws is very unlikely, but this does not implythat one box containing 42 screws is very unlikely. So a box of 42 screws does not give us informationone way or another about the company’s claim.

3.31 (a) Answers will vary. Here is one possible set of randomly selected Points values.Points: 26, 18, 3, 16, 57 x = 24.0

(b) Answers will vary. Here is another possible set of randomly selected Points values.Points: 48, 34, 13, 18, 26 x = 27.8

(c) The mean number of points for all 24 players is µ = 26.46 points for the season. Most sample meansfound in parts (a) and (b) will be somewhat close to this but not exactly the same.

(d) The distribution will be roughly symmetric with a peak at the center of 26.46. See the figure.

Page 5: 74 CHAPTER 3 Section 3.1 Solutions 3 ... - Montgomery …faculty.montgomerycollege.edu/maronne/Math117A-LOCK-STATISTICS... · 74 CHAPTER 3 Section 3.1 ... 3.2 This correlation is

78 CHAPTER 3

3.32 (a) Answers will vary. Here is one sample:Minutes: 140.42,151.72,127.27,141.85,144.32,161.13,140.38,138.25,137.70,149.47 x = 143.3

(b) Answers will vary. Here is another sample:Minutes: 145.05,135.00,140.42,159.02,161.13,146.92,137.93,137.83,143.78,143.15 x = 145.0

(c) The mean of all times of the 76 finishers is µ = 141.1 minutes, or about 2 hours 21 minutes. Thesample means found in parts (a) and (b) were probably close to this but not exactly the same.

(d) The distribution will be roughly symmetric with a peak at the center of 141.1. See the figure.

3.33 Answers will vary, but a typical distribution is shown below. The smallest mean is just below 10 andthe largest is just below 50 (but answers will vary). The standard deviation of these 1000 sample means isabout 7.2.

Page 6: 74 CHAPTER 3 Section 3.1 Solutions 3 ... - Montgomery …faculty.montgomerycollege.edu/maronne/Math117A-LOCK-STATISTICS... · 74 CHAPTER 3 Section 3.1 ... 3.2 This correlation is

CHAPTER 3 79

3.34 Answers will vary, but a typical distribution is shown below. The smallest mean is about 134 minutesand the largest is about 148 minutes. The standard deviation of these sample means is about 2.2.

3.35 (a) This is a population proportion so the correct notation is p. We have p = 41/273 = 0.150.

(b) We expect it to be symmetric and bell-shaped and centered at the population proportion of 0.150.

3.36 (a) This is a population proportion so the correct notation is p. We have p = 181/273 = 0.663.

(b) We expect it to be symmetric and bell-shaped and centered at the population proportion of 0.663.

3.37 (a) The standard error is the standard deviation of the sampling distribution (given in the upperright corner of the sampling distribution box of StatKey) and is likely to be about 0.11. Answers willvary, but the sample proportions should go from 0 to about 0.5 (as in the cotplot below). In that case,the farthest sample proportion from p = 0.15 is p ≈ 0.5, and it is 0.5− 0.15 = 0.35 off from the correctpopulation value. In other simulations the maximum proportion might be as high as 0.6 or even 0.7.

(b) The standard error is the standard deviation of the sampling distribution and is likely to be about 0.08.Answers will vary, but the sample proportions should go from 0 to about 0.4 (as shown in the dotplotbelow). In that case, the farthest sample proportion from p = 0.15 is p ≈ 0.4, and it is 0.4−0.15 = 0.25off from the correct population value. Some simulations might produce even larger discrepancies.

(c) The standard error is the standard deviation of the sampling distribution and is likely to be about0.05. Answers will vary, but the sample proportions should go from near 0 to about 0.3 (as shownin the dotplot below). In that case, the farthest sample proportion from p = 0.15 is p ≈ 0.3, and itis 0.5 − 0.15 = 0.15 off from the correct population value. Some simulations might have even largerdiscrepancies.

(d) Accuracy improves as the sample size increases. The standard error gets smaller, the range of valuesgets smaller, and values tend to be closer to the population value of p = 0.150.

Page 7: 74 CHAPTER 3 Section 3.1 Solutions 3 ... - Montgomery …faculty.montgomerycollege.edu/maronne/Math117A-LOCK-STATISTICS... · 74 CHAPTER 3 Section 3.1 ... 3.2 This correlation is

80 CHAPTER 3

3.38 (a) The standard error is the standard deviation of the sampling distribution (given in the upperright corner of the sampling distribution box in StatKey) and is likely to be about 0.15. Answers willvary, but the sample proportions should go from about 0.2 to about 1.0 (as shown in the dotplot below).In that case, the farthest sample proportion from p = 0.663 is p = 0.2, and it is 0.663− 0.2 = 0.463 offfrom the correct population value.

(b) The standard error is the standard deviation of the sampling distribution and is likely to be about0.11. Answers will vary, but the sample proportions should go from about 0.35 to about 0.95 (as shownin the dotplot below). In that case, the farthest sample proportion from p = 0.663 is p = 0.35, and itis 0.663− 0.35 = 0.313 off from the correct population value.

(c) The standard error is the standard deviation of the sampling distribution and is likely to be about0.06. Answers will vary, but the sample proportions should go from about 0.44 to about 0.84 (as shownin the dotplot below). In that case, the farthest sample proportion from p = 0.663 is p = 0.44, and itis 0.663− 0.44 = 0.223 off from the correct population value.

(d) Accuracy improves as the sample size increases. The standard error gets smaller, the range of valuesgets smaller, and values tend to be closer to the population value of 0.663.

Page 8: 74 CHAPTER 3 Section 3.1 Solutions 3 ... - Montgomery …faculty.montgomerycollege.edu/maronne/Math117A-LOCK-STATISTICS... · 74 CHAPTER 3 Section 3.1 ... 3.2 This correlation is

CHAPTER 3 81

Section 3.2 Solutions

3.39 Using ME to represent the margin of error, an interval estimate for µ is x±ME = 25±3 so an intervalestimate of plausible values for the population mean µ is 22 to 28.

3.40 Using ME to represent the margin of error, an interval estimate for p is p ±ME = 0.37 ± 0.02 so aninterval estimate of plausible values for the population proportion p is 0.35 to 0.39.

3.41 Using ME to represent the margin of error, an interval estimate for ρ is r ±ME = 0.62 ± 0.05 so aninterval estimate of plausible values for the population correlation ρ is 0.57 to 0.67.

3.42 Using ME to represent the margin of error, an interval estimate for µ1 − µ2 is x1 − x2 ±ME = 5± 8so an interval estimate of plausible values for the difference in population means is −3 to 13.

3.43 (a) Yes, plausible values of µ are values in the interval.

(b) Yes, plausible values of µ are values in the interval.

(c) No. Since 105.3 is not in the interval estimate, it is a possible value of µ but is not a very plausibleone.

3.44 (a) No. Since 0.85 is not in the interval estimate, it is a possible value of p but is not a very plausibleone.

(b) Yes, plausible values of p are values in the interval.

(c) No. Since 0.07 is so far out of the interval estimate, it is an extremely unlikely value of the populationparameter p.

3.45 The 95% confidence interval estimate is p±2 ·SE = 0.32±2(0.04) = 0.32±0.08, so the interval is 0.24to 0.40. We are 95% confident that the true value of the population proportion p is between 0.24 and 0.40.

3.46 The 95% confidence interval estimate is x± 2 · SE = 55± 2(1.5) = 55± 3, so the interval is 52 to 58.We are 95% confident that the true value of the population mean µ is between 52 and 58.

3.47 The 95% confidence interval estimate is r±2 ·SE = 0.34±2(0.02) = 0.34±0.04, so the interval is 0.30to 0.38. We are 95% confident that the true value of the population correlation ρ is between 0.30 and 0.38.

3.48 The interval estimate is r ± margin of error = −0.46 ± 0.05, so the interval is -0.51 to -0.41. We are95% confident that the true value of the population correlation ρ is between -0.51 and -0.41.

3.49 The 95% confidence interval estimate is (x1−x2)±margin of error = 3.0± 1.2, so the interval is 1.8 to4.2. We are 95% confident that the true difference in the population means µ1 − µ2 is between 1.8 and 4.2(which means we believe that the mean of population 1 is between 1.8 and 4.2 units larger than the meanof population 2.)

3.50 The interval estimate is (p1 − p2) ±margin of error = 0.08 ± 0.03, so the interval is 0.05 to 0.11. Weare 95% confident that the true difference in population proportions p1− p2 is between 0.05 and 0.11 (whichmeans we believe that the proportion for population 1 is between 0.05 and 0.11 larger than the proportionfor population 2.)

3.51 (a) The information is from a sample, so it is a statistic. It is a proportion, so the correct notationis p = 0.30.

Page 9: 74 CHAPTER 3 Section 3.1 Solutions 3 ... - Montgomery …faculty.montgomerycollege.edu/maronne/Math117A-LOCK-STATISTICS... · 74 CHAPTER 3 Section 3.1 ... 3.2 This correlation is

82 CHAPTER 3

(b) The parameter we are estimating is the proportion, p, of all young people in the US who have beenarrested by the age of 23. Using the information in the sample, we estimate that p ≈ 0.30.

(c) If the margin of error is 0.01, the interval estimate is 0.30 ± 0.01 which gives 0.29 to 0.31. Plausiblevalues for the proportion p range from 0.29 to 0.31.

(d) Since the plausible values for the true proportion are those between 0.29 and 0.31, it is very unlikelythat the actual proportion is less than 0.25.

3.52 (a) The population is all people ages 18 and older living in the US. The sample is the 147,291 peoplewho were actually contacted and asked whether or not they got health insurance from an employer.The parameter of interest is p, the proportion of the entire population of US adults who get healthinsurance from an employer. The relevant statistic is p = 0.45, the proportion of people in the samplewho get health insurance from an employer.

(b) An interval estimate is found by taking the best estimate (p = 0.45) and adding and subtracting themargin of error (±0.01). We are relatively confident that the population proportion is between 0.44and 0.46, or that the percent of the entire population that receive health insurance from an employeris between 44% and 46%.

3.53 We are 95% confident that the proportion of all adults in the US who think a car is a necessity isbetween 0.83 and 0.89.

3.54 (a) The population is all cell phone users age 18 and older in the US. The population parameter ofinterest is µ, the mean number of text messages sent and received per day. The best point estimatefor µ is the sample mean, x = 41.5.

(b) The point estimate is x, so a 95% confidence interval is given by:

x ± 2 · SE41.5 ± 2(6.1)

41.5 ± 12.2

29.3 to 53.7.

We are 95% confident that the mean number of text messages a day for all cell phone users in the USis between 29.3 and 53.7.

3.55 We are estimating p, the proportion of all US adults who agree with the statement that each personhas one true love. The best point estimate is p = 735/2625 = 0.28. We find the confidence interval using:

p ± 2 · SE0.28 ± 2(0.009)

0.28 ± 0.018

0.262 to 0.298.

The margin of error for our estimate is 0.018 or 1.8%. We are 95% sure that the proportion of all US adultswho agree with the statement on one true love is between 0.262 and 0.298.

3.56 We are estimating pM − pF , the difference in proportions between males and females. For males, wehave pM = 372/1213 = 0.31 and for females, we have pF = 363/1412 = 0.26. The best point estimate for

Page 10: 74 CHAPTER 3 Section 3.1 Solutions 3 ... - Montgomery …faculty.montgomerycollege.edu/maronne/Math117A-LOCK-STATISTICS... · 74 CHAPTER 3 Section 3.1 ... 3.2 This correlation is

CHAPTER 3 83

the difference in proportions is pM − pF = 0.31− 0.26 = 0.05. We find the confidence interval using:

(pM − pF ) ± 2 · SE(0.31− 0.26) ± 2(0.018)

0.05 ± 0.036

0.014 to 0.086.

We are 95% confident that the difference in proportion agreeing that we have only one true love betweenmales and females is between 0.014 and 0.086. Since zero is not in this interval, it is not one of the plausiblevalues for the difference. We are fairly sure that the difference in these proportions is positive; thus men aremore likely than women to agree with the statement on one true love.

3.57 (a) We are 95% confident that the mean response time for game players minus the mean responsetime for non-players is between -1.8 to -1.2. In other words, mean response time for game players isless than the mean response time for non-players by between 1.8 and 1.2 seconds.

(b) It is not likely that they are basically the same, since the option of the difference in means being zerois not in the interval. The game players are faster, and we can tell this because the confidence intervalfor µg − µng has only negative values so the mean time is smaller for the game players.

(c) We are 95% confident that the mean accuracy score for game players minus the mean accuracy scorefor non-players is between -4.2 to 5.8.

(d) It is likely that they are basically the same, since the option of the difference in means being zero isin the interval. There is little discernible difference in accuracy between game players and non-gameplayers.

3.58 (a) This is a matched pairs design since all participants participated in both treatments (canned soupfor five days and fresh soup for five days). There might be a great deal of variability in people’s BPAconcentrations and a matched pairs experiment reduces that variability.

(b) The population is all people, and we are estimating µC − µF , where µC is mean urinary BPA concen-tration after eating canned soup for five days and µF is mean urinary BPA concentration after eatingfresh soup for five days. Since this is a matched pairs design, we could also use µD where µD is themean difference in urinary BPA concentration between the two treatments.

(c) We are 95% confident that BPA concentration is, on average, between 19.6 and 25.5 µg/L higher inpeople who have eaten canned soup for five days than it is in people who have eaten fresh soup for fivedays.

(d) A larger sample size increases the accuracy, so we would expect the confidence interval to be narrower.

3.59 (a) Using the margin of error, we see that the likely proportion voting for Candidate A ranges from49% to 59%. Since this interval includes some proportions below 50% as plausible values for the electionproportion, we cannot be very confident in the outcome.

(b) Using the margin of error, we see that the likely proportion voting for Candidate A ranges from 51%to 53%. Since all values in this interval are over 50%, we can be relatively confident that Candidate Awill win.

(c) Using the margin of error, we see that the likely proportion voting for Candidate A ranges from 51%to 55%. Since all values in this range are over 50%, we can be relatively confident that Candidate Awill win.

Page 11: 74 CHAPTER 3 Section 3.1 Solutions 3 ... - Montgomery …faculty.montgomerycollege.edu/maronne/Math117A-LOCK-STATISTICS... · 74 CHAPTER 3 Section 3.1 ... 3.2 This correlation is

84 CHAPTER 3

(d) Using the margin of error, we see that the likely proportion voting for Candidate A ranges from 48%to 68%. Since this interval includes some proportions below 50% as plausible vaues for the electionproportion, we cannot be very confident in the outcome.

3.60 (a) The parameter of interest is µ, the mean effect on weight 2.5 years after a month of overeatingand being sedentary.

(b) The only way to find the exact value would be to have all members of a population overeat and beinactive for a month and then measure the effect 2.5 years later. This is not a good idea!

(c) The 95% confidence interval using the standard error is x± 2 · SE = 6.8± 2(1.2) = 6.8± 2.4. We are95% sure that the mean weight gain over 2.5 years by people who overeat for a month is between 4.4and 9.2 pounds.

(d) The margin of error is ±2.4 which means we are relatively confident that our estimate of 6.8 poundsis within 2.4 pounds of the true mean weight gain for the population.

3.61 Let µ represent the mean time for a golden shiner fish to find the yellow mark. A 95% confidenceinterval is given by

x ± 2 · SE51 ± 2(2.4)

51 ± 4.8

46.2 to 55.8.

A 95% confidence interval for the mean time for fish to find the mark is between 46.2 and 55.8 seconds. Weare 95% sure that the mean time it would take fish to find the target for all fish of this breed is between 46.2seconds and 55.8 seconds. In other words, the plausible values for the population mean µ are those valuesbetween 46.2 and 55.8. Therefore, 60 is not a plausible value for the mean time for all fish, but 55 is.

3.62 We are 95% confident that schools of fish in this situation will end up going with the majority overthe opinionated minority only between 9% and 26% of the time. It is not plausible that the schools of fishin this situation are equally likely to go for either option since that would indicate a proportion of p = 0.5for each option, and 0.5 is not in the range of plausible values. The highly opinionated fish are definitelyhaving an effect!

3.63 We are estimating the difference in population proportions p1 − p2 where p1 is the proportion of timesa school of fish will pick the majority option if there is an opinionated minority, a less passionate majority,and also some additional members with no preference and p2 is the proportion of times a school of fish willpick the majority option if there is an opinionated minority and a less passionate majority and no other fishin the group, as described above in Fish Democracies. (We could also have defined the proportions in theother order.) The best point estimate is p1 − p2 = 0.61− 0.17 = 0.44. We find a 95% confidence interval asfollows:

(p1 − p2) ± 2 · SE(0.61− 0.17) ± 2(0.14)

0.44 ± 0.28

0.16 to 0.72.

We are 95% sure that the proportion of schools of fish picking the majority option is 0.16 to 0.72 higherif fish with no preference are added to the group. If adding the indifferent fish had no effect, then the

Page 12: 74 CHAPTER 3 Section 3.1 Solutions 3 ... - Montgomery …faculty.montgomerycollege.edu/maronne/Math117A-LOCK-STATISTICS... · 74 CHAPTER 3 Section 3.1 ... 3.2 This correlation is

CHAPTER 3 85

population proportions with and without the indifferent fish would be the same, which means the differencein proportions would be zero. Since zero is not a plausible value for the difference in proportions, it is veryunlikely that adding indifferent fish has no effect. The indifferent fish are helping the majority carry the day.

3.64 (a) Interval is for the mean, not all students.

(b) Interval is for the population mean, not the sample mean.

(c) The interval is not uncertain, only whether or not it captures the population mean.

(d) Interval is trying to capture the mean, not 95% of individual student pulse rates.

(e) Scope of inference could apply to the mean pulse rate for all students at this college, but sample wasnot taken from all U.S. college students.

(f) The population mean pulse rate is a single fixed value.

(g) Interval is for the population mean, not other sample means.

Page 13: 74 CHAPTER 3 Section 3.1 Solutions 3 ... - Montgomery …faculty.montgomerycollege.edu/maronne/Math117A-LOCK-STATISTICS... · 74 CHAPTER 3 Section 3.1 ... 3.2 This correlation is

86 CHAPTER 3

Section 3.3 Solutions

3.65 (a) No. The value 12 is not in the original.

(b) No. A bootstrap sample has the same sample size as the original sample.

(c) Yes.

(d) No. A bootstrap sample has the same sample size as the original sample.

(e) Yes.

3.66 (a) Yes.

(b) Yes.

(c) No. A bootstrap sample has the same sample size as the original sample.

(d) No. The value 78 is not in the original sample.

(e) Yes.

(f) Yes

3.67 The distribution appears to be centered near 0.7 so the point estimate is about 0.7. Using the 95%rule, we estimate that the standard error is about 0.1 (since about 95% of the values appear to be within0.2 of the center). Thus our interval estimate is

Statistic ± 2 · SE0.7 ± 2(0.1)

0.7 ± 0.2

0.5 to 0.9.

The parameter being estimated is a proportion p, and the interval 0.5 to 0.9 gives plausible values for thepopulation proportion p. Answers may vary.

3.68 The distribution appears to be centered near 25 so the point estimate is about 25. Using the 95% rule,we estimate that the standard error is about 3 (since about 95% of the values appear to be within 6 of thecenter). Thus our interval estimate is

Statistic ± 2 · SE25 ± 2(3)

25 ± 6

19 to 31.

The parameter being estimated is a mean µ, and the interval 19 to 31 gives plausible values for the populationmean µ. Answers may vary.

3.69 The distribution appears to be centered near 0.4 so the point estimate is about 0.4. Using the 95%rule, we estimate that the standard error is about 0.05 (since about 95% of the values appear to be within0.1 of the center). Thus our interval estimate is

Statistic ± 2 · SE0.4 ± 2(0.05)

0.4 ± 0.1

0.3 to 0.5.

Page 14: 74 CHAPTER 3 Section 3.1 Solutions 3 ... - Montgomery …faculty.montgomerycollege.edu/maronne/Math117A-LOCK-STATISTICS... · 74 CHAPTER 3 Section 3.1 ... 3.2 This correlation is

CHAPTER 3 87

The parameter being estimated is a correlation ρ, and the interval 0.3 to 0.5 gives plausible values for thepopulation correlation ρ. Answers may vary.

3.70 The distribution appears to be centered near 6 so the point estimate is about 6. Using the 95% rule,we estimate that the standard error is about 4 (since about 95% of the values appear to be within 8 of thecenter). Thus our interval estimate is

Statistic ± 2 · SE6 ± 2(4)

6 ± 8

−2 to 14.

The parameter being estimated is a difference in means µ1 − µ2, and the interval -2 to 14 gives plausiblevalues for the difference in population means µ1 − µ2. Answers may vary.

3.71 The statistic for the sample is p = 35/100 = 0.35. Using technology, the standard deviation of thesample proportions for 1000 bootstrap samples is about 0.048 (answers may vary slightly), so we estimatethe standard error is SE≈ 0.048. Thus our interval estimate is

Statistic ± 2 · SE0.35 ± 2(0.048)

0.35 ± 0.096

0.254 to 0.446.

Plausible values of the population proportion range from 0.254 to 0.446.

3.72 The statistic for the sample is p = 180/250 = 0.72. Using technology, the standard deviation of thesample proportions for 1000 bootstrap samples is about 0.028 (answers may vary slightly), so we estimatethe standard error is SE≈ 0.028. Thus our interval estimate is

Statistic ± 2 · SE0.72 ± 2(0.028)

0.72 ± 0.056

0.664 to 0.776.

Plausible values of the population proportion range from 0.664 to 0.776.

3.73 The statistic for the sample is p = 112/400 = 0.28. Using technology, the standard deviation of thesample proportions for 1000 bootstrap samples is about 0.022 (answers may vary slightly), so we estimatethe standard error is SE≈ 0.022. Thus our interval estimate is

Statistic ± 2 · SE0.28 ± 2(0.022)

0.28 ± 0.044

0.236 to 0.324.

Plausible values of the population proportion range from 0.236 to 0.324.

Page 15: 74 CHAPTER 3 Section 3.1 Solutions 3 ... - Montgomery …faculty.montgomerycollege.edu/maronne/Math117A-LOCK-STATISTICS... · 74 CHAPTER 3 Section 3.1 ... 3.2 This correlation is

88 CHAPTER 3

3.74 The statistic for the sample is p = 382/1000 = 0.382. Using technology, the standard deviation of thesample proportions for 1000 bootstrap samples is about 0.015 (answers may vary slightly), so we estimatethe standard error is SE≈ 0.015. Thus our interval estimate is

Statistic ± 2 · SE0.382 ± 2(0.015)

0.382 ± 0.03

0.352 to 0.412.

Plausible values of the population proportion range from 0.352 to 0.412.

3.75 (a) The best point estimate is the sample proportion, p = 26/174 = 0.149.

(b) We can estimate the standard error using the 95% rule, or we can find the standard deviation of thebootstrap statistics in the upper right of the figure. We see that the standard error is about 0.028.Answers will vary slightly with other simulations.

(c) We have

p ± 2 · SE0.149 ± 2(0.028)

0.149 ± 0.056

0.093 to 0.205.

We are 95% confident that the percent of all snails of this kind that will live after being eaten by abird is between 9.3% and 20.5%.

(d) Yes, 20% is within the range of plausible values in the 95% confidence interval.

3.76 (a) We find for the 8 values in the table that x = 34.0 and s = 14.63.

(b) We put the 8 values on the 8 slips of paper and mix them up. Draw one and write down the valueand put it back. Mix them up, draw another, and do this 8 times. The resulting 8 numbers form abootstrap sample, and the mean of those 8 numbers form one bootstrap statistic.

(c) We expect that the bootstrap distribution will be bell-shaped and centered at approximately 34.

(d) The population parameter of interest is the mean, µ, number of ants on all possible peanut buttersandwich bits set near this ant hill. There are other possible answers for the population; for example,you might decide to limit it to the time of day at which the student conducted the study. The bestpoint estimate is the sample mean x = 34.

(e) We have

x ± 2 · SE34.0 ± 2(4.85)

34.0 ± 9.7

24.3 to 43.7.

We are 95% confident that the mean number of ants to climb on a bit of peanut butter sandwich leftnear an ant hill is between 24.3 ants and 43.7 ants.

3.77 (a) The mean is x = 67.59 and the standard deviation is s = 50.02.

Page 16: 74 CHAPTER 3 Section 3.1 Solutions 3 ... - Montgomery …faculty.montgomerycollege.edu/maronne/Math117A-LOCK-STATISTICS... · 74 CHAPTER 3 Section 3.1 ... 3.2 This correlation is

CHAPTER 3 89

(b) Select 20 values at random (with replacement) from the original set of skateboard prices and recordthe mean for those 20 values as the bootstrap statistic.

(c) We expect the bootstrap distribution to be symmetric and bell-shaped and to be centered at the samplemean: 67.59.

(d) We find the 95% confidence interval:

x ± 2 · SE67.59 ± 2(10.9)

67.59 ± 21.8

45.79 to 89.39.

We are 95% confident that the mean price of skateboards for sale online is between $45.79 and $89.39.

3.78 The mean of the five sales numbers is x = 605. Using StatKey or other technology, we obtain abootstrap distribution for sample means like the one below.

The standard deviation of these means shows the standard error is about 72.24. This will vary for other setsof bootstrap samples. We find a 95% confidence interval by

Statistic ± 2 · SE605 ± 2(72.24)

605 ± 144.5

460.5 to 749.5

We would tell the CEO that we are 95% confident the average of all monthly sales of Saabs in the US isbetween about 460.5 and 749.5 cars.

3.79 (a) The best point estimate for the proportion, p, of rats showing empathy is p = 23/30 = 0.767.

(b) On 23 of the slips, we write “yes” (showed empathy) and on the other 7, we write “no”. We then mixup the slips of paper, draw one out and record the result, yes or no. Put the slip of paper back andrepeat the process 30 times. This set of yes’s and no’s is our bootstrap sample. The proportion of yes’sin the sample is our bootstrap statistic.

Page 17: 74 CHAPTER 3 Section 3.1 Solutions 3 ... - Montgomery …faculty.montgomerycollege.edu/maronne/Math117A-LOCK-STATISTICS... · 74 CHAPTER 3 Section 3.1 ... 3.2 This correlation is

90 CHAPTER 3

(c) Using technology, we see that the bootstrap distribution is bell-shaped and centered approximately at0.767. We also see that the standard error is about 0.077.

(d) We have

p ± 2 · SE0.767 ± 2(0.077)

0.767 ± 0.154

0.613 to 0.921.

For all laboratory rats, we are 95% confident that the proportion of rats that will show empathy inthis manner is between 61.3% and 92.1%.

3.80 The sample proportion of females showing compassion is pF = 6/6 = 1.0. The sample proportion ofmales showing compassion is pM = 17/24 = 0.708. The best point estimate for the difference in proportionspF − pM is pF − pM = 1.0− 0.708 = 0.292. Using StatKey to create a bootstrap distribution for a differencein proportions using this sample data, we see a standard error of 0.094.

Page 18: 74 CHAPTER 3 Section 3.1 Solutions 3 ... - Montgomery …faculty.montgomerycollege.edu/maronne/Math117A-LOCK-STATISTICS... · 74 CHAPTER 3 Section 3.1 ... 3.2 This correlation is

CHAPTER 3 91

We have

(pF − pM ) ± 2 · SE(1.0− 0.708) ± 2(0.094)

0.292 ± 0.188

0.104 to 0.480.

Based on this interval the percentage of female rats likely to show compassion is between 10.4% and 48%higher than the percentage of male rats likely to show compassion. Since zero is not in the interval estimate,it is not very plausible that male and female rats are equally compassionate.

3.81 (a) The standard error is about 0.015 since most (roughly 95%) of the bootstrap distribution isbetween 0.12 and 0.18, which is about two standard deviations on either side of the center at 0.15.

(b) The 95% confidence interval is given by:

(pt − pa) ± 2 · SE(0.87− 0.72) ± 2 · (0.015)

.15 ± 0.03

0.12 to 0.18.

We are 95% sure that the proportion of teens who text is between 0.12 and 0.18 higher than theproportion of adults who text.

3.82 Using StatKey or other technology, we create a bootstrap distribution to estimate the difference inmeans µt − µc where µt represents the mean immune response for tea drinkers and µc represents the meanimmune response for coffee drinkers. In the original sample the means are xt = 34.82 and xc = 17.70,respectively, so the point estimate for the difference is xt − xc = 34.82 − 17.70 = 17.12. We see from thebootstrap distribution that the standard error for the differences in bootstrap means is about SE = 7.9.This will vary for other sets of bootstrap differences.

Page 19: 74 CHAPTER 3 Section 3.1 Solutions 3 ... - Montgomery …faculty.montgomerycollege.edu/maronne/Math117A-LOCK-STATISTICS... · 74 CHAPTER 3 Section 3.1 ... 3.2 This correlation is

92 CHAPTER 3

For a 95% confidence interval, we have

(xt − xc) ± 2 · SE(34.82− 17.70) ± 2(7.9)

17.12 ± 15.8

1.32 to 32.92.

We are 95% sure that the mean immune response is between 1.32 and 32.92 units higher in tea drinkers thanit is in coffee drinkers.

3.83 (a) We are estimating µD, the mean difference in delay time for public transportation for all trafficsituations in Dresden, Germany.

(b) Put all 24 slips in a container. Pull out one and write down the value and put it back in the container.Mix up the slips, pull out one and repeat that process until there are 24 values written down. Those24 values form one bootstrap sample.

(c) Record the sample mean for the 24 values in the bootstrap sample.

(d) The distribution will be bell-shaped and centered at 61.

(e) We calculate the standard deviation of the bootstrap statistics.

(f) For a 95% confidence interval, we have

xD ± 2 · SE61 ± 2(3.1)

61 ± 6.2

54.8 to 67.2.

We are 95% confident that the average time savings is between 54.8 and 67.2 seconds, if the city movesto the new system.

3.84 (a) For the original sample the mean commute distance is 18.16 miles and the standard deviation is13.8 miles.

(b) One bootstrap distribution of distance means is shown below. It is bell-shaped, centered around 18.2,and shows sample means ranging between about 16.5 and 20.5 miles.

Page 20: 74 CHAPTER 3 Section 3.1 Solutions 3 ... - Montgomery …faculty.montgomerycollege.edu/maronne/Math117A-LOCK-STATISTICS... · 74 CHAPTER 3 Section 3.1 ... 3.2 This correlation is

CHAPTER 3 93

(c) The standard error of the means for this set of 2000 bootstrap samples is 0.61 miles.

(d) A 95% confidence interval is given by

8.16 ± 2(0.61)

8.16 ± 1.22

16.94 to 19.38.

We are 95% sure that the mean commuting distance for all Atlanta commuters is between 16.94 milesand 19.38 miles.

3.85 (a) We use technology to compute the correlation between commute distances and times, r = 0.807,for the 500 data values.

(b) The distribution of bootstrap correlations (shown below) is fairly bell-shaped (perhaps a slight leftskew), centered around 0.81, and ranges between about .70 and .90.

(c) The standard deviation of the bootstrap correlations for this bootstrap distribution is 0.0355 so themargin of error is 2 · 0.0355 = 0.071. The interval estimate for the correlation between commutedistances and time is 0.807± 0.071 or between 0.736 and 0.878.

(d) The interval is shown on a dotplot of the bootstrap distribution below. The interval includes roughly95% of the bootstrap correlations.

3.86 (a) The original sample is right-skewed with outliers at 107, 121, 175, and 190 minutes.

Page 21: 74 CHAPTER 3 Section 3.1 Solutions 3 ... - Montgomery …faculty.montgomerycollege.edu/maronne/Math117A-LOCK-STATISTICS... · 74 CHAPTER 3 Section 3.1 ... 3.2 This correlation is

94 CHAPTER 3

(b) We find that the mean is x = 49.6 with a standard deviation of s = 49.1

(c) The distribution of bootstrap means is fairly symmetric and centered near 50. It does not show thesame skewness as in the sample.

(d) The standard deviation of these bootstrap means is SE = 9.86 (answers will vary for other simulations)which is much smaller than the standard deviation in the sample, s = 49.1.

(e) For a 95% confidence interval, we have

x ± 2 · SE49.6 ± 2 · 9.8649.6 ± 19.7

29.9 to 69.3 minutes.

(f) The style of play on one team might be more or less aggressive than the league as a whole, so theestimate of mean penalty minutes could be biased.

Page 22: 74 CHAPTER 3 Section 3.1 Solutions 3 ... - Montgomery …faculty.montgomerycollege.edu/maronne/Math117A-LOCK-STATISTICS... · 74 CHAPTER 3 Section 3.1 ... 3.2 This correlation is

CHAPTER 3 95

3.87 The standard deviation for the sample of penalty minutes for n=24 players is s = 49.1 minutes. For oneset of 3000 bootstrap sample standard deviations (shown below), the estimated standard error is SE = 11.0.

Based on this the interval estimate is

s ± 2 · SE49.1 ± 2 · 11.049.1 ± 22.0

27.1 to 71.1.

We estimate that the standard deviation in penalty minutes for all NHL players is somewhere between 27.1and 71.1 minutes.

Page 23: 74 CHAPTER 3 Section 3.1 Solutions 3 ... - Montgomery …faculty.montgomerycollege.edu/maronne/Math117A-LOCK-STATISTICS... · 74 CHAPTER 3 Section 3.1 ... 3.2 This correlation is

96 CHAPTER 3

Section 3.4 Solutions

3.88 (a) We keep the middle 95% of values by chopping off 2.5% from each tail.

(b) We keep the middle 90% of values by chopping off 5% from each tail.

(c) We keep the middle 98% of values by chopping off 1% from each tail.

(d) We keep the middle 99% of values by chopping off 0.5% from each tail.

3.89 (a) We keep the middle 95% of values by chopping off 2.5% from each tail. Since 2.5% of 1000 is 25,we eliminate the 25 highest and the 25 lowest values to create the 95% confidence interval.

(b) We keep the middle 90% of values by chopping off 5% from each tail. Since 5% of 1000 is 50, weeliminate the 50 highest and the 50 lowest values to create the 90% confidence interval.

(c) We keep the middle 98% of values by chopping off 1% from each tail. Since 1% of 1000 is 10, weeliminate the 10 highest and the 10 lowest values to create the 98% confidence interval.

(d) We keep the middle 99% of values by chopping off 0.5% from each tail. Since 0.5% of 1000 is 5, weeliminate the 5 highest and the 5 lowest values to create the 99% confidence interval.

3.90 To find a 99% confidence interval, we go farther out on either side than for a 95% confidence interval,so (A) is the most likely result.

3.91 To find a 90% confidence interval, we go less far out on either side than for a 95% confidence interval,so (C) is the most likely result.

3.92 If the sample size goes up, we get greater accuracy and the spread of the bootstrap distributiondecreases, so the confidence interval will be narrower. Thus, (C) is the most likely result.

3.93 If the sample size is smaller, we have less accuracy and the spread of the bootstrap distribution increases,so the confidence interval will be wider. Thus, (A) is the most likely result.

3.94 As long as the number of bootstrap samples is reasonable, the width of the confidence interval doesnot change much as we take more or fewer bootstrap samples. Thus, (B) is the most likely result.

3.95 As long as the number of bootstrap samples is reasonable, the width of the confidence interval doesnot change much as we take more or fewer bootstrap samples. Thus, (B) is the most likely result.

3.96 The sample proportion who agree is p = 35/100 = 0.35. One set of 1000 bootstrap proportions is shownin the figure below. For a 95% confidence interval we need to find the 2.5%-tile and 97.5%-tile, leaving 95%of the distribution in the middle. For this distribution those points are at 0.26 and 0.44, so we are 95% surethat the proportion in the population who agree is between 0.26 and 0.44. Answers will vary slightly fordifferent simulations.

Page 24: 74 CHAPTER 3 Section 3.1 Solutions 3 ... - Montgomery …faculty.montgomerycollege.edu/maronne/Math117A-LOCK-STATISTICS... · 74 CHAPTER 3 Section 3.1 ... 3.2 This correlation is

CHAPTER 3 97

3.97 The sample proportion who agree is p = 180/250 = 0.72. One set of 1000 bootstrap proportions isshown in the figure below. For a 95% confidence interval we need to find the 2.5%-tile and 97.5%-tile, leaving95% of the distribution in the middle. For this distribution those points are at 0.664 and 0.776, so we are95% sure that the proportion in the population who agree is between 0.664 and 0.776. Answers will varyslightly for different simulations.

3.98 The sample proportion who agree is p = 112/400 = 0.28. One set of 1000 bootstrap proportions isshown in the figure below. For a 90% confidence interval we need to find the 5%-tile and 95%-tile, leaving90% of the distribution in the middle. For this distribution those points are at 0.242 and 0.315, so we are90% sure that the proportion in the population who agree is between 0.242 and 0.315. Answers will varyslightly for different simulations.

Page 25: 74 CHAPTER 3 Section 3.1 Solutions 3 ... - Montgomery …faculty.montgomerycollege.edu/maronne/Math117A-LOCK-STATISTICS... · 74 CHAPTER 3 Section 3.1 ... 3.2 This correlation is

98 CHAPTER 3

3.99 The sample proportion who agree is p = 382/1000 = 0.382. One set of 1000 bootstrap proportionsis shown in the figure below. For a 99% confidence interval we need to find the 0.5%-tile and 99.5%-tile,leaving 99% of the distribution in the middle. For this distribution those points are at 0.343 and 0.423, sowe are 99% sure that the proportion in the population who agree is between 0.343 and 0.423. Answers willvary slightly for different simulations.

3.100 (a) The bootstrap distribution is centered at about 100, so we estimate that the sample mean ofthe orignal IQ scores is x ≈ 100.

(b) Since we are finding a 99% confidence interval, we want to keep the middle 99%. That means we wantan interval that includes the middle 990 of the 1000 bootstrap statistics. We need to cut off 5 valueson each end, which appears to give an interval from about 88 to 112.

3.101 The 98% confidence interval uses the 1%-tile and 99%-tile from the bootstrap means. We are 98%sure that the mean number of penalty minutes for NHL players in a season is between 29.4 and 76.7 minutes.

3.102 Using StatKey or other technology, we produce a bootstrap distribution such as the figure shownbelow. For a 90% confidence interval, we find the 5%-tile and 95%-tile points in this distribution to be 0.730

Page 26: 74 CHAPTER 3 Section 3.1 Solutions 3 ... - Montgomery …faculty.montgomerycollege.edu/maronne/Math117A-LOCK-STATISTICS... · 74 CHAPTER 3 Section 3.1 ... 3.2 This correlation is

CHAPTER 3 99

and 0.774. We are 90% confident that the percent of American adults who think exercise is an importantpart of daily life is between 73.0% and 77.4%.

3.103 Using StatKey or other technology, we produce a bootstrap distribution such as the figure shownbelow. For a 99% confidence interval, we find the 0.5%-tile and 99.5%-tile points in this distribution to be0.467 and 0.493. We are 99% confident that the percent of all Europeans (from these nine countries) whocan identify arm or shoulder pain as a symptom of a heart attack is between 46.7% and 49.3%. Since everyvalue in this interval is below 50%, we can be 99% confident that the proportion is less than half.

3.104 The dog got pB = 33/36 = 0.917 or 91.7% of the breath samples correct and pS = 37/38 = 0.974 or97.4% of the stool samples correct. (A remarkably high percentage in both cases!) We create a bootstrapdistribution for the difference in proportions using StatKey or other technology (as in the figure below) andthen find the middle 90% of values. Using the figure, the 90% confidence interval for pB − pS is -0.14 to0.025. We are 90% confident that the difference between the proportion correct for breath samples and theproportion correct for stool samples for all similar tests we might give this dog is between -0.14 and 0.025.Since a difference of zero represents no difference, and zero is in the interval of plausible values, it is plausiblethat there is no difference in the effectiveness of breath vs stool samples in having this dog detect cancer.

Page 27: 74 CHAPTER 3 Section 3.1 Solutions 3 ... - Montgomery …faculty.montgomerycollege.edu/maronne/Math117A-LOCK-STATISTICS... · 74 CHAPTER 3 Section 3.1 ... 3.2 This correlation is

100 CHAPTER 3

3.105 Using one bootstrap distribution (as shown below), the standard error is SE = 0.19.

The mean tip from the original sample is x = 3.85, so a 95% confidence interval using the standard error is

x ± 2 · SE3.85 ± 2(0.19)

3.85 ± 0.38

3.47 to 4.23.

For this bootstrap distribution, the 95% confidence interval using the 2.5%-tile and 97.5%-tile is 3.47 to4.23. We see that the results (rounding to two decimal places) are the same. We are 95% confident that theaverage tip left at this restaurant is between $3.47 and $4.23.

3.106 (a) A 99% confidence interval is wider than a 90% confidence interval, so the 90% interval is A (3.55to 4.15) and the 99% interval is B (3.35 to 4.35).

Page 28: 74 CHAPTER 3 Section 3.1 Solutions 3 ... - Montgomery …faculty.montgomerycollege.edu/maronne/Math117A-LOCK-STATISTICS... · 74 CHAPTER 3 Section 3.1 ... 3.2 This correlation is

CHAPTER 3 101

(b) We multiply the lower and upper bounds for the average tip by 20 to get the average daily tip revenue(assuming 20 tables per day). With 90% confidence, the interval is 20 · 3.55 = 71 to 20 · 4.15 = 83.With 99% confidence, the interval is 20 · 3.35 = 67 to 20 · 4.35 = 87. We are 90% confident that thiswaitress will average between 71 and 83 dollars in tip income per day, and we are 99% confident thather mean daily tip income is between 67 and 87 dollars.

3.107 (a) We have pm = 27/193 = 0.140 and pf = 16/169 = 0.095 so the best point estimate for thedifference in population proportions is pm − pf = 0.140 − 0.095 = 0.045. In this sample, a largerproportion of males smoke.

(b) Using StatKey or other technology, we create a bootstrap distribution and find the boundaries for themiddle 99% of values. We see that a 99% confidence interval for pm − pf is the interval from about-0.039 to 0.132. We are 99% confidence that the difference between males and females in the proportionthat smoke is between -0.039 and 0.132.

3.108 (a) The population of interest is all FA premier league football matches. The specific parameter ofinterest is proportion of matches the home team wins.

(b) Our best estimate for the parameter is 70/120 = 0.583.

(c) Using StatKey or other technology, we create a bootstrap distribution as shown below. Taking 5%from each tail, the 90% confidence interval is 0.508 to 0.650. We are 90% sure that the home teamwins between 50.8% and 65.0% of all FA premier league football matches.

Page 29: 74 CHAPTER 3 Section 3.1 Solutions 3 ... - Montgomery …faculty.montgomerycollege.edu/maronne/Math117A-LOCK-STATISTICS... · 74 CHAPTER 3 Section 3.1 ... 3.2 This correlation is

102 CHAPTER 3

(d) Using the same bootstrap distribution we see that a 99% confidence interval goes from 0.467 to 0.692.We are 99% sure that the home team wins between 46.7% and 69.2% of all FA premier league footballmatches.

(e) If the population parameter is 0.50 or less, then no home field advantage is present. With the 90%confidence interval we are 90% confident the population parameter is between 0.508 and 0.650. Sincethis interval does not contain 0.50, we are 90% confident that there is a home field advantage. Howeverthe 99% confidence interval does contain 0.50, so we are not 99% confident that there is a home fieldadvantage.

3.109 (a) We have xt−xc = 34.82− 17.7 = 17.12, where xt represents the sample mean immune responsefor tea drinkers and xc represents the sample mean immune response for coffee drinkers.

(b) We are estimating µt − µc where µt represents the mean immune response for all tea drinkers and µc

represents the mean immune response for all coffee drinkers.

(c) Using StatKey or other technology, we obtain a bootstrap distribution of sample differences in means asshown below. We see that a 90% confidence interval for the difference in means is about 4.17 to 29.70.We are 90% confident that tea drinkers have a mean immune response between 4.17 and 29.70 higherthan the mean immune response for coffee drinkers. Answers may vary for other sets of bootstrapdifferences in means.

(d) Using the same bootstrap distribution, we see that a 99% confidence interval for the difference in meansis about -3.30 to 37.04. We are 99% confident that the difference in mean immune response is between-3.30 and 37.04.

(e) We are 90% confident that tea drinkers have a stronger mean immune response, since all values in the90% confidence interval are positive, but we are not 99% confident, since some plausible values for thedifference in means in that interval are negative.

3.110 (a) For one set of 1000 bootstrap sample standard deviations shown below, the 2.5%-tile and 97.5%-tile are 21.5 and 66.9, respectively. Thus we can say with 95% confidence that the standard deviationof the number of penalty minutes awarded to all NHL players in a season is between 21.5 and 66.9minutes.

Page 30: 74 CHAPTER 3 Section 3.1 Solutions 3 ... - Montgomery …faculty.montgomerycollege.edu/maronne/Math117A-LOCK-STATISTICS... · 74 CHAPTER 3 Section 3.1 ... 3.2 This correlation is

CHAPTER 3 103

(b) The midpoint of the interval in part (a) is (21.5 + 66.9)/2 = 44.2 which is less than the standarddeviation of the original sample, s = 49.1. In general, an interval based on bootstrap percentiles doesnot need to be centered at the point estimate.

3.111 The mean area for the sample of ten countries is x = 111.3 thousand square kilometers. Usingtechnology we obtain a bootstrap distribution as shown below. From this distribution the 99% confidenceinterval is (30.1, 228.3). (Answers will vary.) We are 99% confident that the average country size for all 213countries is between 30,100 and 228,300 square kilometers.

3.112 (a) We compute the regression line to be

PctRural = 29.0 + 0.079 ·Area.

The slope of the line for this sample is 0.079.

(b) Using technology to produce the bootstrap distribution below for the sample slopes, we get a 95%confidence interval for the slope from 0.008 to 0.149. Answers will vary – for this small a samplewith strongly skewed data the bootstrap slopes might contain some very extreme values. We are 95%confident that the slope of the regression line for all countries to predict percent rural from land areais between 0.008 and 0.149.

Page 31: 74 CHAPTER 3 Section 3.1 Solutions 3 ... - Montgomery …faculty.montgomerycollege.edu/maronne/Math117A-LOCK-STATISTICS... · 74 CHAPTER 3 Section 3.1 ... 3.2 This correlation is

104 CHAPTER 3

(c) The 95% confidence interval from part (b) is (0.008,0.149), so we don’t quite successfully capture thetrue population slope of 0. The lower bound is very close to zero, so this answer may vary, dependingon the results of the simulation from part (b).

3.113 (a) We see that both cities have a significant number of outliers, with very long commute times.The quartiles and median are all bigger for Atlanta than for St. Louis, so we expect that the meancommute time is larger for Atlanta.

(b) We are estimating the difference between the cities in mean commute time for all commuters, µatl−µstl.We get a point estimate for the difference in mean commute times between the two cities with thedifference in the sample means, xatl − xstl = 29.11− 21.97 = 7.14 minutes.

(c) Since the two samples were taken independently in different cities, for each bootstrap statistic we take500 Atlanta times with replacement from the original Atlanta data and 500 St. Louis times withreplacement from the original St. Louis sample, compute the mean within each sample, and take thedifference. This constitutes one bootstrap statistic.

(d) A bootstrap distribution for the difference in means with 2000 bootstrap samples is shown in the figure.

The standard error for xatl − xstl, found in the upper corner of the figure, is SE = 1.125. We find aninterval estimate for the difference in the population means with

7.14± 2 · 1.125 = 7.14± 2.25 = (4.89, 9.39)

Page 32: 74 CHAPTER 3 Section 3.1 Solutions 3 ... - Montgomery …faculty.montgomerycollege.edu/maronne/Math117A-LOCK-STATISTICS... · 74 CHAPTER 3 Section 3.1 ... 3.2 This correlation is

CHAPTER 3 105

We are 95% confident that the average commuting time for commuters in Atlanta is somewhere between4.89 and 9.39 minutes more than the average commuting time for commuters in St. Louis.

3.114 (a) The parameter of interest is ρ, the correlation between weight gain during a month of overeatingand inactivity and weight gain over the next 2.5 years, for those adults who spend one month (possiblyduring December) overeating and being sedentary. The best point estimate for this parameter isr = 0.21.

(b) To create the bootstrap sample, we sample from the original sample with replacement. In this case,we randomly select one of the 18 ordered pairs, write down the values, and return them to the pile.Then we randomly select one of the 18 ordered pairs (possibly the same one), and write down thosevalues as our second pair. We do this until we have 18 ordered pairs, and that dataset is our bootstrapsample.

(c) For each bootstrap sample, we record the correlation between the one month and 2.5 year weight gainsof the 18 ordered pairs.

(d) We find the standard error by finding the standard deviation of the 1000 bootstrap correlations.

(e) The interval estimate is r± 2 · SE = 0.21± 2(0.14) = 0.21± 0.28, so a 95% confidence interval for thepopulation correlation ρ is −0.07 to 0.49.

(f) There is a reasonable possibility that there is no correlation at all between the amount of weight gainedduring the one month intervention and how much weight is gained over the long-term. We know thatthis is a reasonable possibility because 0 is inside the interval estimate so ρ = 0 is included as one ofthe plausible values of the population correlation.

(g) A 90% confidence interval needs to only include the middle 90% of data values in a bootstrap distri-bution, so it will be narrower than a 95% confidence interval.

3.115 (a) We see that the bootstrap distribution is relatively symmetric and bell-shaped, so it is reasonableto use the distribution to estimate a 95% confidence interval for the standard deviation of prices ofall used Mustang cars. Using either the standard error method or the percentile method (estimatingvalues that include the middle 95%), we estimate a 95% confidence interval to be about 7 to 14. Weare 95% confident that the standard deviation of all prices of used Mustangs is between 7 thousanddollars and 14 thousand dollars.

(b) This bootstrap distribution is not symmetric and is not bell-shaped. It would not be appropriate touse this distribution to find a 95% confidence interval. The sample size is so small (at only n = 5)that the distribution ends up looking a bit bizarre. It is important to always look at the graph of thedistribution. These methods apply only when the bootstrap distribution is reasonably symmetric andbell-shaped.

3.116 The bootstrap distribution for the standard deviations (shown below) has at least four completelyseparate clusters of dots. It is not at all symmetric and bell-shaped so it would not be appropriate to usethis bootstrap distribution to find a confidence interval for the standard deviation. The clusters of dotsrepresent the number of times the outlier is included in the bootstrap sample (with the cluster on the leftcontaining statistics from samples in which the outlier was not included, the next one containing statisticsfrom samples that included the outlier once, the next one containing statistics from samples that includedthe outlier twice, and so on.)

Page 33: 74 CHAPTER 3 Section 3.1 Solutions 3 ... - Montgomery …faculty.montgomerycollege.edu/maronne/Math117A-LOCK-STATISTICS... · 74 CHAPTER 3 Section 3.1 ... 3.2 This correlation is

106 CHAPTER 3