Chapter 12 Significance Tests in Practice AP Statistics Hamilton/Mann.

of 53/53
Chapter 12 Significance Tests in Practice AP Statistics Hamilton/Mann
  • date post

    01-Apr-2015
  • Category

    Documents

  • view

    230
  • download

    4

Embed Size (px)

Transcript of Chapter 12 Significance Tests in Practice AP Statistics Hamilton/Mann.

  • Slide 1

Chapter 12 Significance Tests in Practice AP Statistics Hamilton/Mann Slide 2 Introduction In Chapter 11, we made the unrealistic claim that we knew the population standard deviation . In this chapter, we will stop making this assumption. Just like with confidence intervals, when we no longer know the population standard deviation , we must use a t distribution to carry out a significance test. Remember the use and abuse of tests from 11.3. Also remember that we could be making a Type I or Type II error whenever we use a significance test. Think before you calculate! Slide 3 CHAPTER 12 SECTION 1 Tests about a Population Mean HW: 12.2, 12.3, 12.4, 12.5, 12.6, 12.7, 12.9, 12.10, 12.11 Slide 4 Guinness Brewery William S. Gosset was involved in experiments and statistics to understand data for the Guinness brewery in Dublin, Ireland. He was trying to determine the best varieties of barley and hops for brewing? He ran into the problem of not knowing the population standard deviation . He observed that replacing by s and calling the result roughly Normal wasnt accurate enough. After much work, Gosset developed what we now call the t distribution. Guinness allowed Gosset to publish his discoveries, but not under his own name. He used the name Student, and, as a result, it is sometimes referred to as Students t distribution. Slide 5 The t statistic has the same interpretation as any standardized statistic: it says how far is from its mean in standard deviation units. We are now going to learn to find P-values for a significance test about using the t table. Slide 6 Determining P-values Suppose we carry out a significance test of based on a sample of size 20 and obtain t = 1.81. Since there were 20 observations, we would have df = 19. So look along the row with df = 19. Our t statistic falls between 1.729 and 2.093 which correspond to upper-tail values of 0.05 and 0.025. We can conclude that our P-value is between 0.025 and 0.05 because we were performing a one-sided test and it would be the area to the right, hence it would be the upper-tail value. Slide 7 Determining P-values Suppose we carry out a significance test of based on a sample of size 37 and obtain t = -3.17. Since there were 37 observations, we would have df = 36. Since there is no 36 in the table, we look along the row with df = 30. Our t statistic falls between 3.030 and 3.385 which correspond to upper-tail values of 0.0025 and 0.001. Since it is a two-sided test, we have to find the probability less than -3.17 or greater than 3.17. Due to symmetry, the lower tail value would be the same as the upper-tail value, so we just double both of them. So we can conclude that our P-value is between 0.002 and 0.005. Slide 8 The One-Sample t-test In significance tests as in confidence intervals, we allow for unknown by using the standard error and replacing z by t. Slide 9 Testing Now we can do a realistic analysis of data produced to test a claim about an unknown population mean. Again, we need to follow the steps in our Inference Toolbox. 1.Hypotheses What is the population of interest and what are the hypotheses we are testing? 2.Conditions SRS, Normality and Independence 3.Calculations Find P-value 4.Interpretation Connection, Conclusion, Context Slide 10 Sweet Cola Diet colas use artificial sweeteners to avoid sugar. These sweeteners gradually lose their sweetness over time. Manufacturers therefore test new colas for loss of sweetness before marketing them. Trained tasters (wouldnt you love this job) sip the cola along with drinks of standard sweetness and score the cola on a sweetness scale of 1 to 10. The cola is then stored for a month at high temperature to imitate the effect of storing it at room temperature for four months. Each taster will then score the cola again after storage. Our data are the differences (score before storage minus score after storage) in the tasters scores. The bigger these differences, the bigger the loss of sweetness. Slide 11 Sweet Cola Here are the sweetness losses for a new cola, as measured by 10 different sweetness tasters. Notice that most are positive, indicating a loss of sweetness. Is there good evidence that the cola lost sweetness in storage? Step 1: We are interested in the average difference in sweetness between the before storage and after storage sweetness score. 2.00.40.72.0-0.42.2-1.31.21.12.3 Slide 12 Sweet Cola Step 2 Conditions Since we do not know the standard deviation of sweetness loss in the population of tasters, we must use a one-sided t- test. SRS We must be willing to treat our 10 tasters as an SRS from the population of tasters if we want to draw conclusions about tasters in general. The tasters all have the same training. So even though we dont have an actual SRS, we are willing to act as if we did. Normality The sample is too small to effectively check normality. The stemplot and a boxplot show left skewedness, but no gaps or outliers. Independence It is reasonable that different tasters would have results independent from each other. Slide 13 Sweet Cola Step 3 Calculations Since our t-value is 2.70 and we have 10 observations, our P-value would be between 0.01 and 0.02. Using the calculator, we can get the exact value of 0.0123. Step 4 Interpretation A P-value between 0.01 and 0.02 is quite small and gives good evidence against H 0. Therefore, we reject H 0, and conclude that the cola has lost sweetness during storage. Slide 14 T-Procedures Because t-procedures are so common, all statistical software packages will do the calculations for you. The next two slides have the calculations as performed by 4 different statistical software packages: 1.DataDesk 2.Fathom 3.Minitab 4.CrunchIt! Slide 15 Slide 16 Slide 17 Two-Tailed Example An investor with a stock portfolio worth several hundred thousand dollars sued his broker because lack of diversification in his portfolio led to poor performance. The table below gives the rates of return for the 39 months the account was managed by the broker. An arbitration panel compared these returns with the average of the Standard & Poors 500 stock index for the same period. Consider the 39 monthly returns as a random sample from the monthly returns the broker would generate if he managed the account forever. Are these returns compatible with a population mean of the S&P 500 average? -8.361.63-2.27-2.93-2.70-2.93-9.14-2.646.82-2.35-3.586.137.00 -15.25-8.66-1.03-9.16-1.25-1.22-10.27-5.11-0.80-1.441.28-0.654.34 12.22-7.21-0.097.345.04-7.24-2.14-1.01-1.4112.03-2.564.332.35 Slide 18 Two-Tailed Example Step 1 Hypotheses where is the mean return for all possible months that the broker could manage this account Step 2 Conditions Since we dont know , we must use t-procedures. SRS We were told to assume it was a random sample. Normality Since we have 39 observations, the Central Limit Theorem says that it is approximately Normal. A boxplot and a histogram verify no outliers and no strong skewness. Independence This is a matter of judgment. Would these 39 months represent independent observations? Slide 19 Two-Tailed Example Step 3 Calculations Since our t-value is -2.14 and we have 39 observations, we would get an upper-tail value between 0.02 and 0.025. Since it is a two-tailed test, our P-value would be between 0.04 and 0.05 Step 4 Interpretation The mean monthly returns for this clients account differs significantly from the S&P 500 for the same period (t = -2.14, P