Simultaneous inference

Upload
chelseayates 
Category
Documents

view
42 
download
5
description
Transcript of Simultaneous inference
Simultaneous inference
Estimating (or testing) more than one thing at a time (such as β0 and β1) and
feeling confident about it …
Simultaneous inference we’ll be concerned about …
• Estimating β0 and β1 jointly.
• Estimating more than one mean response, E(Y), at a time.
• Predicting more than one new observation at a time.
Why simultaneous inference is important
• A 95% confidence interval implies a 95% chance that the interval contains β0.
• A 95% confidence interval implies a 95% chance that the interval contains β1.
• If the intervals are independent, then have only a (0.95×0.95) ×100 = 90.25% chance that both intervals are correct.
• (Intervals not independent, but point made.)
Terminology
• Family of estimates (or tests): a set of estimates (or tests) which you want all to be simultaneously correct.
• Statement confidence level: the confidence level, as you know it, that is, for just one parameter.
• Family confidence level: the confidence level of the whole family of interval estimates (or tests).
Examples
• A 95% confidence interval for β0 – the 95% is a statement confidence level.
• A 95% confidence interval for β1 – the 95% is a statement confidence level.
• Consider family of interval estimates for β0 and β1. If a 90.25% chance that both intervals are simultaneously correct, then 90.25% is the family confidence level.
Bonferroni joint confidence intervals for β0 and β1
• GOAL: To formulate joint confidence intervals for β0 and β1 with a specified family confidence level.
• BASIC IDEA: – Make statement confidence level for β0 higher
– Make statement confidence level for β1 higher
– So that the family confidence level for (β0 , β1) is at least (1α)×100%.
Recall: Original confidence intervals
00 2,2
1 bsntb
For β0:
11 2,2
1 bsntb
For β1:
Goal is to adjust the tmultiples so that family confidence coefficient is 1α.
That is, we need to find the α* to put into the above formulas to achieve the desired family coefficient of 1 α.
A little derivation
• Let A1 = the event that first confidence interval does not contain β0 (i.e., incorrect).
• So A1C
= the event that first confidence interval contains β0 (i.e., correct).
• P(A1) = α and P(A1C) = 1 α
A little derivation (cont’d)
• Let A2 = the event that second confidence interval does not contain β1 (i.e., incorrect).
• So A2C
= the event that second confidence interval contains β1 (i.e., correct).
• P(A2) = α and P(A2C) = 1 α
Becoming a not so little derivation…
A1 A2
A1 or A2 A1C and A2
C
We want P(A1C and A2
C) to be at least 1α.
P(A1C and A2
C) = 1 – P(A1 or A2) = 1 – [P(A1)+P(A2) – P(A1 and A2)]= 1 – P(A1) – P(A2) + P(A1 and A2)]≥ 1 – P(A1) – P(A2)= 1 – α – α= 1 – 2α
So, we need α* to be set to α/2.
Bonferroni joint confidence intervals
00 2,2
1 bsntb
11 2,2
1 bsntb
00 2,4
1 bsntb
11 2,4
1 bsntb
Typically, the tmultiple in this setting is called the Bonferroni multiple and is denoted by the letter B.
Example: 90% family confidence interval
The regression equation ispunt = 14.9 + 0.903 leg
Predictor Coef SE Coef T PConstant 14.91 31.37 0.48 0.644leg 0.9027 0.2101 4.30 0.001
n=13 punters t(0.975, 11) = 2.201
9.83,1.54)37.31(201.29.14:0 36.1,44.0)21.0(201.290.0:1
We are 90% confident that β0 is between 54.1 and 83.9 and β1 is between 0.44 and 1.36.
A couple of more points about Bonferroni intervals
• Bonferroni intervals are most useful when there are only a few interval estimates in the family (o.w., the intervals get too large).
• Can specify different statement confidence levels to get desired family confidence level.
• Bonferroni technique easily extends to g interval estimates. Set statement confidence levels at 1(α/g), so need to look up 1 (α/2g).
Bonferroni intervals for more than one mean response at a time
To estimate the mean response E(Yh) for g different Xh values with family confidence coefficient 1α:
hh YsBY ˆˆ
where:
2,
21 n
gtB
g is the number of confidence intervals in the family
Example: Mean punting distance for leg strengths of 140, 150, 160 lbs.
Predicted Values for New Observations
New Fit SE Fit 95.0% CI 95.0% PI
140 141.28 4.88 (130.55,152.01) (103.23,179.33) 150 150.31 4.63 (140.13,160.49) (112.41,188.20) 160 159.33 5.28 (147.72,170.95) (121.03,197.64)
n=13 punters t(0.99, 11) = 2.718
5.154,0.12888.4718.228.141 9.162,7.13763.4718.231.150
7.173,0.14528.5718.233.159
We are 94% confident that the mean responses for leg strengths of 140, 150, 160 pounds are …
Two procedures for predicting g new observations simultaneously
• Bonferroni procedure
• Scheffé procedure
• Use the procedure that gives the narrower prediction limits.
Bonferroni intervals for predicting more than one new obs’n at a time
To predict g new observations Yh for g different Xh values with family confidence coefficient 1α:
predsBYh ˆ
where:
2,
21 n
gtB
g is the number of prediction intervals in the family
222 ˆ)( SEFitMSEYsMSEpreds h
Scheffé intervals for predicting more than one new obs’n at a time
To predict g new observations Yh for g different Xh values with family confidence coefficient 1α:
predsSYh ˆ
where:
2,;12 ngFgS
g is the number of prediction intervals in the family
222 ˆ)( SEFitMSEYsMSEpreds h
Example: Punting distance for leg strengths of 140 and 150 lbs.
n = 13 punters
Bonferroni multiple: 201.211,975.0213,22
10.01
ttB
Suppose we want a 90% family confidence level.
Scheffé multiple: 39.286.22)11,2;10.01(2 FS
Since B is smaller than S, the Bonferroni prediction intervals will be narrower … so use them here instead of the Scheffé intervals.
Example: Punting distance for leg strengths of 140 and 150 lbs.
Predicted Values for New Observations
New Fit SE Fit 95.0% CI 95.0% PI
140 141.28 4.88 (130.55,152.01) (103.23,179.33) 150 150.31 4.63 (140.13,160.49) (112.41,188.20)
n=13 punters s(pred(140)) = 17.28
3.179,2.10328.17201.228.141 2.188,4.11221.17201.231.150
There is a 90% chance that the punting distances for leg strengths of 140 and 150 pounds will be…
s(pred(150)) = 17.21
Simultaneous prediction in Minitab
• Stat >> Regression >> Regression …• Specify predictor and response.• Under Options …, In “Prediction intervals
for new observations” box, specify a column name containing multiple X values. Specify confidence level.
• Click on OK. Click on OK.• Results appear in session window.