σis not known Additional Uncertainty - San Jose State ... · PDF file11.2...
Click here to load reader
Transcript of σis not known Additional Uncertainty - San Jose State ... · PDF file11.2...
Chapter 11 11/16/2009
Basic Biostat 1
November 09 1
Chapter 11: Chapter 11: Chapter 11: Inference About a MeanInference About a MeanInference About a Mean
November 09 2
In Chapter 11:
11.1 Estimated Standard Error of the Mean11.2 Student’s t Distribution11.3 One-Sample t Test11.4 Confidence Interval for μ11.5 Paired Samples11.6 Conditions for Inference11.7 Sample Size and Power
November 09 3
σ is not known• Prior chapter used z procedures to help infer µ• z procedures required that we know population standard
deviation σ ahead of time• When this is not the case, we calculate sample standard
deviations s and use it as an estimate of σ• Sample standard deviation s is then used to calculate
the estimated standard error of the mean:
nsSEx =
November 09 4
Additional Uncertainty
• Using s instead of σ in the standard error formula adds uncertainty to inferences
• This precludes use of the Normal distribution & z procedures
• Instead, we rely on Student t procedures
nsSEx =
November 09 5
Student’s t distributions • A family of densities • Family members
identified by degeesof freedom (df)
• Similar to “z” but broader tails
• As df increases → tails get skinnier → t become like z
A t distribution with infinite degrees of freedom is the Standard Normal z
November 09 6
Table C (t table)
Entries ⇒ t critical values Rows ⇒ dfColumns ⇒ probabilities
Chapter 11 11/16/2009
Basic Biostat 2
November 09 7
One-Sample t Test
Purpose of t test: to seek evidence that the true population mean µ is different from some stated value µ0
Conditions for the procedure: • Simple Random Sample• Normal population or sample large• Population µ not known• Population σ not known
November 09 8
Hypothesis Statements• Null hypothesis (no
difference between population mean µand stated value µ0)H0: µ = µ0
• Alternative hypothesesHa: µ < µ0 (one-sided) Ha: µ > µ0 (one-sided) Ha: µ ≠ µ0 (two-sided)
November 09 9
One-Sample t Test Statistic
nsSE
x
x =
≡≡
hypothesis null under themean the mean, sample the
where
0μ
0stat
xSExt μ−
=
This statistic has n – 1 degrees of freedom
November 09 10
P-value• The one-tailed P-value is the area under the
curve in the tails beyond the tstat
• Look up in Table C or use a computer Applet • Two-tailed P-value = 2 × one-tailed P-value
November 09 11
Example• Statement of problem: Do SIDS babies
have lower birth weight on average? Prior research suggested a μ = 3300 grams in the general (non-SIDs) population
• Statistical Hypotheses H0: µ = 3300 vs. Ha: µ < 3300 (one-sided) or Ha: µ ≠ 3300 (two-sided)
November 09 12
Data: An SRS of n = 10 SIDS babies showsx-bar = 2890.5 grams and s = 720.0 grams
80.11072033005.2890 0
stat −=−
=−
=xSE
xt μ
Example
91101has statistic t This=−=−= ndf
Chapter 11 11/16/2009
Basic Biostat 3
November 09 13
P-value via StaTable.exe
November 09 14
P-value via Table C• Bracket |tstat| between t critical value • tstat = −1.80 w/ 9 df
Table C.
2.2621.8331.3831.1000.8830.703df = 90.0250.050.100.150.200.25Upper-tail P
Thus ⇒ One-tailed: 0.05 < P < 0.10 Two-tailed: 0.10 < P < 0.20
|tstat| = 1.80
November 09 15
Interpretation• The two-tailed P-value
was 0.11• Interpret in context of
H0: µ = 3300 vs. Ha: µ ≠ 3300
• Non-significant evidence against H0
• Data provide little reason to doubt H0
• Retain H0 at α = .10 and below
November 09 16
(1− α)100% CI for µ
nsSE
SEtx
x
xn
=
⋅± −−
where21,1 α
C) Table (from1 ofy probabilit cumulative
and 1 with
2
1,1 2
α
α
−
−≡−− dfnttn
November 09 17
Example
68.227262.25.2890 for CI 95% 977,.9 ⋅±=⋅±= xSEtxμ
10 0.720 5.2890 :before as data Same === nsx
C Table from 262.2 :use .05)( confidence 95%For
97591,110 205. ==
=
−− ,.ttα
68.22710
720===
nsSEx
grams 3405.6) to(2375.4 =
515.1 ±5.2890 =
Interpretation addresses true population mean µ
November 09 18
§11.5 Paired Samples• Two samples • Each data point in one sample uniquely
paired to a data point in the other sample• Examples
– Sequential samples within individuals – “Pre-test/post-test”– Cross-over trials– Pair-matching
Chapter 11 11/16/2009
Basic Biostat 4
November 09 19
Example: Statement of Problem
• Does oat bran reduce LDL cholesterol?• Half subjects start on CORNFLK, half on
OATBRAN• Two weeks on diet 1 ⇒ LDL cholesterol• Washout period• Cross-over to other diet• Two weeks on diet 2 ⇒ LDL cholesterol
November 09 20
Oat bran data
Subject CORNFLK OATBRAN ---- ------- -------1 4.61 3.84 2 6.42 5.57 3 5.40 5.85 4 4.54 4.80 5 3.98 3.68 6 3.82 2.96 7 5.01 4.41 8 4.34 3.72 9 3.80 3.49 10 4.56 3.84 11 5.35 5.26 12 3.89 3.73
November 09 21
Within Pair Difference “DELTA”
• Let DELTA = CORNFLK - OATBRAN• First three observations in OATBRAN data:
ID CORNFLK OATBRAN DELTA ---- ------- ------- -----1 4.61 3.84 0.772 6.42 5.57 0.853 5.40 5.85 -0.45
etc.
November 09 22
Explore and describe DELTA
Stemplot w/ quint. split
|-0f|4|-0*|2|+0*|01|+0t|33|+0f||+0s|6677|+0.|88×1
DELTA values: 0.77, 0.85, −0.45, −0.26, 0.30, 0.86, 0.60, 0.62, 0.31, 0.72, 0.09, 0.16
0.43350.3808
12
===
d
d
sxn
subscript d denotes statistic is for DELTA variable (optional)
November 09 23
(1 – α)100% CI for µd
dxnd SEtx ⋅± −− 21,1 α
t procedure directed toward DELTA
nsSE d
xd=
Where
November 09 24
Example1251.0201.20.3808 for CI %95 ⋅±=dμ
0.43350.3808
12:Data
===
d
d
sxn
1251.12
4335.0==
dxSE
C) (Table 201.2 use .05)( confidence 95%For
975,.111112 205 ==
=
−− tt .,
α
)656.0 to105.0(=
2754.00.3808 ±=
Interpretation addresses true mean difference µd
Chapter 11 11/16/2009
Basic Biostat 5
November 09 25
• Statement of problem: Is the oat bran diet associated with a decline (one-sided) or change (two-sided) in LDL cholesterol?
• H0: µd = 0 • Ha: µd > 0 (one-sided) • Ha: µ ≠ 0 (two-sided)
Paired t test: Example
November 09 26
Paired t statistic
111121 =−=−= ndf
ndf ns
xtd
d 1 where 0stat −=
−=
μ
0.4335 0.3808 12:datacurrent Recall
=== dd sxn
0 µ ,Under 0 =H
043.312/4335.
038083.0 0stat =
−=
−=
nsxt d μ
November 09 27
P-value via StaTable
November 09 28
P-value via Table C• Bracket |tstat| between t critical value • tstat = 3.043 w/ 11 df
Table C.
3.4973.1062.718df = 11.0025.005.01Upper-tail P
Thus ⇒ One-tailed: 0.005 < P < 0.01 Two-tailed: 0.01 < P < 0.02
|tstat| = 3.043
November 09 29
Interpretation• The two-tailed P-value
was 0.011• Interpret in context of
H0: µ = 0 vs. Ha: µ ≠ 0• Significant evidence
against H0
• Data provide good reason to doubt H0
• Reject H0 at α = .05 but retain at α = .01
November 09 30
SPSS Output: “Oat Bran”
Chapter 11 11/16/2009
Basic Biostat 6
November 09 31
The Normality Condition• t Procedures require Normal population or
large samples• How do we assess this condition?• It is OK to use t procedures when:
– Population is Normal– Population is symmetrical and n ≥ 10– Population is skewed and n ≥ 30 to 60
(depending on the severity of skew)
November 09 32
Can a t procedures be used?
Dataset skewed and small ⇒ avoid t procedures
November 09 33
Can a t procedures be used?
Dataset B has a mild skew and is moderate in size: OK to use t procedures
November 09 34
Can a t procedures be used?
Data highly skewed and moderate in size. Avoid t procedures