σis not known Additional Uncertainty - San Jose State ... · PDF file11.2...

6

Click here to load reader

Transcript of σis not known Additional Uncertainty - San Jose State ... · PDF file11.2...

Page 1: σis not known Additional Uncertainty - San Jose State ... · PDF file11.2 Student’s t Distribution ... Table C (t table) Entries ⇒t critical values ... Table C. df = 9 0.703 0.883

Chapter 11 11/16/2009

Basic Biostat 1

November 09 1

Chapter 11: Chapter 11: Chapter 11: Inference About a MeanInference About a MeanInference About a Mean

November 09 2

In Chapter 11:

11.1 Estimated Standard Error of the Mean11.2 Student’s t Distribution11.3 One-Sample t Test11.4 Confidence Interval for μ11.5 Paired Samples11.6 Conditions for Inference11.7 Sample Size and Power

November 09 3

σ is not known• Prior chapter used z procedures to help infer µ• z procedures required that we know population standard

deviation σ ahead of time• When this is not the case, we calculate sample standard

deviations s and use it as an estimate of σ• Sample standard deviation s is then used to calculate

the estimated standard error of the mean:

nsSEx =

November 09 4

Additional Uncertainty

• Using s instead of σ in the standard error formula adds uncertainty to inferences

• This precludes use of the Normal distribution & z procedures

• Instead, we rely on Student t procedures

nsSEx =

November 09 5

Student’s t distributions • A family of densities • Family members

identified by degeesof freedom (df)

• Similar to “z” but broader tails

• As df increases → tails get skinnier → t become like z

A t distribution with infinite degrees of freedom is the Standard Normal z

November 09 6

Table C (t table)

Entries ⇒ t critical values Rows ⇒ dfColumns ⇒ probabilities

Page 2: σis not known Additional Uncertainty - San Jose State ... · PDF file11.2 Student’s t Distribution ... Table C (t table) Entries ⇒t critical values ... Table C. df = 9 0.703 0.883

Chapter 11 11/16/2009

Basic Biostat 2

November 09 7

One-Sample t Test

Purpose of t test: to seek evidence that the true population mean µ is different from some stated value µ0

Conditions for the procedure: • Simple Random Sample• Normal population or sample large• Population µ not known• Population σ not known

November 09 8

Hypothesis Statements• Null hypothesis (no

difference between population mean µand stated value µ0)H0: µ = µ0

• Alternative hypothesesHa: µ < µ0 (one-sided) Ha: µ > µ0 (one-sided) Ha: µ ≠ µ0 (two-sided)

November 09 9

One-Sample t Test Statistic

nsSE

x

x =

≡≡

hypothesis null under themean the mean, sample the

where

0stat

xSExt μ−

=

This statistic has n – 1 degrees of freedom

November 09 10

P-value• The one-tailed P-value is the area under the

curve in the tails beyond the tstat

• Look up in Table C or use a computer Applet • Two-tailed P-value = 2 × one-tailed P-value

November 09 11

Example• Statement of problem: Do SIDS babies

have lower birth weight on average? Prior research suggested a μ = 3300 grams in the general (non-SIDs) population

• Statistical Hypotheses H0: µ = 3300 vs. Ha: µ < 3300 (one-sided) or Ha: µ ≠ 3300 (two-sided)

November 09 12

Data: An SRS of n = 10 SIDS babies showsx-bar = 2890.5 grams and s = 720.0 grams

80.11072033005.2890 0

stat −=−

=−

=xSE

xt μ

Example

91101has statistic t This=−=−= ndf

Page 3: σis not known Additional Uncertainty - San Jose State ... · PDF file11.2 Student’s t Distribution ... Table C (t table) Entries ⇒t critical values ... Table C. df = 9 0.703 0.883

Chapter 11 11/16/2009

Basic Biostat 3

November 09 13

P-value via StaTable.exe

November 09 14

P-value via Table C• Bracket |tstat| between t critical value • tstat = −1.80 w/ 9 df

Table C.

2.2621.8331.3831.1000.8830.703df = 90.0250.050.100.150.200.25Upper-tail P

Thus ⇒ One-tailed: 0.05 < P < 0.10 Two-tailed: 0.10 < P < 0.20

|tstat| = 1.80

November 09 15

Interpretation• The two-tailed P-value

was 0.11• Interpret in context of

H0: µ = 3300 vs. Ha: µ ≠ 3300

• Non-significant evidence against H0

• Data provide little reason to doubt H0

• Retain H0 at α = .10 and below

November 09 16

(1− α)100% CI for µ

nsSE

SEtx

x

xn

=

⋅± −−

where21,1 α

C) Table (from1 ofy probabilit cumulative

and 1 with

2

1,1 2

α

α

−≡−− dfnttn

November 09 17

Example

68.227262.25.2890 for CI 95% 977,.9 ⋅±=⋅±= xSEtxμ

10 0.720 5.2890 :before as data Same === nsx

C Table from 262.2 :use .05)( confidence 95%For

97591,110 205. ==

=

−− ,.ttα

68.22710

720===

nsSEx

grams 3405.6) to(2375.4 =

515.1 ±5.2890 =

Interpretation addresses true population mean µ

November 09 18

§11.5 Paired Samples• Two samples • Each data point in one sample uniquely

paired to a data point in the other sample• Examples

– Sequential samples within individuals – “Pre-test/post-test”– Cross-over trials– Pair-matching

Page 4: σis not known Additional Uncertainty - San Jose State ... · PDF file11.2 Student’s t Distribution ... Table C (t table) Entries ⇒t critical values ... Table C. df = 9 0.703 0.883

Chapter 11 11/16/2009

Basic Biostat 4

November 09 19

Example: Statement of Problem

• Does oat bran reduce LDL cholesterol?• Half subjects start on CORNFLK, half on

OATBRAN• Two weeks on diet 1 ⇒ LDL cholesterol• Washout period• Cross-over to other diet• Two weeks on diet 2 ⇒ LDL cholesterol

November 09 20

Oat bran data

Subject CORNFLK OATBRAN ---- ------- -------1 4.61 3.84 2 6.42 5.57 3 5.40 5.85 4 4.54 4.80 5 3.98 3.68 6 3.82 2.96 7 5.01 4.41 8 4.34 3.72 9 3.80 3.49 10 4.56 3.84 11 5.35 5.26 12 3.89 3.73

November 09 21

Within Pair Difference “DELTA”

• Let DELTA = CORNFLK - OATBRAN• First three observations in OATBRAN data:

ID CORNFLK OATBRAN DELTA ---- ------- ------- -----1 4.61 3.84 0.772 6.42 5.57 0.853 5.40 5.85 -0.45

etc.

November 09 22

Explore and describe DELTA

Stemplot w/ quint. split

|-0f|4|-0*|2|+0*|01|+0t|33|+0f||+0s|6677|+0.|88×1

DELTA values: 0.77, 0.85, −0.45, −0.26, 0.30, 0.86, 0.60, 0.62, 0.31, 0.72, 0.09, 0.16

0.43350.3808

12

===

d

d

sxn

subscript d denotes statistic is for DELTA variable (optional)

November 09 23

(1 – α)100% CI for µd

dxnd SEtx ⋅± −− 21,1 α

t procedure directed toward DELTA

nsSE d

xd=

Where

November 09 24

Example1251.0201.20.3808 for CI %95 ⋅±=dμ

0.43350.3808

12:Data

===

d

d

sxn

1251.12

4335.0==

dxSE

C) (Table 201.2 use .05)( confidence 95%For

975,.111112 205 ==

=

−− tt .,

α

)656.0 to105.0(=

2754.00.3808 ±=

Interpretation addresses true mean difference µd

Page 5: σis not known Additional Uncertainty - San Jose State ... · PDF file11.2 Student’s t Distribution ... Table C (t table) Entries ⇒t critical values ... Table C. df = 9 0.703 0.883

Chapter 11 11/16/2009

Basic Biostat 5

November 09 25

• Statement of problem: Is the oat bran diet associated with a decline (one-sided) or change (two-sided) in LDL cholesterol?

• H0: µd = 0 • Ha: µd > 0 (one-sided) • Ha: µ ≠ 0 (two-sided)

Paired t test: Example

November 09 26

Paired t statistic

111121 =−=−= ndf

ndf ns

xtd

d 1 where 0stat −=

−=

μ

0.4335 0.3808 12:datacurrent Recall

=== dd sxn

0 µ ,Under 0 =H

043.312/4335.

038083.0 0stat =

−=

−=

nsxt d μ

November 09 27

P-value via StaTable

November 09 28

P-value via Table C• Bracket |tstat| between t critical value • tstat = 3.043 w/ 11 df

Table C.

3.4973.1062.718df = 11.0025.005.01Upper-tail P

Thus ⇒ One-tailed: 0.005 < P < 0.01 Two-tailed: 0.01 < P < 0.02

|tstat| = 3.043

November 09 29

Interpretation• The two-tailed P-value

was 0.011• Interpret in context of

H0: µ = 0 vs. Ha: µ ≠ 0• Significant evidence

against H0

• Data provide good reason to doubt H0

• Reject H0 at α = .05 but retain at α = .01

November 09 30

SPSS Output: “Oat Bran”

Page 6: σis not known Additional Uncertainty - San Jose State ... · PDF file11.2 Student’s t Distribution ... Table C (t table) Entries ⇒t critical values ... Table C. df = 9 0.703 0.883

Chapter 11 11/16/2009

Basic Biostat 6

November 09 31

The Normality Condition• t Procedures require Normal population or

large samples• How do we assess this condition?• It is OK to use t procedures when:

– Population is Normal– Population is symmetrical and n ≥ 10– Population is skewed and n ≥ 30 to 60

(depending on the severity of skew)

November 09 32

Can a t procedures be used?

Dataset skewed and small ⇒ avoid t procedures

November 09 33

Can a t procedures be used?

Dataset B has a mild skew and is moderate in size: OK to use t procedures

November 09 34

Can a t procedures be used?

Data highly skewed and moderate in size. Avoid t procedures