Sample Size & Power Analysis - UCLA · α= ob rejecting H H ... Sample Size Calculation and Power...

28
Study Design Sample Size Calculation & Power Analysis RCMAR/CHIME April 21, 2014 Honghu Liu, PhD Professor University of California Los Angeles

Transcript of Sample Size & Power Analysis - UCLA · α= ob rejecting H H ... Sample Size Calculation and Power...

Study Design

Sample Size Calculation & Power Analysis

RCMAR/CHIME

April 21, 2014 Honghu Liu, PhD

Professor University of California Los Angeles

Contents

1. Background 2. Common Designs 3. Examples 4. Computer Software 5. Summary & Discussion

3

Flow of Study Design & Its Process

Select sample and gather data

Hypotheses

Theories/Speculation about aims

Target Population

Inference to

Specific goals Big topic

Obtain results

Background

Study Aims

Sample Data Collection

Conclusion

Research Question

Analysis

Sample Size Calculation & Power Analysis

Determine the sample size

Background Power of a statistical test

– The probability that it will yield statistically significant results

Sample size – The minimum sample size required to detect a certain

difference between parameters Sample size and statistical power are linked with

study aims and hypothesis – Need to collect sample data to test the hypothesis – Hypotheses are tested with certain power

(a) Hypothesis Null:

0H : 21 µµ = Alternative:

aH : 21 µµ ≠ ( 21 µµ > or 21 µµ < ) (b) Type I error (α ) ---Reject a null hypothesis when it is true: true)is (Pr 0|0 HHrejectingob=α

Background (cont.) Key Concepts

Background (cont.)

(c) Type II error (β ) --Accept a null hypothesis when it is false:

false) is | (Pr 00 HHacceptingob=β (d) Power

--The probability of rejecting a null hypothesis when it is false:

Power= β−=1false) is | Pr( 00 HHrejecting

Background (cont.) (e) Effect size

--The difference between parameters to be tested (e.g., 21 µµ −=∆ ). --can be expressed as per standard deviation (e.g., ES= )/(/ )21 σµµσ −=∆ )

(f) Critical value

--The deviate of a distribution that reaches statistical significance under the null hypothesis for a given type I error (e.g., α−1z =1.645 and 2/1 α−z =1.96)

Background (cont.)

(g) One-sided vs. Two-sided test

--One-sided test: a null hypothesis can only be rejected in one direction (directional test.) (e.g., reject if α−> 1zz )

--Two-sided test: a null hypothesis can be rejected in either direction. (e.g., reject if 2/1|| α−> zz )

Background (cont.)

(h) Acceptance region & rejection region

--Acceptance region: the null hypothesis will be accepted for all values that fall into this region (e.g., 2/12/1 αα −− <=<=− zzz )

--Rejection region: the null hypothesis will be rejected for all values that fall into this region (e.g., 2/12/1 or αα −− >−< zzzz )

Acceptance region Rejection region

Power

Normal Distribution with One-sided Test and Type I Error 0.05

Five Key Factors

1. Sample size 2. Effect size 3. Significance level 4. Power of the test 5. Variability

Common Designs I. Continuous measure

a) One sample normal :0H 0µµ = :1H 1µµ =

22/11 ))//()(( szzn ∆+= −− αβ

Where ∆ = −1µ 0µ is the difference to be detected; S is the standard deviation; β−1z and 2/1 α−z are the normal deviates for

desired power and significance level. Note: this is also the sample size for the case of paired observations.

b) Two sample normal

:0H 21 µµ = :1H 21 µµ ≠

)/11/())//()(( 22/111 rszzn +∆+= −− αβ

Where 21 *nrn = with 10 ≤<r , ∆ is the difference to be detected; S is the common standard deviation;

β−1z and 2/1 α−z are the normal deviates for desired power and significance level.

Common Designs (cont.)

c) Two group repeated measures (time-averaged means) (Diggle, et al, 1994)

:0H 21 µµ = , :1H 21 µµ ≠

222 /})1(1{)(2 ndnzzm ρσβα −++= 22 /})1(1{)(2 ∆−++= nnzz ρβα Where αz and βz are the normal deviates;

2σ is the common variance; ),( ikij yyCorr=ρ is the intra-patient correlation;

d is the difference between the average response of two groups. Note: Unbalanced design: Liu, et al. 2005. Journal of Modern Applied Statistics; PASS 2008, NCSS.

II. Binomial distribution

a) One sample binomial

:0H 0pp= :1H 1pp=

0111002/11 )/())}1/(/)1(*(*[{ ppppppsqrtzzn −−−+= −− αβ

)1(** 11 pp − Where 0p is the null value of the probability;

1p is the alternative value of the probability; β−1z and 2/1 α−z are the normal deviates for desired

power and significance level.

b) Two sample binomial

:0H 21 pp = :1H 21 pp ≠

/)1*(/()1/1)*(1*((*{[ 112/111 pprppsqrtzzn −+−+= −− αβ

)]1(*/)1(*[*)}/())]1(* 22112

1222 pprppppppr −+−−−+

Where 21 *nrn = with 10 ≤<r 2/)( 21 ppp +=

1p and 2p are the probability of groups 1 and 2;β−1z and 2/1 α−z are the normal deviates for desired

power and significance level.

c) Two sample binomial repeated measures (Diggle, et al. 1994)

:0H 21 pp = :1H 21 pp ≠ (time-averaged proportions)

))1(1{()})1(1(2{[ 2/1 ρρ βα −++−+= nznqpzm

222/12211 /])}( ndqpqp +

Where 2/)( 21 ppp += pq −=1

),( ikij yyCorr=ρ is the intra-patient correlation d is the difference between the average response for the two groups.

III. Other Designs

Matched case-control McNemar’s test Analysis of variance (ANOVA) Correlation coefficient Logistic regression Multiple regression

III. Other Designs (cont.)

Survey Design – Methods

» Stratification » Clustering » Complex Survey (stratification and clustering)

– Estimate Design Effect:

– Sample Size=deff*(traditional sample size calculation formula)

)()( / srsignsurveydesd vavvavdeff =

Examples Study A: Study Aim: To study the impact of a new physical therapy

on quality of life of patients with chronic back pain Hypothesis: The new physical therapy can significantly

improve the quality of life of patients with chronic back pain

Step 1 – Outcome measure: SF-12 physical component summary (PCS) score

Step 2 – Design: two group comparison between treatment and control

Step 3-Statistical model: two group comparison with a continuous outcome measure

Study A (cont.)

Step 4 – Obtain the required statistics for the statistical model:

1) type I error: 0.05 2) type II error: 0.15 (power 85%) 3)mean of PCS : 44 4) SD of PCS: 22 5) effect size: 5 (minimally clinical meaningful difference) Step 5 – Find a sample size calculation software and plug in

the statistics and get the results: n=348 for each arm

Study B Study Aim: To study the difference in rate of participation in novel clinical trial of gene therapy/stem cell research between the Innovative Health Research Intervention (IHRI) and the Standard HIV Attention Control (AC) Hypothesis: IHRI has a higher participation rate than AC Step 1: Outcome measure – willingness to participate (binary yes/no variable) Step 2: Design – randomized two group comparison between IHRI and AC arms Step 3: Statistical Model – two group comparison with binomial distribution

Study B (cont.)

Step 4 – Obtain the required statistics for the statistical model 1) Type I error: 0.05 2) Type II error: 0.20 (power 80%) 3) The estimated rate of participation (60% for AC) 4) Sample size capacity: 180 in each arm 5) Effect size: to be determined

Step 5 – find a sample size and power analyses software and

plug in the statistics and get the effect size: 15%.

Sample Size Calculation & Power Analyses Software

General purpose statistical software (e.g., STATA, SPSS, SAS, GLIM, Sigmastat and XLISP_STAT)

Special purpose statistical software (e.g., EpiInfo) Stand-alone sample size & power analysis software (e.g.,

NCSS-PASS, nQuery and SYSTAT Design) Stand-alone sample size & power analysis software for

specialized applications (PRESISION for survival studies) Software on Internet (e.g., http://calculators.stat.ucla.edu/)

Summary & Discussion

Related key factors

Min-Max rule – Minimum required sample size for each main

hypothesis – Maximum sample size among the multiple minimums

Practical factors that influence sample size

determination – Budget/sample limitation – Backward estimation

Summary & Discussion (cont.)

Find the necessary and right statistics – e.g., mean, SD & ES

Get multiple solutions and select the best

design

Jacob Cohen (1988). Statistical Power Analysis for the Behavioral Sciences. Lawrence Erlbaum Associates, Publishers. Hillsdale, New Jersey. Diggle, PJ, Liang, KY and Zeger, SL (1996). Analysis of Longitudinal Data. Oxford University Press Inc., New York. R. Barker Bausell and Yu-Fang Li (2002). Power analysis for experimental research. Cambridge University Press. Pass 2008 Power Analysis and Sample Size for Windows. NCSS, Kaysville, Utah. Liu HH and Wu TT. Sample size calculation and power analysis for Time-averaged difference. Journal of Modern Applied Statistical Methods. 2005;4(2):434-445. Helena Chmura Kraemer and Sue Thiemann (1987). How many subjects? Sage Publications, London. Liu HH & Wu TT. Sample Size Calculation and Power Analysis of Changes in Mean Response Over Time. Journal of communication in Statistics (in press) Sharon L. Lohr (1999). Sampling: Design and Analysis. Duxbury Press.

References

Questions?

Honghu Liu, PhD

[email protected]