Sample Size & Power Analysis - UCLA · α= ob rejecting H H ... Sample Size Calculation and Power...
Transcript of Sample Size & Power Analysis - UCLA · α= ob rejecting H H ... Sample Size Calculation and Power...
Study Design
Sample Size Calculation & Power Analysis
RCMAR/CHIME
April 21, 2014 Honghu Liu, PhD
Professor University of California Los Angeles
3
Flow of Study Design & Its Process
Select sample and gather data
Hypotheses
Theories/Speculation about aims
Target Population
Inference to
Specific goals Big topic
Obtain results
Background
Study Aims
Sample Data Collection
Conclusion
Research Question
Analysis
Sample Size Calculation & Power Analysis
Determine the sample size
Background Power of a statistical test
– The probability that it will yield statistically significant results
Sample size – The minimum sample size required to detect a certain
difference between parameters Sample size and statistical power are linked with
study aims and hypothesis – Need to collect sample data to test the hypothesis – Hypotheses are tested with certain power
(a) Hypothesis Null:
0H : 21 µµ = Alternative:
aH : 21 µµ ≠ ( 21 µµ > or 21 µµ < ) (b) Type I error (α ) ---Reject a null hypothesis when it is true: true)is (Pr 0|0 HHrejectingob=α
Background (cont.) Key Concepts
Background (cont.)
(c) Type II error (β ) --Accept a null hypothesis when it is false:
false) is | (Pr 00 HHacceptingob=β (d) Power
--The probability of rejecting a null hypothesis when it is false:
Power= β−=1false) is | Pr( 00 HHrejecting
Background (cont.) (e) Effect size
--The difference between parameters to be tested (e.g., 21 µµ −=∆ ). --can be expressed as per standard deviation (e.g., ES= )/(/ )21 σµµσ −=∆ )
(f) Critical value
--The deviate of a distribution that reaches statistical significance under the null hypothesis for a given type I error (e.g., α−1z =1.645 and 2/1 α−z =1.96)
Background (cont.)
(g) One-sided vs. Two-sided test
--One-sided test: a null hypothesis can only be rejected in one direction (directional test.) (e.g., reject if α−> 1zz )
--Two-sided test: a null hypothesis can be rejected in either direction. (e.g., reject if 2/1|| α−> zz )
Background (cont.)
(h) Acceptance region & rejection region
--Acceptance region: the null hypothesis will be accepted for all values that fall into this region (e.g., 2/12/1 αα −− <=<=− zzz )
--Rejection region: the null hypothesis will be rejected for all values that fall into this region (e.g., 2/12/1 or αα −− >−< zzzz )
Acceptance region Rejection region
Power
Normal Distribution with One-sided Test and Type I Error 0.05
Five Key Factors
1. Sample size 2. Effect size 3. Significance level 4. Power of the test 5. Variability
Common Designs I. Continuous measure
a) One sample normal :0H 0µµ = :1H 1µµ =
22/11 ))//()(( szzn ∆+= −− αβ
Where ∆ = −1µ 0µ is the difference to be detected; S is the standard deviation; β−1z and 2/1 α−z are the normal deviates for
desired power and significance level. Note: this is also the sample size for the case of paired observations.
b) Two sample normal
:0H 21 µµ = :1H 21 µµ ≠
)/11/())//()(( 22/111 rszzn +∆+= −− αβ
Where 21 *nrn = with 10 ≤<r , ∆ is the difference to be detected; S is the common standard deviation;
β−1z and 2/1 α−z are the normal deviates for desired power and significance level.
Common Designs (cont.)
c) Two group repeated measures (time-averaged means) (Diggle, et al, 1994)
:0H 21 µµ = , :1H 21 µµ ≠
222 /})1(1{)(2 ndnzzm ρσβα −++= 22 /})1(1{)(2 ∆−++= nnzz ρβα Where αz and βz are the normal deviates;
2σ is the common variance; ),( ikij yyCorr=ρ is the intra-patient correlation;
d is the difference between the average response of two groups. Note: Unbalanced design: Liu, et al. 2005. Journal of Modern Applied Statistics; PASS 2008, NCSS.
II. Binomial distribution
a) One sample binomial
:0H 0pp= :1H 1pp=
0111002/11 )/())}1/(/)1(*(*[{ ppppppsqrtzzn −−−+= −− αβ
)1(** 11 pp − Where 0p is the null value of the probability;
1p is the alternative value of the probability; β−1z and 2/1 α−z are the normal deviates for desired
power and significance level.
b) Two sample binomial
:0H 21 pp = :1H 21 pp ≠
/)1*(/()1/1)*(1*((*{[ 112/111 pprppsqrtzzn −+−+= −− αβ
)]1(*/)1(*[*)}/())]1(* 22112
1222 pprppppppr −+−−−+
Where 21 *nrn = with 10 ≤<r 2/)( 21 ppp +=
1p and 2p are the probability of groups 1 and 2;β−1z and 2/1 α−z are the normal deviates for desired
power and significance level.
c) Two sample binomial repeated measures (Diggle, et al. 1994)
:0H 21 pp = :1H 21 pp ≠ (time-averaged proportions)
))1(1{()})1(1(2{[ 2/1 ρρ βα −++−+= nznqpzm
222/12211 /])}( ndqpqp +
Where 2/)( 21 ppp += pq −=1
),( ikij yyCorr=ρ is the intra-patient correlation d is the difference between the average response for the two groups.
III. Other Designs
Matched case-control McNemar’s test Analysis of variance (ANOVA) Correlation coefficient Logistic regression Multiple regression
III. Other Designs (cont.)
Survey Design – Methods
» Stratification » Clustering » Complex Survey (stratification and clustering)
– Estimate Design Effect:
– Sample Size=deff*(traditional sample size calculation formula)
)()( / srsignsurveydesd vavvavdeff =
Examples Study A: Study Aim: To study the impact of a new physical therapy
on quality of life of patients with chronic back pain Hypothesis: The new physical therapy can significantly
improve the quality of life of patients with chronic back pain
Step 1 – Outcome measure: SF-12 physical component summary (PCS) score
Step 2 – Design: two group comparison between treatment and control
Step 3-Statistical model: two group comparison with a continuous outcome measure
Study A (cont.)
Step 4 – Obtain the required statistics for the statistical model:
1) type I error: 0.05 2) type II error: 0.15 (power 85%) 3)mean of PCS : 44 4) SD of PCS: 22 5) effect size: 5 (minimally clinical meaningful difference) Step 5 – Find a sample size calculation software and plug in
the statistics and get the results: n=348 for each arm
Study B Study Aim: To study the difference in rate of participation in novel clinical trial of gene therapy/stem cell research between the Innovative Health Research Intervention (IHRI) and the Standard HIV Attention Control (AC) Hypothesis: IHRI has a higher participation rate than AC Step 1: Outcome measure – willingness to participate (binary yes/no variable) Step 2: Design – randomized two group comparison between IHRI and AC arms Step 3: Statistical Model – two group comparison with binomial distribution
Study B (cont.)
Step 4 – Obtain the required statistics for the statistical model 1) Type I error: 0.05 2) Type II error: 0.20 (power 80%) 3) The estimated rate of participation (60% for AC) 4) Sample size capacity: 180 in each arm 5) Effect size: to be determined
Step 5 – find a sample size and power analyses software and
plug in the statistics and get the effect size: 15%.
Sample Size Calculation & Power Analyses Software
General purpose statistical software (e.g., STATA, SPSS, SAS, GLIM, Sigmastat and XLISP_STAT)
Special purpose statistical software (e.g., EpiInfo) Stand-alone sample size & power analysis software (e.g.,
NCSS-PASS, nQuery and SYSTAT Design) Stand-alone sample size & power analysis software for
specialized applications (PRESISION for survival studies) Software on Internet (e.g., http://calculators.stat.ucla.edu/)
Summary & Discussion
Related key factors
Min-Max rule – Minimum required sample size for each main
hypothesis – Maximum sample size among the multiple minimums
Practical factors that influence sample size
determination – Budget/sample limitation – Backward estimation
Summary & Discussion (cont.)
Find the necessary and right statistics – e.g., mean, SD & ES
Get multiple solutions and select the best
design
Jacob Cohen (1988). Statistical Power Analysis for the Behavioral Sciences. Lawrence Erlbaum Associates, Publishers. Hillsdale, New Jersey. Diggle, PJ, Liang, KY and Zeger, SL (1996). Analysis of Longitudinal Data. Oxford University Press Inc., New York. R. Barker Bausell and Yu-Fang Li (2002). Power analysis for experimental research. Cambridge University Press. Pass 2008 Power Analysis and Sample Size for Windows. NCSS, Kaysville, Utah. Liu HH and Wu TT. Sample size calculation and power analysis for Time-averaged difference. Journal of Modern Applied Statistical Methods. 2005;4(2):434-445. Helena Chmura Kraemer and Sue Thiemann (1987). How many subjects? Sage Publications, London. Liu HH & Wu TT. Sample Size Calculation and Power Analysis of Changes in Mean Response Over Time. Journal of communication in Statistics (in press) Sharon L. Lohr (1999). Sampling: Design and Analysis. Duxbury Press.
References