Power and Sample Size
Boulder 2004Benjamin NealeShaun Purcell
To Be Accomplished
• Introduce Concept of Power via Correlation Coefficient (ρ) Example
• Identify Relevant Factors Contributing to Power
• Practical:• Power Analysis for Univariate Twin Model• How to use Mx for Power
Simple example
Investigate the linear relationship (between two random variables X and Y: =0 vs. 0 (correlation coefficient).
• draw a sample, measure X,Y
• calculate the measure of association (Pearson product moment corr. coeff.)• test whether 0.
How to Test 0• assumed the data are normally
distributed
• defined a null-hypothesis ( = 0)
• chosen level (usually .05)
• utilized the (null) distribution of the test statistic associated with =0
• t= [(N-2)/(1-2)]
How to Test 0• Sample N=40
• r=.303, t=1.867, df=38, p=.06 α=.05
• As p > α, we fail to reject = 0
have we drawn the correct conclusion?
= type I error rate probability of deciding 0
(while in truth =0)
is often chosen to equal .05...why?
DOGMA
N=40, r=0, nrep=1000 – central t(38), =0.05 (critical value 2.04)
-5 -4 -3 -2 -1 0 1 2 3 4 50
200
400
600
800
-5 -4 -3 -2 -1 0 1 2 3 4 50
0.1
0.2
0.3
0.4
NREP=1000
2.5% 2.5%
central t(38)
4.6% sign.
Observed non-null distribution (=.2) and null distribution
-5 -4 -3 -2 -1 0 1 2 3 4 50
20
40
60
80
100
-5 -4 -3 -2 -1 0 1 2 3 4 50
0.1
0.2
0.3
0.4
0.5
rho=.20N=40Nrep=1000
abs(t)>2.04 in23%
null distribution t(38)
In 23% of tests of =0, |t|>2.024 (=0.05), and thus draw the correct conclusion that of rejecting 0.
The probability of rejecting the null-hypothesis (=0) correctly is 1-, or the power, when a true effect exists
Hypothesis Testing
• Correlation Coefficient hypotheses:– ho (null hypothesis) is ρ=0
– ha (alternative hypothesis) is ρ ≠ 0• Two-sided test, where ρ > 0 or ρ < 0 are one-sided
• Null hypothesis usually assumes no effect
• Alternative hypothesis is the idea being tested
Summary of Possible Results
H-0 true H-0 false
accept H-0 1-reject H-0 1-
=type 1 error rate
=type 2 error rate
1-=statistical power
Rejection of H0 Non-rejection of H0
H0 true
HA true
Nonsignificant result(1- )
Type II error at rate
Significant result(1-)
Type I error at rate
Power• The probability of rejection of a
false null-hypothesis depends on: –the significance criterion ()–the sample size (N) –the effect size (NCP)
“The probability of detecting a given effect size in a population from a sample of size N, using significance criterion ”
P(T)
T
alpha 0.05
Sampling distribution if HA were true
Sampling distribution if H0 were true
POWER = 1 -
Standard Case
Effect Size (NCP)
P(T)
T
alpha 0.1
Sampling distribution if HA were true
Sampling distribution if H0 were true
POWER = 1 -
Impact of Less Cons. alpha
P(T)
T
alpha 0.01
Sampling distribution if HA were true
Sampling distribution if H0 were true
POWER = 1 -
Impact of More Cons. alpha
P(T)
T
alpha 0.05
Increased Sample Size
Sampling distribution if HA were true
Sampling distribution if H0 were true
POWER = 1 -
P(T)
T
alpha 0.05
Sampling distribution if HA were true
Sampling distribution if H0 were true
POWER = 1 -
Increase in Effect Size
Effect Size (NCP)↑
Effects on Power Recap
• Larger Effect Size
• Larger Sample Size
• Alpha Level shifts <Beware the False Positive!!!>
• Type of Data:– Binary, Ordinal, Continuous
When To Do Power Calcs?
• Generally study planning stages of study
• Occasionally with negative result
• No need if significance is achieved
• Computed to determine chances of success
Power Calculations Empirical
• Attempt to Grasp the NCP from Null
• Simulate Data under theorized model
• Calculate Statistics and Perform Test
• Given α, how many tests p < α
• Power = (#hits)/(#tests)
Practical: Empirical Power 1
• We will Simulate Data under a model online
• We will run an ACE model, and test for C
• We will then submit our data and Shaun will collate it for us
• While he’s collating, we’ll talk about theoretical power calculations
Practical: Empirical Power 2
• First get F:\ben\2004\ace.mx and put it into your directory
• We will paste our simulated data into this script, so open it now in preparation, and note both places where we must paste in the data
• Note that you will have to fit the ACE model and then fit the AE submodel
Practical: Empirical Power 3
• Simulation Conditions– 30% A2 20% C2 50% E2
– Input:– A 0.5477 C of 0.4472 E of 0.7071– 350 MZ 350 DZ– Simulate and Space Delimited at– http://statgen.iop.kcl.ac.uk/workshop/unisim.html or
click here in slide show mode– Click submit after filling in the fields and you will get a
page of data
Practical: Empirical Power 4
• With the data page, use control-a to select the data, control-c to copy, and in Mx control-v to paste in both the MZ and DZ groups.
• Run the ace.mx script with the data pasted in and modify it to run the AE model.
• Report the A, C, and E estimates of the first model, and the A and E estimates of the second model as well as both the-2log-likelihoods on the webpage http://statgen.iop.kcl.ac.uk/workshop/ or click here in slide show mode
Practical: Empirical Power 5
• Once all of you have submitted your results we will take a look at the theoretical power calculation, using Mx.
• Once we have finished with the theory Shaun will show us the empirical distribution that we generated today
Theoretical Power Calculations
• Based on Stats, rather than Simulations• Can be calculated by hand sometimes, but
Mx does it for us• Note that sample size and alpha-level are
the only things we can change, but can assume different effect sizes
• Mx gives us the relative power levels at the alpha specified for different sample sizes
Theoretical Power Calculations
• We will use the power.mx script to look at the sample size necessary for different power levels
• In Mx, power calculations can be computed in 2 ways:– Using Covariance Matrices (We Do This One)– Requiring an initial dataset to generate a
likelihood so that we can use a chi-square test
Power.mx 1! Simulate the data! 30% additive genetic! 20% common environment! 50% nonshared environment
#NGroups 3
G1: model parametersCalculationBegin Matrices;
X lower 1 1 fixedY lower 1 1 fixedZ lower 1 1 fixed
End Matrices;
Matrix X 0.5477Matrix Y 0.4472Matrix Z 0.7071
Begin Algebra;A = X*X' ; C = Y*Y' ;E = Z*Z' ;
End Algebra;End
Power.mx 2G2: MZ twin pairsCalculationMatrices = Group 1Covariances A+C+E | A+C _
A+C | A+C+E /Options MX%E=mzsim.covEnd
G3: DZ twin pairsCalculationMatrices = Group 1 H Full 1 1 Covariances A+C+E | H@A+C _
H@A+C | A+C+E /Matrix H 0.5Options MX%E=dzsim.covEnd
Power.mx 3! Second part of script! Fit the wrong model to the simulated data ! to calculate power
#NGroups 3G1 : model parametersCalculationBegin Matrices;
X lower 1 1 freeY lower 1 1 fixedZ lower 1 1 free
End Matrices;
Begin Algebra;A = X*X' ;C = Y*Y' ; E = Z*Z' ;
End Algebra;End
Power.mx 4G2 : MZ twinsData NInput_vars=2 NObservations=350CMatrix Full File=mzsim.covMatrices= Group 1Covariances A+C+E | A+C _
A+C | A+C+E /Option RSidualsEnd
G3 : DZ twinsData NInput_vars=2 NObservations=350CMatrix Full File=dzsim.covMatrices= Group 1 H Full 1 1Covariances A+C+E | H@A+C _
H@A+C | A+C+E /Matix H 0.5Option RSiduals
! Power for alpha = 0.05 and 1 dfOption Power= 0.05,1 End
Top Related