Power and Sample Size Boulder 2004 Benjamin Neale Shaun Purcell.

Post on 04-Jan-2016

218 views 0 download

Transcript of Power and Sample Size Boulder 2004 Benjamin Neale Shaun Purcell.

Power and Sample Size

Boulder 2004Benjamin NealeShaun Purcell

To Be Accomplished

• Introduce Concept of Power via Correlation Coefficient (ρ) Example

• Identify Relevant Factors Contributing to Power

• Practical:• Power Analysis for Univariate Twin Model• How to use Mx for Power

Simple example

Investigate the linear relationship (between two random variables X and Y: =0 vs. 0 (correlation coefficient).

• draw a sample, measure X,Y

• calculate the measure of association (Pearson product moment corr. coeff.)• test whether 0.

How to Test 0• assumed the data are normally

distributed

• defined a null-hypothesis ( = 0)

• chosen level (usually .05)

• utilized the (null) distribution of the test statistic associated with =0

• t= [(N-2)/(1-2)]

How to Test 0• Sample N=40

• r=.303, t=1.867, df=38, p=.06 α=.05

• As p > α, we fail to reject = 0

have we drawn the correct conclusion?

= type I error rate probability of deciding 0

(while in truth =0)

is often chosen to equal .05...why?

DOGMA

N=40, r=0, nrep=1000 – central t(38), =0.05 (critical value 2.04)

-5 -4 -3 -2 -1 0 1 2 3 4 50

200

400

600

800

-5 -4 -3 -2 -1 0 1 2 3 4 50

0.1

0.2

0.3

0.4

NREP=1000

2.5% 2.5%

central t(38)

4.6% sign.

Observed non-null distribution (=.2) and null distribution

-5 -4 -3 -2 -1 0 1 2 3 4 50

20

40

60

80

100

-5 -4 -3 -2 -1 0 1 2 3 4 50

0.1

0.2

0.3

0.4

0.5

rho=.20N=40Nrep=1000

abs(t)>2.04 in23%

null distribution t(38)

In 23% of tests of =0, |t|>2.024 (=0.05), and thus draw the correct conclusion that of rejecting 0.

The probability of rejecting the null-hypothesis (=0) correctly is 1-, or the power, when a true effect exists

Hypothesis Testing

• Correlation Coefficient hypotheses:– ho (null hypothesis) is ρ=0

– ha (alternative hypothesis) is ρ ≠ 0• Two-sided test, where ρ > 0 or ρ < 0 are one-sided

• Null hypothesis usually assumes no effect

• Alternative hypothesis is the idea being tested

Summary of Possible Results

H-0 true H-0 false

accept H-0 1-reject H-0 1-

=type 1 error rate

=type 2 error rate

1-=statistical power

Rejection of H0 Non-rejection of H0

H0 true

HA true

Nonsignificant result(1- )

Type II error at rate

Significant result(1-)

Type I error at rate

Power• The probability of rejection of a

false null-hypothesis depends on: –the significance criterion ()–the sample size (N) –the effect size (NCP)

“The probability of detecting a given effect size in a population from a sample of size N, using significance criterion ”

P(T)

T

alpha 0.05

Sampling distribution if HA were true

Sampling distribution if H0 were true

POWER = 1 -

Standard Case

Effect Size (NCP)

P(T)

T

alpha 0.1

Sampling distribution if HA were true

Sampling distribution if H0 were true

POWER = 1 -

Impact of Less Cons. alpha

P(T)

T

alpha 0.01

Sampling distribution if HA were true

Sampling distribution if H0 were true

POWER = 1 -

Impact of More Cons. alpha

P(T)

T

alpha 0.05

Increased Sample Size

Sampling distribution if HA were true

Sampling distribution if H0 were true

POWER = 1 -

P(T)

T

alpha 0.05

Sampling distribution if HA were true

Sampling distribution if H0 were true

POWER = 1 -

Increase in Effect Size

Effect Size (NCP)↑

Effects on Power Recap

• Larger Effect Size

• Larger Sample Size

• Alpha Level shifts <Beware the False Positive!!!>

• Type of Data:– Binary, Ordinal, Continuous

When To Do Power Calcs?

• Generally study planning stages of study

• Occasionally with negative result

• No need if significance is achieved

• Computed to determine chances of success

Power Calculations Empirical

• Attempt to Grasp the NCP from Null

• Simulate Data under theorized model

• Calculate Statistics and Perform Test

• Given α, how many tests p < α

• Power = (#hits)/(#tests)

Practical: Empirical Power 1

• We will Simulate Data under a model online

• We will run an ACE model, and test for C

• We will then submit our data and Shaun will collate it for us

• While he’s collating, we’ll talk about theoretical power calculations

Practical: Empirical Power 2

• First get F:\ben\2004\ace.mx and put it into your directory

• We will paste our simulated data into this script, so open it now in preparation, and note both places where we must paste in the data

• Note that you will have to fit the ACE model and then fit the AE submodel

Practical: Empirical Power 3

• Simulation Conditions– 30% A2 20% C2 50% E2

– Input:– A 0.5477 C of 0.4472 E of 0.7071– 350 MZ 350 DZ– Simulate and Space Delimited at– http://statgen.iop.kcl.ac.uk/workshop/unisim.html or

click here in slide show mode– Click submit after filling in the fields and you will get a

page of data

Practical: Empirical Power 4

• With the data page, use control-a to select the data, control-c to copy, and in Mx control-v to paste in both the MZ and DZ groups.

• Run the ace.mx script with the data pasted in and modify it to run the AE model.

• Report the A, C, and E estimates of the first model, and the A and E estimates of the second model as well as both the-2log-likelihoods on the webpage http://statgen.iop.kcl.ac.uk/workshop/ or click here in slide show mode

Practical: Empirical Power 5

• Once all of you have submitted your results we will take a look at the theoretical power calculation, using Mx.

• Once we have finished with the theory Shaun will show us the empirical distribution that we generated today

Theoretical Power Calculations

• Based on Stats, rather than Simulations• Can be calculated by hand sometimes, but

Mx does it for us• Note that sample size and alpha-level are

the only things we can change, but can assume different effect sizes

• Mx gives us the relative power levels at the alpha specified for different sample sizes

Theoretical Power Calculations

• We will use the power.mx script to look at the sample size necessary for different power levels

• In Mx, power calculations can be computed in 2 ways:– Using Covariance Matrices (We Do This One)– Requiring an initial dataset to generate a

likelihood so that we can use a chi-square test

Power.mx 1! Simulate the data! 30% additive genetic! 20% common environment! 50% nonshared environment

#NGroups 3

G1: model parametersCalculationBegin Matrices;

X lower 1 1 fixedY lower 1 1 fixedZ lower 1 1 fixed

End Matrices;

Matrix X 0.5477Matrix Y 0.4472Matrix Z 0.7071

Begin Algebra;A = X*X' ; C = Y*Y' ;E = Z*Z' ;

End Algebra;End

Power.mx 2G2: MZ twin pairsCalculationMatrices = Group 1Covariances A+C+E | A+C _

A+C | A+C+E /Options MX%E=mzsim.covEnd

G3: DZ twin pairsCalculationMatrices = Group 1 H Full 1 1 Covariances A+C+E | H@A+C _

H@A+C | A+C+E /Matrix H 0.5Options MX%E=dzsim.covEnd

Power.mx 3! Second part of script! Fit the wrong model to the simulated data ! to calculate power

#NGroups 3G1 : model parametersCalculationBegin Matrices;

X lower 1 1 freeY lower 1 1 fixedZ lower 1 1 free

End Matrices;

Begin Algebra;A = X*X' ;C = Y*Y' ; E = Z*Z' ;

End Algebra;End

Power.mx 4G2 : MZ twinsData NInput_vars=2 NObservations=350CMatrix Full File=mzsim.covMatrices= Group 1Covariances A+C+E | A+C _

A+C | A+C+E /Option RSidualsEnd

G3 : DZ twinsData NInput_vars=2 NObservations=350CMatrix Full File=dzsim.covMatrices= Group 1 H Full 1 1Covariances A+C+E | H@A+C _

H@A+C | A+C+E /Matix H 0.5Option RSiduals

! Power for alpha = 0.05 and 1 dfOption Power= 0.05,1 End