Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05)...

70
09/27/15 1 Experimental Design Statistical Tests Fundamentals of a statistical test

Transcript of Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05)...

Page 1: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 1

Experimental Design Statistical Tests

Fundamentals of a statistical test

Page 2: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 2

Experimental Design Statistical Tests

● Conceive a hypothesis (HA)● Define the null hypothesis (H0)● Choose the level of rejection of the null hypothesis (α)● Choose an adequate statistical test● Obtain a statistic (actually, a number!)● Compute the “probability” (P) of the statistic

P ≥ α: H0 retained (HA rejected)

P < α: HA retained (H0 rejected)

Procedure of a statistical test

Page 3: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 3

Experimental Design Statistical Tests

Hypothesis testing in action

Actinia equina and Actinia fragacea

Page 4: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 4

Experimental Design Statistical Tests

After some observations you come up with an hypothesis

(HA) The number of tentacles of Actinia fragacea differs from that of Actinia equina

Page 5: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 5

Experimental Design Statistical Tests

Make a trip to Berlengas (Islands)

Catch some anemones and count their tentacles...

Page 6: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 6

Experimental Design Statistical Tests

Two samples obtained at Berlengasequina fragacea

126 150

138 156

132 150

144 126

120 156

132 150

132 148

n 6 6

WAIT!

By using these two samples, are we actually testing the hypothesis that the number of tentacles in A. fragacea is different from that of A. equina?

Page 7: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 7

Experimental Design Statistical Tests

Two samples obtained at Berlengas

Distribution of Actinia equina + fragacea

REPRESEN

TATI

VENESS

Page 8: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 8

Experimental Design Statistical Tests

Rephrase the hypothesis

(HA) The number of tentacles of Actinia fragacea differs from that of Actinia equina in Portugal

or

(HA) The number of tentacles of Actinia fragacea differs from that of Actinia equina at Berlengas

Page 9: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 9

Experimental Design Statistical Tests

Formalize the hypothesis

(HA) The average number of tentacles of A. fragacea (μ

1) differs from that of A. equina

(μ2) at Berlengas

HA: µ

1≠µ

2 H0:µ

1=µ

2

Set the rejection level at α=0.05 (more on that later)

Page 10: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 10

Experimental Design Statistical Tests

● Conceive a hypothesis (HA)● Define the null hypothesis (H0)● Choose the level of rejection of the null hypothesis (α)● Choose an adequate statistical test● Obtain a statistic (actually, a number!)● Compute the “probability” (P) of the statistic

P ≥ α: H0 retained (HA rejected)

P < α: HA retained (H0 rejected)

Procedure of a statistical test

Page 11: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 11

Experimental Design Statistical Tests

Let’s invent a statistical test...

Blah statistic x1−x2

HA: µ

1≠µ

2 H0:µ

1=µ

2

Blah = 132-148

Blah = -16

What do we know about Blah?

NOTHING!

equina fragacea

126 150

138 156

132 150

144 126

120 156

132 150

132 148

n 6 6

Page 12: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 12

Experimental Design Statistical Tests

Student’s t-test

Page 13: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 13

Experimental Design Statistical Tests

● Conceive a hypothesis (HA)● Define the null hypothesis (H0)● Choose the level of rejection of the null hypothesis (α)● Choose an adequate statistical test● Obtain a statistic (actually, a number!)● Compute the “probability” (P) of the statistic

P ≥ α: H0 retained (HA rejected)

P < α: HA retained (H0 rejected)

Procedure of a statistical test

Page 14: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 14

Experimental Design Statistical Tests

We need to compute sample variances for the t-test

Variance of a sample

Page 15: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 15

Experimental Design Statistical Tests

equina fragacea

126 150

138 156

132 150

144 126

120 156

132 150

132 148

792 888

104904 132048

n 6 6

X̄∑ X

∑ X2

Page 16: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 16

Experimental Design Statistical Tests

equina fragacea

126 150

138 156

132 150

144 126

120 156

132 150

132 148

s² 72 124.8

n 6 6

Page 17: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 17

Experimental Design Statistical Tests

equina fragacea

126 150

138 156

132 150

144 126

120 156

132 150

132 148

s² 72 124.8

n 6 6

= 10 degrees of freedom

Page 18: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 18

Experimental Design Statistical Tests

t = -2.79372

ν = 10

(ignore the signal)

Page 19: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 19

Experimental Design Statistical Tests

● Conceive a hypothesis (HA)● Define the null hypothesis (H0)● Choose the level of rejection of the null hypothesis (α)● Choose an adequate statistical test● Obtain a statistic (actually, a number!)● Compute the “probability” (P) of the statistic

P ≥ α: H0 retained (HA rejected)

P < α: HA retained (H0 rejected)

Procedure of a statistical test

Page 20: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 20

Experimental Design Statistical Tests

α = 0.05 n = 6ν = 10

tcrit

= 2.228 t = 2.79372

t > tcrit

p < 0.05

Reject H0: samples came from different populations!

Page 21: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 21

Experimental Design Statistical Tests

● Conceive a hypothesis (HA)● Define the null hypothesis (H0)● Choose the level of rejection of the null hypothesis (α)● Choose an adequate statistical test● Obtain a statistic (actually, a number!)● Compute the “probability” (P) of the statistic

P ≥ α: H0 retained (HA rejected)

P < α: HA retained (H0 rejected)

Procedure of a statistical test

Page 22: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 22

Experimental Design Statistical Tests

The logic behind Student’s t-test

There is a model...

Page 23: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 23

Experimental Design Statistical Tests

Assumptions● Assume that the two samples come from the same population (that is, H0 is true)

● Assume that observations are independent (that is they do not influence each other)

● Assume that variances of samples are homogeneous

● Assume that the variable of interest (number of tentacles) has a normal distribution

Page 24: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 24

Experimental Design Statistical Tests

Assumptions● Assume that the two samples come from the same population (that is, H0 is true)

● Assume that observations are independent (that is they do not influence each other)

● Assume that variances of samples are homogeneous

● Assume that the variable of interest (number of tentacles) has a normal distribution

Page 25: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 25

Experimental Design Statistical Tests

The normal distributionProbability density function (PDF) of the normal distribution

Page 26: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 26

Experimental Design Statistical Tests

The normal distribution (in excel)

Drag cell A4 down to cell A204Drag cell B4 down to cell B204

Make a bar-chart graph from range B4:B204

A B C D E F

1 mu 20

2 stdev 20

3

4 1 =1/($B$2*SQRT(2*PI()))*EXP(-1*POWER(A4-$B$1,2)/(2*POWER($B$2,2)))

5 2 ...

6 3 ...

Page 27: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 27

Experimental Design Statistical Tests

The normal distribution (in excel, but the easy way: use function NORM.DIST)

Drag cell A4 down to cell A204Drag cell B4 down to cell B204

Make a bar-chart graph from range B4:B204

A B C D E F

1 mu 20

2 stdev 20

3

4 1 =NORM.DIST(A4,$B$1,$B$2,0)

5 2 ...

6 3 ...

Page 28: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 28

Experimental Design Statistical Tests

Assumptions● Assume that the two samples come from the same population (that is, H0 is true)

● Assume that observations are independent (that is they do not influence each other)

● Assume that variances of samples are homogeneous

● Assume that the variable of interest (number of tentacles) has a normal distribution

Page 29: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 29

Experimental Design Statistical Tests

sample

sample

sample

sample

sample

sample

t test

t test

t test

Sample N times the normal distribution taking at each step two samples with n replicates and performing a t-test

...

Page 30: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 30

Experimental Design Statistical Tests

A big dilemma: how do you sample from the normal distribution?

Think this way: how do yo generate a sample of size n=6 from a normally distributed population with an average of μ=20 and a variance of σ²=6?

Page 31: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 31

Experimental Design Statistical Tests

Recall that the normal distribution depends two parameters: an average (μ) and a variance (σ²)

But the variance depends on the average!

Page 32: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 32

Experimental Design Statistical Tests

Sampling the normal distribution

f (X )=12 [1+erf ( X−μ

√2σ2 )]

Cumulative distribution function [CDF] of the normal distribution

From the graph

f(2.0) with μ=0 and σ²=5.0 ≈ 0.8 Probability

What does this probability mean?

Page 33: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 33

Experimental Design Statistical Tests

Probability of obtaining values less than or equal to 2

80% of the area under curve

Page 34: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 34

Experimental Design Statistical Tests

We need the inverse of the CDF of the normal distribution!

That is, we need a function that takes a probability and returns a value X from a normal distribution with mean=μ and variance=σ²

Page 35: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 35

Experimental Design Statistical Tests

This function exists and is called

Inverse CDF of the normal distribution

Page 36: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 36

Experimental Design Statistical Tests

Sampling the normal distribution in excel

A B C D E F

1 mu 20

2 stdev 10

3

4 =RAND() =NORM.INV(A4,$B$1,$B$2)

5

6

Drag cells A4:B4 down to Row 9

Compute AVERAGE(B4:B9)Compute STDEV(B4:B9)

Page 37: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 37

Experimental Design Statistical Tests

Get back to the assumptions● Assume that the two samples come from the same population (that is, H0 is true)

● Assume that observations are independent (that is they do not influence each other)

● Assume that variances of samples are homogeneous

● Assume that the variable of interest (number of tentacles) has a normal distribution

Page 38: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 38

Experimental Design Statistical Tests

If we draw two samples (n=6 each) many times, preferably indefinitely, from a normal distribution with parameters μ=10 and σ²=5, and preform a t-test each time, we will end up with a good approximation to the distribution of t

Do that in the simulator!

Page 39: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 39

Experimental Design Statistical Tests

Generate 1000 t-tests using samples of n=6 and a population mean of μ=10 and variance of σ²=5

What is the expected most common value of t?

Sort the t-values (ascending) and find out the ones corresponding to row 25 and row 975 (these are the 2.5% upper and lower tails of the distribution = 0.05)

Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with intervals of 0.1)

Repeat for 10000 samples!

Page 40: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 40

Experimental Design Statistical Tests

Simulated distribution of t

Page 41: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 41

Experimental Design Statistical Tests

OK, but we used μ=10 and σ²=5! What if we have used a different mean or variance or both?

Should we know, a priori, the mean and variance of an hypothetical population?

Repeat with μ=5 and σ²=10

Page 42: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 42

Experimental Design Statistical Tests

Wait! This distribution was already derived!

know as the probability density function (PDF) of t

Page 43: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 43

Experimental Design Statistical Tests

Derived by WS Gosset (1908)

Popularized by RA Fisher as Student’s t-distribution

Page 44: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 44

Experimental Design Statistical Tests

It has one parameter

ν = degrees of freedom

ν=10

ν=6

ν=2It does not depend neither on the mean nor on the variance of a population!

Page 45: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 45

Experimental Design Statistical Tests

t = 2.79372

ν = 10

Recall that for the anemone tentacles’ data

Page 46: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 46

Experimental Design Statistical Tests

2.79372

Now what?

OK, so we know the PDF of t for 10 degrees of freedom and we have got a t statistic of 2.79372

Page 47: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 47

Experimental Design Statistical Tests

Decision theory

Page 48: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 48

Experimental Design Statistical Tests

● We’ve got a test (Student’s t-test)

● We’ve got the data

● We’ve got a statistic (t=2.79372, with ν=10)

● We know the distribution of t for ν=10, when the null hypothesis is true

Page 49: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 49

Experimental Design Statistical Tests

Get back to the assumptions● Assume that the two samples come from the same population (that is, H0 is true)

● Assume that observations are independent (that is they do not influence each other)

● Assume that variances of samples are homogeneous

● Assume that the variable of interest (number of tentacles) has a normal distribution

Page 50: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 50

Experimental Design Statistical Tests

When should we reject the null hypothesis?

α

Type I Error

Probability of rejecting H

0 when it is true

H0:µ

1=µ

2

= 0,05

Probability distribution function of t for ν=10

Probability of detecting false differences between two samples that came from the same population

0.0250.025

2.228-2.228Bilateral test

Critical value of t for α=0.05

Rejection area (red)

Page 51: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 51

Experimental Design Statistical Tests

When should we reject the null hypothesis?

● If tobserved

> tcritical

→ Reject H0

● If tobserved

< tcritical

→ Accept H0

● If tobserved

falls inside rejection area → Reject H0

● If tobserved

falls outside rejection area → Accept H0

Page 52: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 52

Experimental Design Statistical Tests

tobserved

> tcritical

(2.79372 > 2.228)

tobserved

falls in rejection area

tobs

= 2.79372

Reject H0

0.0250.025

2.228-2.228

There are differences between the number of tentacles of A. equina and A. fragacea at Berlengas Islands

Page 53: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 53

Experimental Design Statistical Tests

Unilateral test (for a more specific hypothesis)H

A:µ

1>µ

2

Probability distribution function of t for ν=10

0.050

1.812

Critical value of t for α=0.05

Rejection area (red)

H0:µ

1≤µ

2

Reject H0 if

tobs

> tcrit

tobs

falls in rejection area

Page 54: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 54

Experimental Design Statistical Tests

A few points to notice● Rejection/Acceptance of the null hypothesis is an all-or-nothing process and is determined by the rejection level of the null hypothesis (α) which must be set a priori

● We will never know if the two samples come actually from different populations. What we know is that there is a 5% chance (for α=0.05) of being wrong when we reject H0!

● Computer packages will usually give you an exact probability. If it is smaller than α reject H0

Page 55: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 55

Experimental Design Statistical Tests

A few points to notice

● In the present case, the exact probability of the statistic is 0.018 (that is, the probability of obtaining a t > 2.79372 or a t < -2.79372)

● What would happen if the rejection level of the null hypothesis had been set to α=0.01 instead of α=0.05?

● We would have retained H0! (no differences)

Page 56: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 56

Experimental Design Statistical Tests

Spot the error?

Page 57: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 57

Experimental Design Statistical Tests

We control Type I Error (the probability of rejecting H0 when it is true)

What about the probability of accepting H0 when it is false?

Type II Error

β= the probability of rejecting HA when it is true

Page 58: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 58

Experimental Design Statistical Tests

Type I Error(α) ✓

✓ Type II Error(β)

H0 True

H0 False

H0 Rejected H

0 Accepted

Page 59: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 59

Experimental Design Statistical Tests

Power of a statistical test

Power = 1 - β● If β is the probability of accepting H0 when it is false, it is also the probability of rejecting HA when it is true.

● Hence, 1-β is the probability of accepting HA when it is true. This is called the Power of a statistical test.

Page 60: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 60

Experimental Design Statistical Tests

Can we estimate β?

It is not easy...● β is inversely related with α (the higher the value of α, the lower the value of β)

● β depends on the magnitude of the effect between samples when H0 is false

● β depends on the number of replicates n (the higher the n, the lower the β)

Page 61: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 61

Experimental Design Statistical Tests

Power and the magnitude of the effect

● Assume that you are comparing two putatively different populations (which are actually different, but you don’t know yet)

● Therefore, H0 is false● In the case of sea anemones, assume that on average A. fragacea has more 30 tentacles than A. equina (unstandardized magnitude of the effect = 30)

(the latter is also known as effect size)

Page 62: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 62

Experimental Design Statistical Tests

Power and the magnitude of the effect

Distribution of t when thenull hypothesis is true

Effect size = +30

Page 63: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 63

Experimental Design Statistical Tests

Power and the magnitude of the effect

Distribution of t when thenull hypothesis is true

Distribution of t when theMagnitude of the effect is +30

β=0.001

Power = 1-β = 1-0.001 = 0.999

Effect size = +30

Page 64: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 64

Experimental Design Statistical Tests

Power and the magnitude of the effect

Distribution of t when thenull hypothesis is true

Effect size = +20Distribution of t when theMagnitude of the effect is +20

β=0.15

Power = 1-β = 1-0.15 = 0.85

Page 65: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 65

Experimental Design Statistical Tests

Power and the magnitude of the effect

Distribution of t when thenull hypothesis is true

Effect size = +10Distribution of t when theMagnitude of the effect is +10β=0.60

Power = 1-β = 1-0.60 = 0.40

Page 66: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 66

Experimental Design Statistical Tests

How to increase the power?

Distribution of t when thenull hypothesis is true

β=0.60

Power = 1-β = 1-0.60 = 0.40

Page 67: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 67

Experimental Design Statistical Tests

How to increase the power?

Distribution of t when thenull hypothesis is true

β=0.401) Increase α

Power = 1-β = 1-0.40 = 0.60

Page 68: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 68

Experimental Design Statistical Tests

How to increase the power?

Distribution of t when thenull hypothesis is true

β=0.402) Use unilateral tests

Power = 1-β = 1-0.40 = 0.60

(HA:μ

1>μ

2)

Page 69: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 69

Experimental Design Statistical Tests

How to increase the power?

Distribution of t when thenull hypothesis is true

β=0.203) Increase n

Power = 1-β = 1-0.20 = 0.80

Page 70: Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05) Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with

09/27/15 70

Experimental Design Statistical Tests

Power Analysis

Peterman RM, 1990. Statistical Power Analysis can improve Fisheries Research and Management. Can. J. Fish. Aquat. Sci., 47: 2-15.