Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05)...

09/27/15 1

Experimental Design Statistical Tests

Fundamentals of a statistical test

09/27/15 2


● Conceive a hypothesis (HA)● Define the null hypothesis (H0)● Choose the level of rejection of the null hypothesis (α)● Choose an adequate statistical test● Obtain a statistic (actually, a number!)● Compute the “probability” (P) of the statistic

P ≥ α: H0 retained (HA rejected)

P < α: HA retained (H0 rejected)

Procedure of a statistical test

09/27/15 3


Hypothesis testing in action

Actinia equina and Actinia fragacea

09/27/15 4


After some observations you come up with an hypothesis

(HA) The number of tentacles of Actinia fragacea differs from that of Actinia equina

09/27/15 5


Make a trip to Berlengas (Islands)

Catch some anemones and count their tentacles...

09/27/15 6


Two samples obtained at Berlengasequina fragacea

126 150

138 156

132 150

144 126

120 156

132 150

132 148

n 6 6

X̄

WAIT!

By using these two samples, are we actually testing the hypothesis that the number of tentacles in A. fragacea is different from that of A. equina?

09/27/15 7


Two samples obtained at Berlengas

Distribution of Actinia equina + fragacea

REPRESEN

TATI

VENESS

09/27/15 8


Rephrase the hypothesis

(HA) The number of tentacles of Actinia fragacea differs from that of Actinia equina in Portugal

or

(HA) The number of tentacles of Actinia fragacea differs from that of Actinia equina at Berlengas

09/27/15 9


Formalize the hypothesis

(HA) The average number of tentacles of A. fragacea (μ

1) differs from that of A. equina

(μ2) at Berlengas

HA: µ

1≠µ

2 H0:µ

1=µ

2

Set the rejection level at α=0.05 (more on that later)

09/27/15 10






09/27/15 11


Let’s invent a statistical test...

Blah statistic x1−x2

HA: µ

1≠µ

2 H0:µ

1=µ

2

Blah = 132-148

Blah = -16

What do we know about Blah?

NOTHING!

equina fragacea

126 150

138 156

132 150

144 126

120 156

132 150

132 148

n 6 6

X̄

09/27/15 12


Student’s t-test

09/27/15 13






09/27/15 14


We need to compute sample variances for the t-test

Variance of a sample

09/27/15 15


equina fragacea

126 150

138 156

132 150

144 126

120 156

132 150

132 148

792 888

104904 132048

n 6 6

X̄∑ X

∑ X2

09/27/15 16


equina fragacea

126 150

138 156

132 150

144 126

120 156

132 150

132 148

s² 72 124.8

n 6 6

X̄

09/27/15 17


equina fragacea

126 150

138 156

132 150

144 126

120 156

132 150

132 148

s² 72 124.8

n 6 6

X̄

= 10 degrees of freedom

09/27/15 18


t = -2.79372

ν = 10

(ignore the signal)

09/27/15 19






09/27/15 20


α = 0.05 n = 6ν = 10

tcrit

= 2.228 t = 2.79372

t > tcrit

p < 0.05

Reject H0: samples came from different populations!

09/27/15 21






09/27/15 22


The logic behind Student’s t-test

There is a model...

09/27/15 23


Assumptions● Assume that the two samples come from the same population (that is, H0 is true)

● Assume that observations are independent (that is they do not influence each other)

● Assume that variances of samples are homogeneous

● Assume that the variable of interest (number of tentacles) has a normal distribution

09/27/15 24






09/27/15 25


The normal distributionProbability density function (PDF) of the normal distribution

09/27/15 26


The normal distribution (in excel)

Drag cell A4 down to cell A204Drag cell B4 down to cell B204

Make a bar-chart graph from range B4:B204

A B C D E F

1 mu 20

2 stdev 20

3

4 1 =1/($B$2*SQRT(2*PI()))*EXP(-1*POWER(A4-$B$1,2)/(2*POWER($B$2,2)))

5 2 ...

6 3 ...

09/27/15 27


The normal distribution (in excel, but the easy way: use function NORM.DIST)

Drag cell A4 down to cell A204Drag cell B4 down to cell B204

Make a bar-chart graph from range B4:B204

A B C D E F

1 mu 20

2 stdev 20

3

4 1 =NORM.DIST(A4,$B$1,$B$2,0)

5 2 ...

6 3 ...

09/27/15 28






09/27/15 29


sample

sample

sample

sample

sample

sample

t test

t test

t test

Sample N times the normal distribution taking at each step two samples with n replicates and performing a t-test

...

09/27/15 30


A big dilemma: how do you sample from the normal distribution?

Think this way: how do yo generate a sample of size n=6 from a normally distributed population with an average of μ=20 and a variance of σ²=6?

09/27/15 31


Recall that the normal distribution depends two parameters: an average (μ) and a variance (σ²)

But the variance depends on the average!

09/27/15 32


Sampling the normal distribution

f (X )=12 [1+erf ( X−μ

√2σ2 )]

Cumulative distribution function [CDF] of the normal distribution

From the graph

f(2.0) with μ=0 and σ²=5.0 ≈ 0.8 Probability

What does this probability mean?

09/27/15 33


Probability of obtaining values less than or equal to 2

80% of the area under curve

09/27/15 34


We need the inverse of the CDF of the normal distribution!

That is, we need a function that takes a probability and returns a value X from a normal distribution with mean=μ and variance=σ²

09/27/15 35


This function exists and is called

Inverse CDF of the normal distribution

09/27/15 36


Sampling the normal distribution in excel

A B C D E F

1 mu 20

2 stdev 10

3

4 =RAND() =NORM.INV(A4,$B$1,$B$2)

5

6

Drag cells A4:B4 down to Row 9

Compute AVERAGE(B4:B9)Compute STDEV(B4:B9)

09/27/15 37


Get back to the assumptions● Assume that the two samples come from the same population (that is, H0 is true)




09/27/15 38


If we draw two samples (n=6 each) many times, preferably indefinitely, from a normal distribution with parameters μ=10 and σ²=5, and preform a t-test each time, we will end up with a good approximation to the distribution of t

Do that in the simulator!

09/27/15 39


Generate 1000 t-tests using samples of n=6 and a population mean of μ=10 and variance of σ²=5

What is the expected most common value of t?

Sort the t-values (ascending) and find out the ones corresponding to row 25 and row 975 (these are the 2.5% upper and lower tails of the distribution = 0.05)

Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with intervals of 0.1)

Repeat for 10000 samples!

09/27/15 40


Simulated distribution of t

09/27/15 41


OK, but we used μ=10 and σ²=5! What if we have used a different mean or variance or both?

Should we know, a priori, the mean and variance of an hypothetical population?

Repeat with μ=5 and σ²=10

09/27/15 42


Wait! This distribution was already derived!

know as the probability density function (PDF) of t

09/27/15 43


Derived by WS Gosset (1908)

Popularized by RA Fisher as Student’s t-distribution

09/27/15 44


It has one parameter

ν = degrees of freedom

ν=10

ν=6

ν=2It does not depend neither on the mean nor on the variance of a population!

09/27/15 45


t = 2.79372

ν = 10

Recall that for the anemone tentacles’ data

09/27/15 46


2.79372

Now what?

OK, so we know the PDF of t for 10 degrees of freedom and we have got a t statistic of 2.79372

09/27/15 47


Decision theory

09/27/15 48


● We’ve got a test (Student’s t-test)

● We’ve got the data

● We’ve got a statistic (t=2.79372, with ν=10)

● We know the distribution of t for ν=10, when the null hypothesis is true

09/27/15 49


Get back to the assumptions● Assume that the two samples come from the same population (that is, H0 is true)




09/27/15 50


When should we reject the null hypothesis?

α

Type I Error

Probability of rejecting H

0 when it is true

H0:µ

1=µ

2

= 0,05

Probability distribution function of t for ν=10

Probability of detecting false differences between two samples that came from the same population

0.0250.025

2.228-2.228Bilateral test

Critical value of t for α=0.05

Rejection area (red)

09/27/15 51


When should we reject the null hypothesis?

● If tobserved

> tcritical

→ Reject H0

● If tobserved

< tcritical

→ Accept H0

● If tobserved

falls inside rejection area → Reject H0

● If tobserved

falls outside rejection area → Accept H0

09/27/15 52


tobserved

> tcritical

(2.79372 > 2.228)

tobserved

falls in rejection area

tobs

= 2.79372

Reject H0

0.0250.025

2.228-2.228

There are differences between the number of tentacles of A. equina and A. fragacea at Berlengas Islands

09/27/15 53


Unilateral test (for a more specific hypothesis)H

A:µ

1>µ

2

Probability distribution function of t for ν=10

0.050

1.812

Critical value of t for α=0.05

Rejection area (red)

H0:µ

1≤µ

2

Reject H0 if

tobs

> tcrit

tobs

falls in rejection area

09/27/15 54


A few points to notice● Rejection/Acceptance of the null hypothesis is an all-or-nothing process and is determined by the rejection level of the null hypothesis (α) which must be set a priori

● We will never know if the two samples come actually from different populations. What we know is that there is a 5% chance (for α=0.05) of being wrong when we reject H0!

● Computer packages will usually give you an exact probability. If it is smaller than α reject H0

09/27/15 55


A few points to notice

● In the present case, the exact probability of the statistic is 0.018 (that is, the probability of obtaining a t > 2.79372 or a t < -2.79372)

● What would happen if the rejection level of the null hypothesis had been set to α=0.01 instead of α=0.05?

● We would have retained H0! (no differences)

09/27/15 56


Spot the error?

09/27/15 57


We control Type I Error (the probability of rejecting H0 when it is true)

What about the probability of accepting H0 when it is false?

Type II Error

β= the probability of rejecting HA when it is true

09/27/15 58


Type I Error(α) ✓

✓ Type II Error(β)

H0 True

H0 False

H0 Rejected H

0 Accepted

09/27/15 59


Power of a statistical test

Power = 1 - β● If β is the probability of accepting H0 when it is false, it is also the probability of rejecting HA when it is true.

● Hence, 1-β is the probability of accepting HA when it is true. This is called the Power of a statistical test.

09/27/15 60


Can we estimate β?

It is not easy...● β is inversely related with α (the higher the value of α, the lower the value of β)

● β depends on the magnitude of the effect between samples when H0 is false

● β depends on the number of replicates n (the higher the n, the lower the β)

09/27/15 61


Power and the magnitude of the effect

● Assume that you are comparing two putatively different populations (which are actually different, but you don’t know yet)

● Therefore, H0 is false● In the case of sea anemones, assume that on average A. fragacea has more 30 tentacles than A. equina (unstandardized magnitude of the effect = 30)

(the latter is also known as effect size)

09/27/15 62



Distribution of t when thenull hypothesis is true

Effect size = +30

09/27/15 63




Distribution of t when theMagnitude of the effect is +30

β=0.001

Power = 1-β = 1-0.001 = 0.999

Effect size = +30

09/27/15 64




Effect size = +20Distribution of t when theMagnitude of the effect is +20

β=0.15

Power = 1-β = 1-0.15 = 0.85

09/27/15 65




Effect size = +10Distribution of t when theMagnitude of the effect is +10β=0.60

Power = 1-β = 1-0.60 = 0.40

09/27/15 66


How to increase the power?


β=0.60

Power = 1-β = 1-0.60 = 0.40

09/27/15 67




β=0.401) Increase α

Power = 1-β = 1-0.40 = 0.60

09/27/15 68




β=0.402) Use unilateral tests

Power = 1-β = 1-0.40 = 0.60

(HA:μ

1>μ

2)

09/27/15 69




β=0.203) Increase n

Power = 1-β = 1-0.20 = 0.80

09/27/15 70


Power Analysis

Peterman RM, 1990. Statistical Power Analysis can improve Fisheries Research and Management. Can. J. Fish. Aquat. Sci., 47: 2-15.

Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05)...

Documents

Transcript of Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05)...