Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05)...
Transcript of Fundamentals of a statistical test - FCUP · 2.5% upper and lower tails of the distribution = 0.05)...
09/27/15 1
Experimental Design Statistical Tests
Fundamentals of a statistical test
09/27/15 2
Experimental Design Statistical Tests
● Conceive a hypothesis (HA)● Define the null hypothesis (H0)● Choose the level of rejection of the null hypothesis (α)● Choose an adequate statistical test● Obtain a statistic (actually, a number!)● Compute the “probability” (P) of the statistic
P ≥ α: H0 retained (HA rejected)
P < α: HA retained (H0 rejected)
Procedure of a statistical test
09/27/15 3
Experimental Design Statistical Tests
Hypothesis testing in action
Actinia equina and Actinia fragacea
09/27/15 4
Experimental Design Statistical Tests
After some observations you come up with an hypothesis
(HA) The number of tentacles of Actinia fragacea differs from that of Actinia equina
09/27/15 5
Experimental Design Statistical Tests
Make a trip to Berlengas (Islands)
Catch some anemones and count their tentacles...
09/27/15 6
Experimental Design Statistical Tests
Two samples obtained at Berlengasequina fragacea
126 150
138 156
132 150
144 126
120 156
132 150
132 148
n 6 6
X̄
WAIT!
By using these two samples, are we actually testing the hypothesis that the number of tentacles in A. fragacea is different from that of A. equina?
09/27/15 7
Experimental Design Statistical Tests
Two samples obtained at Berlengas
Distribution of Actinia equina + fragacea
REPRESEN
TATI
VENESS
09/27/15 8
Experimental Design Statistical Tests
Rephrase the hypothesis
(HA) The number of tentacles of Actinia fragacea differs from that of Actinia equina in Portugal
or
(HA) The number of tentacles of Actinia fragacea differs from that of Actinia equina at Berlengas
09/27/15 9
Experimental Design Statistical Tests
Formalize the hypothesis
(HA) The average number of tentacles of A. fragacea (μ
1) differs from that of A. equina
(μ2) at Berlengas
HA: µ
1≠µ
2 H0:µ
1=µ
2
Set the rejection level at α=0.05 (more on that later)
09/27/15 10
Experimental Design Statistical Tests
● Conceive a hypothesis (HA)● Define the null hypothesis (H0)● Choose the level of rejection of the null hypothesis (α)● Choose an adequate statistical test● Obtain a statistic (actually, a number!)● Compute the “probability” (P) of the statistic
P ≥ α: H0 retained (HA rejected)
P < α: HA retained (H0 rejected)
Procedure of a statistical test
09/27/15 11
Experimental Design Statistical Tests
Let’s invent a statistical test...
Blah statistic x1−x2
HA: µ
1≠µ
2 H0:µ
1=µ
2
Blah = 132-148
Blah = -16
What do we know about Blah?
NOTHING!
equina fragacea
126 150
138 156
132 150
144 126
120 156
132 150
132 148
n 6 6
X̄
09/27/15 12
Experimental Design Statistical Tests
Student’s t-test
09/27/15 13
Experimental Design Statistical Tests
● Conceive a hypothesis (HA)● Define the null hypothesis (H0)● Choose the level of rejection of the null hypothesis (α)● Choose an adequate statistical test● Obtain a statistic (actually, a number!)● Compute the “probability” (P) of the statistic
P ≥ α: H0 retained (HA rejected)
P < α: HA retained (H0 rejected)
Procedure of a statistical test
09/27/15 14
Experimental Design Statistical Tests
We need to compute sample variances for the t-test
Variance of a sample
09/27/15 15
Experimental Design Statistical Tests
equina fragacea
126 150
138 156
132 150
144 126
120 156
132 150
132 148
792 888
104904 132048
n 6 6
X̄∑ X
∑ X2
09/27/15 16
Experimental Design Statistical Tests
equina fragacea
126 150
138 156
132 150
144 126
120 156
132 150
132 148
s² 72 124.8
n 6 6
X̄
09/27/15 17
Experimental Design Statistical Tests
equina fragacea
126 150
138 156
132 150
144 126
120 156
132 150
132 148
s² 72 124.8
n 6 6
X̄
= 10 degrees of freedom
09/27/15 18
Experimental Design Statistical Tests
t = -2.79372
ν = 10
(ignore the signal)
09/27/15 19
Experimental Design Statistical Tests
● Conceive a hypothesis (HA)● Define the null hypothesis (H0)● Choose the level of rejection of the null hypothesis (α)● Choose an adequate statistical test● Obtain a statistic (actually, a number!)● Compute the “probability” (P) of the statistic
P ≥ α: H0 retained (HA rejected)
P < α: HA retained (H0 rejected)
Procedure of a statistical test
09/27/15 20
Experimental Design Statistical Tests
α = 0.05 n = 6ν = 10
tcrit
= 2.228 t = 2.79372
t > tcrit
p < 0.05
Reject H0: samples came from different populations!
09/27/15 21
Experimental Design Statistical Tests
● Conceive a hypothesis (HA)● Define the null hypothesis (H0)● Choose the level of rejection of the null hypothesis (α)● Choose an adequate statistical test● Obtain a statistic (actually, a number!)● Compute the “probability” (P) of the statistic
P ≥ α: H0 retained (HA rejected)
P < α: HA retained (H0 rejected)
Procedure of a statistical test
09/27/15 22
Experimental Design Statistical Tests
The logic behind Student’s t-test
There is a model...
09/27/15 23
Experimental Design Statistical Tests
Assumptions● Assume that the two samples come from the same population (that is, H0 is true)
● Assume that observations are independent (that is they do not influence each other)
● Assume that variances of samples are homogeneous
● Assume that the variable of interest (number of tentacles) has a normal distribution
09/27/15 24
Experimental Design Statistical Tests
Assumptions● Assume that the two samples come from the same population (that is, H0 is true)
● Assume that observations are independent (that is they do not influence each other)
● Assume that variances of samples are homogeneous
● Assume that the variable of interest (number of tentacles) has a normal distribution
09/27/15 25
Experimental Design Statistical Tests
The normal distributionProbability density function (PDF) of the normal distribution
09/27/15 26
Experimental Design Statistical Tests
The normal distribution (in excel)
Drag cell A4 down to cell A204Drag cell B4 down to cell B204
Make a bar-chart graph from range B4:B204
A B C D E F
1 mu 20
2 stdev 20
3
4 1 =1/($B$2*SQRT(2*PI()))*EXP(-1*POWER(A4-$B$1,2)/(2*POWER($B$2,2)))
5 2 ...
6 3 ...
09/27/15 27
Experimental Design Statistical Tests
The normal distribution (in excel, but the easy way: use function NORM.DIST)
Drag cell A4 down to cell A204Drag cell B4 down to cell B204
Make a bar-chart graph from range B4:B204
A B C D E F
1 mu 20
2 stdev 20
3
4 1 =NORM.DIST(A4,$B$1,$B$2,0)
5 2 ...
6 3 ...
09/27/15 28
Experimental Design Statistical Tests
Assumptions● Assume that the two samples come from the same population (that is, H0 is true)
● Assume that observations are independent (that is they do not influence each other)
● Assume that variances of samples are homogeneous
● Assume that the variable of interest (number of tentacles) has a normal distribution
09/27/15 29
Experimental Design Statistical Tests
sample
sample
sample
sample
sample
sample
t test
t test
t test
Sample N times the normal distribution taking at each step two samples with n replicates and performing a t-test
...
09/27/15 30
Experimental Design Statistical Tests
A big dilemma: how do you sample from the normal distribution?
Think this way: how do yo generate a sample of size n=6 from a normally distributed population with an average of μ=20 and a variance of σ²=6?
09/27/15 31
Experimental Design Statistical Tests
Recall that the normal distribution depends two parameters: an average (μ) and a variance (σ²)
But the variance depends on the average!
09/27/15 32
Experimental Design Statistical Tests
Sampling the normal distribution
f (X )=12 [1+erf ( X−μ
√2σ2 )]
Cumulative distribution function [CDF] of the normal distribution
From the graph
f(2.0) with μ=0 and σ²=5.0 ≈ 0.8 Probability
What does this probability mean?
09/27/15 33
Experimental Design Statistical Tests
Probability of obtaining values less than or equal to 2
80% of the area under curve
09/27/15 34
Experimental Design Statistical Tests
We need the inverse of the CDF of the normal distribution!
That is, we need a function that takes a probability and returns a value X from a normal distribution with mean=μ and variance=σ²
09/27/15 35
Experimental Design Statistical Tests
This function exists and is called
Inverse CDF of the normal distribution
09/27/15 36
Experimental Design Statistical Tests
Sampling the normal distribution in excel
A B C D E F
1 mu 20
2 stdev 10
3
4 =RAND() =NORM.INV(A4,$B$1,$B$2)
5
6
Drag cells A4:B4 down to Row 9
Compute AVERAGE(B4:B9)Compute STDEV(B4:B9)
09/27/15 37
Experimental Design Statistical Tests
Get back to the assumptions● Assume that the two samples come from the same population (that is, H0 is true)
● Assume that observations are independent (that is they do not influence each other)
● Assume that variances of samples are homogeneous
● Assume that the variable of interest (number of tentacles) has a normal distribution
09/27/15 38
Experimental Design Statistical Tests
If we draw two samples (n=6 each) many times, preferably indefinitely, from a normal distribution with parameters μ=10 and σ²=5, and preform a t-test each time, we will end up with a good approximation to the distribution of t
Do that in the simulator!
09/27/15 39
Experimental Design Statistical Tests
Generate 1000 t-tests using samples of n=6 and a population mean of μ=10 and variance of σ²=5
What is the expected most common value of t?
Sort the t-values (ascending) and find out the ones corresponding to row 25 and row 975 (these are the 2.5% upper and lower tails of the distribution = 0.05)
Build an histogram of frequencies (using a scale ranging from -3.5 to 3.5 with intervals of 0.1)
Repeat for 10000 samples!
09/27/15 40
Experimental Design Statistical Tests
Simulated distribution of t
09/27/15 41
Experimental Design Statistical Tests
OK, but we used μ=10 and σ²=5! What if we have used a different mean or variance or both?
Should we know, a priori, the mean and variance of an hypothetical population?
Repeat with μ=5 and σ²=10
09/27/15 42
Experimental Design Statistical Tests
Wait! This distribution was already derived!
know as the probability density function (PDF) of t
09/27/15 43
Experimental Design Statistical Tests
Derived by WS Gosset (1908)
Popularized by RA Fisher as Student’s t-distribution
09/27/15 44
Experimental Design Statistical Tests
It has one parameter
ν = degrees of freedom
ν=10
ν=6
ν=2It does not depend neither on the mean nor on the variance of a population!
09/27/15 45
Experimental Design Statistical Tests
t = 2.79372
ν = 10
Recall that for the anemone tentacles’ data
09/27/15 46
Experimental Design Statistical Tests
2.79372
Now what?
OK, so we know the PDF of t for 10 degrees of freedom and we have got a t statistic of 2.79372
09/27/15 47
Experimental Design Statistical Tests
Decision theory
09/27/15 48
Experimental Design Statistical Tests
● We’ve got a test (Student’s t-test)
● We’ve got the data
● We’ve got a statistic (t=2.79372, with ν=10)
● We know the distribution of t for ν=10, when the null hypothesis is true
09/27/15 49
Experimental Design Statistical Tests
Get back to the assumptions● Assume that the two samples come from the same population (that is, H0 is true)
● Assume that observations are independent (that is they do not influence each other)
● Assume that variances of samples are homogeneous
● Assume that the variable of interest (number of tentacles) has a normal distribution
09/27/15 50
Experimental Design Statistical Tests
When should we reject the null hypothesis?
α
Type I Error
Probability of rejecting H
0 when it is true
H0:µ
1=µ
2
= 0,05
Probability distribution function of t for ν=10
Probability of detecting false differences between two samples that came from the same population
0.0250.025
2.228-2.228Bilateral test
Critical value of t for α=0.05
Rejection area (red)
09/27/15 51
Experimental Design Statistical Tests
When should we reject the null hypothesis?
● If tobserved
> tcritical
→ Reject H0
● If tobserved
< tcritical
→ Accept H0
● If tobserved
falls inside rejection area → Reject H0
● If tobserved
falls outside rejection area → Accept H0
09/27/15 52
Experimental Design Statistical Tests
tobserved
> tcritical
(2.79372 > 2.228)
tobserved
falls in rejection area
tobs
= 2.79372
Reject H0
0.0250.025
2.228-2.228
There are differences between the number of tentacles of A. equina and A. fragacea at Berlengas Islands
09/27/15 53
Experimental Design Statistical Tests
Unilateral test (for a more specific hypothesis)H
A:µ
1>µ
2
Probability distribution function of t for ν=10
0.050
1.812
Critical value of t for α=0.05
Rejection area (red)
H0:µ
1≤µ
2
Reject H0 if
tobs
> tcrit
tobs
falls in rejection area
09/27/15 54
Experimental Design Statistical Tests
A few points to notice● Rejection/Acceptance of the null hypothesis is an all-or-nothing process and is determined by the rejection level of the null hypothesis (α) which must be set a priori
● We will never know if the two samples come actually from different populations. What we know is that there is a 5% chance (for α=0.05) of being wrong when we reject H0!
● Computer packages will usually give you an exact probability. If it is smaller than α reject H0
09/27/15 55
Experimental Design Statistical Tests
A few points to notice
● In the present case, the exact probability of the statistic is 0.018 (that is, the probability of obtaining a t > 2.79372 or a t < -2.79372)
● What would happen if the rejection level of the null hypothesis had been set to α=0.01 instead of α=0.05?
● We would have retained H0! (no differences)
09/27/15 56
Experimental Design Statistical Tests
Spot the error?
09/27/15 57
Experimental Design Statistical Tests
We control Type I Error (the probability of rejecting H0 when it is true)
What about the probability of accepting H0 when it is false?
Type II Error
β= the probability of rejecting HA when it is true
09/27/15 58
Experimental Design Statistical Tests
Type I Error(α) ✓
✓ Type II Error(β)
H0 True
H0 False
H0 Rejected H
0 Accepted
09/27/15 59
Experimental Design Statistical Tests
Power of a statistical test
Power = 1 - β● If β is the probability of accepting H0 when it is false, it is also the probability of rejecting HA when it is true.
● Hence, 1-β is the probability of accepting HA when it is true. This is called the Power of a statistical test.
09/27/15 60
Experimental Design Statistical Tests
Can we estimate β?
It is not easy...● β is inversely related with α (the higher the value of α, the lower the value of β)
● β depends on the magnitude of the effect between samples when H0 is false
● β depends on the number of replicates n (the higher the n, the lower the β)
09/27/15 61
Experimental Design Statistical Tests
Power and the magnitude of the effect
● Assume that you are comparing two putatively different populations (which are actually different, but you don’t know yet)
● Therefore, H0 is false● In the case of sea anemones, assume that on average A. fragacea has more 30 tentacles than A. equina (unstandardized magnitude of the effect = 30)
(the latter is also known as effect size)
09/27/15 62
Experimental Design Statistical Tests
Power and the magnitude of the effect
Distribution of t when thenull hypothesis is true
Effect size = +30
09/27/15 63
Experimental Design Statistical Tests
Power and the magnitude of the effect
Distribution of t when thenull hypothesis is true
Distribution of t when theMagnitude of the effect is +30
β=0.001
Power = 1-β = 1-0.001 = 0.999
Effect size = +30
09/27/15 64
Experimental Design Statistical Tests
Power and the magnitude of the effect
Distribution of t when thenull hypothesis is true
Effect size = +20Distribution of t when theMagnitude of the effect is +20
β=0.15
Power = 1-β = 1-0.15 = 0.85
09/27/15 65
Experimental Design Statistical Tests
Power and the magnitude of the effect
Distribution of t when thenull hypothesis is true
Effect size = +10Distribution of t when theMagnitude of the effect is +10β=0.60
Power = 1-β = 1-0.60 = 0.40
09/27/15 66
Experimental Design Statistical Tests
How to increase the power?
Distribution of t when thenull hypothesis is true
β=0.60
Power = 1-β = 1-0.60 = 0.40
09/27/15 67
Experimental Design Statistical Tests
How to increase the power?
Distribution of t when thenull hypothesis is true
β=0.401) Increase α
Power = 1-β = 1-0.40 = 0.60
09/27/15 68
Experimental Design Statistical Tests
How to increase the power?
Distribution of t when thenull hypothesis is true
β=0.402) Use unilateral tests
Power = 1-β = 1-0.40 = 0.60
(HA:μ
1>μ
2)
09/27/15 69
Experimental Design Statistical Tests
How to increase the power?
Distribution of t when thenull hypothesis is true
β=0.203) Increase n
Power = 1-β = 1-0.20 = 0.80
09/27/15 70
Experimental Design Statistical Tests
Power Analysis
Peterman RM, 1990. Statistical Power Analysis can improve Fisheries Research and Management. Can. J. Fish. Aquat. Sci., 47: 2-15.