Website
-
Upload
diana-santry -
Category
Documents
-
view
223 -
download
0
Transcript of Website
Websitehttp://www.mun.ca/biology/quant/
Welcome to Biology 4605 / 7220Model Based Statistics in Biology
Cookie Experiment
Was there a preference?Chocolate chip Cinnamon Rolls
Are they different?Use statistics – Binomial Test!
= =
χ2 = p-value =
Are we feeding you a bunch of lies?
Leonard Henry Courtney(1832-1918)
• Do statisticians use a bunch of fancy tests to bolster weak arguments?
• Are stats misused and misinterpreted?
There are three kinds of lies; lies, damned lies and statistics.- Journal of the Royal Statistical
Society, No. 59 (1896)
• Problems:– Rare events– Zero-inflated– Mean is inappropriate
• Hypothetical example: Less than one endangered species was observed per transect (mean: 0.57 ind./transect). Proceed with development!
Statistics are Balderdash!
Ernest Rutherford (1871-1937)
If your experiment needs statistics, you ought to have
done a better experiment
• Fair Enough….• Balance is important• What about field studies?
No! Hypothesis testing is inevitable
Every experiment may be said to exist only in order to give the facts a chance of
disproving the null hypothesis R.A. Fisher
(1890-1962)
Hypothesis testing is statistical flotsam
Everyone will have his own pet assortment of flotsam; mine include most of the
theory of significance testing, including multiple comparison tests, and non
parametric statistics.
John Nelder(1949-2010)
The trouble with significance testingElementary statistics courses for biologists tend to lead to the use of a stereotyped set of tests:1. Without critical attention to the underlying model involved;2. Without due regard to the precise distribution of sampling
errors;3. With little concern for the scale of measurement;4. Careless of dimensional homogeneity;5. Without considering the ideal transformation;6. Without any attempt at model simplification;7. With too much emphasis on hypothesis testing and too little
emphasis on parameter estimation.- M.J. Crawley 1993
“
”
So how should we analyse our data?!1. Use Model Based Statistics2. Don’t let significance testing do the
thinking for you
You are always better off thinking about why a model could generate your data and then testing that model- L. Wilkinson et al. 1992
Model
Plant height
Tim
e in
sun
light
Data
“
”
Classic approach
• Identify a test by name.• Check its assumptions.• Use automated routines
provided in a package.• Sort through the output for
a p-value.• Report whether p was less
than 5%.
Model Based approach
• What is the response variable?
• What are the explanatory variables?
• Write the model.• Check the residuals. Model
appropriate? Error structure correct?
• Take corrective action. • Report the model,
parameter values, and standard errors.X
In short:Write the model* and discard the search for tests
Plant height
Tim
e in
sun
light
Data = Model + Residual Y = mX + b + Residual
(Regression)
*Don’t panic…writing a model is easy
How to conceptualise a modelQuick example
Data
Verbal
Graphical Formal
Data
Verbal
Graphical Formal
R M1 0
1 25
2 0
2 50
3 25
4 0
4 25
4 50
5 0
5 25
5 75
5 100
5 150
5 175
5 200
6 25
6 50
6 75
6 125
6 150
6 175
7 0
7 25
8 0
8 50
9 25
10 0
10 25
Continued…
M = Catch of scallops (kg)R = Seabed roughness (acoustic values)
Data
Verbal
Graphical Formal
R M1 0
1 25
2 0
2 50
3 25
4 0
4 25
4 50
5 0
5 25
5 75
5 100
5 150
5 175
5 200
6 25
6 50
6 75
6 125
6 150
6 175
7 0
7 25
8 0
8 50
9 25
10 0
10 25
Continued…
M = Catch of scallops (kg)R = Seabed roughness (acoustic values)
Grab samples:5&6 = Gravel1-4 = Sand7-10 = Cobble
Data
Verbal
Graphical Formal
Catch is higher in gravel thanin finer (sand) or coarser(cobble) substrates
Data
Verbal
Graphical Formal
Catch is higher in gravel thanin finer (sand) or coarser(cobble) substrates
• No obvious linear trend
• Simplify– Two means model
(gravel vs. other)
Data
Verbal
Graphical Formal
Catch is higher in gravel thanin finer (sand) or coarser(cobble) substrates
Two mean modelM = K1 if R = 5 or 6 (gravel)M = K2 if R not equal 5 or 6
Data = Model + Residual M = [K1 ,K2] + Residuals
The General Linear Model
Data = Model + Normal Residual
Data = [Two means] + Normal residual } t-test
Data = [Several means] + Normal residual } Oneway ANOVA
Data = [Two factors] + Normal residual } twoway ANOVA
Data = [Line] + Normal residual } Regression
Data = [Line + factors] + Normal residual } ANCOVA
Reasons for the model based approach
1. Statistics is modelling
2. Carryover: biological models statistics
3. Model approach leads to learning of concepts and principles
Testing modelsLet computers do the work
Excel Minitab SPSS SAS RSpreadsheet visible LPull down menus LEasily graph data Basic stats functions Randomise data General Linear Model ? Residual analysis Logistic regression Generalized Linear Model Easy to learn FREE
Course Goals
1. Introduce you to effective ways of thinking quantitatively about biological phenomena
2. Increase your skill and confidence in the application of quantitative methods
3. Develop your critical capacity, both for your own work and that of others