Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric...

53
Correlation Hal Whitehead BIOL4062/5062

Transcript of Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric...

Page 1: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

CorrelationHal Whitehead

BIOL4062/5062

Page 2: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

• The correlation coefficient

• Tests

• Non-parametric correlations

• Partial correlation

• Multiple correlation

• Autocorrelation

• Many correlation coefficients

Page 3: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

The correlation coefficient

Page 4: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

Linked observations: x1,x2,...,xn y1,y2,...,yn

Mean: x = Σ xi / n y = Σ yi / n

Variance: S²(x)= Σ(xi-x)²/(n-1) S²(y)= Σ(yi-y)²/(n-1)

Standard Deviation:

S(x) S(y) Covariance: S²(x,y) = Σ(xi-x) ∙ (yi-y) / (n-1)

Page 5: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

Covariance: S²(x,y) = Σ(xi-x) ∙ (yi-y) / (n-1)

Correlation coefficient

(“Pearson” or “product-moment”):

r = {Σ(xi-x) ∙ (yi-y) / (n-1) } / {S(x) ∙ S(y)}

r = S²(x,y) / {S(x) ∙ S(y)}

Page 6: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

The correlation coefficient:

r = S²(x,y) / {S(x) ∙ S(y)}

-1 ≤ r ≤ +1

If no linear relationship: r = 0

r2: proportion of variance accounted for by linear regression

Page 7: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.
Page 8: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

r = -0.01

Page 9: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.
Page 10: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

r = 0.38

Page 11: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.
Page 12: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

r = -0.31

Page 13: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.
Page 14: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

r = 0.95

Page 15: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.
Page 16: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

r = 0.04

Page 17: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.
Page 18: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

r = 0.64

Page 19: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.
Page 20: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

r = -0.46

Page 21: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.
Page 22: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

r = 0.99

Page 23: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.
Page 24: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

r = -0.0

Page 25: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

Tests on Correlation Coefficients

Page 26: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

Tests on Correlation Coefficients• Assume:

– Independence– Bivariate Normality

Page 27: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

Tests on Correlation Coefficients• Assume:

– Independence– Bivariate Normality

Page 28: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

Tests on Correlation Coefficients• Assume:

– Independence

– Bivariate Normality

• Then:

z = Ln [(1+r)/(1-r)]/2 is normally distributed

with variance 1/(n-3)

And, if (true population value of r) = 0 :

r ∙ √(n-2) / √(1-r²) is distributed as Student's t with n-2 degrees of freedom

Page 29: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

We can test:

a) r ≠ 0

b) r > 0 or r < 0

c) r = constant

d) r(x,y) = r(z,w)

Also confidence intervals for r

Page 30: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

Are Whales Battering Rams?(Carrier et al. J. Exp. Biol. 2002)

-30 -20 -10 0 10 20 30 40 50 60Sexual Size Dimorphism

0

10

20

30

Rel

ativ

e M

elon

Are

a

Page 31: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

Are Whales Battering Rams?(Carrier et al. J. Exp. Biol. 2002)

r = 0.75

(SE = 0.15)

(95% C.I. 0.47-0.89)

Tests:

r ≠ 0 : P = 0.0001

r > 0 : P = 0.00005-30 -20 -10 0 10 20 30 40 50 60

Sexual Size Dimorphism

0

10

20

30

Rel

ativ

e M

elon

Are

a

More sexually dimorphic specieshave relatively larger melons

Page 32: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

Why do Large Animals have Large Brains?

(Schoenemann Brain Behav. Evol. 2004)• Correlations among mammals

– Log brain size with

• Log muscle mass

r=0.984

• Log fat mass r=0.942

• Are these significantly different?

t=5.50; df=36; P<0.01

Hotelling-William test

• Brain mass is more closely related to muscle than fat 0.1 1.0 10.0 100.0 1000.0

Fat/Muscle mass (g)

1.0

10.0

100.0

Bra

in m

ass

(g)

MuscleFat

Page 33: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

Non-Parametric Correlation

Page 34: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

Non-Parametric Correlation

• If one variable normally distributed– can test r=0 as before.

• If neither normally distributed:– Spearman's rS rank correlation coefficient

(replace values by ranks)

or:– Kendall's τ correlation coefficient

• Use Spearman's when there is less certainty about the close rankings

Page 35: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

Are Whales Battering Rams?(Carrier et al. J. Exp. Biol. 2002)

r = 0.75

rS = 0.62

τ= 0.47

-30 -20 -10 0 10 20 30 40 50 60Sexual Size Dimorphism

0

10

20

30

Rel

ativ

e M

elon

Are

a

Page 36: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

Partial Correlation

Page 37: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

Partial Correlation• Correlation between X and Y controlling for Z

r (X,Y|Z) = {r(X,Y) - r(X,Z)∙r(Y,Z)}

√{(1 - r(X,Z)²)∙(1 - r(Y,Z)²)}

• Correlation between X and Y controlling for W,Zr (X,Y|W,Z) = {r(X,Y|W) - r(X,Z|W)∙r(Y,Z|W)}

√{(1 - r(X,Z|W)²)∙(1 - r(Y,Z|W)²)}

n-2-c degrees of freedom

(c is number of control variables)

Page 38: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

Why do Large Animals have Large Brains?

(Schoenemann Brain Behav. Evol. 2004)

• Correlations among mammals

– Log brain size with

Log muscle mass

Controlling for Log body mass

r=0.466

Log fat mass

Controlling for Log body mass

r=-0.299

• Fatter species have relatively smaller brains and more muscular species relatively larger brains

Page 39: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

Semi-partial Correlation Coefficient

• Correlation between X & Y controlling Y for Z

r (X,(Y|Z)) = {r(X,Y) - r(X,Z)∙r(Y,Z)}

√(1 - r(Y,Z)²)

Page 40: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

Are Whales Battering Rams?(Carrier et al. J. Exp. Biol. 2002)

Correlation

r = 0.75

Partial Correlation

r (SSD,MA|L) = 0.73

Semi-partial Correlations

r (SSD,(MA|L)) = 0.69

r ((SSD |L),MA) = 0.71

ME

LA

RE

AS

SD

MELAREA

LE

NG

TH

SSD LENGTH

Page 41: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

Multiple Correlation

Page 42: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

Multiple Correlation Coefficient

• Correlation between one dependent variable and its best estimate from a regression on several independent variables:

r(Y∙X1,X2,X3,...)

• Square of multiple correlation coefficient is:– proportion of variance accounted for by multiple

regression

Page 43: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

Multiple Partial Correlation Coefficient

!

Page 44: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

Autocorrelation

Page 45: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

Autocorrelation

• Purposes– Examine time series

– Look at (serial) independence

Page 46: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

Data

(e.g. Feeding rate on consecutive days,

plankton biomass at each station on a transect):

1.5 1.7 4.3 5.4 5.7 6.2 3.9 4.4 5.2 4.8 3.9 3.7 3.6

Autocorrelation of lag=1 is correlation between:

1.5 1.7 4.3 5.4 5.7 6.2 3.9 4.4 5.2 4.8 3.9 3.7

1.7 4.3 5.4 5.7 6.2 3.9 4.4 5.2 4.8 3.9 3.7 3.6

r = 0.508

Autocorrelation of lag=2 is correlation between:

1.5 1.7 4.3 5.4 5.7 6.2 3.9 4.4 5.2 4.8 3.9

4.3 5.4 5.7 6.2 3.9 4.4 5.2 4.8 3.9 3.7 3.6

r = -0.053

…….

Page 47: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

Autocorrelation Plot

0 5 10 15Lag

-1.0

-0.5

0.0

0.5

1.0

Cor

rela

t ion

Autocorrelation Plot (Correlogram)

Page 48: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

Many Correlation Coefficients

Page 49: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

Many Correlation Coefficients:[Behaviour of Sperm Whale Groups]

NGR25L SST SHITR LSPEED APROP SOCV SHR2 LFMECS LAERRNGR25L 1.00SST 0.12 1.00SHITR -0.21 -0.33* 1.00LSPEED 0.10 -0.28+ 0.06 1.00APROP -0.15 -0.34* 0.07 0.18 1.00SOCV -0.05 0.08 -0.16 -0.01 -0.33* 1.00SHR2 -0.18 -0.12 0.01 -0.20 0.19 -0.03 1.00LFMECS 0.08 0.14 -0.13 -0.12 -0.22 0.29+ -0.18 1.00LAERR -0.10 0.03 -0.21 -0.24 -0.02 0.24 -0.08 0.23 1.00

Listwise deletion, n=40; P<0.10; P<0.05; uncorrected

Expected no. with P<0.10 = 3.6; with P<0.05 = 1.8

Page 50: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

Many Correlation Coefficients:[Behaviour of Sperm Whale Groups]

NGR25L SST SHITR LSPEED APROP SOCV SHR2 LFMECS LAERRNGR25L 1.00SST 0.12 1.00SHITR -0.21 -0.33 1.00LSPEED 0.10 -0.28 0.06 1.00APROP -0.15 -0.34 0.07 0.18 1.00SOCV -0.05 0.08 -0.16 -0.01 -0.33 1.00SHR2 -0.18 -0.12 0.01 -0.20 0.19 -0.03 1.00LFMECS 0.08 0.14 -0.13 -0.12 -0.22 0.29 -0.18 1.00LAERR -0.10 0.03 -0.21 -0.24 -0.02 0.24 -0.08 0.23 1.00

Listwise deletion, n=40; P<0.10; P<0.05; Bonferroni corrected

P=1.0 for all coefficients

Page 51: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

Many Correlation Coefficients:[Behaviour of Sperm Whale Groups]

NGR25L SST SHITR LSPEED APROP SOCV SHR2 LFMECS LAERRNGR25L 1.00SST 0.12 1.00SHITR -0.21 -0.33* 1.00LSPEED 0.10 -0.28+ 0.06 1.00APROP -0.15 -0.34* 0.07 0.18 1.00SOCV -0.05 0.08 -0.16 -0.01 -0.33* 1.00SHR2 -0.18 -0.12 0.01 -0.20 0.19 -0.03 1.00LFMECS 0.08 0.14 -0.13 -0.12 -0.22 0.29+ -0.18 1.00LAERR -0.10 0.03 -0.21 -0.24 -0.02 0.24 -0.08 0.23 1.00

Listwise deletion, n=40; P<0.10; P<0.05; uncorrected

Pairwise deletion, n=59-118; P<0.10; P<0.05; uncorrectedNGR25L SST SHITR LSPEED APROP SOCV SHR2 LFMECS LAERR

NGR25L 1.00SST 0.11 1.00SHITR -0.17+ -0.46* 1.00LSPEED 0.05 -0.17 0.05 1.00APROP -0.05 -0.20+ 0.04 0.31* 1.00SOCV -0.00 -0.05 -0.06 -0.02 -0.25* 1.00SHR2 -0.15 -0.13 0.07 -0.14 0.05 0.01 1.00LFMECS 0.01 0.07 -0.02 -0.14 -0.25* 0.43* -0.26+ 1.00LAERR -0.06 0.06 0.09 -0.27* -0.20+ 0.06 -0.06 0.21+ 1.00

Page 52: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

Many Correlation Coefficients

• Missing values:– Listwise deletion (comparability), or– Pairwise deletion (power)

• P-values:– Uncorrected: type 1 errors– Bonferroni, etc.: type 2 errors

Page 53: Correlation Hal Whitehead BIOL4062/5062. The correlation coefficient Tests Non-parametric correlations Partial correlation Multiple correlation Autocorrelation.

Beware!

Correlation Causation

Y1 Y2

Y1 Y3

Y4

Y2 Y5

Y1

Y3

Y2

Y2

Y1 Y3

Y4

Y1 Y3

Y4

Y2 Y5

Y1 Y3

Y4

Y5

Y2 Y6