Sample Correlation - Mathematics 47: Lecture...

23
Sample Correlation Mathematics 47: Lecture 5 Dan Sloughter Furman University March 10, 2006 Dan Sloughter (Furman University) Sample Correlation March 10, 2006 1/8

Transcript of Sample Correlation - Mathematics 47: Lecture...

Page 1: Sample Correlation - Mathematics 47: Lecture 5math.furman.edu/~dcs/courses/math47/lectures/lecture-5.pdf · Sample Correlation Mathematics 47: Lecture 5 Dan Sloughter Furman University

Sample CorrelationMathematics 47: Lecture 5

Dan Sloughter

Furman University

March 10, 2006

Dan Sloughter (Furman University) Sample Correlation March 10, 2006 1 / 8

Page 2: Sample Correlation - Mathematics 47: Lecture 5math.furman.edu/~dcs/courses/math47/lectures/lecture-5.pdf · Sample Correlation Mathematics 47: Lecture 5 Dan Sloughter Furman University

Definition

If X and Y are random variables with means µX and µY and variances σ2X

and σ2Y , respectively, then we call

cov(X ,Y ) = E [(X − µX )(Y − µY )]

the covariance of X and Y .

Dan Sloughter (Furman University) Sample Correlation March 10, 2006 2 / 8

Page 3: Sample Correlation - Mathematics 47: Lecture 5math.furman.edu/~dcs/courses/math47/lectures/lecture-5.pdf · Sample Correlation Mathematics 47: Lecture 5 Dan Sloughter Furman University

Theorem (Cauchy-Schwarz Inequality)

If X and Y are random variables for which E [X 2] and E [Y 2] both exist,then

(E [XY ])2 ≤ E [X 2]E [Y 2].

Proof.

I Let f (t) = E [(X + tY )2] = E [X 2] + 2tE [XY ] + t2E [Y 2].

I Then f is a quadratic polynomial in t with f (t) ≥ 0 for all t.

I Hence, by the quadratic formula, 4(E [XY ])2 − 4E [X 2]E [Y 2] ≤ 0.

I Hence (E [XY ])2 ≤ E [X 2]E [Y 2].

Dan Sloughter (Furman University) Sample Correlation March 10, 2006 3 / 8

Page 4: Sample Correlation - Mathematics 47: Lecture 5math.furman.edu/~dcs/courses/math47/lectures/lecture-5.pdf · Sample Correlation Mathematics 47: Lecture 5 Dan Sloughter Furman University

Theorem (Cauchy-Schwarz Inequality)

If X and Y are random variables for which E [X 2] and E [Y 2] both exist,then

(E [XY ])2 ≤ E [X 2]E [Y 2].

Proof.

I Let f (t) = E [(X + tY )2] = E [X 2] + 2tE [XY ] + t2E [Y 2].

I Then f is a quadratic polynomial in t with f (t) ≥ 0 for all t.

I Hence, by the quadratic formula, 4(E [XY ])2 − 4E [X 2]E [Y 2] ≤ 0.

I Hence (E [XY ])2 ≤ E [X 2]E [Y 2].

Dan Sloughter (Furman University) Sample Correlation March 10, 2006 3 / 8

Page 5: Sample Correlation - Mathematics 47: Lecture 5math.furman.edu/~dcs/courses/math47/lectures/lecture-5.pdf · Sample Correlation Mathematics 47: Lecture 5 Dan Sloughter Furman University

Theorem (Cauchy-Schwarz Inequality)

If X and Y are random variables for which E [X 2] and E [Y 2] both exist,then

(E [XY ])2 ≤ E [X 2]E [Y 2].

Proof.

I Let f (t) = E [(X + tY )2] = E [X 2] + 2tE [XY ] + t2E [Y 2].

I Then f is a quadratic polynomial in t with f (t) ≥ 0 for all t.

I Hence, by the quadratic formula, 4(E [XY ])2 − 4E [X 2]E [Y 2] ≤ 0.

I Hence (E [XY ])2 ≤ E [X 2]E [Y 2].

Dan Sloughter (Furman University) Sample Correlation March 10, 2006 3 / 8

Page 6: Sample Correlation - Mathematics 47: Lecture 5math.furman.edu/~dcs/courses/math47/lectures/lecture-5.pdf · Sample Correlation Mathematics 47: Lecture 5 Dan Sloughter Furman University

Theorem (Cauchy-Schwarz Inequality)

If X and Y are random variables for which E [X 2] and E [Y 2] both exist,then

(E [XY ])2 ≤ E [X 2]E [Y 2].

Proof.

I Let f (t) = E [(X + tY )2] = E [X 2] + 2tE [XY ] + t2E [Y 2].

I Then f is a quadratic polynomial in t with f (t) ≥ 0 for all t.

I Hence, by the quadratic formula, 4(E [XY ])2 − 4E [X 2]E [Y 2] ≤ 0.

I Hence (E [XY ])2 ≤ E [X 2]E [Y 2].

Dan Sloughter (Furman University) Sample Correlation March 10, 2006 3 / 8

Page 7: Sample Correlation - Mathematics 47: Lecture 5math.furman.edu/~dcs/courses/math47/lectures/lecture-5.pdf · Sample Correlation Mathematics 47: Lecture 5 Dan Sloughter Furman University

Theorem (Cauchy-Schwarz Inequality)

If X and Y are random variables for which E [X 2] and E [Y 2] both exist,then

(E [XY ])2 ≤ E [X 2]E [Y 2].

Proof.

I Let f (t) = E [(X + tY )2] = E [X 2] + 2tE [XY ] + t2E [Y 2].

I Then f is a quadratic polynomial in t with f (t) ≥ 0 for all t.

I Hence, by the quadratic formula, 4(E [XY ])2 − 4E [X 2]E [Y 2] ≤ 0.

I Hence (E [XY ])2 ≤ E [X 2]E [Y 2].

Dan Sloughter (Furman University) Sample Correlation March 10, 2006 3 / 8

Page 8: Sample Correlation - Mathematics 47: Lecture 5math.furman.edu/~dcs/courses/math47/lectures/lecture-5.pdf · Sample Correlation Mathematics 47: Lecture 5 Dan Sloughter Furman University

Theorem (Cauchy-Schwarz Inequality)

If X and Y are random variables for which E [X 2] and E [Y 2] both exist,then

(E [XY ])2 ≤ E [X 2]E [Y 2].

Proof.

I Let f (t) = E [(X + tY )2] = E [X 2] + 2tE [XY ] + t2E [Y 2].

I Then f is a quadratic polynomial in t with f (t) ≥ 0 for all t.

I Hence, by the quadratic formula, 4(E [XY ])2 − 4E [X 2]E [Y 2] ≤ 0.

I Hence (E [XY ])2 ≤ E [X 2]E [Y 2].

Dan Sloughter (Furman University) Sample Correlation March 10, 2006 3 / 8

Page 9: Sample Correlation - Mathematics 47: Lecture 5math.furman.edu/~dcs/courses/math47/lectures/lecture-5.pdf · Sample Correlation Mathematics 47: Lecture 5 Dan Sloughter Furman University

Correlation coefficient

I Applying the Cauchy-Schwarz inequality to the definition ofcovariance, we have

|cov(X ,Y )| ≤√

E [(X − µX )2]√

E [(Y − µY )2] = σXσY .

I If we let,

ρX ,Y =cov(X ,Y )

σXσY

then −1 ≤ ρX ,Y ≤ 1.

I Moreover, |ρX ,Y | = 1 if and only if Y = aX + b for some realnumbers a and b.

Definition

We call ρX ,Y the correlation coefficient of X and Y .

Dan Sloughter (Furman University) Sample Correlation March 10, 2006 4 / 8

Page 10: Sample Correlation - Mathematics 47: Lecture 5math.furman.edu/~dcs/courses/math47/lectures/lecture-5.pdf · Sample Correlation Mathematics 47: Lecture 5 Dan Sloughter Furman University

Correlation coefficient

I Applying the Cauchy-Schwarz inequality to the definition ofcovariance, we have

|cov(X ,Y )| ≤√

E [(X − µX )2]√

E [(Y − µY )2] = σXσY .

I If we let,

ρX ,Y =cov(X ,Y )

σXσY

then −1 ≤ ρX ,Y ≤ 1.

I Moreover, |ρX ,Y | = 1 if and only if Y = aX + b for some realnumbers a and b.

Definition

We call ρX ,Y the correlation coefficient of X and Y .

Dan Sloughter (Furman University) Sample Correlation March 10, 2006 4 / 8

Page 11: Sample Correlation - Mathematics 47: Lecture 5math.furman.edu/~dcs/courses/math47/lectures/lecture-5.pdf · Sample Correlation Mathematics 47: Lecture 5 Dan Sloughter Furman University

Correlation coefficient

I Applying the Cauchy-Schwarz inequality to the definition ofcovariance, we have

|cov(X ,Y )| ≤√

E [(X − µX )2]√

E [(Y − µY )2] = σXσY .

I If we let,

ρX ,Y =cov(X ,Y )

σXσY

then −1 ≤ ρX ,Y ≤ 1.

I Moreover, |ρX ,Y | = 1 if and only if Y = aX + b for some realnumbers a and b.

Definition

We call ρX ,Y the correlation coefficient of X and Y .

Dan Sloughter (Furman University) Sample Correlation March 10, 2006 4 / 8

Page 12: Sample Correlation - Mathematics 47: Lecture 5math.furman.edu/~dcs/courses/math47/lectures/lecture-5.pdf · Sample Correlation Mathematics 47: Lecture 5 Dan Sloughter Furman University

Correlation coefficient

I Applying the Cauchy-Schwarz inequality to the definition ofcovariance, we have

|cov(X ,Y )| ≤√

E [(X − µX )2]√

E [(Y − µY )2] = σXσY .

I If we let,

ρX ,Y =cov(X ,Y )

σXσY

then −1 ≤ ρX ,Y ≤ 1.

I Moreover, |ρX ,Y | = 1 if and only if Y = aX + b for some realnumbers a and b.

Definition

We call ρX ,Y the correlation coefficient of X and Y .

Dan Sloughter (Furman University) Sample Correlation March 10, 2006 4 / 8

Page 13: Sample Correlation - Mathematics 47: Lecture 5math.furman.edu/~dcs/courses/math47/lectures/lecture-5.pdf · Sample Correlation Mathematics 47: Lecture 5 Dan Sloughter Furman University

Correlation and independence

I Note: if X and Y are independent, then cov(X ,Y ) = 0 (and henceρX ,Y = 0).

I If cov(X ,Y ) = 0, we say X and Y are uncorrelated.

I However, uncorrelated does not necessarily imply independence,although it does if (X ,Y ) has a bivariate normal distribution.

Dan Sloughter (Furman University) Sample Correlation March 10, 2006 5 / 8

Page 14: Sample Correlation - Mathematics 47: Lecture 5math.furman.edu/~dcs/courses/math47/lectures/lecture-5.pdf · Sample Correlation Mathematics 47: Lecture 5 Dan Sloughter Furman University

Correlation and independence

I Note: if X and Y are independent, then cov(X ,Y ) = 0 (and henceρX ,Y = 0).

I If cov(X ,Y ) = 0, we say X and Y are uncorrelated.

I However, uncorrelated does not necessarily imply independence,although it does if (X ,Y ) has a bivariate normal distribution.

Dan Sloughter (Furman University) Sample Correlation March 10, 2006 5 / 8

Page 15: Sample Correlation - Mathematics 47: Lecture 5math.furman.edu/~dcs/courses/math47/lectures/lecture-5.pdf · Sample Correlation Mathematics 47: Lecture 5 Dan Sloughter Furman University

Correlation and independence

I Note: if X and Y are independent, then cov(X ,Y ) = 0 (and henceρX ,Y = 0).

I If cov(X ,Y ) = 0, we say X and Y are uncorrelated.

I However, uncorrelated does not necessarily imply independence,although it does if (X ,Y ) has a bivariate normal distribution.

Dan Sloughter (Furman University) Sample Correlation March 10, 2006 5 / 8

Page 16: Sample Correlation - Mathematics 47: Lecture 5math.furman.edu/~dcs/courses/math47/lectures/lecture-5.pdf · Sample Correlation Mathematics 47: Lecture 5 Dan Sloughter Furman University

Sample correlation

I Now suppose (X1,Y1), (X2,Y2), . . . , (Xn,Yn) are independentidentically distributed pairs of random variables (that is, a randomsample from a bivariate distribution).

I Let

R =1n

∑ni=1(Xi − X )(Yi − Y )√

1n

∑ni=1(Xi − X )2

√1n

∑ni=1(Yi − Y )2

=

∑ni=1 XiYi − nX Y√∑n

i=1 X 2i − nX 2

√∑ni=1 Y 2

i − nY 2

=

∑ni=1 XiYi −

Pni=1 Xi

Pni=1 Yi

n√∑ni=1 X 2

i −(

Pni=1 Xi)

2

n

√∑ni Y 2

i −(

Pni=1 Yi)

2

n

.

Definition

We call R the sample correlation coefficient.

Dan Sloughter (Furman University) Sample Correlation March 10, 2006 6 / 8

Page 17: Sample Correlation - Mathematics 47: Lecture 5math.furman.edu/~dcs/courses/math47/lectures/lecture-5.pdf · Sample Correlation Mathematics 47: Lecture 5 Dan Sloughter Furman University

Sample correlation

I Now suppose (X1,Y1), (X2,Y2), . . . , (Xn,Yn) are independentidentically distributed pairs of random variables (that is, a randomsample from a bivariate distribution).

I Let

R =1n

∑ni=1(Xi − X )(Yi − Y )√

1n

∑ni=1(Xi − X )2

√1n

∑ni=1(Yi − Y )2

=

∑ni=1 XiYi − nX Y√∑n

i=1 X 2i − nX 2

√∑ni=1 Y 2

i − nY 2

=

∑ni=1 XiYi −

Pni=1 Xi

Pni=1 Yi

n√∑ni=1 X 2

i −(

Pni=1 Xi)

2

n

√∑ni Y 2

i −(

Pni=1 Yi)

2

n

.

Definition

We call R the sample correlation coefficient.

Dan Sloughter (Furman University) Sample Correlation March 10, 2006 6 / 8

Page 18: Sample Correlation - Mathematics 47: Lecture 5math.furman.edu/~dcs/courses/math47/lectures/lecture-5.pdf · Sample Correlation Mathematics 47: Lecture 5 Dan Sloughter Furman University

Sample correlation

I Now suppose (X1,Y1), (X2,Y2), . . . , (Xn,Yn) are independentidentically distributed pairs of random variables (that is, a randomsample from a bivariate distribution).

I Let

R =1n

∑ni=1(Xi − X )(Yi − Y )√

1n

∑ni=1(Xi − X )2

√1n

∑ni=1(Yi − Y )2

=

∑ni=1 XiYi − nX Y√∑n

i=1 X 2i − nX 2

√∑ni=1 Y 2

i − nY 2

=

∑ni=1 XiYi −

Pni=1 Xi

Pni=1 Yi

n√∑ni=1 X 2

i −(

Pni=1 Xi)

2

n

√∑ni Y 2

i −(

Pni=1 Yi)

2

n

.

Definition

We call R the sample correlation coefficient.

Dan Sloughter (Furman University) Sample Correlation March 10, 2006 6 / 8

Page 19: Sample Correlation - Mathematics 47: Lecture 5math.furman.edu/~dcs/courses/math47/lectures/lecture-5.pdf · Sample Correlation Mathematics 47: Lecture 5 Dan Sloughter Furman University

Example

I An experiment to measure the yield of wheat for seven different levelsof nitrogen gave the following observations:Nitrogen/acre (x) 40 60 80 100 120 140 160Yield (cwt/acre) (y) 15.9 18.8 21.6 25.2 28.7 30.4 30.7

I If we let xi and yi , i = 1, 2, . . . , 7, represent the nitrogen levels andwheat yields, respectively, then

7∑i=1

xi = 700,7∑

i=1

yi = 171.3,7∑

i=1

xiyi = 18, 624,

7∑i=1

x2i = 81, 200, and

7∑i=1

y2i = 4398.19.

I So

r =18, 624− (700)(171.3)

7√81, 200− (700)2

7

√4398.19− (171.3)2

7

≈ 0.9830.

Dan Sloughter (Furman University) Sample Correlation March 10, 2006 7 / 8

Page 20: Sample Correlation - Mathematics 47: Lecture 5math.furman.edu/~dcs/courses/math47/lectures/lecture-5.pdf · Sample Correlation Mathematics 47: Lecture 5 Dan Sloughter Furman University

Example

I An experiment to measure the yield of wheat for seven different levelsof nitrogen gave the following observations:Nitrogen/acre (x) 40 60 80 100 120 140 160Yield (cwt/acre) (y) 15.9 18.8 21.6 25.2 28.7 30.4 30.7

I If we let xi and yi , i = 1, 2, . . . , 7, represent the nitrogen levels andwheat yields, respectively, then

7∑i=1

xi = 700,7∑

i=1

yi = 171.3,7∑

i=1

xiyi = 18, 624,

7∑i=1

x2i = 81, 200, and

7∑i=1

y2i = 4398.19.

I So

r =18, 624− (700)(171.3)

7√81, 200− (700)2

7

√4398.19− (171.3)2

7

≈ 0.9830.

Dan Sloughter (Furman University) Sample Correlation March 10, 2006 7 / 8

Page 21: Sample Correlation - Mathematics 47: Lecture 5math.furman.edu/~dcs/courses/math47/lectures/lecture-5.pdf · Sample Correlation Mathematics 47: Lecture 5 Dan Sloughter Furman University

Example

I An experiment to measure the yield of wheat for seven different levelsof nitrogen gave the following observations:Nitrogen/acre (x) 40 60 80 100 120 140 160Yield (cwt/acre) (y) 15.9 18.8 21.6 25.2 28.7 30.4 30.7

I If we let xi and yi , i = 1, 2, . . . , 7, represent the nitrogen levels andwheat yields, respectively, then

7∑i=1

xi = 700,

7∑i=1

yi = 171.3,

7∑i=1

xiyi = 18, 624,

7∑i=1

x2i = 81, 200, and

7∑i=1

y2i = 4398.19.

I So

r =18, 624− (700)(171.3)

7√81, 200− (700)2

7

√4398.19− (171.3)2

7

≈ 0.9830.

Dan Sloughter (Furman University) Sample Correlation March 10, 2006 7 / 8

Page 22: Sample Correlation - Mathematics 47: Lecture 5math.furman.edu/~dcs/courses/math47/lectures/lecture-5.pdf · Sample Correlation Mathematics 47: Lecture 5 Dan Sloughter Furman University

Example

I An experiment to measure the yield of wheat for seven different levelsof nitrogen gave the following observations:Nitrogen/acre (x) 40 60 80 100 120 140 160Yield (cwt/acre) (y) 15.9 18.8 21.6 25.2 28.7 30.4 30.7

I If we let xi and yi , i = 1, 2, . . . , 7, represent the nitrogen levels andwheat yields, respectively, then

7∑i=1

xi = 700,

7∑i=1

yi = 171.3,

7∑i=1

xiyi = 18, 624,

7∑i=1

x2i = 81, 200, and

7∑i=1

y2i = 4398.19.

I So

r =18, 624− (700)(171.3)

7√81, 200− (700)2

7

√4398.19− (171.3)2

7

≈ 0.9830.

Dan Sloughter (Furman University) Sample Correlation March 10, 2006 7 / 8

Page 23: Sample Correlation - Mathematics 47: Lecture 5math.furman.edu/~dcs/courses/math47/lectures/lecture-5.pdf · Sample Correlation Mathematics 47: Lecture 5 Dan Sloughter Furman University

Example (cont’d)

I If the nitrogen levels are in a vector x and the wheat yields are in avector y, then the R command > cor(x, y) returns r , in this case0.9830173.

Dan Sloughter (Furman University) Sample Correlation March 10, 2006 8 / 8