Math 141 - Lecture 7: Variance, Covariance, and Sumspeople.reed.edu/~jones/Courses/P07.pdf · Math...
Click here to load reader
Transcript of Math 141 - Lecture 7: Variance, Covariance, and Sumspeople.reed.edu/~jones/Courses/P07.pdf · Math...
Math 141Lecture 7: Variance, Covariance, and Sums
Albyn Jones1
1Library [email protected]
www.people.reed.edu/∼jones/courses/141
Albyn Jones Math 141
Last Time
Variance: expected squared deviation from the mean:
Var(X ) = E(X − µx)2 = σ2
Standard Deviation:
SD(X ) =√
Var(X ) = σ
Properties:Var(a + bX ) = b2Var(X )
SD(a + bX ) = bSD(X )
Interpretation: For a symmetric distribution, the standarddeviation may be considered a typical deviation from themean. For a strongly asymmetrical distribution, it is betterto work with quantiles (percentiles of the distribution).
Albyn Jones Math 141
Covariance and Correlation
Definition: CovarianceLet X and Y be two RV’s with means µx and µy , respectively.Covariance is the expected value of the products of deviationsfrom the means:
Cov(X ,Y ) = E((X − µx)(Y − µy ))
Definition: CorrelationCorrelation is just scale free covariance:
Cor(X ,Y ) = ρxy =Cov(X ,Y )
σxσy= Cov(
Xσx,
Yσy
)
Albyn Jones Math 141
Covariance and Correlation
Extreme Cases: CovarianceIf X = Y , i.e. the two RV’s are copies of each other, then
Cov(X ,Y ) = Cov(X ,X ) = E((X − µx)(X − µx)) = Var(X )
If X = −Y , then
Cov(X ,Y ) = Cov(X ,−X ) = −Var(X )
Extreme Cases: CorrelationIf X = Y , i.e. the two RV’s are copies of each other, then
Cor(X ,Y ) =Cov(X ,X )
σxσx=
Var(X )
σ2x
= 1
If X = −Y , then
Cor(X ,Y ) =Cov(X ,−X )
σ2x
=−Var(X )
σ2x
= −1
Albyn Jones Math 141
Interpretation
Linear AssociationCovariance and Correlation are measures of the degree oflinear association.
Independent RV’s have correlation 0, but correlation 0 does notguarantee independence!
Albyn Jones Math 141
Example
Positive Covariance: linear association with positive slope.
Cov(X ,Y ) > 0
Albyn Jones Math 141
Example
Negative Covariance: linear association with negative slope.
Cov(X ,Y ) < 0
Albyn Jones Math 141
Example
Zero Covariance: no linear association!
Cov(X ,Y ) = 0
Albyn Jones Math 141
The variance of a sum
The expected value of a sum is the sum of the expected values:E(X + Y ) = E(X ) + E(Y ). What about variances?
Var(X + Y ) = E((X + Y )− (µx + µy ))2
Collect the terms involving X and the terms involving Y :
E((X − µx) + (Y − µy ))2
Recalling that (a + b)2 = a2 + 2ab + b2, we have
E((X − µx)2 + (Y − µy )
2 + 2(X − µx)(Y − µy ))
Finally, the expected value of a sum is the sum of the expectedvalues:
Var(X +Y ) = E(X −µx)2 +E(Y −µy )
2 +2E((X − µx)(Y − µy )
)Albyn Jones Math 141
The variance of a sum, part two
We have produced an expression for the variance of a sum:
Var(X +Y ) = E(X −µx)2 +E(Y −µy )
2 +2E((X − µx)(Y − µy )
)The first two terms are Var(X ) and Var(Y ), the third is
2Cov(X ,Y ). Thus
Var(X + Y ) = Var(X ) + Var(Y ) + 2Cov(X ,Y )
Albyn Jones Math 141
The variance of a sum, part three
Var(X + Y ) = Var(X ) + Var(Y ) + 2Cov(X ,Y )
If Cov(X ,Y ) > 0, then
Var(X + Y ) > Var(X ) + Var(Y )
If Cov(X ,Y ) < 0, then
Var(X + Y ) < Var(X ) + Var(Y )
Albyn Jones Math 141
The variance of a sum: Independence
Fact: If two RV’s are independent, we can’t predict one usingthe other, so there is no linear association, and their covarianceis 0.
Theorem: If X and Y are independent, then
Var(X + Y ) = Var(X ) + Var(Y )
In other words, if the random variables are independent, thenthe variance of their sum is the sum of their variances!
Remember Pythagorus? For independent RV’s, the SD of asum works like Euclidean distance for right triangles! IfZ = X + Y
σZ =√σ2
x + σ2y
Albyn Jones Math 141
Pythagorus
σY
σX
√σ2
X + σ2Y
For the mathematically inclined, covariance is an inner producton the infinite dimensional vector space of random variableswith finite variance.
Albyn Jones Math 141
Example: Variance of a Binomial RV
Let X be a Binomial(n,p) RV. We know E(X ) = np.We also know that X is the sum of n independent Bernoulli(p)trials Yi , each with E(Yi) = p, and Var(Yi) = pq whereq = (1 − p) so
Var(X ) = Var(Y1 + Y2 + . . .+ Yn)
The Y ’s are independent, so their variances add:
Var(X ) = Var(Y1) + Var(Y2) + . . .+ Var(Yn) = npq
Hence we have shown
Var(X ) =n∑
k=0
(k − np)2(
nk
)pkqn−k = npq
Albyn Jones Math 141
Example: 100 coin tosses
Toss a fair coin independently 100 times, and let X be thenumber of Heads we get.
What is the appropriate probability model?X ∼ Binomial(100,1/2)What is the expected number of heads?What is σ2
x , the variance of X?What is σx?What are typical outcomes of this experiment?
Albyn Jones Math 141
Example: 1000 coin tosses
Toss a fair coin independently 1000 times, and let X be thenumber of Heads we get.
X ∼ Binomial(1000,1/2)What is the expected number of heads?What is σ2
x , the variance of X?What is σx?What are typical outcomes of this experiment?
Albyn Jones Math 141
Variance of a Poisson
Suppose X ∼ Poisson(µ). What are E(X ) and Var(X )? Onecan compute them directly from the definition, by summing theinfinite series
E(X ) =∑
k · µke−µ/k ! = µ
and thenVar(X ) =
∑(k − µ)2 · µke−µ/k !
Summing those series makes use of ideas from Math 112.
Can we guess the answer without using Math 112?
Albyn Jones Math 141
Variance of a Poisson, Heuristically
Let X be a Binomial(n,p) RV, where n is large and p is small.We know E(X ) = np, and Var(X ) = np(1 − p). If p is small,(1 − p) is close to 1, and so Var(X ) ≈ np = E(X ).Since that Binomial X is approximately a Poisson withparameter µ = np, we should guess for X ∼ Poisson(µ):
E(X ) = µ
andVar(X ) = µ
Albyn Jones Math 141
Summary
Variance of a sum of two RV’s:Var(X + Y ) = Var(X ) + Var(Y ) + 2Cov(X ,Y )
Variance of a sum of independent RV’s:Var(X + Y ) = Var(X ) + Var(Y )
Pythagorus and the Standard Deviation of a sum of
independent RV’s: SD(X + Y ) =√σ2
x + σ2y
Variance of X ∼ Binomial(n,p): Var(X ) = npq = np(1−p)
Albyn Jones Math 141