MATH2715: Statistical Methods - Mathematics at Leedssta6ajb/math2715/hand05-17.pdf · Rice, J.A....
Transcript of MATH2715: Statistical Methods - Mathematics at Leedssta6ajb/math2715/hand05-17.pdf · Rice, J.A....
MATH2715: Statistical Methods
Exercises V (based on lectures 9-10, work week 6, hand in lecture Mon 6 Nov)
ALL questions count towards the continuous assessment for this module.
Q1. If X ∼ gamma(α, λ), write down the moment generating function of X. Using moment
generating functions, deduce the distribution of Y = bX where b is a constant.
Q2. If X1,X2, . . . ,Xn are independent random variables satisfying Xi ∼ gamma(αi, λ), use
moment generating functions to obtain the distribution of Sn =
n∑
i=1
Xi.
Use moment generating functions to obtain the distribution of Xn where Xn =Sn
n.
Q3. Suppose U is a non-negative random variable with probability density function fU (u) and
b is a positive constant. Prove the Markov inequality,
E[U ] ≥ bpr{U ≥ b} .
Hint: Write E[U ] and pr{U ≥ b} as integrals in terms of u and fU(u). Look at the proof of
Chebyshev’s inequality. Where would you split the region of integration?
Q4. Suppose X has a uniform distribution over the interval (0, 1) with mean µ and variance σ2.
Calculate pr{|X − µ| ≥ kσ} for a number of different values of k. Compare the exact numerical
results with the bound given by Chebyshev’s inequality.
Hint: Recall that pr{|X − µ| ≥ c} = pr{X ≤ (µ − c)} + pr{X ≥ (µ + c)}. You could use R to
evaluate the probabilities. For example
k=c(0.5,1.0,1.5,2.0,2.5,3.0,3.5,4.0) # Assign some k values.
chebyshev=1/k^2 # Chebyshev’s bound.
prob=punif(k,0,1) # If X is uniform(0,1), prob = pr(X < k).
# prob gives the cdf F(x) evaluated at x=k.
# prob does NOT give pr(|X-mu| > k*sigma).
# What SHOULD you type here?
Q5. For the discrete random variable X satisfying
pr{X = 0} = 1 − 1
k2, pr{X = −k} =
1
2k2, pr{X = +k} =
1
2k2, k ≥ 1,
show that Chebyshev’s inequality becomes an equality for this particular value of k.
Hint: First obtain the mean µ and the variance σ2 of X.
56
Background Notes: Lecture 10. Weak law of large numbers
Example
Let
Zn =Xn − µ
σ/√
n.
Figure 26 shows the cumulative distribution function of Z1 and Z4 together with the standard
normal cumulative distribution function in the two cases Xiind∼ exponential(λ = 1) and Xi
ind∼uniform(0, 1). As n → ∞ it can be seen that in both cases the cumulative distribution function of
Zn tends to that of the standard normal cumulative distribution function.
−2 −1 0 1 2
0.0
0.2
0.4
0.6
0.8
1.0
z
Cdf
n=1 n=4
N(0,1) cdf
−2 −1 0 1 2
0.0
0.2
0.4
0.6
0.8
1.0
z
Cdf
n=1
n=4
N(0,1) cdf
Figure 26: cumulative distribution function of Z1 and Z4 for (left) Xiind∼ exponential(λ = 1),
(right) Xiind∼ uniform(0, 1).
Chebyshev’s Inequality
For a random variable X with mean µ and finite variance σ2 < ∞, Chebyshev’s58 inequality states
pr{|X − µ| ≥ kσ} ≤ 1
k2.
Further reading for lectures 9 and 10
Rice, J.A. (1995) Mathematical Statistics and Data Analysis (2nd edition), sections 4.2, 4.5, 5.2.
Hogg, R.V., McKean, J.W. and Craig, A.T. (2005) Introduction to Mathematical Statistics (6th
58Pafnuti Chebyshev (1821-1894) was a Russian mathematician. Many variants of the spelling of his name exist,
Pafnut-i/y (T/Ts)-cheb-i/y-sh/ch/sch-e-v/ff/w.
Uses of Chebyshev’s (1867) inequality include constructing confidence intervals, so that
pr{|X − µ| < 4.472σX} > 0.95.
If X is known to be a normal random variable, we have the tighter bound
pr{|X − µ| < 1.96σX} = 0.95.
The general application of Chebyshev’s inequality has lead it to be used in many fields including digital communi-
cation, gene matching, and detecting computer security attacks.
57
edition), sections 1.9, 1.10, 4.2.
Larsen, R.J. and Marx, M.L. (2010) An Introduction to Mathematical Statistics and its Applica-
tions (5th edition), section 3.12, 5.7.
Miller, I. and Miller, M. (2004) John E. Freund’s Mathematical Statistics with Applications, sec-
tions 4.4, 4.5, 5.4, 8.2.
58
MATH2715: Statistical Methods – Worked Examples V
Worked Example: If X ∼ N(µ, σ2), use the moment generating function to obtain the distri-
bution of Y = θX where θ is a constant.
Answer: If X ∼ N(µ, σ2), then mX(t) = eµt+ 1
2σ2t2 .
Mgf of Y is mY (t) = E[etY ] = E[et(θX)] = E[e(θt)X ] = mX(θt).
Thus mY (t) = eµ(θt)+ 1
2σ2(θt)2 = e(µθ)t+ 1
2(σ2θ2)t2 = e(µθ)t+ 1
2(σθ)2t2 .
By inspection of the mgf it can be seen that Y ∼ N(µθ, σ2θ2).
Worked Example: If X has an exponential(λ) distribution, use moment generating functions
to obtain the distribution of U = aX where a (a > 0) is a known constant.
Answer: X ∼ exponential(λ) has mgf mX(t) =λ
λ − tfor t < λ. Thus
mU (t) = E[etU ] = E[et(aX)] = E[e(at)X ] = mX(at) =λ
λ − at=
(λ/a)
(λ/a) − t.
By inspection of the mgf it can be seen that U ∼ exponential(λ/a).
Worked Example: If X and Y are independent exponential(λ) random variables, use moment
generating functions to obtain the distribution of U = X + Y .
Answer: X has mgf mX(t) =λ
λ − tfor t < λ. Similarly mY (t) =
λ
λ − tfor t < λ.
Hence mU (t) = E[etU ] = E[et(X+Y )] = E[etXetY ] = E[etX ] E[etY ] by independence of X and Y .
Thus
mU (t) =λ
λ − t× λ
λ − t=
(λ
λ − t
)2
.
By inspection of the mgf we deduce that U ∼ gamma(2, λ).
Worked Example: If X1, X2 and X3 are independent gamma(α, λ) random variables, use
moment generating functions to obtain the distribution of U = X1 + X2 + X3.
Answer: The Xi have mgf mXi(t) =
(λ
λ − t
)α
. Hence
mU (t) = E[etU ] = E[et(X1+X2+X3)] = E[etX1+tX2+tX3 ] = E[etX1etX2etX3 ] = E[etX1 ]E[etX2 ]E[etX3 ]
by independence of the Xi. Thus
mU (t) =
(λ
λ − t
)α
×(
λ
λ − t
)α
×(
λ
λ − t
)α
=
(λ
λ − t
)3α
.
By inspection of the mgf, U ∼ gamma(3α, λ).
59
Worked Example: Suppose X ∼ N(µ, σ2). Calculate pr{|X − µ| ≥ kσ} for a number of different
values of k and compare the exact results with the bound given by Chebyshev’s inequality.
Answer:
pr{|X − µ| ≥ kσ} = pr{X − µ ≥ kσ}+ pr{X − µ ≤ −kσ} = pr{X ≥ µ + kσ}+ pr{X ≤ µ − kσ} .
But for the normal distribution we evaluate cumulative probabilities by standardising. Thus
pr{|X − µ| ≥ kσ} = pr
{X − µ
σ≥ k
}+ pr
{X − µ
σ≤ −k
}= pr{Z ≥ k} + pr{Z ≤ −k}
where Z =X − µ
σ∼ N(0, 1) with Φ(z) = pr{Z ≤ z}. Thus
pr{|X − µ| ≥ kσ} = (1 − pr{Z ≤ k}) + pr{Z ≤ −k} = (1 − Φ(k)) + Φ(−k)
= (1 − Φ(k)) + (1 − Φ(k)) = 2(1 − Φ(k)).
The probabilities can be generated for different k using the R code below. The probabilities and
Chebyshev bound are shown in table 1.
k=c(0.5,1.0,1.5,2.0,2.5,3.0,3.5,4.0) # Assign some k values.
prob=2*(1-pnorm(k)) # Exact probabilities.
chebyshev=1/k^2 # Chebyshev’s bound.
Figure 27(left) shows pr{|X − µ| ≥ 1.5σ} for X ∼ N(µ = 10, σ2 = 4). Figure 27(right) gives the
exact probability and Chebyshev’s bound for a range of values k.
4 6 8 10 12 14 16
0.00
0.05
0.10
0.15
0.20
0.25
0 1 2 3 4
0.0
0.2
0.4
0.6
0.8
1.0
k
Pro
babi
lity
Chebyshev’s bound
Exact probability pr(|X − µ| ≥ kσ)
Figure 27: (left) pr{|X − µ| ≥ 1.5σ} for X ∼ N(µ = 10, σ2 = 4); (right) Chebyshev’s bound and
the exact probability pr{|X − µ| ≥ kσ} for X ∼ N(µ, σ2).
Worked Example: Suppose X ∼ exponential(λ) with mean µ and variance σ2. Obtain
pr{|X − µ| ≥ kσ}. Calculate pr{|X − µ| ≥ kσ} for a number of different values of k and com-
pare the exact result with the bound given by Chebyshev’s inequality.
60
k pr{|X − µ| ≥ kσ} Chebyshev bound
0.5 0.61708 (⋆)1.0000
1.0 0.31731 1.0000
1.5 0.13361 0.4444
2.0 0.04550 0.2500
2.5 0.01242 0.1600
3.0 0.02700 0.1111
3.5 0.00047 0.0816
4.0 0.00006 0.0625
Table 1: Chebyshev’s inequality and the exact probability for X ∼ N(µ, σ2). (⋆) For k ≤ 1 the
bound is replaced by unity.
Answer: fX(x) = λe−λx for x > 0 and FX(x) = pr{X ≤ x} = 1 − e−λx for x > 0. Also
µ = E[X] = 1/λ and σ2 = Var[X] = 1/λ2. Hence
pr{|X − µ| ≥ kσ} = pr{X ≤ µ − kσ} + pr{X ≥ µ + kσ}= pr
{X ≤ 1−k
λ
}+ pr
{X ≥ 1+k
λ
}
= FX(1−kλ
) + (1 − FX(1+kλ
)) =
{1 − e−(1−k) + e−(1+k) if k < 1,
e−(1+k) if k ≥ 1.
This can be evaluated for different values of k.
Figure 28 shows pr{|X − µ| > 0.6σ} and pr{|X − µ| > 1.5σ} for X ∼ exponential(λ = 2).59
x
Exp
onen
tial(2
) pd
f
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.0
0.5
1.0
1.5
2.0
x
Exp
onen
tial(2
) pd
f
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.0
0.5
1.0
1.5
2.0
Figure 28: for X ∼ exponential(λ = 2), (left) pr{|X − µ| ≥ 0.6σ} and (right) pr{|X − µ| ≥ 1.5σ}.
Suitable R commands for plotting the exact probability pr{|X − µ| ≥ kσ} and the Chebyshev
bound 1/k2 are given below for the case X ∼ exponential(λ = 1). The result is plotted in figure
29.60
k=c(1:30)*0.1 # Set k=0.1, 0.2, ..., 3.0.
59Thus here µ = 1
2, σ = 1
2, and pr{|X − µ| ≥ 0.6σ} = pr{X ≥ 0.8} + pr{X ≤ 0.2} while pr{|X − µ| ≥ 1.5σ} =
pr{X ≥ 1.25} + pr{X ≤ −0.25} where here pr{X ≤ −0.25} = 0.60Chebyshev’s inequality says that pr{|X − µ| ≥ kσ} ≤ 1/k2 for k > 1. If k < 1 the right-hand side of Chebyshev’s
inequality should be set equal to one!
61
prob=pexp(1-k)+(1-pexp(k+1)) # Exact probability.
cheb=1/k^2 # Chebyshev upper bound.
cheb[cheb>1]=1 # Make all values of cheb>1 equal to one.
plot(k,prob,type="l",ylim=c(0,1))
points(k,cheb,type="l",lty=2)
0 1 2 3 4
0.0
0.2
0.4
0.6
0.8
1.0
k
Pro
babi
lity
Chebyshev’s bound Exact probability pr(|X − µ| ≥ kσ)
Figure 29: Chebyshev’s bound and the exact probability pr{|X − µ| ≥ kσ} in the case where
X ∼ exponential(λ).
Worked Example: In the Markov61 inequality, a non-negative random variable U satisfies
E[U ] ≥ bpr{U ≥ b}. Thus if a random variable X has mean µ and variance σ2 and U = g(X) is a
non-negative function of X, then it follows that E[g(X)] ≥ bpr{g(X) ≥ b}.By considering the special case g(X) = (X − µ)2, derive Chebyshev’s inequality
pr{|X − µ| ≥ kσ} ≤ 1
k2.
Answer: The Markov inequality62 can be written as pr{g(X) ≥ b} ≤ E[g(X)]/b.
Now write g(X) = (X − µ)2. Thus pr{(X − µ)2 ≥ b
}≤ E[(X − µ)2]
b=
Var[X]
b=
σ2
b.
Since (X − µ)2 = |X − µ|2, then pr{(X − µ)2 ≥ b
}= pr
{|X − µ| ≥
√b}
.
Write√
b = kσ, b = k2σ2 so pr{(X − µ)2 ≥ b
}≤ σ2
bgives pr{|X − µ| ≥ kσ} ≤ 1
k2as required.
61Andrei Andreyevich Markov (1856-1922) was a Russian mathematician and a student of Chebyshev.62The Markov inequality is still useful for leading to new results. For example, consider Herman Chernoff’s (1952)
inequality pr{X ≥ c} ≤ e−tcmX(t), where X has moment generating function mX(t). This result follows because
pr{X ≥ c} = pr˘etX ≥ etc
¯and putting g(X) = etX and b = etc in the Markov inequality gives
prn
etX ≥ etc
o≤
E[etX ]
etc.
62
Worked Example: Does there exist a random variable X for which Chebyshev’s inequality
becomes an equality for all k ≥ 1?
Answer: There is no distribution which attains the Chebyshev bound for all k ≥ 1. We use a
proof by contradiction and consider the continuous case.
Suppose there exists a random variable X with finite variance σ2 and satisfying
pr{|X − µ| ≥ kσ} =1
k2∀k ≥ 1. (†)
Let Z =X − µ
σ, with E[Z] = 0 and Var[Z] = 1 (⋆) so that pr{|Z| ≥ k} =
1
k2∀k ≥ 1.
Hence pr{|Z| ≥ z} = pr{Z ≤ −z} + pr{Z ≥ z} =1
z2for z ≥ 1.
If Z has cumulative distribution function FZ(z) = pr{Z ≤ z} and probability density function
fZ(z), then FZ(−z) + {1 − FZ(z)} =1
z2.
Differentiating this with respect to z gives −fZ(−z) − fZ(z) = − 2
z3. Clearly fZ(−z) + fZ(z) =
2
z3
for z ≥ 1.
Since the area under the probability density function integrates to unity it follows that
1 =
∫ ∞
−∞
fZ(z) dz =
∫ −1
−∞
fZ(z) dz +
∫ 1
−1fZ(z) dz +
∫ ∞
1fZ(z) dz
so that
1 =
∫ 1
−1fZ(z) dz +
∫ ∞
1{fZ(−z) + fZ(z)}dz =
∫ 1
−1fZ(z) dz +
∫ ∞
1
2
z3dz
and then ∫ 1
−1fZ(z) dz = 1 −
[−1
z2
]∞
1
= 0.
As probability density functions are non-negative this can only hold if fZ(z) = 0 for −1 < z < 1.
Since by assumption E[Z] = 0, the variance of Z satisfies Var[Z] = E[Z2] so that
Var[Z] =
∫ ∞
−∞
z2fZ(z) dz =
∫ ∞
1z2{fZ(−z) + fZ(z)}dz =
∫ ∞
1
2
zdz = ∞.
Thus Var[Z] 6= 1 as assumed at (⋆). Clearly X cannot have finite variance σ2. There is thus no
random variable X which satisfies (†).
Worked Example: Let X1,X2,X3, . . . be a sequence of random variables with a common mean
µ, common variance σ2, and with covariances, for i 6= j, given by
cov(Xi,Xj) =
{ρσ2 if j = i ± 1,
0 otherwise.
Consider now the sequence of means {Xn}, n = 1, 2, 3, . . ., where Xn =1
n
n∑
i=1
Xi. Show that
Var[Xn] =σ2
n+ 2
(n − 1)
n2ρσ2. Deduce that Var[Xn] ≤
(3n − 2
n2
)σ2 and so use Chebyshev’s in-
equality to show that a weak law of large numbers applies to Xn.
63
Answer: Let Sn =
n∑
i=1
Xi. Then Var[Sn] = Var
[n∑
i=1
Xi
]=
n∑
i=1
n∑
j=1
cov(Xi,Xj).
The double sum has n2 terms as shown in the table.
j = 1 j = 2 j = 3 · · · j = n
i = 1 cov(X1,X1) = σ2 cov(X1,X2) = ρσ2 cov(X1,X3) = 0 · · · cov(X1,Xn) = 0
i = 2 cov(X2,X1) = ρσ2 cov(X2,X2) = σ2 cov(X2,X3) = ρσ2 · · · cov(X2,Xn) = 0...
......
.... . .
...
i = n cov(Xn,X1) = 0 cov(Xn,X2) = 0 cov(X1,X3) = 0 · · · cov(Xn,Xn) = σ2
Hence
Var[Sn] =
n∑
i=1
cov(Xi,Xi) +
n∑
i=1
n∑
j=1i6=j
cov(Xi,Xj) =
n∑
i=1
Var[Xi] + 2
n∑
i=1
n∑
j=1i<j
cov(Xi,Xj)
= nσ2 + 2 {cov(X1,X2) + cov(X2,X3) + · · · + cov(Xn−1,Xn)}= nσ2 + 2(n − 1)ρσ2.
This gives
Var[Xn] = Var
[Sn
n
]=
1
n2Var[Sn] =
σ2
n+ 2
(n − 1)
n2ρσ2.
Since −1 ≤ ρ ≤ 1 and cov(Xi,Xj) = ρσ2, then −σ2 ≤ cov(Xi,Xj) ≤ σ2. Hence
Var[Xn] ≤ σ2
n+ 2
(n − 1)
n2σ2 =⇒ Var[Xn] ≤
(3n − 2
n2
)σ2.
In Chebyshev’s inequality, pr{|Y − µY | ≥ kσY } ≤ 1/k2. Put Y = Xn with µY = µ and put
ǫ = kσY so that
pr{|Xn − µ| ≥ ǫ
}≤ Var[Xn]
ǫ2≤ (3n − 2)σ2
n2ǫ2.
For given η, choose n0 such that(3n0 − 2)σ2
n20ǫ
2< η . Then, for all n > n0,
(3n − 2)σ2
n2ǫ2< η and so,
as required, pr{|Xn − µ| ≥ ǫ
}< η if n > n0. Thus Xn
p→ µ as n → ∞.
64
MATH2715: Solutions to exercises II
Q1.
E[(X − µ)4] = E[X4 − 4µX3 + 6µ2X2 − 4µ3X + µ4]
= E[X4] − 4µE[X3] + 6µ2E[X2] − 4µ3E[X] + µ4
as µ is a constant. Thus
E[(X − µ)4] = µ′4 − 4µµ′
3 + 6µ2µ′2 − 4µ4 + µ4 = µ′
4 − 4µµ′3 + 6µ2µ′
2 − 3µ4,
where µ′r = E[Xr] and µ′
1 = E[X] = µ.
Q2.
E[X3] =
∫
x
x3fX(x) dx =
∫ ∞
x=0x3 λαxα−1e−λx
Γ(α)dx
=
∫ ∞
x=0
λαxα+2e−λx
Γ(α)dx =
λα
Γ(α)
∫ ∞
x=0xα+2e−λx dx.
Since fX(x) is a probability density function it integrates to one, so that
∫
x
fX(x) dx = 1 ⇒∫ ∞
x=0
λαxα−1e−λx
Γ(α)dx = 1 ⇒
∫ ∞
x=0xα−1e−λx dx =
Γ(α)
λα.
This must hold for all values of α so that∫ ∞
x=0xA−1e−λx dx =
Γ(A)
λA.
Putting A = α + 3 we have ∫ ∞
x=0xα+2e−λx dx =
Γ(α + 3)
λα+3.
Thus
E[X3] =λα
Γ(α)
∫ ∞
x=0xα+2e−λx dx =
λα
Γ(α)
Γ(α + 3)
λα+3=
α(α + 1)(α + 2)
λ3.
as required since
Γ(α + 3) = (α + 2)Γ(α + 2) = (α + 2)(α + 1)Γ(α + 1) = (α + 2)(α + 1)αΓ(α).
Q3. Since X ∼ Bin(n, θ), E[X] = nθ and Var[X] = nθ(1−θ). For method of moments estimators
equate63: x = E[X] and m2 = Var[X]. Thus x = nθ and m2 = nθ(1 − θ) gives
1 − θ =m2
x⇒ θ = 1 − m2
x
63You could alternatively equate s2 = Var[X], where s2 =1
m − 1
mX
i=1
(xi − x)2 is the sample variance. This would
give a slightly different estimate for θ and n, namely eθ = 1 −s2
xand en =
x2
x − s2.
65
and so estimate n by n =x
θ=
x2
x − m2.
Problem: Do we expect our estimate for n to be an integer? A bigger problem: Is n always
non-negative?64
Q4.
E[X] =
∫
x
xfX(x) dx =
∫ ∞
x=1x
θ
xθ+1dx =
∫ ∞
x=1θx−θ dx =
[θ
x−θ+1
(−θ + 1)
]∞
x=1
= − θ
(−θ + 1)=
θ
θ − 1
since limx→∞
x−θ+1 = 0 because θ > 1.65
For method of moments estimates equate x = E[X] so that x =θ
θ − 1⇒ θ =
x
x − 1.
Q5. For these data
x =1
10
10∑
i=1
xi = 4.81, m′2 =
1
10
10∑
i=1
x2i = 28.983, m2 = m′
2 − x2 = 5.8469.
For a gamma(α, λ) distribution E[X] =α
λand Var[X] =
α
λ2.
For method of moments estimates equate x = E[X] and m2 = Var[X] so that
α =x2
m2, λ =
x
m2.
These give α = 3.956986 and λ = 0.8226582. The best fitting gamma distribution is thus a
gamma(α = 3.957, λ = 0.8227) distribution. Notice that α ≈ 4.
Using R to do the calculations:
x=c(8.7,3.3,5.5,5.8,6.1,3.9,1.2,0.7,7.4,5.5) # Set values into x.
xbar=mean(x) # Mean is 4.81.
mean(x^2) # m2’ is 28.983.
m2=mean(x^2)-xbar^2 # m2 is 5.8469.
alpha=xbar^2/m2 # alpha is 3.956986.
lambda=xbar/m2 # lambda is 0.8226582.
64A simple R simulation shows that en can be negative:
nn=numeric(10000) # Initialise nn to be a vector 10000 long.
m=20; n=10; theta=0.3 # Initialise required constants m, n and theta.
for (k in 1:10000){ # Do 100000 simulations.
x=rbinom(m,n,theta) # Simulate m values from Bin(n,theta) and store in x.
m2=mean((x-mean(x))∧2) # Obtain m2 values.
nn[k]=mean(x)∧2/(mean(x)-m2) # Obtain n estimate.
} # End k loop.
hist(nn,100) # Histogram of n values with about 100 breaks.
65An alternative approach uses the substitution θ = t+1 so E[X] =
Z∞
1
θ
xθdx =
Z∞
1
t + 1
xt+1dx =
t + 1
t
Z∞
1
t
xt+1dx
and the area under the Pareto density with parameter t is one so that E[X] =t + 1
t=
θ
θ − 1as before.
66
MATH2715: Statistical Methods – Examples Class V, week 6
The questions below will be looked at in Examples Class V held in week 6.
Q1. Suppose that g(x) is a non-negative function66 which is symmetric about zero and which
is a non-decreasing function of x for positive x. If X is a continuous random variable prove that,
for any ǫ > 0,
pr{|X| ≥ ǫ} ≤ E[g(X)]
g(ǫ).
(Hint: Split the region of integration (−∞,+∞) for E[g(X)] into suitable parts.)
By writing X = Y − µ, deduce Chebyshev’s inequality as a special case.
Q2. Suppose that Y1, Y2, . . . is a sequence of random variables which satisfies
Yn =
−2n with probability 123−n
0 with probability 1 − 3−n
2n with probability 123−n
for n = 1, 2, . . .. Prove that
p limn→∞
Yn = 0.
What is the mean and variance of Yn? What happens as n → ∞?
66Examples include g(x) = x2 + 1 and g(x) = 1/(1 + e−x2
).
67