MATH2715: Statistical Methods - Mathematics at Leedssta6ajb/math2715/hand05-17.pdf · Rice, J.A....

12
MATH2715: Statistical Methods Exercises V (based on lectures 9-10, work week 6, hand in lecture Mon 6 Nov) ALL questions count towards the continuous assessment for this module. Q1. If X gamma(α, λ), write down the moment generating function of X. Using moment generating functions, deduce the distribution of Y = bX where b is a constant. Q2. If X 1 ,X 2 ,...,X n are independent random variables satisfying X i gamma(α i ), use moment generating functions to obtain the distribution of S n = n i=1 X i . Use moment generating functions to obtain the distribution of ¯ X n where ¯ X n = S n n . Q3. Suppose U is a non-negative random variable with probability density function f U (u) and b is a positive constant. Prove the Markov inequality, E[U ] b pr{U b} . Hint: Write E[U ] and pr{U b} as integrals in terms of u and f U (u). Look at the proof of Chebyshev’s inequality. Where would you split the region of integration? Q4. Suppose X has a uniform distribution over the interval (0, 1) with mean μ and variance σ 2 . Calculate pr{|X μ|≥ } for a number of different values of k. Compare the exact numerical results with the bound given by Chebyshev’s inequality. Hint: Recall that pr{|X μ|≥ c} = pr{X (μ c)} + pr{X (μ + c)}. You could use R to evaluate the probabilities. For example k=c(0.5,1.0,1.5,2.0,2.5,3.0,3.5,4.0) # Assign some k values. chebyshev=1/k^2 # Chebyshev’s bound. prob=punif(k,0,1) # If X is uniform(0,1), prob = pr(X < k). # prob gives the cdf F(x) evaluated at x=k. # prob does NOT give pr(|X-mu| > k*sigma). # What SHOULD you type here? Q5. For the discrete random variable X satisfying pr{X =0} =1 1 k 2 , pr{X = k} = 1 2k 2 , pr{X =+k} = 1 2k 2 , k 1, show that Chebyshev’s inequality becomes an equality for this particular value of k. Hint: First obtain the mean μ and the variance σ 2 of X. 56

Transcript of MATH2715: Statistical Methods - Mathematics at Leedssta6ajb/math2715/hand05-17.pdf · Rice, J.A....

Page 1: MATH2715: Statistical Methods - Mathematics at Leedssta6ajb/math2715/hand05-17.pdf · Rice, J.A. (1995) Mathematical Statistics and Data Analysis (2nd edition), ... J.W. and Craig,

MATH2715: Statistical Methods

Exercises V (based on lectures 9-10, work week 6, hand in lecture Mon 6 Nov)

ALL questions count towards the continuous assessment for this module.

Q1. If X ∼ gamma(α, λ), write down the moment generating function of X. Using moment

generating functions, deduce the distribution of Y = bX where b is a constant.

Q2. If X1,X2, . . . ,Xn are independent random variables satisfying Xi ∼ gamma(αi, λ), use

moment generating functions to obtain the distribution of Sn =

n∑

i=1

Xi.

Use moment generating functions to obtain the distribution of Xn where Xn =Sn

n.

Q3. Suppose U is a non-negative random variable with probability density function fU (u) and

b is a positive constant. Prove the Markov inequality,

E[U ] ≥ bpr{U ≥ b} .

Hint: Write E[U ] and pr{U ≥ b} as integrals in terms of u and fU(u). Look at the proof of

Chebyshev’s inequality. Where would you split the region of integration?

Q4. Suppose X has a uniform distribution over the interval (0, 1) with mean µ and variance σ2.

Calculate pr{|X − µ| ≥ kσ} for a number of different values of k. Compare the exact numerical

results with the bound given by Chebyshev’s inequality.

Hint: Recall that pr{|X − µ| ≥ c} = pr{X ≤ (µ − c)} + pr{X ≥ (µ + c)}. You could use R to

evaluate the probabilities. For example

k=c(0.5,1.0,1.5,2.0,2.5,3.0,3.5,4.0) # Assign some k values.

chebyshev=1/k^2 # Chebyshev’s bound.

prob=punif(k,0,1) # If X is uniform(0,1), prob = pr(X < k).

# prob gives the cdf F(x) evaluated at x=k.

# prob does NOT give pr(|X-mu| > k*sigma).

# What SHOULD you type here?

Q5. For the discrete random variable X satisfying

pr{X = 0} = 1 − 1

k2, pr{X = −k} =

1

2k2, pr{X = +k} =

1

2k2, k ≥ 1,

show that Chebyshev’s inequality becomes an equality for this particular value of k.

Hint: First obtain the mean µ and the variance σ2 of X.

56

Page 2: MATH2715: Statistical Methods - Mathematics at Leedssta6ajb/math2715/hand05-17.pdf · Rice, J.A. (1995) Mathematical Statistics and Data Analysis (2nd edition), ... J.W. and Craig,

Background Notes: Lecture 10. Weak law of large numbers

Example

Let

Zn =Xn − µ

σ/√

n.

Figure 26 shows the cumulative distribution function of Z1 and Z4 together with the standard

normal cumulative distribution function in the two cases Xiind∼ exponential(λ = 1) and Xi

ind∼uniform(0, 1). As n → ∞ it can be seen that in both cases the cumulative distribution function of

Zn tends to that of the standard normal cumulative distribution function.

−2 −1 0 1 2

0.0

0.2

0.4

0.6

0.8

1.0

z

Cdf

n=1 n=4

N(0,1) cdf

−2 −1 0 1 2

0.0

0.2

0.4

0.6

0.8

1.0

z

Cdf

n=1

n=4

N(0,1) cdf

Figure 26: cumulative distribution function of Z1 and Z4 for (left) Xiind∼ exponential(λ = 1),

(right) Xiind∼ uniform(0, 1).

Chebyshev’s Inequality

For a random variable X with mean µ and finite variance σ2 < ∞, Chebyshev’s58 inequality states

pr{|X − µ| ≥ kσ} ≤ 1

k2.

Further reading for lectures 9 and 10

Rice, J.A. (1995) Mathematical Statistics and Data Analysis (2nd edition), sections 4.2, 4.5, 5.2.

Hogg, R.V., McKean, J.W. and Craig, A.T. (2005) Introduction to Mathematical Statistics (6th

58Pafnuti Chebyshev (1821-1894) was a Russian mathematician. Many variants of the spelling of his name exist,

Pafnut-i/y (T/Ts)-cheb-i/y-sh/ch/sch-e-v/ff/w.

Uses of Chebyshev’s (1867) inequality include constructing confidence intervals, so that

pr{|X − µ| < 4.472σX} > 0.95.

If X is known to be a normal random variable, we have the tighter bound

pr{|X − µ| < 1.96σX} = 0.95.

The general application of Chebyshev’s inequality has lead it to be used in many fields including digital communi-

cation, gene matching, and detecting computer security attacks.

57

Page 3: MATH2715: Statistical Methods - Mathematics at Leedssta6ajb/math2715/hand05-17.pdf · Rice, J.A. (1995) Mathematical Statistics and Data Analysis (2nd edition), ... J.W. and Craig,

edition), sections 1.9, 1.10, 4.2.

Larsen, R.J. and Marx, M.L. (2010) An Introduction to Mathematical Statistics and its Applica-

tions (5th edition), section 3.12, 5.7.

Miller, I. and Miller, M. (2004) John E. Freund’s Mathematical Statistics with Applications, sec-

tions 4.4, 4.5, 5.4, 8.2.

58

Page 4: MATH2715: Statistical Methods - Mathematics at Leedssta6ajb/math2715/hand05-17.pdf · Rice, J.A. (1995) Mathematical Statistics and Data Analysis (2nd edition), ... J.W. and Craig,

MATH2715: Statistical Methods – Worked Examples V

Worked Example: If X ∼ N(µ, σ2), use the moment generating function to obtain the distri-

bution of Y = θX where θ is a constant.

Answer: If X ∼ N(µ, σ2), then mX(t) = eµt+ 1

2σ2t2 .

Mgf of Y is mY (t) = E[etY ] = E[et(θX)] = E[e(θt)X ] = mX(θt).

Thus mY (t) = eµ(θt)+ 1

2σ2(θt)2 = e(µθ)t+ 1

2(σ2θ2)t2 = e(µθ)t+ 1

2(σθ)2t2 .

By inspection of the mgf it can be seen that Y ∼ N(µθ, σ2θ2).

Worked Example: If X has an exponential(λ) distribution, use moment generating functions

to obtain the distribution of U = aX where a (a > 0) is a known constant.

Answer: X ∼ exponential(λ) has mgf mX(t) =λ

λ − tfor t < λ. Thus

mU (t) = E[etU ] = E[et(aX)] = E[e(at)X ] = mX(at) =λ

λ − at=

(λ/a)

(λ/a) − t.

By inspection of the mgf it can be seen that U ∼ exponential(λ/a).

Worked Example: If X and Y are independent exponential(λ) random variables, use moment

generating functions to obtain the distribution of U = X + Y .

Answer: X has mgf mX(t) =λ

λ − tfor t < λ. Similarly mY (t) =

λ

λ − tfor t < λ.

Hence mU (t) = E[etU ] = E[et(X+Y )] = E[etXetY ] = E[etX ] E[etY ] by independence of X and Y .

Thus

mU (t) =λ

λ − t× λ

λ − t=

λ − t

)2

.

By inspection of the mgf we deduce that U ∼ gamma(2, λ).

Worked Example: If X1, X2 and X3 are independent gamma(α, λ) random variables, use

moment generating functions to obtain the distribution of U = X1 + X2 + X3.

Answer: The Xi have mgf mXi(t) =

λ − t

. Hence

mU (t) = E[etU ] = E[et(X1+X2+X3)] = E[etX1+tX2+tX3 ] = E[etX1etX2etX3 ] = E[etX1 ]E[etX2 ]E[etX3 ]

by independence of the Xi. Thus

mU (t) =

λ − t

×(

λ

λ − t

×(

λ

λ − t

=

λ − t

)3α

.

By inspection of the mgf, U ∼ gamma(3α, λ).

59

Page 5: MATH2715: Statistical Methods - Mathematics at Leedssta6ajb/math2715/hand05-17.pdf · Rice, J.A. (1995) Mathematical Statistics and Data Analysis (2nd edition), ... J.W. and Craig,

Worked Example: Suppose X ∼ N(µ, σ2). Calculate pr{|X − µ| ≥ kσ} for a number of different

values of k and compare the exact results with the bound given by Chebyshev’s inequality.

Answer:

pr{|X − µ| ≥ kσ} = pr{X − µ ≥ kσ}+ pr{X − µ ≤ −kσ} = pr{X ≥ µ + kσ}+ pr{X ≤ µ − kσ} .

But for the normal distribution we evaluate cumulative probabilities by standardising. Thus

pr{|X − µ| ≥ kσ} = pr

{X − µ

σ≥ k

}+ pr

{X − µ

σ≤ −k

}= pr{Z ≥ k} + pr{Z ≤ −k}

where Z =X − µ

σ∼ N(0, 1) with Φ(z) = pr{Z ≤ z}. Thus

pr{|X − µ| ≥ kσ} = (1 − pr{Z ≤ k}) + pr{Z ≤ −k} = (1 − Φ(k)) + Φ(−k)

= (1 − Φ(k)) + (1 − Φ(k)) = 2(1 − Φ(k)).

The probabilities can be generated for different k using the R code below. The probabilities and

Chebyshev bound are shown in table 1.

k=c(0.5,1.0,1.5,2.0,2.5,3.0,3.5,4.0) # Assign some k values.

prob=2*(1-pnorm(k)) # Exact probabilities.

chebyshev=1/k^2 # Chebyshev’s bound.

Figure 27(left) shows pr{|X − µ| ≥ 1.5σ} for X ∼ N(µ = 10, σ2 = 4). Figure 27(right) gives the

exact probability and Chebyshev’s bound for a range of values k.

Pdf

4 6 8 10 12 14 16

0.00

0.05

0.10

0.15

0.20

0.25

0 1 2 3 4

0.0

0.2

0.4

0.6

0.8

1.0

k

Pro

babi

lity

Chebyshev’s bound

Exact probability pr(|X − µ| ≥ kσ)

Figure 27: (left) pr{|X − µ| ≥ 1.5σ} for X ∼ N(µ = 10, σ2 = 4); (right) Chebyshev’s bound and

the exact probability pr{|X − µ| ≥ kσ} for X ∼ N(µ, σ2).

Worked Example: Suppose X ∼ exponential(λ) with mean µ and variance σ2. Obtain

pr{|X − µ| ≥ kσ}. Calculate pr{|X − µ| ≥ kσ} for a number of different values of k and com-

pare the exact result with the bound given by Chebyshev’s inequality.

60

Page 6: MATH2715: Statistical Methods - Mathematics at Leedssta6ajb/math2715/hand05-17.pdf · Rice, J.A. (1995) Mathematical Statistics and Data Analysis (2nd edition), ... J.W. and Craig,

k pr{|X − µ| ≥ kσ} Chebyshev bound

0.5 0.61708 (⋆)1.0000

1.0 0.31731 1.0000

1.5 0.13361 0.4444

2.0 0.04550 0.2500

2.5 0.01242 0.1600

3.0 0.02700 0.1111

3.5 0.00047 0.0816

4.0 0.00006 0.0625

Table 1: Chebyshev’s inequality and the exact probability for X ∼ N(µ, σ2). (⋆) For k ≤ 1 the

bound is replaced by unity.

Answer: fX(x) = λe−λx for x > 0 and FX(x) = pr{X ≤ x} = 1 − e−λx for x > 0. Also

µ = E[X] = 1/λ and σ2 = Var[X] = 1/λ2. Hence

pr{|X − µ| ≥ kσ} = pr{X ≤ µ − kσ} + pr{X ≥ µ + kσ}= pr

{X ≤ 1−k

λ

}+ pr

{X ≥ 1+k

λ

}

= FX(1−kλ

) + (1 − FX(1+kλ

)) =

{1 − e−(1−k) + e−(1+k) if k < 1,

e−(1+k) if k ≥ 1.

This can be evaluated for different values of k.

Figure 28 shows pr{|X − µ| > 0.6σ} and pr{|X − µ| > 1.5σ} for X ∼ exponential(λ = 2).59

x

Exp

onen

tial(2

) pd

f

0.0 0.5 1.0 1.5 2.0 2.5 3.0

0.0

0.5

1.0

1.5

2.0

x

Exp

onen

tial(2

) pd

f

0.0 0.5 1.0 1.5 2.0 2.5 3.0

0.0

0.5

1.0

1.5

2.0

Figure 28: for X ∼ exponential(λ = 2), (left) pr{|X − µ| ≥ 0.6σ} and (right) pr{|X − µ| ≥ 1.5σ}.

Suitable R commands for plotting the exact probability pr{|X − µ| ≥ kσ} and the Chebyshev

bound 1/k2 are given below for the case X ∼ exponential(λ = 1). The result is plotted in figure

29.60

k=c(1:30)*0.1 # Set k=0.1, 0.2, ..., 3.0.

59Thus here µ = 1

2, σ = 1

2, and pr{|X − µ| ≥ 0.6σ} = pr{X ≥ 0.8} + pr{X ≤ 0.2} while pr{|X − µ| ≥ 1.5σ} =

pr{X ≥ 1.25} + pr{X ≤ −0.25} where here pr{X ≤ −0.25} = 0.60Chebyshev’s inequality says that pr{|X − µ| ≥ kσ} ≤ 1/k2 for k > 1. If k < 1 the right-hand side of Chebyshev’s

inequality should be set equal to one!

61

Page 7: MATH2715: Statistical Methods - Mathematics at Leedssta6ajb/math2715/hand05-17.pdf · Rice, J.A. (1995) Mathematical Statistics and Data Analysis (2nd edition), ... J.W. and Craig,

prob=pexp(1-k)+(1-pexp(k+1)) # Exact probability.

cheb=1/k^2 # Chebyshev upper bound.

cheb[cheb>1]=1 # Make all values of cheb>1 equal to one.

plot(k,prob,type="l",ylim=c(0,1))

points(k,cheb,type="l",lty=2)

0 1 2 3 4

0.0

0.2

0.4

0.6

0.8

1.0

k

Pro

babi

lity

Chebyshev’s bound Exact probability pr(|X − µ| ≥ kσ)

Figure 29: Chebyshev’s bound and the exact probability pr{|X − µ| ≥ kσ} in the case where

X ∼ exponential(λ).

Worked Example: In the Markov61 inequality, a non-negative random variable U satisfies

E[U ] ≥ bpr{U ≥ b}. Thus if a random variable X has mean µ and variance σ2 and U = g(X) is a

non-negative function of X, then it follows that E[g(X)] ≥ bpr{g(X) ≥ b}.By considering the special case g(X) = (X − µ)2, derive Chebyshev’s inequality

pr{|X − µ| ≥ kσ} ≤ 1

k2.

Answer: The Markov inequality62 can be written as pr{g(X) ≥ b} ≤ E[g(X)]/b.

Now write g(X) = (X − µ)2. Thus pr{(X − µ)2 ≥ b

}≤ E[(X − µ)2]

b=

Var[X]

b=

σ2

b.

Since (X − µ)2 = |X − µ|2, then pr{(X − µ)2 ≥ b

}= pr

{|X − µ| ≥

√b}

.

Write√

b = kσ, b = k2σ2 so pr{(X − µ)2 ≥ b

}≤ σ2

bgives pr{|X − µ| ≥ kσ} ≤ 1

k2as required.

61Andrei Andreyevich Markov (1856-1922) was a Russian mathematician and a student of Chebyshev.62The Markov inequality is still useful for leading to new results. For example, consider Herman Chernoff’s (1952)

inequality pr{X ≥ c} ≤ e−tcmX(t), where X has moment generating function mX(t). This result follows because

pr{X ≥ c} = pr˘etX ≥ etc

¯and putting g(X) = etX and b = etc in the Markov inequality gives

prn

etX ≥ etc

o≤

E[etX ]

etc.

62

Page 8: MATH2715: Statistical Methods - Mathematics at Leedssta6ajb/math2715/hand05-17.pdf · Rice, J.A. (1995) Mathematical Statistics and Data Analysis (2nd edition), ... J.W. and Craig,

Worked Example: Does there exist a random variable X for which Chebyshev’s inequality

becomes an equality for all k ≥ 1?

Answer: There is no distribution which attains the Chebyshev bound for all k ≥ 1. We use a

proof by contradiction and consider the continuous case.

Suppose there exists a random variable X with finite variance σ2 and satisfying

pr{|X − µ| ≥ kσ} =1

k2∀k ≥ 1. (†)

Let Z =X − µ

σ, with E[Z] = 0 and Var[Z] = 1 (⋆) so that pr{|Z| ≥ k} =

1

k2∀k ≥ 1.

Hence pr{|Z| ≥ z} = pr{Z ≤ −z} + pr{Z ≥ z} =1

z2for z ≥ 1.

If Z has cumulative distribution function FZ(z) = pr{Z ≤ z} and probability density function

fZ(z), then FZ(−z) + {1 − FZ(z)} =1

z2.

Differentiating this with respect to z gives −fZ(−z) − fZ(z) = − 2

z3. Clearly fZ(−z) + fZ(z) =

2

z3

for z ≥ 1.

Since the area under the probability density function integrates to unity it follows that

1 =

∫ ∞

−∞

fZ(z) dz =

∫ −1

−∞

fZ(z) dz +

∫ 1

−1fZ(z) dz +

∫ ∞

1fZ(z) dz

so that

1 =

∫ 1

−1fZ(z) dz +

∫ ∞

1{fZ(−z) + fZ(z)}dz =

∫ 1

−1fZ(z) dz +

∫ ∞

1

2

z3dz

and then ∫ 1

−1fZ(z) dz = 1 −

[−1

z2

]∞

1

= 0.

As probability density functions are non-negative this can only hold if fZ(z) = 0 for −1 < z < 1.

Since by assumption E[Z] = 0, the variance of Z satisfies Var[Z] = E[Z2] so that

Var[Z] =

∫ ∞

−∞

z2fZ(z) dz =

∫ ∞

1z2{fZ(−z) + fZ(z)}dz =

∫ ∞

1

2

zdz = ∞.

Thus Var[Z] 6= 1 as assumed at (⋆). Clearly X cannot have finite variance σ2. There is thus no

random variable X which satisfies (†).

Worked Example: Let X1,X2,X3, . . . be a sequence of random variables with a common mean

µ, common variance σ2, and with covariances, for i 6= j, given by

cov(Xi,Xj) =

{ρσ2 if j = i ± 1,

0 otherwise.

Consider now the sequence of means {Xn}, n = 1, 2, 3, . . ., where Xn =1

n

n∑

i=1

Xi. Show that

Var[Xn] =σ2

n+ 2

(n − 1)

n2ρσ2. Deduce that Var[Xn] ≤

(3n − 2

n2

)σ2 and so use Chebyshev’s in-

equality to show that a weak law of large numbers applies to Xn.

63

Page 9: MATH2715: Statistical Methods - Mathematics at Leedssta6ajb/math2715/hand05-17.pdf · Rice, J.A. (1995) Mathematical Statistics and Data Analysis (2nd edition), ... J.W. and Craig,

Answer: Let Sn =

n∑

i=1

Xi. Then Var[Sn] = Var

[n∑

i=1

Xi

]=

n∑

i=1

n∑

j=1

cov(Xi,Xj).

The double sum has n2 terms as shown in the table.

j = 1 j = 2 j = 3 · · · j = n

i = 1 cov(X1,X1) = σ2 cov(X1,X2) = ρσ2 cov(X1,X3) = 0 · · · cov(X1,Xn) = 0

i = 2 cov(X2,X1) = ρσ2 cov(X2,X2) = σ2 cov(X2,X3) = ρσ2 · · · cov(X2,Xn) = 0...

......

.... . .

...

i = n cov(Xn,X1) = 0 cov(Xn,X2) = 0 cov(X1,X3) = 0 · · · cov(Xn,Xn) = σ2

Hence

Var[Sn] =

n∑

i=1

cov(Xi,Xi) +

n∑

i=1

n∑

j=1i6=j

cov(Xi,Xj) =

n∑

i=1

Var[Xi] + 2

n∑

i=1

n∑

j=1i<j

cov(Xi,Xj)

= nσ2 + 2 {cov(X1,X2) + cov(X2,X3) + · · · + cov(Xn−1,Xn)}= nσ2 + 2(n − 1)ρσ2.

This gives

Var[Xn] = Var

[Sn

n

]=

1

n2Var[Sn] =

σ2

n+ 2

(n − 1)

n2ρσ2.

Since −1 ≤ ρ ≤ 1 and cov(Xi,Xj) = ρσ2, then −σ2 ≤ cov(Xi,Xj) ≤ σ2. Hence

Var[Xn] ≤ σ2

n+ 2

(n − 1)

n2σ2 =⇒ Var[Xn] ≤

(3n − 2

n2

)σ2.

In Chebyshev’s inequality, pr{|Y − µY | ≥ kσY } ≤ 1/k2. Put Y = Xn with µY = µ and put

ǫ = kσY so that

pr{|Xn − µ| ≥ ǫ

}≤ Var[Xn]

ǫ2≤ (3n − 2)σ2

n2ǫ2.

For given η, choose n0 such that(3n0 − 2)σ2

n20ǫ

2< η . Then, for all n > n0,

(3n − 2)σ2

n2ǫ2< η and so,

as required, pr{|Xn − µ| ≥ ǫ

}< η if n > n0. Thus Xn

p→ µ as n → ∞.

64

Page 10: MATH2715: Statistical Methods - Mathematics at Leedssta6ajb/math2715/hand05-17.pdf · Rice, J.A. (1995) Mathematical Statistics and Data Analysis (2nd edition), ... J.W. and Craig,

MATH2715: Solutions to exercises II

Q1.

E[(X − µ)4] = E[X4 − 4µX3 + 6µ2X2 − 4µ3X + µ4]

= E[X4] − 4µE[X3] + 6µ2E[X2] − 4µ3E[X] + µ4

as µ is a constant. Thus

E[(X − µ)4] = µ′4 − 4µµ′

3 + 6µ2µ′2 − 4µ4 + µ4 = µ′

4 − 4µµ′3 + 6µ2µ′

2 − 3µ4,

where µ′r = E[Xr] and µ′

1 = E[X] = µ.

Q2.

E[X3] =

x

x3fX(x) dx =

∫ ∞

x=0x3 λαxα−1e−λx

Γ(α)dx

=

∫ ∞

x=0

λαxα+2e−λx

Γ(α)dx =

λα

Γ(α)

∫ ∞

x=0xα+2e−λx dx.

Since fX(x) is a probability density function it integrates to one, so that

x

fX(x) dx = 1 ⇒∫ ∞

x=0

λαxα−1e−λx

Γ(α)dx = 1 ⇒

∫ ∞

x=0xα−1e−λx dx =

Γ(α)

λα.

This must hold for all values of α so that∫ ∞

x=0xA−1e−λx dx =

Γ(A)

λA.

Putting A = α + 3 we have ∫ ∞

x=0xα+2e−λx dx =

Γ(α + 3)

λα+3.

Thus

E[X3] =λα

Γ(α)

∫ ∞

x=0xα+2e−λx dx =

λα

Γ(α)

Γ(α + 3)

λα+3=

α(α + 1)(α + 2)

λ3.

as required since

Γ(α + 3) = (α + 2)Γ(α + 2) = (α + 2)(α + 1)Γ(α + 1) = (α + 2)(α + 1)αΓ(α).

Q3. Since X ∼ Bin(n, θ), E[X] = nθ and Var[X] = nθ(1−θ). For method of moments estimators

equate63: x = E[X] and m2 = Var[X]. Thus x = nθ and m2 = nθ(1 − θ) gives

1 − θ =m2

x⇒ θ = 1 − m2

x

63You could alternatively equate s2 = Var[X], where s2 =1

m − 1

mX

i=1

(xi − x)2 is the sample variance. This would

give a slightly different estimate for θ and n, namely eθ = 1 −s2

xand en =

x2

x − s2.

65

Page 11: MATH2715: Statistical Methods - Mathematics at Leedssta6ajb/math2715/hand05-17.pdf · Rice, J.A. (1995) Mathematical Statistics and Data Analysis (2nd edition), ... J.W. and Craig,

and so estimate n by n =x

θ=

x2

x − m2.

Problem: Do we expect our estimate for n to be an integer? A bigger problem: Is n always

non-negative?64

Q4.

E[X] =

x

xfX(x) dx =

∫ ∞

x=1x

θ

xθ+1dx =

∫ ∞

x=1θx−θ dx =

x−θ+1

(−θ + 1)

]∞

x=1

= − θ

(−θ + 1)=

θ

θ − 1

since limx→∞

x−θ+1 = 0 because θ > 1.65

For method of moments estimates equate x = E[X] so that x =θ

θ − 1⇒ θ =

x

x − 1.

Q5. For these data

x =1

10

10∑

i=1

xi = 4.81, m′2 =

1

10

10∑

i=1

x2i = 28.983, m2 = m′

2 − x2 = 5.8469.

For a gamma(α, λ) distribution E[X] =α

λand Var[X] =

α

λ2.

For method of moments estimates equate x = E[X] and m2 = Var[X] so that

α =x2

m2, λ =

x

m2.

These give α = 3.956986 and λ = 0.8226582. The best fitting gamma distribution is thus a

gamma(α = 3.957, λ = 0.8227) distribution. Notice that α ≈ 4.

Using R to do the calculations:

x=c(8.7,3.3,5.5,5.8,6.1,3.9,1.2,0.7,7.4,5.5) # Set values into x.

xbar=mean(x) # Mean is 4.81.

mean(x^2) # m2’ is 28.983.

m2=mean(x^2)-xbar^2 # m2 is 5.8469.

alpha=xbar^2/m2 # alpha is 3.956986.

lambda=xbar/m2 # lambda is 0.8226582.

64A simple R simulation shows that en can be negative:

nn=numeric(10000) # Initialise nn to be a vector 10000 long.

m=20; n=10; theta=0.3 # Initialise required constants m, n and theta.

for (k in 1:10000){ # Do 100000 simulations.

x=rbinom(m,n,theta) # Simulate m values from Bin(n,theta) and store in x.

m2=mean((x-mean(x))∧2) # Obtain m2 values.

nn[k]=mean(x)∧2/(mean(x)-m2) # Obtain n estimate.

} # End k loop.

hist(nn,100) # Histogram of n values with about 100 breaks.

65An alternative approach uses the substitution θ = t+1 so E[X] =

Z∞

1

θ

xθdx =

Z∞

1

t + 1

xt+1dx =

t + 1

t

Z∞

1

t

xt+1dx

and the area under the Pareto density with parameter t is one so that E[X] =t + 1

t=

θ

θ − 1as before.

66

Page 12: MATH2715: Statistical Methods - Mathematics at Leedssta6ajb/math2715/hand05-17.pdf · Rice, J.A. (1995) Mathematical Statistics and Data Analysis (2nd edition), ... J.W. and Craig,

MATH2715: Statistical Methods – Examples Class V, week 6

The questions below will be looked at in Examples Class V held in week 6.

Q1. Suppose that g(x) is a non-negative function66 which is symmetric about zero and which

is a non-decreasing function of x for positive x. If X is a continuous random variable prove that,

for any ǫ > 0,

pr{|X| ≥ ǫ} ≤ E[g(X)]

g(ǫ).

(Hint: Split the region of integration (−∞,+∞) for E[g(X)] into suitable parts.)

By writing X = Y − µ, deduce Chebyshev’s inequality as a special case.

Q2. Suppose that Y1, Y2, . . . is a sequence of random variables which satisfies

Yn =

−2n with probability 123−n

0 with probability 1 − 3−n

2n with probability 123−n

for n = 1, 2, . . .. Prove that

p limn→∞

Yn = 0.

What is the mean and variance of Yn? What happens as n → ∞?

66Examples include g(x) = x2 + 1 and g(x) = 1/(1 + e−x2

).

67