Gubner

Solutions Manual for

Probability and Random Processes for

Electrical and Computer Engineers

John A. Gubner

University of Wisconsin–Madison

File Generated July 13, 2007

CHAPTER 1

Problem Solutions

1. Ω = 1,2,3,4,5,6.

2. Ω = 0,1,2, . . . ,24,25.

3. Ω = [0,∞). RTT> 10 ms is given by the event (10,∞).

4. (a) Ω = (x,y) ∈ IR2 : x2 + y2 ≤ 100.(b) (x,y) ∈ IR2 : 4≤ x2 + y2 ≤ 25.

5. (a) [2,3]c = (−∞,2)∪ (3,∞).

(b) (1,3)∪ (2,4) = (1,4).

(c) (1,3)∩ [2,4) = [2,3).

(d) (3,6]\ (5,7) = (3,5].

6. Sketches:

x

y

1

1

1B0B

x

y

−1B

−1

−1

x

y

x

y

x

y

3x

y

3

H3 J3C1

1

2 Chapter 1 Problem Solutions

x

y

x

y

3

3

3

3

H3 J3

U= M3 UH3 J3 N3

=

x

y

2

2M N3U

= 2M

2

x

y

4

3

4M N3U

4

3

7. (a) [1,4]∩([0,2]∪ [3,5]

)=

([1,4]∩ [0,2]

)∪([1,4]∩ [3,5]

)= [1,2]∪ [3,4].

(b)

([0,1]∪ [2,3]

)c

= [0,1]c∩ [2,3]c

=[(−∞,0)∪ (1,∞)

]∩[(−∞,2)∪ (3,∞)

]

=

((−∞,0)∩

[(−∞,2)∪ (3,∞)

])

∪(

(1,∞)∩[(−∞,2)∪ (3,∞)

])

= (−∞,0)∪ (1,2)∪ (3,∞).

(c)

∞⋂

n=1

(− 1n, 1n) = 0.

(d)

∞⋂

n=1

[0,3+ 12n ) = [0,3].

(e)

∞⋃

n=1

[5,7− 13n ] = [5,7).

(f)

∞⋃

n=1

[0,n] = [0,∞).

Chapter 1 Problem Solutions 3

8. We first let C ⊂ A and show that for all B, (A∩B)∪C = A∩ (B∪C). Write

A∩ (B∪C) = (A∩B)∪ (A∩C), by the distributive law,

= (A∩B)∪C, since C ⊂ A⇒ A∩C =C.

For the second part of the problem, suppose (A∩B)∪C= A∩(B∪C). We must show

that C ⊂ A. Let ω ∈ C. Then ω ∈ (A∩B)∪C. But then ω ∈ A∩ (B∪C), whichimplies ω ∈ A.

9. Let I := ω ∈Ω : ω ∈ A⇒ ω ∈ B. We must show that A∩ I = A∩B.

⊂: Let ω ∈ A∩ I. Then ω ∈ A and ω ∈ I. Therefore, ω ∈ B, and then ω ∈ A∩B.⊃: Let ω ∈ A∩B. Then ω ∈ A and ω ∈ B. We must show that ω ∈ I too. In other

words, we must show that ω ∈ A⇒ ω ∈ B. But we already have ω ∈ B.

10. The function f :(−∞,∞)→ [0,∞) with f (x) = x3 is not well defined because not all

values of f (x) lie in the claimed co-domain [0,∞).

11. (a) The function will be invertible if Y = [−1,1].(b) x : f (x)≤ 1/2= [−π/2,π/6].

(c) x : f (x) < 0= [−π/2,0).

12. (a) Since f is not one-to-one, no choice of co-domain Y can make f : [0,π]→ Y

invertible.

(b) x : f (x)≤ 1/2= [0,π/6]∪ [5π/6,π].

(c) x : f (x) < 0= ∅.

13. For B⊂ IR,

f−1(B) =

X , 0 ∈ B and 1 ∈ B,A, 1 ∈ B but 0 /∈ B,Ac, 0 ∈ B but 1 /∈ B,∅, 0 /∈ B and 1 /∈ B.

14. Let f :X → Y be a function such that f takes only n distinct values, say y1, . . . ,yn.Let B ⊂ Y be such that f−1(B) is nonempty. By definition, each x ∈ f−1(B) has theproperty that f (x) ∈ B. But f (x) must be one of the values y1, . . . ,yn, say yi. Nowf (x) = yi if and only if x ∈ Ai := f−1(yi). Hence,

f−1(B) =⋃

i:yi∈BAi.

15. (a) f (x) ∈ Bc⇔ f (x) /∈ B⇔ x /∈ f−1(B)⇔ x ∈ f−1(B)c.

(b) f (x)∈∞⋃

n=1

Bn if and only if f (x)∈ Bn for some n; i.e., if and only if x ∈ f−1(Bn)

for some n. But this says that x ∈∞⋃

n=1

f−1(Bn).


(c) f (x) ∈∞⋂

n=1

Bn if and only if f (x) ∈ Bn for all n; i.e., if and only if x ∈ f−1(Bn)

for all n. But this says that x ∈∞⋂

n=1

f−1(Bn).

16. If B=⋃ibi and C =

⋃ici, put a2i := bi and a2i−1 := ci. Then A=

⋃i ai = B∪C

is countable.

17. Since each Ci is countable, we can writeCi =⋃j ci j. It then follows that

B :=∞⋃

i=1

Ci =∞⋃

i=1

∞⋃

j=1

ci j

is a doubly indexed sequence and is therefore countable as shown in the text.

18. Let A=⋃mam be a countable set, and let B⊂ A. We must show that B is countable.

If B = ∅, we’re done by definition. Otherwise, there is at least one element of B in

A, say ak. Then put bn := an if an ∈ B, and put bn := ak if an /∈ B. Then ⋃nbn= B

and we see that B is countable.

19. Let A ⊂ B where A is uncountable. We must show that B is uncountable. We prove

this by contradiction. Suppose that B is countable. Then by the previous problem, A

is countable, contradicting the assumption that A is uncountable.

20. Suppose A is countable and B is uncountable. Wemust show that A∪B is uncountable.We prove this by contradiction. Suppose that A∪B is countable. Then since B ⊂A∪B, we would have B countable as well, contradicting the assumption that B is

uncountable.

21. MATLAB. OMITTED.

22. MATLAB. Intuitive explanation: Using only the numbers 1,2,3,4,5,6, consider howmany ways there are to write the following numbers:

2 = 1+1 1 way, 1/36 = 0.02783 = 1+2= 2+1 2 ways, 2/36 = 0.05564 = 1+3= 2+2= 3+1 3 ways, 3/36 = 0.08335 = 1+4= 2+3= 3+2= 4+1 4 ways, 4/36 = 0.11116 = 1+5= 2+4= 3+3= 4+2= 5+1 5 ways, 5/36 = 0.13897 = 1+6= 2+5= 3+4= 4+3= 5+2= 6+1 6 ways, 6/36 = 0.16678 = 2+6= 3+5= 4+4= 5+3= 6+2 5 ways, 5/36 = 0.13899 = 3+6= 4+5= 5+4= 6+3 4 ways, 4/36 = 0.111110 = 4+6= 5+5= 6+4 3 ways, 3/36 = 0.083311 = 5+6= 6+5 2 ways, 2/36 = 0.055612 = 6+6 1 way, 1/36 = 0.0278

36 ways, 36/36 = 1

23. Take Ω := 1, . . . ,26 and put

P(A) :=|A||Ω| =

|A|26

.


The event that a vowel is chosen is V = 1,5,9,15,21, and P(V ) = |V |/26= 5/26.

24. Let Ω := (i, j) : 1≤ i, j≤ 26 and i 6= j. For A⊂Ω, put P(A) := |A|/|Ω|. The eventthat a vowel is chosen followed by a consonant is

Bvc =(i, j) ∈Ω : i= 1,5,9,15, or 21 and j ∈ 1, . . . ,26\1,5,9,15,21

.

Similarly, the event that a consonant is followed by a vowel is

Bcv =(i, j) ∈Ω : i ∈ 1, . . . ,26\1,5,9,15,21 and j = 1,5,9,15, or 21

.

We need to compute

P(Bvc∪Bcv) =|Bvc|+ |Bcv|

|Ω| =5 · (26−5)+(26−5) ·5

650=21

65≈ 0.323.

The event that two vowels are chosen is

Bvv =(i, j) ∈Ω : i, j ∈ 1,5,9,15,21 with i 6= j

,

and P(Bvv) = |Bvv|/|Ω|= 20/650 = 2/65≈ .031.

25. MATLAB. The code for simulating the drawing of a face card is

% Simulation of Drawing a Face Card

%

n = 10000; % Number of draws.

X = ceil(52*rand(1,n));

faces = (41 <= X & X <= 52);

nfaces = sum(faces);

fprintf(’There were %g face cards in %g draws.\n’,nfaces,n)

26. Since 9 pm to 7 am is 10 hours, take Ω := [0,10]. The probability that the baby wakesup during a time interval 0≤ t1 < t2 ≤ 10 is

P([t1, t2]) :=∫ t2

t1

1

10dω.

Hence, P([2,10]c) = P([0,2]) =∫ 20 1/10dω = 1/5.

27. Starting with the equations

SN = 1+ z+ z2+ · · ·+ zN−2+ zN−1

zSN = z+ z2+ · · ·+ zN−2+ zN−1+ zN ,

subtract the second line from the first. Canceling common terms leaves

SN − zSN = 1− zN , or SN(1− z) = 1− zN .

If z 6= 1, we can divide both sides by 1− z to get SN = (1− zN)/(1− z).


28. Let x = p(1). Then p(2) = 2p(1) = 2x, p(3) = 2p(2) = 22x, p(4) = 2p(3) = 23x,

p(5) = 24x, and p(6) = 25x. In general, p(ω) = 2ω−1x and we can write

1 =6

∑ω=1

p(ω) =6

∑ω=1

2ω−1x = x5

∑ω=0

2ω =1−26

1−2x = 63x.

Hence, x= 1/63, and p(ω) = 2ω−1/63 for ω = 1, . . . ,6.

29. (a) By inclusion–exclusion, P(A∪B) = P(A)+P(B)−P(A∩B), which can be re-arranged as P(A∩B) = P(A)+P(B)−P(A∪B).

(b) Since P(A) = P(A∩B)+P(A∩Bc),

P(A∩Bc) = P(A)−P(A∩B) = P(A∪B)−P(B), by part (a).

(c) Since B and A∩Bc are disjoint,

P(B∪ (A∩Bc)) = P(B)+P(A∩Bc) = P(A∪B), by part (b).

(d) By De Morgan’s law, P(Ac∩Bc) = P([A∪B]c) = 1−P(A∪B).

30. We must check the four axioms of a probability measure. First,

P(∅) = λP1(∅)+(1−λ )P2(∅) = λ ·0+(1−λ ) ·0 = 0.

Second,

P(A) = λP1(A)+(1−λ )P2(A) ≥ λ ·0+(1−λ ) ·0 = 0.

Third,

P

( ∞⋃

n=1

An

)= λP1

( ∞⋃

n=1

An

)+(1−λ )P2

( ∞⋃

n=1

An

)

= λ∞

∑n=1

P1(An)+(1−λ )∞

∑n=1

P2(An)

=∞

∑n=1

[λP1(An)+(1−λ )P2(An)]

=∞

∑n=1

P(An).

Fourth, P(Ω) = λP1(Ω)+(1−λ )P2(Ω) = λ +(1−λ ) = 1.

31. First, since ω0 /∈∅, µ(∅) = 0. Second, by definition, µ(A) ≥ 0. Third, for disjoint

An, suppose ω0 ∈⋃nAn. Then ω0 ∈ Am for some m, and ω0 /∈ An for n 6= m. Then

µ(Am) = 1 and µ(An) = 0 for n 6=m. Hence, µ(⋃

nAn

)= 1 and ∑n µ(An) = µ(Am) =

1. A similar analysis shows that if ω0 /∈⋃nAn then µ

(⋃nAn

)and ∑n µ(An) are both

zero. Finally, since ω0 ∈Ω, µ(Ω) = 1.


32. Starting with the assumption that for any two disjoint events A and B, P(A∪B) =P(A)+P(B), we have that for N = 2,

P

( N⋃

n=1

An

)=

N

∑n=1

P(An). (∗)

Now we must show that if (∗) holds for any N ≥ 2, then (∗) holds for N+1. Write

P

(N+1⋃

n=1

An

)= P

([ N⋃

n=1

An

]∪AN+1

)

= P

( N⋃

n=1

An

)+P(AN+1), additivity for two events,

=N

∑n=1

P(An)+P(AN+1), by (∗),

=N+1

∑n=1

P(An).

33. Since An := Fn∩F cn−1∩·· ·∩F c

1 ⊂ Fn, it is easy to see thatN⋃

n=1

An ⊂N⋃

n=1

Fn.

The hard part is to show the reverse inclusion ⊃. Suppose ω ∈⋃Nn=1Fn. Then ω ∈ Fn

for some n in the range 1, . . . ,N. However, ω may belong to Fn for several values of

n since the Fn may not be disjoint. Let

k := minn : ω ∈ Fn and 1≤ n≤ N.In other words, 1≤ k ≤ N and ω ∈ Fk, but ω /∈ Fn for n< k; in symbols,

ω ∈ Fk ∩F ck−1∩·· ·∩F c

1 =: Ak.

Hence, ω ∈ Ak ⊂⋃Nn=1An. The proof that

⋃∞n=1An ⊂

⋃∞n=1Fn is similar except that

k :=minn : ω ∈ Fn and n≥ 1.34. For arbitrary events Fn, let An be as in the preceding problem. We can then write

P

( ∞⋃

n=1

Fn

)= P

( ∞⋃

n=1

An

)=

∞

∑n=1

P(An), since the An are disjoint,

= limN→∞

N

∑n=1

P(An), by def. of infinite sum,

= limN→∞

P

( N⋃

n=1

An

)

= limN→∞

P

( N⋃

n=1

Fn

).


35. For arbitrary events Gn, put Fn := Gcn. Then

P

( ∞⋂

n=1

Gn

)= 1−P

( ∞⋃

n=1

Fn

), by De Morgan’s law,

= 1− limN→∞

P

( N⋃

n=1

Fn

), by the preceding problem,

= 1− limN→∞

[1−P

( N⋂

n=1

Gn

)], by De Morgan’s law,

= limN→∞

P

( N⋂

n=1

Gn

).

36. By the inclusion–exclusion formula,

P(A∪B) = P(A)+P(B)−P(A∩B) ≤ P(A)+P(B).

This establishes the union bound for N = 2. Now suppose the union bound holds for

some N ≥ 2. We must show it holds for N+1. Write

P

(N+1⋃

n=1

Fn

)= P

([ N⋃

n=1

Fn

]∪FN+1

)

≤ P

( N⋃

n=1

Fn

)+P(FN+1), by the union bound for two events,

≤N

∑n=1

P(Fn)+P(FN+1), by the union bound for N events,

=N+1

∑n=1

P(Fn).

37. To establish the union bound for a countable sequence of events, we proceed as fol-

lows. Let An := Fn∩F cn−1∩·· ·∩F c

1 ⊂ Fn be disjoint with⋃∞n=1An =

⋃∞n=1Fn. Then

P

( ∞⋃

n=1

Fn

)= P

( ∞⋃

n=1

An

)

=∞

∑n=1

P(An), since the An are disjoint,

≤∞

∑n=1

P(Fn), since An ⊂ Fn.

38. Following the hint, we put Gn :=⋃∞k=nBk so that we can write

P

( ∞⋂

n=1

∞⋃

k=n

Bk

)= P

( ∞⋂

n=1

Gn

)= lim

N→∞P

( N⋂

n=1

Gn

), limit property of P,


= limN→∞

P(GN), since Gn ⊃ Gn+1,

= limN→∞

P

( ∞⋃

k=N

Bk

), definition of GN ,

≤ limN→∞

∞

∑k=N

P(Bk), union bound.

This last limit must be zero since ∑∞k=1P(Bk) < ∞.

39. In this problem, the probability of an interval is its length.

(a) P(A0) = 1, P(A1) = 2/3, P(A2) = 4/9= (2/3)2, and P(A3) = 8/27= (2/3)3.

(b) P(An) = (2/3)n.

(c) Write

P(A) = P

( ∞⋂

n=1

An

)= lim

N→∞P

( N⋂

n=1

An


= limN→∞

P(AN), since An ⊃ An+1,

= limN→∞

(2/3)N = 0.

40. Consider the collection consisting of the empty set along with all unions of the form⋃iAki for some finite subsequence of distinct elements from 1, . . . ,n. We first show

that this collection is a σ -field. First, it contains ∅ by definition. Second, since

A1, . . . ,An is a partition, (⋃

i

Aki

)c

=⋃

i

Ami ,

where mi is the subsequence 1, . . . ,n \ ki. Hence, the collection is closed undercomplementation. Third,

∞⋃

n=1

(⋃

i

Akn,i

)=

⋃

j

Am j ,

where an integer l ∈ 1, . . . ,n is in m j if and only if kn,i = l for some n and some i.

This shows that the collection is a σ -field. Finally, since every element in our collec-tion must be contained in every σ -field that contains A1, . . . ,An, our collection mustbe σ(A1, . . . ,An).

41. We claim that A is not a σ -field. Our proof is by contradiction: We assume A is a

σ -field and derive a contradiction. Consider the set

∞⋂

n=1

[0,1/2n) = 0.

Since [0,1/2n)∈Cn ⊂An ⊂A , the intersection must be inA since we are assuming

A is a σ -field. Hence, 0 ∈A . Now, any set in A must belong to some An. By the


preceding problem, every set in An must be a finite union of sets from Cn. However,

the singleton set 0 cannot be expressed as a finite union of sets from any Cn. Hence,

0 /∈A .

42. Let Ai := X−1(xi) for i = 1, . . . ,n. By the problems mentioned in the hint, for anysubset B, if X−1(B) 6= ∅, then

X−1(B) =⋃

i:xi∈BAi ∈ σ(A1, . . . ,An).

It follows that the smallest σ -field containing all the X−1(B) is σ(A1, . . . ,An).

43. (a) F =∅,A,B,3,1,2,4,5,1,2,4,5,Ω

(b) The corresponding probabilities are 0,5/8,7/8,1/2,1/8,3/8,1/2,1.

(c) Since 1 /∈F , P(1) is not defined.

44. Suppose that a σ -field A contains an infinite sequence Fn of sets. If the sequence is

not disjoint, we can construct a new sequence An that is disjoint with each An ∈ A .

Let a = a1,a2, . . . be an infinite sequence of zeros and ones. Then A contains each

union of the form ⋃

i:ai=1

Ai.

Furthermore, since the Ai are disjoint, each sequence a gives a different union, and

we know from the text that the number of infinite sequences a is uncountably infinite.

45. (a) First, since ∅ is in each Aα , ∅ ∈ ⋂α Aα . Second, if A ∈

⋂α Aα , then A ∈Aα

for each α , and so Ac ∈Aα for each α . Hence, Ac ∈ ⋂α Aα . Third, if An ∈A

for all n, then for each n and each α , An ∈ Aα . Then⋃nAn ∈ Aα for each α ,

and so⋃nAn ∈

⋂α Aα .

(b) We first note that

A1 = ∅,1,2,3,4,2,3,4,1,3,4,1,2,Ω

and

A2 = ∅,1,3,2,4,2,3,4,1,2,4,1,3,Ω.

It is then easy to see that

A1∩A2 = ∅,1,2,3,4,Ω.

(c) First note that by part (a),⋂

A :C⊂A A is a σ -field, and since C ⊂ A for each

A , the σ -field⋂

A :C⊂A A contains C . Finally, ifD is any σ -field that containsC , then D is one of the A s in the intersection. Hence,

C ⊂⋂

A :C⊂A

A ⊂ D .

Thus⋂

A :C⊂A A is the smallest σ -field that contains C .


46. The union of two σ -fields is not always a σ -field. Here is an example. Let Ω :=1,2,3,4, and put

F :=∅,1,2,3,4,Ω

and G :=

∅,1,3,2,4,Ω

.

Then

F ∪G =∅,1,2,3,4,1,3,2,4,Ω

is not a σ -field since it does not contain 1,2∩1,3= 1.47. Let Ω denote the positive integers, and let A denote the collection of subsets A such

that either A or Ac is finite.

(a) Let E denote the subset of even integers. Then E does not belong to A since

neither E nor E c (the odd integers) is a finite set.

(b) To show that A is closed under finite unions, we consider two cases. First

suppose that A1, . . . ,An are all finite. Then

∣∣∣∣n⋃

i=1

Ai

∣∣∣∣ ≤n

∑i=1

|Ai| < ∞,

and so⋃ni=1Ai ∈A . In the second case, suppose that some Acj is finite. Then

( n⋃

i=1

Ai

)c

=n⋂

i=1

Aci ⊂ Acj .

Hence, the complement of⋃ni=1Ai is finite, and so the union belongs to A .

(c) A is not a σ -field. To see this, put Ai := 2i for i = 1,2, . . . . Then⋃∞i=1Ai =

E /∈A by part (a).

48. Let Ω be an uncountable set. Let A denote the collection of all subsets A such that

either A is countable or Ac is countable. We show thatA is a σ -field. First, the emptyset is countable. Second, if A ∈A , we must show that Ac ∈A . There are two cases.

If A is countable, then the complement of Ac is A, and so Ac ∈A . If Ac is countable,

then Ac ∈ A . Third, let A1,A2, . . . belong to A . There are two cases to consider. If

all An are countable, then⋃nAn is also countable by an earlier problem. Otherwise,

if some Acm is countable, then write

( ∞⋃

n=1

An

)c

=∞⋂

n=1

Acn ⊂ Acm.

Since the subset of a countable set is countable, we see that the complement of⋃∞n=1An is countable, and thus the union belongs to A .

49. (a) Since (a,b] =⋂∞n=1(a,b+ 1

n), and since each (a,b+ 1

n) ∈B, (a,b] ∈B.

(b) Since a=⋂∞n=1(a− 1

n,a+ 1

n), and since each (a− 1

n,a+ 1

n)∈B, the singleton

a ∈B.


(c) Since by part (b), singleton sets are Borel sets, and since A is a countable union

of Borel sets, A ∈B; i.e., A is a Borel set.

(d) Using part (a), write

λ((a,b]

)= λ

( ∞⋂

n=1

(a,b+ 1n)

)

= limN→∞

λ

( N⋂

n=1

(a,b+ 1n)

), limit property of probability,

= limN→∞

λ((a,b+ 1

N)), decreasing sets,

= limN→∞

(b+ 1N)−a, characterization of λ ,

= b−a.

Similarly, using part (b), we can write

λ(a

)= λ

( ∞⋂

n=1

(a− 1n,a+ 1

n)

)

= limN→∞

λ

( N⋂

n=1

(a− 1n,a+ 1

n)

), limit property of probability,

= limN→∞

λ((a− 1

N,a+ 1

N)), decreasing sets,

= limN→∞

2/N, characterization of λ ,

= 0.

50. Let I denote the collection of open intervals, and let O denote the collection of open

sets. We need to show that σ(I ) = σ(O). Since I ⊂ O , every σ -field containingO also contains I . Hence, the smallest σ -field containing O contains I ; i.e., I ⊂σ(O). By the definition of the smallest σ -field containing I , it follows that σ(I )⊂σ(O). Now, if we can show thatO ⊂ σ(I ), then it will similarly follow that σ(O)⊂σ(I ). Recall that in the problem statement, it was shown that every open set U can

be written as a countable union of open intervals. This meansU ∈ σ(I ). This provesthat O ⊂ σ(I ) as required.

51. MATLAB. Chips from S1 are 80% reliable; chips from S2 are 70% reliable.

52. Observe that

N(Ow,S1) = N(OS1)−N(Od,S1) = N(OS1)(1− N(Od,S1)

N(OS1)

)

and

N(Ow,S2) = N(OS2)−N(Od,S2) = N(OS2)(1− N(Od,S2)

N(OS2)

).


53. First write

P(A|B∩C)P(B|C) =P(A∩ [B∩C])

P(B∩C)· P(B∩C)

P(C)=

P([A∩B]∩C)

P(C)= P(A∩B|C).

From this formula, we can isolate the equation

P(A|B∩C)P(B|C) =P([A∩B]∩C)

P(C).

Multiplying through by P(C) yields P(A|B∩C)P(B|C)P(C) = P(A∩B∩C).

54. (a) P(MM) = 140/(140+60) = 140/200 = 14/20 = 7/10 = 0.7. Then P(HT) =1−P(MM) = 0.3.

(b) Let D denote the event that a workstation is defective. Then

P(D) = P(D|MM)P(MM)+P(D|HT)P(HT)

= (.1)(.7)+(.2)(.3)

= .07+ .06 = 0.13.

(c) Write

P(MM|D) =P(D|MM)P(MM)

P(D)=

.07

.13=

7

13.

55. Let O denote the event that a cell is overloaded, and let B denote the event that a call

is blocked. The problem statement tells us that

P(O) = 1/3, P(B|O) = 3/10, and P(B|Oc) = 1/10.

To find P(O|B), first write

P(O|B) =P(B|O)P(O)

P(B)=

3/10 ·1/3P(B)

.

Next compute

P(B) = P(B|O)P(O)+P(B|Oc)P(Oc)

= (3/10)(1/3)+(1/10)(2/3) = 5/30 = 1/6.

We conclude that

P(O|B) =1/10

1/6=

6

10=

3

5= 0.6.

56. The problem statement tells us that P(R1|T0) = ε and P(R0|T1) = δ . We also know

that

1 = P(Ω) = P(T0∪T1) = P(T0)+P(T1).

The problem statement tells us that these last two probabilities are the same; hence

they are both equal to 1/2. To find P(T1|R1), we begin by writing

P(T1|R1) =P(R1|T1)P(T1)

P(R1).


Next, we note that P(R1|T1) = 1−P(R0|T1) = 1−δ . By the law of total probability,

P(R1) = P(R1|T1)P(T1)+P(R1|T0)P(T0)

= (1−δ )(1/2)+ ε(1/2) = (1−δ + ε)/2.

So,

P(T1|R1) =(1−δ )(1/2)

(1−δ + ε)/2=

1−δ

1−δ + ε.

57. Let H denote the event that a student does the homework, and let E denote the event

that a student passes the exam. Then the problem statement tells us that

P(E|H) = .8, P(E|H c) = .1, and P(H) = .6.

We need to compute P(E) and P(H|E). To begin, write

P(E) = P(E|H)P(H)+P(E|H c)P(H c)

= (.8)(.6)+(.1)(1− .6) = .48+ .04 = .52.

Next,

P(H|E) =P(E|H)P(H)

P(E)=

.48

.52=

12

13.

58. The problem statement tells us that

P(AF |CF) = 1/3, P(AF |CcF) = 1/10, and P(CF) = 1/4.

We must compute

P(CF |AF) =P(AF |CF)P(CF)

P(AF)=

(1/3)(1/4)

P(AF)=

1/12

P(AF).

To compute the denominator, write

P(AF) = P(AF |CF)P(CF)+P(AF |CcF)P(Cc

F)

= (1/3)(1/4)+(1/10)(1−1/4) = 1/12+3/40 = 19/120.

It then follows that

P(CF |AF) =1

12· 12019

=10

19.

59. Let F denote the event that a patient receives a flu shot. Let S, M, and R denote the

events that Sue, Minnie, or Robin sees the patient. The problem tells us that

P(S) = .2, P(M) = .4, P(R) = .4, P(F |S) = .6, P(F |M) = .3, and P(F |R) = .1.

We must compute

P(S|F) =P(F |S)P(S)

P(F)=

(.6)(.2)

P(F)=

.12

P(F).


Next,

P(F) = P(F |S)P(S)+P(F|M)P(M)+P(F|R)P(R)

= (.6)(.2)+(.3)(.4)+(.1)(.4) = .12+ .12+ .04 = 0.28.

Thus,

P(S|F) =12

100· 10028

=3

7.

60. (a) Let Ω = 1,2,3,4,5 with P(A) := |A|/|Ω|. Without loss of generality, let 1

and 2 correspond to the two defective chips. Then D := 1,2 is the event thata defective chip is tested. Hence, P(D) = |D|/5= 2/5.

(b) Your friend’s information tells you that of the three chips you may test, one is

defective and two are not. Hence, the conditional probability that the chip you

test is defective is 1/3.

(c) Yes, your intuition is correct. To prove this, we construct a sample space and

probability measure and compute the desired conditional probability. Let

Ω := (i, j,k) : i< j and k 6= i,k 6= j,

where i, j,k ∈ 1,2,3,4,5. Here i and j are the chips taken by the friend, andk is the chip that you test. We again take 1 and 2 to be the defective chips. The

10 possibilities for i and j are

12 13 14 15

23 24 25

34 35

45

For each pair in the above table, there are three possible values of k:

345 245 235 235

145 135 134

125 124

123

Hence, there are 30 triples in Ω. For the probability measure we take P(A) :=|A|/|Ω|. Now let Fi j denote the event that the friend takes chips i and j with

i< j. For example, if the friend takes chips 1 and 2, then from the second table,

k has to be 3 or 4 or 5; i.e.,

F12 = (1,2,3),(1,2,4),(1,2,5).

The event that the friend takes two chips is then

T := F12∪F13∪F14∪F15∪F23∪F24∪F25∪F34∪F35∪F45.

Now the event that you test a defective chip is

D := (i, j,k) : k = 1 or 2 and i< j with i, j 6= k.


We can now compute

P(D|T ) =P(D∩T )

P(T ).

Since the Fi j that make up T are disjoint, |T |= 10 ·3= 30 and P(T ) = |T |/|Ω|=1. We next observe that

D∩T = ∅∪ [D∩F13]∪ [D∩F14]∪ [D∩F15]∪ [D∩F23]∪ [D∩F24]∪ [D∩F25]∪ [D∩F34]∪ [D∩F35]∪ [D∩F45].

Of the above intersections, the first six intersections are singleton sets, and the

last three are pairs. Hence, |D∩T |= 6 ·1+3 ·2 and so P(D∩T ) = 12/30= 2/5.We conclude that P(D|T ) = P(D∩ T )/P(T ) = (2/5)/1 = 2/5, which is the

answer in part (a).

Remark. The model in part (c) can be used to solve part (b) by observing

that the probability in part (b) is

P(D|F12∪F13∪F14∪F15∪F23∪F24∪F25),

which can be similarly evaluated.

61. (a) If two sets A and B are disjoint, then by definition, A∩B= ∅.

(b) If two events A and B are independent, then by definition, P(A∩B) = P(A)P(B).

(c) If two events A and B are disjoint, then P(A∩B) = P(∅) = 0. In order for

them to be independent, we must have P(A)P(B) = 0; i.e., at least one of the

two events must have zero probability. If two disjoint events both have positive

probability, then they cannot be independent.

62. LetW denote the event that the decoder outputs the wrong message. Of course,W c

is the event that the decoder outputs the correct message. We must find P(W ) =1−P(W c). Now, W c occurs if only the first bit is flipped, or only the second bit is

flipped, or only the third bit is flipped, or if no bits are flipped. Denote these disjoint

events by F100, F010, F001, and F000, respectively. Then

P(W c) = P(F100∪F010∪F001∪F000)= P(F100)+P(F010)+P(F001)+P(F000)

= p(1− p)2+(1− p)p(1− p)+(1− p)2p+(1− p)3

= 3p(1− p)2 +(1− p)3.

Hence,

P(W ) = 1−3p(1− p)2− (1− p)3 = 3p2−2p3.

If p= 0.1, then P(W ) = 0.03−0.002= 0.028.

63. Let Ai denote the event that your phone selects channel i, i= 1, . . . ,10. Let B j denotethe event that your neighbor’s phone selects channel j, j = 1, . . . ,10. Let P(Ai) =


P(B j) = 1/10, and assume Ai and B j are independent. Then

P

( 10⋃

i=1

[Ai∪B j])

=10

∑i=1

P(Ai∩B j) =10

∑i=1

P(Ai)P(B j) =10

∑i=1

1

10· 110

= 0.1.

64. Let L denote the event that the left airbag works properly, and let R denote the event

that the right airbag works properly. Assume L and R are independent with P(Lc) =P(Rc) = p. The probability that at least one airbag works properly is

P(L∪R) = 1−P(Lc∩Rc) = 1−P(Lc)P(Rc) = 1− p2.

65. The probability that the dart never lands within 2 cm of the center is

P

( ∞⋂

n=1

Acn

)= lim

N→∞P

( N⋂

n=1

Acn


= limN→∞

N

∏n=1

P(Acn), independence,

= limN→∞

N

∏n=1

(1− p)

= limN→∞

(1− p)N = 0.

66. LetWi denote the event that you win on your ith play of the lottery. The probability

that you win at least once in n plays is

P

( n⋃

i=1

Wi

)= 1−P

( n⋂

i=1

W ci

)= 1−

n

∏i=1

P(W ci ), by independence,

= 1−n

∏i=1

(1− p) = 1− (1− p)n.

We need to choose n so that 1− (1− p)n > 1/2, which happens if and only if

1/2 > (1− p)n or − ln2 > n ln(1− p) or− ln2

ln(1− p) < n,

where the last step uses the fact that ln(1− p) is negative. For p = 10−6, we needn> 693147.

67. Let A denote the event that Anne catches no fish, and let B denote the event that Betty

catches no fish. Assume A and B are independent with P(A) = P(B) = p. We must

compute

P(A|A∪B) =P(A∩ [A∪B])

P(A∪B)=

P(A)

P(A∪B),

where the last step uses the fact that A⊂ A∪B. To compute the denominator, write

P(A∪B) = 1−P(Ac∩Bc) = 1−P(Ac)P(Bc) = 1− (1− p)2 = 2p− p2 = p(2− p).


Then

P(A|A∪B) =p

p(2− p) =1

2− p .

68. We show that A and B\C are independent as follows. First, since C ⊂ B,

P(B) = P(C)+P(B\C).

Next, since A and B are independent and since A and C are independent,

P(A∩B) = P(A)P(B) = P(A)[P(C)+P(B\C)] = P(A∩C)+P(A)P(B\C).

Again using the fact that C ⊂ B, we now write

P(A∩B) = P(A∩ [C∪B\C]) = P(A∩C)+P(A∩B\C).

It follows that P(A∩B\C) = P(A)P(B\C), which establishes the claimed indepen-dence.

69. We show that A, B, and C are mutually independent. To begin, note that P(A) =P(B) = P(C) = 1/2. Next, we need to identify the events

A∩B = [0,1/4)

A∩C = [0,1/8)∪ [1/4,3/8)

B∩C = [0,1/8)∪ [1/2,5/8)

A∩B∩C = [0,1/8)

so that we can compute

P(A∩B) = P(A∩C) = P(B∩C) = 1/4 and P(A∩B∩C) = 1/8.

We find that

P(A∩B) = P(A)P(B), P(A∩C) = P(A)P(C), P(B∩C) = P(B)P(C),

and

P(A∩B∩C) = P(A)P(B)P(C).

70. From a previous problem we have that P(A∩C|B) = P(A|B∩C)P(C|B). Hence,

P(A∩C|B) = P(A|B)P(C|B) if and only if P(A|B∩C) = P(A|B).

71. We show that the probability of the complementary event is zero. By the union bound,

P

( ∞⋃

n=1

∞⋂

k=n

Bck

)≤

∞

∑n=1

P

( ∞⋂

k=n

Bck

).

We show that every term on the right is zero. Write

P

( ∞⋂

k=n

Bck

)= lim

N→∞P

( N⋂

k=n

Bck



= limN→∞

N

∏k=n

P(Bck), independence,

= limN→∞

N

∏k=n

[1−P(Bk)]

≤ limN→∞

N

∏k=n

exp[−P(Bk)], the hint,

= limN→∞

exp

[−

N

∑k=n

P(Bk)

]

= exp

[− limN→∞

N

∑k=n

P(Bk)

], since exp is continuous,

= exp

[−

∞

∑k=n

P(Bk)

]

= e−∞ = 0,

where the second-to-last step uses the fact that ∑∞k=1P(Bk) = ∞⇒ ∑∞

k=nP(Bk) = ∞.

72. There are 3 ·5 ·7= 105 possible systems.

73. There are 2n n-bit numbers.

74. There are 100! different orderings of the 100 message packets. In order that the first

header packet to be received is the 10th packet to arrive, the first 9 packets to be

received must come from the 96 data packets, the 10th packet must come from the 4

header packets, and the remaining 90 packets can be in any order. More specifically,

there are 96 possibilities for the first packet, 95 for the second, . . . , 88 for the ninth, 4

for the tenth, and 90! for the remaining 90 packets. Hence, the desired probability is

96 · · ·88 ·4 ·90!100!

=96 · · ·88 ·4100 · · ·91 =

90 ·89 ·88 ·4100 ·99 ·98 ·97 = 0.02996.

75.(52

)= 10 pictures are needed.

76. Suppose the player chooses distinct digits wxyz. The player wins if any of the 4!= 24

permutations of wxyz occurs. Since each permutation has probability 1/10000 of

occurring, the probability of winning is 24/10000 = 0.0024.

77. There are(83

)= 56 8-bit words with 3 ones (and 5 zeros).

78. The probability that a random byte has 4 ones and 4 zeros is(84

)/28 = 70/256 =

0.2734.

79. In the first case, since the prizes are different, order is important. Hence, there are

41 ·40 ·39= 63960 outcomes. In the second case, since the prizes are the same, order

is not important. Hence, there are(413

)= 10660 outcomes.

80. There are(5214

)possible hands. Since the deck contains 13 spades, 13 hearts, 13

diamonds, and 13 clubs, there are(132

)(133

)(134

)(135

)hands with 2 spades, 3 hearts, 4


diamonds, and 5 clubs. The probability of such a hand is

(132

)(133

)(134

)(135

)(5214

) = 0.0116.

81. All five cards are of the same suit if and only if they are all spades or all hearts or all

diamonds or all clubs. These are four disjoint events. Hence, the answer is four times

the probability of getting all spades:

4

(135

)(525

) = 41287

2598960= 0.00198.

82. There are(

nk1,...,km

)such partitions.

83. The general result is (n

k0, . . . ,km−1

)/mn.

When n = 4 and m = 10 and a player chooses xxyz, we compute(

42,1,1

)/10000 =

0.0012. For xxyy, we compute(42,2

)/10000 = 0.0006. For xxxy, we compute(

43,1

)/10000 = 0.0004.

84. Two apples and three carrots corresponds to (0,0,1,1,0,0,0). Five apples corre-

sponds to (0,0,0,0,0,1,1).

CHAPTER 2

Problem Solutions

1. (a) ω : X(ω)≤ 3= 1,2,3.(b) ω : X(ω) > 4= 5,6.(c) P(X ≤ 3) = P(X = 1)+P(X = 2)+P(X = 3) = 3(2/15) = 2/5, and

P(X > 4) = P(X = 5)+P(X = 6) = 2/15+1/3= 7/15.

2. (a) ω : X(ω) = 2= 1,2,3,4.(b) ω : X(ω) = 1= 41,42, . . . ,52.(c) P(X = 1 or X = 2) = P(1,2,3,4∪41,42, . . . ,52). Since these are disjoint

events, the probability of their union is 4/52+12/52 = 16/52= 4/13.

3. (a) ω ∈ [0,∞) : X(ω)≤ 1= [0,1].

(b) ω ∈ [0,∞) : X(ω)≤ 3= [0,3].

(c) P(X ≤ 1) =∫ 10 e−ω dω = 1−e−1. P(X ≤ 3) = 1−e−3, P(1< X ≤ 3) = P(X ≤

3)−P(X ≤ 1) = e−1− e−3.4. First, since X−1(∅) = ∅, µ(∅) = P(X−1(∅)) = P(∅) = 0. Second, µ(B) =

P(X−1(B))≥ 0. Third, for disjoint Bn,

µ

( ∞⋃

n=1

Bn

)=P

(X−1

( ∞⋃

n=1

Bn

))=P

( ∞⋃

n=1

X−1(Bn)

)=

∞

∑n=1

P(X−1(Bn))=∞

∑n=1

µ(Bn).

Fourth, µ(IR) = P(X−1(IR)) = P(Ω) = 1.

5. Since

P(Y > n−1) =∞

∑k=n

P(Y = k) = P(Y = n)+∞

∑k=n+1

P(Y = k) = P(Y = n)+P(Y > n),

it follows that P(Y = n) = P(Y > n−1)−P(Y > n).

6. P(Y = 0) = P(TTT,THH,HTH,HHT) = 4/8= 1/2, andP(Y = 1) = P(TTH,THT,HTT,HHH) = 4/8= 1/2.

7. P(X = 1) = P(X = 2) = P(X = 3) = P(X = 4) = P(X = 5) = 2/15 and P(X = 6) =1/3.

8. P(X = 2)=P(1,2,3,4)= 4/52= 1/13. P(X = 1)=P(41,42, . . . ,52)= 12/52=3/13. P(X = 0) = 1−P(X = 2)−P(X = 1) = 9/13.

9. The possible values of X are 0,1,4,9,16. We have P(X = 0) = P(0) = 1/7, P(X =1) = P(−1,1) = 2/7, P(X = 4) = P(−2,2) = 2/7, P(X = 9) = P(3) = 1/7,and P(X = 16) = P(4) = 1/7.

21


10. We have

P(X > 1) = 1−P(X ≤ 1) = 1− [P(X = 0)+P(X = 1)]

= 1− [e−λ +λe−λ ] = 1− e−λ (1+λ ).

When λ = 1, P(X > 1) = 1− e−2(2) = 1−2/e= 0.264.

11. The probability that the sensor fails to activate is

P(X < 4) = P(X ≤ 3) = P(X = 0)+ · · ·+P(X = 3) = e−λ (1+λ +λ 2/2!+λ 3/3!).

If λ = 2, P(X < 4) = e−2(1+ 2+ 2+ 4/3) = e−2(19/3) = 0.857. The probabilitythat the sensor activates is 1−P(X < 4) = 0.143.

12. Let Xk = 1 correspond to the event that the kth student gets an A. This event hasprobability P(Xk = 1) = p. Now, the event that only the kth student gets an A is

Xk = 1 and Xl = 0 for l 6= k.

Hence, the probability that exactly one student gets an A is

P

( 15⋃

k=1

Xk = 1 and Xl = 0 for l 6= k)

=15

∑k=1

P(Xk = 1 and Xl = 0 for l 6= k)

=15

∑k=1

p(1− p)14

= 15(.1)(.9)14 = 0.3432.

13. Let X1,X2,X3 be the random digits of the drawing. Then P(Xi = k) = 1/10 for k =0, . . . ,9 since each digit has probability 1/10 of being chosen. Then if the player

chooses d1d2d3, the probability of winning is

P

(X1 = d1,X2 = d2,X3 = d3∪X1 = d1,X2 = d2,X3 6= d3

∪X1 = d1,X2 6= d2,X3 = d3∪X1 6= d1,X2 = d2,X3 = d3),

which is equal to .13+3[.12(.9)]= 0.028 since the union is disjoint and since X1,X2,X3are independent.

14.

P

( m⋃

k=1

Xk < 2)

= 1−P

( m⋂

k=1

Xk ≥ 2)

= 1−m

∏k=1

P(Xk ≥ 2) = 1−m

∏k=1

[1−P(Xk ≤ 1)]

= 1− [1−e−λ +λe−λ]m = 1− [1− e−λ (1+λ )]m.


15. (a)

P

( n⋃

i=1

Xi ≥ 2)

= 1−P

( n⋂

i=1

Xi ≤ 1)

= 1−n

∏i=1

P(Xi ≤ 1)

= 1−n

∏i=1

[P(Xi = 0)+P(Xi = 1)]

= 1− [e−λ +λe−λ ]n = 1− e−nλ (1+λ )n.

(b) P

( n⋂

i=1

Xi ≥ 1)

=n

∏i=1

P(Xi ≥ 1) =n

∏i=1

[1−P(Xi = 0)] = (1− e−λ )n.

(c) P

( n⋂

i=1

Xi = 1)

=n

∏i=1

P(Xi = 1) = (λe−λ )n = λ ne−nλ .

16. For the geometric0 pmf, write

∞

∑k=0

(1− p)pk = (1− p)∞

∑k=0

pk = (1− p) · 1

1− p = 1.

For the geometric1 pmf, write

∞

∑k=1

(1− p)pk−1 = (1− p)∞

∑k=1

pk−1 = (1− p)∞

∑n=0

pn = (1− p) · 1

1− p = 1.

17. Let Xi be the price of stock i, which is a geometric0(p) random variable. Then

P

( 29⋃

i=1

Xi > 10)

= 1−P

( 29⋂

i=1

Xi ≤ 10)

= 1−29

∏i=1

[(1− p)(1+ p+ · · ·+ p10)]

= 1−((1− p)1− p

11

1− p)29

= 1− (1− p11)29.

Substituting p= .7, we have 1− (1− .711)29 = 1− (.98)29 = 1− .560 = 0.44.

18. For the first problem, we have

P(min(X1, . . . ,Xn) > `) = P

( n⋂

k=1

Xk > `)

=n

∏k=1

P(Xk > `) =n

∏k=1

p` = pn`.

Similarly,

P(max(X1, . . . ,Xn)≤ `)=P

( n⋂

k=1

Xk≤ `)

=n

∏k=1

P(Xk≤ `)=n

∏k=1

(1− p`)= (1− p`)n.

19. Let Xk denote the number of coins in the pocket of the kth student. Then the Xk are

independent and is uniformly distributed from 0 to 20; i.e., P(Xk = i) = 1/21.


(a) P

( 25⋂

k=1

Xk ≥ 5)

=25

∏k=1

16/21= (16/21)25 = 1.12×10−3.

(b) P

( 25⋃

k=1

Xk ≥ 19)

= 1−P

( 25⋂

k=1

Xk ≤ 18)

= 1− (1−2/21)25 = 0.918.

(c) The probability that only student k has 19 coins in his or her pocket is

P

(Xk = 10∩

⋂

l 6=kXl 6= 19

)= (1/21)(20/21)24 = 0.01477.

Hence, the probability that exactly one student has 19 coins is

P

( 25⋃

k=1

Xk = 10∩⋂

l 6=kXl 6= 19

)= 25(0.01477) = 0.369.

20. Let Xi = 1 if block i is good. Then P(Xi = 1) = p and

P(Y = k) = P(X1 = 1∩· · ·∩Xk−1 = 1∩Xk = 0

)= pk−1(1− p), k= 1,2, . . . .

Hence, Y ∼ geometric1(p).

21. (a) Write

P(X > n) =∞

∑k=n+1

(1− p)pk−1 =∞

∑`=0

(1− p)p`+n

= (1− p)pn∞

∑`=0

p` = (1− p)pn 1

1− p = pn.

(b) Write

P(X > n+ k|X > n) =P(X > n+ k,X > n)

P(X > n)=

P(X > n+ k)

P(X > n)=pn+k

pn= pk.

22. Since P(Y > k) = P(Y > n+ k|Y > n), we can write

P(Y > k) = P(Y > n+ k|Y > n) =P(Y > n+ k,Y > n)

P(Y > n)=

P(Y > n+ k)

P(Y > n).

Let p := P(Y > 1). Taking k = 1 above yields P(Y > n+ 1) = P(Y > n)p. Thenwith n = 1 we have P(Y > 2) = P(Y > 1)p = p2. With n = 2 we have P(Y > 3) =P(Y > 2)P(Y > 1) = p3. In general then P(Y > n) = pn. Finally, P(Y = n) = P(Y >n−1)−P(Y > n) = pn−1− pn = pn−1(1− p), which is the geometric1(p) pmf.

23. (a) To compute pX (i), we sum row i of the matrix. This yields pX (1) = pX (3) = 1/4and pX (2) = 1/2. To compute pY ( j), we sum column j to get pY (1) = pY (3) =1/4 and pY (2) = 1/2.


(b) To compute P(X < Y ), we sum pXY (i, j) over i and j such that i< j. We have

P(X < Y ) = pXY (1,2)+ pXY (1,3)+ pXY (2,3) = 0+1/8+0 = 1/8.

(c) We claim that X and Y are not independent. For example, pXY (1,2) = 0 is not

equal to pX (1)pY (2) = 1/8.

24. (a) To compute pX (i), we sum row i of the matrix. This yields pX (1) = pX (3) = 1/4and pX (2) = 1/2. To compute pY ( j), we sum column j to get pY (1) = pY (3) =1/6 and pY (2) = 2/3.

(b) To compute P(X < Y ), we sum pXY (i, j) over i and j such that i< j. We have

P(X <Y ) = pXY (1,2)+ pXY (1,3)+ pXY (2,3) = 1/6+1/24+1/12 = 7/24.

(c) Using the results of part (a), it is easy to verify that pX (i)pY ( j) = pXY (i, j) fori, j = 1,2,3. Hence, X and Y are independent.

25. To compute the marginal of X , write

pX (1) =e−3

3

∞

∑j=0

3 j

j!=

e−3

3e3 = 1/3.

Similarly,

pX (2) =4e−6

6

∞

∑j=0

6 j

j!=

4e−6

6e6 = 2/3.

Alternatively, pX (2) = 1− pX (1) = 1−1/3 = 2/3. Of course pX (i) = 0 for i 6= 1,2.We clearly have pY ( j) = 0 for j < 0 and

pY ( j) =3 j−1e−3

j!+4

6 j−1e−6

j!, j ≥ 0.

Since pX (1)pY ( j) 6= pXY (1, j), X and Y are not independent.

26. (a) For k ≥ 1,

pX (k) =∞

∑n=0

(1− p)pk−1kne−kn!

= (1− p)pk−1e−k∞

∑n=0

kn

n!= (1− p)pk−1e−kek = (1− p)pk−1,

which we recognize as the geometric1(p) pmf.

(b) Next,

pY (0) =∞

∑k=1

(1− p)pk−1e−k =1− pe

∞

∑k=1

(p/e)k−1

=1− pe

∞

∑m=0

(p/e)m =1− pe· 1

1− p/e =1− pe− p .


(c) Since pX (1)pY (0) = (1− p)2/(e− p) is not equal to pXY (1,0) = (1− p)/e, Xand Y are not independent.

27. MATLAB. Here is a script:

p = ones(1,51)/51;

k=[0:50];

i = find(g(k) >= -16);

fprintf(’The answer is %g\n’,sum(p(i)))

where

function y = g(x)

y = 5*x.*(x-10).*(x-20).*(x-30).*(x-40).*(x-50)/1e6;

28. MATLAB. If you modified your program for the preceding problem only by the way

you compute P(X = k), then you may get only 0.5001= P(g(X)≥−16 and X ≤ 50).Note that g(x) > 0 for x> 50. Hence, you also have to add P(X ≥ 51) = p51 = 0.0731to 0.5001 to get 0.5732.




32. E[X ] = 2(1/3)+5(2/3) = 12/3= 4.

33. E[I(2,6)(X)]= ∑5k=3P(X = k)= (1− p)[p3+ p4+ p5]. For p= 1/2, we get E[I(2,6)(X)]=

7/64= 0.109375.

34. Write

E[1/(X+1)] =∞

∑n=0

1

n+1

λ ne−λ

n!=

e−λ

λ

∞

∑n=0

λ n+1

(n+1)!=

e−λ

λ

∞

∑n=1

λ n

n!

=e−λ

λ

[ ∞

∑n=0

λ n

n!−1

]=

e−λ

λ[eλ −1] =

1− e−λ

λ.

35. Since var(X) = E[X2]− (E[X ])2, E[X2] = var(X)+(E[X ])2. Hence, E[X2] = 7+22 =7+4= 11.

36. Since Y = cX , E[Y ] = E[cX ] = cm. Hence,

var(Y ) = E[(Y − cm)2] = E[(cX− cm)2] = E[c2(X−m)2]

= c2E[(X−m)2] = c2 var(X) = c2σ2.

37. We begin with

E[(X+Y )3] = E[X3 +3X2Y +3XY 2 +Y 3]

= E[X3]+3E[X2Y ]+3E[XY 2]+E[Y 3]

= E[X3]+3E[X2]E[Y ]+3E[X ]E[Y 2]+E[Y 3], by independence.


Now, as noted in the text, for a Bernoulli(p) random variable, Xn = X , and so E[Xn] =E[X ] = p. Similarly E[Y n] = q. Thus,

E[(X+Y )3] = p+3pq+3pq+q = p+6pq+q.

38. The straightforward approach is to put

f (c) := E[(X− c)2] = E[X2]−2mc+ c2

and differentiate with respect to c to get f ′(c) =−2m+2c. Solving f ′(c) = 0 results

in c= m. An alternative approach is to write

E[(X− c)2] = E[(X−m)+(m− c)2] = σ2 +(m− c)2.

From this expression, it is obvious that c= m minimizes the expectation.

39. The two sketches are:

a

1( )x a/ 1/2

( )xIa )8[ ,

a

1

x a/

( )xIa )8[ ,

40. The right-hand side is easy: E[X ]/2 = (3/4)/2 = 3/8 = 0.375. The left-hand side ismore work:

P(X ≥ 2) = 1−P(X ≤ 1) = 1− [P(X = 0)+P(X = 1)] = 1− e−λ (1+λ ).

For λ = 3/4, P(X ≥ 2) = 0.1734. So the bound is a little more than twice the valueof the probability.

41. The Chebyshev bound is (λ + λ 2)/4. For λ = 3/4, the bound is 0.3281, which isa little better than the Markov inequality bound in the preceding problem. The true

probability is 0.1734.

42. Comparing the definitions of ρXY and cov(X ,Y ), we find ρXY = cov(X ,Y )/(σXσY ).Hence, cov(X ,Y ) = σXσYρXY . Since cov(X ,Y ) := E[(X −mX )(Y −mY ), if Y = X ,

we see that cov(X ,X) = E[(X−mX )2] =: var(X).

43. Put

f (a) := E[(X−aY )2] = E[X2]−2aE[XY ]+a2E[Y 2] = σ2X −2aρσXσY +a2σ2

Y .

Then

f ′(a) = −2ρσXσY +2aσ2Y .

Setting this equal to zero and solving for a yields a= ρ(σX/σY ).


44. Since P(X =±1) = P(X =±2) = 1/4, E[X ] = 0. Similarly, since P(XY =±1) = 1/4and P(XY =±4) = 1/4, E[XY ] = 0. Thus, E[XY ] = 0 = E[X ]E[Y ] and we see that Xand Y are uncorrelated. Next, since X = 1 implies Y = 1, P(X = 1,Y = 1) = P(X =1) = 1/4 while P(Y = 1) = P(X = 1 or X =−1) = 1/2. Thus,

P(X = 1)P(Y = 1) = (1/4)(1/2) = 1/8, but P(X = 1,Y = 1) = 1/4.

45. As discussed in the text, for uncorrelated random variables, the variance of the sum

is the sum of the variances. Since independent random variables are uncorrelated, the

same results holds for them too. Hence, for Y = X1 + · · ·+XM ,

var(Y ) =M

∑k=1

var(Xk).

We also have E[Y ] = ∑Mk=1E[Xk]. Next, since the Xk are i.i.d. geometric1(p), E[Xk] =

1/(1− p) and var(Xk) = p/(1− p)2. It follows that var(Y ) =Mp/(1− p)2 and E[Y ] =M/(1− p). We conclude by writing

E[Y 2] = var(Y )+(E[Y ])2 =Mp

(1− p)2 +M2

(1− p)2 =M(p+M)

(1− p)2 .

46. From E[Y ] = E[dX− s(1−X)] = dp− s(1− p) = 0, we find that d/s= (1− p)/p.47. (a) p= 1/1000.

(b) Since (1− p)/p= (999/1000)/(1/1000) = 999, the fair odds against are 999 :1.

(c) Since the fair odds of 999 :1 are not equal to the offered odds of 500 :1, the game

is not fair. To make the game fair, the lottery should pay $900 instead of $500.

48. First note that

∫ ∞

1

1

t pdt =

1

1− p ·1

t p−1

∣∣∣∞

1, p 6= 1,

ln t∣∣∣∞

1, p= 1.

For p> 1, the integral is equal to 1/(p−1). For p≤ 1, the integral is infinite.

For 0< p≤ 1, write

∞

∑k=1

1

kp≥

∞

∑k=1

∫ k+1

k

1

t pdt =

∫ ∞

1

1

t pdt = ∞.

For p> 1, it suffices to show that ∑∞k=2 1/k

p < ∞. To this end, write

∞

∑k=2

1

kp=

∞

∑k=1

1

(k+1)p≤

∞

∑k=1

∫ k+1

k

1

t pdt =

∫ ∞

1

1

t pdt < ∞.

49. First write

E[Xn] =∞

∑k=1

knC−1pkp

= C−1p∞

∑k=1

1

kp−n.


By the preceding problem this last sum is finite for p− n > 1, or equivalently, n <p− 1. Otherwise the sum is infinite; the case 1 ≥ p− n > 0 being handled by the

preceding problem, and the case 0≥ p−n being obvious.50. If all outcomes are equally likely,

H(X) =n

∑i=1

pi log1

pi=

1

n

n

∑i=1

logn = logn.

If X is a constant random variable with pi = 0 for i 6= j, then

H(X) =n

∑i=1

pi log1

pi= p j log

1

p j= 1log1 = 0.

51. Let P(X = xi) = pi for i= 1, . . . ,n. Then

E[g(X)] =n

∑i=1

g(xi)pi and g(E[X ]) = g

(n

∑i=1

xipi

).

For n= 2, Jensen’s inequality says that

p1g(x1)+ p2g(x2) ≥ g(p1x1 + p2x2).

If we put λ = p1, then 1−λ = p2 and the above inequality becomes

λg(x1)+(1−λ )g(x2) ≥ g(λx1 +(1−λ )x2),

which is just the definition of a convex function. Hence, if g is convex, Jensen’s

inequality holds for n = 2. Now suppose Jensen’s inequality holds for some n ≥ 2.

We must show it holds for n+1. The case of n is

n

∑i=1

g(xi)pi ≥ g

(n

∑i=1

xipi

), if p1 + · · ·+ pn = 1.

Now suppose that p1 + · · ·+ pn+1 = 1, and write

n+1

∑i=1

g(xi)pi = (1− pn+1)[

n

∑i=1

g(xi)pi

1− pn+1

]+ pn+1g(xn+1).

Let us focus on the quantity in brackets. Since

n

∑i=1

pi

1− pn+1=

p1 + · · ·+ pn

pn+1=

1− pn+11− pn+1

= 1,

Jensen’s inequality for n terms yields

n

∑i=1

g(xi)pi

1− pn+1≥ g

(n

∑i=1

xipi

1− pn+1

).


Hence,

n+1

∑i=1

g(xi)pi ≥ (1− pn+1)g(

n

∑i=1

xipi

1− pn+1

)+ pn+1g(xn+1).

Now apply the two-term Jensen inequality to get

n+1

∑i=1

g(xi)pi ≥ g

((1− pn+1)

[n

∑i=1

xipi

1− pn+1

]+ pn+1xn+1

)

= g

(n+1

∑i=1

pixi

).

52. With X = |Z|α and g(x) = xβ/α , we have

E[g(X)] = E[Xβ/α ] = E[(|Z|α)β/α ] = E[|Z|β ]

and

E[X ] = E[|Z|α ].

Then Jensen’s inequality tells us that

E[|Z|β ] ≥ (E[|Z|α ])β/α .

Raising both sides the 1/β power yields Lyapunov’s inequality.

53. (a) For all discrete random variables, we have ∑iP(X = xi) = 1. For a nonnegative

random variable, if xk < 0, we have

1 = ∑i

P(X = xi) ≥ ∑i

I[0,∞)(xi)P(X = xi)+P(X = xk) = 1+P(X = xk).

From this it follows that 0≥ P(X = xk)≥ 0, and so P(X = xk) = 0.

(b) Write

E[X ] = ∑i

xiP(X = xi) = ∑i:xi≥0

xiP(X = xi)+ ∑k:xk<0

xiP(X = xk).

By part (a), the last sum is zero. The remaining sum is obviously nonnegative.

(c) By part (b), 0≤ E[X−Y ] = E[X ]−E[Y ]. Hence, E[Y ]≤ E[X ].

CHAPTER 3

Problem Solutions

1. First, E[X ] = G′X (1) =(16 + 4

3 z)∣∣∣z=1

= 16 + 4

3 = 96 = 3

2 . Second, E[X(X − 1)] =

G′′X (1) = 4/3. Third, from E[X(X−1)] = E[X2]−E[X ], we have E[X2] = 4/3+3/2=17/6. Finally, var(X) = E[X2]− (E[X ])2 = 17/6−9/4= (34−27)/12= 7/12.

2. pX (0) =GX (0) =(16 + 1

6 z+23 z

2)∣∣∣z=0

= 1/6, pX (1) =G′X (0) =(16 + 4

3 z)∣∣∣z=0

= 1/6,

and pX (2) = G′′X (0)/2= (4/3)/2= 2/3.

3. To begin, note that G′X (z) = 5((2+ z)/3)4/3, and G′′X (z) = 20((2+ z)/3)3/9. Hence,E[X ] = G′X (1) = 5/3 and E[X(X − 1)] = 20/9. Since E[X(X − 1)] = E[X2]−E[X ],E[X2] = 20/9+ 5/3 = 35/9. Finally, var(X) = E[X2]− (E[X ])2 = 35/9− 25/9 =10/9.

4. For X ∼ geometric0(p),

GX (z) =∞

∑n=0

znP(X = n) =∞

∑n=0

zn(1− p)pn = (1− p)∞

∑n=0

(zp)n = (1− p) 1

1− pz .

Then

E[X ] = G′X (1) =1− p

(1− pz)2 p∣∣∣∣z=1

=1− p

(1− p)2 p=p

1− p .

Next,

E[X2]−E[X ] = E[X(X−1)] = G′′X (1) =(1− p)p(1− pz)4 ·2p(1− pz)

∣∣∣∣z=1

=(1− p)p ·2p

(1− p)3 =2p2

(1− p)2 .

This implies that

E[X2] =2p2

(1− p)2 +p

1− p =2p2 + p(1− p)

(1− p)2 =p+ p2

(1− p)2 .

Finally,

var(X) = E[X2]− (E[X ])2 =p+ p2

(1− p)2 −p2

(1− p)2 =p

(1− p)2 .

For X ∼ geometric1(p),

GX (z) =∞

∑n=1

znP(X = n) =∞

∑n=1

zn(1− p)pn−1 =1− pp

∞

∑n=1

(zp)n

=1− pp

pz

1− pz =(1− p)z1− pz .

31


Now

G′X (z) =(1− pz)(1− p)+(1− p)pz

(1− pz)2 =1− p

(1− pz)2 .

Hence,

E[X ] = G′X (1) =1

1− p .

Next,

G′′X (z) =(1− p)

(1− pz)4 ·2(1− pz)p =2p(1− p)(1− pz)3 .

We then have

E[X2]−E[X ] = E[X(X−1)] = G′′X (1) =2p(1− p)(1− p)3 =

2p

(1− p)2 ,

and

E[X2] =2p

(1− p)2 +1

1− p =2p+1− p(1− p)2 =

1+ p

(1− p)2 .

Finally,

var(X) = E[X2]− (E[X ])2 =1+ p

(1− p)2 −1

(1− p)2 =p

(1− p)2 .

5. Since the Xi are independent Poisson(λi), we use probability generating functions tofind GY (z), which turns out to be Poisson(λ ) with λ := λ1 + · · ·+λn. It then followsthat P(Y = 2) = λ 2e−λ /2. It remains to write

GY (z) = E[zY ] = E[zX1+···+Xn ] = E[zX1 · · ·zXn ] =n

∏i=1

E[zXi ] =n

∏i=1

eλi(z−1)

= exp

[n

∑i=1

λi(z−1)

]= eλ (z−1),

which is the Poisson(λ ) pgf. Hence, Y ∼ Poisson(λ ) as claimed.

6. If GX (z) = ∑∞k=0 z

kP(X = k), then GX (1) = ∑∞

k=0P(X = k) = 1. For the particular

formula in the problem, we must have 1 = GX (1) = (a0 +a1 +a2 + · · ·+an)m/D, orD= (a0 +a1 +a2 + · · ·+an)m.

7. From the table on the inside of the front cover of the text, E[Xi] = 1/(1− p). Thus,

E[Y ] =n

∑i=1

E[Xi] =n

1− p .

Second, since the variance of the sum of uncorrelated random variables is the sum of

the variances, and since var(Xi) = p/(1− p)2, we have

var(Y ) =n

∑i=1

var(Xi) =np

(1− p)2 .


It then easily follows that

E[Y 2] = var(Y )+(E[Y ])2 =np

(1− p)2 +n2

(1− p)2 =n(p+n)

(1− p)2 .

SinceY is the sum of i.i.d. geometric1(p) random variables, the pgf ofY is the product

of the individual pgfs. Thus,

GY (z) =n

∏i=1

(1− p)z1− pz =

[(1− p)z1− pz

]n.

8. Starting with GY (z) = [(1− p)+ pz]n, we have

E[Y ] = G′Y (1) = n[(1− p)+ pz]n−1p∣∣∣z=1

= np.

Next,

E[Y (Y −1)] = G′′Y (1) = n(n−1)[(1− p)+ pz]n−2p2∣∣∣z=1

= n(n−1)p2.


E[Y 2] = E[Y (Y −1)]+E[Y ] = n(n−1)p2 +np = n2p2−np2 +np.

Finally,

var(Y ) = E[Y 2]− (E[Y ])2 = n2p2−np2 +np− (np)2 = np(1− p).

9. For the binomial(n, p) random variable,

GY (z) =n

∑k=0

P(Y = k)zk =n

∑k=0

(n

k

)pk(1− p)n−kzk.

Hence,

1 = GY (1) =n

∑k=0

(n

k

)pk(1− p)n−k.

10. Starting fromn

∑k=0

(n

k

)pk(1− p)n−k = 1,

we follow the hint and replace p with a/(a+ b). Note that 1− p = b/(a+ b). With

these substitutions, the above equation becomes

n

∑k=0

(n

k

)(a

a+b

)k(b

a+b

)n−k= 1 or

n

∑k=0

(n

k

)akbn−k

(a+b)k(a+b)n−k= 1.

Multiplying through by (a+b)n we have

n

∑k=0

(n

k

)akbn−k = (a+b)n.


11. Let Xi = 1 if bit i is in error, Xi = 0 otherwise. Then Yn := X1+ · · ·+Xn is the numberof errors in n bits. Assume the Xi are independent. Then

GYn(z) = E[zYn ] = E[zX1+···+Xn ] = E[zX1 · · ·zXn ]= E[zX1 ] · · ·E[zXn ], by independence,

= [(1− p)+ pz]n, since the Xi are i.i.d. Bernoulli(p).

We recognize this last expression as the binomial(n, p) probability generating func-tion. Thus, P(Yn = k) =

(nk

)pk(1− p)n−k.

12. Let Xi ∼ binomial(ni, p) denote the number of students in the ith room. Then Y =X1 + · · ·+XM is the total number of students in the school. Next,

GY (z) = E[zY ] = E[zX1+···+XM ]

= E[zX1 ] · · ·E[zXM ], by independence,

=M

∏i=1

[(1− p)+ pz]ni , since Xi ∼ binomial(ni, p),

= [(1− p)+ pz]n1+···+nM .

Setting n := n1 + · · ·+ nM , we see that Y ∼ binomial(n, p). Hence, P(Y = k) =(nk

)pk(1− p)n−k.

13. LetY =X1+ · · ·+Xn, where the Xi are i.i.d. with P(Xi = 1) = 1− p and P(Xi = 2) = p.

Observe that Xi−1∼ Bernoulli(p). Hence,

Y = n+n

∑i=1

(Xi−1)

︸︷︷︸=: Z

.

Since Z is the sum of i.i.d. Bernoulli(p) random variables, Z∼ binomial(n, p). Hence,

P(Y = k) = P(n+Z = k) = P(Z = k−n)

=

(n

k−n

)pk−n(1− p)2n−k, k = n, . . . ,2n.

14. Let Xi be i.i.d. Bernoulli(p), where Xi = 1 means bit i is flipped. Then Y := X1+ · · ·+Xn (n = 10) is the number of bits flipped. A codeword cannot be decoded if Y > 2.

We need to find P(Y > 2). Observe that

GY (z) = E[zY ] = E[zX1+···+Xn ] = E[zX1 · · ·zXn ] =n

∏i=1

E[zXi ] = [(1− p)+ pz]n.

This is the pgf of a binomial(n, p) random variable. Hence,

P(Y > 2) = 1−P(Y ≤ 2) = 1− [P(Y = 0)+P(Y = 1)+P(Y = 2)]

= 1−[(

10

0

)(1− p)10+

(10

1

)p(1− p)9+

(10

2

)p2(1− p)8

]

= 1− (1− p)8[(1− p)2+10p(1− p)+45p2].


15. For n= 150 and p= 1/100, we have

k P

(Binomial(n, p) = k

)P

(Poisson(np) = k

)

0 0.2215 0.2231

1 0.3355 0.3347

2 0.2525 0.2510

3 0.1258 0.1255

4 0.0467 0.0471

5 0.0138 0.0141

16. If the Xi are i.i.d. with mean m, then

E[Mn] = E

[1

n

n

∑i=1

Xi

]=

1

n

n

∑i=1

E[Xi] =1

n

n

∑i=1

m =nm

n= m.

If X is any random variable with mean m, then E[cX ] = cE[X ] = cm, and

var(cX) = E[(cX− cm)2] = E[c2(X−m)2] = c2E[(X−m)2] = c2 var(X).

17. In general, we have

P(|Mn−m|< ε) ≥ 0.9 ⇔ P(|Mn−m| ≥ ε) < 0.1.

By Chebyshev’s inequality,

P(|Mn−m| ≥ ε) ≤ σ2

nε2< 0.1

if n> σ2/(.1)ε2. For σ2 = 1 and ε = 0.25, we require n> 1/(.1)(.25)2 = 1/.00625=160 students. If instead ε = 1, we require n> 0.1= 10 students.

18. (a) E[Xi] = E[IB(Zi)] = P(Zi ∈ B). Setting p := P(Zi ∈ B), we see that Xi = IB(Zi)∼Bernoulli(p). Hence, var(Xi) = p(1− p).

(b) In fact, the Xi are independent. Hence, they are uncorrelated.

19. Mn = 0 if and only if all the Xi are zero. Hence,

P(Mn = 0) = P

( n⋂

i=1

Xi = 0)

=n

∏i=1

P(Xi = 0) = (1− p)n.

In particular, if p = 1/1000, then P(M100 = 0) = (1− p)100 = 0.999100 = 0.905.Hence, the chances are more than 90% that when we run a simulation, M100 = 0 and

we learn nothing!

20. If Xi = Z ∼ Bernoulli(1/2) for all i, then

Mn =1

n

n

∑i=1

Xi =1

n

n

∑i=1

Z = Z,


26. We use the law of total probability to write

P(T = n) = P(X−Y = n) =∞

∑k=0

P(X−Y = n|Y = k)P(Y = k)

=∞

∑k=0

P(X− k = n|Y = k)P(Y = k), by the substitution law,

=∞

∑k=0

P(X = n+ k|Y = k)P(Y = k)

=∞

∑k=0

P(X = n+ k)P(Y = k), by independence,

=∞

∑k=0

(1− p)pn+k · (1−q)qk

= (1− p)(1−q)pn∞

∑k=0

(pq)k =(1− p)(1−q)pn

1− pq .

27. The problem is telling us that

P(Y = n|X = 1) =µne−µ

n!and P(Y = n|X = 2) =

νne−ν

n!.

The problem also tells us that P(X = 1) = P(X = 2) = 1/2. We can now write

P(X = 1|Y = 2)=P(X = 1,Y = 2)

P(Y = 2)=

P(Y = 2|X = 1)P(X = 1)

P(Y = 2)=

(µ2e−µ/2)(1/2)

P(Y = 2).

It remains to use the law of total probability to compute

P(Y = 2) =2

∑i=1

P(Y = 2|X = i)P(X = i)

= [P(Y = 2|X = 1)+P(Y = 2|X = 2)]/2

= [µ2e−µ/2+ν2e−ν/2]/2 = [µ2e−µ +ν2e−ν ]/4.

We conclude by writing

P(X = 1|Y = 2) =µ2e−µ/4

[µ2e−µ +ν2e−ν ]/4=

1

1+(ν/µ)2eµ−ν.

28. Let X = 0 or X = 1 according to whether message zero or message one is sent. The

problem tells us that P(X = 0) = P(X = 1) = 1/2 and that

P(Y = k|X = 0) = (1− p)pk and P(Y = k|X = 1) = (1−q)qk,

where q 6= p. We need to compute

P(X = 1|Y = k) =P(Y = k|X = 1)P(X = 1)

P(Y = k)=

(1−q)qk(1/2)P(Y = k)

.


We next use the law of total probability to compute

P(Y = k) = [P(Y = k|X = 0)+P(Y = k|X = 1)]/2= [(1− p)pk+(1−q)qk]/2.

We can now compute

P(X = 1|Y = k) =(1−q)qk(1/2)

[(1− p)pk+(1−q)qk]/2 =1

1+ (1−p)pk(1−q)qk

.

29. Let R denote the number of red apples in a crate, and letG denote the number of green

apples in a crate. The problem is telling us that R∼ Poisson(ρ) and G∼ Poisson(γ)are independent. If T = R+G is the total number of apples in the crate, we must

compute

P(G= 0|T = k) =P(T = k|G= 0)P(G= 0)

P(T = k).

We first use the law of total probability, substitution, and independence to write

P(T = k|G= 0) = P(R+G= k|G= 0) = P(R= k|G= 0) = P(R= k) = ρke−ρ/k!.

We also note from the text that the sum of two independent Poisson random variables

is a Poisson random variable whose parameter is the sum of the individual parameters.

Hence, P(T = k) = (ρ + γ)ke−(ρ+γ)/k!. We can now write

P(G= 0|T = k) =ρke−ρ/k! · e−γ

(ρ + γ)ke−(ρ+γ)/k!=

(ρ

ρ + γ

)k.

30. We begin with

P(X = n|Y = 1) =P(Y = 1|X = n)P(X = n)

P(Y = 1)=

1n+1 · λ ne−λ

n!

P(Y = 1).

Next, we compute

P(Y = 1) =∞

∑n=0

P(Y = 1|X = n)P(X = n) =∞

∑n=0

1

n+1· λ ne−λ

n!

=e−λ

λ

∞

∑n=0

λ n+1

(n+1)!=

e−λ

λ

∞

∑k=1

λ k

k!=

e−λ

λ

[ ∞

∑k=0

λ k

k!−1

]

=e−λ

λ[eλ −1] =

1− e−λ

λ.

We conclude with

P(X = n|Y = 1) =1n+1 · λ ne−λ

n!

P(Y = 1)=

1n+1 · λ ne−λ

n!

(1− e−λ )/λ=

λ n+1

(eλ −1)(n+1)!.


31. We begin with

P(X = n|Y = k) =P(Y = k|X = n)P(X = n)

P(Y = k)=

(nk

)pk(1− p)n−k ·λ ne−λ /n!

P(Y = k).

Next,

P(Y = k) =∞

∑n=0

P(Y = k|X = n)P(X = n) =∞

∑n=k

(n

k

)pk(1− p)n−kλ ne−λ /n!

=pkλ ke−λ

k!

∞

∑n=k

[(1− p)λ ]n−k

(n− k)! =pkλ ke−λ

k!

∞

∑m=0

[(1− p)λ ]m

m!

=pkλ ke−λ

k!e(1−p)λ =

(pλ )ke−pλ

k!.

Note that Y ∼ Poisson(pλ ). We continue with

P(X = n|Y = k) =

(nk


P(Y = k)=

(nk


(pλ )ke−pλ /k!

=[(1− p)λ ]n−ke−(1−p)λ

(n− k)! .

32. First write

P(X > k

∣∣max(X ,Y ) > k)

=P(X > k∩max(X ,Y ) > k

)

P(max(X ,Y ) > k

) =P(X > k)

P(max(X ,Y ) > k

) ,

since X > k ⊂ max(X ,Y ) > k. We next compute

P(max(X ,Y ) > k

)= 1−P

(max(X ,Y )≤ k

)= 1−P(X ≤ k)P(Y ≤ k).

If we put θk := P(X ≤ k) and use the fact that X and Y have the same pmf, then

P(X > k

∣∣max(X ,Y ) > k)

=1−θk1−θ 2

k

=1−θk

(1−θk)(1+θk)=

1

1+θk.

With n= 100 and p= .01, we compute

θ1 = P(X ≤ 1) = P(X = 0)+P(X = 1) = .99100 + .9999 = .366+ .370= .736.

It follows that the desired probability is 1/(1+θ1) = 1/1.736 = 0.576.

33. (a) Observe that

P(XY = 4) = P(X = 1,Y = 4)+P(X = 2,Y = 2)+P(X = 4,Y = 1)

= (1− p)(1−q)[pq4+ p2q2 + p4q].


(b) Write

pZ( j) = ∑i

pY ( j− i)pX (i)

=∞

∑i=0

pY ( j− i)pX (i), since pX (i) = 0 for i< 0,

=j

∑i=0

pY ( j− i)pX (i), since pY (k) = 0 for k < 0,

= (1− p)(1−q)j

∑i=0

piq j−i = (1− p)(1−q)q jj

∑i=0

(p/q)i.

Now, if p= q,

pZ( j) = (1− p)2p j( j+1).

If p 6= q,

pZ( j) = (1− p)(1−q)q j 1− (p/q) j+1

1− p/q = (1− p)(1−q)qj+1− p j+1q− p .

34. For j = 0,1,2,3,

pZ( j) =j

∑i=0

(1/16) = ( j+1)/16.

For j = 4,5,6,

pZ( j) =3

∑i= j−3

(1/16) = (7− j)/16.

For other values of j, pZ( j) = 0.

35. We first writeP(Y = j|X = 1)

P(Y = j|X = 0)≥ P(X = 0)

P(X = 1)

asλ j1 e−λ1/ j!

λ j0 e−λ0/ j!

≥ 1− pp

.

We can further simplify this to

(λ1λ0

) j≥ 1− p

peλ1−λ0 .

Taking logarithms and rearranging, we obtain

j ≥[λ1−λ0+ ln

1− pp

]/ln(λ1/λ0).

Observe that the right-hand side is just a number (threshold) that is computable from

the problem data. If we observe Y = j, we compare j to the threshold. If j is greater

than or equal to this number, we decide X = 1; otherwise, we decide X = 0.


36. We first writeP(Y = j|X = 1)

P(Y = j|X = 0)≥ P(X = 0)

P(X = 1)

as(1−q1)q j1(1−q0)q j0

≥ 1− pp

.

We can further simplify this to

(q1q0

) j≥ (1− p)(1−q0)

p(1−q1).

Taking logarithms and rearranging, we obtain

j ≤[ln

(1− p)(1−q0)p(1−q1)

]/ln(q1/q0),

since q1 < q0 implies ln(q1/q0) < 0.

37. Starting with P(X = xi|Y = y j) = h(xi), we have

P(X = xi,Y = y j) = P(X = xi|Y = y j)P(Y = y j) = h(xi)pY (y j).

If we can show that h(xi) = pX (xi), then it will follow that X and Y are independent.

Now observe that the sum over j of the left-hand side reduces to P(X = xi) = pX (xi).The sum over j of the right-hand side reduces to h(xi). Hence, pX (xi) = h(xi) asdesired.

38. First write

pXY (1, j) = (1/3)3 je−3/ j! and pXY (2, j) = (4/6)6 je−6/ j!

Notice that 3 je−3/ j! is a Poisson(3) pmf and 6 je−6/ j! is a Poisson(6) pmf. Hence,pX (1) = ∑∞

j=0 pXY (1, j) = 1/3 and pX (2) = ∑∞j=0 pXY (2, j) = 2/3. It then follows

that pY |X ( j|1) is Poisson(3) and pY |X ( j|2) is Poisson(6). With these observations, it

is clear that

E[Y |X = 1] = 3 and E[Y |X = 2] = 6,

and

E[Y ] = E[Y |X = 1](1/3)+E[Y |X = 2](2/3) = 3(1/3)+6(2/3) = 1+4 = 5.

To obtain E[X |Y = j], we first compute

pY ( j) = pXY (1, j)+ pXY (2, j) = (1/3)3 je−3/ j!+(2/3)6 je−6/ j!

and

pX |Y (1| j) =(1/3)3 je−3/ j!

(1/3)3 je−3/ j!+(2/3)6 je−6/ j!=

1

1+2 j+1e−3

and

pX |Y (2| j) =(2/3)6 je−6/ j!

(1/3)3 je−3/ j!+(2/3)6 je−6/ j!=

1

1+2−( j+1)e3.


We now have

E[X |Y = j] = 1 · 1

1+2 j+1e−3+2 · 1

1+2−( j+1)e3

=1

1+2 j+1e−3+2 · 2 j+1e−3

1+2 j+1e−3=

1+2 j+2e−3

1+2 j+1e−3.

39. Since Y is conditionally Poisson(k) given X = k, E[Y |X = k] = k. Hence,

E[Y ] =∞

∑k=1

E[Y |X = k]P(X = k) =∞

∑k=1

kP(X = k) = E[X ] =1

1− p ,

since X ∼ geometric1(p). Next

E[XY ] =∞

∑k=1

E[XY |X = k]P(X = k) =∞

∑k=1

E[kY |X = k]P(X = k), by substitution,

=∞

∑k=1

kE[Y |X = k]P(X = k) =∞

∑k=1

k2P(X = k) = E[X2]

= var(X)+(E[X ])2 =p

(1− p)2 +1

(1− p)2 =1+ p

(1− p)2 .

Since E[Y 2|X = k] = k+ k2,

E[Y 2] =∞

∑k=1

E[Y 2|X = k]P(X = k) =∞

∑k=1

(k+ k2)P(X = k) = E[X ]+E[X2]

=1

1− p +1+ p

(1− p)2 =2

(1− p)2 .

Finally, we can compute

var(Y ) = E[Y 2]− (E[Y ])2 =2

(1− p)2 −1

(1− p)2 =1

(1− p)2 .

40. From the solution of the example, it is immediate that E[Y |X = 1] = λ and E[Y |X =0] = λ/2. Next,

E[Y ] = E[Y |X = 0](1− p)+E[Y |X = 1]p = (1− p)λ/2+ pλ .

Similarly,

E[Y 2] = E[Y 2|X = 0](1− p)+E[Y 2|X = 1]p

= (λ/2+λ 2/4)(1− p)+(λ +λ 2)p.

To conclude, we have

var(Y ) = E[Y 2]− (E[Y ])2 = (λ/2+λ 2/4)(1− p)+(λ +λ 2)p− [(1− p)λ/2+ pλ ]2.


41. Write

E[(X+1)Y 2] =1

∑i=0

E[(X+1)Y 2|X = i]P(X = i) =1

∑i=0

E[(i+1)Y 2|X = i]P(X = i)

=1

∑i=0

(i+1)E[Y 2|X = i]P(X = i).

Now, since given X = i, Y is conditionally Poisson(3(i+1)),

E[Y 2|X = i] = (λ +λ 2)∣∣∣λ=3(i+1)

= 3(i+1)+9(i+1)2.

It now follows that

E[(X+1)Y 2] =1

∑i=0

(i+1)[3(i+1)+9(i+1)2]P(X = i)

=1

∑i=0

(i+1)2[3+9(i+1)]P(X = i)

= 12(1/3)+84(2/3) = 4+56 = 60.

42. Write

E[XY ] =∞

∑n=0

E[XY |X = n]P(X = n) =∞

∑n=0

E[nY |X = n]P(X = n)

=∞

∑n=0

nE[Y |X = n]P(X = n) =∞

∑n=0

n1

n+1

λ ne−λ

n!

= E

[X

X+1

]= E

[X+1−1

X+1

]= 1−E

[1

X+1

].

By a problem in the previous chapter, this last expectation is equal to (1− e−λ )/λ .Hence,

E[XY ] = 1− 1− e−λ

λ.

43. Write

E[XY ] =∞

∑n=1

E[XY |X = n]P(X = n) =∞

∑n=1

E[nY |X = n]P(X = n)

=∞

∑n=1

nE[Y |X = n]P(X = n) =∞

∑n=1

nn

1−qP(X = n)

=1

1−qE[X2] =1

1−q[var(X)+(E[X ])2

]

=1

1−q

[p

(1− p)2 +1

(1− p)2]

=1+ p

(1−q)(1− p)2 .


44. Write

E[X2] =∞

∑k=1

E[X2|Y = k]P(Y = k) =∞

∑n=1

(k+ k2)P(Y = k) = E[Y +Y 2]

= E[Y ]+E[Y 2] = m+(r+m2) = m+m2 + r.

45. Using probability generating functions, we see that

GV (z) = E[zX+Y ] = E[zX zY ] = E[zX ]E[zY ]

= [(1− p)+ pz]n[(1− p)+ pz]m = [(1− p)+ pz]n+m.

Thus, V ∼ binomial(n+m, p). We next compute

P(V = 10|X = 4) = P(X+Y = 10|X = 4) = P(4+Y = 10|X = 4)

= P(Y = 6|X = 4) = P(Y = 6) =

(m

6

)p6(1− p)m−6.

46. Write

GY (z) = E[zY ] =∞

∑k=1

E[zY |X = k]P(X = k) =∞

∑k=1

ek(z−1)P(X = k)

=∞

∑k=1

(ez−1)kP(X = k) = GX (ez−1) =(1− p)ez−11− pez−1 .

CHAPTER 4

Problem Solutions

1. Let Vi denote the input voltage at the ith sampling time. The problem tells us that the

Vi are independent and uniformly distributed on [0,7]. The alarm sounds if Vi > 5 for

i= 1,2,3. The probability of this is

P

( 3⋂

i=1

Vi > 5)

=3

∏i=1

P(Vi > 5).

Now, P(Vi > 5) =∫ 75 (1/7)dt = 2/7. Hence, the desired probability is (2/7)3 =

8/343 = 0.0233.

2. We must solve∫ ∞t f (x)dx= 1/2 for t. Now,

∫ ∞

t2x−3 dx = − 1

x2

∣∣∣∣∞

t

=1

t2.

Solving 1/t2 = 1/2, we find that t =√2.

3. To find c, we solve∫ 10 cx

−1/2 dx= 1. The left-hand side of this equation is 2cx1/2|10 =

2c. Solving 2c= 1 yields c= 1/2. For the median, we must solve∫ 1t (1/2)x−1/2 dx=

1/2 or x1/2|1t = 1/2. We find that t = 1/4.

4. (a) For t ≥ 0, P(X > t) =∫ ∞t λe−λx dx=−e−λx|∞t = e−λ t .

(b) First, P(X > t+∆t|X > t) = P(X > t+∆t,X > t)/P(X > t). Next, observe that

X > t+∆t∩X > t = X > t+∆t,and so P(X > t + ∆t|X > t) = P(X > t + ∆t)/P(X > t) = e−λ (t+∆t)/e−λ t =e−λ∆t .

5. Let Xi denote the voltage output by regulator i. Then the Xi are i.i.d. exp(λ ) randomvariables. Now put

Y :=10

∑i=1

I(v,∞)(Xi)

so that Y counts the number of regulators that output more than v volts. We must

compute P(Y = 3). Now, the I(v,∞)(Xi) are i.i.d. Bernoulli(p) random variables, where

p = P(Xi > v) =∫ ∞

vλe−λx dx = −e−λx

∣∣∣∣∞

v

= e−λv.

Next, we now from the previous chapter that a sum of n i.i.d. Bernoulli(p) randomvariables is a binomial(n, p). Thus,

P(Y = 3) =

(n

3

)p3(1− p)n−3 =

(10

3

)e−3λv(1− e−λv)7 = 120e−3λv(1− e−λv)7.

45


6. First note that P(Xi > 2) =∫ ∞2 λe−λx dx=−e−λx|∞2 = e−2λ .

(a) P(min(X1, . . . ,Xn) > 2

)= P

(⋂ni=1Xi > 2

)= ∏n

i=1P(Xi > 2) = e−2nλ .

(b) Write

P(max(X1, . . . ,Xn) > 2

)= 1−P

(max(X1, . . . ,Xn)≤ 2

)

= 1−P

( n⋂

i=1

Xi ≤ 2)

= 1−n

∏i=1

P(Xi ≤ 2) = 1− [1− e−2λ ]n.

7. (a) P(Y ≤ 2) =∫ 20 µe−µy dy= 1− e−2µ .

(b) P(X ≤ 12,Y ≤ 12) = P(X ≤ 12)P(Y ≤ 12) = (1− e−12λ )(1− e−12µ).

(c) Write

P(X ≤ 12∪Y ≤ 12) = 1−P(X > 12,Y > 12)

= 1−P(X > 12)P(Y > 12)

= 1− e−12λ e−12µ = 1− e−12(λ+µ).

8. (a) Make the change of variable y= λxp, dy= λ pxp−1 dx to get

∫ ∞

0λ pxp−1e−λxp dx =

∫ ∞

0e−y dy = −e−y

∣∣∣∣∞

0

= 1.

(b) The same change of variables also yields

P(X > t) =∫ ∞

tλ pxp−1e−λxp dx =

∫ ∞

λ t pe−y dy = e−λ t p .

(c) The probability that none of the Xi exceeds 3 is

P

( n⋂

i=1

Xi ≤ 3)

=n

∏i=1

P(Xi ≤ 3) = [1−P(X1 > 3)]n = [1− e−λ3p ]n.

The probability that at least one of them exceeds 3 is

P

( n⋃

i=1

Xi > 3)

= 1−P

( n⋂

i=1

Xi ≤ 3)

= 1− [1− e−λ3p ]n.

9. (a) Since f ′(x) =−xe−x2/2/√2π , f ′(x) < 0 for x> 0 and f ′(x) > 0 for x< 0.

(b) Since f ′′(x)= (x2−1)e−x2/2/√2π , we see that f ′′(x)> 0 for |x|> 1 and f ′′(x)<

0 for |x|< 1.

(c) Rearrange ex2/2 ≥ x2/2 to get e−x2/2 ≤ 2/x2→ 0 as |x| → ∞.


10. Following the hint, write f (x) = ϕ((x−m)/σ)/σ , where ϕ is the standard normal

density. Observe that f ′(x) = ϕ ′((x−m)/σ)/σ2 and f ′′(x) = ϕ ′′((x−m)/σ)/σ3.

(a) Since the argument of ϕ ′ is positive for x > m and negative for x < m, f (x) isdecreasing for x>m and increasing for x<m. Hence, f has a global maximum

at x= m.

(b) Since the absolute value of the argument of ϕ ′′ is greater than one if and only if|x−m|> σ , f (x) is concave for |x−m|< σ and convex for |x−m|> σ .

11. Since ϕ is bounded, limσ→∞ ϕ((x−m)/σ)/σ = 0. Hence, limσ→∞ f (x) = 0. For

x 6= m, we have

f (x) =exp

[−(x−m

σ

)2/2]

√2π σ

≤ 2√2π σ

(x−m

σ

)2 =2σ√

2π(x−m)2→ 0

as σ → 0. Otherwise, since f (m) = [√2π σ ]−1, limσ→0 f (m) = ∞.

12. (a) f (x) = ∑n pn fn(x) is obviously nonnegative. Also,∫ ∞

−∞f (x)dx =

∫ ∞

−∞∑n

pn fn(x)dx = ∑n

pn

∫ ∞

−∞fn(x)dx = ∑

n

pn ·1 = ∑n

pn = 1.

(b)

x1 2 30

3/4

1/2

1/4

0

(c)

x1 2 30

3/4

1/2

1/4

0

13. Clearly, (g∗h)(x) =∫ ∞−∞ g(y)h(x− y)dy≥ 0 since g and h are nonnegative. Next,

∫ ∞

−∞(g∗h)(x)dx =

∫ ∞

−∞

(∫ ∞

−∞g(y)h(x− y)dy

)dx

=

∫ ∞

−∞g(y)

(∫ ∞

−∞h(x− y)dx

)dy

=∫ ∞

−∞g(y)

(∫ ∞

−∞h(θ)dθ

)

︸︷︷︸= 1

dy =∫ ∞

−∞g(y)dy = 1.

14. (a) Let p> 1. On Γ(p) =∫ ∞0 x

p−1e−x dx, use integration by parts with u= xp−1 anddv= e−x dx. Then du= (p−1)xp−2dx, v=−e−x, and

Γ(p) = −xp−1e−x∣∣∣∞

0︸︷︷︸= 0

+(p−1)

∫ ∞

0x(p−1)−1e−x dx = (p−1)Γ(p−1).


(b) On Γ(1/2) =∫ ∞0 x

−1/2e−x dx, make the change of variable x= y2/2 or y=√2x.

Then dx= ydy and x−1/2 =√2/y. Hence,

Γ(1/2) =∫ ∞

0

√2

ye−y

2/2ydy =√2

∫ ∞

0e−y

2/2 dy =√2√2π

∫ ∞

0

e−y2/2

√2π

dy

=√2√2π · 1

2=√

π.

(c) By repeatedly using the recursion formula in part (a), we have

Γ

(2n+1

2

)=

2n−1

2Γ

(2n−1

2

)=

2n−1

2· 2n−3

2Γ

(2n−3

2

)

...

=2n−1

2· 2n−3

2· · · 5

2· 32· 12

Γ(1/2)

=2n−1

2· 2n−3

2· · · 5

2· 32· 12·√

π

=(2n−1)!!

2n

√π.

(d) First note that gp(y) = 0 for y ≤ 0, and similarly for gq(y). Hence, in order tohave gp(y)gq(x−y) > 0, we need y> 0 and x−y> 0, or equivalently, x> y> 0.

Of course, if x ≤ 0 this does not happen. Thus, (gp ∗gq)(x) = 0 for x ≤ 0. For

x> 0, we follow the hint and write

(gp ∗gq)(x) =∫ ∞

−∞gp(y)gq(x− y)dy

=∫ x

0gp(y)gq(x− y)dy

=1

Γ(p)Γ(q)

∫ x

0yp−1e−y · (x− y)q−1e−(x−y) dy

=xq−1e−x

Γ(p)Γ(q)

∫ x

0yp−1(1− y/x)q−1 dy

=xqe−x

Γ(p)Γ(q)

∫ 1

0(xθ)p−1(1−θ)q−1 dθ , ch. of var. θ = y/x,

=xp+q−1e−x

Γ(p)Γ(q)

∫ 1

0θ p−1(1−θ)q−1 dθ . (∗)

Now, the left-hand side is a convolution of densities, and is therefore a density

by Problem 13. In particular, this means that the left-hand side integrates to

one. On the right-hand side, note that∫ ∞0 x

p+q−1e−x dx = Γ(p+ q). Hence,

integrating the above equation with respect to x from zero to infinity yields

1 =Γ(p+q)

Γ(p)Γ(q)

∫ 1

0θ p−1(1−θ)q−1 dθ .

Solving for the above integral and substituting the result into (∗), we find that(gp ∗gq)(x) = gp+q(x).


15. (a) In∫ ∞−∞ fλ (x)dx =

∫ ∞−∞ λ f (λx)dx, make the change of variable y = λx, dy =

λ dx to get ∫ ∞

−∞fλ (x)dx =

∫ ∞

−∞f (y)dy = 1.

(b) Observe that

g1,λ (x) = λ(λx)0 e−λx

0!= λe−λx,

which we recognize as the exp(λ ) density.

(c) The desired probability is

Pm(t) :=∫ ∞

tλ

(λx)m−1 e−λx

(m−1)!dx.

Note that P1(t) =∫ ∞t λe−λx dx = e−λ t . For m > 1, apply integration by parts

with u= (λx)m−1/(m−1)! and dv= λe−λx dx. Then

Pm(t) =(λ t)m−1 e−λ t

(m−1)!+Pm−1(t).

Applying this result recursively, we find that

Pm(t) =(λ t)m−1 e−λ t

(m−1)!+

(λ t)m−2 e−λ t

(m−2)!+ · · ·+ e−λ t .

(d) We have

g 2m+12 , 12

(x) = 12

( 12x)m−1/2 e−x/2

Γ((2m+1)/2)=

(1/2)m(1/2)1/2xm−1/2e−x/2

(2m−1)!!2m

√π

=xm−1/2e−x/2

(2m−1) · · ·5 ·3 ·1 ·√2π

.

16. (a) We see that b1,1(x) = 1 is the uniform(0,1) density, b2,2(x) = 6x(1− x), andb1/2,1(x) = 1/(2

√x).

0 0.25 0.5 0.75 10

0.5

1

1.5

2

b1,1(x)

b2,2(x)

b1/2,1

(x)

x


(b) From Problem 14(d) and its hint, we have

gp+q(x) = (gp ∗gq)(x) =xp+q−1e−x

Γ(p)Γ(q)

∫ 1

0θ p−1(1−θ)q−1dθ .

Integrating the left and right-hand sides with respect to x from zero to infinity

yields

1 =Γ(p+q)

Γ(p)Γ(q)

∫ 1

0θ p−1(1−θ)q−1dθ ,

which says that the beta density integrates to one.

17. Starting with

Γ(p)Γ(q) = Γ(p+q)∫ 1

0up−1(1−u)q−1 du,

make the change of variable u= sin2 θ , du= 2sinθ cosθ dθ . We obtain

Γ(p)Γ(q) = Γ(p+q)

∫ 1

0up−1(1−u)q−1 du

= Γ(p+q)∫ π/2

0(sin2 θ)p−1(1− sin2 θ)q−1 ·2sinθ cosθ dθ

= 2Γ(p+q)∫ π/2

0(sinθ)2p−1(cosθ)2q−1 dθ .

Setting p= q= 1/2 on both sides yields

Γ(1/2)2 = 2

∫ π/2

01dθ = π,

and it follows that Γ(1/2) =√

π .

18. Starting with

Γ(p)Γ(q) = Γ(p+q)

∫ 1

0up−1(1−u)q−1 du,

make the change of variable u= sin2 θ , du= 2sinθ cosθ dθ . We obtain

Γ(p)Γ(q)

Γ(p+q)=

∫ 1

0up−1(1−u)q−1 du

=∫ π/2

0(sin2 θ)p−1(1− sin2 θ)q−1 ·2sinθ cosθ dθ

= 2

∫ π/2

0(sinθ)2p−1(cosθ)2q−1 dθ .

Setting p= (n+1)/2 and q= 1/2 on both sides yields

Γ

(n+1

2

)√π

Γ

(n+2

2

) = 2

∫ π/2

0sinn θ dθ ,

and the desired result follows.


19. Starting with the integral definition of B(p,q), make the change of variable u = 1−e−θ , which implies both du= e−θ dθ and 1−u= e−θ . Hence,

B(p,q) =∫ 1

0up−1(1−u)q−1 du =

∫ ∞

0(1− e−θ )p−1(e−θ )q−1e−θ dθ

=∫ ∞

0(1− e−θ )p−1e−qθ dθ .

20. We first use the fact that the density is even and then make the change of variable

eθ = 1+ x2/ν , which implies both eθdθ = 2x/ν dx and x=√

ν(eθ −1). Thus,

∫ ∞

−∞

(1+

x2

ν

)−(ν+1)/2dx = 2

∫ ∞

0

(1+

x2

ν

)−(ν+1)/2dx

= 2

∫ ∞

0(eθ )−(ν+1)/2 · ν

2 eθ 1√

ν(eθ −1)dθ

=√

ν∫ ∞

0(eθ )−ν/2(eθ )1/2/

√eθ −1dθ

=√

ν∫ ∞

0(eθ )−ν/2(1− e−θ )−1/2 dθ

=√

ν∫ ∞

0(1− e−θ )1/2−1e−θν/2 dθ .

By the preceding problem, this is equal to√

ν B(1/2,ν/2), and we see that Student’s tdensity integrates to one.

21. (a) Using Stirling’s formula,

Γ(1+ν

2

)

√ν Γ

(ν

2

) ≈

√2π

(1+ν

2

)(1+ν)/2−1/2e−(1+ν)/2

√ν√2π

(ν

2

)ν/2−1/2e−ν/2

=

(1+ν

2

)ν/2e−1/2

√ν(ν

2

)ν/2−1/2

=(1+ν

ν

)ν/2 (ν/2)1/2√ν

e−1/2 = [(1+1/ν)ν ]1/21√2e1/2

→ (e1)1/21√2e1/2

=1√2.

(b) First write

(1+

x2

ν

)(ν+1)/2=

[(1+

x2

ν

)ν]1/2(

1+x2

ν

)1/2→ [ex

2]1/211/2 = ex

2/2.


fν(x) =

(1+ x2

ν

)−(ν+1)/2

√νB( 12 ,

ν2 )

=Γ(1+ν

2

)

√ν Γ

(ν

2

) · 1

√π(1+ x2

ν

)(ν+1)/2

→ 1√2π ex2/2

=e−x

2/2

√2π

.


22. Making the change of variable t = 1/(1+ z) as suggested in the hint, note that it isequivalent to 1+ z= t−1, which implies dz=−t−2dt. Thus,

∫ ∞

0

zp−1

(1+ z)p+qdz =

∫ 1

0

(1t−1

)p−1t p+q

dt

t2=

∫ 1

0

(1− tt

)p−1t p+q−2 dt

=∫ 1

0(1− t)p−1tq−1 dt = B(q, p) = B(p,q).

Hence, fZ(z) integrates to one.

23. E[X ] =∫ ∞

1x · 2x3dx =

∫ ∞

12x−2 dx =

−2x

∣∣∣∣∞

1

= 2.

24. If the input-output relation has n levels, then the distance from−Vmax to+Vmax shouldbe n∆; i.e., n∆ = 2Vmax, or ∆ = 2Vmax/n. Next, we have from the example in the text

that the performance is ∆2/12, and we need ∆2/12< ε , or

1

12

(2Vmaxn

)2< ε.

Solving this for n yields Vmax/√3ε < n= 2b. Taking natural logarithms, we have

b > ln(Vmax/

√3ε

)/ln2.

25. We use the change of variable x= z−m as follows:

E[Z] =∫ ∞

−∞z fZ(z)dz =

∫ ∞

−∞z f (z−m)dz =

∫ ∞

−∞(x+m) f (x)dx

=∫ ∞

−∞x f (x)dx+m

∫ ∞

−∞f (x)dx = E[X ]+m = 0+m = m.

26. E[X2] =∫ ∞

1x2 · 2

x3dx =

∫ ∞

1

2

xdx = 2lnx

∣∣∣∣∞

1

= 2(∞−0) = ∞.

27. First note that since Student’s t density is even, E[|X |k] =∫ ∞−∞ |x|k fν(x)dx is propor-

tional to∫ ∞

0

xk

(1+ x2/ν)(ν+1)/2dx=

∫ 1

0

xk

(1+ x2/ν)(ν+1)/2dx+

∫ ∞

1

xk

(1+ x2/ν)(ν+1)/2dx

With regard to this last integral, observe that

∫ ∞

1

xk

(1+ x2/ν)(ν+1)/2dx ≤

∫ ∞

1

xk

(x2/ν)(ν+1)/2dx = ν(ν+1)/2

∫ ∞

1

dx

xν+1−k ,

which is finite if ν + 1− k > 1, or k < ν . Next, instead of breaking the range of

integration at one, we break it at the solution of x2/ν = 1, or x=√

ν . Then

∫ ∞

√ν

xk

(1+ x2/ν)(ν+1)/2dx≥

∫ ∞

√ν

xk

(x2/ν + x2/ν)(ν+1)/2dx=

(ν2

)(ν+1)/2∫ ∞

√ν

dx

xν+1−k ,

which is infinite if ν +1− k ≤ 1, or k ≥ ν .


28. Begin with E[Y 4] = E[(Z+n)4] = E[Z4 +4Z3n+6Z2n2 +4Zn3 +n4]. The momentsof the standard normal were computed in an example in this chapter. Hence E[Y 4] =3+4 ·0 ·n+6 ·1 ·n2+4 ·0 ·n3+n4 = 3+6n2 +n4.

29. E[Xn] =∫ ∞

0xnxp−1e−x

Γ(p)dx =

1

Γ(p)

∫ ∞

0x(n+p)−1e−x dx =

Γ(n+ p)

Γ(p).

30. (a) First write

E[X ] =∫ ∞

0x · xe−x2/2 dx =

1

2

∫ ∞

−∞x2e−x

2/2 dx =

√2π

2

∫ ∞

−∞x2e−x

2/2

√2π

dx,

where the last integral is the second moment of a standard normal density, which

is one. Hence, E[X ] =

√2π

2=

√π/2.

(b) For higher-order moments, first write

E[Xn] =∫ ∞

0xn · xe−x2/2 dx =

∫ ∞

0xn+1e−x

2/2 dx.

Now make the change of variable t = x2/2, which implies x =√2t, or dx =

dt/√2t. Hence,

E[Xn] =∫ ∞

0[(2t)1/2]n+1e−t

dt

21/2t1/2

= 2n/2∫ ∞

0t [(n/2)+1]−1e−t dt = 2n/2Γ(1+n/2).

31. Let Xi denote the flow on link i, and put Yi := I(β ,∞)(Xi) so that Yi = 1 if the flow

on link i is greater than β . Put Z := ∑ni=1Yi so that Z counts the number of links

with flows greater than β . The buffer overflows if Z > 2. Since the Xi are i.i.d.,

so are the Yi. Furthermore, the Yi are Bernoulli(p), where p = P(Xi > β ). Hence,Z ∼ binomial(n, p). Thus,

P(Z > 2) = 1−P(Z ≤ 2)

= 1−[(n

0

)(1− p)n+

(n

1

)p(1− p)n−1+

(n

2

)p2(1− p)n−2

]

= 1− (1− p)n−2[(1− p)2+np(1− p)+ 12n(n−1)p2].

In remains to compute

p = P(Xi > β ) =∫ ∞

βxe−x

2/2 dx = −ex2/2∣∣∣∣∞

β

= e−β 2/2.

32. The key is to use the change of variable θ = λxp, which implies both dθ = λ pxp−1 dxand x= (θ/λ )1/p. Hence,

E[Xn] =

∫ ∞

0xn ·λ pxp−1e−λxp dx =

∫ ∞

0[(θ/λ )1/p]ne−θ dθ

= (1/λ )n/p∫ ∞

0θ [(n/p)+1]−1e−θ dθ = Γ(1+n/p)

/λ n/p.


33. Write

E[Y ] = E[(X1/4)2] = E[X1/2] =∫ ∞

0x1/2e−xdx =

∫ ∞

0x3/2−1e−xdx

= Γ(3/2) = (1/2)Γ(1/2) =√

π/2.

34. We have

P

( n⋃

i=1

Xi < µ/2)

= 1−P

( n⋂

i=1

Xi ≥ µ/2)

= 1−n

∏i=1

P(Xi ≥ µ/2)

= 1−(∫ ∞

µ/2λe−λx dx

)n, with λ := 1/µ ,

= 1− (e−λ µ/2)n = 1− e−n/2.

35. Let Xi ∼ exp(λ ) be i.i.d., where λ = 1/20. We must compute

P

( 5⋃

i=1

Xi > 25)

= 1−P

( 5⋂

i=1

Xi ≤ 25)

= 1−5

∏i=1

P(Xi ≤ 25)

= 1−(∫ 25

0λe−λx dx

)5

= 1− (1− e−25λ )5 = 1− (1− e−5/4)5 = 0.815.

36. The first two calculations are

h(X) =∫ 2

0(1/2) log2dx= log2 and h(X) =

∫ 1/2

02log(1/2)dx= log(1/2).

For the third calculation, note that − ln f (x) = 12 [(x−m)/σ ]2+ 1

2 ln2πσ2. Then

h(X) =1

2

∫ ∞

−∞f (x)

([(x−m)/σ ]2+ ln2πσ2

)dx

=1

2

E

[(X−m

σ

)2]+ ln2πσ2

= 1

21+ ln2πσ2 = 12 ln2πσ2e.

37. The main difficulty is to compute

∫ ∞

−∞x2n(1+ x2/ν)−(ν+1)/2 dx.


First use the fact that the integrand is even and then make the change of variable

eθ = 1+ x2/ν , which implies both eθdθ = 2x/ν dx and x=√

ν(eθ −1). Thus,

∫ ∞

−∞x2n(1+ x2/ν)−(ν+1)/2 dx = ν

∫ ∞

0x2n−1(1+ x2/ν)−(ν+1)/2 2x

νdx

= ν∫ ∞

0(√

ν(eθ −1) )2n−1(eθ )−(ν+1)/2eθ dθ

= νn+1/2∫ ∞

0(eθ −1)n−1/2e−θ(ν+1)/2eθ dθ

= νn+1/2∫ ∞

0(1− e−θ )n−1/2e−θ(ν−2n)/2 dθ

= νn+1/2∫ ∞

0(1− e−θ )(n+1/2)−1e−θ(ν−2n)/2 dθ

= νn+1/2B(n+1/2,(ν−2n)/2

), by Problem 19.

Hence,

E[X2n] = νn+1/2B(n+1/2,(ν−2n)/2

)· 1√

νB( 12 ,ν2 )

= νn+1/2Γ( 2n+12 )Γ( ν−2n

2 )

Γ( ν+12 )

· Γ( ν+12 )

√ν Γ( 12 )Γ( ν

2 )= νn

Γ( 2n+12 )Γ( ν−2n2 )

Γ( 12 )Γ( ν2 )

.

38. From MX (s) = eσ2s2/2, we have M′X (s) =MX (s)σ2s and then

M′′X (s) = MX (s)σ4s2 +MX (s)σ2.

Since MX (0) = 1, we have M′′X (1) = σ2.

39. LetM(s) := es2/2 denote the moment generating function of the standard normal ran-

dom variable. For the N(m,σ2) moment generating function, we use the change ofvariable y= (x−m)/σ , dy= dx/σ to write

∫ ∞

−∞esx

exp[− 12 (x−m

σ )2]√2π σ

dx =∫ ∞

−∞es(σy+m) e

−y2/2√2π

dy = esm∫ ∞

−∞esσy

e−y2/2

√2π

dy

= esmM(sσ) = esm+σ2s2/2.

40. E[esY ] = E[es ln(1/X)] = E[elnX−s

] = E[X−s] =∫ 1

0x−s dx =

x1−s

1− s

∣∣∣∣1

0

=1

1− s .

41. First note that

|x| =

x, x≥ 0,−x, x< 0.

Then the Laplace(λ ) mgf is

E[esX ] =∫ ∞

−∞esx · λ

2 e−λ |x| dx


=λ

2

[∫ ∞

0esxe−λx dx+

∫ 0

−∞esxeλx dx

]

=λ

2

[∫ ∞

0e−x(λ−s) dx+

∫ 0

−∞ex(λ+s) dx

].

Of these last two integrals, the one on the left is finite if λ > Res, while the second is

finite if Res>−λ . For both of them to be finite, we need −λ < Res< λ . For suchs both integrals are easy to evaluate. We get

MX (s) := E[esX ] =λ

2

[1

λ − s +1

λ + s

]=

λ

2· 2λ

λ 2− s2 =λ 2

λ 2− s2 .

Now, M′X (s) = 2sλ 2/(λ 2− s2)2, and so the mean isM′X (0) = 0. We continue with

M′′X (s) = 2λ 2 (λ 2− s2)2 +4s2(λ 2− s2)(λ 2− s2)4.

Hence, the second moment is M′′X (0) = 2/λ 2. Since the mean is zero, the second

moment is also the variance.

42. Since X is a nonnegative random variable, for s≤ 0, sX ≤ 0 and esX ≤ 1. Hence, for

s≤ 0, MX (s) = E[esX ]≤ E[1] = 1 < ∞. For s> 0, we show that MX (s) = ∞. We use

the fact that for z> 0,

ez =∞

∑n=0

zn

n!≥ z3

3!.

Then for s> 0, sX > 0, and we can write

MX (s) = E[esX ] ≥ E

[(sX)3

3!

]=

s3

3!E[X3] =

2s3

3!

∫ ∞

1

x3

x3dx = ∞.

43. We apply integration by parts with u = xp−1/Γ(p) and dv = e−x(1−s) dx. Then du =xp−2/Γ(p−1)dx and v=−e−x(1−s)/(1− s). Hence,

Mp(s) =∫ ∞

0esxxp−1e−x

Γ(p)dx =

∫ ∞

0

xp−1

Γ(p)e−x(1−s) dx

= − xp−1

Γ(p)· e−x(1−s)

1− s

∣∣∣∣∞

0

+1

1− s

∫ ∞

0

xp−2

Γ(p−1)e−x(1−s) dx.

The last term is Mp−1(s)/(1− s). The other term is zero if p> 1 and Res< 1.

44. (a) In this case, we use the change of variable t = x(1− s), which implies x =t/(1− s) and dx= dt/(1− s). Hence,

Mp(s) =∫ ∞

0esxxp−1e−x

Γ(p)dx =

1

Γ(p)

∫ ∞

0xp−1e−x(1−s) dx

=1

Γ(p)

∫ ∞

0

( t

1− s)p−1

e−tdt

1− s

=( 1

1− s)p· 1

Γ(p)

∫ ∞

0t p−1e−t dt

︸︷︷︸= 1

=( 1

1− s)p

.


(b) FromMX (s) = (1−s)−p, we findM′X (s) = p(1−s)−p−1,M′′X (s) = p(p+1)(1−s)−p−2, and so on. The general result is

M(n)X (s) = p(p+1) · · ·(p+[n−1])(1− s)−p−n =

Γ(n+ p)

Γ(p)(1− s)−p−n.

Hence, the Taylor series is

MX (s) =∞

∑n=0

sn

n!M

(n)X (0) =

∞

∑n=0

sn

n!· Γ(n+ p)

Γ(p).

45. (a) Make the change of variable t = λx or x= t/λ , dx= dt/λ . Thus,

E[esX ] =∫ ∞

0esx

λ (λx)p−1e−λx

Γ(p)dx =

∫ ∞

0e(s/λ )t t

p−1e−t

Γ(p)dt,

which is the moment generating function of gp evaluated at s/λ . Hence,

E[esX ] =( 1

1− s/λ

)p=

( λ

λ − s)p

,

and the characteristic function is

E[e jνX ] =( 1

1− jν/λ

)p=

( λ

λ − jν

)p.

(b) The Erlang(m,λ ) mgf is( λ

λ − s)m

, and the chf is( λ

λ − jν

)m.

(c) The chi-squared with k degrees of freedom mgf is( 1

1−2s

)k/2, and the chf is

( 1

1−2 jν

)k/2.

46. First write

MY (s) = E[esY ] = E[esX2] =

∫ ∞

−∞esx

2 e−x2/2

√2π

dx =

∫ ∞

−∞

e−x2(1−2s)/2√2π

dx.

If we let (1−2s) = 1/σ2; i.e., σ = 1/√1−2s, then

MY (s) = σ∫ ∞

−∞

e−(x/σ)2/2

√2π σ

dx = σ =1√1−2s

.

47. First observe that

esx2e−(x−m)2/2 = e−(x2−2xm+m2−2sx2)/2 = e−[x2(1−2s)−2xm]/2e−m

2/2

= e−(1−2s)x2−2xm/(1−2s)+[m/(1−2s)]2−[m/(1−2s)]2/2e−m2/2

= e−(1−2s)x−[m/(1−2s)]2/2em2/[2(1−2s)]e−m

2/2

= e−(1−2s)x−[m/(1−2s)]2/2esm2/(1−2s).


If we now let 1−2s= 1/σ2, or σ = 1/√1−2s, and µ = m/(1−2s), then

E[esY ] = E[esX2] =

∫ ∞

−∞esx

2 e−(x−m)2/2

√2π

dx = esm2/(1−2s)σ

∫ ∞

−∞

e−[(x−µ)/σ ]2/2

√2π σ

dx

= esm2/(1−2s)σ =

esm2/(1−2s)

√1−2s

.

48. ϕY (ν) = E[e jνY ] = E[e jν(aX+b)] = E[e j(νa)X ]e jνb = ϕX (aν)e jνb.

49. The key observation is that

|ν | =

ν , ν ≥ 0,−ν , ν < 0.


fX (x) =1

2π

∫ ∞

−∞e−λ |ν |e− jνx dν

=1

2π

[∫ ∞

0e−λνe− jνx dν +

1

2π

∫ 0

−∞eλνe− jνx dν

]

=1

2π

[∫ ∞

0e−ν(λ+ jx) dν +

∫ 0

−∞eν(λ− jx) dν

]

=1

2π

[ −1λ + jx

e−ν(λ+ jx)

∣∣∣∣∞

0

+1

λ − jxeν(λ− jx)

∣∣∣∣0

−∞

]

=1

2π

[1

λ + jx+

1

λ − jx

]=

1

2π

[2λ

λ 2 + x2

]=

λ/π

λ 2 + x2.

50. (a)d

dx

(e−x

2/2

√2π

)= −xe

−x2/2√2π

= −x f (x).

(b) ϕ ′X (ν) =d

dν

∫ ∞

−∞e jνx f (x)dx = j

∫ ∞

−∞e jνxx f (x)dx = − j

∫ ∞

−∞e jνx f ′(x)dx.

(c) In this last integral, let u = e jνx and dv = f ′(x)dx. Then du = jνe jνx dx, v =f (x), and the last integral is equal to

e jνx f (x)

∣∣∣∣∞

−∞︸︷︷︸= 0

− jν∫ ∞

−∞e jνx f (x)dx = − jνϕX (ν).

(d) Combining (b) and (c), we have ϕ ′X (ν) =− j[− jνϕX (ν)] =−νϕX (ν).

(e) If K(ν) := ϕX (ν)eν2/2, then

K′(ν) = ϕ ′X (ν)eν2/2 +ϕX (ν) ·νeν2/2 =−νϕX (ν)eν2/2 +ϕX (ν) ·νeν2/2 = 0.


(f) By the mean-value theorem of calculus, for every ν , there is a ν0 between 0 andν such that K(ν)−K(0) = K′(ν0)(ν−0). Since the derivative is zero, we have

K(ν) = K(0) = ϕX (0) = 1. It then follows that ϕX (ν) = e−ν2/2.

51. Following the hints, we first write

d

dxxgp(x) =

d

dx

xpe−x

Γ(p)=

pxp−1e−x− xpe−xΓ(p)

= pgp(x)− xgp(x) = (p− x)gp(x).

In

ϕ ′X (ν) =d

dν

∫ ∞

0e jνxgp(x)dx = j

∫ ∞

0e jνxxgp(x)dx,

apply integration by parts with u= xgp(x) and dv= e jνx dx. Then du is given above,

v= e jνx/( jν), and

ϕ ′X (ν) = j

[xgp(x)e

jνx

jν

∣∣∣∣∞

0︸︷︷︸= 0

− 1

jν

∫ ∞

0e jνx(p− x)gp(x)dx

]

= − 1

ν

[p

∫ ∞

0e jνxgp(x)dx−

1

j

∫ ∞

0e jνx( jx)gp(x)dx

]

= − 1

ν

[pϕX (ν)− 1

jϕ ′X (ν)

]= −(p/ν)ϕX (ν)+(1/ jν)ϕ ′X(ν).

Rearrange this to get

ϕ ′X (ν)(1−1/ jν) = −(p/ν)ϕX (ν),

and multiply through by − jν to get

ϕ ′X (ν)(− jν +1) = jpϕX (ν).

Armed with this, the derivative of K(ν) := ϕX (ν)(1− jν)p is

K′(ν) = ϕ ′X (ν)(1− jν)p+ϕX (ν)p(1− jν)p−1(− j)= (1− jν)p−1[ϕ ′X (ν)(1− jν)− jpϕX(ν)] = 0.

By the mean-value theorem of calculus, for every ν , there is a ν0 between 0 and

ν such that K(ν)−K(0) = K′(ν0)(ν − 0). Since the derivative is zero, we have

K(ν) = K(0) = ϕX (0) = 1. It then follows that ϕX (ν) = 1/(1− jν)p.

52. We use the formula cov(X ,Z) = E[XZ]−E[X ]E[Z]. The mean of an exp(λ ) randomvariable is 1/λ . Hence, E[X ] = 1. Since Z := X +Y , E[Z] = E[X ] + E[Y ]. Since

the Laplace random variable has zero mean, E[Y ] = 0. Hence, E[Z] = E[X ] = 1.

Next, E[XZ] = E[X(X +Y )] = E[X2]+E[XY ] = E[X2]+E[X ]E[Y ] by independence.Since E[Y ] = 0, E[XZ] = E[X2] = var(X)+(E[X ])2 = 1+12 = 2, where we have used

the fact that the variance of an exp(λ ) random variable is 1/λ 2. We can now write

cov(X ,Z) = 2−1= 1. Since Z is the sum of independent, and therefore uncorrelated,

random variables, var(Z) = var(X +Y ) = var(X) + var(Y ) = 1+ 2 = 3, where we

have used the fact that the variance of a Laplace(λ ) random variable is 2/λ 2.


53. Since Z = X+Y , where X ∼ N(0,1) and Y ∼ Laplace(1) are independent, we have

var(Z) = var(X+Y ) = var(X)+ var(Y ) = 1+2 = 3.

54. Write

MZ(s) = E[esZ ] = E[es(X−Y )] = E[esXe−sY ] = E[esX ]E[e−sY ] = MX (s)MY (−s).

IfMX (s) =MY (s) = λ/(λ − s), then

MZ(s) =λ

λ − s ·λ

λ − (−s) =λ

λ − s ·λ

λ + s=

λ 2

λ 2− s2 ,

which is the Laplace(λ ) mgf.

55. Because the Xi are independent, we can write

MYn(s) := E[esYn ] = E[es(X1+···+Xn)] = E

[n

∏i=1

esXi]

=n

∏i=1

E[esXi ] =n

∏i=1

MXi(s). (∗)

(a) For Xi ∼ N(mi,σ2i ),MXi(s) = esmi+σ2

i s2/2. Hence,

MYn(s) =n

∏i=1

esmi+σ2i s

2/2 = exp

[sn

∑i=1

mi+

(n

∑i=1

σ2i

)s2/2

]= esm+σ2s2/2

︸︷︷︸N(m,σ2) mgf

,

provided we put m := m1 + · · ·+mn and σ2 := σ21 + · · ·+σ2

n .

(b) For Cauchy random variables, we must observe that the moment generating

function exists only for s = jν . Equivalently, we must use characteristic func-tions. In this case, (∗) becomes

ϕYn(ν) := E[e jνYn ] =n

∏i=1

ϕXi(ν).

Now, the Cauchy(λi) chf is ϕXi(ν) = e−λi|ν |. Hence,

ϕYn(ν) =n

∏i=1

e−λi|ν | = exp

[−(

n

∑i=1

λi

)|ν |

]= e−λ |ν |

︸︷︷︸Cauchy(λ ) chf

,

provided we put λ := λ1 + · · ·+λn.

(c) For Xi ∼ gamma(pi,λ ), the mgf isMXi(s) = [λ/(λ − s)]pi . Hence,

MYn(s) =n

∏i=1

(λ

λ − s

)pi

=

(λ

λ − s

)p1+···+pn=

(λ

λ − s

)p

︸︷︷︸gamma(p,λ ) mgf

,

provided we put p := p1 + · · ·+ pn.


56. From part (c) of the preceding problem, Y ∼ gamma(rp,λ ). The table inside the backcover of the text gives the nth moment of a gamma random variable. Hence,

E[Y n] =Γ(n+ rp)

λ nΓ(rp).

57. Let Ti denote the time to transmit packet i. Then the time to transmit n packets is T :=T1 + · · ·+Tn. We need to find the density of T . Since the Ti are exponential, we can

apply the remark in the statement of Problem 55(c) to conclude that T ∼Erlang(n,λ ).Hence,

fT (t) =λ (λ t)n−1e−λ t

(n−1)!, t ≥ 0.

58. Observe that Y = ln

(n

∏i=1

1

Xi

)=

n

∑i=1

ln1

Xi. By Problem 40, each term is an exp(1)

random variable. Hence, by the remark in the statement of Problem 55(c), Y ∼Erlang(n,1); i.e.,

fY (y) =yn−1e−y

(n−1)!, y≥ 0.

59. Consider the characteristic function,

ϕY (ν) = E[e jνY ] = E[e jν(β1X1+···+βnXn)] = E

[n

∏i=1

e j(νβi)Xi

]=

n

∏i=1

E[e j(νβi)Xi ]

=n

∏i=1

e−λ |νβi| =n

∏i=1

e−λβi|ν | = exp

[−λ

(n

∑i=1

βi

)|ν |

].

This is the chf of a Cauchy random variable with parameter λ ∑ni=1βi. Hence,

fY (y) =λπ ∑n

i=1βi(

λ ∑ni=1βi

)2+ y2

.

60. We need to compute P(|X−Y | ≤ 2). If we put Z := X−Y , then we need to computeP(|Z| ≤ 2). We first find the density of Z using characteristic functions. Write

ϕZ(ν) = E[e jν(X−Y )] = E[e jνXe− jνY ] = E[e jνX ]E[e j(−ν)Y ] = e−|ν |e−|−ν | = e−2|ν |,

which is the chf of a Cauchy(2) random variable. Since the Cauchy density is even,

P(|Z| ≤ 2) = 2

∫ 2

0fZ(z)dz= 2

[1

πtan−1

( z2

)+1

2

]∣∣∣∣2

0

=2

πtan−1(1) =

2

π· π

4=1

2.

61. Let X :=U+V +W be the sum of the three voltages. The alarm sounds if X > x. To

find P(X > x), we need the density of X . SinceU , V , andW are i.i.d. exp(λ ) randomvariables, by the remark in the statement of Problem 55(c), X ∼ Erlang(3,λ ). By

Problem 15(c),

P(X > x) =2

∑k=0

(λx)ke−λx

k!.


62. Let Xi ∼ Cauchy(λ ) be the i.i.d. line loads. Let Y := X1 + · · ·+Xn be the total load.

The substation shuts down if Y > `. To find P(Y > `), we need to find the density ofY . By Problem 55(b), Y ∼ Cauchy(nλ ), and so

P(Y > `) =∫ ∞

`fY (y)dy =

[1

πtan−1

( y

nλ

)+1

2

]∣∣∣∣∞

`

= 1− 1

π

[tan−1

( `

nλ

)+1

2

]=

1

2− 1

πtan−1

( `

nλ

).

63. Let the Ui ∼ uniform[0,1] be the i.i.d. efficiencies of the extractors. Let Xi = 1 if

extractor i operates with efficiency less than 0.25; in symbols, Xi = I[0,0.25)(Ui), whichis Bernoulli(p) with p = 0.25. Then Y := X1 + · · ·+X13 is the number of extractors

operating at less than 0.25 efficiency. The outpost operates normally if Y < 3. We

must compute P(Y < 3). Since Y is the sum of i.i.d. Bernoulli(p) random variables,

Y ∼ binomial(13, p). Thus,

P(Y < 3) = P(Y = 0)+P(Y = 1)+P(Y = 2)

=

(13

0

)p0(1− p)13+

(13

1

)p(1− p)12 +

(13

2

)p2(1− p)11

= (1− p)11[(1− p)2+13p(1− p)+78p2] = 0.3326.

64. By the remark in the statement of Problem 55(c), R = T + A is chi-squared with

k = 2 degrees of freedom. Since the number of degrees of freedom is even, R is

Erlang(k/2,1/2) = Erlang(1,1/2) = exp(1/2). Hence,

P(R> r) =∫ ∞

r(1/2)e−x/2 dx = e−r/2.

65. (a) Since c2n+k is a density, it integrates to one. So,

∫ ∞

0ck,λ 2(x)dx =

∫ ∞

0

∞

∑n=0

(λ 2/2)ne−λ 2/2

n!c2n+k(x)dx

=∞

∑n=0

(λ 2/2)ne−λ 2/2

n!

∫ ∞

0c2n+k(x)dx

︸︷︷︸= 1

=∞

∑n=0

(λ 2/2)ne−λ 2/2

n!︸︷︷︸Poisson(λ 2/2) pmf

= 1.

(b) The mgf is

Mk,λ 2(s) =∫ ∞

0esxck,λ 2(x)dx

=∫ ∞

0esx

[ ∞

∑n=0

(λ 2/2)ne−λ 2/2

n!c2n+k(x)

]dx


=∞

∑n=0

(λ 2/2)ne−λ 2/2

n!

∫ ∞

0esxc2n+k(x)dx

=∞

∑n=0

(λ 2/2)ne−λ 2/2

n!

(1

1−2s

)(2n+k)/2

=e−λ 2/2

(1−2s)k/2

∞

∑n=0

1

n!

(λ 2/2

1−2s

)n

=e−λ 2/2

(1−2s)k/2e(λ

2/2)/(1−2s) =exp

[(λ 2/2)

(1

1−2s −1)]

(1−2s)k/2

=exp

[(λ 2/2)

(2s

1−2s

)]

(1−2s)k/2=

exp[sλ 2/(1−2s)]

(1−2s)k/2.

(c) If we first note that

d

ds

(sλ 2

1−2s

)∣∣∣∣s=0

=(1−2s)λ 2− sλ 2(−2)

1−2s

∣∣∣∣s=0

= λ 2,

then it is easy to show that M′k,λ 2(s) has the general form

α(s)+β (s)

(1−2s)k, where

α(0) = λ 2 and β (0) = k. Hence, E[X ] =M′k,λ 2(0) = λ 2 + k.

(d) The usual mgf argument gives

MY (s) = E[esY ] = E[es(X1+···+Xn)] =n

∏i=1

Mki,λ2i(s)

=n

∏i=1

exp[sλ 2i /(1−2s)]

(1−2s)ki/2

=exp[s(λ 2

1 + · · ·+λ 2n )/(1−2s)]

(1−2s)(k1+···+kn)/2.

If we put k := k1+ · · ·+ kn and λ 2 := λ 21 + · · ·+λ 2

n , we see that Y is noncentral

chi-squared with k degrees of freedom and noncentrality parameter λ 2.

(e) We first consider

eλ√x+ e−λ

√x

2=

1

2

∞

∑n=0

(λ√x)n

n![1+(−1)n] =

∞

∑n=0

λ 2nxn

(2n)!

=∞

∑n=0

λ 2nxn

1 ·3 ·5 · · ·(2n−1) ·2n ·n!

=∞

∑n=0

2n(λ 2/2)n(x/2)n

1 ·3 ·5 · · ·(2n−1) ·n!

=∞

∑n=0

√π(λ 2/2)n(x/2)n

Γ( 2n+12 ) ·n!, by Problem 14(c).


We can now write

e−(x+λ 2)/2

√2πx

· eλ√x+ e−λ

√x

2=

e−(x+λ 2)/2

√2πx

∞

∑n=0

√π(λ 2/2)n(x/2)n

Γ( 2n+12 ) ·n!

=∞

∑n=0

(λ 2/2)ne−λ 2/2

n!· (1/2)(x/2)

n−1/2e−x/2

Γ( 2n+12 )

=∞

∑n=0

(λ 2/2)ne−λ 2/2

n!c2n+1(x) = c1,λ 2(x).

66. First, P(X ≥ a) =∫ ∞a 2x−3 dx = 1/a2, while, using the result of Problem 23, the

Markov bound is E[X ]/a = 2/a. Thus, the true probability is 1/a2, but the boundis 2/a, which decays much more slowly for large a.

67. We begin by noting that P(X ≥ a) = e−a, E[X ] = 1, and E[X2] = 2. Hence, the Markov

bound is 1/a, and the Chebyshev bound is 2/a2. To find the Chernoff bound, we mustminimize h(s) := e−saMX (s) = e−sa/(1− s) over 0< s< 1. Now,

h′(s) =(1− s)(−a)e−sa+ e−sa

(1− s)2 .

Solving h′(s) = 0, we find s= (a−1)/a, which is positive only for a> 1. Hence, the

Chernoff bound is valid only for a> 1. For a> 1, the Chernoff bound is

h((a−1)/a) =e−a·(a−1)/a

1− (a−1)/a= ae1−a.

(a) It is easy to see that the Markov bound is smaller than the Chebyshev bound

for 0 < a < 2. However, note that the Markov bound is greater than one for

0< a< 1, and the Chebyshev bound is greater than one for 0< a< 2.

(b) MATLAB.

1 2 3 4 5 60

0.5

1

1.5

2

P(X > a)

Markov 1/a

Chebyshev 2/a2

Chernoff ae1−a

a6 8 10 12 14 16 18 20

10−8

10−6

10−4

10−2

100

P(X > a)

Markov 1/a

Chebyshev 2/a2

Chernoff ae1−a

a

The Markov bound is the smallest on [1,2]. The Chebyshev bound is the small-est from a= 2 to a bit more than a= 5. Beyond that, the Chernoff bound is the

smallest.

CHAPTER 5

Problem Solutions

1. For x≥ 0,

F(x) =∫ x

0λe−λ t dt = −e−λ t

∣∣∣∣x

0

= 1− e−λx.

For x< 0, F(x) = 0.

2. For x≥ 0,

F(x) =∫ x

0

t

λ 2e−(t/λ )2/2 dt = −e−(t/λ )2/2

∣∣∣∣x

0

= 1− e−(x/λ )2/2.

For x< 0, F(x) = 0.

3. For x≥ 0,

F(x) =∫ x

0λ pt p−1e−λ t p dt = −e−λ t p

∣∣∣∣x

0

= 1− e−λxp .

For x< 0, F(x) = 0.

4. For x≥ 0, first write

F(x) =∫ x

0

√2

π

t2

λ 3e−(t/λ )2/2 dt =

∫ x/λ

0

√2

πθ 2e−θ2/2 dθ ,

where we have used the change of variable θ = t/λ . Next use integration by parts

with u= θ and dv= θe−θ2/2 dθ . Then

F(x) =

√2

π

−θe−θ2/2

∣∣∣∣x/λ

0

+∫ x/λ

0e−θ2/2 dθ

= −√

2

π

x

λe−(x/λ )2/2 +2

∫ x/λ

0

e−θ2/2

√2π

dθ

= −√

2

π

x

λe−(x/λ )2/2 +2[Φ(x/λ )−1/2]

= 2Φ(x/λ )−1−√

2

π

x

λe−(x/λ )2/2.

For x< 0, F(x) = 0.

5. For y > 0, F(y) = P(Y ≤ y) = P(eZ ≤ y) = P(Z ≤ lny) = FZ(lny). Then fY (y) =fZ(lny)/y for y> 0. Since Y := eZ > 0, fY (y) = 0 for y≤ 0.

65


6. To begin, write FY (y) = P(Y ≤ y) = P(1−X ≤ y) = P(1− y ≤ X) = 1−FX (1− y).Thus, fY (y) =− fX (1− y) · (−1) = fX (1− y). In the case of X ∼ uniform(0,1),

fY (y) = fX (1− y) = I(0,1)(1− y) = I(0,1)(y),

since 0< 1− y< 1 if and only if 0< y< 1.

7. For y> 0,

FY (y) = P(Y ≤ y) = P(ln(1/X)≤ y) = P(1/X ≤ ey) = P(X ≥ e−y) = 1−FX (e−y).

Thus, fY (y) = − fX (e−y) · (−e−y) = e−y, since fX (e−y) = I(0,1)(e−y) = 1 for y > 0.

Since Y := ln(1/X) > 0, fY (y) = 0 for y≤ 0.

8. For y≥ 0,

FY (y) = P(λXP ≤ y) = P(X p ≤ y/λ ) = P(X ≤ (y/λ )1/p) = FX ((y/λ )1/p).

Thus, fY (y) = fX ((y/λ )1/p) · 1p(y/λ )(1/p)−1/λ . Using the formula for the Weibull

density, we find that fY (y) = e−y for y≥ 0. Since Y := λX p ≥ 0, fY (y) = 0 for y< 0.

Thus, Y ∼ exp(1).

9. For y≥ 0, write FY (y) = P(√X ≤ y) = P(X ≤ y2) = FX (y2) = 1− e−y2 . Thus,

fY (y) = −e−y2 · (−2y) =y

(1/√2)2

e−(y/(1/√2))2/2,

which is the Rayleigh(1/√2 ) density.

10. Recall that the moment generating function of X ∼ N(m,σ2) is MX (s) = E[esX ] =

esm+s2σ2/2. Thus,

E[Y n] = E[(eX )n] = E[enX ] = MX (n) = enm+n2σ2/2.

11. For y> 0, FY (y) = P(X2 ≤ y) = P(−√y≤ X ≤√y) = FX (√y)−FX (−√y). Thus,

fY (y) = fX (√y)( 1

2y−1/2)− fX (−√y)(− 1

2y−1/2).

Since fX is even,

fY (y) = y−1/2 fX (√y) =

e−y/2√2πy

, y> 0.

12. For y> 0,

FY (y) = P((X+m)2 ≤ y)= P(−√y≤ X+m≤√y)= P(−√y−m≤ X ≤√y−m)

= FX (√y−m)−FX (−√y−m).


Thus,

fY (y) = fX (√y−m)( 1

2y−1/2)− fX (−√y−m)(− 1

2y−1/2)

=1√2πy

[e−(

√y−m)2/2 + e−(−

√2−m)2/2

]/2

=1√2πy

[e−(y−2√ym+m2)/2 + e−(y+2

√ym+m2)/2

]/2

=e−(y+m2)/2

√2πy

[em√y+ e−m

√y

2

], y> 0.

13. Using the example mentioned in the hint, we have

FXmax(z) =n

∏k=1

FXk(z) = F(z)n and FXmin(z) = 1−n

∏k=1

[1−FXk(z)] = 1−[1−F(z)]n.

14. Let Z := max(X ,Y ). Since X and Y are i.i.d., we have from the preceding problem

that FZ(z) = FX (z)2. Hence,

fZ(z) = 2FX (z) fX (z) = 2(1− e−λ z) ·λe−λ z, z≥ 0.

Next,

E[Z] =∫ ∞

0z fZ(z)dz = 2

∫ ∞

0λ ze−λ z(1− e−λ z)dz

= 2

∫ ∞

0z ·λe−λ z dz−

∫ ∞

0z · (2λ )e−(2λ )z dz

= 21

λ− 1

2λ=

3

2λ.

15. Use the laws of total probability and substitution and the fact that conditioned on

X = m, Y ∼ Erlang(m,λ ). In particular, E[Y |X = m] = m/λ . We can now write

E[XY ] =∞

∑m=0

E[XY |X = m]P(X = m) =∞

∑m=0

E[mY |X = m]P(X = m)

=∞

∑m=0

mE[Y |X = m]P(X = m) =∞

∑m=0

m(m/λ )P(X = m)

=1

λE[X2] =

µ + µ2

λ, since X ∼ Poisson(µ).

16. The problem statement tells us that P(Y > y|X = n) = e−ny. Using the law of total

probability and the pgf of X ∼ Poisson(λ ), we have

P(Y > y) =∞

∑n=0

P(Y > y|X = n)P(X = n) =∞

∑n=0

e−nyP(X = n)

= E[e−yX ] = GX (e−y) = eλ (e−y−1).


17. (a) Using the law of substitution, independence, and the fact that Y ∼ N(0,1), write

FZ|X (z|i) = P(Z ≤ z|X = i) = P(X+Y ≤ z|X = i) = P(Y ≤ z− i|X = i)

= P(Y ≤ z− i) = Φ(z− i).

Next,

FZ(z) =1

∑i=0

FZ|X (z|i)P(X = i) = (1− p)Φ(z)+ pΦ(z−1),

and so

fZ(z) =(1− p)e−z2/2 + pe−(z−1)2/2

√2π

.

(b) From part (a), it is easy to see that fZ|X (z|i) = exp[−(z− i)2/2]/√2π . Hence,

fZ|X (z|1)fZ|X (z|0) ≥

P(X = 0)

P(X = 1)becomes

exp[−(z−1)2/2]

exp[−z2/2] ≥ 1− pp

,

or ez−1/2 ≥ (1− p)/p. Taking logarithms, we can further simplify this to

z ≥ 1

2+ ln

1− pp

.

18. Use substitution and independence to write

FZ|A,X (z|a, i) = P(Z ≤ z|A= a,X = i) = P(X/A+Y ≤ z|A= a,X = i)

= P(Y ≤ z− i/a|A= a,X = i) = P(Y ≤ z− i/a) = Φ(z− i/a).

19. (a) Write

FZn(z) = P(Zn ≤ z) = P(√Yn ≤ z) = P(Yn ≤ z2) = FYn(z

2).

For future reference, note that

fZn(z) = fYn(z2) · (2z).

Since Yn is chi-squared with n degrees of freedom, i.e., gamma(n/2,1/2),

fYn(y) = 12

(y/2)n/2−1e−y/2

Γ(n/2), y> 0,

and so

fZn(z) = z(z2/2)n/2−1e−z

2/2

Γ(n/2), z> 0.

(b) When n= 1, we obtain the folded normal density,

fZ1(z) = z(z2/2)−1/2e−z

2/2

Γ(1/2)= 2

e−z2/2

√2π

, z> 0.


(c) When n= 2, we obtain the Rayleigh(1) density,

fZ2(z) = z(z2/2)0e−z

2/2

Γ(1)= ze−z

2/2, z> 0.

(d) When n= 3, we obtain the Maxwell density,

fZ3(z) = z(z2/2)1/2e−z

2/2

Γ(3/2)=

z2√2· e−z2/2

12Γ( 1

2)

=

√2

πz2e−z

2/2, z> 0.

(e) When n= 2m, we obtain the Nakagami-m density,

fZ2m(z) = z(z2/2)m−1e−z

2/2

Γ(m)=

2

2mΓ(m)z2m−1e−z

2/2, z> 0.

20. Let Z := X1 + · · ·+Xn. By Problem 55(a) in Chapter 4, Z ∼ N(0,n), and it followsthat V := Z/

√n ∼ N(0,1). We can now write Y = Z2 = nV 2. By Problem 11, V 2 is

chi-squared with one degree of freedom. Hence,

FY (y) = P(nV 2 ≤ y) = P(V 2 ≤ y/n),

and

fY (y) = fV 2(y/n)/n =e−(y/n)/2

√2πy/n

· 1n

=e−(y/n)/2

√2πny

, y> 0.

21. (a) Since FY (y) = P(Y ≤ y) = P(X1/q ≤ y) = P(X ≤ yq) = FX (yq),

fY (y) = fX (yq) · (qyq−1) = qyq−1 · (yq)p−1e−y

q

Γ(p)=

qyqp−1e−yq

Γ(p), y> 0.

(b) Since q > 0, as y→ 0, yq → 0, and e−yq → 1. Hence, the behavior of fY (y)

as y→ 0 is determined by the behavior of yp−1. For p > 1, p− 1 > 0, and

so yp−1 → 0. For p = 1, yp−1 = y0 = 1. For 0 < p < 1, p− 1 < 0 and so

yp−1 = 1/y1−p→ ∞. Thus,

limy→0

fY (y) =

0, p> 1,q/Γ(1/q), p= 1,

∞, 0 < p< 1.

(c) We begin with the given formula

fY (y) =λq(λy)p−1e−(λy)q

Γ(p/q), y> 0.

(i) Taking q= p and replacing λ with λ 1/p yields

fY (y) = λ 1/pp(λ 1/py)p−1e−(λ 1/py)p = λ 1/ppλ 1−1/pyp−1e−λyp

= λ pyp−1e−λyp ,

which is the Weibull(p,λ ) density.


(ii) Taking p= q= 2 and replacing λ with 1/(√2λ ) yields

fY (y) = 2/(√2λ )[y/(

√2λ )]e−[y/(

√2λ )]2 = (y/λ 2)e−(y/λ )2/2,

which is the required Rayleigh density.

(iii) Taking p= 3, q= 2, and replacing λ with 1/(√2λ ) yields

fY (y) =2/(√2λ )[y/(

√2λ )]2e−[y/(

√2λ )]2

Γ(3/2)=

(y2/λ 3)e−(y/λ )2/2

√2 · 1

2Γ( 1

2)

=

√2

π

y2

λ 3e−(y/λ )2 ,

which is the required Maxwell density.

(d) In

E[Y n] =∫ ∞

0yn

λq(λy)p−1e−(λy)q

Γ(p/q)dy,

make the change of variable t = (λy)q, dt = q(λy)q−1λ dy. Then

E[Y n] =1

Γ(p/q)λ n

∫ ∞

0(λy)n(λy)p−qe−(λy)qλq(λy)q−1 dy

=1

Γ(p/q)λ n

∫ ∞

0(t1/q)n+p−qe−t dt

=1

Γ(p/q)λ n

∫ ∞

0t(n+p)/q−1e−t dt =

Γ((n+ p)/q)

Γ(p/q)λ n.

(e) We use the same change of variable as in part (d) to write

FY (y) =∫ y

0

λq(λθ)p−1e−(λθ)q

Γ(p/q)dθ =

1

Γ(p/q)

∫ (λy)q

0(t1/q)p−qe−t dt

=1

Γ(p/q)

∫ (λy)q

0t(p/q)−1e−t dt = Gp/q((λy)

q).

22. Following the hint, let u = t−1 and dv = te−t2/2 dt so that du = −1/t2 dt and v =

−e−t2/2. Then∫ ∞

xe−t

2/2 dt = −e−t2/2

t

∣∣∣∣∞

x

−∫ ∞

x

e−t2/2

t2dt

=e−x

2/2

x−

∫ ∞

x

e−t2/2

t2dt (∗)

<e−x

2/2

x.

The next step is to write the integral in (∗) as∫ ∞

x

e−t2/2

t2dt =

∫ ∞

x

1

t3· te−t2/2 dt


and apply integration by parts with u= t−3 and dv= te−t2/2 dt so that du=−3/t4 dt

and v=−e−t2/2. Then∫ ∞

x

e−t2/2

t2dt = −e

−t2/2

t3

∣∣∣∣∞

x

−3

∫ ∞

x

e−t2/2

t4dt

=e−x

2/2

x3−3

∫ ∞

x

e−t2/2

t4dt.

Substituting this into (∗), we find that∫ ∞

xe−t

2/2 dt =e−x

2/2

x− e

−x2/2

x3+3

∫ ∞

x

e−t2/2

t4dt

≥ e−x2/2

x− e

−x2/2

x3.

23. (a) Write

E[FcX (Z)] =∫ ∞

−∞FcX (z) fZ(z)dz

=∫ ∞

−∞

[∫ ∞

zfX (x)dx

]fZ(z)dz

=∫ ∞

−∞

[∫ ∞

−∞I(z,∞)(x) fX (x)dx

]fZ(z)dz

=∫ ∞

−∞fX (x)

[∫ ∞

−∞I(z,∞)(x) fZ(z)dz

]dx

=∫ ∞

−∞fX (x)

[∫ ∞

−∞I(−∞,x)(z) fZ(z)dz

]dx

=∫ ∞

−∞fX (x)

[∫ x

−∞fZ(z)dz

]dx

=∫ ∞

−∞fX (x)FZ(x)dx = E[FZ(X)].

(b) From Problem 15(c) in Chapter 4, we have

FZ(z) = 1−m−1∑k=0

(λ z)ke−λ z

k!, for z≥ 0,

and FZ(z) = 0 for z< 0. Hence,

E[FZ(X)] = E

[I[0,∞)(X)

(1−

m−1∑k=0

(λX)ke−λX

k!

)]

= P(X ≥ 0)−m−1∑k=0

λ k

k!E[Xke−λX I[0,∞)(X)].


(c) Let X ∼ N(0,1) so that E[Q(Z)] = E[FcX (Z)] = E[FZ(X)]. Next, since Z ∼exp(λ ) = Erlang(1,λ ), we can use the result of part (b) to write

E[FZ(X)] = P(X ≥ 0)−E[e−λX I[0,∞)(X)]

=1

2−

∫ ∞

0e−λx e

−x2/2√2π

dx

=1

2− e

λ 2/2

√2π

∫ ∞

0e−(x2+2xλ+λ 2)/2 dx

=1

2− e

λ 2/2

√2π

∫ ∞

0e−(x+λ )2/2 dx

=1

2− e

λ 2/2

√2π

∫ ∞

λe−t

2/2 dt =1

2− eλ 2/2Q(λ ).

(d) Put Z := σ√Y and make the following observations. First, for z≥ 0,

FZ(z) = P(σ√Y ≤ z) = P(σ2Y ≤ z2) = P(Y ≤ (z/σ)2) = FY ((z/σ)2),

and FZ(z) = 0 for z < 0. Second, since Y is chi-squared with 2m-degrees of

freedom, Y ∼ Erlang(m,1/2). Hence,

FZ(z) = 1−m−1∑k=0

((z/σ)2/2)ke−(z/σ)2/2

k!, for z≥ 0.

Third, with X ∼ N(0,1),

E[Q(σ√Y )] = E[Q(Z)] = E[FcX (Z)] = E[FZ(X)]

is equal to

E

[I[0,∞)(X)

(1−

m−1∑k=0

((X/σ)2/2)ke−(X/σ)2/2

k!

)],

which simplifies to

P(X ≥ 0)−m−1∑k=0

1

σ2k2kk!E[X2ke−(X/σ)2/2I[0,∞)(X)].

Now, P(X ≥ 0) = 1/2, and with σ2 := (1+1/σ2)−1, we have

E[X2ke−(X/σ)2/2I[0,∞)(X)] =

∫ ∞

0x2ke−(x/σ)2/2 e

−x2/2√2π

dx

=σ√2π σ

∫ ∞

0x2ke−(x/σ)2/2 dx

=σ

2· 1√

2π σ

∫ ∞

−∞x2ke−(x/σ)2/2 dx

=σ

2·1 ·3 ·5 · · ·(2k−1) · σ2k.

Putting this all together, the desired result follows.


(e) If m= 1 in part (d), then Z defined in solution of part (d) is Rayleigh(σ) by thesame argument as in the solution of Problem 19(b). Hence, the desired result

follows by taking m= 1 in the result of part (d).

(f) Since the Vi are independent exp(λi), the mgf of Y :=V1 + · · ·+Vm is

MY (s) =m

∏k=1

λkλk− s

=m

∑k=1

ckλk

λk− s,

where the ck are the result of expansion by partial fractions. Inverse transform-

ing term by term, we find that

fY (y) =m

∑k=1

ck ·λke−λky, y≥ 0.

Using this, we can write

FY (y) =m

∑k=1

ck(1− e−λky), y≥ 0.

Next, put Z :=√Y , and note that for z≥ 0,

FZ(z) = P(√Y ≤ z) = P(Y ≤ z2) = FY (z2).

Hence,

FZ(z) =m

∑k=1

ck(1− e−λkz2), z≥ 0.

We can now write that E[Q(Z)] = E[FcX (Z)] = E[FZ(X)] is equal to

E

[I[0,∞)(X)

(m

∑k=1

ck(1− e−λkX2)

)].

Now observe that with σ2k := (1+2λk)

−1,

E[I[0,∞)(X)e−λkX2] =

∫ ∞

0e−λkx

2 e−x2/2

√2π

dx =1

2

∫ ∞

−∞e−λkx

2 e−x2/2

√2π

dx

=1

2

∫ ∞

−∞

e−(1+2λk)x2/2

√2π

dx =σk2

∫ ∞

−∞

e−(x/σk)2/2

√2π σk

dx

=σk2

=1

2√1+2λk

.

Putting this all together, the desired result follows.

24. Recall that the noncentral chi-squared density with k degrees of freedom and noncen-

trality parameter λ 2 is given by

ck,λ 2(t) :=∞

∑n=0

(λ 2/2)ne−λ 2/2

n!c2n+k(t), t > 0,


where c2n+k denotes the central chi-squared density with 2n+ k degrees of freedom.Hence,

Ck,λ 2(x) =∫ x

0ck,λ 2(t)dt =

∫ x

0

∞

∑n=0

(λ 2/2)ne−λ 2/2

n!c2n+k(t)dt

=∞

∑n=0

(λ 2/2)ne−λ 2/2

n!

∫ x

0c2n+k(t)dt =

∞

∑n=0

(λ 2/2)ne−λ 2/2

n!C2n+k(x).

25. (a) Begin by writing

FZn(z) = P(√Yn ≤ z) = P(Yn ≤ z2) = FYn(z

2).

Then fZn(z) = fYn(z2) ·2z. Next observe that

In/2−1(mz) =∞

∑`=0

(mz/2)2`+n/2−1

`!Γ(`+(n/2)−1+1)=

∞

∑`=0

(mz/2)2`+n/2−1

`!Γ(`+n/2).

From Problem 65 in Chapter 4,

fYn(y) :=∞

∑`=0

(m2/2)`e−m2/2

`!c2`+n(y)

=∞

∑`=0

(m2/2)`e−m2/2

`!· 12

(y/2)(2`+n)/2−1e−y/2

Γ((2`+n)/2).

So,

fZn(z) = fYn(z2) ·2z = 2ze−(m2+z2)/2 · 1

2

∞

∑`=0

(m2/2)`(z2/2)`+n/2−1

`!Γ(`+n/2)

= ze−(m2+z2)/2∞

∑`=0

(mz)2`+n/2−1m−n/2+1(1/2)2`+n/2−1zn/2−1

`!Γ(`+n/2)

=zn/2

mn/2−1e−(m2+z2)In/2−1(mz).

(b) Obvious.

(c) Begin by writing

FYn(y) = P(Z2n ≤ y) = P(Zn ≤√y) = FZn(y

1/2).

Then

fYn(y) = fZn(y1/2) · y−1/2/2

= 12

(y1/2)n/2

mn/2−1e−(m2+y)/2In/2−1(m

√y)y−1/2

= 12

(√ym

)n/2−1e−(m2+y)/2In/2−1(m

√y).


(d) First write

FcZn(z) =∫ ∞

z

tn/2

mn/2−1e−(m2+t2)/2In/2−1(mt)dt

=∫ ∞

z

(mt)n/2

mn−1e−(m2+t2)/2In/2−1(mt)dt.

Now apply integration by parts with

u = (mt)n/2−1In/2−1(mt) and dv = mte−t2/2e−m

2/2/mn−1 dt.

Then

v=−me−m2/2e−t2/2/mn−1, and by the hint, du= (mt)n/2−1In/2−2(mt) ·mdt.

Thus,

FcZn(z) = −(mt)n/2−1In/2−1(mt)e−(m2+t2)/2

mn−2

∣∣∣∣∞

z

+∫ ∞

z

(mt)n/2−1

mn−3e−(m2+t2)/2In/2−2(mt)dt

=( zm

)n/2−1e−(m2+z2)/2In/2−1(mz)+FZn−2(z).

(e) Using induction, this is immediate from part (d).

(f) It is easy to see that Q(0,z) = Q(0,z) = e−z2/2. We then turn to

∂

∂mQ(m,z) =

∂

∂me−(m2+z2)/2

∞

∑k=0

(m/z)kIk(mz)

= e−(m2+z2)/2∞

∑k=0

−m(m/z)kIk(mz)+ z−2k(mz)kIk−1(mz)z

= e−(m2+z2)/2∞

∑k=0

−m(m/z)kIk(mz)+(m/z)kIk−1(mz)z

= ze−(m2+z2)/2∞

∑k=0

(m/z)kIk−1(mz)− (m/z)Ik(mz)

= ze−(m2+z2)/2∞

∑k=0

(m/z)kIk−1(mz)− (m/z)k+1Ik(mz)

= ze−(m2+z2)/2I−1(mz) = ze−(m2+z2)/2I1(mz).

To conclude, we compute

∂Q

∂m=

∂

∂m

∫ ∞

zte−(m2+t2)/2I0(mt)dt

=∫ ∞

z−mte−(m2+t2)/2I0(mt)dt+

∫ ∞

zte−(m2+t2)/2I−1(mt) · t dt.


Write this last integral as

∫ ∞

z(mt)I1(mt) · (t/m)e−(m2+t2)/2 dt.

Now apply integration by parts with u= (mt)I1(mt) and dv= te−(t2+m2)/2/mdt.

Then du = (mt)I0(mt)mdt and v = −e−(m2+t2)/2/m, and the above integral isequal to

ze−(m2+z2)/2I1(mz)+∫ ∞

zmte−(m2+t2)/2I0(mt)dt.

Putting this all together, we find that ∂Q/∂m= ze−(m2+z2)/2I1(mz).

26. Recall that Iν(x) :=∞

∑`=0

(x/2)2`+ν

`!Γ(`+ν +1).

(a) Write

Iν(x)

(x/2)ν=

1

Γ(ν +1)+

∞

∑`=1

(x/2)2`

`!Γ(`+ν +1)→ 1

Γ(ν +1)as x→ 0.

Now write

fZn(z) =zn/2

mn/2−1e−(m2+z2)/2In/2−1(mz)

=zn/2

mn/2−1e−(m2+z2)/2 In/2−1(mz)

(mz/2)n/2−1(mz/2)n/2−1

=zn−1

2n/2−1e−(m2+z2)/2 In/2−1(mz)

(mz/2)n/2−1.

Thus,

limz→0

fZn(z) =e−m

2/2

2n/2−1Γ(n/2)limz→0

zn−1 =

0, n> 1,

e−m2/2

√2/π, n= 1,

∞, 0 < n< 1.

(b) First note that

Iν−1(x) =∞

∑`=0

(x/2)2`+ν−1

`!Γ(`+ν)=

∞

∑`=0

(`+ν)(x/2)2`+ν−1

`!Γ(`+ν +1).

Second, with the change of index ` = k+1,

Iν+1(x) =∞

∑k=0

(x/2)2k+ν+1

k!Γ(k+ν +2)=

∞

∑`=1

(x/2)2`+ν−1

(`−1)!Γ(`+ν +1)

=∞

∑`=1

`(x/2)2`+ν−1

`!Γ(`+ν +1)=

∞

∑`=0

`(x/2)2`+ν−1

`!Γ(`+ν +1).


It is now easy to see that

Iν−1(x)+ Iν+1(x) =∞

∑`=0

(2`+ν)(x/2)2`+ν−1

`!Γ(`+ν +1)= 2I′ν(x).

It is similarly easy to see that

Iν−1(x)− Iν+1(x) =∞

∑`=0

ν(x/2)2`+ν−1

`!Γ(`+ν +1)= 2(ν/x)Iν(x).

(c) To the integral In(x) := (2π)−1∫ π−π e

xcosθ cos(nθ)dθ , apply integration by parts

with u = excosθ and dv = cosnθ dθ . Then du = excosθ (−xsinθ)dθ and v =sin(nθ)/n. We find that

In(x) =1

2π

[excosθ sinnθ

n

∣∣∣∣π

−π︸︷︷︸= 0

+x

n

∫ π

−πexcosθ sinnθ sinθ dθ

].

We next use the identity sinAsinB= 12[cos(A−B)− cos(A+B)] to get

In(x) =x

2n[In−1(x)− In+1(x)].

(d) Since I0(x) := (1/π)∫ π0 e

xcosθ dθ , make the change of variable t = θ − π/2.Then

I0(x) =1

π

∫ π/2

−π/2excos(t+π/2) dt =

1

π

∫ π/2

−π/2e−xsin t dt

=1

π

∫ π/2

−π/2

∞

∑k=0

(−xsin t)kk!

dt =1

π

∞

∑k=0

(−x)kk!

∫ π/2

−π/2sink t dt

︸︷︷︸= 0 for k odd

=1

π

∞

∑`=0

x2`

(2`)!

∫ π/2

−π/2sin2` t dt =

2

π

∞

∑`=0

x2`

(2`)!

∫ π/2

0sin2` t dt

=2

π

∞

∑`=0

x2`

(2`)!·

Γ

(2`+1

2

)√π

2Γ

(2`+2

2

) , by Problem 18 in Ch. 4.

Now,

(2`)! = 1 ·3 ·5 · · ·(2`−1) ·2 ·4 ·6 · · ·2` = 1 ·3 ·5 · · ·(2`−1) ·2``!,

and from Problem 14(c) in Chapter 4,

Γ

(2`+1

2

)=

1 ·3 ·5 · · ·(2`−1)

2`

√π.

Hence,

I0(x) =∞

∑`=0

(x/2)2`

`!Γ(`+1)=: I0(x).


(e) Begin by writing

In±1(x) =1

2π

∫ π

−πexcosθ cos([n±1]θ)dθ

=1

2π

∫ π

−πexcosθ [cosnθ cosθ ∓ sinnθ sinθ ]dθ .

Then12[In−1(x)+ In+1(x)] =

1

2π

∫ π

−πexcosθ cosnθ cosθ dθ .

Since

I′n(x) =∂

∂x

[1

2π

∫ π

−πexcosθ cos(nθ)dθ

]=

1

2π

∫ π

−πexcosθ cos(nθ)cosθ dθ ,

we see that 12[In−1(x)+ In+1(x)] = I′n(x).

27. MATLAB.

0 1 2 3 40

0.1

0.2

0.3

0.4

0.5

28. See previous problem solution for graph. The probabilities are:

k P(X = k)0 0.0039

1 0.0469

2 0.2109

3 0.4219

4 0.3164

29. (a) Sketch of f (t):

0 1

1

1/2

1/3

1/6


(b) P(X = 0) =∫0 f (t)dt = 1/2, and P(X = 1) =

∫1 f (t)dt = 1/6.

(c) We have

P(0< X < 1) =∫ 1−

0+f (t)dt =

∫ 1−

0+

13e−tu(t)+ 1

2δ (t)+ 1

6δ (t−1)dt

=∫ 1−

0+

13e−t dt =

∫ 1

0

13e−t dt = − 1

3e−t

∣∣∣1

0=

1− e−13

and

P(X > 1) =∫ ∞

1+f (t)dt =

∫ ∞

1

13e−t dt = − 1

3e−t

∣∣∣∞

1= e−1/3.

(d) Write

P(0≤ X ≤ 1) = P(X = 0)+P(0< X < 1)+P(X = 1)

=1

2+1− e−1

3+1

6= 1− e−1/3

and

P(X > 1) = P(X = 1)+P(X > 1) = 1/6+ e−1/3 =1+2e−1

6.

(e) Write

E[X ] =

∫ ∞

−∞t f (t)dt =

∫ ∞

0

t3e−t dt+0 ·P(X = 0)+1 ·P(X = 1)

= 13E[exp RV w/λ = 1]+1/6 = 1/3+1/6 = 1/2.

30. For the first part of the problem, we have E[eX ] =∫ ∞−∞ e

x · 12[δ (x) + I(0,1](x)]dx =

12e0 + 1

2

∫ 10 e

x dx = 1/2+ ex/2∣∣∣1

0= 1/2+(e− 1)/2 = e/2. For the second part, first

write

P(X = 0|X ≤ 1/2) =P(X = 0∩X ≤ 1/2)

P(X ≤ 1/2)=

P(X = 0)

P(X ≤ 1/2).

Since P(X = 0) = 1/2 and

P(X ≤ 1/2) =∫ 1/2

−∞

12[δ (x)+ I(0,1](x)]dx= 1

2+ 1

2

∫ 1/2

0dx= 1

2+ 1

4= 3

4,

we have P(X = 0|X ≤ 1/2) = 1/23/4 = 2/3.

31. The approach is to find the density and then compute E[X ] =∫ ∞−∞ x fX (x)dx. The catch

is that the cdf has a jump at x= 1/2, and so the density has an impulse there. Put

fX (x) :=

2x, 0 < x< 1/2,1, 1/2 < x< 1,0, otherwise.


Then the density is fX (x) = fX (x)+ 14δ (x−1/2). Hence,

E[X ] =

∫ ∞

−∞x fX (x)dx =

∫ 1/2

0x ·2xdx+

∫ 1

1/2x ·1dx+ 1

4· 12

= 23

(12

)3+ 1

2[1−

(12

)2]+ 1

8

= 112

+ 38+ 1

8= 11

24+ 1

8= 7/12.

32. The approach is to find the density and then compute E[√X ] =

∫ ∞−∞

√x fX (x)dx. The

catch is that the cdf has a jump at x= 4, and so the density has an impulse there. Put

fX (x) :=

x−1/2/8, 0< x< 4,1/20, 4< x< 9,0, otherwise.

Then the density is fX (x) = fX (x)+ 14δ (x−4). To begin, we compute

∫ ∞

−∞x1/2 fX (x)dx =

∫ 4

01/8dx+

∫ 9

4x1/2/20dx = 1/2+ x3/2/30

∣∣∣9

4

= 1/2+(27−8)/30 = 1/2+19/30 = 17/15.

The complete answer is

E[√X ] = 17/15+

√4/4 = 17/15+1/2 = 34/30+15/30 = 49/30.

33. First note that for y< 0,∫ y

−∞e−|t| dt =

∫ y

−∞et dt = ey,

and for y≥ 0,

∫ y

−∞e−|t| dt =

∫ 0

−∞et dt+

∫ y

0e−t dt = 1+(−e−t)

∣∣∣y

0= 1+1− e−y = 2− e−y.

Hence,

FY (y) =

ey/4, y< 0,(2− e−y)/4+1/3, 0≤ y< 7,(2− e−y)/4+1/3+1/6, y≥ 7,

=

ey/4, y< 0,5/6− e−y/4, 0≤ y< 7,1− e−y/4, y≥ 7.

−2 0 2 4 6 80

0.25

0.5

0.75

1


34. Let Y = 0 = loose connection, and Y = 1 = Y = 0c. Then P(Y = 0) =P(Y = 1) = 1/2. Using the law of total probability,

FX (x) = P(X ≤ x) =1

∑i=0

P(X ≤ x|Y = i)P(Y = i)

= 12

[P(X ≤ x|Y = 0)+

∫ x

−∞I(0,1](t)dt

].

Since P(X = 0|Y = 0) = 1, we see that

FX (x) =

1, x≥ 1,12(1+ x), 0≤ x< 1,0, x< 0.

Since there is a jump at x= 0, we must be careful in computing the density. It is

fX (x) = 12[I(0,1)(x)+δ (x)].

0 1

1

1/2

0 1

1

1/2

cdf density

35. (a) Recall that as x varies from +1 to −1, cos−1 x varies from 0 to π . Hence,

FX (x) = P(cosΘ≤ x) = P(Θ∈ [−π,−θx]∪ [θx,π]

), where θx := cos−1 x. Since

Θ∼ uniform[−π,π],

FX (x) =π−θx2π

+−θx− (−π)

2π= 2

π−θx2π

= 1− cos−1 xπ

.

(b) Recall that as y varies from +1 to −1, sin−1 y varies from π/2 to −π/2. Fory≥ 0,

FY (y) = P(sinΘ≤ y) = P(Θ ∈ [−π,θy]∪ [π−θy,π]

)

=θy2π

+θy− (−π)

2π=

π +2θy2π

=1

2+sin−1 y

π,

and for y< 0,

FY (y) = P(sinΘ≤ y) = P(Θ ∈ [−π−θy,θy])

=2θy+π

2π=

1

2+sin−1 y

π.


(c) Write

fX (x) = − 1

π· −1√

1− x2=

1/π√1− x2

,

and

fY (y) =1

π· 1√

1− y2=

1/π√1− y2

.

(d) First write

FZ(z) = P(Z ≤ z) = P

(Y +1

2≤ z

)= P(Y ≤ 2z−1) = FY (2z−1).

Then differentiate to get

fZ(z) = fY (2z−1) ·2 =2

π√1− (2z−1)2

=1

√π√

π√z(1− z)

=Γ( 1

2+ 1

2)

Γ( 12)Γ( 1

2)z1/2−1(1− z)1/2−1,

which is the beta density with p= q= 1/2.

36. The cdf is

FY (y) =

1, y≥ 3,∫ −1+√1+y

−1−√1+y1/4dx, −1≤ y< 3,

0, y<−1,

=

1, y≥ 3,12

√1+ y, −1≤ y< 3,0, y<−1,

and the density is

fY (y) =

1

4√1+ y

, −1< y< 3,

0, otherwise.

37. The cdf is

FY (y) =

1, y≥ 2,

2(3−√2/y)

6, 1/2≤ y< 2,

1/3, 0≤ y< 1/2,0, y< 0,

and the density is

fY (y) =

√2

6y−3/2I(1/2,2)(y)+

1

3δ (y)+

1

3δ (y−2).


38. For 0≤ y≤ 1, we first compute the cdf

FY (y) = P(Y ≤ y) = P(√1−R2 ≤ y) = P(1−R2 ≤ y2) = P(1− y2 ≤ R2)

= P(√1− y2 ≤ R) =

√2−

√1− y2√2

= 1− 1√2(1− y2)1/2.

We then differentiate to get the density

fY (y) =

y√2(1− y2)

, 0< y< 1,

0, otherwise.

39. The first thing to note is that for 0 ≤ R ≤√2, 0 ≤ R2 ≤ 2. It is then easy to see that

the minimum value of Z = [R2(1−R2/4)]−1 occurs when R2 = 2 or R=√2. Hence,

the random variable Z takes values in the range [1,∞). So, for z≥ 1, we write

FZ(z) = P

(1

R2(1−R2/4) ≤ z)

= P

(R2(1−R2/4)≥ 1/z

)

= P

((R2/4)(1−R2/4)≥ 1/(4z)

).

Put y := 1/(4z) and observe that x(1− x) ≥ y if and only if 0 ≥ x2 − x+ y, with

equality if and only if

x =1±√1−4y

2.

Since we will have x= R2/4≤ 1/2, we need the negative root. Thus,

FZ(z) = P

((R2/4)≥ 1−√1−4y

2

)= P

(R2 ≥ 2[1−

√1−1/z ]

)

= P

(R≥

√2[1−

√1−1/z ]

)=

√2−

√2[1−

√1−1/z ]

√2

= 1−√1−

√1−1/z.

Differentiating, we obtain

fZ(z) = −1

2

(1−

√1−1/z

)−1/2 ddz

[1−√1−1/z ]

= −1

2

(1−

√1−1/z

)−1/2 ·−1

2(1−1/z)−1/2 · 1

z2

=1

4z2[(1−

√1−1/z)(1−1/z)]−1/2.

40. First note that as R varies from 0 to√2, T varies from π to

√2π . For π ≤ t ≤

√2π ,

write

FT (t) = P(T ≤ t) = P

(π√

1−R2/4≤ t

)= P

(π2

t2≤ 1−R2/4

)


= P(R2/4≤ 1−π2/t2) = P(R2 ≤ 4[1−π2/t2])

= P

(R≤ 2

√1−π2/t2

)=√2(1−π2/t2)1/2.

Differentiating, we find that

fT (t) =π2√2

t3(1−π2/t2)−1/2 =

π2√2

t2√t2−π2

.

For the second part of the problem, observe that as R varies between 0 and√2, M

varies between 1 and e−π . For m in this range, note that lnm< 0, and write

FM(m) = P(M ≤ m) = P(e−π(R/2)/√

1−R2/4 ≤ m)

= P

(π(R/2)

/√1−R2/4≥− lnm

)

= P

(π2(R2/4)

/(1−R2/4)≥ (− lnm)2

)

= P

((R2/4)≥ 1/[1+π/(− lnm)2]

)

= P

(R≥ 2[1+π/(− lnm)2]−1/2

)

= 1−√2[1+π/(− lnm)2]−1/2,

Differentiating, we find that

fM(m) =

√2π2

m(− lnm)3[1+π/(− lnm)2]−3/2 =

√2π2

m[(− lnm)2 +π2]3/2.

41. (a) For X ∼ uniform[−1,1],

FY (y) =

1, y≥ 1,(y+

√y)/2, 0≤ y< 1,

0, y< 0,

and

fY (y) =

12[1+1/(2

√y)], 0< y< 1,

0, otherwise.

(b) For X ∼ uniform[−1,2],

FY (y) =

1, y≥ 2,(y+1)/3, 1≤ y< 2,

(y+√y)/3, 0≤ y< 1,

0, y< 0,

and

fY (y) =

1/3, 1 < y< 213[1+1/(2

√y)], 0 < y< 1,

0, otherwise.


(c) For X ∼ uniform[−2,3],

FY (y) =

1, y≥ 2,(y+2)/5, 1≤ y< 2,

(y+√y)/5, 0≤ y< 1,

0, y< 0,

and

fY (y) =1

5[I(1,2)(y)+1+1/(2

√y)I(0,1)(y)+δ (y−1)+δ (y−2)].

(d) For X ∼ exp(λ ),

FY (y) =

1, y≥ 2,P(X ≤ y)+P(X ≥ 3), 0≤ y< 2,

0, y< 0,

=

1, y≥ 2,

(1− e−λy)+ e−3λ , 0≤ y< 2,0, y< 0,

and

fY (y) = λe−λyI(0,2)(y)+ e−3λ δ (y)+(e−2λ − e−3λ )δ (y−2).

42. (a) If X ∼ uniform[−1,1], then Y = g(X) = 0, and so

FY (y) = u(y) (the unit step function), and fY (y) = δ (y).

(b) If X ∼ uniform[−2,2],

FY (y) =

1, y≥ 1,12(y+1), 0≤ y< 1,0, y< 0,

and

fY (y) =1

2[I(0,1)(y)+δ (y)].

(c) We have

FY (y) =

1, y≥ 1,(y+1)/3, 0≤ y< 1,

0, y< 0,

and

fY (y) = 13[I(0,1)(y)+δ (y)+δ (y−1)].

(d) If X ∼ Laplace(λ ),

FY (y) =

1, y≥ 1,

2∫ y+10

λ2e−λx dx, 0≤ y< 1,

0, y< 0,

=

1, y≥ 1,

1− e−λ (y+1), 0≤ y< 1,0, y< 0,


and

fY (y) = λe−λ (y+1)I(0,1)(y)+(1− e−λ )δ (y)+ e−2λ δ (y−1).

43. (a) If X ∼ uniform[−3,2],

FY (y) =

1, y≥ 1,15(y1/3 + y+2), 0≤ y< 1,

15[y+2− (−y)1/2], −1≤ y< 0,

0, y<−1,and

fY (y) = 15[1+1/(3y2/3)]I(0,1)(y)+ 1

5[1+1/(2

√−y)]I(−1,0)(y)+ 15δ (y−1).

(b) If X ∼ uniform[−3,1],

FY (y) =

1, y≥ 1,14(y1/3 + y+2), 0≤ y< 1,

14[y+2− (−y)1/2], −1≤ y< 0,

0, y<−1,and

fY (y) = 14[1+1/(3y2/3)]I(0,1)(y)+ 1

4[1+1/(2

√−y)]I(−1,0)(y).

(c) If X ∼ uniform[−1,1],

FY (y) =

1, y≥ 1,12(y1/3 +1), 0≤ y< 1,

12[1− (−y)1/2], −1≤ y< 0,

0, y<−1,and

fY (y) =1

6y2/3I(0,1)(y)+

1

4√−y I(−1,0)(y).

44. We have

FY (y) =

0, y<−1,16[y+1+

√y+1], −1≤ y< 1,

16[3+

√y+1 ], 1≤ y< 8,

1, y≥ 8,

and

fY (y) = 16δ (y−1)+ [ 1

6+ 1

12(y+1)−1/2]I(−1,1)(y)+ 1

12(y+1)−1/2I(1,8)(y).

45. We have

FY (y) =

0, y< 0,(y2 +

√y+ y+1)/4, 0≤ y< 1,1, y≥ 1,

and

fY (y) = 14δ (y)+ 1

4[2y+1/(2

√y)+1]I(0,1)(y).


46. We have

FY (y) =

1, y≥ 1,16[4+3y− y2], 0≤ y< 1,

16[3+2y− y2], −1≤ y< 0,

0, y<−1,and

fY (y) = [ 12− 1

3y]I(0,1)(y)+ 1

3[1− y]I(−1,0)(y)+ 1

6δ (y).

47. We have

FY (y) =

1, y≥ 1,13[1+ y+

√y/(2− y) ], 0≤ y< 1,

0, y< 0,

and

fY (y) = 13

[δ (y)+1+

( y

2− y)−1/2 1

(2− y)2].

48. Observe that g is a periodic sawtooth function with period one. Also note that since

0≤ g(x) < 1, Y = g(X) takes values in [0,1).

(a) We begin by writing

FY (y)=P(Y ≤ y)=P(g(X)≤ y)=∞

∑k=0

P(k≤X ≤ k+y)=∞

∑k=0

FX (k+y)−FX (k).

When X ∼ exp(1), we obtain, for 0≤ y< 1,

FY (y) =∞

∑k=0

(1− e−ke−y)− (1− e−k) =∞

∑k=0

e−k(1− e−y) =1− e−y1− e−1 .

Differentiating, we get

fY (y) =e−y

1− e−1 , 0≤ y< 1.

We say that Y has a “truncated” exponential density.

(b) When X ∼ uniform[0,1), we obtain Y = X ∼ uniform[0,1).

(c) Suppose X ∼ uniform[ν ,ν + δ ), where ν = m+ δ for some integer m ≥ 0 and

some 0 < δ < 1. For 0≤ y< δ

FY (y) = (m+1+ y)− (m+1) = y,

and for δ ≤ y< 1,

FY (y) = [(m+1+δ )− (m+1)]+ [(m+ y)− (m+δ )] = y.

Since FY (y) = y for 0≤ y< 1, Y ∼ uniform[0,1).


49. In the derivation of Property (vii) of cdfs, we started with the formula

(−∞,x0) =∞⋃

n=1

(−∞,x0− 1n].

However, we can also write

(−∞,x0) =∞⋃

n=1

(−∞,x0− 1n).

Hence,

G(x0)=P(X < x0) = P

( ∞⋃

n=1

X < x0− 1n)

= limN→∞

P(X < x0− 1N)= lim

N→∞G(x0− 1

N).

Thus, G is left continuous. In a similar way, we can adapt the derivation of Prop-

erty (vi) of cdfs to write

P(X ≤ x0)=P

( ∞⋂

n=1

X < x0+1n)

= limN→∞

P(X < x0+1n)= lim

N→∞G(x0+

1n)=G(x0+).

To conclude, write

P(X = x0) = P(X ≤ x0)−P(X < x0) = G(x0+)−G(x0).

50. First note that since FY (t) is right continuous, so is 1−FY (t) = P(Y > t). Next, we usethe assumption P(Y > t+∆t|T > t) = P(Y > ∆t) to show that with h(t) := P(Y > t),h(t+∆t) = h(t)+h(∆t). To this end, write

h(∆t) = lnP(Y > ∆t) = lnP(Y > t+∆t|Y > t) = lnP(Y > t+∆t∩Y > t)

P(Y > t)

= lnP(Y > t+∆t)

P(Y > t)= lnP(Y > t+∆t)− lnP(Y > t) = h(t+∆t)−h(t).

Rewrite this result as h(t+ ∆t) = h(t)+ h(∆t). Then with ∆t = t, we have h(2t) =2h(t). With ∆t = 2t, we have h(3t) = h(t)+h(2t) = h(t)+2h(t) = 3h(t). In general,h(nt) = nh(t). In a similar manner we can show that

h(t) = h( tm

+ · · ·+ t

m

)= mh(t/m),

and so h(t/m) = h(t)/m. We now have that for rational a= n/m, h(at) = h(n(t/m)) =nh(t/m) = (n/m)h(t) = ah(t). For general a≥ 0, let ak ↓ a with ak rational. Then bythe right continuity of h,

h(at) = limk→∞

h(akt) = limk→∞

akh(t) = ah(t).

We can now write

h(t) = h(t ·1) = th(1).


Thus,

t ·h(1) = h(t) = lnP(Y > t) = ln(1−FY (t)),

and we have 1−FY (t) = eh(1)t , which implies Y ∼ exp(−h(1)). Of couse, −h(1) =− lnP(Y > 1) =− ln[1−FY (1)].

51. We begin with

E[Yn] = E

[1√n

n

∑i=1

(Xi−m

σ

)]=

1√nσ

n

∑i=1

(E[Xi]−m

)= 0.

For the variance, we use the fact that since independent random variables are uncor-

related, the variance of the sum is the sum of the variances. Thus,

var(Yn) =n

∑i=1

var

(Xi−mσ√n

)=

n

∑i=1

var(Xi)

nσ2=

n

∑i=1

σ2

nσ2= 1.

52. Let Xi denote the time to transmit the ith packet, where Xi has mean m and variance

σ2. The total time to transmit n packets is Tn := X1 + · · ·+Xn. The expected total

time is E[Tn] = nm. Since we do not know the distribution of the Xi, we cannot know

the distribution of Tn. However, we use the central limit theorem to approximate

P(Tn > 2nm). Note that the sample mean Mn = Tn/n. Write

P(Tn > 2nm) = P( 1nTn > 2m) = P(Mn > 2m) = P(Mn−m> m)

= P

(Mn−mσ/√n

>m

σ/√n

)= P

(Yn >

m

σ/√n

)

= 1−FYn(m√n/σ) ≈ 1−Φ(m

√n/σ),

by the central limit theorem.

53. Let Xi = 1 if bit i is in error, and Xi = 0 otherwise. Then P(Xi = 1) = p. Although

the problem does not say so, let us assume that the Xi are independent. Then Mn =1n ∑n

i=1Xi is the fraction of bits in error. We cannot reliably decode of Mn > t. To

approximate the probability that we cannot reliably decode, write

P(Mn > t) = P

(Mn−mσ/√n

>t−mσ/√n

)= P

(Yn >

t−mσ/√n

)= 1−FYn

(t−mσ/√n

)

≈ 1−Φ

(t−mσ/√n

)= 1−Φ

(t− p√p(1− p)/n

),

since m= E[Xi] = p and σ2 = var(Xi) = p(1− p).54. If the Xi are i.i.d. Poisson(1), then Tn := X1 + · · ·+Xn is Poisson(n). Thus,

P(Tn = k) =nke−n

k!≈ 1√

2πexp

[−1

2

(k−n ·11 ·√n

)2]1

1 ·√n .

Taking k = n, we obtain nne−n ≈ n!/√2πn or n!≈

√2π nn+1/2e−n.


55. Recall that gn is the density of X1 + · · ·+Xn. If the Xi are i.i.d. uniform[−1,1], thengn is the convolution of (1/2)I[−1,1](x) with itself n times. From graphical considera-

tions, it is clear that gn(x) = 0 for |x|> n; i.e., xmax = n.

56. To begin, write

ϕYn(ν) = E[e jν(X1+···+Xn)/

√n]

= E

[n

∏i=1

e j(ν/√n )Xi

]=

n

∏i=1

[12e− jν/

√n+ 1

2e jν/

√n]

= cosn(ν/√n ) ≈

(1− ν2/2

n

)n→ e−ν2/2.

57. (a) MTTF = E[T ] = n (from Erlang in table).

(b) The desired probability is

R(t) := P(T > t) =∫ ∞

t

τn−1 e−τ

(n−1)!dτ.

Let Pn(t) denote the above integral. Then P1(t) =∫ ∞t e

−τ dτ = e−τ . For n > 1,

apply integration by parts with u= τn−1/(n−1)! and dv= e−τ dτ . Then

Pn(t) =tn−1 e−t

(n−1)!+Pn−1(t).

Applying this result recursively, we find that

Pn(t) =tn−1 e−t

(n−1)!+tn−2 e−t

(n−2)!+ · · ·+ e−t ,

which is the desired result.

(c) The failure rate is

r(t) =fT (t)

R(t)=tn−1e−t/(n−1)!

n−1∑k=0

tk

k!e−t

=tn−1

(n−1)!n−1∑k=0

tk

k!

.

For n= 2, r(t) = t/(1+ t):

0 2 4 6 8 100

1

58. (a) Let λ = 1. For p = 1/2, r(t) = 1/(2√2). For p = 1, r(t) = 1. For p = 3/2,

r(t) = 3√t/2. For p= 2, r(t) = 2t. For p= 3, r(t) = 3t2.


0 10

1

2

p = 1

p = 2p = 3

p = 1/2

p = 3/2

(b) We have from the text that

R(t) = exp

[−

∫ t

0r(τ)dτ

]= exp

[−

∫ t

0λ pτ p−1 dτ

]= e−λ t p .

(c) The MTTF is

E[T ] =

∫ ∞

0R(t)dt =

∫ ∞

0e−λ t p dt.

Now make the change of variable θ = λ t p, or t = (θ/λ )1/p. Then

E[T ] =∫ ∞

0e−θ

(θ

λ

)1/p−1 dθ

λ p=

1

pλ 1/p

∫ ∞

0θ 1/p−1e−θ dθ

=1

pλ 1/pΓ(1/p) =

Γ(1/p+1)

λ 1/p.

(d) Using the result of part (b),

fT (t) = −R′(t) = λ t p−1e−λ t p , t > 0.

59. (a) Write

R(t) = exp

[−

∫ ∞

0r(τ)dτ

]= exp

[−

∫ t

t0

pτ−1 dτ

]

= exp[−p(ln t− ln t0)] = exp[ln(t0/t)p] = (t0/t)

p, t ≥ t0.

(b) If t0 = 1 and p= 2, R(t) = 1/t2 has the form

0 1 2 3 4 50

1

(c) For p> 1, the MTTF is

E[T ] =∫ ∞

0R(t)dt =

∫ ∞

t0

(t0/t)p dt = t

p0

( t1−p1− p

)∣∣∣∞

t0=

t0

p−1.


(d) For the density, write

fT (t) = r(t)R(t) =p

t·( t0t

)p=

ptp0

t p+1, t ≥ t0.

60. (a) A sketch of r(t) = t2−2t+2 for t ≥ 0 is:

0 1 2 30

2

4

6

(b) We first compute

∫ t

0r(τ)dτ =

∫ t

0τ2−2τ +2dτ = 1

3t3− t2 +2t.

Then

fT (t) = r(t)e−∫ t0 r(τ)dτ = [t2−2t+2]e−( 13 t

3−t2+2t), t ≥ 0.

61. (a) If T ∼ uniform[1,2], then for 0 ≤ t < 1, R(t) = P(T > t) = 1, and for t ≥ 2,

R(t) = P(T > t) = 0. For 1≤ t < 2,

R(t) = P(T > t) =∫ 2

t1dτ = 2− t.

The complete formula and sketch are

R(t) =

1, 0≤ t < 1,2− t, 1≤ t < 2,0, t ≥ 2.

0 1 2 3

0

1

(b) The failure rate is

r(t) = − d

dtlnR(t) = − d

dtln(2− t) =

1

2− t , 1< t < 2.


(c) Since T ∼ uniform[1,2], the MTTF is E[T ] = 1.5.

62. Write

R(t) := P(T > t) = P(T1 > t,T2 > t) = P(T1 > t)P(T2 > t) = R1(t)R2(t).

63. Write

R(t) := P(T > t) = P(T1 > t∪T2 > t) = 1−P(T1 ≤ t,T2 ≤ t)= 1−P(T1 ≤ t)P(T2 ≤ t) = 1− [1−R1(t)][1−R2(t)]= 1− [1−R1(t)−R2(t)+R1(t)R2(t)]

= R1(t)+R2(t)−R1(t)R2(t).

64. We follow the hint and write

E[Y n] = E[T ] =∫ ∞

0P(T > t)dt =

∫ ∞

0P(Y n > t)dt =

∫ ∞

0P(Y > t1/n)dt.

We then make the change of variable y= t1/n, or t = yn, dt = nyn−1 dy, to get

E[Y n] =∫ ∞

0P(Y > y) ·nyn−1 dy.

CHAPTER 6

Problem Solutions

1. Since the Xi are uncorrelated with common mean m and common variance σ2,

E[S2n] =1

n−1

(E

[n

∑i=1

X2i

]−nE[Mn]

)

=1

n−1

[n(σ2 +m2)−n

var(Mn)+(E[Mn])

2]

=1

n−1

[n(σ2 +m2)−nσ2/n+m2

]

=1

n−1

[(n−1)σ2 +nm2−nm2

]= σ2.

2. (a) The mean of a Rayleigh(λ ) random variable is λ√

π/2. Consider

λn := Mn/√

π/2.

Then

E[λn] = E[Mn/√

π/2 ] = E[Mn]/√

π/2 = λ√

π/2/√

π/2 = λ .

Thus, λn is unbiased. Next,

λn = Mn/√

π/2 → E[Mn]/√

π/2 = λ√

π/2/√

π/2 = λ ,

and we see that λn is strongly consistent.

(b) MATLAB. Add the line of code lambdan=mean(X)/sqrt(pi/2).

(c) MATLAB. Since Mn ≈ λ√

π/2, we solve for π and put

πn := 2(Mn/λ )2.

Since λ = 3, add the line of code pin=2*(mean(X)/3)ˆ2.

3. (a) The mean of a gamma(p,λ ) random variable is p/λ . We put

pn := λMn.

Then E[pn] = λE[Mn] = λ · p/λ = p. Also pn = λMn→ λ (p/λ ) = p. Thus, pnis unbiased and strongly consistent.

(b) MATLAB. In this problem λ = 1/2 and p= k/2, or k= 2p. We use kn := 2pn =2(λMn) = 2((1/2)Mn) =Mn. We therefore add the line of code kn=mean(X).

94


4. (a) Since the mean of a noncentral chi-squared random variable with k degrees of

freedom and noncentrality parameter λ 2 is k+λ 2, we put

λ 2n := Mn− k.

Then E[λ 2n ] = E[Mn − k] = E[Mn]− k = (k+ λ 2)− k = λ 2, and we see that

λ 2n is an unbiased estimator of λ 2. Next, since λ 2

n = Mn− k→ E[Mn]− k =(k+λ 2)− k = λ 2, the estimator is strongly consistent.

(b) MATLAB. Since k = 5, add the line of code lambda2n=mean(X)-5.

5. (a) Since the mean of a gamma(p,λ ) random variable is p/λ , we put λn := p/Mn.

Then λn = p/Mn→ p/E[Mn] = p/(p/λ ) = λ , and we see that λn is a stronglyconsistent estimator of λ .

(b) MATLAB. Since p= 3, add the line of code lambdan=3/mean(X).

6. (a) Since the variance of a Laplace(λ ) random variable is 2/λ 2, we put

λn :=√2/S2n.

Since S2n converges to the variance, we have λn→√2/(2/λ 2) = λ , and we see

that λn is a strongly consistent estimator of λ .

(b) MATLAB. Add the line of code lambdan=sqrt(2/var(X)).

7. (a) The mean of a gamma(p,λ ) random variable is p/λ . The second moment isp(p+1)/λ 2. Hence, the variance is

p(p+1)

λ 2− p2

λ 2=

p

λ 2.

Thus, Mn ≈ p/λ and S2n ≈ p/λ 2. Solving for p and λ suggests that we put

λn :=Mn

S2nand pn := λnMn.

Now, Mn→ p/λ and S2n→ p/λ 2. It follows that λn→ (p/λ )/(p/λ 2) = λ and

then pn→ λ · (p/λ ) = p. Hence, λn is a strongly consistent estimator of λ , andpn is a strongly consistent estimator of p.

(b) MATLAB. Add the code

Mn = mean(X)

lambdan = Mn/var(X)

pn = lambdan*Mn

8. Using results from the problem referenced in the hint, we have

E[Xq] =Γ((q+ p)/q

)

Γ(p/q)λ q=

Γ(1+ p/q)

Γ(p/q)λ q=

p/q

λ q.


This suggests that we put

λn :=

(p/q

1n ∑n

i=1Xqi

)1/q

.

Then

λn →(

p/q

(p/q)/λ q

)1/q

= λ ,

and we see that λn is a strongly consistent estimator of λ .

9. In the preceding problem E[Xq] = (p/q)/λ q. Now consider

E[X2q] =Γ((2q+ p)/q

)

Γ(p/q)λ 2q=

Γ(2+ p/q)

Γ(p/q)λ 2q=

(1+ p/q)(p/q)

λ 2q.

In this equation, replace λ q by (p/q)/E[Xq] and solve for (p/q). Thus,

p/q =(E[Xq])2

var(Xq).

This suggests that we first put

Xqn :=

1

n

n

∑i=1

Xqi

and then

pn := q(Xqn )

2

1n−1

[(∑ni=1(X

qi )

2)−n(Xqn )2

]

and

λn :=

(pn/q

Xqn

)1/q

=

X

qn

1n−1

[(∑ni=1(X

qi )

2)−n(Xqn )2

]

1/q

.


11. MATLAB. The required script can be created using the code from Problem 2 followed

by the lines

global lambdan

lambdan = mean(X)/sqrt(pi/2)

followed by the script from Problem 10 modifed as follows: The chi-squared statistic

Z can be computed by inserting the lines

p = CDF(b) - CDF(a);

Z = sum((H-n*p).ˆ2./(n*p))


after the creation of the right edge sequence in the script given in Problem 10, where

CDF is the function

function y = CDF(t) % Rayleigh CDF

global lambdan

y = zeros(size(t));

i = find(t>0);

y(i) = 1-exp(-(t(i)/lambdan).ˆ2/2);

In addition, the line defining y in the script from Problem 10 should be changed to

y=PDF(t), where PDF is the function

function y = PDF(t) % Rayleigh density

global lambdan

y = zeros(size(t));

i = find(t>0);

y(i) = (t(i)/lambdanˆ2).*exp(-(t(i)/lambdan).ˆ2/2);

Finally, the chi-squared statistic Z should be compared with zα = 22.362, since α =0.05 and since there are m = 15 bins and r = 1 estimated parameter, the degrees of

freedom parameter is k = m−1− r = 15−1−1 = 13 in the chi-squared table in the

text.

12. MATLAB. Similar to the solution of Problem 11 except that it is easier to use the

MATLAB function chi2cdf or gamcdf to compute the required cdfs for evaluating

the chi-squared statistic Z. For the same reasons as in Problem 11, zα = 22.362.

13. MATLAB. Similar to the solution of Problem 11 except that it is easier to use the

MATLAB function ncx2cdf to compute the required cdfs for evaluating the chi-

squared statistic Z. For the same reasons as in Problem 11, zα = 22.362.

14. MATLAB. Similar to the solution of Problem 11 except that it is easier to use the MAT-

LAB function gamcdf to compute the required cdfs for evaluating the chi-squared

statistic Z. For the same reasons as in Problem 11, zα = 22.362.

15. MATLAB. Similar to the solution of Problem 11. For the same reasons as in Prob-

lem 11, zα = 22.362.

16. Since

E[H j] = E

[n

∑i=1

I[e j ,e j+1)(Xi)

]=

n

∑i=1

P(e j ≤ Xi < e j+1) =n

∑i=1

p j = np j,

E

[H j−np j√

np j

]= 0.

Since the Xi are i.i.d., the I[e j ,e j+1)(Xi) are i.i.d. Bernoulli(p j). Hence,

E[(H j−np j)2] = var(H j) = var

(n

∑i=1

I[e j ,e j+1)(Xi)

)

=n

∑i=1

var

(I[e j ,e j+1)(Xi)

)= n · p j(1− p j),


and so

E

[(H j−np j√

np j

)2]= 1− p j.

17. If f is an even density, then

F(−x) =∫ −x

−∞f (t)dt = −

∫ x

∞f (−θ)dθ =

∫ ∞

xf (θ)dθ = 1−F(x).

18. The width of any confidence interval is w= 2σy/√n. If σ = 2 and n= 100,

w99% =2 ·2 ·2.576

10= 1.03.

To make w99% < 1/4 requires

2σy√n

< 1/4 or n > (8σy)2 = (16 ·2.576)2 = 1699.

19. First observe that with Xi =m+Wi, E[Xi] =m, and var(Xi) = var(Wi) = 4. So, σ2 = 4.

For 95% confidence interval, σyα/2/√n= 2 ·1.960/10 = 0.392, and so

m = 14.846±0.392 with 95% probability.

The corresponding confidence interval is [14.454,15.238].

20. Write

P(|Mn−m| ≤ δ ) = P(−δ ≤Mn−m≤ δ ) = P

(−δ ≤ 1

n

n

∑i=1

(m+Wi)−m≤ δ

)

= P

(−δ ≤ 1

n

n

∑i=1

Wi ≤ δ

)= P

(−nδ ≤

n

∑i=1

Wi

︸︷︷︸Cauchy(n)

≤ nδ

)

=2

πtan−1(nδ/n),

which is equal to 2/3 if and only if tan−1(δ ) = π/3, or δ =√3.


22. We use the formula m=Mn± yα/2Sn/√n= 10.083± (1.960)(0.568)/10 to get

m = 10.083±0.111 with 95% probability,

and the confidence intervale is [9.972,10.194].

23. We use the formula m=Mn± yα/2Sn/√n= 4.422± (1.812)(0.957)/10 to get


and the confidence interval is [4.249,4.595].


24. We have Mn = number defective/n = 10/100 = 0.1. We use the formula m =Mn±yα/2Sn/

√n= 0.1± (1.645)(.302)/10 to get


The number of defectives is 10000m, or

number of defectives = 1000±497 with 90% probability.

Thus, we are 90% sure that the number of defectives is between 503 and 1497 out of

a total of 10000 units.

25. We have Mn = 1559/3000. We use the formula m = Mn± yα/2Sn/√n = 0.520±

(1.645)(.5)/√3000 to get


and the confidence interval is [0.505,0.535]. Hence, the probability is at least 90%that more than 50.5% of the voters will vote for candiate A. So we are 90% sure that

candidate A will win. The 99% confidence interval is given by


and the confidence interval is [0.496,0.544]. Hence, we are not 99% sure that candi-

date A will win.

26. We have Mn = number defective/n = 48/500 = 0.0960. We use the formula m =Mn± yα/2Sn/

√n= 0.0960± (1.881)(.295)/

√500 to get


The number of defectives is 100000m, or

number of defectives = 9600±2482 with 94% probability.

Thus, we are 94% sure that the number of defectives is between 7118 and 12082 out

of a total of 100000 units.

27. (a) We have Mn = 6/100. We use the formula m = Mn ± yα/2Sn/√n = 0.06±

(2.170)(.239)/10 to get


We are thus 97% sure that p= m lies in the interval [0.0081,0.1119]. Thus, weare not 97% sure that p< 0.1.

(b) We have Mn = 71/1000. We use the formula m =Mn± yα/2Sn/√n = 0.071±

(2.170)(.257)/√1000 to get


We are thus 97% sure that p=m lies in the interval [0.053,0.089]. Thus, we are97% sure that p< 0.089 < 0.1.


28. (a) Let Ti denote the time to transmit the ith packet. Then we need to compute

P

( n⋃

i=1

Ti > t)

= 1−P

( n⋂

i=1

Ti ≤ t)

= 1−FT1(t)n = 1− (1− e−t/µ)n.

(b) Using the notation from part (a), T = T1 + · · ·+ Tn. Since the Ti are i.i.d.

exp(1/µ), T is Erlang(n,1/µ) by Problem 55(c) in Chapter 4 and the remark

following it. Hence,

fT (t) = (1/µ)(t/µ)n−1e−t/µ

(n−1)!, t ≥ 0.

(c) We have

µ = Mn±yα/2Sn√

n= 1.994± 1.960(1.798)

10= 1.994±0.352

and confidence interval [1.642,2.346] with 95% probability.


30. By the hint, ∑ni=1Xi is Gaussian with mean nm and variance nσ2. Since Mn =(

∑ni=1Xi

)/n, it is easy to see that Mn is still Gaussian, and its mean is (nm)/n = m.

Its variance is (nσ2)/n2 = σ2/n. Next,Mn−m remains Gaussian but with mean zero

and the same variance σ2/n. Finally, (Mn−m)/√

σ/n remains Gaussian and with

mean zero, but its variance is (σ2/n)/(√

σ2/n )2 = 1.

31. We use the fomula m =Mn± yα/2Sn/√n, where in this Gaussian case, yα/2 is taken

from the tables using Student’s t distribution with n= 10. Thus,

m = Mn±yα/2Sn√

n= 14.832± 2.262 ·1.904√

10= 14.832±1.362,

and the confidence interval is [13.470,16.194] with 95% probability.

32. We use [nV 2n /u,nV 2

n /`], where u and ` are chosen from the appropriate table. For a

95% confidence interval, ` = 74.222 and u= 129.561. Thus,[nV 2

n

u,nV 2

n

`

]=

[100(4.413)

129.561,100(4.413)

74.222

]= [3.406,5.946].

33. We use [(n− 1)S2n/u,(n− 1)S2n/`], where u and ` are chosen from the appropriate

table. For a 95% confidence interval, ` = 73.361 and u= 128.422. Thus,[(n−1)S2n

u,(n−1)S2n

`

]=

[99(4.736)

128.422,99(4.736)

73.361

]= [3.651,6.391].

34. For the two-sided test at the 0.05 significance level, we compare |Zn| with yα/2 =1.960. Since |Zn| = 1.8 ≤ 1.960 = yα/2, we accept the null hypothesis. For the

one-sided test of m > m0 at the 0.05 significance level, we compare Zn with −yα =−1.645. Since it is not the case that Zn =−1.80 >−1.645 =−yα , we do not accept

the null hypothesis.


35. Suppose Φ(−y) = α . Then by Problem 17, Φ(−y) = 1−Φ(y), and so 1−Φ(y) = α ,or Φ(y) = 1−α .

36. (a) Since Zn = 1.50 ≤ yα = 1.555, the Internet service provider accepts the nullhypothesis.

(b) Since Zn = 1.50>−1.555=−yα , we accept the null hypothesis; i.e., we reject

the claim of the Internet service provider.

37. The computer vendor would take the null hypothesis to bem≤m0. To give the vendor

the benefit of the doubt, the consumer group uses m ≤ m0 as the null hypothesis. To

accept the null hypothesis would require Zn ≤ yα . Only by using the sigificance level

of 0.10, which has yα = 1.282, can the consumer group give the benefit of the doubtto the vendor and still reject the vendor’s claim.

38. Giving itself the benfit of the doubt, the company uses the null hypothesism>m0 and

uses a 0.05 significance level. The null hypothesis will be accepted if Zn > −yα =−1.645. Since Zn =−1.6>−1.645, the company believes it has justified its claim.

39. Write

e(g) =n

∑k=1

|Yk− (axk+ b)|2 =n

∑k=1

|Yk− (axk+[Y − ax])|2

=n

∑k=1

|(Yk−Y )+ a(xk− x)|2

=n

∑k=1

[(Yk−Y )2−2a(xk− x)(Yk−Y )+ a2(xk− x)2

]

= SYY −2aSxY + a2Sxx

= SYY −2aSxY + a(SxYSxx

)Sxx = SYY − aSxY = SYY −S2xY/Sxx.

40. Write

E[Y |X = x] = E[g(X)+W |X = x] = E[g(x)+W |X = x]

= g(x)+E[W |X = x] = g(x)+E[W ] = g(x).



43. MATLAB. If z= c/tq, then lnz= lnc−q ln t. If y= lnz and x= ln t, then y= (−q)x+lnc. If y≈ a(1)x+a(2), then q=−a(1) and c= exp(a(2)). Hence, the two lines ofcode that we need are

qhat = -a(1)

chat = exp(a(2))

44. Obvious.


45. Write

fZ(z) = esz fZ(z)/MZ(s) = esz

e−z2/2

√2π

/es

2/2 =e−(z−s)2/2√2π

.

To make E[Z] = t, put s= t. Then Z ∼ (t,1).

46. Write

fZ(z) = esz fZ(z)/MZ(s) = esz

λ (λ z)p−1e−λ z

Γ(p)

/( λ

λ − s)p

, z> 0.

Then

E[Z] =( λ

λ − s)−p ∫ ∞

0z

λ (λ z)p−1e−z(λ−s)

Γ(p)dz =

∫ ∞

0z(λ − s)pzp−1e−z(λ−s)

Γ(p)dz,

which is the mean of a gamma(p,λ − s) density. Hence, E[Z] = p/(λ − s). To makeE[Z] = t, we need p/(λ − s) = t or s= λ − p/t.


48. First,

pZ(zi) = eszi pZ(zi)

/[(1− p)+ pes].

Then

pZ(1) = esp/[(1− p)+ pes] and p

Z(0) = (1− p)/[(1− p)+ pes].

CHAPTER 7

Problem Solutions

1. We have

FZ(z) = P(Z ≤ z) = P(Y −X ≤ z) = P((X ,Y ) ∈ Az),

where

Az := (x,y) : y− x≤ z = (x,y) : y≤ x+ z.

z

x

y

−z

2. We have

FZ(z) = P(Z ≤ z) = P(Y/X ≤ z) = P

(Y/X ≤ z∩

[X < 0∪X > 0

])

= P((X ,Y ) ∈ D−z ∪D+z ) = P((X ,Y ) ∈ Az),

where

Az := D−z ∪D+z ,

and

D−z := (x,y) : y/x≤ z and x< 0 = (x,y) : y≥ zx and x< 0,

and

D+z := (x,y) : y/x≤ z and x> 0 = (x,y) : y≤ zx and x> 0.

x

y

Dz+−

Dz

line of slopez

3. (a) R := (a,b]× (c,d] is

103


c

d

a b

(b) A := (−∞,a]× (−∞,d] is

c

d

a b

(c) B := (−∞,b]× (−∞,c] is

c

d

ba

(d) C := (a,b]× (−∞,c] is

c

d

a b

(e) D := (−∞,a]× (c,d] is

c

d

a b


(f) A∩B is

c

d

ba

4. Following the hint and then observing that R and A∪B are disjoint, we have

P((X ,Y ) ∈ (−∞,b]× (−∞,d]

)= P

((X ,Y ) ∈ R

)+P

((X ,Y ) ∈ A∪B

). (∗)

Next, by the inclusion–exclusion formula,

P((X ,Y ) ∈ A∪B

)= P

((X ,Y ) ∈ A

)+P

((X ,Y ) ∈ B

)−P

((X ,Y ) ∈ A∩B

)

= FXY (a,d)+FXY (b,c)−FXY (a,c).

Hence, (∗) becomes

FXY (b,d) = P((X ,Y ) ∈ R

)+FXY (a,d)+FXY (b,c)−FXY (a,c),

which is easily rearranged to get the rectangle formula,

P((X ,Y ) ∈ R

)= FXY (b,d)−FXY (a,d)−FXY (b,c)+FXY (a,c).

5. (a) (x,y) : |x| ≤ y≤ 1 is NOT a product set.

(b) (x,y) : 2< x≤ 4, 1≤ y< 2= (2,4]× [1,2).

(c) (x,y) : 2< x≤ 4, y= 1= (2,4]×1.(d) (x,y) : 2< x≤ 4= (2,4]× IR.

(e) (x,y) : y= 1= IR×1.(f) (1,1),(2,1),(3,1)= 1,2,3×1.(g) The union of (1,3),(2,3),(3,3) and the set in (f) is equal to 1,2,3×1,3.(h) (1,0),(2,0),(3,0),(0,1),(1,1),(2,1),(3,1) is NOT a product set.

6. We have

FX (x) =

1, x≥ 2,x−1, 1≤ x< 2,0, x< 1,

and FY (y) =

1− e−y−e−2y

y, y≥ 0,

0, y< 0,

where the quotient involving division by y is understood as taking its limiting value

of one when y= 0. Since FX (x)FY (y) 6= FXY (x,y) when 1≤ x≤ 2 and y> 0, X and Y

are NOT independent.


7. We have

FX (x) =

1, x≥ 3,2/7, 2≤ x< 3,0, x< 2,

and FY (y) =

17[7−2e−2y−5e−3y], y≥ 0,

0, y< 0.

8. Let y> 0. First compute

∂

∂y

(y+ e−x(y+1)

y+1

)=

(y+1)1+ e−x(y+1)(−x)−y+ e−x(y+1)(1)(y+1)2

=(y+1)− x(y+1)e−x(y+1)− y− e−x(y+1)

(y+1)2

=1− e−x(y+1)1+ x(y+1)

(y+1)2.

Then compute

∂

∂x

[∂

∂y

(y+ e−x(y+1)

y+1

)]=1+ x(y+1)e−x(y+1)(y+1)− e−x(y+1)(y+1)

(y+1)2

=xe−x(y+1)(y+1)2

(y+1)2= xe−x(y+1), x,y> 0.

9. The first step is to recognize that fXY (x,y) factors into

fXY (x,y) =exp[−|y− x|− x2/2]

2√2π

=e−x

2/2

√2π

· e−|y−x|

2.

When integrating this last factor with respect to y, make the change of variable θ =y− x to get

fX (x) =∫ ∞

−∞fXY (x,y)dy =

e−x2/2

√2π

∫ ∞

−∞

12e−|y−x| dy =

e−x2/2

√2π

∫ ∞

−∞

12e−|θ | dθ

=e−x

2/2

√2π

∫ ∞

0e−θ dθ =

e−x2/2

√2π

.

Thus, X ∼ N(0,1).

10. The first step is to factor fXY (x,y) as

fXY (x,y) =4e−(x−y)2/2

y5√2π

=4

y5· e−(x−y)2/2√2π

.

Regarding this last factor a function of x, it is an N(y,1) density. In other words, whenintegrated with respect to x, the result is one. In symbols,

fY (y) =∫ ∞

−∞fXY (x,y)dx =

4

y5

∫ ∞

−∞

e−(x−y)2/2√2π

dx =4

y5, y≥ 1.


11. We first analyzeU :=max(X ,Y ). Then

FU (u) = P(max(X ,Y )≤ u) = P(X ≤ u and Y ≤ u) = FXY (u,u),

and the density is

fU (u) =∂FXY (x,y)

∂x

∣∣∣∣x=u,y=u

+∂FXY (x,y)

∂y

∣∣∣∣x=u,y=u

.

If X and Y are independent, then

FU (u) = FX (u)FY (u), and fU (u) = fX (u)FY (u)+FX (u) fY (u).

If in addition X and Y have the same density, say f , (and therefore the same cdf, say

F), then

FU (u) = F(u)2, and fU (u) = 2F(u) f (u).

We next analyze V :=min(X ,Y ). Using the inclusion–exclusion formula,

FV (v) = P(min(X ,Y )≤ v) = P(X ≤ v or Y ≤ v)= P(X ≤ v)+P(Y ≤ v)−P(X ≤ v and Y ≤ v)= FX (v)+FY (v)−FXY (v,v).

The density is

fV (v) = fX (v)+ fY (v)− ∂FXY (x,y)

∂x

∣∣∣∣x=v,y=v

− ∂FXY (x,y)

∂y

∣∣∣∣x=v,y=v

.

If X and Y are independent, then

FV (v) = FX (v)+FY (v)−FX (v)FY (v),

and

fV (v) = fX (v)+ fY (v)− fX (v)FY (v)−FX (v) fY (v).

If in addition X and Y have the same density f and cdf F , then

FV (v) = 2F(v)−F(v)2, and fV (v) = 2[ f (v)−F(v) f (v)].

12. Since X ∼ gamma(p,1) and Y ∼ gamma(q,1) are independent, we have from Prob-

lem 55(c) in Chapter 4 that Z ∼ gamma(p+q,1). Since p= q= 1/2, we further haveZ ∼ gamma(1,1) = exp(1). Hence, P(Z > 1) = e−1.

13. We have from Problem 55(b) in Chapter 4 that Z ∼ Cauchy(λ + µ). Since λ = µ =1/2, Z ∼ Cauchy(1). Thus,

P(Z ≤ 1) =1

πtan−1(1)+

1

2=

1

π

π

4+1

2=

3

4.


14. First write

FZ(z) =∫ ∞

−∞

[∫ z−y

−∞fXY (x,y)dx

]dy.


fZ(z) =∂

∂ zFZ(z) =

∫ ∞

−∞

∂

∂ z

∫ z−y

−∞fXY (x,y)dxdy =

∫ ∞

−∞fXY (z− y,y)dy.

15. First write

FZ(z) = P((X ,Y ) ∈ Az) = P((X ,Y ) ∈ B+z ∪B−z )

=∫ ∫

B+z

fXY (x,y)dxdy+∫ ∫

B−z

fXY (x,y)dxdy

=

∫ ∞

0

[∫ z/y

−∞fXY (x,y)dx

]dy+

∫ 0

−∞

[∫ ∞

z/yfXY (x,y)dx

]dy.

Then

fZ(z) =∂

∂ zFZ(z) =

∫ ∞

0fXY (z/y,y)/ydy−

∫ 0

−∞fXY (z/y,y)/ydy

=∫ ∞

0fXY (z/y,y)/|y|dy+

∫ 0

−∞fXY (z/y,y)/|y|dy

=∫ ∞

−∞fXY (z/y,y)/|y|dy.

16. For the cdf, write

FZ(z) =∫ ∞

−∞

[∫ z+x

−∞fXY (x,y)dy

]dx.

Then

fZ(z) =∂

∂ zFZ(z) =

∫ ∞

−∞

∂

∂ z

∫ z+x

−∞fXY (x,y)dydx =

∫ ∞

−∞fXY (x,z+ x)dx.

17. For the cdf, write

FZ(z) = P(Z ≤ z) = P(Y/X ≤ z)= P(Y ≤ Xz,X > 0)+P(Y ≥ Xz,X < 0)

=∫ ∞

0

[∫ xz

−∞fXY (x,y)dy

]dx+

∫ 0

−∞

[∫ ∞

xzfXY (x,y)dy

]dx.

Then

fZ(z) =∂

∂ zFZ(z) =

∫ ∞

0fXY (x,xz)xdx−

∫ 0

−∞fXY (x,xz)xdx=

∫ ∞

−∞fXY (x,xz)|x|dx.

18. (a) The region D:


−1 0 1

−1

0

1D

(b) Since fXY (x,y) = Kxnym for (x,y) ∈ D and since D contains negative values of

x, we must have n even in order that the density be nonnegative. In this case,

the integral of the density over the region D must be one. Hence, K must be

such that

1 =∫ ∫

D

fXY (x,y)dydx =∫ 1

−1

[∫ 1

|x|Kxnym dy

]dx =

∫ 1

−1Kxn

[ym+1

m+1

]1

|x|dx

=

∫ 1

−1

Kxn

m+1[1−|x|m+1]dx =

K

m+1

[∫ 1

−1xn−|x|n+m+1 dx

]

=2K

m+1

∫ 1

0xn− xn+m+1 dx =

2K

m+1

[1

n+1− 1

n+m+2

]

=2K

(n+1)(n+m+2).

Hence, K = (n+1)(n+m+2)/2.

(c) A sketch of Az with z= 0.3:

−1 0 1

−1

0

1

Az

Az

(d) A sketch of Az∩D with z= 0.3:

−1 0 1

−1

0

1

(e)

P((X ,Y ) ∈ Az) =∫ √

z

z

[∫ 1

z/xKxnym dy

]dx+

∫ 1

√z

[∫ 1

xKxnym dy

]dx


=∫ √

z

zKxn

[∫ 1

z/xym dy

]dx+

∫ 1

√zKxn

[∫ 1

xym dy

]dx

=K

m+1

[∫ √z

zxn[1− (z/x)m+1]dx+

∫ 1

√zxn[1− xm+1]dx

]

=K

m+1

[∫ √z

zxn− zm+1xn−m−1 dx+

∫ 1

√zxn− xn+m+1 dx

]

=K

m+1

[∫ 1

zxn dx−

∫ √z

zzm+1xn−m−1 dx−

∫ 1

√zxn+m+1 dx

]

=K

m+1

[1− znn+1

−∫ √

z

zzm+1xn−m−1 dx− 1− z(n+m+2)/2

n+m+2

].

If n 6= m, the remaining integral is equal to

(√z )n+m+2− zn+1

n−m .

Otherwise, the integral is equal to

− zm+1

2lnz.

19. Let X ∼ uniform[0,w] and Y ∼ uniform[0,h]. We need to compute P(XY ≥ λwh).Before proceeding, we make a few observations. First, since X ≥ 0, we can write for

z> 0,

P(XY ≥ z) =∫ ∞

0

∫ ∞

z/xfXY (x,y)dydx =

∫ ∞

0fX (x)

∫ ∞

z/xfY (y)dydx.

Since Y ∼ uniform[0,h], the inner integral will be zero if z/x > h. Since z/x ≤ h ifand only if x≥ z/h,

P(XY ≥ z) =∫ ∞

z/hfX (x)

∫ h

z/x

1

hdydx =

∫ ∞

z/hfX (x)

(1− z

xh

)dx.

We can now write

P(XY ≥ λwh) =

∫ ∞

λwh/hfX (x)

(1− λwh

xh

)dx =

∫ ∞

λwfX (x)

(1− λw

x

)dx

=∫ w

λw

1

w

(1− λw

x

)dx = (1−λ )−λ ln

w

λw= (1−λ )+λ lnλ .

20. We first compute

E[XY ] = E[cosΘsinΘ] = 12E[sin2Θ] =

1

4π

∫ π

−πsin2θ dθ

=cos(−2π)− cos(2π)

8π= 0.


Similarly,

E[X ] = E[cosΘ] =1

2π

∫ π

−πcosθ dθ =

sin(π)− sin(−π)

2π=

0−0

2π= 0,

and

E[Y ] = E[sinΘ] =1

2π

∫ π

−πsinθ dθ =

cos(−π)− cos(π)

2π=

(−1)− (−1)2π

= 0.

To prove that X and Y are not independent, we argue by contradiction. However,

before we begin, observe that since (X ,Y ) satisfies

X2 +Y 2 = cos2Θ+ sin2Θ = 1,

(X ,Y ) always lies on the unit circle. Now consider the square of side one centered at

the origin,

S := (x,y) : |x| ≤ 1/2, and |y| ≤ 1/2.Since this region lies strictly inside the unit circle, P((X ,Y ) ∈ S) = 0. Now, to obtain

a contradiction suppose that X and Y are independent. Then fXY (x,y) = fX (x) fY (y),where fX and fY are both arcsine densities by Problem 35 in Chapter 5. Hence, for

|x|< 1 and |y|< 1, fXY (x,y)≥ fXY (0,0) = 1/π2. We can now write

P((X ,Y ) ∈ S) =∫ ∫

S

fXY (x,y)dxdy ≥∫ ∫

S

1/π2 dxdy = 1/π2 > 0,

which is a contradiction.

21. If E[h(X)k(Y )] = E[h(X)]E[k(Y )] for all bounded continuous functions h and k, thenwe may specialize this equation to the functions h(x) = e jν1x and k(y) = e jν2y to show

that the joint characteristic function satisfies

ϕXY (ν1,ν2) := E[e j(ν1X+ν2Y )] = E[e jν1Xe jν2Y ] = E[e jν1X ]E[e jν2Y ] = ϕX (ν1)ϕY (ν2).

Since the joint characteristic function is the product of the marginal characteristic

functions, X and Y are independent.

22. (a) Following the hint, let D denote the half-plane D := (x,y) : x> x0,distance of point from origin is x0/cosθ

x

y

θ

x0

D


and write

P(X > x0) = P(X > x0,Y ∈ IR) = P((X ,Y ) ∈ D),

where X and Y are both N(0,1). Then

P((X ,Y ) ∈ D) =∫ ∫

D

e−(x2+y2)/2

2πdxdy.

Now convert to polar coordinates and write

P((X ,Y ) ∈ D) =

∫ π/2

−π/2

∫ ∞

x0/cosθ

e−r2/2

2πr drdθ

=1

2π

∫ π/2

−π/2−e−r2/2

∣∣∣∣∞

x0/cosθ

dθ

=1

2π

∫ π/2

−π/2exp[−(x0/cosθ)2/2]dθ

=1

π

∫ π/2

0exp

( −x202cos2 θ

)dθ .

(b) In the preceding integral, make the change of variable θ = π/2− t, dθ =−dt.Then cosθ becomes cos(π/2− t) = sin t, and the preceding integral becomes

1

π

∫ π/2

0exp

( −x202sin2 t

)dt.

23. Write∫ ∞

−∞fY |X (y|x)dy=

∫ ∞

−∞

fXY (x,y)

fX (x)dy=

1

fX (x)

∫ ∞

−∞fXY (x,y)dy=

fX (x)

fX (x)= 1.

24. First write

P((X ,Y ) ∈ A|x< X ≤ x+∆x) =P

((X ,Y ) ∈ A

∩x< X ≤ x+∆x

)

P(x< X ≤ x+∆x)

=P

((X ,Y ) ∈ A

∩(X ,Y ) ∈ (x,x+∆x]× IR

)

P(x< X ≤ x+∆x)

=

P

((X ,Y ) ∈ A∩

[(x,x+∆x]× IR

])

P(x< X ≤ x+∆x)

=

∫ ∫

A∩[(x,x+∆x]×IR

]fXY (t,y)dydt

∫ x+∆x

xfX (τ)dτ


=

∫ ∫IA(t,y)I(x,x+∆x]×IR(t,y) fXY (t,y)dydt

∫ x+∆x

xfX (τ)dτ

=

∫ x+∆x

x

∫ ∞

−∞IA(t,y) fXY (t,y)dydt

∫ x+∆x

xfX (τ)dτ

=

1

∆x

∫ x+∆x

x

∫ ∞

−∞IA(t,y) fXY (t,y)dydt

1

∆x

∫ x+∆x

xfX (τ)dτ.

It now follows that

lim∆t→∞

P((X ,Y ) ∈ A|x< X ≤ x+∆x) =

∫ ∞

−∞IA(x,y) fXY (x,y)dy

fX (x)

=∫ ∞

−∞IA(x,y) fY |X (y|x)dy.

25. We first compute, for x> 0,

fY |X (y|x) =xe−x(y+1)

e−x= xe−xy, y> 0.

As a function of y, this is an exponential density with parameter x. This is very

different from fY (y) = 1/(y+1)2. We next compute, for y> 0,

fX |Y (x|y) =xe−x(y+1)

1/(y+1)2= (y+1)2xe−(y+1)x, x> 0.

As a function of x this is an Erlang(2,y+ 1) density, which is not the same as fX ∼exp(1).

26. We first compute, for x> 0,

fY |X (y|x) =xe−x(y+1)

e−x= xe−xy, y> 0.

As a function of y, this is an exponential density with parameter x. Hence,

E[Y |X = x] =∫ ∞

0y fY |X (y|x)dy = 1/x.

We next compute, for y> 0,

fX |Y (x|y) =xe−x(y+1)

1/(y+1)2= (y+1)2xe−(y+1)x, x> 0.

As a function of x this is an Erlang(2,y+1) density. Hence,

E[X |Y = y] =

∫ ∞

0x fX |Y (x|y)dx = 2/(y+1).


27. Write

∫ ∞

−∞P(X ∈ B|Y = y) fY (y)dy =

∫ ∞

−∞

[∫

BfX |Y (x|y)dx

]fY (y)dy

=∫ ∞

−∞

[∫ ∞

−∞IB(x) fX |Y (x|y)dx

]fY (y)dy

=

∫ ∞

−∞IB(x)

[∫ ∞

−∞fXY (x,y)dy

]dx

=∫

BfX (x)dx = P(X ∈ B).

28. For z≥ 0,

fZ(z) =∫ √

z

−√z

e−(z−y2)/(2σ2)√2πσ√z− y2

· e−(y/σ)2/2

√2πσ

dy =e−z/(2σ2)

2πσ2

∫ √z

−√z

1√z− y2

dy

=e−z/(2σ2)

πσ2

∫ √z

0

1√z− y2

dy =e−z/(2σ2)

πσ2

∫ 1

0

1√1− t2

dt

=e−z/(2σ2)

πσ2[sin−1(1)− sin−1(0)] =

e−z/(2σ2)

2σ2,

which is an exponential density with parameter 1/(2σ2).

29. For z≥ 0,

fZ(z) =∫ ∞

0x ·λe−λxz ·λe−λx dx+

∫ ∞

0y ·λe−λyz ·λe−λy dy

= 2

∫ ∞

0x ·λe−λxz ·λe−λx dx

=2λ 2

λ z+λ

∫ ∞

0x · [λ z+λ ]e−x[λ z+λ ] dx.

Now, this last integral is the expectation of an exponential density with parameter

λ z+λ . Hence,

fZ(z) =2λ 2

λ z+λ· 1

λ z+λ=

2

(z+1)2, z≥ 0.

30. Using the law of total probability, substitution, and independence, we have

P(X ≤ Y ) =∫ ∞

0P(X ≤ Y |X = x) fX (x)dx =

∫ ∞

0P(x≤ Y |X = x) fX (x)dx

=∫ ∞

0P(Y ≥ x) fX (x)dx =

∫ ∞

0e−µx ·λe−λx dx

= λ∫ ∞

0e−x(λ+µ) dx =

λ

λ + µ

(−e−(λ+µ)

)∣∣∣∞

0=

λ

λ + µ.


31. Using the law of total probability, substitution, and independence, we have

P(Y/ ln(1+X2) > 1) =∫ ∞

−∞P(Y/ ln(1+X2) > 1|X = x) fX (x)dx

=∫ 2

1P(Y > ln(1+ x2)|X = x) ·1dx

=∫ 2

1P(Y > ln(1+ x2))dx =

∫ 2

1e− ln(1+x

2) dx

=

∫ 2

1eln(1+x

2)−1 dx =

∫ 2

1

1

1+ x2dx

= tan−1(2)− tan−1(1).

32. First find the cdf using the law of total probability and substitution. Then differentiate

to obtain the density.

(a) For Z = eXY ,

FZ(z) = P(Z ≤ z) = P(eXY ≤ z) =∫ ∞

−∞P(eXY ≤ z|X = x) fX (x)dx

=∫ ∞

−∞P(Y ≤ ze−x|X = x) fX (x)dx =

∫ ∞

−∞FY |X (ze−x|x) fX (x)dx.

Then

fZ(z) =

∫ ∞

−∞fY |X (ze−x|x)e−x fX (x)dx,

and so

fZ(z) =∫ ∞

−∞fXY (x,ze−x)e−x dx.

(b) Since Z = |X +Y | ≥ 0, we know that FZ(z) and fZ(z) are zero for z < 0. For

z≥ 0, write

FZ(z) = P(Z ≤ z) = P(|X+Y | ≤ z) =

∫ ∞

−∞P(|X+Y | ≤ z|X = x) fX (x)dx

=∫ ∞

−∞P(|x+Y | ≤ z|X = x) fX (x)dx

=∫ ∞

−∞P(−z≤ x+Y ≤ z|X = x) fX (x)dx

=∫ ∞

−∞P(−z− x≤ Y ≤ z− x|X = x) fX (x)dx

=∫ ∞

−∞

[FY |X (z− x|x)−FY |X (−z− x|x)

]fX (x)dx.

Then

fZ(z) =∫ ∞

−∞

[fY |X (z− x|x)− fY |X (−z− x|x)(−1)

]fX (x)dx

=∫ ∞

−∞

[fY |X (z− x|x)+ fY |X (−z− x|x)

]fX (x)dx

=∫ ∞

−∞fXY (x,z− x)+ fXY (x,−z− x)dx.


33. (a) First find the cdf of Z using the law of total probability, substitution, and inde-

pendence. Then differentiate to obtain the density. Write

FZ(z) = P(Z ≤ z) = P(Y/X ≤ z) =

∫ ∞

−∞P(Y/X ≤ z|X = x) fX (x)dx

=∫ ∞

−∞P(Y/x≤ z|X = x) fX (x)dx =

∫ ∞

−∞P(Y/x≤ z) fX (x)dx

=∫ 0

−∞P(Y/x≤ z) fX (x)dx+

∫ ∞

0P(Y/x≤ z) fX (x)dx

=

∫ 0

−∞P(Y ≥ zx) fX (x)dx+

∫ ∞

0P(Y ≤ zx) fX (x)dx

=∫ 0

−∞[1−FY (zx)] fX (x)dx+

∫ ∞

0FY (zx) fX (x)dx.

Then

fZ(z) =

∫ 0

−∞− fY (zx)x fX (x)dx+

∫ ∞

0fY (zx)x fX (x)dx

=∫ 0

−∞fY (zx)|x| fX (x)dx+

∫ ∞

0fY (zx)|x| fX (x)dx

=∫ ∞

−∞fY (zx) fX (x)|x|dx.

(b) Using the result of part (a) and the fact that the integrand is even, write

fZ(z) =∫ ∞

−∞

e−|zx|2/(2σ2)

√2πσ

e−(x/σ)2/2

√2πσ

|x|dx =2

2π

∫ ∞

0

x

σ2e−(x/σ)2[1+z2]/2 dx.

Now make the change of variable θ = (x/σ)√1+ z2 to get

fZ(z) =1/π

1+ z2

∫ ∞

0θe−θ2/2 dθ =

1/π

1+ z2

(−e−θ2/2

)∣∣∣∞

0=

1/π

1+ z2,

which is the Cauchy(1) density.

(c) Using the result of part (a) and the fact that the integrand is even, write

fZ(z) =∫ ∞

−∞

λ2e−λ |zx| λ

2e−λ |x||x|dx =

λ 2

2

∫ ∞

0xe−xλ (|z|+1) dx

=λ 2/2

λ (|z|+1)

∫ ∞

0x ·λ (|z|+1)e−λ (|z|+1)x dx,

where this last integral is the mean of an exponential density with parameter

λ (|z|+1). Hence,

fZ(z) =λ 2/2

λ (|z|+1)· 1

λ (|z|+1)=

1

2(|z|+1)2.


(d) For z> 0, use the result of part (a) and the fact that the integrand is even, write

fZ(z) =∫ ∞

−∞

12I[−1,1](zx)

e−x2/2

√2π|x|dx =

∫ 1/z

0xe−x

2/2

√2π

dx

=1√2π

(−e−x2/2)∣∣1/z0

=1− e−1/(2z2)

√2π

.

The same formula holds for z< 0, and it is easy to check that fZ(0) = 1/√2π .

(e) Since Z ≥ 0, for z≥ 0, we use the result from part (a) to write

fZ(z) =∫ ∞

0

zxλ 2 e

−(zx/λ )2/2 xλ 2 e

−(x/λ )2/2xdx = z

∫ ∞

0

(xλ

)3e−(x/λ )2(z2+1)/2 dx

λ

= z

∫ ∞

0θ 3e−θ2(z2+1)/2 dθ = z

∫ ∞

0

(t√z2 +1

)3

e−t2/2 dt√

z2 +1

=z

(z2 +1)2

∫ ∞

0t3e−t

2/2 dt =2z

(z2 +1)2

∫ ∞

0se−s ds.

This last integral is the mean of an exp(1) density, which is one. Hence fZ(z) =2z/(z2 +1)2.

34. For the cdf, use the law of total probability, substitution, and independence to write

FZ(z) = P(Z ≤ z) = P(Y/ lnX ≤ z) =∫ ∞

0P(Y/ lnX ≤ z|X = x) fX (x)dx

=∫ ∞

0P(Y/ lnx≤ z|X = x) fX (x)dx =

∫ ∞

0P(Y/ lnx≤ z) fX (x)dx

=∫ 1

0P(Y ≥ z lnx) fX (x)dx+

∫ ∞

1P(Y ≤ z lnx) fX (x)dx.

Then

fZ(z) =∫ 1

0− fY (z lnx)(lnx) fX (x)dx+

∫ ∞

1fY (z lnx)(lnx) fX (x)dx

=∫ ∞

0fX (x) fY (z lnx)| lnx|dx.

35. Use the law of total probability, substitution, and independence to write

E[e(X+Z)U ] =

∫ 1/2

−1/2E[e(X+Z)U |U = u]du =

∫ 1/2

−1/2E[e(X+Z)u|U = u]du

=

∫ 1/2

−1/2E[e(X+Z)u]du =

∫ 1/2

−1/2E[eXu]E[eZu]du =

∫ 1/2

−1/2

1

(1−u)2 du

=1

1−u

∣∣∣∣1/2

−1/2=

1

1−1/2− 1

1+1/2= 2− 2

3=

4

3.


36. Use the law of total probability and substitution to write

E[X2Y ] =∫ 2

1E[X2Y |Y = y]dy =

∫ 2

1E[X2y|Y = y]dy =

∫ 2

1yE[X2|Y = y]dy

=

∫ 2

1y · (2/y2)dy = 2

∫ 2

11/ydy = 2ln2.

37. Use the law of total probability and substitution to write

E[XnY r] =∫ ∞

0E[XnY r|Y = y] fY (y)dy =

∫ ∞

0E[Xnyr|Y = y] fY (y)dy

=∫ ∞

0yrE[Xn|Y = y] fY (y)dy =

∫ ∞

0yr

Γ(n+ p)

ynΓ(p)fY (y)dy

=Γ(n+ p)

Γ(p)

∫ ∞

0yr−n fY (y)dy =

Γ(n+ p)

Γ(p)E[Y r−n] =

Γ(n+ p)

Γ(p)· (r−n)!

λ r−n.

38. (a) Use the law of total probability, substitution, and independence to find the cdf.

Write

FY (y) = P(Y ≤ y) = P(eVU ≤ y) =∫ ∞

0P(eVU ≤ y|V = v) fV (v)dv

=∫ ∞

0P(evU ≤ y|V = v) fV (v)dv =

∫ ∞

0P(evU ≤ y) fV (v)dv

=∫ ∞

0P(U ≤ 1

vlny) fV (v)dv.

Then

fY (y) =∫ ∞

0

1vyfU ( 1

vlny) fV (v)dv.

To determine when fU ( 1vlny) is nonzero, we consider the cases y> 1 and y< 1

separately. For y > 1, 1vlny ≥ 0 for all v ≥ 0, and 1

vlny ≤ 1/2 for v ≥ 2lny.

Thus, fU ( 1vlny) = 1 for v≥ 2lny, and we can write

fY (y) =∫ ∞

2lny

1yv· ve−v dv =

e−2lny

y=

1

y3.

For y< 1, fU ( 1vlny) = 1 for −1/2≤ 1

vlny, or v≥−2lny. Thus,

fY (y) =∫ ∞

−2lny1yv· ve−v dv =

1

ye2lny = y.

Putting this all together, we have

fY (y) =

1/y3, y≥ 1,y, 0≤ y< 1,0, y< 0.


(b) Using the density of part (a),

E[Y ] =∫ 1

0y2 dy+

∫ ∞

1

1

y2dy =

1

3+1 =

4

3.

(c) Using the law of total probability, substitution, and independence, we have

E[eVU ] =∫ 1/2

−1/2E[eVU |U = u]du =

∫ 1/2

−1/2E[eVu|U = u]du

=∫ 1/2

−1/2E[eVu]du =

∫ 1/2

−1/2

1

(1−u)2 du =1

1−u

∣∣∣∣1/2

−1/2=

4

3.

39. (a) This problem is interesting because the answer does not depend on the random

variable X . Assuming X has a density fX (x), first write

E[cos(X+Y )] =∫ ∞

−∞E[cos(X+Y )|X = x] fX (x)dx

=∫ ∞

−∞E[cos(x+Y )|X = x] fX (x)dx.

Now use the conditional density of Y given X = x to write

E[cos(x+Y )|X = x] =∫ x+π

x−πcos(x+ y)

dy

2π=

∫ 2x+π

2x−πcosθ

dθ

2π= 0,

since we are integrating cosθ over an interval of length 2π . Thus, E[cos(X +Y )] = 0 as well.

(b) Write

P(Y > y) =∫ ∞

−∞P(Y > y|X = x) fX (x)dx =

∫ 2

1P(Y > y|X = x)dx

=

∫ 2

1e−xy dx =

e−y− e−2yy

.

(c) Begin in the usual way by writing

E[XeY ] =∫ ∞

−∞E[XeY |X = x] fX (x)dx =

∫ ∞

−∞E[xeY |X = x] fX (x)dx

=

∫ ∞

−∞xE[eY |X = x] fX (x)dx.

Now observe that

E[eY |X = x] = E[esY |X = x]∣∣∣s=1

= es2x2/2

∣∣∣s=1

= ex2/2.

Then continue with

E[XeY ] =∫ ∞

−∞xex

2/2 fX (x)dx =1

4

∫ 7

3xex

2/2 dx =1

4

(ex

2/2)∣∣73

=e49/2− e9/2

4.


(d) Write

E[cos(XY )] =∫ ∞

−∞E[cos(XY )|X = x] fX (x)dx =

∫ 2

1E[cos(xY )|X = x]dx

=∫ 2

1E[Re(e jxY )|X = x]dx = Re

∫ 2

1E[e jxY |X = x]dx

= Re

∫ 2

1e−x

2(1/x)/2 dx =∫ 2

1e−x/2 dx = 2(e−1/2− e−1).

40. Using the law of total probability, substitution, and independence,

MY (s) = E[esY ] = E[esZX ] =∫ ∞

0E[esZX |Z = z] fZ(z)dz

=∫ ∞

0E[eszX |Z = z] fZ(z)dz =

∫ ∞

0E[eszX ] fZ(z)dz

=∫ ∞

0e(sz)

2σ2/2 fZ(z)dz =∫ ∞

0e(sz)

2σ2/2ze−z2/2 dz

=∫ ∞

0ze−(1−s2σ2)z2/2 dz.

Now make the change of variable t = z√1− s2σ2 to get

MY (s) =∫ ∞

0

t√1− s2σ2

e−t2/2 dt√

1− s2σ2=

1

1− s2σ2=

1/σ2

1/σ2− s2 .

Hence, Y ∼ Laplace(1/σ).

41. Using the law of total probability and substitution,

E[XnYm] =∫ ∞

0E[XnYm|Y = y] fY (y)dy =

∫ ∞

0E[Xnym|Y = y] fY (y)dy

=∫ ∞

0ymE[Xn|Y = y] fY (y)dy =

∫ ∞

0ym2n/2ynΓ(1+n/2) fY (y)dy

= 2n/2Γ(1+n/2)E[Y n+m] = 2n/2Γ(1+n/2)(n+m)!

β n+m.

42. (a) We use the law of total probability, substitution, and independence to write

FZ(z) = P(Z ≤ z) = P(X/Y ≤ z) =∫ ∞

0P(X/Y ≤ z|Y = y) fY (y)dy

=∫ ∞

0P(X/y≤ z|Y = y) fY (y)dy =

∫ ∞

0P(X ≤ zy|Y = y) fY (y)dy

=∫ ∞

0P(X ≤ zy) fY (y)dy =

∫ ∞

0FX (zy) fY (y)dy.

Differentiating, we have

fZ(z) =∫ ∞

0fX (zy)y · fY (y)dy =

∫ ∞

0λ

(λ zy)p−1e−λ zy

Γ(p)·λ (λy)q−1e−λy

Γ(q)· ydy.


Making the change of variable w= λy, we obtain

fZ(z) =

∫ ∞

0

(zw)p−1e−zw

Γ(p)· w

q−1e−w

Γ(q)·wdw

=zp−1

Γ(p)Γ(q)

∫ ∞

0wp+q−1e−w(1+z) dw.

Now make the change of variable θ = w(1+ z) so that dθ = (1+ z)dw and

w= θ/(1+ z). Then

fZ(z) =zp−1

Γ(p)Γ(q)

∫ ∞

0

(θ

1+ z

)p+q−1e−θ dθ

(1+ z)

=zp−1

Γ(p)Γ(q)(1+ z)p+q

∫ ∞

0θ p+q−1e−θ dθ

︸︷︷︸= Γ(p+q)

=zp−1

B(p,q)(1+ z)p+q.

(b) Starting with V := Z/(1+Z), we first write

FV (v) = P

(Z

1+Z≤ v

)= P(Z ≤ v+ vZ) = P(Z(1− v)≤ v)

= P(Z ≤ v/(1− v)).


fV (v) = fZ

( v

1− v) (1− v)+ v

(1− v)2 = fZ

( v

1− v) 1

(1− v)2 .

Now apply the formula derived in part (a) and use the fact that

1+v

1− v =1

1− vto get

fV (v) =[v/(1− v)]p−1B(p,q)( 1

1−v )p+q

· 1

(1− v)2 =vp−1(1− v)q−1

B(p,q),

which is the beta density with parameters p and q.

43. Put q := (n−1)p and Zi := ∑ j 6=iX j, which is gamma(q,λ ) by Problem 55(c) in Chap-

ter 4. Now observe that Yi = Xi/(Xi+Zi), which has a beta density with parameters pand q := (n−1)p by Problem 42(b).

44. Using the law of total probability, substitution, and independence,

FZ(z) = P(Z ≤ z) = P(X/√Y/k ≤ z) = P(X ≤ z

√Y/k )

=∫ ∞

0P(X ≤ z

√Y/k |Y = y) fY (y)dy =

∫ ∞

0P(X ≤ z

√y/k |Y = y) fY (y)dy

=∫ ∞

0P(X ≤ z

√y/k ) fY (y)dy.


Then

fZ(z) =∫ ∞

0fX (z

√y/k )

√y/k fY (y)dy

=∫ ∞

0

e−(z2y/k)/2

√2π

√y/k

12(y/2)k/2−1e−y/2

Γ(k/2)dy

=1

2√

π√kΓ(k/2)

∫ ∞

0(y/2)k/2−1/2e−y(1+z

2/k)/2 dy.

Now make the change of variable θ = y(1+ z2/k)/2, dθ = (1+ z2/k)/2dy to get

fZ(z) =1√

π√kΓ(k/2)

∫ ∞

0

(θ

1+ z2/k

)k/2−1/2e−θ dθ

1+ z2/k

=1

Γ(1/2)√kΓ(k/2)(1+ z2/k)k/2+1/2

∫ ∞

0θ k/2+1/2−1e−θ dθ

=1

Γ(1/2)√kΓ(k/2)(1+ z2/k)k/2+1/2

Γ(k/2+1/2) =(1+ z2/k)−(k+1)/2

√k B(1/2,k/2)

,

which is the required Student’s t density with k degrees of freedom.

45. We use the law of total probability, substitution, and independence to write

FZ(z) = P(Z ≤ z) = P(X/Y ≤ z) =∫ ∞

0P(X/Y ≤ z|Y = y) fY (y)dy

=∫ ∞

0P(X/y≤ z|Y = y) fY (y)dy =

∫ ∞

0P(X ≤ zy|Y = y) fY (y)dy

=∫ ∞

0P(X ≤ zy) fY (y)dy =

∫ ∞

0FX (zy) fY (y)dy.


fZ(z) =∫ ∞

0fX (zy)y · fY (y)dy =

∫ ∞

0λ r

(λ zy)p−1e−(λ zy)r

Γ(p/r)·λ r (λy)

q−1e−(λy)r

Γ(q/r)· ydy.

Making the change of variable w= λy, we obtain

fZ(z) =

∫ ∞

0r(zw)p−1e−(zw)r

Γ(p/r)· rw

q−1e−wr

Γ(q/r)·wdw

=r2zp−1

Γ(p/r)Γ(q/r)

∫ ∞

0wp+q−1e−w

r(1+zr) dw.

Now make the change of variable θ = wr(1+ zr) so that dθ = rwr−1(1+ zr)dw and

w= (θ/[1+ zr])1/r. Then

fZ(z) =r2zp−1

Γ(p/r)Γ(q/r)

∫ ∞

0

(θ

1+ zr

)(p+q−1)/re−θ dθ

r(1+ zr)(

θ1+zr

)(r−1)/r


=rzp−1

Γ(p/r)Γ(q/r)(1+ zr)(p+q)/r

∫ ∞

0θ (p+q)/r−1e−θ dθ

︸︷︷︸= Γ((p+q)/r)

=rzp−1

B(p/r,q/r)(1+ zr)(p+q)/r.

46. For 0 < z≤ 1,

fZ(z) =1

4

∫ z

0y−1/2(z− y)−1/2 dy =

1

4z

∫ z

0

1√(y/z)(1− (y/z))

dy.

Now make the change of variable t2 = y/z, 2t dt = dy/z to get

fZ(z) =1

2

∫ 1

0

1√1− t2

dt =1

2sin−1 t

∣∣∣1

0= π/4.

Next, for 1< z≤ 2,

fZ(z) =1

4z

∫ 1

z−1

1√(y/z)(1− (y/z))

dy =1

2

∫ 1/√z

√1−1/z

1√1− t2

dt

=1

2sin−1 t

∣∣∣1/√z

√1−1/z

=1

2

[sin−1(1/

√z )− sin−1(

√1−1/z )

].

Putting this all together yields

fZ(z) =

π/4, 0< z≤ 1,12

[sin−1(1/

√z )− sin−1(

√1−1/z )

], 1< z≤ 2,

0, otherwise.

47. Let ψ denote the N(0,1) density. Using

fUV (u,v) = ψρ(u,v) = ψ(u) · 1√1−ρ2

ψ

(v−ρu√1−ρ2

)

︸︷︷︸N(ρu,1−ρ2) density in v

,

we see that

fU (u) =∫ ∞

−∞fUV (u,v)dv =

∫ ∞

−∞ψ(u) · 1√

1−ρ2ψ

(v−ρu√1−ρ2

)dv

= ψ(u)

∫ ∞

−∞

1√1−ρ2

ψ

(v−ρu√1−ρ2

)dv

︸︷︷︸density in v integrates to one

= ψ(u).

Similarly writing

fUV (u,v) = ψρ(u,v) =1√1−ρ2

ψ

(u−ρv√1−ρ2

)·ψ(v),


we have

fV (v) =∫ ∞

−∞fUV (u,v)du =

∫ ∞

−∞

1√1−ρ2

ψ

(u−ρv√1−ρ2

)·ψ(v)du

= ψ(v)∫ ∞

−∞

1√1−ρ2

ψ

(v−ρu√1−ρ2

)du = ψ(v).

48. Using

ψρ(u,v) = ψ(u) · 1√1−ρ2

ψ

(v−ρu√1−ρ2

),

we can write

fXY (x,y) =1

σXσYψρ

(x−mX

σX,y−mY

σY

)

=1

σXσYψ

(x−mX

σX

)· 1√

1−ρ2ψ

(y−mY

σY−ρ x−mX

σX√1−ρ2

). (∗)

Then in∫ ∞−∞ fXY (x,y)dy, make the change of variable v= (y−mY )/σY , dv= dy/σY

to get

fX (x) =∫ ∞

−∞fXY (x,y)dy =

1

σXψ

(x−mX

σX

)∫ ∞

−∞

1√1−ρ2

ψ

(v−ρ x−mX

σX√1−ρ2

)dv

︸︷︷︸density in v integrates to one

=1

σXψ

(x−mX

σX

).

Thus, fX ∼ N(mX ,σ2X ). Using this along with (∗), we obtain

fY |X (y|x) =fXY (x,y)

fX (x)=

1

σY√1−ρ2

ψ

(y−mY

σY−ρ x−mX

σX√1−ρ2

)

=1

σY√1−ρ2

ψ

(y− [mY + σY

σXρ(x−mX )]

σY√1−ρ2

).

Thus, fY |X (· |x) ∼ N(mY + σY

σXρ(x−mX ),σ2

Y (1−ρ2)). Proceeding in an analogous

way, using

ψρ(u,v) =1√1−ρ2

ψ

(u−ρv√1−ρ2

)·ψ(v),

we can write

fXY (x,y) =1

σXσYψρ

(x−mX

σX,y−mY

σY

)

=1

σXσY√1−ρ2

ψ

(x−mX

σX−ρ y−mY

σY√1−ρ2

)·ψ(y−mY

σY

). (∗∗)


Then in∫ ∞−∞ fXY (x,y)dx, make the change of variable u= (x−mX )/σX , du= dx/σX

to get

fY (y) =∫ ∞

−∞fXY (x,y)dx =

1

σYψ

(y−mY

σY

)∫ ∞

−∞

1√1−ρ2

ψ

(u−ρ y−mY

σY√1−ρ2

)du

︸︷︷︸density in u integrates to one

=1

σYψ

(y−mY

σY

).

Thus, fY ∼ N(mY ,σ2Y ). Using this along with (∗∗), we obtain

fX |Y (x|y) =fXY (x,y)

fY (y)=

1

σX√1−ρ2

ψ

(x−mX

σX−ρ y−mY

σY√1−ρ2

)

=1

σX√1−ρ2

ψ

(x− [mX + σX

σYρ(y−mY )]

σX√1−ρ2

).

Thus, fX |Y (· |y)∼ N(mX + σX

σYρ(y−mY ),σ2

X (1−ρ2)).

49. From the solution of Problem 48, we have that

fY |X (· |x)∼ N(mY +

σYσX

ρ(x−mX ),σ2Y (1−ρ2)

)

and

fX |Y (· |y)∼ N(mX +

σXσY

ρ(y−mY ),σ2X (1−ρ2)

).

Hence,

E[Y |X = x] =∫ ∞

−∞y fY |X (y|x)dx = mY +

σYσX

ρ(x−mX ),

and

E[X |Y = y] =∫ ∞

−∞x fX |Y (x|y)dx = mX +

σXσY

ρ(y−mY ).

50. From Problem 48, we know that fX ∼ N(mX ,σ2X ). Hence, E[X ] = mX and E[X2] =

var(X)+m2X = σ2

X +m2X . To compute cov(X ,Y ), we use the law of total probability

and substitution to write

cov(X ,Y ) = E[(X−mX )(Y −mY )] =

∫ ∞

−∞E[(X−mX )(Y −mY )|Y = y] fY (y)dy

=∫ ∞

−∞E[(X−mX )(y−mY )|Y = y] fY (y)dy

=∫ ∞

−∞(y−mY )E[(X−mX )|Y = y] fY (y)dy

=∫ ∞

−∞(y−mY )

E[X |Y = y]−mX

fY (y)dy

=∫ ∞

−∞(y−mY )

σXσY

ρ(y−mY )fY (y)dy


=σXσY

ρ∫ ∞

−∞(y−mY )2 fY (y)dy =

σXσY

ρ ·E[(Y −mY )2]

=σXσY

ρ ·σ2Y = σXσYρ.

It then follows thatcov(X ,Y )

σXσY= ρ.

51. (a) Using the results of Problem 47, we have

fU (u) =∫ ∞

−∞fUV (u,v)dv =

1

2

∫ ∞

−∞ψρ1(u,v)+ψρ2(u,v)dv

=1

2[ψ(u)+ψ(u)] = ψ(u),

and

fV (v) =∫ ∞

−∞fUV (u,v)du =

1

2

∫ ∞

−∞ψρ1(u,v)+ψρ2(u,v)du

=1

2[ψ(v)+ψ(v)] = ψ(v).

Thus, fU and fV are N(0,1) densities.

(b) Write

ρ := E[UV ] =∫ ∞

−∞

∫ ∞

−∞uv fUV (u,v)dudv

=∫ ∞

−∞

∫ ∞

−∞uv · 1

2[ψρ1(u,v)+ψρ2(u,v)]dudv

=1

2

[∫ ∞

−∞

∫ ∞

−∞uvψρ1(u,v)dudv+

∫ ∞

−∞

∫ ∞

−∞uvψρ2(u,v)dudv

]

=ρ1 +ρ2

2.

(c) If indeed

fUV (u,v) =1

2[ψρ1(u,v)+ψρ2(u,v)]

is a bivariate normal density, then

fUV (u,v) =exp(

−12(1−ρ 2)

[u2−2ρuv+ v2])

2π√1−ρ 2

.

In particular then,

fUV (u,u) =1

2[ψρ1(u,u)+ψρ2(u,u)],

or

e−u2/(1+ρ)

2π√1−ρ 2

=1

2

[e−u

2/(1+ρ1)

2π√1−ρ2

1

+e−u

2/(1+ρ2)

2π√1−ρ2

2

].


Since t := u2 ≥ 0 is arbitrary, part (iii) of the hint tells us that

1

2π√1−ρ 2

=−1

4π√1−ρ2

1

=−1

4π√1−ρ2

2

= 0,

which is false.

(d) First observe that

fV |U (v|u) =fUV (u,v)

fU (u)=

1

2

(ψρ1(u,v)

ψ(u)︸︷︷︸N(ρ1u,1−ρ2

1 )

+ψρ2(u,v)

ψ(u)︸︷︷︸N(ρ2u,1−ρ2

2 )

).

Hence,

∫ ∞

−∞v2 fV |U (v|u)dv =

1

2[(1−ρ2

1 )+(ρ1u)2 +(1−ρ2

2 )+(ρ2u)2]

=1

2[2−ρ2

1 −ρ22 +(ρ2

1 +ρ22 )u

2],

which depends on u unless ρ1 = ρ2 = 0.

52. For u0,v0 ≥ 0, let D := (u,v) : u≥ u0,v≥ v0. Then

P(U > u0,V > v0) =

∫ ∫

D

ψρ(u,v)dudv

=∫ tan−1(v0/u0)

0

∫ ∞

v0/sinθψρ(r cosθ ,r sinθ)r drdθ (#)

+∫ π/2

tan−1(v0/u0)

∫ ∞

u0/cosθψρ(r cosθ ,r sinθ)r drdθ . (##)

Now, since

ψρ(r cosθ ,r sinθ) =exp(

−r22(1−ρ2)

[1−ρ sin2θ ])

2π√1−ρ2

,

we can express the anti-derivative of rψρ(r cosθ ,r sinθ) with respect to r in closedform as

−√1−ρ2

2π(1−ρ sin2θ)exp

( −r22(1−ρ2)

[1−ρ sin2θ ]

).

Hence, the double integral in (#) reduces to

∫ tan−1(v0/u0)

0h(v20,θ)dθ .

The double integral in (##) reduces to

∫ π/2

tan−1(v0/u0)

√1−ρ2

2π(1−ρ sin2θ)exp

( −u202(1−ρ2)cos2 θ

[1−ρ sin2θ ]

)dθ .


Applying the change of variable t = π/2−θ , dt =−dθ , we obtain∫ π/2−tan−1(v0/u0)

0hρ(u20, t)dt.

53. If ρ = 0 in Problem 52, thenU and V are independent. Taking u0 = v0 = x0,

Q(x0)2 =

∫ π/4

0h0(x

20,θ)dθ +

∫ π/4

0h0(x

20,θ)dθ

= 2

∫ π/4

0

exp[−x20/(2sin2 θ)]

2πdθ =

1

π

∫ π/4

0exp

( −x202sin2 θ

)dθ .

54. Factor

fXYZ(x,y,z) =2exp[−|x− y|− (y− z)2/2]

z5√2π

, z≥ 1,

as

fXYZ(x,y,z) =4

z5· e−(y−z)2/2√2π

· 12e−|x−y|.

Now, the second factor on the right is an N(z,1) density in the variable y, and the

third factor is a Laplace(1) density that has been shifted to have mean y. Hence, theintegral of the third factor with respect to x yields one, and we have by inspection that

fYZ(y,z) =4

z5· e−(y−z)2/2√2π

, z≥ 1.

We then easily see that

fX |YZ(x|y,z) :=fXYZ(x,y,z)

fYZ(y,z)= 1

2e−|x−y|, z≥ 1.

Next, since the right-hand factor in the formula for fYZ(y,z) is an N(z,1) density in y,if we integrate this factor with respect to y, we get one. Thus,

fZ(z) =4

z5, z≥ 1.

We can now see that

fY |Z(y|z) :=fYZ(y,z)

fZ(z)=

e−(y−z)2/2√2π

, z≥ 1.

55. To find fXY (x,y), first write

fXYZ(x,y,z) :=e−(x−y)2/2e−(y−z)2/2e−z

2/2

(2π)3/2=

e−(x−y)2/2e−(z−y/2)2e−y2/4

(2π)3/2

=e−(x−y)2/2e−(y/

√2 )2/2

2π· e−(z−y/2)2√2π

=e−(x−y)2/2e−(y/

√2 )2/2

2π√2

· e−(z−y/2)2√2π/

√2

=e−(x−y)2/2e−(y/

√2 )2/2

2π√2

· e−[(z−y/2)/(1/

√2 )]2/2

√2π/

√2

.


Now the right-hand factor is an N(y/2,1/2) density in the variable z. Hence, its

integral with respect to z is one. We thus have

fXY (x,y) =e−(x−y)2/2e−(y/

√2 )2/2

2π√2

=e−(y/

√2 )2/2

√2π√2· e−(x−y)2/2√2π

,

which shows that Y ∼ N(0,2), and given Y = y, X is conditionally N(y,1). Thus,

E[Y ] = 0 and var(Y ) = 2.

Next,

E[X ] =∫ ∞

−∞E[X |Y = y] fY (y)dy =

∫ ∞

−∞y fY (y)dy = E[Y ] = 0,

and

var(X) = E[X2] =∫ ∞

−∞E[X2|Y = y] fY (y)dy =

∫ ∞

−∞(1+ y2) · fY (y)dy

= 1+E[Y 2] = 1+ var(Y ) = 1+2 = 3.

Finally,

E[XY ] =∫ ∞

−∞E[XY |Y = y] fY (y)dy =

∫ ∞

−∞E[Xy|Y = y] fY (y)dy

=

∫ ∞

−∞yE[X |Y = y] fY (y)dy =

∫ ∞

−∞y2 fY (y)dy = E[Y 2] = var(Y ) = 2.

56. First write

E[XY ] =∫ ∞

−∞

∫ ∞

−∞E[XY |Y = y,Z = z] fYZ(y,z)dydz

=∫ ∞

−∞

∫ ∞

−∞E[Xy|Y = y,Z = z] fYZ(y,z)dydz

=∫ ∞

−∞

∫ ∞

−∞yE[X |Y = y,Z = z] fYZ(y,z)dydz.

Since fX |YZ(· |y,z)∼ N(y,z2), the preceding conditional expectation is just y. Hence,

E[XY ] =∫ ∞

−∞

∫ ∞

−∞y2 fYZ(y,z)dydz = E[Y 2] =

∫ ∞

−∞E[Y 2|Z = z] fZ(z)dz.

Since fY |Z(· |z)∼ exp(z), the preceding conditional expectation is just 2/z2. Thus,

E[XY ] =∫ ∞

−∞

2

z2fZ(z)dz =

∫ 2

1

2

z2· 37z2 dz =

6

7.

A similar analysis yields

E[YZ] =∫ ∞

−∞E[YZ|Z = z] fZ(z)dz =

∫ ∞

−∞E[Yz|Z = z] fZ(z)dz

=∫ ∞

−∞zE[Y |Z = z] fZ(z)dz =

∫ ∞

−∞z(1/z) fZ(z)dz =

∫ ∞

−∞fZ(z)dz = 1.


57. Write

E[XYZ] =

∫ ∞

−∞

∫ ∞

−∞E[XYZ|Y = y,Z = z] fYZ(y,z)dydz

=∫ ∞

−∞

∫ ∞

−∞E[Xyz|Y = y,Z = z] fYZ(y,z)dydz

=∫ ∞

−∞

∫ ∞

−∞yzE[X |Y = y,Z = z] fYZ(y,z)dydz.

Since fX |YZ(· |y,z) is a shifted Laplace density with mean y, the preceding conditionalexpectation is just y. Hence,

E[XYZ] =∫ ∞

−∞

∫ ∞

−∞y2z fYZ(y,z)dydz = E[Y 2Z] =

∫ ∞

−∞E[Y 2Z|Z = z] fZ(z)dy

=∫ ∞

−∞E[Y 2z|Z = z] fZ(z)dy =

∫ ∞

−∞zE[Y 2|Z = z] fZ(z)dy.

Since fY |Z(· |z)∼ N(z,1), the preceding conditional expectation is just 1+ z2. Thus,

E[XYZ] =∫ ∞

−∞z(1+ z2) fZ(z)dz =

∫ ∞

1[z+ z3] ·4/z5 dz = 4

∫ ∞

1z−4 + z−2 dz

= 4(1/3+1) = 16/3.

58. Write

E[XYZ] =

∫ ∞

−∞

∫ ∞

−∞E[XYZ|X = x,Y = y] fXY (x,y)dxdy

=∫ ∞

−∞

∫ ∞

−∞E[xyZ|X = x,Y = y] fXY (x,y)dxdy

=∫ ∞

−∞

∫ ∞

−∞xyE[Z|X = x,Y = y] fXY (x,y)dxdy

=∫ ∞

−∞

∫ ∞

−∞x2y fXY (x,y)dxdy = E[X2Y ] =

∫ ∞

−∞E[X2Y |X = x] fX (x)dx

=∫ ∞

−∞E[x2Y |X = x] fX (x)dx =

∫ ∞

−∞x2E[Y |X = x] fX (x)dx

=∫ ∞

−∞x3 fX (x)dx =

∫ 2

1x3 dx = 15/4.

59. We use the law of total probability, substitution, and independence to write

ϕY (ν) = E[e jνY ] = E[e jν ∑Ni=1Xi ] =∞

∑n=1

E[e jν ∑Ni=1Xi |N = n]P(N = n)

=∞

∑n=1

E[e jν ∑ni=1Xi |N = n]P(N = n)

=∞

∑n=1

E[e jν ∑ni=1Xi ]P(N = n)


=∞

∑n=1

E

[n

∏i=1

e jνXi]P(N = n)

=∞

∑n=1

[n

∏i=1

E[e jνXi ]

]P(N = n)

=∞

∑n=1

ϕX (ν)nP(N = n) = GN(ϕX (ν)).

Now, if N ∼ geometric1(p), GN(z) = [(1− p)z]/[1− pz], and if X ∼ exp(λ ), ϕX (ν) =λ/(λ − jν). Then

ϕY (ν) =(1− p)ϕX (ν)

1− pϕX (ν)=

(1− p)λ/(λ − jν)

1− pλ/(λ − jν)=

(1− p)λ(λ − jν)− pλ

=(1− p)λ

(1− p)λ − jν,

which is the exp((1− p)λ ) characteristic function. Thus, Y ∼ exp((1− p)λ ).

CHAPTER 8

Problem Solutions

1. We have 10 40

20 50

30 60

[7 8 9

4 5 6

]=

230 280 330

340 410 480

450 540 630

and

tr

230 280 330

340 410 480

450 540 630

= 1270.

2. MATLAB. See the answer to the previous problem.

3. MATLAB.We have

A′ =

7 4

8 5

9 6

.

4. Write

tr(AB) =r

∑i=1

(AB)ii =r

∑i=1

(n

∑k=1

AikBki

)=

n

∑k=1

(r

∑i=1

BkiAik

)=

n

∑k=1

(BA)kk = tr(BA).

5. (a) Write

tr(AB′) =r

∑i=1

(AB′)ii =r

∑i=1

(n

∑k=1

Aik(B′)ki

)=

r

∑i=1

n

∑k=1

AikBik.

(b) If tr(AB′) = 0 for all B, then in particular, it is true for B= A; i.e.,

0 = tr(AA′) =r

∑i=1

n

∑k=1

A2ik,

which implies Aik = 0 for all i and k. In other words, A is the zero matrix of size

r×n.6. Following the hint, we first write

0 ≤ ‖x−λy‖2 = 〈x−λy,x−λy〉 = ‖x‖2−2λ 〈x,y〉+λ 2‖y‖2.

Taking λ = 〈x,y〉/‖y‖2 yields

0 ≤ ‖x‖2−2|〈x,y〉|2‖y‖2 +

|〈x,y〉|2‖y‖4 ‖y‖2 = ‖x‖2− |〈x,y〉|

2

‖y‖2 ,

132


which can be rearranged to get |〈x,y〉|2 ≤ ‖x‖2 ‖y‖2. Conversely, suppose |〈x,y〉|2 =‖x‖2 ‖y‖2. There are two cases to consider. If y 6= 0, then reversing the above sequence

of observations implies 0 = ‖x−λy‖2, which implies x = λy. On the other hand, ify= 0 and if

|〈x,y〉| = ‖x‖‖y‖,then we must have ‖x‖= 0; i.e., x= 0 and y= 0, and in this case x= λy for all λ .

7. Consider the i j component of E[XB]. Since

(E[XB])i j = E[(XB)i j] = E

[∑k

XikBk j

]= ∑

k

E[Xik]Bk j = ∑k

(E[X ])ikBk j

= (E[X ]B)i j

holds for all i j, E[XB] = E[X ]B.

8. tr(E[X ]) = ∑i

(E[X ])ii = ∑i

E[Xii] = E

[∑i

Xii

]= E[tr(X)].

9. Write

E[‖X−E[X ]‖2] = E[(X−E[X ])′(X−E[X ])], which is a scalar,

= trE[(X−E[X ])′(X−E[X ])]

= E

[tr(X−E[X ])′(X−E[X ])

], by Problem 8,

= E

[tr(X−E[X ])(X−E[X ])′

], by Problem 4,

= trE[(X−E[X ])(X−E[X ])′]

, by Problem 8,

= tr(C) =n

∑i=1

Cii =n

∑i=1

var(Xi).

10. Since E[[X ,Y,Z]′

]=[E[X ],E[Y ],E[Z]

]′, and

cov([X ,Y,Z]′) =

E[X2] E[XY ] E[XZ]E[YX ] E[Y 2] E[YZ]E[ZX ] E[ZY ] E[Z2]

−

E[X ]2 E[X ]E[Y ] E[X ]E[Z]E[Y ]E[X ] E[Y ]2 E[Y ]E[Z]E[Z]E[X ] E[Z]E[Y ] E[Z]2

,

we begin by computing all the entries of these matrices. For the mean vector,

E[Z] =∫ 2

1z · 3

7z2 dz = 3

7

∫ 2

1z3 dz = 3

7· 14z4∣∣∣2

1= 3 ·15/28 = 45/28.

Next,

E[Y ] =∫ 2

1E[Y |Z = z] fZ(z)dz =

∫ 2

1

1z· 37z2 dz = 3

7

∫ 2

1zdz = 3

7· 12z2∣∣∣2

1= 9/14.

Since E[U ] = 0 and sinceU and Z are independent,

E[X ] = E[ZU+Y ] = E[Z]E[U ]+E[Y ] = E[Y ] = 9/14.


Thus, the desired mean vector is

E[[X ,Y,Z]′

]= [9/14,9/14,45/28]′.

We next compute the correlations. First,

E[YZ] =∫ 2

1E[YZ|Z = z] fZ(z)dz =

∫ 2

1E[Yz|Z = z] fZ(z)dz

=∫ 2

1zE[Y |Z = z] fZ(z)dz =

∫ 2

1z(1/z) fZ(z)dz =

∫ 2

1fZ(z)dz = 1.

Next,

E[Z2] =

∫ 2

1z2 · 3

7z2 dz = 3

7

∫ 2

1z4 dz = 3

7· 15z5∣∣∣2

1= 93/35.

Again using the fact that E[U ] = 0 and independence,

E[XZ] = E[(ZU+Y )Z] = E[Z2]E[U ]+E[YZ] = E[YZ] = 1.

Now,

E[Y 2] =∫ 2

1E[Y 2|Z = z] fZ(z)dz =

∫ 2

1(2/z2) · 3

7z2 dz = 6/7.

We can now compute

E[XY ] = E[(ZU+Y )Y ] = E[ZY ]E[U ]+E[Y 2] = 6/7,

and

E[X2] = E[(ZU+Y )2] = E[Z2]E[U2]+2E[U ]E[ZY ]+E[Y 2]

= E[Z2]+E[Y 2] = 93/35+6/7 = 123/35.

We now have that

cov([X ,Y,Z]′) =

123/35 6/7 1

6/7 6/7 1

1 1 93/35

−

81/196 81/196 405/39281/196 81/196 405/392405/392 405/392 2025/784

=

3.1010 0.4439 −0.03320.4439 0.4439 −0.0332−0.0332 −0.0332 0.0742

.


]=[E[X ],E[Y ],E[Z]

]′, and

cov([X ,Y,Z]′) =


−


,

we compute all the entries of these matrices. To make this job easier, we first factor

fXYZ(x,y,z) =2exp[−|x− y|− (y− z)2/2]

z5√2π

, z≥ 1,


as fX |YZ(x|y,z) fY |Z(y|z) fZ(z) by writing

fXYZ(x,y,z) = 12e−|x−y| · e

−(y−z)2/2√2π

· 4z5

, z≥ 1.

We then see that as a function of x, fX |YZ(x|y,z) is a shifted Laplace(1) density. Sim-ilarly, as a function of y, fY |Z(y|z) is an N(z,1) density. Thus,

E[X ] =∫ ∞

−∞

∫ ∞

−∞E[X |Y = y,Z = z] fYZ(y,z)dydz =

∫ ∞

−∞

∫ ∞

−∞y fYZ(y,z)dydz

= E[Y ] =∫ ∞

−∞E[Y |Z = z] fZ(z)dz =

∫ ∞

−∞z fZ(z)dz = E[Z]

=∫ ∞

1z · 4z5dz =

∫ ∞

1

4

z4dz = 4/3.

Thus, E[[X ,Y,Z]′

]= [4/3,4/3,4/3]′. We next compute

E[X2] =∫ ∞

−∞

∫ ∞

−∞E[X2|Y = y,Z = z] fYZ(y,z)dydz

=∫ ∞

−∞

∫ ∞

−∞(2+ y2) fYZ(y,z)dydz = 2+E[Y 2],

where

E[Y 2] =∫ ∞

−∞E[Y 2|Z = z] fZ(z)dz =

∫ ∞

−∞(1+ z2) fZ(z)dz = 1+E[Z2].

Now,

E[Z2] =∫ ∞

1z24

z5dz =

∫ ∞

1

4

z3dz = 2.

Thus, E[Y 2] = 3 and E[X2] = 5. We next turn to

E[XY ] =∫ ∞

−∞

∫ ∞


=∫ ∞

−∞

∫ ∞


=∫ ∞

−∞

∫ ∞

−∞yE[X |Y = y,Z = z] fYZ(y,z)dydz

=∫ ∞

−∞

∫ ∞

−∞y2 fYZ(y,z)dydz = E[Y 2] = 3.

We also have

E[XZ] =

∫ ∞

−∞

∫ ∞

−∞E[XZ|Y = y,Z = z] fYZ(y,z)dydz

=∫ ∞

−∞

∫ ∞

−∞E[Xz|Y = y,Z = z] fYZ(y,z)dydz

=∫ ∞

−∞

∫ ∞

−∞yz fYZ(y,z)dydz = E[YZ]

=∫ ∞


∫ ∞

−∞zE[Y |Z = z] fZ(z)dz

=∫ ∞

−∞z2 fZ(z)dz = E[Z2] = 2.


We now have that

cov([X ,Y,Z]′) =

5 3 2

3 3 2

2 2 2

− 16

9

1 1 1

1 1 1

1 1 1

=1

9

29 11 2

11 11 2

2 2 2

=

3.2222 1.2222 0.22221.2222 1.2222 0.22220.2222 0.2222 0.2222

.


]=[E[X ],E[Y ],E[Z]

]′, and

cov([X ,Y,Z]′) =


−


,

we compute all the entries of these matrices. We begin with

E[Z] =∫ ∞

−∞

∫ ∞

−∞E[Z|Y = y,X = x] fXY (x,y)dydx

=∫ ∞

−∞

∫ ∞

−∞x fXY (x,y)dydx = E[X ] = 3/2.

Next,

E[Y ] =∫ ∞

−∞E[Y |X = x] fX (x)dx =

∫ ∞

−∞x fX (x)dx = E[X ] = 3/2.

We now compute

E[XY ] =

∫ ∞

−∞E[XY |X = x] fX (x)dx =

∫ ∞

−∞E[xY |X = x] fX (x)dx

=∫ ∞

−∞xE[Y |X = x] fX (x)dx =

∫ ∞

−∞x2 fX (x)dx = E[X2]

= var(X)+E[X ]2 = 1/12+(3/2)2 = 7/3.

Then

E[XZ] =

∫ ∞

−∞

∫ ∞

−∞E[XZ|Y = y,X = x] fXY (x,y)dydx

=∫ ∞

−∞

∫ ∞

−∞xE[Z|Y = y,X = x] fXY (x,y)dydx

=∫ ∞

−∞

∫ ∞

−∞x2 fXY (x,y)dydx = E[X2] = 7/3,

and

E[YZ] =∫ ∞

−∞

∫ ∞

−∞E[YZ|Y = y,X = x] fXY (x,y)dydx

=∫ ∞

−∞

∫ ∞

−∞yE[Z|Y = y,X = x] fXY (x,y)dydx

=∫ ∞

−∞

∫ ∞

−∞xy fXY (x,y)dydx = E[XY ] = 7/3.


Next,

E[Y 2] =

∫ ∞

−∞E[Y 2|X = x] fX (x)dx =

∫ ∞

−∞2x2 fX (x)dx = 2E[X2] = 14/3,

and

E[Z2] =∫ ∞

−∞

∫ ∞

−∞E[Z2|Y = y,X = x] fXY (x,y)dydx

=∫ ∞

−∞

∫ ∞

−∞(1+ x2) fXY (x,y)dydx = 1+E[X2] = 1+7/3 = 10/3.

We now have that

cov([X ,Y,Z]′) =1

3

7 7 7

7 14 7

7 7 10

−

(32

)21 1 1

1 1 1

1 1 1

=1

12

1 1 1

1 29 1

1 1 13

=

0.0833 0.0833 0.08330.0833 2.4167 0.08330.0833 0.0833 1.0833

.


]=[E[X ],E[Y ],E[Z]

]′, and

cov([X ,Y,Z]′) =


−


,

we compute all the entries of these matrices. In order to do this, we first note that

Z ∼ N(0,1). Next, as a function of y, fY |Z(y|z) is an N(z,1) density. Similarly, as afunction of x, fX |YZ(x|y,z) is an N(y,1) density. Hence, E[Z] = 0,

E[Y ] =∫ ∞

−∞E[Y |Z = z] fZ(z)dz =

∫ ∞

−∞z fZ(z)dz = E[Z] = 0,

and

E[X ] =∫ ∞

−∞

∫ ∞

−∞E[X |Y = y,Z = z] fYZ(y,z)dydz

=

∫ ∞

−∞

∫ ∞

−∞y fYZ(y,z)dydz = E[Y ] = 0.

We next compute

E[XY ] =∫ ∞

−∞

∫ ∞


=∫ ∞

−∞

∫ ∞


=∫ ∞

−∞

∫ ∞

−∞yE[X |Y = y,Z = z] fYZ(y,z)dydz =

∫ ∞

−∞

∫ ∞

−∞y2 fYZ(y,z)dydz

= E[Y 2] =∫ ∞

−∞E[Y 2|Z = z] fZ(z)dz =

∫ ∞

−∞(1+ z2) fZ(z)dz

= 1+E[Z2] = 2.


Then

E[YZ] =

∫ ∞


∫ ∞

−∞zE[Y |Z = z] fZ(z)dz

=∫ ∞

−∞z2 fZ(z)dz = E[Z2] = 1,

and

E[XZ] =∫ ∞

−∞

∫ ∞

−∞E[XZ|Y = y,Z = z] fYZ(y,z)dydz

=∫ ∞

−∞

∫ ∞

−∞zE[X |Y = y,Z = z] fYZ(y,z)dydz

=∫ ∞

−∞

∫ ∞

−∞yz fYZ(y,z)dydz = E[YZ] = 1.

Next,

E[Y 2] =∫ ∞

−∞E[Y 2|Z = z] fZ(z)dz =

∫ ∞

−∞(1+ z2) fZ(z)dz = 1+E[Z2] = 2,

and

E[X2] =∫ ∞

−∞

∫ ∞

−∞E[X2|Y = y,Z = z] fYZ(y,z)dydz

=∫ ∞

−∞

∫ ∞

−∞(1+ y2) fYZ(y,z)dydz = 1+E[Y 2] = 3.

Since E[Z2] = 1, we now have that

cov([X ,Y,Z]′) =

3 2 1

2 2 1

1 1 1

.

14. We first note that Z ∼ N(0,1). Next, as a function of y, fY |Z(y|z) is an N(z,1) density.Similarly, as a function of x, fX |YZ(x|y,z) is an N(y,1) density. Hence,

E[e j(ν1X+ν2Y+ν3Z)] =

∫ ∞

−∞

∫ ∞

−∞E[e j(ν1X+ν2Y+ν3Z)|Y = y,Z = z] fYZ(y,z)dydz

=∫ ∞

−∞

∫ ∞

−∞E[e j(ν1X+ν2y+ν3z)|Y = y,Z = z] fYZ(y,z)dydz

=∫ ∞

−∞

∫ ∞

−∞E[e jν1X |Y = y,Z = z]e jν2ye jν3z fYZ(y,z)dydz

=∫ ∞

−∞

∫ ∞

−∞e jν1y−ν21/2e jν2ye jν3z fYZ(y,z)dydz

= e−ν21 /2∫ ∞

−∞

∫ ∞

−∞e j(ν1+ν2)ye jν3z fYZ(y,z)dydz

= e−ν21 /2E[e j(ν1+ν2)Y e jν3Z ]


= e−ν21 /2∫ ∞

−∞E[e j(ν1+ν2)Y e jν3Z |Z = z] fZ(z)dz

= e−ν21 /2∫ ∞

−∞e jν3zE[e j(ν1+ν2)Y |Z = z] fZ(z)dz

= e−ν21 /2∫ ∞

−∞e jν3ze j(ν1+ν2)z−(ν1+ν2)

2/2 fZ(z)dz

= e−ν21 /2e−(ν1+ν2)2/2∫ ∞

−∞e j(ν1+ν2+ν3)z fZ(z)dz

= e−ν21 /2e−(ν1+ν2)2/2

E[e j(ν1+ν2+ν3)z]

= e−ν21 /2e−(ν1+ν2)2/2e−(ν1+ν2+ν3)

2/2

= e−[ν21+(ν1+ν2)2+(ν1+ν2+ν3)

2]/2.

15. RY := E[YY ′] = E[(AX)(AX)′] = E[AXX ′A′] = AE[XX ′]A′ = ARXA′.

16. First, since R = E[XX ′], we see that R′ = E[XX ′]′ = E[(XX ′)′] = E[XX ′] = R. Thus,

R is symmetric. Next, define the scalar Y := c′X . Then

0 ≤ E[Y 2] = E[YY ′] = E[(c′X)(c′X)′] = E[c′XX ′c] = c′E[XX ′]c = c′Rc,

and we see that R is positive semidefinite.

17. Use the Cauchy–Schwarz inequality to write

∣∣(CXY )i j∣∣ =

∣∣E[(Xi−mX ,i)(Yj−mY, j)]∣∣

≤√

E[(Xi−mX ,i)2]E[(Yj−mY, j)2] =√

(CX )ii(CY ) j j.

18. If

P =

[cosθ −sinθsinθ cosθ

],

then[U

V

]:= P′

[X

Y

]=

[cosθ sinθ−sinθ cosθ

][X

Y

]=

[X cosθ +Y sinθ−X sinθ +Y cosθ

].

We now have to find θ such that E[UV ] = 0. Write

E[UV ] = E[(X cosθ +Y sinθ)(−X sinθ +Y cosθ)]

= E[(Y 2−X2)sinθ cosθ +XY (cos2 θ − sin2 θ)]

=σ2Y −σ2

X

2sin2θ +E[XY ]cos2θ ,

which is zero if and only if

tan2θ =2E[XY ]

σ2X −σ2

Y

.

Hence,

θ =1

2tan−1

(2E[XY ]

σ2X −σ2

Y

).


19. (a) Write (eie′i)mn = (ei)m(e′i)n, which equals one if and only if m = i and n = i.

Hence, eie′i must be all zeros except at position i i where it is one.

(b) Write

E ′E =

[e1 e4 e5

] e′1e′4e5

= e1e′1 + e4e

′4 + e5e

′5

= diag(1,0,0,0,0)+diag(0,0,0,1,0)+diag(0,0,0,0,1)

= diag(1,0,0,1,1).

20. (a) We must solve P′CXP= diagonal. With X :=U+V , we have

CX = E[XX ′] = E[(U+V )(U+V )′] = CU +E[UV ′]+E[VU ′]+CV= QMQ′+ I = Q(M+ I)Q′.

Hence, Q′CXQ =M+ I, which is diagonal. The point here is that we may takeP= Q.

(b) We now put Y := P′X = QX = Q(U+V ). Then

CY = E[YY ′] = E[Q(U+V )(U ′+V ′)Q′]

= QCU +E[UV ′]+E[VU ′]+CVQ′ = QCUQ′+QIQ′ = M+ I.

21. Starting with u= x+ y and v= x− y, we have

x =u+ v

2and y =

u− v2

.

We can now write

dH =

[∂x∂u

∂x∂v

∂y∂u

∂y∂v

]=

[1/2 1/2

1/2 −1/2

], detdH = −1/2,

and

|detdH | = 1/2,

and so

fUV (u,v) = fXY

(u+ v

2,u− v2

)· 12.

22. Starting with u = xy and v = y/x, write y = xv. Then u = x2v, and x = (u/v)1/2. We

also have y= (u/v)1/2v= (uv)1/2. Then

∂x∂u = (1/2)/

√uv, ∂x

∂v = (−1/2)√u/v3/2,∂y∂u = (1/2)

√v/u, ∂y

∂v = (1/2)√u/v.


In other words,

dH =

[(1/2)/

√uv (−1/2)√u/v3/2

(1/2)√v/u (1/2)

√u/v

],

and so

|detdH | =

∣∣∣∣1

4v+

1

4v

∣∣∣∣ =1

2|v| .

Thus,

fUV (u,v) = fXY

(√u/v,

√uv) 1

2v, u,v> 0.

For fU , write

fU (u) =∫ ∞

0fUV (u,v)dv =

∫ ∞

0fXY

(√u/v,

√uv) 1

2vdv

Now make the change of variable y=√uv, or y2 = uv. Then 2ydy= udv, and

fU (u) =∫ ∞

0fXY

(uy,y)1ydy.

For fV , write

fV (v) =∫ ∞

0fUV (u,v)du =

∫ ∞

0fXY

(√u/v,

√uv) 1

2vdu.

This time make the change of variable x =√u/v or x2 = u/v. Then 2xdx = du/v,

and

fV (v) =∫ ∞

0fXY (x,vx)xdx.

23. Starting with u= x and v= y/x, we find that y= xv= uv. Hence,

dH =

[∂x∂u

∂x∂v

∂y∂u

∂y∂v

]=

[1 0

v u

], detdH = u, and |detdH | = |u|.

Then

fUV (u,v) = fXY (u,uv)|u| = λ2e−λ |u| · λ

2e−λ |uv| · |u|,

and

fV (v) =∫ ∞

−∞

λ 2

4|u|e−λ (1+|v|)|u| du = λ 2

2

∫ ∞

0ue−λ (1+|v|)u du

=λ

2(1+ |v|)

∫ ∞

0u ·λ (1+ |v|)e−λ (1+|v|)u du.

Now, this last integral is the mean of an exponential density with parameter λ (1+ |v|).Hence,

fV (v) =λ

2(1+ |v|) ·1

λ (1+ |v|) =1

2(1+ |v|)2 .


24. Starting with u=√−2lnx cos(2πy) and v=

√−2lnx sin(2πy), we have

u2 + v2 = (−2lnx)[cos2(2πy)+ sin2(2πy)] = −2lnx.

Hence, x= e−(u2+v2)/2. We also have

v

u= tan(2πy) or y =

1

2πtan−1(v/u).

We can now write

∂x∂u =−ue−(u2+v2)/2, ∂x

∂v =−ve−(u2+v2)/2,

∂y∂u = 1

2π1

1+(u/v)2· −vu2

, ∂y∂v = 1

2π1

1+(u/v)2· 1u.

In other words,

dH =

−ue

−(u2+v2)/2 −ve−(u2+v2)/2

12π

11+(u/v)2

· −vu2

12π

11+(u/v)2

· 1u

,

and so

|detdH| =

∣∣∣∣−12π

e−(u2+v2)/2

1+(v/u)2− 1

2π

e−(u2+v2)/2

1+(v/u)2· v

2

u2

∣∣∣∣

=e−(u2+v2)/2

2π

∣∣∣∣1

1+(v/u)2+

(v/u)2

1+(v/u)2

∣∣∣∣

=e−u

2/2

√2π

e−v2/2

√2π

.

We next use the formula fUV (u,v) = fXY (x,y) · |detdH|, where x = e−(u2+v2)/2 and

y = tan−1(v/u)/(2π). Fortunately, since these formulas for x and y lie in (0,1],fXY (x,y) = I(0,1](x)I(0,1](y) = 1, and we see that

fUV (u,v) = 1 · |detdH| =e−u

2/2

√2π

e−v2/2

√2π

.

Integrating out v shows that fU (u) = e−u2/2/

√2π , and integrating out v shows that

fV (v) = e−v2/2/

√2π . It now follows that fUV (u,v) = fU (u) fV (v), and we see thatU

and V are independent.

25. Starting with u= x+y and v= x/(x+y) = x/u, we see that v= x/u, or x= uv. Next,

from y= u− x= u−uv, we get y= u(1− v). We can now write

∂x∂u = v, ∂x

∂v = u,

∂y∂u = 1− v, ∂y

∂v =−u.

In other words,

dH =

[v u

1− v −u

],


and so

|detdH| = |−uv−u(1− v)| = |u|.

We next write

fUV (u,v) = fXY (uv,u(1− v))|u|.

If X and Y are independent gamma RVs, then for u> 0 and 0 < v< 1,

fUV (u,v) =λ (λuv)p−1e−λuv

Γ(p)· λ (λu(1− v))q−1e−λu(1−v)

Γ(q)·u

=λ (λu)p+q−1e−λu

Γ(p+q)· Γ(p+q)

Γ(p)Γ(q)vp−1(1− v)q−1,

which we recognize as the product of a gamma(p+ q,λ ) and a beta(p,q) density.Hence, it is easy to integrate out either u or v and show that fUV (u,v) = fU (u) fV (v),where fU ∼ gamma(p+q,λ ) and fV ∼ beta(p,q). Thus,U and V are independent.

26. Starting with u= x+y and v= x/y, write x= yv and then u= yv+y= y(v+1). Solvefor y= u/(v+1) and x= u− y= u−u/(v+1) = uv/(v+1). Next,

∂x∂u = v/(v+1), ∂x

∂v = u/(v+1)2,

∂y∂u = 1/(v+1), ∂y

∂v =−u/(v+1)2.

In other words,

dH =

[v/(v+1) u/(v+1)2

1/(v+1) −u/(v+1)2

],

and so

|detdH | =

∣∣∣∣−uv

(v+1)3− u

(v+1)3

∣∣∣∣ =

∣∣∣∣−u(v+1)

(v+1)3

∣∣∣∣ =|u|

(v+1)2.

We next write

fUV (u,v) = fXY

(uv

v+1,u

v+1

) |u|(v+1)2

.

When X ∼ gamma(p,λ ) and Y ∼ gamma(q,λ ) are independent, then U and V are

nonnegative, and

fUV (u,v) = λ[λuv/(v+1)]p−1e−λuv/(v+1)

Γ(p)·λ [λu/(v+1)]q−1e−λu/(v+1)

Γ(q)· u

(v+1)2

= λ(λu)p+q−1e−λu

Γ(p+q)· Γ(p+q)

Γ(p)Γ(q)· vp−1

(v+1)p+q

= λ(λu)p+q−1e−λu

Γ(p+q)· vp−1

B(p,q)(v+1)p+q,

which shows thatU and V are independent with the required marginal densities.


27. From solution of the Example at the end of the section,

fR,Θ(r,θ) = fXY (r cosθ ,r sinθ)r.

What is different in this problem is that X and Y are correlated Gaussian random

variables; i.e.,

fXY (x,y) =e−(x2−2ρxy+y2)/[2(1−ρ2)]

2π√1−ρ2

.

Hence,

fR,Θ(r,θ) =re−(r2 cos2 θ−2ρ(r cosθ)(r sinθ)+r2 sin2 θ)/[2(1−ρ2)]

2π√1−ρ2

=re−r

2(1−ρ sin2θ)/[2(1−ρ2)]

2π√1−ρ2

.

To find the density of Θ, we must integrate this with respect to r. Notice that the

integrand is proportional to re−λ r2/2, whose anti-derivative is −e−λ r2/2/λ . Here wehave λ = (1−ρ sin2θ)/(1−ρ2). We can now write

fΘ(θ) =∫ ∞

0fR,Θ(r,θ)dr =

1

2π√1−ρ2

∫ ∞

0re−λ r2/2 dr =

1

2π√1−ρ2

· 1λ

=1−ρ2

2π√1−ρ2(1−ρ sin2θ)

=

√1−ρ2

2π(1−ρ sin2θ).

28. Since X ∼ N(0,1), we have E[X ] = E[X3] = 0, E[X4] = 3, and E[X6] = 15. Since

W ∼ N(0,1), E[W ] = 0 too. Hence, E[Y ] = E[X3 +W ] = 0. It then follows that mY ,

mX and b= mX −Amy are zero. We next compute

CXY = E[XY ] = E[X(X3 +W )] = E[X4]+E[X ]E[W ] = 3+0 = 3,

and

CY = E[Y 2] = E[X6 +2X3W +W 2] = 15+0+1 = 16.

Hence, ACY =CXY implies A= 3/16, and then X = A(Y −mY )+mX = (3/16)Y .

29. First note that since X andW are zero mean, so is Y . Next,

CXY = E[XY ] = E[X(X+W )] = E[X2]+E[XW ] = E[X2]+E[X ]E[W ] = E[X2] = 1,

and

CY = E[Y 2] = E[(X+W )2] = E[X2 +2XW +W 2]

= E[X2]+2E[X ]E[W ]+E[W 2] = 1+0+2/λ 2 = 1+2/λ 2.

Then A=CXY/CY = 1/[1+2/λ 2], and

X =λ 2

2+λ 2Y.


30. We first have E[Y ] = E[GX+W ] = GmX +0= GmX . Next,

CXY = E[(X−mX )(Y −mY )′] = E[(X−mX )(GX+W −GmX )′]

= E[(X−mX )(GX−mX+W )′] = CXG′+CXW = CXG

′,

and

CY = E[(Y −mY )(Y −mY )′] = E[(GX−mX+W )(GX−mX+W )′]

= GCXG′+GCXW +CWXG

′+CW = GCXG′+CW ,

since X andW are uncorrelated. Solving ACY =CXY implies

A=CXG′(GCXG

′+CW )−1 and X =CXG′(GCXG

′+CW )−1(Y −GmX )+mX .

31. We begin with the result of Problem 30 that

A = CXG′(GCXG

′+CW )−1.

Following the hint, we make the identifications α =CW , γ =CX , β =G, and δ =G′.Then

A = CXG′(α +βγδ )−1

= CXG′[C−1W −C−1W G(C−1X +G′C−1W G)−1G′C−1W ]

= CXG′C−1W −CXG′C−1W G(C−1X +G′C−1W G)−1G′C−1W

= [CX −CXG′C−1W G(C−1X +G′C−1W G)−1]G′C−1W= [CX (C−1X +G′C−1W G)−CXG′C−1W G](C−1X +G′C−1W G)−1G′C−1W= [I+CXG

′C−1W G−CXG′C−1W G](C−1X +G′C−1W G)−1G′C−1W= (C−1X +G′C−1W G)−1G′C−1W .

32. We begin with

X = A(Y −mY )+mX , where ACY =CXY .

Next, with Z := BX , we have mZ = BmX and

CZY = E[(Z−mZ)(Y −mY )′] = E[B(X−mX )(Y −mY )′] = BCXY .

We must solve ACY =CZY . Starting with ACY =CXY , multiply this equation by B to

get (BA)CY = BCXY =CZY . We see that A := BA solves the required equation. Hence,

the linear MMSE estimate of X based on Z is

(BA)(Y −mY )+mZ = (BA)(Y −mY )+BmX = BA(Y −mY )+mX = BX .

33. We first show that the orthogonality condition

E[(CY )′(X−AY )] = 0, for all C,


implies A is optimal. Write

E[‖X−BY‖2] = E[‖(X−AY )+(AY −BY )‖2]= E[‖(X−AY )+(A−B)Y )‖2]= E[‖X−AY‖2]+2E[(A−B)Y′(X−AY )]+E[‖(A−B)Y‖2]= E[‖X−AY‖2]+E[‖(A−B)Y‖2]≥ E[‖X−AY‖2],

where the cross terms vanish by taking C = A− B in the orthogonality condition.

Next, rewrite the orthogonality condition as

E[(CY )′(X−AY )] = trE[(CY )′(X−AY )] = E[tr(CY )′(X−AY )]= E[tr(X−AY )(CY )′] = trE[(X−AY )(CY )′]= tr(RXY −ARY )C′.

Now, this expression must be zero for all C, including C = RXY −ARY . However,

since tr(DD′) = 0 implies D= 0, we conclude that the optimal A solves ARY = RXY .

Next, the best constant estimator is easily found by writing

E[‖X−b‖2] = E[‖(X−mX )+(mX −b)‖2] = E[‖X −mX‖2]+‖mX −b‖2.

Hence, the optimal value of b is b= mX .

34. To begin, write

E[(X− X)(X− X)′] = E[(X−mX )−A(Y −mY )(X−mX )−A(Y −mY )′]= CX −ACYX −CXYA′+ACYA′. (∗)

We now use the fact that ACY =CXY . If we multiply ACY =CXY on the right by A′,we obtain

ACYA′ = CXYA

′.

Furthermore, since ACYA′ is symmetric, we can take the transpose of the above ex-

pression and obtain

ACYA′ = ACYX .

By making appropriate substitutions in (∗), we find that the error covariance is alsogiven by

CX −ACYX , CX −CXYA′, and CX −ACYA′.

35. Write

E[‖X− X‖2] = E[(X− X)′(X− X)] = trE[(X− X)′(X− X)]

= E[tr(X− X)′(X− X)] = E[tr(X− X)(X− X)′]= trE[(X− X)(X− X)′] = trCX −ACYX.


36. MATLAB.We found

A =

−0.0622 0.0467 −0.0136 −0.10070.0489 −0.0908 −0.0359 −0.1812−0.0269 0.0070 −0.0166 0.09210.0619 0.0205 −0.0067 0.0403

and MSE= 0.0806.

37. Write X in the form X = [Y ′,Z′]′, where Y := [X1, . . . ,Xm]′. Then

CX = E

[[Y

Z

] [Y ′ Z′

] ]=

[CY CYZCZY CZ

]=

[C1 C2C′2 C3

],

and

CXY = E

[[Y

Z

]Y ′]

=

[CYCZY

]=

[C1C′2

].

Solving ACY =CXY becomes

AC1 =

[C1C′2

], or A =

[I

C′2C−11

].

The linear MMSE estimate of X = [Y ′,Z′]′ is

AY =

[I

C′2C−11

]Y =

[Y

C′2C−11 Y

].

In other words, Y = Y and Z =C′2C−11 Y . Note that the matrix required for the linear

MMSE estimate of Z based on Y is the solution of BCY = CZY or BC1 = C′2; i.e.,B=C′2C

−11 . Next, the error covariance for estimating X based on Y is

E[(X− X)(X− X)′] = CX −ACYX =

[C1 C2C′2 C3

]−[

I

C′2C−11

] [C1 C2

]

=

[C1 C2C′2 C3

]−[C1 C2C′2 C

′2C−11 C2

]

=

[0 0

0 C3−C′2C−11 C2

],

and the MSE is

E[‖X − X‖2] = tr(CX −ACYX ) = tr(C3−C′2C−11 C2).

38. Since P′ decorrelates Y , the covariance matrix CZ of Z := P′Y is diagonal. Writing

ACZ =CXZ in component form, and using the fact that CZ is diagonal,

∑k

Aik(CZ)k j = (CXZ)i j

becomes

Ai j(CZ) j j = (CXZ)i j


If (CZ) j j 6= 0, then Ai j = (CXZ)i j/(CZ) j j. If (CZ) j j = 0, then Ai j(CZ) j j = (CXZ)i jcan be solved only if (CXZ)i j = 0, which we now show to be the case by using the

Cauchy–Schwarz inequality. Write

|(CXZ)i j| =∣∣E[(Xi− (mX )i)(Z j− (mZ) j)]

∣∣

≤√

E[(Xi− (mX )i)2]E[(Z j− (mZ) j)2]

=√

(CX )ii(CZ) j j.

Hence, if (CZ) j j = 0 then (CXZ)i j = 0, and any value of Ai j solves Ai j(CZ) j j =(CXZ)i j. Now that we have shown that we can always solve ACZ = CXZ , observe

that this equation is equivalent to

A(P′CYP) = CXYP or (AP′)CY = CXY .

Thus, A= AP′ solves the original problem.

39. Since X has the form X = [Y ′,Z′]′, if we take G= [I,0] andW ≡ 0, then

GX+W =

[I 0] [

Y

Z

]= Y.

40. Write

E

[1

n

n

∑k=1

X2k

]=

1

n

n

∑k=1

E[X2k ] =

1

n

n

∑k=1

σ2 = σ2.

41. Write

E

[1

n

n

∑k=1

XkX′k

]=

1

n

n

∑k=1

E[XkX′k] =

1

n

n

∑k=1

C = C.

42. MATLAB. Additional code:

Mn = mean(X,2)

MnMAT = kron(ones(1,n),Mn);

Chat = (X-MnMAT)*(X-MnMAT)’/(n-1)

43. We must first find fY |X (y|x). To this end, use substitution and independence to write

P(Y ≤ y|X = x) = P(X+W ≤ y|X = x) = P(x+W ≤ y|X = x) = P(W ≤ y− x).

Then fY |X (y|x) = fW (y− x) = (λ/2)e−λ |y−x|. For fixed y, the maximizing value of xis x= y. Hence, gML(y) = y.

44. By the same argument as in the solution of Problem 43, fY |X (y|x) = (λ/2)e−λ |y−x|.When X ∼ exp(µ) and we maximize over x, we must impose the constraint x ≥ 0.

Hence,

gML(y) = argmaxx≥0

λ2e−λ |y−x| =

y, y≥ 0,0, y< 0.


When X ∼ uniform[0,1],

gML(y) = argmax0≤x≤1

λ2e−λ |y−x| =

y, 0≤ y≤ 1,1, y> 1,0, y< 0.

45. By the same argument as in the solution of Problem 43, fY |X (y|x) = (λ/2)e−λ |y−x|.For the MAP estimator, we must maximize

fY |X (y|x) fX (x) = λ2e−λ |y−x| ·µe−µx, x≥ 0.

By considering separately the cases x≤ y and x> y,

fY |X (y|x) fX (x) =

µλ2e−λye(λ−µ)x, 0≤ x≤ y,

µλ2eλye−(λ+µ)x, x> y, x≥ 0.

When y≥ 0, observe that the two formulas agree at x= y and have the common value

(µλ/2)e−µy; in fact, if λ > µ , the first formula is maximized at x = y, while the

second formula is always maximized at x= y. If y< 0, then only the second formula

is valid, and its region of validity is x≥ 0. This formula is maximized at x= 0. Hence,

for λ > µ ,

gMAP(y) =

y, y≥ 0,0, y< 0.

We now consider the case λ ≤ µ . As before, if y < 0, the maximizing value of x is

zero. If y ≥ 0, then the maximum value of fY |X (y|x) fX (x) for 0 ≤ x ≤ y occurs at

x= 0 with a maximum value of (µλ/2)e−λy. The maximum value of fY |X (y|x) fX (x)for x≥ y occurs at x= y with a maximum value of (µλ/2)e−µy. For λ < µ ,

max(µλ/2)e−λy,(µλ/2)e−µy = (µλ/2)e−λy,

which corresponds to x= 0. Hence, for λ < µ ,

gMAP(y) = 0, −∞ < y< ∞.

46. From the formula

fXY (x,y) = (x/y2)e−(x/y)2/2 ·λe−λy, x, y> 0,

we see that Y ∼ exp(λ ), and that given Y = y, X ∼ Rayleigh(y). Hence, the MMSE

estimator is E[X |Y = y] =√

π/2y. To compute the MAP estimator, we must solve

argmaxx≥0

(x/y2)e−(x/y)2/2.

We do this by differentiating with respect to x and setting the derivative equal to zero.

Write

∂

∂x(x/y2)e−(x/y)2/2 =

e−(x/y)2/2

y2

[1− x

2

y2

].

Hence,

gMAP(y) = y.


47. Suppose that

E[(X−g1(Y ))h(Y )] = 0 and E[(X−g2(Y ))h(Y )] = 0.

Subtracting the second equation from the first yields

E[g2(Y )−g1(Y )h(Y )] = 0. (∗)

Since h is an arbitrary bounded function, put h(y) := sgn[g2(y)−g1(y)], where

sgn(x) :=

1, x> 0,0, x= 0,−1, x< 0.

Note also that x · sgn(x) = |x|. Then (∗) becomes E[|g2(Y )−g1(Y )|] = 0.

CHAPTER 9

Problem Solutions

1. We first compute

detC = det

[σ21 σ1σ2ρ

σ1σ2ρ σ22

]= σ2

1σ22 − (σ1σ2ρ)2 = σ2

1σ22 (1−ρ2),

and√detC = σ1σ2

√1−ρ2. Next,

C−1 =1

detC

[σ22 −σ1σ2ρ

−σ1σ2ρ σ21

]=

1

σ21 (1−ρ2)

−ρ

σ1σ2(1−ρ2)−ρ

σ1σ2(1−ρ2)

1

σ22 (1−ρ2)

,

and

[x y

]C−1

[x

y

]=[x y

]

x

σ21 (1−ρ2)

− ρy

σ1σ2(1−ρ2)−ρx

σ1σ2(1−ρ2)+

y

σ21 (1−ρ2)

=x2

σ21 (1−ρ2)

− ρxy

σ1σ2(1−ρ2)− ρxy

σ1σ2(1−ρ2)+

y2

σ21 (1−ρ2)

=1

1−ρ2

[( xσ1

)2−2ρ

xy

σ1σ2+( y

σ2

)2],

and the result follows.

2. Here is the plot:

−4 −2 0 2 40

1 n = 1

n = 4

3. (a) First write c1X+c2Y = c1X+c2(3X) = (c1+3c2)X , which is easily seen to beN(0,(c1 +3c2)

2). Thus, X and Y are jointly Gaussian.

(b) Observe that E[XY ] = E[X(3X)] = 3E[X2] = 3 and E[Y 2] = E[(3X)2] = 9E[X2] =9. Since X and Y have zero means,

cov([X ,Y ]′) =

[E[X2] E[XY ]E[YX ] E[Y 2]

]=

[1 3

3 9

].

151


(c) The conditional cdf of Y given X = x is

FY |X (y|x) = P(Y ≤ y|X = x) = P(3X ≤ y|X = x).

By substitution, this last conditional probability is P(3x ≤ y|X = x). The event3x ≤ y is deterministic and therefore independent of X . Hence, we can dropthe conditioning and get

FY |X (y|x) = P(3x≤ y).

If 3x ≤ y, then 3x ≤ y = Ω, and 3x ≤ y = ∅ otherwise. Hence, the above

probability is just u(y−3x).

4. If Y = ∑ni=1 ciXi, its characteristic function is

ϕY (ν) = E[e jν(∑ni=1 ciXi)] = E

[n

∏i=1

e jνciXi]

=n

∏i=1

E[e jνciXi ] =n

∏i=1

ϕXi(νci)

=n

∏i=1

e j(νci)mi−(νci)2σ2i /2 = e jν(∑ni=1 cimi)−ν2(∑ni=1 c

2i σ2i )/2,

which is the characteristic function of an N(∑ni=1 cimi,∑

ni=1 c

2i σ

2i ) random variable.

5. First, E[Y ] = E[AX+b] = AE[X ]+b= Am+b. Second,

E[Y − (Am+b)Y − (Am+b)′] = E[A(X−m)(X−m)′A′] = ACA′.

6. Write out

Y1 = X1

Y2 = X1 +X2

Y3 = X1 +X2 +X3...

In general, Yn = Yn−1 +Xn, or Xn = Yn−Yn−1, which we can write in matrix-vectornotation as

X1X2X3...

Xn−1Xn

=

1 0 · · · 0

−1 1 0 0

0 −1 1 0 0

. . .

0 0 −1 1 0

0 · · · 0 −1 1

︸︷︷︸=: A

Y1Y2Y3...

Yn−1Yn

.

Since X = AY and Y is Gaussian, so is X .

7. Since X ∼ N(0,C), the scalar Y := ν ′X is also Gaussian and has zero mean. Hence,

E[(ν ′XX ′ν)k] = E[(Y 2)k] = E[Y 2k] = (2k−1) · · ·5 ·3 ·1 · (E[Y 2])k. Now observe that

E[Y 2] = E[ν ′XX ′ν ] = ν ′Cν and the result follows.


8. We first have

E[Yj] = E[X j−X ] = m−E

[1

n

n

∑i=1

Xi

]= m− 1

n

n

∑i=1

m = 0.

Next, since E[XYj] = E[X(X j−X)], we first compute

E[X X ] =1

n2

n

∑i=1

n

∑j=1

E[XiX j] =1

n2

n

∑i=1

(σ2 +m2)+ ∑i 6= j

m2

=1

n2

n(σ2 +m2)+n(n−1)m2

=

1

n2

nσ2 +n2m2

=

σ2

n+m2,

and

E[X X j] =1

n

n

∑i=1

E[XiX j] =1

n

(σ2 +m2)+(n−1)m2

=

σ2

n+m2.

It now follows that E[XYj] = 0.

9. Following the hint, in the expansion of E[(ν ′X)2k], the sum of all the coefficients of

νi1 · · ·νi2k is (2k)!E[Xi1 · · ·Xi2k ]. The corresponding sum of coeficients in the expan-

sion of

(2k−1)(2k−3) · · ·5 ·3 ·1 · (ν ′Cν)k

is

(2k−1)(2k−3) · · ·5 ·3 ·1 ·2kk! ∑j1,..., j2k

C j1 j2 · · ·C j2k−1 j2k ,

where the sum is over all j1, . . . , j2k that are permutations of i1, . . . , i2k and such that

the product C j1 j2 · · ·C j2k−1 j2k is distinct. Since

(2k−1)(2k−3) · · ·5 ·3 ·1 ·2kk! = (2k−1)(2k−3) · · ·5 ·3 ·1 · (2k)(2k−2) · · ·4 ·2= (2k)!,

Wick’s Theorem follows.

10. Write

E[X1X2X3X4] = ∑j1, j2, j3, j4

C j1 j2C j3 j4 = C12C34 +C13C24 +C14C23.

E[X1X23X4] = C13C34 +C14C33.

E[X21X

22 ] = C11C22 +C2

12.

11. Put a := [a1, . . . ,an]′. Then Y = a′X , and

ϕY (η) = E[e jηY ] = E[e jη(a′X)] = E[e j(ηa)′X ] = ϕX (ηa)

= e j(ηa)′m−(ηa)′C(ηa)/2 = e jη(a′m)−(a′Ca)η2/2,

which is the characteristic function of a scalar N(a′m,a′Ca) random variable.


12. With X = [U ′,W ′]′ and ν = [α ′,β ′]′, we have ϕX (ν) = e jν′m−ν ′Cν/2, where

ν ′m =[

α ′ β ′][ mUmW

]= α ′mU +β ′mW ,

and

ν ′Cν =[

α ′ β ′][ S 0

0 T

][αβ

]=[

α ′ β ′][ SαTβ

]= α ′Sα +β ′Tβ .

Thus,

ϕX (ν) = e j(α′mU+β ′mW )−(α ′Sα+β ′Tβ )/2 = e jα

′mU−α ′Sα/2e jβ′mW−β ′Tβ/2,

which has the required form ϕU (α)ϕW (β ) of a product of Gaussian characteristic

functions.

13. (a) Since X is N(0,C) and Y := C−1/2X , Y is also normal. It remains to find the

mean and covariance of Y . We have E[Y ] = E[C−1/2X ] = C−1/2E[X ] = 0 and

E[YY ′] = E[C−1/2XX ′C−1/2] =C−1/2E[XX ′]C−1/2 =C−1/2CC−1/2 = I. Hence,

Y ∼ N(0, I).

(b) Since the covariance matrix of Y is diagonal, the components of Y are uncorre-

lated. Since Y is also Gaussian, the components of Y are independent. Since the

covariance matrix ofY is the indentity, eachYk ∼N(0,1). Hence, eachY 2k is chi-squared with one degree of freedom by Problem 46 in Chapter 4 or Problem 11

in Chapter 5.

(c) By the Remark in Problem 55(c) in Chapter 4, V is chi-squared with n degrees

of freedom.

14. Since

Z := det

[X −YY X

]= X2 +Y 2,

where X and Y are independent N(0,1), observe that X2 and Y 2 are chi-squared with

one degree of freedom. Hence, Z is chi-squared with two degrees of freedom, which

is the same as exp(1/2).

15. Begin with

fX (x) =1

(2π)n

∫

IRne− jν

′xe jν′m−ν ′Cν/2 dν =

1

(2π)n

∫

IRne− j(x−m)′νe−ν ′Cν/2 dν .

Now make the multivariate change of variable ζ =C1/2ν , dζ = detC1/2 dν . Then

fX (x) =1

(2π)n

∫

IRne− j(x−m)′C−1/2ζ e−ζ ′ζ/2 dζ

detC1/2

=1

(2π)n

∫

IRne− jC

−1/2(x−m)′ζ e−ζ ′ζ/2 dζ√detC

.


Put t =C−1/2(x−m) so that

fX (x) =1

(2π)n

∫

IRne− jt

′ζ e−ζ ′ζ/2 dζ√detC

=1√detC

n

∏i=1

(1

2π

∫ ∞

−∞e− jtiζie−ζ 2i /2 dζi

).

Observe that e−ζ 2i /2 is the characteristic function of a scalar N(0,1) random variable.

Hence,

fX (x) =1√detC

n

∏i=1

e−t2i /2

√2π

=1

(2π)n/2√detC

exp[−t ′t/2].

Recalling that t =C−1/2(x−m) yields

fX (x) =exp[− 1

2(x−m)′C−1(x−m)]

(2π)n/2√detC

.


Z := det

[X Y

U V

]= XV −YU.

Then consider the conditional cumulative distribution function,

FZ|UV (z|u,v) = P(Z ≤ z|U = u,V = v) = P(XV −YU ≤ z|U = u,V = v)

= P(Xv−Yu≤ z|U = u,V = v).

Since [X ,Y ]′ and [U,V ]′ are jointly Gaussian and uncorrelated, they are independent.Hence, we can drop the conditioning and get

FZ|UV (z|u,v) = P(Xv−Yu≤ z).

Next, since X and Y are independent and N(0,1), Xv−Yu∼ N(0,u2 + v2). Hence,

fZ|UV (· |u,v)∼ N(0,u2 + v2).

17. We first use the fact that A solves ACY =CXY to show that (X−mX )−A(Y −mY ) andY are uncorrelated. Write

E[(X−mX )−A(Y −mY )Y ′] = CXY −ACY = 0.

We next show that (X−mX )−A(Y −mY ) and Y are jointly Gaussian by writing them

as an affine transformation of the Gaussian vector [X ′,Y ′]′; i.e.,[

(X−mX )−A(Y −mY )Y

]=

[I −A0 I

][X

Y

]+

[AmY −mX

0

].

It now follows that (X−mX )−A(Y −mY ) and Y are independent. Using the hints on

substitution and independence, we compute the conditional characteristic function,

E[eν ′X |Y = y] = E[e jν

′[(X−mX )−A(Y−mY )] · e jν ′[mX+A(Y−mY )]∣∣Y = y

]

= e jν′[mX+A(y−mY )]

E[e jν

′[(X−mX )−A(Y−mY )]∣∣Y = y

]

= e jν′[mX+A(y−mY )]

E[e jν

′[(X−mX )−A(Y−mY )]].


This last expectation is the characteristic function of the zero-mean Gaussian random

vector (X −mX )−A(Y −mY ). To compute its covariance matrix first observe that

since ACY =CXY , we have ACYA′ =CXYA

′. Then

E[(X−mX )−A(Y −mY )(X−mX )−A(Y −mY )′]= CX −CXYA′−ACYX +ACYA

′

= CX −ACYX .

We now have

E[eν ′X |Y = y] = e jν′[mX+A(y−mY )]e−ν ′[CX−ACYX ]ν/2.

Thus, given Y = y, X is conditionally N(mX +A(y−mY ),CX −ACYX

).


Z := det

[X Y

U V

]= XV −YU.

Then consider the conditional cumulative distribution function,

FZ|UV (z|u,v) = P(Z ≤ z|U = u,V = v) = P(XV −YU ≤ z|U = u,V = v)

= P(Xv−Yu≤ z|U = u,V = v).

Since [X ,Y,U,V ]′ is Gaussian, givenU = u and V = v, [X ,Y ]′ is conditionally

N

(A

[u

v

],C[X ,Y ]′ −AC[U,V ]′,[X ,Y ]′

),

where A solves AC[U,V ]′ =C[X ,Y ]′,[U,V ]′ . We now turn to the conditional distribution of

Xv−Yu. Since the conditional distribution of [X ,Y ]′ is Gaussian, so is the conditionaldistribution of the linear combination Xv−Yu. Hence, all we need to find are the

conditional mean and the conditional variance of Xv−Yu; i.e.,

E[Xv−Yu|U = u,V = v] = E

[[v u

][ XY

]∣∣∣∣U = u,V = v

]

=[v u

]E

[[X

Y

]∣∣∣∣U = u,V = v

]

=[v u

]A

[u

v

],

and

E

[[v u

]([ XY

]−A

[u

v

])2∣∣∣∣U = u,V = v

]

= E

[[v u

]([ XY

]−A

[U

V

])2∣∣∣∣U = u,V = v

]

=[v u

]E

[([X

Y

]−A

[U

V

])([X

Y

]−A

[U

V

])′∣∣∣∣U = u,V = v

][v

u

]

=[v u

](C[X ,Y ]′ −AC[U,V ]′,[X ,Y ]′

)[v

u

],


where the last step uses the fact that [X ,Y ]′−A[U,V ]′ is independent of [U,V ]′. If[X ,Y ]′ and [U,V ]′ are uncorrelated, i.e.,C[X ,Y ]′,[U,V ]′ = 0, then A= 0 solves AC[U,V ]′ =0; in this case, the conditional mean is zero, and the conditional variance simplifies to

[v u

]C[X ,Y ]′

[v

u

]=[v u

]I

[v

u

]= v2 +u2.

19. First write

Z−E[Z] = (X+ jY )− (mX + jmY ) = (X−mX )+ j(Y −mY ).

Then

cov(Z) := E[(Z−E[Z])(Z−E[Z])∗]

= E[(X−mX )+ j(Y −mY )(X−mX )+ j(Y −mY )∗]= E[(X−mX )+ j(Y −mY )(X−mX )− j(Y −mY )]= var(X)− j cov(X ,Y )+ j cov(Y,X)+ var(Y ) = var(X)+ var(Y ).

20. (a) Write

K := E[(Z−E[Z])(Z−E[Z])H ]

= E[(X−mX )+ j(Y −mY )(X−mX )+ j(Y −mY )H ]

= E[(X−mX )+ j(Y −mY )(X−mX )H − j(Y −mY )H]= CX − jCXY + jCYX +CY = (CX +CY )+ j(CYX −CXY ).

(b) IfCXY =−CYX , the (CXY )ii =−(CYX )ii implies

E[(Xi− (mX )i)(Yi− (mY )i)] = −E[(Yi− (mY )i)(Xi− (mX )i)]

= −E[(Xi− (mX )i)(Yi− (mY )i)].

Hence, E[(Xi− (mX )i)(Yi− (mY )i)] = 0, and we see that Xi and Yi are uncorre-

lated.

(c) By part (a), if K is real, then CYX = CXY . Circular symmetry implies CYX =−CXY . It follows that CXY =−CXY , and then CXY = 0; i.e., X and Y are uncor-

related.

21. First,

fX (x) =e−x

′(2I)x/2

(2π)n/2(1/2)n/2and fY (y) =

e−y′(2I)y/2

(2π)n/2(1/2)n/2.

Then

fXY (x,y) = fX (x) fY (y) =e−(x ′x+y ′y)

πn=

e−(x+ jy)H (x+ jy)

πn.

22. (a) Immediate from Problem 20(a).

(b) Since ν ′Qν is a scalar, (ν ′Qν)′ = ν ′Qν . Since Q′ = −Q, (ν ′Qν)′ = ν ′Q′ν =−ν ′Qν . Thus, ν ′Qν =−ν ′Qν , and it follows that ν ′Qν = 0.


(c) Begin by observing that if K = R+ jQ and w= ν + jθ , then

Kw = (R+ jQ)(ν + jθ) = Rν + jRθ + jQν−Qθ .

Next,

wH(Kw) = (ν ′− jθ ′)(Rν + jRθ + jQν−Qθ)

= ν ′Rν + jν ′Rθ + jν ′Qν−ν ′Qθ − jθ ′Rν +θ ′Rθ +θ ′Qν− jθ ′Qθ

= ν ′Rν + jν ′Rθ + jν ′Qν +θ ′Qν− jν ′Rθ +θ ′Rθ +θ ′Qν− jθ ′Qθ

= ν ′Rν +θ ′Rθ +2θ ′Qν ,

where we have used the result of parts (a) and (b).

23. First,

AZ = (α + jβ )(X+ jY ) = (αX−βY )+ j(βX+αY ).

Second, [α −ββ α

][X

Y

]=

[αX −βYβX+αY

].

Now assume that circular symmetry holds; i.e., CX = CY and CXY = −CYX . Put

U := αX −βY and V := βX+αY . Assuming zero means to simplify the notation,

CU = E[(αX−βY )(αX−βY )′] = αCXα ′−βCYXα ′−αCXYβ ′+β ′CYβ

= αCXα ′−βCYXα ′+αCYXβ ′+β ′CXβ .

Similarly,

CV = E[(βX+αY )(βX+αY )′] = βCXβ ′+αCYXβ ′+βCXYα ′+αCYα ′

= βCXβ ′+αCYXβ ′−βCYXα ′+αCXα ′.

Hence, CU =CV . It remains to compute

CUV = E[(αX−βY )(βX+αY )′] = αCXβ ′+αCXYα ′−βCYXβ ′−βCYα ′

= αCXβ ′−αCYXα ′−βCYXβ ′−βCXα ′

and

CVU = E[(βX+αY )(αX−βY )′] = βCXα ′−βCXYβ ′+αCYXα ′−αCYβ ′

= βCXα ′+βCYXβ ′+αCYXα ′−αCXβ ′,

which shows that CU =−CV . Thus, if Z is circularly symmetric, so is AZ.24. To begin, note that with R= [X ′,U ′]′ and I = [Y ′,V ′]′,

CR =

[CX CXUCUX CU

], CI =

[CY CYVCVY CV

],

and

CRI =

[CXY CXVCUY CUV

], CIR =

[CYX CYUCVX CVU

].

Also, Θ is circularly symmetric means CR =CI and CRI =−CIR.


(a) We assume zero means to simplify the notation. First,

KZW = E[ZWH ] = E[(X+ jY )(U+ jV )H ] = E[(X+ jY )(UH − jVH)]

= CXU − jCXV + jCYU +CYV

= 2(CXU − jCXV ), since Θ is circularly symmetric.

Second,

CZW

=

[CXU CXVCYU CYV

]=

[CXU CXV−CXV CXU

], since Θ is circularly symmetric.

It is now clear that KZW = 0 if and only ifCZW

= 0.

(b) Assuming zero means again, we compute

KW = E[WWH ] = E[(U+ jV )(U+ jV )H ] = E[(U+ jV )(UH − jVH)]

= CU − jCUV + jCVU +CV = 2(CU − jCUV ).

We now see that AKW = KZW becomes

2(α + jβ )(CU − jCUV ) = 2(CXU − jCXV )

or

(αCU +βCUV )+ j(βCU −αCUV ) = CXU − jCXV . (∗)We also have

CW

=

[CU CUVCVU CV

]

so that ACW

=CZW

becomes

[α −ββ α

][CU CUVCVU CV

]= C

ZW

or [αCU −βCVU αCUV −βCVβCU +αCVU βCUV +αCV

]= C

ZW

or [αCU +βCUV αCUV −βCUβCU −αCUV βCUV +αCU

]=

[CXU CXV−CXV CXU

],

which is equivalent to (∗).(c) If A solves AKW = KZW , then by part (b), A solves AC

W= C

ZW. Hence, by

Problem 17, given W = w,

Z ∼ N(mZ+ A(w−m

W),C

Z− AC

W Z

).

Next, CZ− AC

W Zis equivalent to

[CX CXYCYX CY

]−[

α −ββ α

][CUX CUYCVX CVY

],


which, by the circular symmetry of Θ, becomes[CX CXY−CXY CX

]−[

α −ββ α

][CUX CUY−CUY CUX

],

or [CX CXY−CXY CX

]−[

αCUX +βCUY αCUY −βCUXβCUX −αCUY βCUY +αCUX

],

which is equivalent to

2(CX − jCXY )− (α + jβ ) ·2(CUX − jCUY ),

which is exactly KZ−AKWZ . Thus, givenW = w,

Z ∼ N(mZ +A(w−mW ),KZ−AKWZ

).

25. Let Z = X+ jY with X and Y independent N(0,1/2) as in the text.

(a) Since X and Y are zero mean,

cov(Z) = E[ZZ∗] = E[X2 +Y 2] = 12+ 1

2= 1.

(b) First write 2|Z|2 = 2(X2 +Y 2) = (√2X)2 +(

√2Y )2. Now,

√2X and

√2Y are

both N(0,1). Hence, their squares are chi-squared with one degree of freedomby Problem 46 in Chapter 4 or Problem 11 in Chapter 5. Hence, by Prob-

lem 55(c) in Chapter 4 and the remark following it, 2|Z|2 is chi-squared withtwo degrees of freedom.

26. With X ∼ N(mr,1) and Y ∼ N(mi,1), it follows either from Problem 47 in Chapter 4

or from Problem 12 in Chapter 5 that X2 and Y 2 are noncentral chi-squared with one

degree of freedom and respective noncentrality parameters m2r and m

2i . Since X and Y

are independent, it follows from Problem 65 in Chapter 4 that X2 +Y 2 is noncentralchi-squared with two degrees of freedom and noncentrality parameter m2

r +m2i . It is

now immediate from Problem 26 in Chapter 5 that |Z| =√X2 +Y 2 has the orginal

Rice density.

27. (a) The covariance matrix ofW is

E[WWH ] = E[K−1/2ZZHK−1/2] = K−1/2E[ZZH ]K−1/2 = K−1/2KK−1/2 = I.

Hence,

fW (w) =e−w

Hw

πn=

n

∏k=1

e−|wk|2

π.

(b) By part (a), theWk =Uk+ jVk are i.i.d. N(0,1) with

fWk(w) =e−|wk|

2

π=

e−(u2k+v2k)

(√2π/2

)2 =e−[(uk/(1/

√2 ))2+(vk/(1/

√2 ))2]/2

(√2π/2

)2

=e−[uk/(1/

√2 )]2/2

√2π/2

· e−[vk/(1/

√2 )]2/2

√2π/2

= fUkVk(uk,vk).

Hence,Uk and Vk are independent N(0,1/2).


(c) Write

2‖W‖2 =n

∑k=1

(√2Uk)

2 +(√2Vk)

2.

Since√2Uk and

√2Vk are independent N(0,1), their squares are chi-squared

with one degree of freedom by Problem 46 in Chapter 4 or Problem 11 in Chap-

ter 5. Next, (√2Uk)

2+(√2Vk)

2 is chi-squared with two degrees of freedom by

Problem 55(c) in Chapter 4 and the remark following it. Similarly, since theWkare indepdendent, 2‖W‖2 is chi-squared with 2n degrees of freedom.

28. (a) Write

0 = (u+ v)′M(u+ v) = u′Mu+ v′Mu+u′Mv+ v′Mv = 2v′Mu,

sinceM′ =M. Hence, v′Mu= 0.

(b) By part (a) with v=Mu we have

0 = v′Mu = (Mu)′Mu = ‖Mu‖2.

Hence, Mu= 0 for all u, and it follows that M must be the zero matrix.

29. We have from the text that

[ν ′ θ ′

][ CX CXYCYX CY

][νθ

]

is equal to

ν ′CXν +ν ′CXYθ +θ ′CYXν +θ ′CYθ ,

which, upon noting that ν ′CXYθ is a scalar and therefore equal to its transpose, sim-

plifies to

ν ′CXν +2θ ′CYXν +θ ′CYθ . (∗)We also have from the text (via Problem 22) that

wHKw= ν ′(CX +CY )ν +θ ′(CX +CY )θ +2θ ′(CYX −CXY )ν .

If (∗) is equal to wHKw/2 for all ν and all θ , then in particular, this must hold for allν when θ = 0. This implies

ν ′CXν = ν ′CX +CY

2ν or ν ′

CX −CY2

ν = 0.

Since ν is arbitrary, (CX −CY )/2= 0, orCX =CY . This means that we can now write

wHKw/2 = ν ′CXν +θ ′CYθ +θ ′(CYX −CXY )ν .

Comparing this with (∗) shows that

2θ ′CYXν = θ ′(CYX −CXY )ν or θ ′(CYX +CXY )ν = 0.

Taking θ = ν arbitrary and noting thatCYX +CXY is symmetric, it follows thatCYX +CXY = 0, and so CXY =−CYX .


30. (a) Since Γ is 2n× 2n, det(2Γ) = 22n detΓ. From the hint it follows that detΓ =(detK)2/22n.

(b) Write

VV−1 = (A+BCD)[A−1−A−1B(C−1 +DA−1B)−1DA−1]

= (A+BCD)A−1[I−B(C−1 +DA−1B)−1DA−1]

= (I+BCDA−1)[I−B(C−1 +DA−1B)−1DA−1]

= I+BCDA−1−B(C−1 +DA−1B)−1DA−1

−BCDA−1B(C−1 +DA−1B)−1DA−1

= I+BCDA−1

−B[I+CDA−1B](C−1 +DA−1B)−1DA−1

= I+BCDA−1

−BC[C−1 +DA−1B](C−1 +DA−1B)−1DA−1

= I+BCDA−1−BCDA−1 = I.

(c) To begin, write

ΓΓ−1 =

[CX −CYXCYX CX

][∆−1 C−1X CYX∆−1

−∆−1CYXC−1X ∆−1

]

=

[CX∆−1 +CYX∆−1CYXC

−1X CYX∆−1−CYX∆−1

CYX∆−1−CX∆−1CYXC−1X CYXC

−1X CYX∆−1 +CX∆−1

]

=


−1X 0

CYX∆−1−CX∆−1CYXC−1X (CYXC

−1X CYX +CX )∆−1

]

=


−1X 0

CYX∆−1−CX∆−1CYXC−1X I

].

Using the hint that

∆−1 = C−1X −C−1X CYX∆−1CYXC−1X ,

we easily obtain

CX∆−1 = I−CYX∆−1CYXC−1X ,

from which it follows that

ΓΓ−1 =

[I 0

CYX∆−1−CX∆−1CYXC−1X I

].

To show that the lower-left block is also zero, use the hint to write

CYX∆−1−CX∆−1CYXC−1X

= CYX [C−1X −C−1X CYX∆−1CYXC−1X ]−CX∆−1CYXC

−1X

= CYXC−1X −CYXC−1X CYX∆−1CYXC

−1X −CX∆−1CYXC

−1X

= CYXC−1X − [CYXC

−1X CYX +CX ]∆−1CYXC

−1X

= CYXC−1X −∆∆−1CYXC

−1X = 0.


(d) Write

KK−1 = 2(CX + jCYX )(∆−1− jC−1X CYX∆−1)/2

= CX∆−1 +CYXC−1X CYX∆−1 + j(CYX∆−1−CYX∆−1)

= ∆∆−1 = I.

(e) We begin with

[x ′ y ′

]Γ−1

[x

y

]=[x ′ y ′

][ ∆−1 C−1X CYX∆−1

−∆−1CYXC−1X ∆−1

][x

y

]

=[x ′ y ′

][ ∆−1x+C−1X CYX∆−1y−∆−1CYXC

−1X x+∆−1y

]

= x ′∆−1x+ x ′C−1X CYX∆−1y− y ′∆−1CYXC−1X x+ y ′∆−1y.

Now, since each of the above terms on the third line is a scalar, each term is

equal to its transpose. In particular,

y ′∆−1CYXC−1X x = x ′C−1X CXY∆−1y = −x ′C−1X CYX∆−1y.

Hence,

12

[x ′ y ′

]Γ−1

[x

y

]= 1

2(x ′∆−1x+2x ′C−1X CYX∆−1y+ y ′∆−1y). (∗)

We next compute

zHK−1z = 12(x ′− jy ′)(∆−1− jC−1X CYX∆−1)(x+ jy)

= 12(x ′− jy ′)[(∆−1x+C−1X CYX∆−1y)+ j(∆−1y−C−1X CYX∆−1x)]

= 12(x ′− jy ′)[(∆−1x+C−1X CYX∆−1y)+ j(∆−1y−∆−1CYXC

−1X x)]

by the hint that C−1X CYX∆−1 = ∆−1CYXC−1X . We continue with

zHK−1z = 12[x ′(∆−1x+C−1X CYX∆−1y)+ y ′(∆−1y−∆−1CYXC

−1X x)

+ jx ′(∆−1y−∆−1CYXC−1X x)− y ′(∆−1x+C−1X CYX∆−1y)]

= 12[x ′∆−1x+ x ′C−1X CYX∆−1y+ y ′∆−1y− y ′∆−1CYXC−1X x

+ jx ′∆−1y− x ′∆−1CYXC−1X x− y ′∆−1x− y ′C−1X CYX∆−1y].

We now use the fact that since each of the terms in the last line is a scalar, it is

equal to its transpose. AlsoCXY =−CYX . Hence,

zHK−1z = 12[x ′∆−1x+2x ′C−1X CYX∆−1y+ y ′∆−1y− jx ′∆−1CYXC−1X x+ y ′C−1X CYX∆−1y].

Since

x ′∆−1CYXC−1X x= (x ′∆−1CYXC

−1X x)′ =−x ′C−1X CYX∆−1x=−x ′∆−1CYXC−1X x,

and similarly for y ′C−1X CYX∆−1y, the two imaginary terms above are zero.

CHAPTER 10

Problem Solutions

1. Write

mX (t) := E[Xt ] = E[g(t,Z)]

= g(t,1)P(Z = 1)+g(t,2)P(Z = 2)+g(t,3)P(Z = 3)

= p1a(t)+ p2b(t)+ p3c(t),

and

RX (t,s) := E[XtXs] = E[g(t,Z)g(s,Z)]

= g(t,1)g(s,1)p1 +g(t,2)g(s,2)p2 +g(t,3)g(s,3)p3

= a(t)a(s)p1 +b(t)b(s)p2 + c(t)c(s)p3.

2. Imitating the derivation of the Cauchy–Schwarz inequality for random variables in

Chapter 2 of the text, write

0 ≤∫ ∞

−∞|g(θ)−λh(θ)|2 dθ

=∫ ∞

−∞|g(θ)|2 dθ −λ

∫ ∞

−∞h(θ)g(θ)∗ dθ

−λ ∗∫ ∞

−∞g(θ)h(θ)∗ dθ + |λ |2

∫ ∞

−∞|h(θ)|2 dθ .

Then put

λ =

∫ ∞−∞ g(θ)h(θ)∗ dθ∫ ∞−∞ |h(θ)|2 dθ

to get

0 ≤∫ ∞

−∞|g(θ)|2 dθ −

∣∣∣∫ ∞−∞ g(θ)h(θ)∗ dθ

∣∣∣2

∫ ∞−∞ |h(θ)|2 dθ

−

∣∣∣∫ ∞−∞ g(θ)h(θ)∗ dθ

∣∣∣2

∫ ∞−∞ |h(θ)|2 dθ

+

∣∣∣∫ ∞−∞ g(θ)h(θ)∗ dθ

∣∣∣2

(∫ ∞−∞ |h(θ)|2 dθ

)2∫ ∞

−∞|h(θ)|2 dθ

=∫ ∞

−∞|g(θ)|2 dθ −

∣∣∣∫ ∞−∞ g(θ)h(θ)∗ dθ

∣∣∣2

∫ ∞−∞ |h(θ)|2 dθ

.

Rearranging yields the desired result.

164


3. Write

CX (t1, t2) = E[(Xt1 −mX (t1))(Xt2 −mX (t2))]

= E[Xt1Xt2 ]−mX (t1)E[Xt2 ]−E[Xt1 ]mX (t2)+mX (t1)mX (t2)

= E[Xt1Xt2 ]−mX (t1)mX (t2)−mX (t1)mX (t2)+mX (t1)mX (t2)

= RX (t1, t2)−mX (t1)mX (t2).

Similarly,

CXY (t1, t2) = E[(Xt1 −mX (t1))(Yt2 −mY (t2))]

= E[Xt1Yt2 ]−mX (t1)E[Yt2 ]−E[Xt1 ]mY (t2)+mX (t1)mY (t2)

= E[Xt1Xt2 ]−mX (t1)mY (t2)−mX (t1)mY (t2)+mX (t1)mY (t2)

= RXY (t1, t2)−mX (t1)mY (t2).

4. Write

0 ≤ E

[∣∣∣∣n

∑i=1

ciXti

∣∣∣∣2]

= E

[(n

∑i=1

ciXti

)(n

∑k=1

ckXtk

)∗]=

n

∑i=1

n

∑k=1

ciE[XtiXtk ]c∗k

=n

∑i=1

n

∑k=1

ciRX (ti, tk)c∗k .

5. Since Xt has zero mean, var(Xt) = RX (t, t) = t. Thus, Xt ∼ N(0, t), and

fXt (x) =e−x

2/(2t)

√2πt

.

6. First note that by making the change of variable k = n− i, we have

Yn =∞

∑k=−∞

h(k)Xn−k.

(a) Write

mY (n) = E[Yn] = E

[ ∞

∑k=−∞

h(k)Xn−k

]=

∞

∑k=−∞

h(k)E[Xn−k]

=∞

∑k=−∞

h(k)mX (n− k).

(b) Write

E[XnYm] = E

[Xn

( ∞

∑k=−∞

h(k)Xm−k

)]=

∞

∑k=−∞

h(k)E[XnXm−k]

=∞

∑k=−∞

h(k)RX (n,m− k).


(c) Write

E[YnYm] = E

[( ∞

∑l=−∞

h(l)Xn−l

)Ym

]=

∞

∑l=−∞

h(l)E[Xn−lYm]

=∞

∑l=−∞

h(l)RXY (n− l,m) =∞

∑l=−∞

h(l)

[ ∞

∑k=−∞

h(k)RX (n− l,m− k)].

7. Let Xt = cos(2π f t+Θ), where Θ∼ uniform[−π,π].

(a) Consider choices t1 = 0 and t2 =−(π/2)/(2π f ). Then Xt1 = cos(Θ) and Xt2 =sin(Θ), which are not jointly continuous.

(b) Write

E[g(Xt)] = E[g(cos(2π f t+Θ))] =∫ π

−πg(cos(2π f t+θ))

dθ

2π

=∫ π+2π f t

−π+2π f tg(cos(τ))

dτ

2π=∫ π

−πg(cos(τ))

dτ

2π,

since the integrand has period 2π and the range of integration has length 2π .Thus, E[g(Xt)] does not depend on t.

8. (a) Using independence and a trigonometric identity, write

E[Yt1Yt2 ] = E[Xt1Xt2 cos(2π f t1 +Θ)cos(2π f t2 +Θ)]

= E[Xt1Xt2 ]E[cos(2π f t1 +Θ)cos(2π f t2 +Θ)]

= 12RX (t1− t2)E

[cos(2π f [t1− t2])+ cos(2π f [t1 + t2]+2Θ)

]

= 12RX (t1− t2)

cos(2π f [t1− t2])+E

[cos(2π f [t1 + t2]+2Θ)

]︸︷︷︸

= 0

= 12RX (t1− t2)cos(2π f [t1− t2]).

(b) A similar argument yields

E[Xt1Yt2 ] = E[Xt1Xt2 cos(2π f t2 +Θ)] = E[Xt1Xt2 ]E[cos(2π f t2 +Θ)]︸︷︷︸= 0

= 0.

(c) It is clear that Yt is zero mean. Together with part (a) it follows that Yt is WSS.

9. By Problem 7(b), FXt (x) = P(Xt ≤ x) = E[I(−∞,x](Xt)] does not depend on t, and sowe can restrict attention to the case t = 0. Since X0 = cos(Θ) has the arcsine densityof Problem 35 in Chapter 5, f (x) = (1/π)/

√1− x2 for |x|< 1.

10. Second-order strict stationarity means that for every two-dimensional set B, for every

t1, t2, and ∆t, P((Xt1+∆t ,Xt2+∆t) ∈ B) does not depend on ∆t. In particular, this is truewhenever B has the form B= A× IR for any one-dimensional set A; i.e.,

P((Xt1+∆t ,Xt2+∆t) ∈ B) = P((Xt1+∆t ,Xt2+∆t) ∈ A× IR) = P(Xt1 ∈ A,Xt2 ∈ IR)

= P(Xt1 ∈ A).

does not depend on ∆t. Hence Xt is first-order strictly stationary.


11. (a) If p1 = p2 = 0 and p3 = 1, then Xt = c(t) = −1 with probability one. Then

E[Xt ] = −1 does not depend on t, and E[Xt1Xt2 ] = (−1)2 = 1 depends on t1and t2 only through their difference t1− t2; in fact the correlation function is aconstant function of t1− t2. Thus, Xt is WSS.

(b) If p1 = 1 and p2 = p3 = 0, then Xt = e−|t| with probability one. Then E[Xt ] =e−|t| depends on t. Hence, Xt is not WSS.

(c) First, the only way to have X0 = 1 is to have Xt = a(t) = e−|t|, which requiresZ = 1. Hence,

P(X0 = 1) = P(Z = 1) = p1.

Second, the only way to have Xt ≤ 0 for 0≤ t ≤ 0.5 is to have Xt = c(t) =−1,which requires Z = 3. Hence,

P(Xt ≤ 0, 0≤ t ≤ 0.5) = P(Z = 3) = p3.

Third, the only way to have Xt ≤ 0 for 0.5≤ t ≤ 1 is to have Xt = b(t) = sin(2πt)or Xt = c(t) =−1. Hence,

P(Xt ≤ 0, 0.5≤ t ≤ 1) = P(Z = 2 or Z = 3) = p2 + p3.

12. With Yk := q(Xk,Xk+1, . . . ,Xk+L−1), write

E[e j(ν1Y1+m+···+νnYn+m)] = E[e jν1q(X1+m,...,Xm+L)+···+νnq(Xn+m,...,Xn+m+L−1)].

The exponential on the right is just a function of X1+m, . . . ,Xn+L−1+m. Since the Xkprocess is strictly stationary, the expectation on the right is unchanged if we replace

X1+m, . . . ,Xn+L−1+m by X1, . . . ,Xn+L−1; i.e., the above right-hand side is equal to

E[e jν1q(X1,...,XL)+···+νnq(Xn,...,Xn+L−1)] = E[e j(ν1Y1+···+νnYn)].

Hence, the Yk process is strictly stationary.

13. We begin with

E[g(X0)] = E[X0I[0,∞)(X0)] =∫ ∞

−∞xI[0,∞)(x) · λ

2e−λ |x| dx = 1

2

∫ ∞

0x ·λe−λx dx,

which is just 1/2 times the expectation of an exp(λ ) random variable. We thus see

that E[g(X0)] = 1/(2λ ). Next, for n 6= 0, we compute

E[g(Xn)] =∫ ∞

−∞xI[0,∞)(x) ·

e−x2/2

√2π

dx =1√2π

∫ ∞

0xe−x

2/2 dx

=1√2π

(−e−x2/2

)∣∣∣∞

0=

1√2π

.

Hence, E[g(X0)] 6= E[g(Xn)] for n 6= 0, and it follows that Xk is not strictly stationary.


14. First consider the mean function,

E[q(t+T )] =1

T0

∫ T0

0q(t+θ)dθ =

1

T0

∫ t+T0

tq(τ)dτ =

1

T0

∫ T0

0q(τ)dτ,

where we have used the fact that since q has period T0, the integral of q over any inter-

val of length T0 yields the same result. The second thing to consider is the correlation

function. Write

E[q(t1 +T )q(t2 +T )] =1

T0

∫ T0

0q(t1 +θ)q(t2 +θ)dθ

=1

T0

∫ t2+T0

t2

q(t1 + τ− t2)q(τ)dτ

=1

T0

∫ T0

0q([t1− t2]+ τ)q(τ)dτ,

where we have used the fact that as a function of τ , the product q([t1− t2]+ τ)q(τ)has period T0. Since the mean function does not depend on t, and since the correlation

function depends on t1 and t2 only through their difference, Xt is WSS.

15. For arbitrary functions h, write

E[h(Xt1+∆t , . . . ,Xtn+∆t)] = E[h(q(t1 +∆t+T ), . . . ,q(tn+∆t+T )

)]

=1

T0

∫ T0

0h(q(t1 +∆t+θ), . . . ,q(tn+∆t+θ)

)dθ

=1

T0

∫ ∆t+T0

∆th(q(t1 + τ), . . . ,q(tn+ τ)

)dτ

=1

T0

∫ T0

0h(q(t1 + τ), . . . ,q(tn+ τ)

)dτ,

where the last step follows because we are integrating a function of period T0 over an

interval of length T0. Hence,

E[h(Xt1+∆t , . . . ,Xtn+∆t)] = E[h(Xt1 , . . . ,Xtn)],

and we see that Xt is strictly stationary.

16. First write E[Yn] = E[Xn−Xn−1] = E[Xn]−E[Xn−1] = 0 since E[Xn] does not dependon n. Next, write

E[YnYm] = E[(Xn−Xn−1)(Xm−Xm−1)]= RX (n−m)−RX ([n−1]−m)−RX(n− [m−1])+RX(n−m)

= 2RX (n−m)−RX (n−m−1)−RX(n−m+1),

which depends on n and m only through their difference. Hence, Yn is WSS.

17. From the Fourier transform table, SX ( f ) =√2π e−(2π f )2/2.

18. From the Fourier transform table, SX ( f ) = πe−2π| f |.


19. (a) Since correlation functions are real and even, we can write

SX ( f ) =

∫ ∞

−∞RX (τ)e− j2π f τ dτ =

∫ ∞

−∞RX (τ)cos(2π f τ)dτ

= 2

∫ ∞

0RX (τ)cos(2π f τ)dτ = 2Re

∫ ∞

0RX (τ)e− j2π f τ dτ.

(b) OMITTED.

(c) The requested plot is at the left; at the right the plot is focused closer to f = 0.

−6 −3 0 3 60

1

2

−1 −0.5 0 0.5 10

1

2

(d) The requested plot is at the left; at the right the plot is focused closer to f = 0.

−6 −3 0 3 60

2

4

−1 −0.5 0 0.5 10

2

4

20. (a) RX (n) = E[Xk+nXk] = E[XkXk+n] = RX (−n).(b) Since RX (n) is real and even, we can write

SX ( f ) =∞

∑n=−∞

RX (n)e− j2π f n =∞

∑n=−∞

RX (n)[cos(2π f n)− j sin(2π f n)]

=∞

∑n=−∞

RX (n)cos(2π f n)− j∞

∑n=−∞

odd function of n︷︸︸︷RX (n)sin(2π f n)

︸︷︷︸= 0

=∞

∑n=−∞

RX (n)cos(2π f n),

which is a real and even function of f .

21. (a) Since correlation functions are real and even, we can write

SX ( f ) =∞

∑n=−∞

RX (n)e− j2π f n


= RX (0)+∞

∑n=1

RX (n)e− j2π f n+−1∑

n=−∞

RX (n)e− j2π f n

= RX (0)+∞

∑n=1

RX (n)e− j2π f n+∞

∑k=1

RX (−k)e j2π f k

= RX (0)+2∞

∑n=1

RX (n)cos(2π f n)

= RX (0)+2Re∞

∑n=1

RX (n)e− j2π f n.

(b) OMITTED.

(c) Here is the plot:

−0.5 0 0.50

1

2

(d) Here is the plot:

−0.5 0 0.50

2

4

22. Write∫ ∞

−∞h(−t)e− j2π f t dt =

∫ ∞

−∞h(θ)e− j2π f (−θ) dθ

=∫ ∞

−∞h(θ)e j2π fθ dθ

=

(∫ ∞

−∞h(θ)∗e− j2π fθ dθ

)∗

=

(∫ ∞

−∞h(θ)e− j2π fθ dθ

)∗, since h is real,

= H( f )∗.

23. We begin with SXY ( f ) = H( f )∗SX ( f ) = ( j2π f )∗SX ( f ) = − j2π f SX ( f ). It then fol-lows that

RXY (τ) = − d

dτRX (τ) = − d

dτe−τ2/2 = τe−τ2/2.


Similarly, since SY ( f ) = |H( f )|2SX ( f ) =−( j2π f )( j2π f )SX( f ),

RY (τ) = − d2

dτ2RX (τ) =

d

dτRXY (τ) =

d

dττe−τ2/2 = e−τ2/2(1− τ2).

24. Since RX (τ) = 1/(1+ τ2), we have from the transform table that SX ( f ) = πe−2π| f |.Similarly, since h(t) = 3sin(πt)/(πt), we have from the transform table that H( f ) =3I[−1/2,1/2]( f ). We can now write

SY ( f ) = |H( f )|2SX ( f ) = 9I[−1/2,1/2]( f ) ·πe−2π| f | = 9πe−2π| f |I[−1/2,1/2]( f ).

25. First note that since RX (τ) = e−τ2/2, SX ( f ) =√2πe−(2π f )2/2.

(a) SXY ( f ) = H( f )∗SX ( f ) =[e−(2π f )2/2]∗

√2πe−(2π f )2/2 =

√2π e−(2π f )2 .

(b) Writing

SXY ( f ) =1√2

√2π√2e−(

√2)2(2π f )2/2,

we have from the transform table that

RXY (τ) =1√2e−(τ/

√2)2/2 =

1√2e−τ2/4.

(c) Write

E[Xt1Yt2 ] = RXY (t1− t2) =1√2e−(t1−t2)2/4.

(d) SY ( f ) = |H( f )|2SX ( f ) = e−(2π f )2 ·√2πe−(2π f )2/2 =

√2πe−3(2π f )2/2.

(e) Writing

SY ( f ) =1√3

√2π√3e−(

√3)2(2π f )2/2,

we have from the transform table that

RY (τ) =1√3e−(τ/

√3)2/2 =

1√3e−τ2/6.

26. We have from the transform table that SX ( f ) = [sin(π f )/(π f )]2. The goal is to

choose a filterH( f ) so that RY (τ) = sin(πτ)/(πτ); i.e., so that SY ( f ) = I[−1/2,1/2]( f ).Thus, the formula SY ( f ) = |H( f )|2SX ( f ) becomes

I[−1/2,1/2]( f ) = |H( f )|2[sin(π f )

π f

]2.

We therefore take

H( f ) =

π f

sin(π f ), | f | ≤ 1/2,

0, | f |> 1/2.


27. SinceYt and Zt are responses of LTI systems to aWSS input,Yt and Zt are individually

WSS. If we can show that E[Yt1Zt2 ] depends on t1 and t2 only throught their difference,then Yt and Zt will be J-WSS. We show this to be the case. Write

E[Yt1Zt2 ] = E

[∫ ∞

−∞h(θ)Xt1−θ dθ

∫ ∞

−∞g(τ)Xt2−τ dτ

]

=∫ ∞

−∞

∫ ∞

−∞h(θ)g(τ)E[Xt1−θXt2−τ ]dτ dθ

=

∫ ∞

−∞

∫ ∞

−∞h(θ)g(τ)RX ([t1−θ ]− [t2− τ])dτ dθ ,

which depends on t1− t2 as claimed.28. Observe that if h(t) := δ (t)−δ (t−1), then

∫ ∞−∞ h(τ)Xt−τ dτ = Xt −Xt−1.

(a) Since Yt := Xt−Xt−1 is the response of an LTI system to a WSS input, Xt and Ytis J-WSS.

(b) Since H( f ) = 1− e− j2π f ,

|H( f )|2 = (1− e− j2π f )(1− e j2π f ) = 2− e j2π f − e− j2π f

= 2−2e j2π f + e− j2π f

2= 2[1− cos(2π f )],

we have

SY ( f ) = |H( f )|2SX ( f ) = 2[1− cos(2π f )] · 2

1+(2π f )2=

4[1− cos(2π f )]

1+(2π f )2.

29. In Yt =∫ tt−3Xτ dτ , make the change of variable θ = t− τ , dθ =−dτ to get

Yt =∫ 3

0Xt−θ dθ =

∫ ∞

−∞I[0,3](θ)Xt−θ dθ .

This shows that Yt is the response to Xt of the LTI system with impulse h(t) = I[0,3](t).Hence, Yt is WSS.

30. Apply the formula

H(0) =

(∫ ∞

−∞h(t)e− j2π f t dt

)∣∣∣∣f=0

with h(t) = (1/π)[(sin t)/t]2. Then from the table, H( f ) = (1−π| f |)I[−1/π,1/π]( f ),and we find that

1 =1

π

∫ ∞

−∞

( sin tt

)2dt, which is equivalent to π =

∫ ∞

−∞

( sin tt

)2dt.

31. (a) Following the hint, we have

E[XnYm] =∞

∑k=−∞

h(k)RX (n,m− k) =∞

∑k=−∞

h(k)RX (n−m+ k).


(b) Similarly,

E[YnYm] =∞

∑l=−∞

h(l)

[ ∞

∑k=−∞

h(k)RX (n− l,m− k)]

=∞

∑l=−∞

h(l)

[ ∞

∑k=−∞

h(k)RX (n− l− [m− k])]

=∞

∑l=−∞

h(l)

[ ∞

∑k=−∞

h(k)RX ([n−m]− [l− k])].

(c) By part (a),

RX (n) =∞

∑k=−∞

h(k)RX (n+ k),

and so

SXY ( f ) =∞

∑n=−∞

RX (n)e− j2π f n =∞

∑n=−∞

[ ∞

∑k=−∞

h(k)RX (n+ k)

]e− j2π f n

=∞

∑k=−∞

h(k)

[ ∞

∑n=−∞

RX (n+ k)e− j2π f n

]

=∞

∑k=−∞

h(k)

[ ∞

∑m=−∞

RX (m)e− j2π f (m−k)]

=

[ ∞

∑k=−∞

h(k)e j2π f k

][ ∞

∑m=−∞

RX (m)e− j2π fm

]

=

[ ∞

∑k=−∞

h(k)e− j2π f k

]∗SX ( f ), since h(k) is real,

= H( f )∗SX ( f ).

(d) By part (b),

RY (n) =∞

∑l=−∞

h(l)

[ ∞

∑k=−∞

h(k)RX (n− [l− k])],

and so

SY ( f ) =∞

∑n=−∞

∞

∑l=−∞

h(l)

[ ∞

∑k=−∞

h(k)RX (n− [l− k])]e− j2π f n

=∞

∑l=−∞

h(l)∞

∑k=−∞

h(k)∞

∑n=−∞

RX (n− [l− k])e− j2π f n

=∞

∑l=−∞

h(l)∞

∑k=−∞

h(k)∞

∑m=−∞

RX (m)e− j2π f (m+[l−k])

=∞

∑l=−∞

h(l)e− j2π f l∞

∑k=−∞

h(k)e j2π f k∞

∑m=−∞

RX (m)e− j2π fm

= H( f )H( f )∗SX ( f ) = |H( f )|2SX ( f ).


32. By the hint,1

2T

∫ T

0|x(t)|2 dt =

nE0

2T+

1

2T

∫ τ

0|x(t)|2 dt.

We first observe that since

T0 ≤ T0 +τ

n=

T

n≤ T0 +

T0

n→ T0,

T/n→ T0. It follows that

nE0

2T=

E0

2(T/n)→ E0

2T0.

We next show that the integral on the right goes to zero. Write

1

2T

∫ τ

0|x(t)|2 dt ≤ 1

2T

∫ T0

0|x(t)|2 dt =

E0

2T→ 0.

A similar argument shows that

1

2T

∫ 0

−T|x(t)|2 dt → E0/(2T0).

Putting this all together shows that (1/2T )∫ T−T |x(t)|2 dt→ E0/T0.

33. Write

E

[∫ ∞

−∞X2t dt

]=

∫ ∞

−∞E[X2

t ]dt =∫ ∞

−∞RX (0)dτ = ∞.

34. If the function q(W ) :=∫W0 S1( f )−S2( f )d f is identically zero, then so is its deriva-

tive, q′(W ) = S1( f )−S2( f ). But then S1( f ) = S2( f ) for all f ≥ 0.

35. If h(t) = I[−T,T ](t), and white noise is applied to the corresponding system, the crosspower spectral density of the input and output is

SXY ( f ) = H( f )∗N0/2 = 2Tsin(2πT f )

2πT f· N02

,

which is real and even, but not nonnegative. Similarly, if h(t) = e−tI[0,∞)(t),

SXY ( f ) =N0/2

1+ j2π f,

which is complex valued.

36. First write

SY ( f ) = |H( f )|2N0/2 =

(1− f 2)2N0/2, | f | ≤ 1,

0, | f |> 1.

Then

PY =∫ ∞

−∞SY ( f )d f =

N0

22

∫ 1

0(1−2 f 2 + f 4)d f

= N0

[f − 2

3f 3 + 1

5f 5]∣∣∣1

0= N0[1− 2

3+ 1

5] = 8N0/15.


37. First note that RX (τ) = e−(2πτ)2/2 has power spectral density SX ( f ) = e− f2/2/

√2π .

Using the definition of H( f ) =√| f | I[−1,1]( f ),

E[Y 2t ] = RY (0) =∫ ∞

−∞SY ( f )d f =

∫ ∞

−∞|H( f )|2SX ( f )d f =

∫ 1

−1| f |SX ( f )d f

= 2

∫ 1

0f SX ( f )d f =

√2

π

∫ 1

0f e− f

2/2 d f =

√2

π

(−e− f 2/2

)∣∣∣1

0

= (1− e−1/2)√

2

π.


SY ( f ) = |H( f )|2SX ( f ) =

(sinπ f

π f

)2

N0/2.

This is the Fourier transform of

RY (τ) = (1−|τ|)I[−1,1](τ)N02

.

Then PY = RY (0) = N0/2.

39. (a) First note that

H( f ) =1/(RC)

1/(RC)+ j2π f=

1

1+ j(2π f )RC.

Then

SXY ( f ) = H( f )∗SX ( f ) =N0/2

1− j(2π f )RC.

(b) The inverse Fourier transform of SXY ( f ) = H( f )∗N0/2 is

RXY (τ) = h(−τ)∗N0/2 = h(−τ)N0/2,

where the last step follows because h is real. Hence,

RXY (τ) =N0

2RCeτ/(RC)u(−τ).

(c) E[Xt1Yt2 ] = RXY (t1− t2) = (N0/(2RC))e(t1−t2)/(RC)u(t2− t1).

(d) SY ( f ) = |H( f )|2SX ( f ) =N0/2

1+(2π f RC)2.

(e) Since

SY ( f ) =N0/2

1+(2π f RC)2=

N0

2(RC)2· 2(1/(RC))(

1RC

)2+(2π f )2

(RC

2

),

we have that

RY (τ) =N0

4RCe−|τ |/(RC).


(f) PY = RY (0) = N0/(4RC).

40. To begin, write E[Yt+1/2Yt ] = RY (1/2). Next, since the input has power spectral den-

sity N0/2 and since h(t) = 1/(1+ t2) has transform H( f ) = πe−2π| f |, we can write

SY ( f ) = |H( f )|2SX ( f ) = |πe−2π| f ||2 N02

= π2e−4π| f | N02

= πN02·πe−2π(2)| f |.

From the transform table, we conclude that

RY (τ) =πN02· 2

4+ τ2=

πN04+ τ2

,

and so E[Yt+1/2Yt ] = RY (1/2) = 4πN0/17.

41. Since H( f ) = sin(πT f )/(πT f ), we can write

SY ( f ) = |H( f )|2N02

= T

(sinπT f

πT f

)2N0

2T.

Then

RY (τ) =N0

2T(1−|τ|/T )I[−T,T ](τ).

42. To begin, write

Yt = e−t∫ t

−∞eθXθ dθ =

∫ t

−∞e−(t−θ)Xθ dθ =

∫ ∞

−∞e−(t−θ)u(t−θ)Xθ dθ ,

where u is the unit-step function. We then see that Yt is the response to Xt of the

LTI system with impulse response h(t) := e−tu(t). Hence, we know from the text

that Xt and Yt are jointly wide-sense stationary. Next, since SX ( f ) = N0/2, RX (τ) =(N0/2)δ (τ). We then compute in the time domain,

RXY (τ) =∫ ∞

−∞h(−α)RX (τ−α)dα =

N0

2

∫ ∞

−∞h(−α)δ (τ−α)dα =

N0

2h(−τ)

=N0

2eτu(−τ).

Next,

SXY ( f ) = H( f )∗SX ( f ) =N0/2

1− j2π f,

and

SY ( f ) = |H( f )|2SX ( f ) =N0/2

1+(2π f )2.

It then follows that RY (τ) = (N0/4)e−|τ |.

43. Consider the impulse response

h(τ) :=∞

∑n=−∞

hnδ (τ−n).


Then

∫ ∞

−∞h(τ)Xt−τ dτ =

∫ ∞

−∞

[ ∞

∑n=−∞

hnδ (τ−n)]Xt−τ dτ

=∞

∑n=−∞

hn

∫ ∞

−∞Xt−τ δ (τ−n)dτ =

∞

∑n=−∞

hnXt−n =: Yt .

(a) Since Yt is the response of the LTI system with impulse response h(t) to the

WSS input Xt , Xt and Yt are J-WSS.

(b) Since Yt is the response of the LTI system with impulse response h(t) to the

WSS input Xt , SY ( f ) = |H( f )|2SX ( f ), where

H( f ) =∫ ∞

−∞h(τ)e− j2π f τ dτ =

∫ ∞

−∞

[ ∞

∑n=−∞

hnδ (τ−n)]e− j2π f τ dτ

=∞

∑n=−∞

hn

∫ ∞

−∞δ (τ−n)e− j2π f τ dτ =

∞

∑n=−∞

hne− j2π f n

has period one. Hence, P( f ) = |H( f )|2 is real, nonnegative, and has period one.

44. When the input power spectral density is SW ( f ) = 3, the output power spectral density

is |H( f )|2 ·3. We are also told that this output power spectral density is equal to e− f2.

Hence, |H( f )|2 ·3 = e− f2, or |H( f )|2 = e− f

2/3. Next, if SX ( f ) = e f

2I[−1,1]( f ), then

SY ( f ) = |H( f )|2SX ( f ) = (e− f2/3) · e f 2 I[−1,1]( f ) = (1/3)I[−1,1]( f ). It then follows

that

RY (τ) =1

3·2sin(2πτ)

2πτ=

2

3· sin(2πτ)

2πτ.

45. Since H( f ) = GI[−B,B]( f ) and Yt is the response to white noise, the output power

spectral density is SY ( f ) = G2I[−B,B]( f ) ·N0/2, and so

RY (τ) =G2N0

2·2B sin(2πBτ)

2πBτ= G2BN0 ·

sin(2πBτ)

2πBτ.

Note that

RY (k∆t) = RY (k/(2B)) = G2BN0 ·sin(2πBk/(2B))

2πBk/(2B)= G2BN0 ·

sin(πk)

πk,

which is G2BN0 for k = 0 and zero otherwise. It is obvious that the Xi are zero mean.

Since E[XiX j] = RY (i− j), and the Xi are uncorrelated with variance E[X2i ] = RY (0) =

G2BN0.

46. (a) First write RX (τ) = E[Xt+τX∗t ]. Then RX (−τ) = E[Xt−τX

∗t ] = (E[XtX

∗t−τ ])

∗ =RX (τ)∗.

(b) Since

SX ( f ) =∫ ∞

−∞RX (τ)e− j2π f τ dτ,


we can write

SX ( f )∗ =∫ ∞

−∞RX (τ)∗e j2π f τ dτ =

∫ ∞

−∞RX (−t)∗e− j2π f t dt

=∫ ∞

−∞RX (t)e− j2π f t dt, by part (a),

= SX ( f ).

Since SX ( f ) is equal to its complex conjugate, SX ( f ) is real.

(c) Write

E[Xt1Y∗t2] = E

[Xt1

(∫ ∞

−∞h(θ)Xt2−θ dθ

)∗]=

∫ ∞

−∞h(θ)∗E[Xt1X

∗t2−θ ]dθ

=∫ ∞

−∞h(θ)∗RX ([t1− t2]+θ)dθ =

∫ ∞

−∞h(−β )∗RX ([t1− t2]−β )dβ .

(d) By part (c),

RXY (τ) =∫ ∞

−∞h(−β )∗RX (τ−β )dβ ,

which is the convolution of h(−·)∗ and RX . Hence, the transform of this equa-

tion is the product of the transform of h(−·)∗ and SX . We just have to observe

that∫ ∞

−∞h(−β )∗e− j2π fβ dβ =

∫ ∞

−∞h(t)∗e j2π f t dt =

[∫ ∞

−∞h(t)e− j2π f t dt

]∗= H( f )∗.

Hence, SXY ( f ) = H( f )∗SX ( f ). Next, since

RY (τ) = E[Yt+τY∗t ] = E

[(∫ ∞

−∞h(θ)Xt+τ−θ dθ

)Y ∗t

]

=

∫ ∞

−∞h(θ)E[Xt+τ−θY

∗t ]dθ =

∫ ∞

−∞h(θ)RXY (τ−θ)dθ ,

is a convolution, its transform is

SY ( f ) = H( f )SXY ( f ) = H( f )H( f )∗SX ( f ) = |H( f )|2SX ( f ).

47. (a) RX (τ) =∫ ∞

−∞SX ( f )e j2π f τ d f =

∫ ∞

−∞δ ( f )e j2π f τ d f = e j2π0τ = 1.

(b) Write

RX (τ) =

∫ ∞

−∞[δ ( f − f0)+δ ( f + f0)]e

j2π f τ d f = e j2π f0τ + e− j2π f0τ

= 2cos(2π f0τ).

(c) First write

SX ( f ) = e− f2/2 =

[e−( 1

2π )2(2π f )2/2 · 12π·√2π

] 2π√2π

=[e−( 1

2π )2(2π f )2/2 · 12π·√2π

]√2π.


From the Fourier transform table with σ = 1/(2π),

RX (τ) =√2πe−(2πτ)2/2.

(d) From the transform table with λ = 1/(2π),

RX (τ) =1

π· 1/(2π)

(1/(2π))2 + τ2=

2

1+(2πτ)2.

48. Write

E[X2t ] = RX (0) =

[∫ ∞

−∞SX ( f )e j2π f τ d f

]∣∣∣∣τ=0

=∫ ∞

−∞SX ( f )d f =

∫ W

−W1d f = 2W.

49. (a) e− f u( f ) is not even.

(b) e− f2cos( f ) is not nonnegative.

(c) (1− f 2)/(1+ f 4) is not nonnegative.

(d) 1/(1+ j f 2) is not real valued.

50. (a) Since sinτ is odd, it is NOT a valid correlation function.

(b) Since the Fourier transform of cosτ is [δ ( f − 1)+ δ ( f + 1)]/2, which is real,even, and nonnegative, cosτ IS a valid correlation function.

(c) Since the Fourier transform of e−τ2/2 is√2π e−(2π f )2/2, which is real, even, and

nonnegative, e−τ2/2 IS a valid correlation function.

(d) Since the Fourier transform of e−|τ | is 2/[1+(2π f )2], which is real, even, andnonnegative, e−|τ | IS a valid correlation function.

(e) Since the value of τ2e−|τ | at τ = 0 is less than the value for other values of τ ,τ2e−|τ | is NOT a valid correlation function.

(f) Since the Fourier transform of I[−T,T ](τ) is (2T )sin(2πT f )/(2πT f ) is not non-negative, I[−T,T ](τ) is NOT a valid correlation function.

51. Since R0(τ) is a correlation function, S0( f ) is real, even, and nonnegative. Since

R(τ) = R0(τ)cos(2π f0τ),

S( f ) = 12[S0( f − f0)+S0( f + f0)],

which is obviously real and nonnegative. It is also even since

S(− f ) = 12[S0(− f − f0)+S0(− f + f0)]

= 12[S0( f + f0)+S0( f − f0)], since S0 is even,

= S( f ).

Since S( f ) is real, even, and nonnegative, R(τ) is a valid correlation function.


52. First observe that the Fourier transform of R(τ) = R(τ − τ0) +R(τ + τ0) is S( f ) =2S( f )cos(2π f τ0). Hence, the answer cannot be (a) because it is possible to have

S( f ) > 0 and cos(2π f τ0) < 0 for some values of f . Let S( f ) = I[−1/(4τ0),1/(4τ0)]( f ),which is real, even, and nonnegative. Hence, its inverse transform, which we denote

by R(τ), is a correlation function. In this case, S( f ) = 2S( f )cos(2π f τ0)≥ 0 for all f ,

and is real and even too. Hence, for this choice of R(τ), R(τ) is a correlation function.Therefore, the answer is (b).

53. To begin, write

R(τ) =∫ ∞

−∞S( f )e j2π f τ d f =

∫ ∞

−∞S( f )[cos(2π f τ)− j sin(2π f τ)]d f .

Since S is real and even, the integral of S( f )sin(2π f τ) is zero, and we have

R(τ) =∫ ∞

−∞S( f )cos(2π f τ)d f ,

which is a real and even function of τ . Finally,

|R(τ)| =

∣∣∣∣∫ ∞

−∞S( f )e j2π f τ d f

∣∣∣∣ ≤∫ ∞

−∞

∣∣S( f )e j2π f τ∣∣d f

=∫ ∞

−∞|S( f )|d f =

∫ ∞

−∞S( f )d f = R(0).

54. Let S0( f ) denote the Fourier transform of R0(τ), and let S( f ) denote the Fouriertransform of R(τ).

(a) The derivation in the text showing that the transform of a correlation function

is real and even uses only the fact that correlation functions are real and even.

Hence, S0( f ) is real and even. Furthermore, since R is the convolution of R0with itself, S( f ) = S0( f )

2, which is real, even, and nonnegative. Hence, R(τ) isa correlation function.

(b) If R0(τ) = I[−T,T ](τ), then

S0( f ) = 2Tsin(2πT f )

2πT fand S( f ) = 2T ·2T

[sin(2πT f )

2πT f

]2.

Hence, R(τ) = 2T ·(1−|τ|/(2T )

)I[−2T,2T ](τ).

55. Taking α = N0/2 as in the text,

h(t) = v(t0− t) = sin(t0− t)I[0,π](t0− t).Then h is causal for t0 ≥ π .

56. Since v(t) = e(t/√2 )2/2, V ( f ) =

√2π√2e−2(2π f )2/2 = 2

√πe−(2π f )2 . Then

H( f ) = αV ( f )∗e− j2π f t0

SX ( f )= α

2√

πe−(2π f )2e− j2π f t0

e−(2π f )2/2

= 2α√

πe−(2π f )2/2e− j2π f t0 =√2α ·

√2πe−(2π f )2/2e− j2π f t0 ,

and it follows that h(t) =√2αe−(t−t0)2/2.


57. Let v0(n) := ∑k h(n−k)v(k) andYn := ∑k h(n−k)Xk. The SNR is v0(n0)2/E[Y 2n0 ]. We

have

E[Y 2n0 ] =∫ 1/2

−1/2SY ( f )d f =

∫ 1/2

−1/2|H( f )|2SX ( f )d f .

Let V ( f ) := ∑k v(k)e− j2π f k. Then

|v0(n0)|2 =

∣∣∣∣∫ 1/2

−1/2H( f )V ( f )e j2π f n0 d f

∣∣∣∣2

=

∣∣∣∣∫ 1/2

−1/2H( f )

√SX ( f ) · V ( f )e j2π f n0

√SX ( f )

d f

∣∣∣∣2

≤∫ 1/2

−1/2|H( f )|2SX ( f )d f

∫ 1/2

−1/2

|V ( f )∗e− j2π f n0 |2SX ( f )

d f ,

with equality if and only if

H( f )√SX ( f ) = α

V ( f )∗e− j2π f n0√SX ( f )

(#)

for some constant α . It is now clear that the SNR is upper bounded by

∫ 1/2

−1/2

|V ( f )∗e− j2π f n0 |2SX ( f )

d f

and that the SNR equals the bound if and only if (#) holds with equality for some

constant α . Hence, the matched filter transfer function is

H( f ) = αV ( f )∗e− j2π f n0

SX ( f ).

58. We begin by writing

E[Ut ] = E[Vt +Xt ] = E[Vt ]+E[Xt ],

which does not depend on t since Vt and Xt are each individually WSS. Next write

E[Ut1Vt2 ] = E[(Vt1 +Xt1)Vt2 ] = RV (t1− t2)+RXV (t1− t2). (∗)

Now write

E[Ut1Ut2 ] = E[Ut1(Vt2 +Xt2 ] = E[Ut1Vt2 ]+E[Ut1Xt2 ].

By (∗), the term E[Ut1Vt2 ] depends on t1 and t2 only through their difference. Since

E[Ut1Xt2 ] = E[(Vt1 +Xt1)Xt2 ] = RVX (t1− t2)+RX (t1− t2),

it follows that E[Ut1Ut2 ] depends on t1 and t2 only through their difference. Hence,Utand Vt are J-WSS.


59. The assumptions in the problem imply thatVt and Xt are J-WSS, and by the preceding

problem, it follows that Ut and Vt are J-WSS. We can therefore apply the formulas

for the Wiener filter derived in the text. It just remains to compute the quantities used

in the formulas. First,

RVU (τ) = E[Vt+τUt ] = E[Vt+τ(Vt +Xt)] = RV (τ)+RXV (τ) = RV (τ),

which implies SVU ( f ) = SV ( f ). Similarly,

RU (τ) = E[Ut+τUt ] = E[(Vt+τ +Xt+τ)Ut ]

= RVU (τ)+E[Xt+τ(Vt +Xt)]

= RV (τ)+RXV (τ)+RX (τ) = RV (τ)+RX (τ),

and so SU ( f ) = SV ( f )+SX ( f ). We then have

H( f ) =SVU ( f )

SU ( f )=

SV ( f )

SV ( f )+SX ( f ).

60. The formula for RV (τ) implies SV ( f ) = (1−| f |)I[−1,1]( f ). We then have

H( f ) =SV ( f )

SV ( f )+SX ( f )=

(1−| f |)I[−1,1]( f )(1−| f |)I[−1,1]( f )+1− I[−1,1]( f )

=(1−| f |)I[−1,1]( f )1−| f |I[−1,1]( f )

= I[−1,1]( f ),

and so

h(t) = 2sin(2πt)

2πt.

61. To begin, write

E[|Vt −Vt |2] = E[(Vt −Vt)(Vt −Vt)] = E[(Vt −Vt)Vt ]−E[(Vt −Vt)Vt ]= E[(Vt −Vt)Vt ], by the orthogonality principle,

= E[V 2t ]−E[VtVt ] = RV (0)−E[VtVt ] =

∫ ∞

−∞SV ( f )d f −E[VtVt ].

Next observe that

E[VtVt ] = E

[(∫ ∞

−∞h(θ)Ut−θ dθ

)Vt

]=

∫ ∞

−∞h(θ)E[VtUt−θ ]dθ

=∫ ∞

−∞h(θ)RVU (θ)dθ =

∫ ∞

−∞h(θ)RVU (θ)∗ dθ , since RVU (θ) is real,

=

∫ ∞

−∞H( f )SVU ( f )∗ d f , by Parseval’s formula,

=∫ ∞

−∞

SVU ( f )

SU ( f )SVU ( f )∗ d f =

∫ ∞

−∞

|SVU ( f )|2SU ( f )

d f .

Putting these two observations together yields

E[|Vt −Vt |2] =∫ ∞

−∞SV ( f )d f −

∫ ∞

−∞

|SVU ( f )|2SU ( f )

d f =∫ ∞

−∞

[SV ( f )− |SVU ( f )|2

SU ( f )

]d f .


62. Denote the optimal estimator by Vn = ∑∞k=−∞ h(k)Un−k, and denote any other estima-

tor by Vn = ∑∞k=−∞ h(k)Un−k. The discrete-time orthogonality principle says that if

E

[(Vn−Vn)

∞

∑k=−∞

h(k)Un−k

]= 0 (∗)

for every h, then h is optimal in that E[|Vn− Vn|2] ≤ E[|Vn− Vn|2] for every h. To es-tablish the orthogonality principle, assume the above equation holds for every choice

of h. Then we can write

E[|Vn−Vn|2] = E[|(Vn−Vn)+(Vn−Vn)|2]= E[|Vn−Vn|2 +2(Vn−Vn)(Vn−Vn)+ |Vn−Vn|2]= E[|Vn−Vn|2]+2E[(Vn−Vn)(Vn−Vn)]+E[|Vn−Vn|2]. (∗∗)

Now observe that

E[(Vn−Vn)(Vn−Vn)] = E

[(Vn−Vn)

( ∞

∑k=−∞

h(k)Un−k−∞

∑k=−∞

h(k)Un−k

)]

= E

[(Vn−Vn)

∞

∑k=−∞

[h(k)− h(k)

]Un−k

]= 0, by (∗).

We can now continue (∗∗) writing

E[|Vn−Vn|2] = E[|Vn−Vn|2]+E[|Vn−Vn|2] ≥ E[|Vn−Vn|2],

and thus h is the filter that minimizes the mean-squared error.

The next task is to characterize the filter h that satisfies the orthogonality condition

for every choice of h. Write the orthogonality condition as

0 = E

[(Vn−Vn)

∞

∑k=−∞

h(k)Ut−k

]= E

[ ∞

∑k=−∞

h(k)(Vn−Vn)Ut−k]

=∞

∑k=−∞

E[h(k)(Vn−Vn)Ut−k] =∞

∑k=−∞

h(k)E[(Vn−Vn)Ut−k]

=∞

∑k=−∞

h(k)[RVU (k)−RVU

(k)].

Since this must hold for all h, take h(k) = RVU (k)−RVU

(k) to get

∞

∑k=−∞

∣∣RVU (k)−RVU

(k)∣∣2 = 0.

Thus, the orthogonality condition holds for all h if and only if RVU (k) = RVU

(k) forall k.


The next task is to analyze RVU

. Recall that Vn is the response of an LTI system to

inputUn. Applying the result of Problem 31(a) with X replaced byU and Y replaced

by V , we have, also using the fact that RU is even,

RVU

(m) = RUV

(−m) =∞

∑k=−∞

h(k)RU (m− k).

Taking discrete-time Fourier transforms of

RVU (m) = RVU

(m) =∞

∑k=−∞

h(k)RU (m− k)

yields

SVU ( f ) = H( f )SU ( f ), and so H( f ) =SVU ( f )

SU ( f ).

63. We have

H( f ) =SV ( f )

SV ( f )+SX ( f )=

2λ/[λ 2 +(2π f )2]

2λ/[λ 2 +(2π f )2]+1=

2λ

2λ +λ 2 +(2π f )2

=λ

A· 2A

A2 +(2π f )2,

where A :=√2λ +λ 2. Hence, h(t) = (λ/A)e−A|t|.

64. To begin, write

K( f ) =λ + j2π f

A+ j2π f=

λ

A+ j2π f+ j2π f

1

1+ j2π f.

Then

k(t) = λe−Atu(t)+d

dte−Atu(t)

= λe−Atu(t)−Ae−Atu(t)+ e−Atδ (t)

= (λ −A)e−Atu(t)+δ (t),

since e−Atδ (t) = δ (t) for both t = 0 and for t 6= 0. This is a causal impulse response.

65. Let Zt :=Vt+∆t . Then the causal Wiener filter for Zt yields the prediction or smoothing

filter for Vt+∆t . The Wiener–Hopf equation for Zt is

RZU (τ) =∫ ∞

0h∆t(θ)RU (τ−θ)dθ , τ ≥ 0.

Now, RZU (τ) = E[Zt+τUt ] = E[Vt+τ+∆tUt ] = RVU (τ +∆t), and so we must solve

RVU (τ +∆t) =∫ ∞

0h∆t(θ)RU (τ−θ)dθ , τ ≥ 0.


For white noise with RU (τ) = δ (τ), this reduces to

RVU (τ +∆t) =∫ ∞

0h∆t(θ)δ (τ−θ)dθ = h∆t(τ), τ ≥ 0.

If h(t) = RVU (t)u(t) denotes the causal Wiener filter, then for ∆t ≥ 0 (prediction), we

can write

h∆t(τ) = RVU (τ +∆t) = h(τ +∆t), τ ≥ 0.

If ∆t < 0 (smoothing), we can write h∆t(τ) = h(τ + ∆t) only for τ ≥ −∆t. For 0 ≤τ <−∆t, h(τ +∆t) = 0 while h∆t(τ) = RVU (τ +∆t).

66. By the hint, the limit of the double sums is the desired double integral. If we can show

that each of these double sums is nonnegative, then the limit will also be nonnegative.

To this end put Zi := Xtie− j2π f ti∆t i. Then

0 ≤ E

[∣∣∣∣n

∑i=1

Zi

∣∣∣∣2]

= E

[(n

∑i=1

Zi

)(n

∑k=1

Zk

)∗]=

n

∑i=1

n

∑k=1

E[ZiZ∗k ]

=n

∑i=1

n

∑k=1

E[XtiXtk ]e− j2π f tie j2π f tk∆t i∆tk

=n

∑i=1

n

∑k=1

RX (ti− tk)e− j2π f tie j2π f tk∆t i∆tk.

67. (a) The Fourier transform of CY (τ) = e−|τ | is 2/[1+(2π f )2], which is continuousat f = 0. Hence, we have convergence in mean square of 1

2T

∫ T−T Yt dt to E[Yt ].

(b) The Fourier transform of CY (τ) = sin(πτ)/(πτ) is I[−1/2,1/2]( f ), which is con-

tinuous at f = 0. Hence, we have convergence in mean square of 12T

∫ T−T Yt dt to

E[Yt ].

68. We first point out that this is not a question about mean-square convergence. Write

1

2T

∫ T

−Tcos(2πt+Θ)dt =

sin(2πT +Θ)− sin(2π(−T )+Θ)

2T ·2π.

Since |sinx | ≤ 1, we can write∣∣∣∣1

2T

∫ T

−Tcos(2πt+Θ)dt

∣∣∣∣ ≤2

4πT→ 0,

and so the limit in question exists and is equal to zero.

69. As suggested by the hint, put Yt := Xt+τXt . It will be sufficient if Yt is WSS and if the

Fourier transform of the covariance function of Yt is continuous at the origin. First,

since Xt is WSS, the mean of Yt is

E[Yt ] = E[Xt+τXt ] = RX (τ),

which does not depend on t. Before examining the correlation function of Yt , we

assume that Xt is fourth-order strictly stationary so that

E[Yt1Yt2 ] = E[Xt1+τXt1Xt2+τXt2 ]


must be unchanged if on the right-hand side we subtract t2 from every subscript ex-

pression to get

E[Xt1+τ−t2Xt1−t2XτX0].

Since this depends on t1 and t2 only through their difference, we see that Yt is WSS if

Xt is fourth-order strictly stationary. Now, the covariance function of Yt is

C(θ) = E[Xθ+τXθXτX0]−RX (τ)2.

If the Fourier transform of this function of θ is continuous at the origin, then

1

2T

∫ T

−TXt+τXt dt→ RX (τ).

70. As suggested by the hint, put Yt := IB(Xt). It will be sufficient if Yt is WSS and if

the Fourier transform of the covariance function of Yt is continuous at the origin. We

assume at the outset that Xt is second-order strictly stationary. Then the mean of Yt is

E[Yt ] = E[IB(Xt)] = P(Xt ∈ B),

which does not depend on t. Similarly,

E[Yt1Yt2 ] = E[IB(Xt1)IB(Xt2)] = P(Xt1 ∈ B,Xt2 ∈ B)

must be unchanged if on the right-hand side we subtract t2 from every subscript to get

P(Xt1−t2 ∈ B,X0 ∈ B).

Since this depends on t1 and t2 only through their difference, we see that Yt is WSS if

Xt is second-order strictly stationary. Now, the covariance function of Yt is

C(θ) = P(Xθ ∈ B,X0 ∈ B)−P(Xt ∈ B)2.

If the Fourier transform of this function of θ is continuous at the origin, then

1

2T

∫ T

−TIB(Xt)dt→ P(Xt ∈ B).

71. We make the following definition and apply the hints:

RXY (τ) := limT→∞

1

2T

∫ T

−TRXY (τ +θ ,θ)dθ

= limT→∞

1

2T

∫ T

−T

[∫ ∞

−∞h(α)RX (τ +θ ,θ −α)dα

]dθ

=∫ ∞

−∞h(α)

[limT→∞

1

2T

∫ T

−TRX (τ +θ ,θ −α)dθ

]dα

=∫ ∞

−∞h(α)

[limT→∞

1

2T

∫ T−α

−T−αRX (τ +α +β ,β )dβ

]dα

=

∫ ∞

−∞h(α)

[limT→∞

1

2T

∫ T

−TRX (τ +α +β ,β )dβ

︸︷︷︸= RX (τ+α)

]dα.


72. Write

RY (τ) := limT→∞

1

2T

∫ T

−TRY (τ +θ ,θ)dθ

= limT→∞

1

2T

∫ T

−T

[∫ ∞

−∞h(β )RXY (τ +θ −β ,θ)dβ

]dθ

=∫ ∞

−∞h(β )

[limT→∞

1

2T

∫ T

−TRXY ([τ−β ]+θ ,θ)dθ

︸︷︷︸= RXY (τ−β )

]dβ .

73. Let SXY ( f ) denote the Fourier transform of RXY (τ), and let SY ( f ) denote the Fouriertransform of RY (τ). Then by the preceding two problems,

SY ( f ) = H( f )SXY ( f ) = H( f )H(− f )SX ( f ) = |H( f )|2SX ( f ),

where, since h is real, H(− f ) = H( f )∗.

74. This is an instance of Problem 32.

CHAPTER 11

Problem Solutions

1. With λ = 3 and t = 10,

P(Nt = 0) = e−λ t = e−3·10 = e−30 = 9.358×10−14.

2. With λ = 12 per minute and t = 20 seconds, λ t = 4. Thus,

P(Nt > 3) = 1−P(Nt ≤ 3) = 1− e−λ t

(1+λ t+

(λ t)2

2!+

(λ t)3

3!

)

= 1− e−4(1+4+

42

2+43

6

)= 1− e−4(5+8+32/3)

= 1− e−4(39/3+32/3) = 1− e−4(71/3) = 0.5665.

3. (a) P(N5 = 10) =(2 ·5)10e−2·5

10!= 0.125.

(b) We have

P

( 5⋂

i=1

Ni−Ni−1 = 2)

=5

∏i=1

P(Ni−Ni−1 = 2) =5

∏i=1

(22e−2

2!

)

= (2e−2)5 = 32e−10 = 1.453×10−3.

4. Let Nt denote the number of crates sold through time t (in days). Then Ni−Ni−1 isthe number of crates sold on day i, and so

P

( 5⋂

i=1

Ni−Ni−1 ≥ 3)

=5

∏i=1

P(Ni−Ni−1 ≥ 3) =5

∏i=1

[1−P(Ni−Ni−1 ≤ 2)

]

=5

∏i=1

[1− e−λ

(1+λ +λ 2/2!

)]

=[1− e−3

(1+3+9/2

)]5= 0.06385.

5. Let Nt denote the number of fishing rods sold through time t (in days). Then Ni−Ni−1is the number of crates sold on day i, and so

P

( 5⋃

i=1

Ni−Ni−1 ≥ 3)

= 1−P

( 5⋂

i=1

Ni−Ni−1 ≤ 2)

= 1−5

∏i=1

P(Ni−Ni−1 ≤ 2)

188


= 1−5

∏i=1

[e−λ (1+λ +λ 2/2!)

]

= 1−[e−2(1+2+4/2!)

]5= 1− e−10 ·55 = 0.858.

6. Since the average time between hit songs is 7 months, the rate is λ = 1/7 per month.

(a) Since a year is 12 months, we write

P(N12 > 2) = 1−P(N12 ≤ 2) = 1− e−12λ [1+12λ +(12λ )2/2!]

= 1− e−12/7[1+12/7+(12/7)2/2] = 0.247.

(b) Let Tn denote the time of the nth hit song. Since Tn = X1 + · · ·+Xn, E[Tn] =nE[X1] = 7n. For n= 10, E[T10] = 70 months.

7. (a) Since N0 ≡ 0, Nt =Nt−N0. Since (0, t]∩(t, t+∆t] = ∅, Nt−N0 and Nt+∆t−Ntare independent.

(b) Write

P(Nt+∆t = k+ `|Nt = k) = P(Nt+∆t −Nt = `|Nt = k), by substitution,

= P(Nt+∆t −Nt = `|Nt −N0 = k), since N0 ≡ 0,

= P(Nt+∆t −Nt = `), by independent increments.

(c) Write

P(Nt = k|Nt+∆t = k+ `) =P(Nt+∆t = k+ `|Nt = k)P(Nt = k)

P(Nt+∆t = k+ `)

=P(Nt+∆t −Nt = `)P(Nt = k)

P(Nt+∆t = k+ `), by part (b),

=(λ∆t)`e−λ∆t

`! · (λ t)ke−λ t

k!

[λ (t+∆t)]k+`e−λ (t+∆t)

(k+`)!

=

(k+ `

k

)(t

t+∆t

)k( ∆t

t+∆t

)`

.

(d) In part (c), put ` = n− k and put p= t/(t+∆t). Then

P(Nt = k|Nt+∆t = n) =

(n

k

)pk(1− p)n−k, k = 0, . . . ,n.

8. The nth customer arrives at time Tn ∼ Erlang(n,λ ). Hence,

E[Tn] =Γ(1+n)

λΓ(n)=

nΓ(n)

λΓ(n)=

n

λ.

Alternatively, since Tn =X1+ · · ·+Xn, where the Xi are i.i.d. exp(λ ), E[Tn] = nE[Xi] =n/λ .


9. (a) E[Xi] = 1/λ = 0.5 weeks.

(b) P(N2 = 0) = e−λ ·2 = e−4 = 0.0183.

(c) E[N12] = λ ·12= 24 snowstorms.

(d) Write

P

( 12⋃

i=1

Ni−Ni−1 ≥ 5)

= 1−P

( 12⋂

i=1

Ni−Ni−1 ≤ 4)

= 1− [e−λ (1+λ +λ 2/2+λ 3/6+λ 4/24)]12

= 1− [e−2(1+2+2+4/3+2/3)]12

= 1− [7e−2]12 = 1−0.523 = 0.477.

10. First observe that since 1/λ = 2 months, λ = 1/2 per month.

(a) P(N4 = 0) = e−λ ·4 = e−4/2 = e−2 = 0.135.

(b) Write

P

( 4⋃

i=1

Ni−Ni−1 ≥ 2)

= 1−P

( 4⋂

i=1

Ni−Ni−1 ≤ 1)

= 1−4

∏i=1

P(Ni−Ni−1 ≤ 1)

= 1−4

∏i=1

[e−λ ·1 + e−λ ·1(λ ·1)]

= 1− [e−λ (1+λ )]4 = 1− [ 32e−1/2]4

= 1− 8116e−2 = 0.315.

11. We need to find var(Tn) = var(X1 + · · ·+Xn). Since the Xi are independent, they

are uncorrelated, and so the variance of the sum is the sum of the variances. Since

Xi ∼ exp(λ ), var(Tn) = nvar(Xi) = n/λ 2. An alternative approach is to use the fact

that Tn ∼ Erlang(n,λ ). Since the moments of Tn are available,

var(Tn) = E[T 2n ]− (E[Tn])2 =

(1+n)n

λ 2−

( nλ

)2=

n

λ 2.

12. To begin, use the law of total probability, substitution, and independence to write

E[zYt ] = E[zNln(1+tU) ] =∫ 1

0E[zNln(1+tU) |U = u]du =

∫ 1

0E[zNln(1+tu) |U = u]du

=∫ 1

0E[zNln(1+tu) ]du =

∫ 1

0exp[(z−1) ln(1+ tu)]du =

∫ 1

0eln(1+tu)

z−1du

=∫ 1

0(1+ tu)z−1 du.


To compute this integral, we need to treat the cases z = 0 and z 6= 0 separately. We

find that

E[zYt ] =

ln(1+ t)

t, z= 0,

(1+ t)z−1

tz, z 6= 0.

13. Denote the arrival times of Nt by T1,T2, . . . , and let Xk := Tk−Tk−1 denote the inter-arrival times. Similarly, denote the arrival times of Mt by S1,S2, . . . . (As it turns out,we do not need the interarrival times ofMt .) Then for arbitrary k > 1, we use the law

of total probability, substitution, and independence to compute

P(MTk −MTk−1 = m)

=∫ ∞

0

∫ ∞

θP(MTk −MTk−1 = m|Tk = t,Tk−1 = θ) fTkTk−1(t,θ)dt dθ

=∫ ∞

0

∫ ∞

θP(Mt −Mθ = m|Tk = t,Tk−1 = θ) fTkTk−1(t,θ)dt dθ

=∫ ∞

0

∫ ∞

θP(Mt −Mθ = m) fTkTk−1(t,θ)dt dθ

=∫ ∞

0

∫ ∞

θ

[µ(t−θ)]me−µ(t−θ)

m!fTkTk−1(t,θ)dt dθ

= E

[[µ(Tk−Tk−1)]me−µ(Tk−Tk−1)

m!

]

= E

[(µXk)

me−µXk

m!

]=

∫ ∞

0

(µx)me−µx

m!·λe−λx dx

=µmλ

(λ + µ)m!

∫ ∞

0xm · (λ + µ)e−(λ+µ)x dx

︸︷︷︸mth moment of exp(λ + µ) density

=µmλ

(λ + µ)m!· m!

(λ + µ)m=

λ

λ + µ

(µ

λ + µ

)m,

which is a geometric0(µ/(λ + µ)) pmf in m.

14. It suffices to show that the probability generating functionGMt (z) has the form eµ(z−1)

for some µ > 0. We use the law of total probability, substitution, and independence

to write

GMt (z) = E[zMt ] = E[z∑Nti=1Yi ] =

∞

∑n=0

E[z∑Nti=1Yi |Nt = n]P(Nt = n)

=∞

∑n=0

E[z∑ni=1Yi |Nt = n]P(Nt = n) =

∞

∑n=0

E[z∑ni=1Yi ]P(Nt = n)

=∞

∑n=0

(n

∏i=1

E[zYi ]

)P(Nt = n) =

∞

∑n=0

[pz+(1− p)]nP(Nt = n)

= GNt(pz+(1− p)

)= eλ t([pz+(1−p)]−1) = epλ t(z−1).

Thus, Mt ∼ Poisson(pλ t).


15. Let Mt = ∑Nti=1Vi denote the total energy throught time t, with Mt = 0 for Nt = 0. We

use the law of total probability, substitution, and independence to write

E[Mt ] =∞

∑n=0

E[Mt |Nt = n]P(Nt = n) =∞

∑n=1

E

[ Nt

∑i=1

Vi

∣∣∣∣Nt = n

]P(Nt = n)

=∞

∑n=1

E

[n

∑i=1

Vi

∣∣∣∣Nt = n

]P(Nt = n) =

∞

∑n=1

(n

∑i=1

E[Vi]

)P(Nt = n)

=∞

∑n=1

nE[V1]P(Nt = n) = E[V1]∞

∑n=1

nP(Nt = n) = E[V1]E[Nt ] = E[V1](λ t).

The average time between lightning strikes is 1/λ minutes.

16. We have

λ = 5.170± 2.132(1.96)

10= 5.170±0.418 with 95% probability.

In other words, λ ∈ [4.752,5.588] with 95% probability.

17. To begin, observe that

Y =∞

∑i=1

g(Ti) =∞

∑i=1

n

∑k=1

gkI(tk−1,tk](Ti) =n

∑k=1

gk

∞

∑i=1

I(tk−1,tk](Ti) =n

∑k=1

gk(Ntk −Ntk−1)

is a sum of independent random variables. Then

E[Y ] =n

∑k=1

gkE[Ntk −Ntk−1 ] =n

∑k=1

gkλ (tk− tk−1) =∫ ∞

0g(τ)λ dτ,

since g is piecewise constant. Next,

ϕY (ν) = E[e jνY ] = E[e jν ∑nk=1 gk(Ntk−Ntk−1 )] =n

∏k=1

E[e jνgk(Ntk−Ntk−1 )]

=n

∏k=1

exp[λ (tk− tk−1)(e jνgk −1)

]= exp

[n

∑k=1

λ (tk− tk−1)(e jνgk −1)

]

= exp

[∫ ∞

0

(e jνg(τ)−1

)λ dτ

],

since e jνg(τ)− 1 is piecewise constant with values e jνgk − 1 (or the value zero if τdoes not lie in any (tk−1, tk]). We now compute the correlation,

E[YZ] = E

[(n

∑k=1

gk(Ntk −Ntk−1))(

n

∑l=1

hl(Ntl −Ntl−1))]

= ∑k 6=lgkhlE[(Ntk −Ntk−1)(Ntl −Ntl−1)]+∑

k

gkhkE[(Ntk −Ntk−1)2]

= ∑k 6=lgkhlλ

2(tk− tk−1)(tl− tl−1)+∑k

gkhk[λ (tk− tk−1)+λ 2(tk− tk−1)2]


= ∑k,l

gkhlλ2(tk− tk−1)(tl− tl−1)+∑

k

gkhkλ (tk− tk−1)

=

[∑k

gkλ (tk− tk−1)][

∑l

hlλ (tl− tl−1)]+∑

k

gkhkλ (tk− tk−1)

=∫ ∞

0g(τ)λ dτ

∫ ∞

0h(τ)λ dτ +

∫ ∞

0g(τ)h(τ)λ dτ.

Hence,

cov(Y,Z) = E[YZ]−E[Y ]E[Z] =∫ ∞

0g(τ)h(τ)λ dτ.

18. The key is to use g(τ) = h(t − τ) in the preceding problem. It then immediately

follows that

E[Yt ] =∫ ∞

0h(t− τ)λ dτ, ϕYt (ν) = exp

[∫ ∞

0

(e jνh(t−τ)−1

)λ dτ

],

and

cov(Yt ,Ys) =∫ ∞

0h(t− τ)h(s− τ)λ dτ.

19. MATLAB. Replace the line

X = -log(rand(1))/lambda; % Generate exp(lambda) RV

with

X = randn(1)ˆ2; % Generate chi-squared RV

20. If Nt is a Poisson process of rate λ , then FX is the exp(λ ) cdf. Hence,

P(X1 < Y1) =∫ ∞

0FX (y) fY (y)dy =

∫ ∞

0[1− e−λy] fY (y)dy = 1−MY (−λ ).

IfMt is a Poisson process of rate µ , then MY is the exp(µ) mgf, and

P(X1 < Y1) = 1− µ

µ− (−λ )= 1− µ

µ +λ=

λ

λ + µ.

21. Since Tn := X1 + · · ·+Xn is the sum of i.i.d. random variables, var(Tn) = nvar(X1).Since X1 ∼ uniform[0,1], var(X1) = 1/12, and var(Tn) = n/12.

22. In the case of a Poisson process, Tk is Erlang(k,λ ). Hence,

∞

∑k=1

Fk(t) =∞

∑k=1

[1−

k−1∑l=0

(λ t)le−λ t

l!

]=

∞

∑k=1

[ ∞

∑l=k

(λ t)le−λ t

l!

]

=∞

∑k=1

[ ∞

∑l=0

(λ t)le−λ t

l!I[k,∞)(l)

]=

∞

∑l=0

(λ t)le−λ t

l!

[ ∞

∑k=1

I[k,∞)(l)

]

=∞

∑l=0

l(λ t)le−λ t

l!= E[Nt ] = λ t.


23. To begin, write

E[Nt |X1 = x] = E

[ ∞

∑n=1

I[0,t](Tn)

∣∣∣∣X1 = x

]=

∞

∑n=1

E[I[0,t](X1 + · · ·+Xn)

∣∣X1 = x]

=∞

∑n=1

E[I[0,t](x+X2 + · · ·+Xn)

∣∣X1 = x].

(a) If x > t, then x+X2 + · · ·+Xn > t, and so I[0,t](x+X2 + · · ·+Xn) = 0. Thus,

E[Nt |X1 = x] = 0 for x> t.

(b) First, for n= 1 and x ≤ t, I[0,t](x) = 1. Next, for n≥ 2 and x ≤ t, I[0,t](x+X2 +· · ·+Xn) = I[0,t−x](X2 + · · ·+Xn). So,

E[Nt |X1 = x] = 1+∞

∑n=2

E[I[0,t−x](X2 + · · ·+Xn)

]

= 1+∞

∑n=1

E[I[0,t−x](X1 + · · ·+Xn)

]

= 1+E[Nt−x],

where we have used fact that the Xi are i.i.d.

(c) By the law of total probability,

E[Nt ] =∫ ∞

0E[Nt |X1 = x] f (x)dx

=∫ t

0E[Nt |X1 = x] f (x)dx+

∫ ∞

tE[Nt |X1 = x]︸︷︷︸

= 0 for x>t

f (x)dx

=∫ t

0

(1+E[Nt−x]

)f (x)dx = F(t)+

∫ t

0E[Nt−x] f (x)dx.

24. With the understanding that m(t) = 0 for t < 0, we can write the renewal equation as

m(t) = F(t)+∫ ∞

0m(t− x) f (x)dx,

where the last term is a convolution. Hence, taking the Laplace transform of the

renewal equation yields

M(s) :=∫ ∞

0m(t)est dt =

∫ ∞

0F(t)est dt+M(s)MX (s).

Using integration by parts, we have

∫ ∞

0F(t)est dt =

F(t)est

s

∣∣∣∣∞

0

− 1

s

∫ ∞

0f (t)est dt.

Thus,

M(s) = −1

sMX (s)+M(s)MX(s),


which we can rearrange as

M(s)[1−MX (s)] = −1

sMX (s),

or

M(s) = −1

s· MX (s)

1−MX (s)= −1

s· λ/(λ − s)1−λ/(λ − s) = −1

s· λ

−s =λ

s2.

It follows that m(t) = λ tu(t).

25. For 0≤ s< t < ∞, write

E[VtVs] = E

[∫ t

0Xτ dτ

∫ s

0Xθ dθ

]=

∫ s

0

(∫ t

0E[XτXθ ]dτ

)dθ

=∫ s

0

(∫ t

0RX (τ−θ)dτ

)dθ = σ2

∫ s

0

(∫ t

0δ (τ−θ)dτ

)dθ

= σ2∫ s

0dθ = σ2s.

26. For 0≤ s< t < ∞, write

E[WtWs] = E[(Wt −Ws)Ws]+E[W 2s ]

= E[(Wt −Ws)(Ws−W0)]+σ2s

= E[Wt −Ws]E[Ws−W0]+σ2s = 0 ·0+σ2s = σ2s.

27. For t2 > t1, write

E[Yt1Yt2 ] = E[eWt1 eWt2 ] = E[eWt2−Wt1 e2Wt1 ] = E[eWt2−Wt1 e2(Wt1−W0)]

= E[eWt2−Wt1 ]E[e2(Wt1−W0)] = eσ2(t2−t1)/2e4σ2t1/2 = eσ2(t2+3t1)/2.

28. Since cov(Wti ,Wt j) = σ2min(ti, t j),

cov(X) = σ2

t1 t1 t1 · · · t1t1 t2 t2 · · · t2t1 t2 t3 · · · t3...

....... . .

...

t1 t2 t3 tn

.

29. Let 0≤ t1 < · · ·< tn+1 < ∞ and 0≤ s1 < · · ·< sm+1 < ∞, and suppose that

g(τ) =n

∑i=1

giI(ti,ti+1](τ) and h(τ) =m

∑j=1

h jI(s j ,s j+1](τ).

Denote the distinct points of ti∪s j in increasing order by θ1 < · · ·< θp. Then

g(τ) =p

∑k=1

gkI(θk,θk+1](τ) and h(τ) =p

∑k=1

hkI(θk,θk+1](τ),


where the gk are taken from the gi, and the hk are taken from the h j. We can now

write

∫ ∞

0g(τ)dWτ +

∫ ∞

0h(τ)dWτ =

p

∑k=1

gk(Wθk+1−Wθk)+

p

∑k=1

hk(Wθk+1−Wθk)

=p

∑k=1

[gk+ hk](Wθk+1−Wθk)

=∫ ∞

0[g(τ)+h(τ)]dWτ .

30. Following the hint, we first write

E

[(∫ ∞

0g(τ)dWτ −

∫ ∞

0h(τ)dWτ

)2]= E

[(∫ ∞

0g(τ)dWτ

)2]

−2E

[∫ ∞

0g(τ)dWτ

∫ ∞

0h(τ)dWτ

]

+E

[(∫ ∞

0h(τ)dWτ

)2]

= σ2∫ ∞

0g(τ)2 dτ +σ2

∫ ∞

0h(τ)2 dτ

−2E

[∫ ∞

0g(τ)dWτ

∫ ∞

0h(τ)dWτ

].

Second, we write

E

[(∫ ∞

0g(τ)dWτ −

∫ ∞

0h(τ)dWτ

)2]= E

[(∫ ∞

0[g(τ)−h(τ)]dWτ

)2]

= σ2∫ ∞

0[g(τ)−h(τ)]2 dτ

= σ2∫ ∞

0g(τ)2 dτ +σ2

∫ ∞

0h(τ)2 dτ

−2σ2∫ ∞

0g(τ)h(τ)dτ.

Comparing these two formulas, we see that

E

[∫ ∞

0g(τ)dWτ

∫ ∞

0h(τ)dWτ

]= σ2

∫ ∞

0g(τ)h(τ)dτ.

31. (a) Following the hint,

Yt :=∫ t

0g(τ)dWτ =

∫ ∞

0g(τ)I[0,t](τ)dWτ ,

it follows that

E[Y 2t ] = σ2∫ ∞

0[g(τ)I[0,t](τ)]2 dτ = σ2

∫ t

0g(τ)2 dτ.


(b) For t1, t2 ≥ 0, write

E[Yt1Yt2 ] = E

[∫ ∞

0g(τ)I[0,t1](τ)dWτ

∫ ∞

0g(τ)I[0,t2](τ)dWτ

]

= σ2∫ ∞

0

g(τ)I[0,t1](τ)

g(τ)I[0,t2](τ)

dτ

= σ2∫ ∞

0g(τ)2I[0,t1](τ)I[0,t2](τ)dτ = σ2

∫ min(t1,t2)

0g(τ)2 dτ.

32. By independence of V and the Wiener process along with the result of the previous

problem,

RY (t1, t2) = e−λ (t1+t2)E[V 2]+σ2e−λ (t1+t2)∫ min(t1,t2)

0e2λτ dτ.

If t1 ≤ t2, this last integral is equal to∫ t1

0e2λτ dτ =

1

2λ(e2λ t1 −1),

and so

E[Yt1Yt2 ] = e−λ (t1+t2)(q2− σ2

2λ

)+

σ2

2λe−λ (t2−t1).

Similarly, if t2 < t1,

E[Yt1Yt2 ] = e−λ (t1+t2)(q2− σ2

2λ

)+

σ2

2λe−λ (t1−t2).

In either case, we can write

E[Yt1Yt2 ] = e−λ (t1+t2)(q2− σ2

2λ

)+

σ2

2λe−λ |t1−t2|.

33. Write

E[Yt1Yt2 ] =e−λ (t1+t2)

2λE[W

e2λ t1We2λ t2 ] =e−λ (t1+t2)

2λ·σ2min(e2λ t1 ,e2λ t2).

For t1 ≤ t2, this reduces to

e−λ (t1+t2)

2λ·σ2e2λ t1 =

σ2

2λe−λ (t2−t1).

If t2 < t1, we have

e−λ (t1+t2)

2λ·σ2e2λ t2 =

σ2

2λe−λ (t1−t2).

We conclude that

E[Yt1Yt2 ] =σ2

2λe−λ |t1−t2|.


34. (a) P(t) := E[Y 2t ] = E

[(∫ t

0g(τ)dWτ

)2]=

∫ t

0g(τ)2 dτ .

(b) If g(t) is never zero, then for 0≤ t1 < t2 < ∞,

P(t2)−P(t1) =∫ t2

t1

g(τ)2 dτ > 0.

Thus, P(t1) < P(t2).

(c) First,

E[Xt ] = E[YP−1(t)] = E

[∫ P−1(t)

0g(τ)dWτ

]= 0

since Wiener integrals have zero mean. Second,

E[X2t ] = E

[(∫ P−1(t)

0g(τ)dWτ

)2]=

∫ P−1(t)

0g(τ)2 dτ = P(P−1(t)) = t.

35. (a) For t > 0, E[W 2t ] = E[(Wt −W0)

2] = t.

(b) For s< 0, E[W 2s ] = E[(W0−Ws)2] =−s.

(c) From parts (a) and (b) we see that no matter what the sign of t, E[W 2t ] = |t|.

Whether t > s or s< t, we can write

|t− s| = E[(Wt −Ws)2] = E[W 2t ]−2E[WtWs]+E[W 2

s ],

and so

E[WtWs] =E[W 2

t ]+E[W 2s ]−|t− s|

2=|t|+ |s|− |t− s|

2.

36. (a) Write

P(X = xk) = P((X ,Y ) ∈ xk× IR) = ∑i

∫ ∞

−∞Ixk(xi)

= 1︷︸︸︷IIR(y) fXY (xi,y)dy

=∫ ∞

−∞

[∑i

Ixk(xi) fXY (xi,y)

]dy =

∫ ∞

−∞fXY (xk,y)dy.

(b) Write

P(Y ∈C) = P((X ,Y ) ∈ IR×C

)= ∑

i

∫ ∞

−∞IIR(xi)IC(y) fXY (xi,y)dy

=∫ ∞

−∞IC(y)

[∑i

fXY (xi,y)

]dy =

∫

C

[∑i

fXY (xi,y)

]dy.

(c) First write

P(Y ∈C|X = xk) =P(X = xk,Y ∈C)

P(X = xk)=

P((X ,Y ) ∈ xk×C)

P(X = xk).


Then since

P((X ,Y ) ∈ xk×C) = ∑i

∫ ∞

−∞Ixk(xi)IC(y) fXY (xi,y)dy =

∫

CfXY (xk,y)dy,

we have

P(Y ∈C|X = xk) =

∫

CfXY (xk,y)dy

pX (xk)=

∫

C

[fXY (xk,y)

pX (xk)

]dy.

(d) Write

∫ ∞

−∞P(X ∈ B|Y = y) fY (y)dy =

∫ ∞

−∞

[∑i

IB(xi)pX |Y (xi|y)]fY (y)dy

= ∑i

IB(xi)∫ ∞

−∞

fXY (xi,y)

fY (y)fY (y)dy

= ∑i

IB(xi)∫ ∞

−∞fXY (xi,y)dy

= ∑i

IB(xi)∫ ∞

−∞IIR(y) fXY (xi,y)dy

= ∑i

∫ ∞

−∞IB×IR(xi,y) fXY (xi,y)dy

= P((X ,Y ) ∈ B× IR)

= P(X ∈ B,Y ∈ IR) = P(X ∈ B).

37. The key observations are that since F is nondecreasing,

F−1(U)≤ x ⇒ U ≤ F(x),

and since F−1 is nondecreasing,

U ≤ F(x) ⇒ F−1(U)≤ x.

Hence, F−1(U)≤ x= U ≤ F(x), and with X := F−1(U) we can write

P(X ≤ x) = P(F−1(U)≤ x) = P(U ≤ F(x)) =∫ F(x)

01du = F(x).

38. (a) The cdf is

x

F(x)

1/4

1/2

3/4

1

210

x/2

12


(b) For 1/2≤ u< 1, Bu = [2u,∞), and G(u) = 2u. For 1/4< u< 1/2, Bu = [1,∞),and G(u) = 1. For u = 1/4, Bu = [1/2,∞), and G(u) = 1/2. For 0 ≤ u < 1/4,Bu = [

√u,∞), and G(u) =

√u. Hence,

u

G(u)

1

2

1

0

2u

1/2

12

14

34

39. Suppose G(u) ≤ x. Since F is nondecreasing, F(G(u)) ≤ F(x). By the definition ofG(u), F(G(u))≥ u. Thus, F(x)≥ F(G(u))≥ u. Now suppose u≤ F(x). Then by thedefinition of G(u), G(u)≤ x.

40. With X := G(U), write

P(X ≤ x) = P(G(U)≤ x) = P(U ≤ F(x)) =∫ F(x)

01du = F(x).

41. To compute G(u), we used the MATLAB function

function x = G(u)

i1 = find(u <= .25);

i2 = find(.25 < u & u <= .5);

i3 = find(.5 < u & u <= 1);

x(i1) = sqrt(u(i1));

x(i2) = 1;

x(i3) = 2*u(i3);

We obtained the histogram

0 0.5 1 20

1

2

3

This is explained by noting that the density corresponding to F(x) is impulsive:


x

f (x)

1/4

1/2

3/4

1

210 12

42. For an integer-valued process,

µm,n(B) = ∑jm,..., jn

IB( jm, . . . , jn)pm,n( jm, . . . , jn).

If B= im× · · ·×in, then

µm,n+1(B× IR) = ∑jm,..., jn, j

IB×IR( jm, . . . , jn, j)pm,n+1( jm, . . . , jn, j)

= ∑j

pm,n+1(im, . . . , in, j),

µm−1,n(IR×B) = ∑j, jm,..., jn

IIR×B( j, jm, . . . , jn)pm−1,n( j, jm, . . . , jn)

= ∑j

pm−1,n( j, im, . . . , in),

and µm,n(B) = pm,n(im, . . . , in). Hence, if the conditions

µm,n+1(B× IR) = µm,n(B) and µm−1,n(IR×B) = µm,n(B)

hold for all B and we take B= im× · · ·×in, then we obtain

∑j

pm,n+1(im, . . . , in, j) = pm,n(im, . . . , in)

and

∑j

pm−1,n( j, im, . . . , in) = pm,n(im, . . . , in).

Conversely, suppose these two formulas hold. Fix an arbitrary B ⊂ IRn−m+1. If we

multiply both formulass by IB(im, . . . , in) and sum over all im, . . . , in, we obtain the

original consistency conditions.

43. For the first condition, write

∑j

pm,n+1(im, . . . , in, j) = ∑j

q(im)r(im+1|im) · · ·r(in|in−1)r( j|in)

= q(im)r(im+1|im) · · ·r(in|in−1)︸︷︷︸= pm,n(im,...,in)

∑j

r( j|in)︸︷︷︸

= 1

.


For the second condition, write

∑j

pm−1,n( j, im, . . . , in) = ∑j

q( j)r(im| j)r(im+1|im) · · ·r(in|in−1)

=

(∑j

q( j)r(im| j))

︸︷︷︸= q(im)

r(im+1|im) · · ·r(in|in−1)

︸︷︷︸= pm,n(im,...,in)

.

44. If

µn(Bn) = µn+1(Bn× IR)

=

∫ ∞

−∞· · ·

∫ ∞

−∞IBn(x1, . . . ,xn)IIR(y) fn+1(x1, . . . ,xn,y)dydxn · · ·dx1

=∫· · ·

∫

Bn

[∫ ∞

−∞fn+1(x1, . . . ,xn,y)dy

]dxn · · ·dx1,

then necessarily the quantity in square brackets is the joint density fn(x1, . . . ,xn).Conversely, if the quantity in square brackets is equal to fn(x1, . . . ,xn), we can write

µn+1(Bn× IR) =∫ ∞

−∞· · ·

∫ ∞

−∞IBn(x1, . . . ,xn)IIR(y) fn+1(x1, . . . ,xn,y)dydxn · · ·dx1

=∫· · ·

∫

Bn

[∫ ∞

−∞fn+1(x1, . . . ,xn,y)dy

]dxn · · ·dx1

=∫· · ·

∫

Bn

fn(x1, . . . ,xn)dxn · · ·dx1 = µn(Bn).

45. The key is to write

µt1,...,tn+1(Bn,k)

=∫ ∞

−∞· · ·

∫ ∞

−∞IBn(x1, . . . ,xk−1,xk+1, . . . ,xn)IIR(xk) ft1,...,tn+1

(x1, . . . ,xn+1)dxn+1 · · ·dx1

=∫· · ·

∫

Bn

[∫ ∞

−∞ft1,...,tn+1

(x1, . . . ,xn+1)dxk

]dxn+1 · · ·dxk+1dxk−1 · · ·dx1.

Then if µt1,...,tn+1(Bn,k) = µn(Bn), we must have

ft1,...,tk−1,tk+1,...,tn(x1, . . . ,xk−1,xk+1, . . . ,xn) =∫ ∞

−∞ft1,...,tn+1

(x1, . . . ,xn+1)dxk

and conversely.

46. By the hint, [Wt1 , . . . ,Wtn ]′ is a linear transformation of a vector of independent Gaus-

sian increments, which is a Gaussian vector. Hence, [Wt1 , . . . ,Wtn ]′ is a Gaussian

vector. Since n and the times t1 < · · ·< tn are arbitrary,Wt is a Gaussian process.


47. Let X = [Wt1 −W0, . . . ,Wtn −Wtn−1 ]′, and let Y = [Wt1 , . . . ,Wtn ]′. Then Y = AX , where

A denotes the matrix given in the statement of Problem 46. Since the components of

X are independent,

fX (x) =n

∏i=1

e−x2i /[2(ti−ti−1)]

√2π(ti− ti−1)

,

where it is understood that t0 := 0. Using the example suggested in the hint, we have

fY (y) =fX (x)

|detA|

∣∣∣∣x=A−1y

.

Fortunately, detA= 1, and by definition, Xi =Wti −Wti−1 . Hence,

ft1,...,tn(w1, . . . ,wn) =n

∏i=1

e−(wi−wi−1)2/[2(ti−ti−1)]√2π(ti− ti−1)

,

where it is understood that w0 := 0.

48. By the hint, it suffices to show that C is positive semidefinite. Write

a′Ca =n

∑i=1

n

∑k=1

aiakCik =n

∑i=1

n

∑k=1

aiakR(ti− tk)

=n

∑i=1

n

∑k=1

aiak

∫ ∞

−∞S( f )e j2π f (ti−tk) d f

=∫ ∞

−∞S( f )

∣∣∣∣n

∑i=1

aiej2π f ti

∣∣∣∣2

d f ≥ 0.

CHAPTER 12

Problem Solutions

1. If P(A|X = i,Y = j,Z = k) = h(i), then

P(A,X = i,Y = j,Z = k) = P(A|X = i,Y = j,Z = k)P(X = i,Y = j,Z = k)

= h(i)P(X = i,Y = j,Z = k).

Summing over j yields

P(A,X = i,Z = k) = h(i)P(X = i,Z = k), (∗)

and thus h(i) = P(A|X = i,Z = k). Next, summing (∗) over k yields

P(A,X = i) = h(i)P(X = i),

and then h(i) = P(A|X = i) as well.

2. Write

X1 = g(X0,Z1)

X2 = g(X1,Z2) = g(g(X0,Z1),Z2)

X3 = g(X2,Z3) = g(g(g(X0,Z1),Z2),Z3)

...

In general, Xn is a function of X0,Z1, . . . ,Zn, and thus (X0, . . . ,Xn) is a function of

(X0,Z1, . . . ,Zn), which is independent of Zn+1. Now observe that

P(Xn+1 = in+1|Xn = in, . . . ,X0 = i0) = P(g(Xn,Zn+1) = in+1|Xn = in, . . . ,X0 = i0)

= P(g(in,Zn+1) = in+1|Xn = in, . . . ,X0 = i0)

= P(g(in,Zn+1) = in+1),

where we have used substitution and independence. Since

P(Xn+1 = in+1|Xn = in, . . . ,X0 = i0)

does not depend on in−1, . . . , i0, Xn is a Markov chain.

3. Write

P(A∩B|C) =P(A∩B∩C)

P(C)=

P(A∩B∩C)

P(B∩C)· P(B∩C)

P(C)= P(A|B∩C)P(B|C).

204


4. Write

P(X0 = i,X1 = j,X2 = k,X3 = l)

= P(X3 = l|X2 = k,X1 = j,X0 = i)

·P(X2 = k|X1 = j,X0 = i)P(X1 = j|X0 = i)P(X0 = i)

= pkl p jkpi jνi.

5. The first equation we write is

π0 = ∑k

πkpk0 as π0 = π0p00 +π1p10 = π0(1−a)+π1b.

This tells us that π1 = (a/b)π0. Then we use the fact that π0 + π1 = 1 to write π0 +(a/b)π0 = 1, or π0 = 1/(1+a/b) = b/(a+b). We also have π1 = (a/b)π0 = a/(a+b).

6. The state transition diagram is

1/2

1/2

1/2

1/2

1/4

3/4

1 2

0

The stationary distribution is π0 = 5/12, π1 = 1/3, π2 = 1/4.


1/2

1/2

1/4

3/4

1 2

0

1/4

3/4




1/21/2

1/10

9/10

1/2

1/2

1/10

9/100

23

1

The stationary distribution is π0 = 9/28, π1 = 5/28, π2 = 5/28, π3 = 9/28.

9. The first equation is

π0 = π0p00 +π1p10 = π0(1−a)+π1b,

which tells us that π1 = (a/b)π0. The next equation is

π1 = π0p01 +π1p11 +π2p21 = π0a+π1(1− [a+b])+π2b,

which tells us that

(a+b)π1 = π0a+π2b or (a+b)(a/b)π0 = π0a+π2b,

from which it follows that π2 = (a/b)2π0. Now suppose that πi = (a/b)iπ0 holds fori= 0, . . . , j < N. Then from

π j = π j−1p j−1, j+π jp j j+π j+1p j+1, j = π j−1a+π j(1− [a+b])+π j+1b,

we obtain

(a+b)π j = π j−1a+π j+1b or (a+b)(a/b) jπ0 = (a/b) j−1a+π j+1b,

from which it follows that π j+1 = (a/b) j+1π0. To find π0, we use the fact that π0 +· · ·+πN = 1. Using the finite geometric series formula, we have

1 =N

∑j=0

(a/b) jπ0 = π01− (a/b)N+1

1−a/b .

Hence,

π j =1−a/b

1− (a/b)N+1

(ab

) j, j = 0, . . . ,N, a 6= b.

If a= b, then π j = π0 for j = 0, . . . ,N implies π j = 1/(N+1).


10. Let π and π denote the two solutions in the example. For 0≤ λ ≤ 1, put

π j := λπ j+(1−λ )π j ≥ 0.

Then

∑j

π j = ∑j

[λπ j+(1−λ )π j

]= λ ∑

j

π j+(1−λ )∑j

π j = λ ·1+(1−λ ) ·1= 1.

Hence, π is a valid pmf. Finally, note that

∑i

πipi j = ∑i

[λπi+(1−λ )πi]pi j = λ ∑i

πipi j+(1−λ )∑i

πipi j

= λπ j+(1−λ )π j = π j.


12. Write

E[T1( j)|X0 = i] =∞

∑k=1

kP(T1( j) = k|X0 = i)+∞ ·P(T1( j) = ∞|X0 = i)

=∞

∑k=1

k f(k)i j +∞(1− fi j).

13. (a) First note that T1 = 2= X2 = j,X1 6= j and that

T2 = 5 = X5 = j∩[X4 = j,X3 6= j,X2 6= j,X1 6= j∪X4 6= j,X3 = j,X2 6= j,X1 6= j∪X4 6= j,X3 6= j,X2 = j,X1 6= j∪X4 6= j,X3 6= j,X2 6= j,X1 = j

].

Then

T2 = 5∩T1 = 2 = X5 = j,X4 6= j,X3 6= j,X2 = j,X1 6= j.

It now follows that

P(T2 = 5|T1 = 2,X0 = i) = P(X5 = j,X4 6= j,X3 6= j|X2 = j,X1 6= j,X0 = i).

(b) Write

P(X5 = j,X4 6= j,X3 6= j,X2 = j,X1 6= j,X0 = i)

as

P

(⋃

l 6= j

X5 = j,X4 6= j,X3 6= j,X2 = j,X1 = l,X0 = i)

,

which is equal to

∑l 6= j

P(X5 = j,X4 6= j,X3 6= j|X2 = j,X1 = l,X0 = i)P(X2 = j,X1 = l,X0 = i).


By the Markov property, this simplifies to

∑l 6= j

P(X5 = j,X4 6= j,X3 6= j|X2 = j)P(X2 = j,X1 = l,X0 = i),

which is just

P(X5 = j,X4 6= j,X3 6= j|X2 = j)P(X2 = j,X1 6= j,X0 = i).

It now follows that P(X5 = j,X4 6= j,X3 6= j|X2 = j,X1 6= j,X0 = i) is equal to

P(X5 = j,X4 6= j,X3 6= j|X2 = j).

(c) Write

P(X5 = j,X4 6= j,X3 6= j|X2 = j) = ∑k 6= j,l 6= j

P(X5 = j,X4 = k,X3 = l|X2 = j)

= ∑k 6= j,l 6= j

P(X3 = j,X2 = k,X1 = l|X0 = j)

= P(X3 = j,X2 6= j,X1 6= j|X0 = j).

14. Write

P(V ( j) = ∞|X0 = i) = P

( ∞⋂

L=1

V ( j)≥ L∣∣∣∣X0 = i

)

= limM→∞

P

( M⋂

L=1

V ( j)≥ L∣∣∣∣X0 = i


= limM→∞

P(V ( j)≥M|X0 = i), decreasing events.

15. First write

E[V ( j)|X0 = i] = E

[ ∞

∑n=1

I j(Xn)

∣∣∣∣X0 = i

]=

∞

∑n=1

E[I j(Xn)|X0 = i]

=∞

∑n=1

P(Xn = j|X0 = i) =∞

∑n=1

p(n)i j . (∗)

Next,

P(V ( j) = ∞|X0 = i) = limL→∞

P(V ( j)≥ L|X0 = i), from preceding problem,

= limL→∞

fi j( f j j)L−1, from the text.

Now, if j is recurrent ( f j j = 1), the above limit is fi j. If fi j > 0, then V ( j) = ∞ with

positive probability, and so the expectation on the left in (∗) must be infinite, andhence, so is the sum on the far right in (∗). If fi j = 0, then for any n = 1,2, . . . , wecan write

0 = fi j := P(T1( j) < ∞|Xi = 0) ≥ P

( ∞⋃

m=1

Xm = j∣∣∣∣Xi = 0

)

≥ P(Xn = j|X0 = i) = p(n)i j ,


and it follows that ∑∞n=1 p

(n)i j = 0.

In the transient case ( f j j < 1), we have from the text that

P(V ( j)≥ L|X0 = i) = fi j( f j j)L−1.

Using the preceding problem along with f j j < 1, we have

P(V ( j) = ∞|X0 = i) = limL→∞

fi j( f j j)L−1 = 0.

Now, we also have from the text that

P(V ( j) = L|X0 = i) =

fi j( f j j)

L−1(1− f j j), L≥ 1,1− fi j, L= 0,

where we use the fact that the number of visits to state j is zero if and only if we never

visit state j, and this happens if and only if T1( j) = ∞. We can therefore write

E[V ( j)|X0 = i] =∞

∑L=0

L ·P(V ( j) = L|X0 = i) =fi j

1− f j j< ∞,

and it follows that the last sum in (∗) is finite too.16. From mλ −1≤ bmλc ≤ mλ ,

1

mλ≤ 1

bmλc ≤1

mλ −1.

Then1

λ≤ m

bmλc ≤1

λ −1/m→ 1

λ.

17. (a) Write h( j) = IS( j) = ∑nl=1 Isl( j). Then

limm→∞

1

m

m

∑k=1

h(Xk) = limm→∞

1

m

m

∑k=1

[n

∑l=1

Isl(Xk)

]=

n

∑l=1

limm→∞

1

m

m

∑k=1

Isl(Xk)

=n

∑l=1

limm→∞

Vm(sl)/m =n

∑l=1

πsl , by Theorems 3 and 4,

= ∑j

IS( j)π j = ∑j

h( j)π j.

(b) With h( j) = ∑nl=1 clISl ( j), where each Sl is a finite subset of states, we can write

limm→∞

1

m

m

∑k=1

h(Xk) = limm→∞

1

m

m

∑k=1

[n

∑l=1

clISl (Xk)

]=

n

∑l=1

cl limm→∞

1

m

m

∑k=1

ISl (Xk)

=n

∑l=1

cl

[∑j

ISl ( j)π j

], by part (a)

= ∑j

[n

∑l=1

clISl ( j)

]π j = ∑

j

h( j)π j.


(c) If h( j) = 0 for all but finitely many states, then there are at most finitely many

distinct nonzero values that h( j) can take, say c1, . . . ,cn. Put Sl := j : h( j) =cl. By the assumption about h, each Sl is a finite set. Furthermore, we can

write

h( j) =n

∑l=1

clISl ( j).

Hence, the desired result is immediate by part (b).



20. Suppose k ∈ Ai ∩A j. Since k ∈ Ai, i↔ k. Since k ∈ A j, j↔ k. Hence, i↔ j. Now,

if l ∈ Ai, l↔ i↔ j and we see that l ∈ A j. Similarly, if l ∈ A j, l↔ j↔ i and we see

that l ∈ Ai. Thus, Ai = A j.

21. Yes. As pointed out in the discussion at the end of the section, by combining Theo-

rem 8 and Theorem 6, the chain has a unique stationary distribution. Now, by The-

orem 4, all states are positive recurrent, which by definition means that the expected

time to return to a state is finite.

22. Write

gi,i+n = lim∆t↓0

(λ∆t)ne−λ∆t

∆t ·n! =λ

n!lim∆t↓0

(λ∆t)n−1e−λ∆t = 0, for n≥ 2.


5

4

1

2

3

1 2

0


24. MATLAB. Change the line A=P-In; to A=P;.

25. The forward equation is

p′i j(t) = ∑k

pik(t)gk j =j+1

∑k= j−1

pikgk j

= pi, j−1(t)λ j−1− pi j(t)[λ j+ µ j]+ pi, j+1(t)µ j+1.


The backward equation is

p′i j(t) = ∑k

gikpk j(t) =i+1

∑k=i−1

gikpi j(t)

= µipi−1, j(t)− (λi+ µi)pii(t)+λipi+1, j(t).

The chain is conservative because gi,i−1+gi,i+1 = µi+λi =−[−(λi+µi)] =−gii < ∞.

26. We use the formula 0 = ∑k πkgk j. For j = 0, we have

0 = ∑k

πkgk0 = π0(−λ0)+π1µ1,

which implies π1 = (λ0/µ1)π0. For j = 1, we have

0 = ∑k

πkgk1 = π0λ0 +π1[−(λ1 + µ1)]+π2µ2

= π0λ0 +(λ0/µ1)π0[−(λ1 + µ1)]+π2µ2 = π0(λ0λ1/µ1)+π2µ2,

which implies π2 = π0λ0λ1/(µ1µ2). Now suppose that πi = π0λ0 · · ·λi−1/(µ1 · · ·µi)for i= 1, . . . , j. Then from

0 = ∑k

πkgk j = π j−1λ j−1−π j(λ j+ µ j)+π j+1µ j+1

=λ0 · · ·λ j−2µ1 · · ·µ j−1

π0λ j−1−λ0 · · ·λ j−1µ1 · · ·µ j

π0(λ j+ µ j)+π j+1µ j+1,

and it follows that

π j+1 =λ0 · · ·λ j

µ1 · · ·µ j+1π0.

Now, let

B :=∞

∑j=1

λ0 · · ·λ j−1µ1 · · ·µ j

< ∞.

From the condition ∑∞j=0π j = 1, we have

1 = π0 +π1 + · · · = π0 +π0∞

∑j=1

λ0 · · ·λ j−1µ1 · · ·µ j

= π0(1+B).

It then follows that π0 = 1/(1+B), and

π j =1

1+B

(λ0 · · ·λ j−1µ1 · · ·µ j

), j ≥ 1.

If λi = λ and µi = µ , then B= ∑∞j=1(λ/µ) j, which is finite if and only if λ < µ . In

in this case,

1+B =∞

∑k=0

(λ

µ

)k=

1

1+λ/µ,

and

π j = (1−λ/µ)(λ/µ) j ∼ geometric0(λ/µ).


27. The solution is very similar the that of the preceding problem. In this case, put

BN :=N

∑j=1

λ0 · · ·λ j−1µ1 · · ·µ j

< ∞,

and then π0 = 1/(1+BN), and

π j =1

1+BN

(λ0 · · ·λ j−1µ1 · · ·µ j

), j = 1, . . . ,N.

If λi = λ and µi = µ , then

1+BN =N

∑k=0

(λ/µ)k =1− (λ/µ)N+1

1−λ/µ,

and

π j =1−λ/µ

1− (λ/µ)N+1(λ/µ) j, j = 0, . . . ,N.

If λ = µ , then π j = 1/(N+1).

28. To begin, write

mi(t) := E[Xt |X0 = i] = ∑j

jP(Xt = j|X0 = i) = ∑j

jpi j(t).

Then

m′i(t) = ∑j

jp′i j(t) = ∑j

j

[∑k

pik(t)gk j

]= ∑

k

pik(t)

[∑j

jgk j

]

= ∑k

pik(t)[(k−1)gk,k−1+ kgkk+(k+1)gk,k+1

]

= ∑k

pik(t)[(k−1)(kµ)− k(kλ +α + kµ)+(k+1)(kλ +α)

]

= ∑k

pik(t)[−kµ + kλ +α] = −µmi(t)+λmi(t)+α.

We must solve

m′i(t)+(µ−λ )mi(t) = α, mi(0) = i.

If µ 6= λ , it is readily verified that

mi(t) =

(i− α

µ−λ

)e−(µ−λ )t +

α

µ−λ

solves the equation. If µ = λ , then mi(t) = αt+ i solves m′i(t) = α with mi(0) = i.

29. To begin, write

p′i j(t) = ∑k

pik(t)gk j = pi, j−1(t)g j−1, j+ pi j(t)g j j = λ pi, j−1(t)−λ pi j(t). (∗)


Observe that for j = i, pi,i−1(t) = 0 since the chain can not go from state i to a lower

state. Thus,

p′ii(t) = −λ pii(t).

Also, pii(0) = P(Xt = i|X0 = i) = 1. Thus, pii(t) = e−λ t , and it is easily verified that

pi,i+n(t) =(λ t)ne−λ t

n!, n= 0,1,2, . . . ,

solves (∗). Thus, Xt is a Poisson process of rate λ with X0 = i instead of X0 = 0.

30. (a) The assumption that D>−∞ implies that the πk are not all zero, in which case,the πk would not sum to one and would not be a pmf. Aside from this, it is clear

that πk ≥ 0 for all k and that ∑k πk = D/D= 1. Next, since p j j = 0, write

∑k

πkpk j = ∑k 6= j

[πkgkk/D]gk j

−gkk=−1D

∑k 6= j

πkgk j

=−1D

[(∑k

πkgk j

)

︸︷︷︸= 0

−π jg j j

]= π jg j j/D = π j.

(b) The assumption that D > −∞ implies that the πk are not all zero. Aside from

this, it is clear that πk ≥ 0 for all k and that ∑k πk = D/D= 1. Next, write

∑k

πkgk j = ∑k

[(πk/gkk)/D]gk j =−1

D∑k

πkgk j

−gkk

=−1

D

[(∑k 6= j

πkgk j

−gkk

)− π j

]

=−1

D

[(∑k

πkpk j

)

︸︷︷︸= π j

−π j

], since p j j = 0,

= 0.

(c) Since πk and πk are pmfs, they sum to one. Hence, if gii = g for all i, D = g

and D = 1/g. In (a), πk = πkgkk/D = πkg/g = πk. In (b), πk = (πk/gkk)/D =(πk/g)/(1/g) = πk.

31. Following the hints, write

P(T > t+∆t|T > t,X0 = i) = P(Xs = i,0≤ s≤ t+∆t|Xs = i,0≤ s≤ t)= P(Xs = i, t ≤ s≤ t+∆t|Xs = i,0≤ s≤ t)= P(Xs = i, t ≤ s≤ t+∆t|Xt = i)

= P(Xs = i,0≤ s≤ ∆t|X0 = i)

= P(T > ∆t|X0 = i).


The exponential parameter is

lim∆t↓0

1−P(T > ∆t|X0 = i)

∆t= lim

∆t↓01−P(Xs = i,0≤ s≤ ∆t|X0 = i)

∆t

= lim∆t↓0

1−P(X∆t = i|X0 = i)

∆t=: −gii.

32. Using substitution in reverse, write

P(Wt ≤ y|Ws = x,Wsn−1= xn−1, . . . ,Ws0 = x0)

= P(Wt − x≤ y− x|Ws = x,Wsn−1= xn−1, . . . ,Ws0 = x0)

= P(Wt −Ws ≤ y− x|Ws = x,Wsn−1= xn−1, . . . ,Ws0 = x0).

Now, using the fact thatW0 ≡ 0, this last conditional probability is equal to

P(Wt −Ws ≤ y− x|Ws−Wsn−1= x− xn−1, . . . ,Ws1 −Ws0 = x1− x0,Ws0 −W0 = x0).

Since the Wiener process has independent increments that are Gaussian, this last ex-

pression reduces to

P(Wt −Ws ≤ y− x) =∫ y−x

−∞

exp[−θ/[σ√t− s ]2/2]√

2πσ2(t− s)dθ .

Since this depends on x but not on xn−1, . . . ,x0,

P(Wt ≤ y|Ws = x,Wsn−1= xn−1, . . . ,Ws0 = x0) = P(Wt ≤ y|Ws = x).

Hence,Wt is a Markov process.

33. Write

∑y

P(X = x|Y = y,Z = z)P(Y = y|Z = z)

= ∑y

P(X = x,Y = y,Z = z)

P(Y = y,Z = z)· P(Y = y,Z = z)

P(Z = z)

=1

P(Z = z) ∑y

P(X = x,Y = y,Z = z)

=P(X = x,Z = z)

P(Z = z)= P(X = x|Z = z).

34. Using the law of total conditional probability, write

Pt+s(B) = P(Xt+s ∈ B|X0 = x)

=∫ ∞

−∞P(Xt+s ∈ B|Xs = z,X0 = x) fXs|X0

(z|x)dz

=∫ ∞

−∞P(Xt+s ∈ B|Xs = z) fs(x,z)dz

=∫ ∞

−∞P(Xt ∈ B|X0 = z) fs(x,z)dz

=∫ ∞

−∞Pt(z,B) fs(x,z)dz.

CHAPTER 13

Problem Solutions

1. First observe that E[|Xn|p] = np/2P(U ≤ 1/n) = np/2 · (1/n) = np/2−1, which goes to

zero if and only if p< 2. Thus, 1≤ p< 2.

2. E

[∣∣∣∣Nt

t−λ

∣∣∣∣2]

=1

t2E[|Nt −λ t|2] =

var(Nt)

t2=

λ t

t2=

λ

t→ 0.

3. Given ε > 0, let n≥ N imply |C(k)| ≤ ε/2. Then for such n,

1

n

n−1

∑k=0

C(k) =1

n

N−1

∑k=0

C(k)+1

n

n

∑k=N

C(k)

and ∣∣∣∣1

n

n−1

∑k=0

C(k)

∣∣∣∣ ≤1

n

N−1

∑k=0

|C(k)|+ 1

n

n

∑k=N

ε/2 ≤ 1

n

N−1

∑k=0

|C(k)|+ ε/2.

For large enough n, the right-hand side will be less that ε .

4. Applying the hint followed by the Cauchy–Schwarz inequality shows that

∣∣∣∣1

n

n−1

∑k=0

C(k)

∣∣∣∣ ≤√

E[(X1−m)2]E[(Mn−m)2].

Hence, ifMn converges in mean square to m, the left-hand side goes to zero as n→∞.

5. Starting with the hint, we write

E[ZIA] = E[(Z−Zn)IA]+E[ZnIA]

≤ E[Z−Zn]+nE[IA], since Zn ≤ Z and Zn ≤ n,= E[|Z−Zn|]+nP(A).

Since Zn converges in mean to Z, given ε > 0, we can let n≥ N imply E[|Z−Zn|]≤ε/2. Then in particular we have

E[ZIA] < ε/2+NP(A).

Hence, if 0 < δ < ε/(2N),

P(A) < δ implies E[ZIA] < ε/2+Nε/(2N) = ε.

6. Following the hint, we find that given ε > 0, there exists a δ > 0 such that

P(U ≤ ∆x) = ∆x< δ implies E[ f (x+U)IU≤∆x] =∫ ∆x

0f (x+ t)dt < ε.

215


Now make the change of variable θ = x+ t to get

∫ ∆x

0f (x+ t)dt =

∫ x+∆x

xf (θ)dθ = F(x+∆x)−F(x).

For 0 < −∆x < δ , take A = U > 1 + ∆x and Z = f (x− 1 +U). Then P(U >1+∆x) =−∆x implies

E[ f (x+1+U)IU>1+Dx] =∫ 1

1+∆xf (x−1+ t)dt < ε.

In this last integral make the change of variable θ = x−1+ t to get

∫ x

x+∆xf (θ)dθ = F(x)−F(x+∆x).

Thus, F(x+ ∆x)−F(x) > −ε . We can now write that given ε > 0, there exists a δsuch that |∆x|< δ implies |F(x+∆x)−F(x)|< ε .

7. Following the hint, we can write

exp[ 1

pln(|X |/α)p+

1

qln(|Y |/β )q

]≤ 1

peln(|X |/α)p +

1

qeln(|Y |/β )q

or

exp[ln(|X |/α)+ ln(|Y |/β )

]≤ 1

p

( |X |α

)p+

1

q

( |Y |β

)q

or|XY |αβ

≤ 1

p

|X |pα p

+1

q

|Y |qβ q

.

Hence,E[|XY |]

αβ≤ 1

p

E[|X |p]α p

+1

q

E[|Y |q]β q

=1

p

α p

α p+

1

q

β q

β q= 1.

The hint assumes neither α nor β is zero or infinity. However, if either α or β is zero,

then both sides of the inequality are zero. If neither is zero and one of them is infinity,

then the right-hand side is infinity and the inequality is trivial.

8. Following the hint, with X = |Z|α , Y = 1, and p= β/α , we have

E[|Z|α ] = E[|XY |] ≤ E[|X |p]1/pE[|Y |q]1/q = E[(|Z|α)β/α ]α/β ·1 = E[|Z|β ]α/β .

Raising both sides to the 1/α yields E[|Z|α ]1/α ≤ E[|Z|β ]1/β .

9. By Lyapunov’s inequality, E[|Xn−X |α ]1/α ≤ E[|Xn−X |β ]1/β . Raising both sides to

the α power yields E[|Xn−X |α ]≤ E[|Xn−X |β ]α/β . Hence, if E[|Xn−X |β ]→ 0, then

E[|Xn−X |α ]→ 0 too.

10. Following the hint, write

E[|X+Y |p] = E[|X+Y | |X+Y |p−1]

≤ E[|X | |X+Y |p−1]+E[|Y | |X+Y |p−1]

≤ E[|X |p]1/pE[(|X+Y |p−1)q]1/q+E[|Y |p]1/p

E[(|X+Y |p−1)q]1/q,


where 1/q := 1−1/p. Hence, 1/q = (p−1)/p and q = p/(p−1). Now divide the

above inequality by E[|X+Y |p](p−1)/p to get

E[|X+Y |p]1/p ≤ E[|X |p]1/p+E[|Y |p]1/p.

11. For a Wiener process,Wt−Wt0 ∼N(0,σ2|t− t0|). To simplify the notation, put σ2 :=σ2|t− t0|. Then

E[|Wt −Wt0 |] =∫ ∞

−∞|x|e

−(x/σ)2/2

√2π σ

dx = 2

∫ ∞

0xe−(x/σ)2/2

√2π σ

dx

=2σ√2π

∫ ∞

0te−t

2/2 dt =2σ√2π

(−e−t2/2

)∣∣∣∞

0= 2σ

√|t− t0|

2π,

which goes to zero as |t− t0| → 0.

12. Let t0 > 0 be arbitrary. Since E[|Nt −Nt0 |2] = λ |t− t0|+λ 2|t− t0|2, it is clear that as

t→ t0, E[|Nt −Nt0 |2]→ 0. Hence, Nt is continuous in mean square.

13. First write

R(t,s)−R(τ,θ) = R(t,s)−R(t,θ)+R(t,θ)−R(τ,θ)

= E[Xt(Xs−Xθ )]+E[(Xt −Xτ)Xθ ].

Then by the Cauchy–Schwarz inequality,

|R(t,s)−R(τ,θ)| ≤√

E[X2t ]E[(Xs−Xθ )2]+

√E[(Xt −Xτ)2]E[X2

θ ],

which goes to zero as (t,s)→ (τ,θ). Note that we need the boundedness of E[X2t ] for

t near τ .

14. (a) E[|Xt+T −Xt |2] = R(0)−2R(T )+R(0) = 0.

(b) Write

|R(t+T )−R(t)| =∣∣E[(Xt+T −Xt)X0]

∣∣ ≤√

E[|Xt+T −Xt |2]E[X20 ],

which is zero by part (a). Hence, R(t+T ) = R(t) for all t.

15. Write

‖(Xn+Yn)− (X+Y )‖p = ‖(Xn−X)+(Yn−Y )‖p ≤ ‖Xn−X‖p+‖Yn−Y‖p→ 0.

16. Let t0 be arbitrary Since

‖Zt −Zt0‖p = ‖(Xt +Yt)− (Xt0 +Yt0)‖p= ‖(Xt −Xt0)+(Yt −Yt0)‖p≤ ‖Xt −Xt0‖p+‖Yt −Yt0‖p,

it is clear that if Xt and Yt are continuous in mean of order p, then so is Zt := Xt +Yt .


17. Let ε > 0 be given. Since ‖Xn−X‖p→ 0, there is an N such that for n ≥ N, ‖Xn−X‖p < ε/2. Thus, for n,m≥ N,

‖Xn−Xm‖p = ‖(Xn−X)+(X−Xm)‖p ≤ ‖Xn−X‖p+‖X−Xm‖p < ε/2+ ε/2 = ε.

18. Suppose Xn is Cauchy in Lp. With ε = 1, there is an N such that for all n,m ≥ N,

‖Xn−Xm‖p < 1. In particular, with m= N we have from∣∣‖Xn‖p−‖XN‖p

∣∣ ≤ ‖Xn−XN‖pthat

‖Xn‖p ≤ ‖Xn−XN‖p+‖XN‖p < 1+‖XN‖p, for n≥ N.

To get a bound that also holds for n= 1, . . . ,N−1, write

‖Xn‖p ≤ max(‖X1‖p, . . . ,‖XN−1‖p,1+‖XN‖p).

19. Since Xn converges, it is Cauchy and therefore bounded by the preceding two prob-

lems. Hence, we can write ‖Xn‖p ≤ B < ∞ for some constant B. Given ε > 0, let

n≥ N imply ‖Xn−X‖p < ε/(2‖Y‖q) and ‖Yn−Y‖q < ε/(2B). Then

‖XnYn−XY‖1 = E[|XnYn−XnY +XnY −XY |]≤ E[|Xn(Yn−Y )|]+E[|(Xn−X)Y |]≤ ‖Xn‖p‖Yn−Y‖q+‖Xn−X‖p‖Y‖q, by Holder’s inequality,

< B · ε

2B+

ε

2‖Y‖q‖Y‖q = ε.

20. If Xn converges in mean of order p to both X and Y , write

‖X −Y‖p = ‖(X−Xn)+(Xn−Y )‖p ≤ ‖X−Xn‖p+‖Xn−Y‖p → 0.

Since ‖X−Y‖p = 0, E[|X−Y |p] = 0.

21. For the rightmost inequality, observe that

‖X −Y‖p = ‖X +(−Y )‖p ≤ ‖X‖p+‖−Y‖p = ‖X‖p+‖Y‖p.For the remaining inequality, first write

‖X‖p = ‖(X−Y )+Y‖p ≤ ‖X −Y‖p+‖Y‖p,from which it follows that

‖X‖p−‖Y‖p ≤ ‖X −Y‖p. (∗)Similarly, from

‖Y‖p = ‖(Y −X)+X‖p ≤ ‖Y −X‖p+‖X‖p,it follows that

‖Y‖p−‖X‖p ≤ ‖Y −X‖p = ‖X −Y‖p. (∗∗)From (∗) and (∗∗) it follows that

∣∣‖X‖p−‖Y‖p∣∣ ≤ ‖X −Y‖p.


22. By taking pth roots, we see that

limn→∞

E[|Xn|p] = E[|X |p] (#)

is equivalent to ‖Xn‖p→‖X‖p. Since

∣∣‖X‖p−‖Y‖p∣∣ ≤ ‖X −Y‖p,

we see that convergence in mean of order p implies (#).

23. Write

‖X +Y‖2p+‖X−Y‖2p = 〈X+Y,X+Y 〉+ 〈X−Y,X−Y 〉= 〈X ,X〉+2〈X ,Y 〉+ 〈Y,Y 〉+ 〈X ,X〉−2〈X ,Y 〉+ 〈Y,Y 〉= 2(‖X‖2p+‖Y‖2p).

24. Write

|〈Xn,Yn〉−〈X ,Y 〉| = |〈Xn,Yn〉−〈X ,Yn〉+ 〈X ,Yn〉−〈X ,Y 〉|≤ ‖〈Xn−X ,Y 〉|+ |〈X ,Yn−Y 〉|≤ ‖Xn−X‖2‖Yn‖2 +‖X‖2‖Yn−Y‖2 → 0.

Here we have used the Cauchy–Schwarz inequality and the fact that since Yn con-

verges, it is bounded.

25. As in the example, for n> m, we can write

‖Yn−Ym‖22 ≤n

∑k=m+1

n

∑l=m+1

|hk| |hl | |〈Xk,Xl〉|.

In this problem, 〈Xk,Xl〉= 0 for k 6= l. Hence,

‖Yn−Ym‖22 ≤n

∑k=m+1

B|hk|2 ≤ Bn

∑k=m+1

|hk|2 → 0

as n> m→ ∞ on account of the assumption that ∑∞k=1 |hk|2 < ∞.

26. Put Yn := ∑nk=1 hkXk. It suffices to show that Yn is Cauchy in Lp. Write, for n> m,

‖Yn−Ym‖p =

∥∥∥∥n

∑k=m+1

hkXk

∥∥∥∥p

≤n

∑k=m+1

|hk|‖Xk‖p = B1/pn

∑k=m+1

|hk| → 0

as n> m→ ∞ on account of the assumption that ∑∞k=1 |hk|< ∞.

27. Write

E[YZ] = E

[(n

∑i=1

Xτi(ti− ti−1)

)( ν

∑j=1

Xθ j(s j− s j−1)

)]

=n

∑i=1

ν

∑j=1

R(τi,θ j)(ti− ti−1)(s j− s j−1).


28. First note that if

Y :=n

∑i=1

g(τi)Xτi(ti− ti−1) and Z :=ν

∑j=1

g(θ j)Xθ j(s j− s j−1),

then

E[YZ] =n

∑i=1

ν

∑j=1

g(τi)R(τi,θ j)g(θ j)(ti− ti−1)(s j− s j−1).

Hence, given finer and finer partitions, with Ym defined analogously to Y above, we

see that

E[|Ym−Yk|2] = E[Y 2m]−2E[YmYk]+E[Y 2

k ]

→∫ b

a

∫ b

ag(t)R(t,s)g(s)dt ds−2

∫ b

a

∫ b

ag(t)R(t,s)g(s)dt ds

+∫ b

a

∫ b

ag(t)R(t,s)g(s)dt ds = 0.

Thus, Ym is Cauchy in L2, and there exists aY ∈ L2 with ‖Ym−Y‖2→ 0. Furthermore,

since Ym converges in mean square to Y , E[Y 2m]→ E[Y 2], and it is clear that

E[Y 2m] →

∫ b

a

∫ b

ag(t)R(t,s)g(s)dt ds.

29. Consider the formula

∫ T

0R(t− s)ϕ(s)ds = λϕ(t), 0≤ t ≤ T. (∗)

Since R is defined for all t, we can extend the definition of ϕ on the right-hand side in

the obvious way. Furthermore, since R has period T , so will the extended definition

of ϕ . Hence, both R and ϕ have Fourier series representations, say

R(t) = ∑n

rnej2πnt/T and ϕ(s) = ∑

n

ϕnej2πns/T .

Substituting these into (∗) yields

∑n

rnej2πnt/T

∫ T

0e− j2πns/Tϕ(s)ds

︸︷︷︸= Tϕn

= ∑n

λϕnej2πnt/T .

It follows that Trnϕn = λϕn. Now, if ϕ is an eigenfunction, ϕ cannot be the zero

function. Hence, there is at least one value of n with ϕn 6= 0. For all n with ϕn 6= 0,

λ = Trn. Thus,

ϕ(t) = ∑n:rn=λ/T

ϕnej2πnt/T .


30. If∫ ba R(t,s)ϕ(s)ds= λϕ(t) then

0 ≤∫ b

a

∫ b

aR(t,s)ϕ(t)ϕ(s)dt ds =

∫ b

aϕ(t)

[∫ b

aR(t,s)ϕ(s)ds

]dt

=∫ b

aϕ(t)[λϕ(t)]dt = λ

∫ b

aϕ(t)2 dt.

Hence, λ ≥ 0.

31. To begin, write

λk

∫ b

aϕk(t)ϕm(t)dt =

∫ b

aλkϕk(t)ϕm(t)dt =

∫ b

a

[∫ b

aR(t,s)ϕk(s)ds

]ϕm(t)dt

=∫ b

aϕk(s)

[∫ b

aR(s, t)ϕm(t)dt

]ds, since R(t,s) = R(s, t),

=∫ b

aϕk(s) ·λmϕm(s)ds = λm

∫ b

aϕk(s)ϕm(s)ds.

We can now write

(λk−λm)

∫ b

aϕk(t)ϕm(t)dt = 0.

If λk 6= λm, we must have∫ ba ϕk(t)ϕm(t)dt = 0.

32. (a) Write

∫ T

0R(t,s)g(s)ds =

∫ T

0

[ ∞

∑k=1

λkϕk(t)ϕk(s)

]g(s)ds

=∞

∑k=1

λkϕk(t)∫ T

0g(s)ϕk(s)ds =

∞

∑k=1

λkgkϕk(t).

(b) Write

∫ T

0R(t,s)ϕ(s)ds =

∫ T

0R(t,s)

[g(s)−

∞

∑k=1

gkϕk(s)

]ds

=∫ T

0R(t,s)g(s)ds−

∞

∑k=1

gk

∫ T

0R(t,s)ϕk(s)ds

︸︷︷︸= λkϕk(t)

=∞

∑k=1

λkgkϕk(t)−∞

∑k=1

λkgkϕk(t) = 0.

(c) Write

E[Z2] =∫ T

0

∫ T

0ϕ(t)R(t,s)ϕ(s)dt ds=

∫ T

0ϕ(t)

[∫ T

0R(t,s)ϕ(s)ds

]dt = 0.


33. Consider the equation

λϕ(t) =∫ T

0e−|t−s|ϕ(s)ds (#)

=

∫ t

0e−(t−s)ϕ(s)ds+

∫ T

te−(s−t)ϕ(s)ds

= e−t∫ t

0esϕ(s)ds+ et

∫ T

te−sϕ(s)ds.

Differentiating yields

λϕ ′(t) = −e−t∫ t

0esϕ(s)ds+ e−tetϕ(t)+ et

∫ T

te−sϕ(s)ds+ et(−e−tϕ(t))

= −e−t∫ t

0esϕ(s)ds+ et

∫ T

te−sϕ(s)ds. (##)

Differentiating again yields

λϕ ′′(t) = e−t∫ t

0esϕ(s)ds− e−tetϕ(t)+ et

∫ T

te−sϕ(s)ds+ et(−e−tϕ(t))

=∫ T

0e−|t−s|ϕ(s)ds−2ϕ(t)

= λϕ(t)−2ϕ(t), by (#),

= (λ −2)ϕ(t).

We can now write

ϕ ′′(t) = (1−2/λ )ϕ(t) = −(2/λ −1)ϕ(t).

For 0 < λ < 2, put µ :=√

2/λ −1. Then ϕ ′′(t) =−µ2ϕ(t). Hence, we must have

ϕ(t) = Acosµt+Bsinµt

for some constants A and B.

34. The required orthogonality principle is that

E

[(Xt − Xt)

L

∑i=1

ciAi

]= 0

for all constants ci, where

Xt =L

∑j=1

c jA j.

In particular, we must have E[(Xt − Xt)Ai] = 0. Now, we know from the text that

E[XtAi] = λiϕi(t). We also have

E[XtAi] =L

∑j=1

c jE[A jAi] = ciλi.


Hence, ci = ϕi(t), and we have

Xt =L

∑i=1

Aiϕi(t).

35. Since ‖Yn −Y‖2 → 0, by the hint, E[Yn] → E[Y ] and E[Y 2n ] → E[Y 2]. Since gn is

piecewise constant, we know that E[Yn] = 0, and so E[Y ] = 0 too. Next, an argument

analogous to the one in Problem 21 tells us that if ‖gn− g‖ → 0, then ‖gn‖ → ‖g‖.Hence,

E[Y 2] = limn→∞

E[Y 2n ] = lim

n→∞σ2∫ ∞

0gn(t)

2 dt = σ2∫ ∞

0g(t)2 dt.

36. First write

‖Y − Y‖2 = ‖Y −Yn‖2 +‖Yn− Yn‖2 +‖Yn− Y‖2,where the first and last terms on the right go to zero. As for the middle term, write

‖Yn− Yn‖22 = E

[(∫ ∞

0gn(t)dWt −

∫ ∞

0gn(t)dWt

)2]

= E

[(∫ ∞

0[gn(t)− gn(t)]dWt

)2]= σ2

∫ ∞

0[gn(t)− gn(t)]2 dt.

We can now write

‖Yn− Yn‖2 = σ‖gn− gn‖ ≤ σ‖gn−g‖+σ‖g− gn‖ → 0.

37. We know from our earlier work that the Wiener integral is linear on piecewise-con-

stant functions. To analyze the general case, let Y =∫ ∞0 g(t)dWt and Z =

∫ ∞0 h(t)dWt .

We must show that aY +bZ =∫ ∞0 ag(t)+bh(t)dWt . Let gn(t) and hn(t) be piecewise-

constant functions such that ‖gn − g‖ → 0 and ‖hn − h‖ → 0 and such that Yn :=∫ ∞0 gn(t)dWt and Zn :=

∫ ∞0 hn(t)dWt converge in mean square toY and Z, respectively.

Now observe that

aYn+bZn = a

∫ ∞

0gn(t)dWt +b

∫ ∞

0hn(t)dWt =

∫ ∞

0agn(t)+bhn(t)dWt , (∗)

since gn and hn are piecewise constant. Next, since

‖(agn+bhn)− (ag+bh)‖ ≤ |a|‖gn−g‖+ |b|‖hn−h‖ → 0,

it follows that the right-hand side of (∗) converges in mean square to∫ ∞0 ag(t) +

bh(t)dWt . Since the left-hand side of (∗) converges in mean square to aY + bZ, the

desired result follows.

38. We must find all values of β for which E[|Xt/t|2]→ 0. First compute

E[X2t ] = E

[(∫ t

0τβ dWτ

)2]=∫ t

0τ2β dτ =

t2β+1

2β +1.

Then t2β+1/t2 = t2β−1 → 0 if and only if 2β −1 < 0; i.e., 0 < β < 1/2.


39. Using the law of total probability, substitution, and independence, write

E[Y 2T ] = E

[(∫ T

0τn dWτ

)2]=∫ ∞

0E

[(∫ T

0τn dWτ

)2∣∣∣∣T = t

]fT (t)dt

=

∫ ∞

0E

[(∫ t

0τn dWτ

)2∣∣∣∣T = t

]fT (t)dt =

∫ ∞

0E

[(∫ t

0τn dWτ

)2]fT (t)dt.

Now use properties of the Wiener process to write

E[Y 2T ] =

∫ ∞

0

(∫ t

0τ2n dτ

)fT (t)dt =

∫ ∞

0

t2n+1

2n+1fT (t)dt =

E[T 2n+1]

2n+1

=(2n+1)!/λ 2n+1

2n+1=

(2n)!

λ 2n+1.

40. (a) Write

g(t+∆t)−g(t) = E[ f (Wt+∆t)]−E[ f (Wt)]

= E[ f (Wt+∆t)− f (Wt)]

≈ E[ f ′(Wt)(Wt+∆t −Wt)]+ 12E[ f ′′(Wt)(Wt+∆t −Wt)2]

= E[ f ′(Wt −W0)(Wt+∆t −Wt)]+ 1

2E[ f ′′(Wt −W0)(Wt+∆t −Wt)2]

= E[ f ′(Wt −W0)] ·0+ 12E[ f ′′(Wt −W0)] ·∆t.

It then follows that g′(t) = 12E[ f ′′(Wt)].

(b) If f (x) = ex, then g′(t) = 12E[eWt ] = 1

2g(t). In this case, g(t) = et/2, since g(0) =

E[eW0 ] = 1.

(c) We have by direct calculation that g(t) = E[eWt ] = es2t/2∣∣s=1

= et/2.

41. LetC be the ball of radius r,C := Y ∈ Lp : ‖Y‖p ≤ r. For X /∈C, i.e., ‖X‖p > r, we

show that

X =r

‖X‖pX .

To begin, note that the proposed formula for X satisfies ‖X‖p = r so that X ∈ C as

required. Now observe that

‖X− X‖p =

∥∥∥∥X −r

‖X‖pX

∥∥∥∥p

=

∣∣∣∣1−r

‖X‖p

∣∣∣∣‖X‖p = ‖X‖p− r.

Next, for any Y ∈C,

‖X −Y‖p ≥∣∣‖X‖p−‖Y‖p

∣∣= ‖X‖p−‖Y‖p≥ ‖X‖p− r= ‖X− X‖p.

Thus, no Y ∈C is closer to X than X .


42. Suppose that X and X are both elements of a subspace M and that 〈X− X ,Y 〉= 0 for

all Y ∈M and 〈X− X ,Y 〉= 0 for all Y ∈M. Then write

‖X − X‖22 = 〈X− X , X− X〉 = 〈(X−X)+(X− X), X− X︸︷︷︸∈M

〉 = 0+0 = 0.

43. For XN is to be the projection of XM onto N, it is sufficient that the orthogonality

principle be satisfied. In other words, it suffices to show that

〈XM− XN ,Y 〉 = 0, for all Y ∈ N.

Observe that

〈XM− XN ,Y 〉 = 〈(XM−X)+(X− XN),Y 〉 = −〈X− XM,Y 〉+ 〈X− XN ,Y 〉.Now, the last term on the right is zero by the orthogonality principle for the projection

of X onto N, since Y ∈ N. To show that 〈X − XM,Y 〉= 0, observe that since N ⊂M,

Y ⊂ N implies Y ⊂M. By the orthogonality principle for the projection of X ontoM,

〈X− XM,Y 〉= 0 for Y ∈M.

44. In the diagram, M is the disk and N is the horizontal line segment. The is XM , the

projection of X onto the disk M. The is XN , the projection of X onto line segment

N. The × is(XM)N , the projection of the circle XM onto the line segment N. We see

that(XM)N 6= XN .

XXM

XN(XM)N

45. Suppose that gn(Y ) ∈ M and gn(Y ) converges in mean square to some X . We must

show that X ∈M. Since gn(Y ) converges, it is Cauchy. Writing

‖gn(Y )−gm(Y )‖22 = E[|gn(Y )−gm(Y )|2]

=∫ ∞

−∞|gn(y)−gm(y)|2 fY (y)dy = ‖gn−gm‖Y ,

we see that gn is Cauchy in G, which is complete. Hence, there exists a g ∈ G with

‖gn−g‖Y → 0. We claim X = g(Y ). Write

‖g(Y )−X‖2 = ‖g(Y )−gn(Y )+gn(Y )−X‖2≤ ‖g(Y )−gn(Y )‖2 +‖gn(Y )−X‖2,

where the last term goes to zero by assumption. Now observe that

‖g(Y )−gn(Y )‖22 = E[|g(Y )−gn(Y )|2] =∫ ∞

−∞|g(y)−gn(y)|2 fY (y)dy

= ‖g−gn‖2Y → 0.

Thus, X = g(Y ) ∈M as required.


46. We claim that the required projection is∫ 10 f (t)dWt . Note that this is an element ofM

since ∫ 1

0f (t)2 dt ≤

∫ ∞

0f (t)2 dt < ∞.

Consider the orthogonality condition

E

[(∫ ∞

0f (t)dWt−

∫ 1

0f (t)dWt

)∫ 1

0g(t)dWt

]=E

[(∫ ∞

1f (t)dWt

)(∫ 1

0g(t)dWt

)].

Now put

f (t) :=

f (t), t > 1,0, 0≤ t ≤ 1,

and g(t) :=

0, t > 1,g(t), 0≤ t ≤ 1,

so that the orthgonality condition becomes

E

[(∫ ∞

0f (t)dWt

)(∫ ∞

0g(t)dWt

)]=∫ ∞

0f (t)g(t)dt,

which is zero since f (t)g(t) = 0 for all t ≥ 0.

47. The function g(t) will be optimal if the orthogonality condition

E

[(X−

∫ ∞

0g(τ)dWτ

)∫ ∞

0g(τ)dWτ

]= 0

holds for all g with∫ ∞0 g(τ)2 dτ . In particular, this must be true for g(τ) = I[0,t](τ). In

this case, the above expectation reduces to

E

[X

∫ ∞

0I[0,t](τ)dWτ

]−E

[∫ ∞

0g(τ)dWτ

∫ ∞

0I[0,t](τ)dWτ

].

Now, since ∫ ∞

0I[0,t](τ)dWτ =

∫ t

0dWτ = Wt ,

we have the further simplification

E[XWt ]−σ2∫ ∞

0g(τ)I[0,t](τ)dτ = E[XWt ]−σ2

∫ t

0g(τ)dτ.

Since this must be equal to zero for all t ≥ 0, we can differentiate and obtain

g(t) =1

σ2· ddt

E[XWt ].

48. (a) Using the methods of Chapter 5, it is not too hard to show that

FY (y) =

1, y≥ 1/4,[2−√1−4y ]/2, 0≤ y< 1/4,

[3/2− (1/2)√

1−4y ]/2, −2≤ y< 0,0, y<−2.



fY (y) =

1/√

1−4y, 0 < y< 1/4,1/(2

√1−4y ), −2 < y< 0,0, otherwise.

(b) We must find a g(y) such that

E[v(X)g(Y )] = E[g(Y )g(Y )], for all bounded g.

For future reference, note that

E[v(X)g(Y )] = E[v(X)g(X(1−X))] =1

2

∫ 1

−1v(x)g(x(1− x))dx.

Now, by considering the problem of solving g(x) = y for x in the two cases

0≤ y≤ 1/4 and −2≤ y< 0 suggests that we try

g(y) =

12v(

1+√

1−4y2

)+ 1

2v(

1−√1−4y2

), 0≤ y≤ 1/4,

v(

1−√1−4y2

), −2≤ y< 0.

To check, we compute

E[g(Y )g(Y )] = E[g(X(1−X))g(X(1−X))]

=1

2

[∫ 0

−1+∫ 1/2

0+∫ 1

1/2

]g(x(1− x))g(x(1− x))dx

=1

2

[∫ 0

−1v(x)g(x(1− x))dx

+

∫ 1/2

0

v(1− x)+ v(x)

2g(x(1− x))dx

+

∫ 1

1/2

v(x)+ v(1− x)2

g(x(1− x))dx]

=1

2

[∫ 0

−1v(x)g(x(1− x))dx+

∫ 1

0v(x)g(x(1− x))dx

]

=∫ 1

−1v(x)g(x(1− x))dx.

49. (a) Using the methods of Chapter 5, it is not too hard to show that

FY (y) =

FΘ(sin−1(y))+1−FΘ(π− sin−1(y)), 0≤ y≤ 1,

FΘ(sin−1(y))−FΘ(−π− sin−1(y)), −1≤ y< 0.

Hence,

fY (y) =

fΘ(sin−1(y))+ fΘ(π− sin−1(y))√1− y2

, 0≤ y< 1,

fΘ(sin−1(y))+ fΘ(−π− sin−1(y))√1− y2

, −1 < y< 0.


(b) Consider

E[v(X)g(Y )] = E[v(cosΘ)g(sinΘ)] =∫ π

−πv(cosθ)g(sinθ) fΘ(θ)dθ .

Next, write

∫ π/2

0v(cosθ)g(sinθ) fΘ(θ)dθ =

∫ 1

0v(√

1− y2 )g(y)fΘ(sin−1(y))√

1− y2dy

and

∫ 0

−π/2v(cosθ)g(sinθ) fΘ(θ)dθ =

∫ 0

−1v(√

1− y2 )g(y)fΘ(sin−1(y))√

1− y2dy.

Similarly, we can write with a bit more work

∫ π

π/2v(cosθ)g(sinθ) fΘ(θ)dθ =

∫ π/2

0v(cos(π− t))g(sin(π− t)) fΘ(π− t)dt

=∫ π/2

0v(−cos t)g(sin t) fΘ(π− t)dt

=∫ 1

0v(−

√1− y2 )g(y)

fΘ(π− sin−1(y))√1− y2

dy,

and

∫ −π/2

−πv(cosθ)g(sinθ) fΘ(θ)dθ

=∫ 0

−π/2v(cos(−π− t))g(sin(−π− t)) fΘ(−π− t)dt

=∫ 0

−π/2v(−cos t)g(sin t) fΘ(−π− t)dt

=∫ 0

−1v(−

√1− y2 )g(y)

fΘ(−π− sin−1(y))√1− y2

dy.

Putting all this together, we see that

E[v(X)g(Y )]

=∫ 1

0

v(√

1− y2 ) fΘ(sin−1(y))+ v(−√

1− y2 ) fΘ(π− sin−1(y))

fΘ(sin−1(y))+ fΘ(π− sin−1(y))

·g(y) fY (y)dy

+∫ 0

−1

v(√

1− y2 ) fΘ(sin−1(y))+ v(−√

1− y2 ) fΘ(−π− sin−1(y))

fΘ(sin−1(y))+ fΘ(−π− sin−1(y))

·g(y) fY (y)dy.


We conclude that

E[v(X)|Y = y]

=

v(√

1−y2 ) fΘ(sin−1(y))+v(−√

1−y2 ) fΘ(π−sin−1(y))

fΘ(sin−1(y))+ fΘ(π−sin−1(y)), 0 < y< 1,

v(√

1−y2 ) fΘ(sin−1(y))+v(−√

1−y2 ) fΘ(−π−sin−1(y))

fΘ(sin−1(y))+ fΘ(−π−sin−1(y)), −1 < y< 0.

50. Since X ≥ 0 and g(y) = IB(E[X |Y ])≥ 0, E[Xg(Y )]≥ 0. On the other hand,

E[X |Y ]I(−∞,−1/n)(E[X |Y ]) ≤ −1

nI(−∞,−1/n)(E[X |Y ]),

and so

0 ≤ E[Xg(Y )] = E[E[X |Y ]g(Y )] ≤ −1

nP(E[X |Y ] <−1/n) < 0,

which is a contradiction. Hence, P(E[X |Y ] < 0) = 0.

51. Write

E[Xg(Y )] = E[(X+−X−)g(Y )]

= E[X+g(Y )]−E[X−g(Y )]

= E[E[X+|Y ]g(Y )

]−E[E[X−|Y ]g(Y )

]

= E[(

E[X+|Y ]−E[X−|Y ])g(Y )

].

By uniqueness, we conclude that E[X |Y ] = E[X+|Y ]−E[X−|Y ].

52. Following the hint, we begin with

∣∣E[X |Y ]∣∣ =

∣∣E[X+|Y ]−E[X−|Y ]∣∣≤

∣∣E[X+|Y ]∣∣+

∣∣E[X−|Y ]∣∣ = E[X+|Y ]+E[X−|Y ].

Then

E[∣∣E[X |Y ]

∣∣] ≤ E[E[X+|Y ]

]+E

[E[X−|Y ]

]= E[X+]+E[X−] < ∞.

53. To show that E[h(Y )X |Y ] = h(Y )E[X |Y ], we have to show that the right-hand side

satisfies the characterizing equation of the left-hand side. Since the characterizing

equation for the left-hand side is

E[h(Y )Xg(Y )] = E[E[h(Y )X |Y ]g(Y )

], for all bounded g,

we must show that

E[h(Y )Xg(Y )] = E[h(Y )E[X |Y ]g(Y )

], for all bounded g. (∗)

The only other thing we know is the characterizing equation for E[X |Y ], which is

E[Xg(Y )] = E[E[X |Y ]g(Y )

], for all bounded g.


Since g in the above formula is an arbitrary and bounded function, and since h is

also bounded, we can rewrite the above formula by replacing g(Y ) with g(Y )h(Y ) for

arbitrary bounded g. We thus have

E[Xg(Y )h(Y )] = E[E[X |Y ]g(Y )h(Y )

], for all bounded g,

which is equivalent to (∗).

54. We must show that E[X |q(Y )] satisfies the characterizing equation of E[E[X |Y ]

∣∣q(Y )].

To write down the characterizing equation for E[E[X |Y ]

∣∣q(Y )], it is convenient to use

the notation Z := E[X |Y ]. Then the characterizing equation for E[Z|q(Y )] is

E[Zg(q(Y ))] = E[E[Z|q(Y )]g(q(Y ))


We must show that this equation holds when E[Z|q(Y )] is replaced by E[X |q(Y )]; i.e.,

we must show that

E[Zg(q(Y ))] = E[E[X |q(Y )]g(q(Y ))


Replacing Z by its definition, we must show that

E[E[X |Y ]g(q(Y ))] = E[E[X |q(Y )]g(q(Y ))

], for all bounded g. (∗∗)

We begin with the characterizing equation for E[X |Y ], which is

E[Xh(Y )] = E[E[X |Y ]h(Y )

], for all bounded h.

Since h is bounded and arbitrary, we can replace h(Y ) by g(q(Y )) for arbitrary bound-

ed g. Thus,

E[Xg(q(Y ))] = E[E[X |Y ]g(q(Y ))


We next consider the characterizing equation of E[X |q(Y )], which is

E[Xg(q(Y ))] = E[E[X |q(Y )]g(q(Y ))


Combining these last two equations yields (∗∗).

55. The desired result,

E[h(Y )Xg(Y )] = E[h(Y )E[X |Y ]g(Y )],

can be rewritten as

E[(X−E[X |Y ])g(Y )h(Y )] = 0,

where g is a bounded function and h(Y ) ∈ L2. But then g(Y )h(Y ) ∈ L2, and there-

fore this last equation must hold by the orthogonality principle since E[X |Y ] is the

projection of X onto M = v(Y ) : E[v(Y )2] < ∞.


56. Let hn(Y ) be bounded and converge to h(Y ). Then for bounded g,

E[h(Y )Xg(Y )] = E

[limn→∞

hn(Y )Xg(Y )]

= limn→∞

E[Xhn(Y )g(Y )]

= limn→∞

E[E[X |Y ]hn(Y )g(Y )

]

= E

[limn→∞

hn(Y )E[X |Y ]g(Y )]

= E[h(Y )E[X |Y ]g(Y )

].

By uniqueness, E[h(Y )X |Y ] = h(Y )E[X |Y ].

57. First write

Y = E[Y |Y ] = E[X1 + · · ·+Xn|Y ] =n

∑i=1

E[Xi|Y ].

By symmetry, we must have E[Xi|Y ] = E[X1|Y ] for all i. Then Y = nE[X1|Y ], or

E[X1|Y ] = Y/n.

58. Write

E[Xn+1|Y1, . . . ,Yn] = E[Yn+1 +Yn+ · · ·+Y1|Y1, . . . ,Yn]

= E[Yn+1|Y1, . . . ,Yn]+Yn+ · · ·+Y1

= E[Yn+1]+Xn, by indep. & def. of Xn,

= Xn, since E[Yn+1] = 0.

59. For n≥ 1,

E[Xn+1] = E[E[Xn+1|Yn, . . . ,Y1]

]= E[Xn],

where the second equality uses the definition of a martingale. Hence, E[Xn] = E[X1]for n≥ 1.

60. For n≥ 1,

E[Xn+1] = E[E[Xn+1|Yn, . . . ,Y1]

]≤ E[Xn],

where the inequality uses the definition of a supermartingale. Since Xn≥ 0, E[Xn]≥ 0.

Hence, 0≤ E[Xn]≤ E[X1] for n≥ 1.

61. Since Xn+1 := E[Z|Yn+1, . . . ,Y1],

E[Xn+1|Yn, . . . ,Y1] = E[E[Z|Yn+1, . . . ,Y1]

∣∣Yn, . . . ,Y1

]

= E[Z|Yn, . . . ,Y1], by the smoothing property,

=: Xn.

62. Since Xn :=w(Yn) · · ·w(Y1), observe that Xn is a function of Y1, . . . ,Yn and that Xn+1 =w(Yn+1)Xn. Then

E[Xn+1|Yn, . . . ,Y1] = E[w(Yn+1)Xn|Yn, . . . ,Y1] = XnE[w(Yn+1)|Yn, . . . ,Y1]

= XnE[w(Yn+1)], by independence.


It remains to compute

E[w(Yn+1)] =∫ ∞

−∞w(y) f (y)dy =

∫ ∞

−∞

f (y)

f (y)f (y)dy =

∫ ∞

−∞f (y)dy = 1.

Hence, E[Xn+1|Yn, . . . ,Y1] = Xn; i.e., Xn is a martingale with respect to Yn.

63. To begin, write

wn+1(y1, . . . ,yn+1) =fYn+1···Y1(yn+1, . . . ,y1)

fYn+1···Y1(yn+1, . . . ,y1)

=fYn+1|Yn···Y1(yn+1|yn, . . . ,y1)fYn+1|Yn···Y1(yn+1|yn, . . . ,y1)

· fYn···Y1(yn, . . . ,y1)fYn···Y1(yn, . . . ,y1)

.

If we put

wn+1(yn+1, . . . ,y1) :=fYn+1|Yn···Y1(yn+1|yn, . . . ,y1)fYn+1|Yn···Y1(yn+1|yn, . . . ,y1)

,

then Xn+1 := wn+1(Y1, . . . ,Yn+1) = wn+1(Yn+1, . . . ,Y1)Xn, where Xn is a function of

Y1, . . . ,Yn. We can now write

E[Xn+1|Yn, . . . ,Y1] = E[wn+1(Yn+1, . . . ,Y1)Xn|Yn, . . . ,Y1]

= XnE[wn+1(Yn+1, . . . ,Y1)|Yn, . . . ,Y1].

We now show that this last factor is equal to one. Write

E[wn+1(Yn+1,Yn, . . . ,Y1)|Yn = yn, . . . ,Y1 = y1]

= E[wn+1(Yn+1,yn . . . ,y1)|Yn = yn, . . . ,Y1 = y1]

=∫ ∞

−∞wn+1(y,yn, . . . ,y1) fYn+1|Yn···Y1(y|yn, . . . ,y1)dy

=∫ ∞

−∞fYn+1|Yn···Y1(y|yn, . . . ,y1)dy = 1.

64. Since Wk depends on Yk and Yk−1, and since Xn = W1 + · · ·+Wn, Xn depends on

Y0, . . . ,Yn. Note also that Xn+1 = Xn+Wn+1. Now write

E[Xn+1|Yn, . . . ,Y0] = Xn+E[Wn+1|Yn, . . . ,Y0].

Next,

E[Wn+1|Yn = yn, . . . ,Y0 = y0] = E

[Yn+1−

∫ ∞

−∞zp(z|Yn)dz

∣∣∣∣Yn = yn, . . . ,Y0 = y0

]

= E

[Yn+1−

∫ ∞

−∞zp(z|yn)dz

∣∣∣∣Yn = yn, . . . ,Y0 = y0

]

= E[Yn+1|Yn = yn, . . . ,Y0 = y0]−∫ ∞

−∞zp(z|yn)dz

=∫ ∞

−∞z fYn+1|Yn···Y0(z|yn, . . . ,y0)dz−

∫ ∞

−∞zp(z|yn)dz


=∫ ∞

−∞z fYn+1|Yn(z|yn)dz−

∫ ∞

−∞zp(z|yn)dz

=∫ ∞

−∞zp(z|yn)dz−

∫ ∞

−∞zp(z|yn)dz = 0.

Hence, E[Xn+1|Yn, . . . ,Y0] = Xn; i.e., Xn is a martingale with respect to Yn.

65. First write

E[Yn+1|Xn = in, . . . ,X0 = i0] = E[ρXn+1

∣∣Xn = in, . . . ,X0 = i0]

= ∑j

ρ jP(Xn+1 = j|Xn = in, . . . ,X0 = i0)

= ∑j

ρ jP(Xn+1 = j|Xn = in).

Next, for 0 < i< N,

∑j

ρ jP(Xn+1 = j|Xn = i) = ρ i−1(1−a)+ρ i+1a

=(1−a

a

)i−1

(1−a)+(1−a

a

)i+1

a

=(1−a

a

)i−1[1−a+

(1−aa

)2

a]

=(1−a

a

)i−1[1−a+

(1−a)2a

]

=(1−a)iai−1

[1+

1−aa

]= ρ i.

If Xn = 0 or Xn = 1, then Xn+1 = Xn, and so

E[ρXn+1 |Xn, . . . ,X0] = E[ρXn |Xn, . . . ,X0] = ρXn = Yn.

Hence, in all cases, E[Yn+1|Xn, . . . ,X0] = Yn, and so Yn is a martingale with respect to

Xn.

66. First, since Xn is a submartingale, it is clear that

An+1 := An+(E[Xn+1|Yn, . . . ,Y1]−Xn) ≥ 0

and is a function of Y1, . . . ,Yn. To show that Mn is a martingale, write

E[Mn+1|Yn, . . . ,Y1] = E[Xn+1−An+1|Yn, . . . ,Y1]

= E[Xn+1|Yn, . . . ,Y1]−An+1

= E[Xn+1|Yn, . . . ,Y1]−[An+(E[Xn+1|Yn, . . . ,Y1]−Xn)

]

= Xn−An = Mn.

67. Without loss of generality, assume f1 ≤ f2. Then

E[Z f1Z∗f2] = E[Z f1(Z f2 −Z f1)∗]+E[|Z f1 |2]


= E[(Z f1 −Z−1/2)(Z f2 −Z f1)∗]+E[|Z f1 |2]= E[Z f1 −Z−1/2]E[(Z f2 −Z f1)∗]+E[|Z f1 |2]

= 0 ·0+E[|Z f1 |2] =∫ f1

−1/2S(ν)dν .

68. Using the geometric series formula,

1

n

n

∑k=1

e j2π f n =1

n· e

j2π f −(e j2π f

)n+1

1− e j2π f=

e j2π f

n· 1− e

j2π f n

1− e j2π f

=e j2π f

n· e

jπ f n

e jπ f· e− jπ f n− e jπ f ne− jπ f − e jπ f =

e j2π f

n· e

jπ f n

e jπ f· sin(π f n)

sin(π f ).


E[|Y |2] = E[YY ∗] = E

[(N

∑n=−N

dnXn

)(N

∑k=−N

dkXk

)∗]

=N

∑n=−N

N

∑k=−N

dnd∗kE[XnXk] =

N

∑n=−N

N

∑k=−N

dnd∗kR(n− k)

=N

∑n=−N

N

∑k=−N

dnd∗k

∫ 1/2

−1/2S( f )e j2π f (n−k) d f

=∫ 1/2

−1/2S( f )

∣∣∣∣N

∑n=−N

dnej2π f n

∣∣∣∣2

d f .

Hence, if ∑Nn=−N dne

j2π f n = 0, then E[|Y |2] = 0.

70. Write

E[T (G0)T (H0)] = E

[(N

∑n=−N

gnXn

)(N

∑k=−N

hkXk

)∗]=

N

∑n=−N

N

∑k=−N

gnh∗nR(n− k)

=N

∑n=−N

N

∑k=−N

gnh∗n

∫ 1/2

−1/2S( f )e j2π f (n−k) d f

=∫ 1/2

−1/2S( f )

(N

∑n=−N

gnej2π f n

)(N

∑k=−N

hkej2π f k

)∗d f

=∫ 1/2

−1/2S( f )G0( f )H0( f )

∗ d f .

71. (a) Write

‖T (G)−Y‖2 = ‖T (G)−T (Gn)+T (Gn)−T (Gn)+T (Gn)−Y‖2≤ ‖T (G)−T (Gn)‖2 +‖T (Gn)−T (Gn)‖2 +‖T (Gn)−Y‖2.


On the right-hand side, the first and third terms go to zero. To analyze the middle

term, write

‖T (Gn)−T (Gn)‖2 = ‖T (Gn− Gn)‖2 = ‖Gn− Gn‖≤ ‖Gn−G‖+‖G− Gn‖ → 0.

(b) To show T is norm preserving on L2(S), let Gn→G with T (Gn)→ T (G). Then

by Problem 21, ‖T (Gn)‖2 →‖T (G)‖2, and similarly, ‖Gn‖→ ‖G‖. Now write

‖T (G)‖2 = limn→∞

‖T (Gn)‖2 = limn→∞

‖Gn‖2, since Gn is a trig. polynomial,

= ‖G‖.

(c) To show T is linear on L2(S), fix G,H ∈ L2(S), and let Gn and Hn be trigono-

metric polynomials converging to G and H, respectively. Then

αGn+βHn → αG+βH, (∗)

and we can write

‖T (αG+βH)−αT (G)+βT (H)‖2= ‖T (αG+βH)−T (αGn+βHn)‖2

+‖T (αGn+βHn)−αT (Gn)+βT (Hn)‖2+‖αT (Gn)+βT (Hn)−αT (G)+βT (H)‖2.

Now, the first term on the right goes to zero on account of (∗) and the defintion

of T . The second term on the right is equal to zero because T is linear on

trigonometric polynomials. The third term goes to zero upon observing that

‖αT (Gn)+βT (Hn)−αT (G)+βT (H)‖2 ≤ |α|‖T (Gn)−T (G)‖2+ |β |‖T (Hn)−T (H)‖2.

(d) Using parts (b) and (c), ‖T (G)−T (H)‖2 = ‖T (G−H)‖2 = ‖G−H‖ implies

that T is actually uniformly continuous.

72. It suffices to show that I[−1/2, f ] ∈ L2(S). Write

∫ 1/2

−1/2

∣∣I[−1/2, f ](ν)∣∣2S(ν)dν =

∫ f

−1/2S(ν)dν ≤

∫ 1/2

−1/2S(ν)dν = R(0) = E[X2

n ] < ∞.

73. (a) We know that for trigonometric polynomials, E[T (G)] = 0. Hence, if Gn is a

sequence of trigonometric polynomials converging toG in L2(S), then T (Gn)→T (G) in L2, and then E[T (Gn)]→ E[T (G)].

(b) For trigonometric polynomials G and H, we have

〈T (G),T (H)〉 := E[T (G)T (H)∗] =∫ 1/2

−1/2G( f )H( f )∗S( f )d f .


If Gn and Hn are sequences of trigonometric polynomials converging to G and

H in L2(S), then we can use the result of Problem 24 to write

〈T (G),T (H)〉 = limn→∞

〈T (Gn),T (Hn)〉 = limn→∞

∫ 1/2

−1/2Gn( f )Hn( f )

∗S( f )d f

=∫ 1/2

−1/2G( f )H( f )∗S( f )d f .

(c) For −1/2≤ f1 < f2 ≤ f3 < f4 ≤ 1/2, write

E[(Z f2 −Z f1)(Z f4 −Z f3)∗] = E[T (I( f1, f2])T (I( f3, f4])∗]

=∫ 1/2

−1/2I( f1, f2]( f )I( f3, f4]( f )S( f )d f = 0.

74. (a) We take

L2(SY ) :=

G :

∫ 1/2

−1/2|G( f )|2SY ( f )d f < ∞

.

(b) Write

TY (G0) =N

∑n=−N

gnYn =N

∑n=−N

gn

∞

∑k=−∞

hkXn−k

=N

∑n=−N

gn

∞

∑k=−∞

hk

∫ 1/2

−1/2e j2π f (n−k) dZ f

=∫ 1/2

−1/2G0( f )H( f )dZ f .

(c) For G ∈ L2(SY ), let Gn be a sequence of trigonometric polynomials converging

to G in L2(SY ). Since TY is continuous,

TY (G) = limn→∞

TY (Gn) = limn→∞

∫ 1/2

−1/2Gn( f )H( f )dZ f

= limn→∞

T (GnH) = T (GH), since T is continuous,

=∫ 1/2

−1/2G( f )H( f )dZ f .

(d) Using part (c),

Vf := TY (I[−1/2, f ]) =∫ 1/2

−1/2I[−1/2, f ](ν)H(ν)dZν (∗)

=∫ f

−1/2H(ν)dZν .


(e) A slight generalization of (∗) establishes the result for piecewise-constant func-

tions. For general G ∈ L2(SY ), approximate G by a sequence of piecewise-

constant functions Gn and write

∫ 1/2

−1/2G( f )dVf = lim

n→∞

∫ 1/2

−1/2Gn( f )dVf = lim

n→∞

∫ 1/2

−1/2Gn( f )H( f )dZ f

= limn→∞

T (GnH) = T (GH) =∫ 1/2

−1/2G( f )H( f )dZ f .

CHAPTER 14

Problem Solutions

1. We must show that P(|Xn| ≥ ε)→ 0. Since Xn ∼ Cauchy(1/n) has an even density,

we can write

P(|Xn| ≥ ε) = 2

∫ ∞

ε

1/(nπ)

(1/n)2 + x2dx = 2

∫ ∞

nε

1/π

1+ y2dy → 0.

2. Since |cn− c| → 0, given ε > 0, there is an N such that for n ≥ N, |cn− c| < ε . For

such n,

ω ∈Ω : |cn− c| ≥ ε = ∅,

and so P(|cn− c| ≥ ε) = 0.

3. We show that Xn converges in probability to zero. Observe that Xn takes only the

values n and zero. Hence, for 0 < ε < 1, |Xn| ≥ ε if and only if Xn = n, which

happens if and only ifU ≤ 1/√n. We can now write

P(|Xn| ≥ ε) = P(U ≤ 1/√n ) = 1/

√n → 0.

4. Write

P(|Xn| ≥ ε) = P(|V | ≥ εcn) = 2

∫ ∞

εcnfV (v)dv → 0,

since cn→ ∞.

5. Using the hint,

P(X 6= Y ) = P

( ∞⋃

k=1

|X −Y | ≥ 1/k)

= limK→∞

P(|X−Y | ≥ 1/K).

We now show that the above limit is zero. To begin, observe that

|Xn−X |< 1/(2K) and |Xn−Y |< 1/(2K)

imply

|X−Y |= |X−Xn+Xn−Y | ≤ |X−Xn|+ |Xn−Y |< 1/(2K)+1/(2K) = 1/K.

Hence, |X −Y | ≥ 1/K implies |Xn−X | ≥ 1/(2K) or |Xn−Y | ≥ 1/(2K), and we can

write

P(|X−Y | ≥ 1/K) ≤ P(|Xn−X | ≥ 1/(2K)∪|Xn−Y | ≥ 1/(2K)

)

≤ P(|Xn−X | ≥ 1/(2K)

)+P

(|Xn−Y | ≥ 1/(2K)

),

which goes to zero as n→ ∞

238


6. Let ε > 0 be given. We must show that for every η > 0, for all sufficiently large n,

P(|Xn−X | ≥ ε) < η . Without loss of generality, assume η < ε . Let n be so large that

P(|Xn−X | ≥ η) < η . Then since η < ε ,

P(|Xn−X | ≥ ε) ≤ P(|Xn−X | ≥ η) < η .

7. (a) Observe that

P(|X |> α) = 1−P(|X | ≤ α) = 1−P(−α ≤ X ≤ α)

= 1− [FX (α)−FX ((−α)−)] = 1−FX (α)+FX ((−α)−)

≤ 1−FX (α)+FX (−α).

Now, since FX (x)→ 1 as x→ ∞ and FX (x)→ 0 as x→−∞, for large α , 1−FX (α) < ε/8 and FX (−α) < ε/8, and then P(|X | > α) < ε/4. Similarly, for

large β , P(|Y |> β ) < ε/4.

(b) Observe that if the four conditions hold, then

|Xn|= |Xn−X+X | ≤ |Xn−X |+ |X |< δ +α < 2α,

and similarly, |Yn| ≤ 2β . Now that both (Xn,Yn) and (X ,Y ) lie in the rectangle,

|g(Xn,Yn)−g(X ,Y )| < ε.

(c) By part (b), observe that

|g(Xn,Yn)−g(X ,Y )| ≥ ε ⊂ |Xn−X | ≥ δ∪|Yn−Y | ≥ δ∪|X |> α∪|Y |> β.

Hence,

P(|g(Xn,Yn)−g(X ,Y )| ≥ ε) ≤ P(|Xn−X | ≥ δ )+P(|Yn−Y | ≥ δ )

+P(|X |> α)+P(|Y |> β )

< ε/4+ ε/4+ ε/4+ ε/4 = ε.

8. (a) Since the Xi are i.i.d., the X2i are i.i.d. and therefore uncorrelated and have com-

mon mean E[X2i ] = σ2 +m2 and common variance

E[(X2i −E[X2

i ])2]

= E[X4i ]−

(E[X2

i ])2

= γ4− (σ2 +m2)2.

By the weak law of large numbers, Vn converges in probability to E[X2i ] = σ2 +

m2.

(b) Observe that S2n = g(Mn,Vn)n/(n− 1), where g(m,v) := v−m2 is continuous.

Hence, by the preceding problem, g(Mn,Vn) converges in probability to g(m,v)= (σ2 +m2)−m2 = σ2. Next, by Problem 2, n/(n−1) converges in probability

to 1, and then the product [n/(n− 1)]g(Mn,Vn) converges in probability to 1 ·g(m,v) = σ2.


9. (a) Since Xn converges in probability to X , with ε = 1 we have P(|Xn−X | ≥ 1)→ 0

as n→ ∞. Now, if |X−Xn|< 1, then∣∣|X |− |Xn|

∣∣ < 1, and it follows that

|X | < 1+ |Xn| ≤ 1+ |Y |.Equivalently,

|X | ≥ |Y |+1 implies |Xn−X | ≥ 1.

Hence,

P(|X | ≥ |Y |+1) ≤ P(|Xn−X | ≥ 1) → 0.

(b) Following the hint, write

E[|Xn−X |] = E[|Xn−X |IAn ]+E[|Xn−X |IAcn]

≤ E[(|Xn|+ |X |

)IAn

]+ εP(Ac

n)

≤ E[(Y + |X |

)IAn

]+ ε

≤ ε + ε,

since for large n, P(An) < δ implies E[(Y + |X |

)IAn

]< ε . Hence, E[|Xn−X |] <

2ε .

10. (a) Suppose g(x) is bounded, nonnegative, and g(x) → 0 as x→ 0. Then given

ε > 0, there exists a δ > 0 such that g(x) < ε/2 for all |x| < δ . For |x| ≥ δ ,

we use the fact that g is bounded to write g(x) ≤ G for some positive, finite G.

Since Xn converges to zero in probability, for large n, P(|Xn| ≥ δ ) < ε/(2G).Now write

E[g(Xn)] = E[g(Xn)I[0,δ )(|Xn|)]+E[g(Xn)I[δ ,∞)(|Xn|)]≤ E[(ε/2)I[0,δ )(|Xn|)]+E[GI[δ ,∞)(|Xn|)]

=ε

2P(|Xn|< δ )+GP(|Xn| ≥ δ )

<ε

2+G

ε

2G= ε.

(b) By applying part (a) to the function g(x) = x/(1+ x), it follows that if Xn con-

verges in probability to zero, then

limn→∞

E

[ |Xn|1+ |Xn|

]= 0.

Now we show that if the above limit holds, then Xn must converge in probability

to zero. Following the hint, we use the fact that g(x) = 1/(1+x) is an increasing

function for x≥ 0. Write

E[g(Xn)] = E[g(Xn)I[ε ,∞)(|Xn|)]+E[g(Xn)I[0,ε)(|Xn|)]≥ E[g(Xn)I[ε ,∞)(|Xn|)]≥ E[g(ε)I[ε ,∞)(|Xn|)]= g(ε)P(|Xn| ≥ ε).

Thus, if g(x) is nonnegative and nondecreasing, if E[g(Xn)]→ 0, then Xn con-

verges in distribution to zero.


11. First note that for the constant random variable Y ≡ c, FY (y) = I[c,∞)(y). Similarly,

for Yn ≡ cn, FYn(y) = I[cn,∞)(y). Since the only point at which FY is not continuous is

y= c, we must show that I[cn,∞)(y)→ I[c,∞)(y) for all y 6= c. Consider a y with c< y.

For all sufficiently large n, cn will be very close to c — so close that cn < y, which

implies FYn(y) = 1 = FY (y). Now consider y < c. For all sufficiently large n, cn will

be very close to c— so close that y< cn, which implies FYn(y) = 0 = FY (y).

12. For 0 < c < ∞, FY (y) = P(cX ≤ y) = FX (y/c), and FYn(y) = FX (y/cn). Now, y is a

continuity point of FY if and only if y/c is a continuity point of FX . For such y, since

y/cn→ y/c, FX (y/cn)→ FX (y/c). For c= 0, Y ≡ 0, and FY (y) = I[0,∞)(y). For y 6= 0,

y/cn →

+∞, y> 0,−∞, y< 0,

and so

FYn(y) = FX (y/cn) →

1, y> 0,0, y< 0,

which is exactly FY (y) for y 6= 0.

13. Since Xt ≤ Yt ≤ Zt ,Zt ≤ y ⊂ Yt ≤ y ⊂ Xt ≤ y,

we can write

P(Zt ≤ y) ≤ P(Yt ≤ y) ≤ P(Xt ≤ y),or FZt (y)≤ FYt (y)≤ FXt (y), and it follows that

limt→∞

FZt (y)︸︷︷︸

= F(y)

≤ limt→∞

FYt (y) ≤ limt→∞

FXt (y)︸︷︷︸

= F(y)

.

Thus, FYt (y)→ F(y).

14. Since Xn converges in mean to X , the inequality

∣∣E[Xn]−E[X ]∣∣ =

∣∣E[Xn−X ]∣∣ ≤ E[|Xn−X |]

shows that E[Xn]→ E[X ]. We now need the following implications:

conv. in mean⇒ conv. in probability⇒ conv. in distribution⇒ ϕXn(ν)→ ϕX (ν).

Since Xn is exponential, we also have

ϕXn(ν) =1/E[Xn]

1/E[Xn]− jν→ 1/E[X ]

1/E[X ]− jν.

Since limits are unique, the above right-hand side must be ϕX (ν), which implies X is

an exponential random variable.

15. Since Xn and Yn each converge in distribution to constants x and y, respectively, they

also converge in probability. Hence, as noted in the text, Xn+Yn converges in proba-

bility to x+ y. Since convergence in probability implies convergence in distribution,

Xn+Yn converges in distribution to x+ y.


16. Following the hint, we note that each Yn is a finite linear combination of independent

Gaussian increments. Hence, each Yn is Gaussian. Since Y is the mean-square limit

of the Gaussian Yn, the distribution of Y is also Gaussian by the example cited in

the hint. Furthermore, since each Yn =∫ ∞0 gn(τ)dWτ , Yn has zero mean and variance

σ2∫ ∞0 gn(τ)2 dτ = σ2‖gn‖2. By the cited example, Y has zero mean. Also, since

∣∣‖gn‖−‖g‖∣∣ ≤ ‖gn−g‖ → 0,

we have

var(Y ) = E[Y 2] = limn→∞

E[Y 2n ] = lim

n→∞var(Yn)

= limn→∞

σ2‖gn‖2 = σ2‖g‖2 = σ2∫ ∞

0g(τ)2 dτ.

17. For any constants c1, . . . ,cn, write

n

∑i=1

ciXti =n

∑i=1

ci

∫ ∞

0g(ti,τ)dWτ =

∫ ∞

0

(n

∑i=1

cig(ti,τ)

)

︸︷︷︸=: g(τ)

dWτ ,

which is normal by the previous problem.

18. The plan is to show that the increments are Gaussian and uncorrelated. It will then

follow that the increments are independent. For 0≤ u< v≤ s< t < ∞, write

[Xv−XuXt −Xs

]=

[−1 1 0 0

0 0 −1 1

]

XuXvXsXt

.

By writing

Xt :=

∫ t

0g(τ)dWτ =

∫ ∞

0g(τ)I[0,t](τ)dWτ =

∫ ∞

0h(t,τ)dWτ ,

where h(t,τ) := g(τ)I[0,t](τ), we see that by the preceding problem, the vector on the

right above is Gaussian, and hence, so are the increments in the vector on the left.

Next,

E[(Xt −Xs)(Xv−Xu)] = E

[∫ t

sg(τ)dWτ

∫ v

ug(τ)dWτ

]

= E

[∫ ∞

0g(τ)I(s,t](τ)dWτ

∫ ∞

0g(τ)I(u,v](τ)dWτ

]

= σ2∫ ∞

0g(τ)2I(s,t](τ)I(u,v](τ)dτ = 0.

19. Given a linear combination ∑nk=1 ckAk, put g(τ) := ∑n

k=1 ckϕk(τ). Then

n

∑k=1

ckAk =n

∑k=1

ck

[∫ b

aXτ ϕk(τ)dτ

]=

∫ b

aXτ

[n

∑k=1

ckϕk(τ)

]dτ =

∫ b

ag(τ)Xτ dτ,


which is a mean-square limit of sums of the form

∑i

g(τi)Xτi(ti− ti−1).

Since Xτ is a Gaussian process, these sums are Gaussian, and hence, so is their mean-

square limit.

20. If MXn(s) → MX (s), then this holds when s = jν ; i.e., ϕXn(ν) → ϕX (ν). But this

implies that Xn converges in distribution to X . Similarly, if GXn(z)→ GX (z), then

this holds when z = e jν ; i.e., ϕXn(ν) → ϕX (ν). But this implies Xn converges in

distribution to X .

21. (a) Write

pn(k) = FXn(k+1/2)−FXn(k−1/2)

→ FX (k+1/2)−FX (k−1/2) = p(k),

where we have used the fact that since X is integer valued, k±1/2 is a continuity

point of FX .

(b) The continuity points of FX are the noninteger values of x. For such x > 0,

suppose k < x< k+1. Then

FXn(x) = P(Xn ≤ x) = P(Xn ≤ k) =k

∑i=0

pn(i)→k

∑i=0

p(i) = P(X ≤ k) = FX (x).

22. Let

pn(k) := P(Xn = k) =

(n

k

)pkn(1− pn)n−k =

n!

k!(n− k)! pkn(1− pn)n−k

and p(k) := P(X = k) = λ ke−λ /k!. Next, by Stirling’s formula, as n→ ∞,

qn :=

√2π nn+1/2e−n

n!→ 1 and rn(k) :=

(n− k)!√2π(n− k)n−k+1/2e−(n−k) → 1,

and so qnrn(k)→ 1 as well. If we can show that pn(k)qnrn(k)→ p(k), then

limn→∞

pn(k) = limn→∞

pn(k)qnrn(k)

qnrn(k)=

limn→∞

pn(k)qnrn(k)

limn→∞

qnrn(k)=

p(k)

1= p(k).

Now write

limn→∞

pn(k)qnrn(k) = limn→∞

pn(k)

√2π nn+1/2e−n

n!· (n− k)!√

2π(n− k)n−k+1/2e−(n−k)

=e−k

k!limn→∞

pkn(1− pn)n−k ·nn+1/2

(n− k)n−k+1/2


=e−k

k!limn→∞

pkn(1− pn)n−k ·nn+1/2

nn−k+1/2(1− k/n)n−k+1/2

=e−k

k!limn→∞

(npn)k

(1− k/n)−k+1/2· (1− pn)

n−k

(1− k/n)n

=e−k

k!limn→∞

(npn)k

(1− k/n)−k+1/2· [1− (npn)/n]

n(1− pn)−k(1− k/n)n

=e−k

k!· λ k

1· e−λ ·1e−k

=λ ke−λ

k!= p(k).

Note that since npn→ λ , pn→ 0, and so (1− pn)−k→ 1.

23. First write

GXn(z) = [(1− pn)+ pnz]n = [1+ pn(z−1)]n =

[1+

npn(z−1)

n

]n.

Since npn(z−1)→ λ (z−1), GXn(z)→ eλ (z−1) = GX (z).

24. (a) Here are the sketches:

1 1

I(−∞,a](t)

ta b

ga,b(t)

ta b

1

I(−∞,b](t)

ta b

(b) From part (a) we can write

I(−∞,a](Y ) ≤ ga,b(Y ) ≤ I(−∞,b](Y ),


E[I(−∞,a](Y )]︸︷︷︸

= P(Y≤a)

≤ E[ga,b(Y )] ≤ E[I(−∞,b](Y )]︸︷︷︸

= P(Y≤b)

,

or

FY (a) ≤ E[ga,b(Y )] ≤ FY (b).

(c) Since FXn(x)≤ E[gx,x+∆x(Xn)],

limn→∞

FXn(x) ≤ limn→∞

E[gx,x+∆x(Xn)] = E[gx,x+∆x(X)] ≤ FX (x+∆x).

(d) Similarly,

FX (x−∆x) ≤ E[gx−∆x,x(X)] = limn→∞

E[gx−∆x,x(Xn)] ≤ limn→∞

FXn(x).

(e) If x is a continuity point of FX , then given any ε > 0, for sufficiently small ∆x,

FX (x)− ε < FX (x−∆x) and FX (x+∆x) < FX (x)+ ε.


Combining this with parts (c) and (d), we obtain

FX (x)− ε ≤ limn→∞

FXn(x) ≤ limn→∞

FXn(x) < FX (x)+ ε.

Since ε > 0 is arbitrary, the liminf and the limsup are equal, limn→∞FXn(x) exists

and is equal to FX (x).

25. If Xn converges in distribution to zero, then Xn converges in probability to zero, and

by Problem 10,

E

[ |Xn|1+ |Xn|

]= 0.

Conversely, if the above limit holds, then by Problem 10, Xn converges in probability

to zero, which implies convergence in distribution to zero.

26. Observe that fn(x) = n f (nx) implies

Fn(x) = F(nx) →

1, x> 0,F(0), x= 0,

0, x< 0.

So, for x 6= 0, Fn(x)→ I[0,∞)(x), which is the cdf of X ≡ 0. In other words, Xn con-

verges in distribution to zero, which implies convergence in probability to zero.

27. Since Xn converges in mean square to X , Xn converges in distribution to X . Since

g(x) := x2e−x2is a bounded continuous function, E[g(Xn)]→ E[g(X)].

28. Let 0 < δ < 1 be given. For large t, we have |u(t)−1|< δ , or

−δ < u(t)−1 < δ or 1−δ < u(t) < 1+δ .

Hence, for z> 0,

P(Zt ≤ z(1−δ )) ≤ P(Zt ≤ zu(t)) ≤ P(Zt ≤ z(1+δ )).

Rewrite this as

FZt (z(1−δ )) ≤ P(Zt ≤ zu(t)) ≤ FZt (z(1+δ )).

Then

F(z(1−δ )) = limt→∞

FZt (z(1−δ )) ≤ limt→∞

P(Zt ≤ zu(t))

and

limt→∞

P(Zt ≤ zu(t)) ≤ limt→∞

FZt (z(1+δ )) = FZ(z(1+δ )).

Since 0 < δ < 1 is arbitrary, and since F is continuous, we must have

limt→∞

P(Zt ≤ zu(t)) = F(z).

The case for z< 0 is similar.


29. Rewrite P(c(t)Zt ≤ z) as P(Zt ≤ z/c(t)) = P(Zt ≤ (z/c)(c/c(t)). Then if we put

u(t) := c/c(t), we have by the preceding problem that P(c(t)Zt ≤ z)→ F(z/c).

30. First write FXt (x) = P(Zt + s(t) ≤ x) = P(Zt ≤ x− s(t)). Let ε > 0 be given. Then

s(t)→ 0 implies that for large t, |s(t)|< ε , or

−ε < s(t) < ε or − ε < −s(t) < ε.

Then

FZt (x− ε) = P(Zt ≤ x− ε) ≤ FXt (x) ≤ P(Zt ≤ x+ ε) = FZt (x+ ε).

It follows that

F(x− ε) ≤ limt→∞

FXt (x) ≤ limt→∞

FXt (x) ≤ F(x+ ε).

Since F is continuous and ε > 0 is arbitrary,

limt→∞

FXt (x) = F(x).

31. Starting with Nbtc ≤ Nt ≤ Ndte, it is easy to see that

Xt :=

Nbtct−λ√

λ/t≤ Yt ≤

Ndtet−λ√

λ/t=: Zt .

According to Problem 13, it suffices to show that Xt and Zt converge in distribution

to N(0,1) random variables. By the preceding two problems, the distribution limit of

Zt is the same as that of c(t)Zt + s(t) if c(t)→ 1 and s(t)→ 0. We take

c(t) :=t/dte√t/dte

=√t/dte → 1

and

s(t) :=λ t

/dte√

λ/dte− λ√

λ/dte

= λ

(t/dte−1

)√

λ

√dte → 0.

Finally, observe that

c(t)Zt + s(t) =

Ndtedte −λ√

λ/dte

,

goes through the values of Yn and therefore converges in distribution to an N(0,1)random variable. It is similar to show that the distribution limit of Xt is also N(0,1).

32. (a) Let G := Xn→ X. For ω ∈ G,

1

1+Xn(ω)2→ 1

1+X(ω)2.

Since P(Gc) = 0, 1/(1+X2n ) converges almost surely to 1/(1+X2).


(b) Since almost sure convergence implies convergence in probability, which im-

plies convergence in distribution, we have 1/(1+X2n ) converging in distribution

to 1/(1+X2). Since g(x) = 1/(1+ x2) is bounded and continuous, E[g(Xn)]→E[g(X)]; i.e.,

limn→∞

E

[1

1+X2n

]= E

[1

1+X2

].

33. Let GX := Xn → X and GY := Yn → Y, where P(GcX ) = P(Gc

Y ) = 0. Let G :=g(Xn,Yn) → g(X ,Y ). We must show that P(Gc) = 0. Our plan is to show that

GX ∩GY ⊂ G. It follows that Gc ⊂ GcX ∪Gc

Y , and we can then write

P(Gc) ≤ P(GcX )+P(Gc

Y ) = 0.

For ω ∈ GX ∩GY , (Xn(ω),Yn(ω))→ (X(ω),Y (ω)). Since g(x,y) is continuous, for

such ω , g(Xn(ω),Yn(ω))→ g(X(ω),Y (ω)). Thus, GX ∩GY ⊂ G.

34. Let GX := Xn→ X and GXY := X = Y, where P(GcX ) = P(Gc

XY ) = 0. Put GY :=Xn→ Y. We must show that P(Gc

Y ) = 0. Our plan is to show that GX ∩GXY ⊂ GY .

It follows that GcY ⊂ Gc

X ∪GcXY , and we can then write

P(GcY ) ≤ P(Gc

X )+P(GcXY ) = 0.

For ω ∈GX ∩GXY , Xn(ω)→ X(ω) and X(ω) =Y (ω). Hence, for such ω , Xn(ω)→Y (ω); i.e., ω ∈ GY . Thus, GX ∩GXY ⊂ GY .

35. Let GX := Xn→ X and GY := Xn→ Y, where P(GcX ) = P(Gc

Y ) = 0. Put GXY :=X = Y. We must show that P(Gc

XY ) = 0. Our plan is to show that GX ∩GY ⊂ GXY .

It follows that GcXY ⊂ Gc

X ∪GcY , and we can then write

P(GcXY ) ≤ P(Gc

X )+P(GcY ) = 0.

For ω ∈ GX ∩GY , Xn(ω)→ X(ω) and Xn(ω)→ Y (ω). Since limits of sequences of

numbers are unique, for such ω , X(ω) =Y (ω); i.e., ω ∈GXY . Thus, GX ∩GY ⊂GXY .

36. Let GX := Xn→ X, GY := Yn→ Y, and GI :=⋂∞n=1Xn ≤ Yn, where P(Gc

X ) =P(Gc

Y ) = P(GcI ) = 0. This last equality follows because

P(GcI ) ≤

∞

∑n=1

P(Xn > Yn) = 0.

Let G := X ≤ Y. We must show that P(Gc) = 0. Our plan is to show that GX ∩GY ∩GI ⊂ G. It follows that Gc ⊂ Gc

X ∪GcY ∪Gc

I , and we can then write

P(Gc) ≤ P(GcX )+P(Gc

Y )+P(GcI ) = 0.

For ω ∈GX ∩GY ∩GI , Xn(ω)→ X(ω), Yn(ω)→Y (ω), and for all n, Xn(ω)≤Yn(ω).By properties of sequences of real numbers, for such ω , we must have X(ω)≤Y (ω).Thus, GX ∩GY ∩GI ⊂ G.

37. Suppose Xn converges almost surely to X , and Xn converges in mean to Y . Then Xnconverges in probability to X and to Y . By Problem 5, X = Y a.s.


38. If X(ω) > 0, then cnX(ω)→∞. If X(ω) < 0, then cnX(ω)→−∞. If X(ω) = 0, then

cnX(ω) = 0→ 0. Hence,

Y (ω) =

+∞, if X(ω) > 0,0, if X(ω) = 0,

−∞, if X(ω) < 0.

39. By a limit property of probability, we can write

P

( ∞⋂

N=1

∞⋃

n=N

Xn = j∣∣∣∣X0 = i

)= lim

M→∞P

( ∞⋃

n=M

Xn = j∣∣∣∣X0 = i

)

≤ limM→∞

∞

∑n=M

P(Xn = j|X0 = i) = limM→∞

∞

∑n=M

p(n)i j ,

which must be zero since the p(n)i j are summable.


P(S= ∞) = P

( ∞⋂

n=1

S> n)

= limN→∞

P(S> N) ≤ limN→∞

E[S]

N= 0.

41. Since Sn converges in mean to S, by the problems referred to in the hint,

E[S] = limn→∞

E[Sn] = limn→∞

n

∑k=1

E[|hk||Xk|] = limn→∞

n

∑k=1

|hk|E[|Xk|]≤ B1/p∞

∑k=1

|hk|< ∞.

42. From the inequality

∫ n+1

n

1

t pdt ≥

∫ n+1

n

1

(n+1)pdt =

1

(n+1)p,

we obtain

∞ >

∫ ∞

1

1

t pdt =

∞

∑n=1

∫ n+1

n

1

t pdt ≥

∞

∑n=1

1

(n+1)p=

∞

∑n=2

1

np,

which implies ∑∞n=1 1/np < ∞.

43. Write

E[M4n ] =

1

n4E

[(n

∑i=1

Xi

)(n

∑j=1

X j

)(n

∑l=1

Xl

)(n

∑m=1

Xm

)]

=1

n4

n

∑i=1

E[X4i ]+

n

∑i=1

n

∑l=1,l 6=i

E[XiXiXlXl ]+n

∑i=1

n

∑j=1, j 6=i

E[XiX jXiX j]

+n

∑i=1

n

∑j=1, j 6=i

E[XiX jX jXi]+4n

∑i=1

n

∑j=1, j 6=i

E[X3i X j]︸︷︷︸

= 0

=1

n4[nγ +3n(n−1)σ4].


44. It suffices to show that ∑∞n=1 P(|Xn| ≥ ε) < ∞. To this end, write

P(|Xn| ≥ ε) = 2

∫ ∞

ε

n2e−nx dx = −e−nx

∣∣∣∞

ε= e−nε .

Then∞

∑n=1

P(|Xn| ≥ ε) =∞

∑n=1

(e−ε)n =e−ε

1− e−ε< ∞.

45. First use the Rayleigh cdf to write

P(|Xn| ≥ ε) = P(Xn ≥ ε) = e−(nε)2/2.

Then

∞

∑n=1

P(|Xn| ≥ ε) =∞

∑n=1

e−ε2n2/2 ≤∞

∑n=1

(e−ε2/2)n =1

1− e−ε2/2< ∞.

Thus, Xn converges almost surely to zero.

46. For n> 1/ε , write

P(|Xn| ≥ ε) =∫ ∞

ε

p−1

np−1x−p dx =

1

np−1ε p−1.

Then∞

∑n>1

P(|Xn| ≥ ε) = ∑n<1/ε

P(|Xn| ≥ ε)+1

ε p−1

∞

∑n>1/ε

1

np−1< ∞

since p−1 > 1. Thus, Xn converges almost surely to zero.

47. To begin, first observe that

E[|Xn−X |] = E[|n2(−1)nYn|] = n2E[Yn] = n2pn.

Also observe that

P(|Xn−X | ≥ ε) = P(n2Yn ≥ ε) = P(Yn ≥ ε/n2) = P(Yn = 1) = pn.

In order to have Xn converge almost surely to X , it is sufficient to consider pn such

that ∑∞n=1 pn < ∞.

(a) If pn = 1/n3/2, then ∑∞n=1 pn < ∞, but n2pn = n1/2 6→ 0. For this choice of pn,

Xn converges almost surely but not in mean to X .

(b) If pn = 1/n3, then ∑∞n=1 pn < ∞ and n2pn = 1/n→ 0. For this choice of pn, Xn

converges almost surely and in mean to X .

48. To apply the weak law of large numbers of this section to (1/n)∑ni=1X

2i requires only

that the X2i be i.i.d. and have finite mean; there is no second-moment requirement on

X2i (which would be a requirement on X4

i ).


49. By writingNn

n=

1

n

n

∑k=1

Nk−Nk−1,

which is a sum of i.i.d. Poisson(λ ) random variables, we see that Nn/n converges

almost surely to λ by the strong law of large numbers. We next observe that Nbtc ≤Nt ≤ Ndte. Then

Nbtct

≤ Nt

t≤

Ndtet

,

and it follows thatNbtcdte ≤ Nt

t≤

Ndtebtc ,

and thenbtcdte︸︷︷︸→1

·Nbtcbtc︸︷︷︸→λ

≤ Nt

t≤

Ndtedte︸︷︷︸→λ

· dtebtc︸︷︷︸→1

.

Hence Nt/t converges almost surely to λ .

50. (a) By the strong law of large numbers, for ω not in a set of probability zero,

1

n

n

∑k=1

Xk(ω) → µ .

Hence, for ε > 0, for all sufficiently large n,

∣∣∣∣1

n

n

∑k=1

Xk(ω)−µ

∣∣∣∣ < ε,

which implies1

n

n

∑k=1

Xk(ω)−µ < ε,


n

∑k=1

Xk(ω) < n(µ + ε).

(b) Given M > 0, let ε > 0 and choose n in part (a) so that both n(µ + ε) >M and

n≥M hold. Then

Tn(ω) =n

∑k=1

Xk(ω) < n(µ + ε).

Now, for t > n(µ + ε)≥ Tn(ω),

Nt(ω) =∞

∑k=1

I[0,t](Tk(ω)) ≥ n ≥ M.


(c) As noted in the solution of part (a), the strong law of large numbers implies

Tn

n=

1

n

n

∑k=1

Xk → µ a.s.

Hence, n/Tn→ 1/µ a.s.

(d) First observe that

YNt =Nt

TNt≥ Nt

t,

and so1

µ= lim

t→∞YNt ≥ lim

t→∞

Nt

t.

Next,

YNt+1 =Nt +1

TNt+1≤ Nt +1

t=

Nt

t+

1

t,

and so1

µ= lim

t→∞YNt+1 ≤ lim

t→∞

Nt

t+0.

Hence,

limt→∞

Nt

t=

1

µ.

51. Let Xn := nI(0,1/n](U), whereU ∼ uniform(0,1]. Then for every ω ,U(ω)∈ (0,1]. For

n> 1/U(ω), orU(ω) > 1/n, Xn(ω) = 0. Thus, Xn(ω)→ 0 for every ω . However,

E[Xn] = nP(U ≤ 1/n) = 1.

Hence, Xn does not converge in mean to zero.

52. Suppose Dε contains n points, say a≤ x1 < · · ·< xn ≤ b. Then

nε <n

∑k=1

G(xk+)−G(xk−)

= G(xn+)+n−1

∑k=1

G(xk+)−n

∑k=2

G(xk−)−G(x1−)

≤ G(b)+n−1

∑k=1

G(xk+1)−n

∑k=2

G(xk−1)−G(a)

= G(b)+n−1

∑k=1

[G(xk+1)−G(xk)]−G(a)

≤ G(b)+G(xn)−G(x1)−G(a) ≤ 2[G(b)−G(a)].

53. Write

P

(U ∈

∞⋃

n=1

Kn

)= P

( ∞⋃

n=1

U ∈ Kn)≤

∞

∑n=1

P(U ∈ Kn) ≤∞

∑n=1

2ε

2n= 2ε.

Since ε > 0 is arbitrary, the probability in question must be zero.

CHAPTER 15

Problem Solutions

1. Using the formula FWt (x) = FW1(t−Hx), we see that

limt↓0FWt (x) =

FW1

(∞) = 1, x> 0,FW1

(−∞) = 0, x< 0,

which is the cdf of the zero random variable for x 6= 0. Hence, Wt converges in

distribution to the zero random variable.

Next,

X(ω) := limt→∞

tHW1(ω) =

∞, ifW1(ω) > 0,0, ifW1(ω) = 0,

−∞, ifW1(ω) < 0.

Thus,

P(X = ∞) = P(W1 > 0) = 1−FW1(0), P(X =−∞) = P(W1 < 0) = FW1

(0−),

and

P(X = 0) = P(W1 = 0) = FW1(0)−FW1

(0−).

2. Since the joint characteristic function of a zero-mean Gaussian process is completely

determined by the covariance matrix, we simply observe that

E[Wλ t1Wλ t2 ] = σ2 min(λ t1,λ t2) = λσ2 min(t1, t2),

and

E[(λ 1/2Wt1)(λ1/2Wt2)] = λσ2 min(t1, t2).

3. Fix τ > 0, and consider the process Zt :=Wt −Wt−τ . Since the Wiener process is

Gaussian with zero mean, so is the process Zt . Hence, it suffices to consider the

covariance

E[Zt1Zt2 ] = E[(Wt1 −Wt1−τ)(Wt2 −Wt2−τ)].

The time intervals involved do not overlap if t1 < t2− τ or if t2 < t1− τ . Hence,

E[Zt1Zt2 ] =

0, |t2− t1|> τ,

σ2(|t2− t1|+ τ), |t2− t1| ≤ τ,

which depends on t1 and t2 only through their difference.

4. For H = 1/2, qH(θ) = I(0,∞)(θ), and CH = 1. So,

BH(t)−BH(s) =∫ ∞

−∞[I(−∞,t)(τ)− I(−∞,s)(τ)]dWτ = Wt −Ws.

252


5. It suffices to show that

∫ ∞

1[(1+θ)H−1/2−θH−1/2]2 dθ < ∞.

Consider the function f (t) := tH−1/2. By the mean-value theorem of calculus,

(1+θ)H−1/2−θH−1/2 = f ′(t), for some t ∈ (θ ,θ +1).

Since f ′(t) = (H−1/2)tH−3/2,

∣∣(1+θ)H−1/2−θH−1/2∣∣ ≤ |H−1/2|/θ 3/2−H .

Then

∫ ∞

1[(1+θ)H−1/2−θH−1/2]2 dθ ≤ (H−1/2)2

∫ ∞

1

1

θ 3−2Hdθ < ∞,

since 3−2H > 1.

6. The expression

P

(∣∣∣Mn−µ

σ/n1−H

∣∣∣≤ y)

= 1−α

says that

µ = Mn±yσ

n1−H with probability 1−α.

Hence, the width of the confidence interval is

2yσ

n1−H =2yσ√n·nH−1/2.

7. Suppose E[Y 2k ] = σ2k2H for k = 1, . . . ,n. Substituting this into the required formula

yields (E[Y 2

n+1]−σ2n2H)−

(σ2n2H −σ2(n−1)2H

)= 2C(n),

which we are assuming is equal to

σ2[(n+1)2H −2n2H +(n−1)2H

].

It follows that E[Y 2n+1] = σ2(n+1)2H .

8. (a) Clearly,

X(m)ν :=

νm

∑k=(ν−1)m+1

(Xk−µ)

is zero mean. Also,

E[X

(m)ν X

(m)n

]=

νm

∑k=(ν−1)m+1

nm

∑l=(n−1)m+1

C(k− l)


=νm

∑k=(ν−1)m+1

m

∑i=1

C(k− [i+(n−1)m])

=m

∑i=1

m

∑j=1

C([ j+(ν−1)m]− [i+(n−1)m])

=m

∑i=1

m

∑j=1

C( j− i+m(ν−n)).

Thus, X(m)ν is WSS.

(b) From the solution of part (a), we see that

C(m)(0) =m

∑i=1

m

∑j=1

C( j− i)

= mC(0)+2m−1

∑k=1

C(k)(m− k)

= E[Y 2m] = σ2m2H , by Problem 7.

Thus, C(m)(0) = σ2m2H/m2 = σ2m2H−2.

9. Starting with

limm→∞

C(m)(n)

m2H−2=

σ2∞

2

[|n+1|2H −2|n|2H + |n−1|2H

],

put n= 0 to get

limm→∞

C(m)(0)

m2H−2= σ2

∞.

Then observe that

C(m)(0)

m2H−2=

E[(X(m)1 −µ)2]

m2H−2=

E

[∣∣∣∣1

m

m

∑k=1

(Xk−µ)

∣∣∣∣2]

m2H−2.

10. Following the hint, observe that

C(m)(n)

m2H=

1

2

(E[Y 2

(n+1)m]−E[Y 2nm]

)−

(E[Y 2

nm]−E[Y 2(n−1)m]

)

m2H

=1

2

[(n+1)2H

E[Y 2(n+1)m]

[(n+1)m]2H−2n2H E[Y 2

nm]

(nm)2H+(n−1)2H

E[Y 2(n−1)m]

[(n−1)m]2H

]

→ σ2∞

2[(n+1)2H −2n2H +(n−1)2H ].

11. If the equation cited in the text holds, then in particular,

limm→∞

C(m)(0)

m2H−2= σ2

∞,


and

limm→∞

ρ(m)(n) = limm→∞

C(m)(n)

C(m)(0)= lim

m→∞

C(m)(n)/m2H−2

C(m)(0)/m2H−2

=1

2[|n+1|2H −2|n|2H + |n−1|2H ].

Conversely, if both conditions hold,

limm→∞

C(m)(n)

m2H−2= lim

m→∞

C(m)(0)ρ(m)(n)

m2H−2

= limm→∞

C(m)(0)

m2H−2limm→∞

ρ(m)(n)

=σ2

∞

2[|n+1|2H −2|n|2H + |n−1|2H ].

12. To begin, write

S( f ) =∣∣1− e− j2π f

∣∣−2d=

∣∣e− jπ f [e jπ f − e− jπ f ]∣∣−2d

=

∣∣∣∣2 je jπ f − e− jπ f

2 j

∣∣∣∣−2d

=∣∣2sin(π f )

∣∣−2d=

[4sin2(π f )

]−d.

Then

C(n) =∫ 1/2

−1/2

[4sin2(π f )

]−de j2π f n d f

=∫ 1/2

−1/2

[4sin2(π f )

]−dcos(2π f n)d f

= 2

∫ 1/2

0

[4sin2(π f )

]−dcos(2π f n)d f

= 2

∫ π

0

[4sin2(ν/2)

]−dcos(nν)

dν

2π

=1

π

∫ π

0

[4sin2(ν/2)

]−dcos(nν)dν . (∗)

Next, as suggested in the hint, apply the change of variable θ = 2π−ν to

1

π

∫ 2π

π

[4sin2(ν/2)

]−dcos(nν)dν

to get1

π

∫ π

0

[4sin2([2π−θ ]/2)

]−dcos(n[2π−θ ])dθ ,

which, using a trigonometric identity, is equal to (∗). Thus,

C(n) =1

2π

∫ 2π

0

[4sin2(ν/2)

]−dcos(nν)dν

=1

π

∫ π

0

[4sin2(t)

]−dcos(2nt)dt.


By the formula provided in the hint,

C(n) =cos(nπ)Γ(2−2d)2p−121−p

(1−2d)Γ((2−2d+2n)/2)Γ((2−2d−2n)/2)

=(−1)nΓ(1−2d)

Γ(1−d+n)Γ(1−d−n) .

13. Following hint (i), let u = sin2 θ and dv = θ α−3 dθ . Then du = 2sinθ cosθ dθ and

v= θ α−2/(α−2). Hence,

∫ r

εθ α−3 sin2 θ dθ =

θ α−2 sin2 θ

α−2

∣∣∣∣r

ε

− 1

α−2

∫ r

εθ α−2 sin2θ dθ .

Next,

1

2−α

∫ r

εθ α−2 sin2θ dθ =

1

2−α

∫ 2r

2ε(t/2)α−2 sin t

dt

2

=(1/2)α−1

2−α

∫ 2r

2εtα−2 sin t dt

=21−α

2−α

[tα−1

α−1sin t

∣∣∣∣2r

2ε

+1

1−α

∫ 2r

2εtα−1 cos t dt

].

Now write

∫ 2r

2εtα−1 cos t dt = Re

∫ 2r

2εtα−1e− jt dt → Ree− jαπ/2Γ(α) = cos(απ/2)Γ(α)

as ε → 0 and r→ ∞ by hint (iii). To obtain the complete result, observe that

θ α−2 sin2 θ = θ α( sinθ

θ

)2

and tα−1 sin t = tαsin t

t

both tend to zero as their arguments tend to zero or to infinity.

14. (a) First observe that

Q(− f ) =∞

∑i=1

1

|i− f |2H+1=

−1∑l=−∞

1

|− l− f |2H+1=

−1∑i=−∞

1

|i+ f |2H+1.

Hence,

S( f ) =[Q(− f )+

1

| f |2H+1+Q( f )

]sin2(π f ),

and then

S( f )

| f |1−2H = | f |2H−1[Q(− f )+Q( f )]sin2(π f )+sin2(π f )

f 2→ π2.


(b) We have

∫ 1/2

−1/2S( f )d f = σ2 = π2 · 4cos([1−H]π)Γ(2−2H)

(2π)2−2H(2H−1)2H

=(2π)2H cos(πH)Γ(2−2H)

2H(1−2H).

15. Write

C(n) =(−1)nΓ(1−2d)

Γ(n+1−d)Γ(1−d−n)

=(−1)nΓ(1−2d)

Γ(n+1−d)(−1)nΓ(d)Γ(1−d)/Γ(n+d)

=Γ(1−2d)

Γ(1−d)Γ(d)· Γ(n+d)

Γ(n+1−d) .

Now, with ε = 1−d, observe that

Γ(n+1− ε)

Γ(n+ ε)∼ (n+1− ε)n+1−ε−1/2e−(n+1−ε)

(n+ ε)n+ε−1/2e−(n+ε)

= e1−2dnn+1/2−ε [1+(1− ε)/n)]n+1/2−ε

nn+ε−1/2(1+ ε/n)n+ε−1/2

= e1−2dn2d−1[1+(1− ε)/n)]n+1/2−ε

(1+ ε/n)n+ε−1/2 .

Thus,

n1−2dΓ(n+1− ε)

Γ(n+ ε)→ e1−2d

(e1−ε

)1/2−ε

(eε)ε−1/2 = e1−2de1/2−ε = e1/2−d .

Thus, α = 1−2d and c= e1/2−dΓ(1−2d)/[Γ(1−d)Γ(d)].

16. Evaluating the integral, we have

In

n1−α=

n1−α − k1−α

(1−α)n1−α=

1− (k/n)1−α

1−α→ 1

1−α.

Now, given a small ε > 0, for large n, In/n1−α > 1/(1− α)− ε , which implies

In/n−α > n(1/(1−α)− ε)→ ∞. With Bn as in the hint, we have from Bn+ n−α −

k−α ≤ In ≤ Bn that

Bn

n1−α+1

n+k−α

n1−α≤ In

n1−α≤ Bn

n1−α

orIn

n1−α≤ Bn

n1−α≤ In

n1−α− 1

n− k−α

n1−α.

Thus, Bn/n1−α → 1/(1−α).


17. We begin with the inequality

n−1∑ν=k

ν1−α

︸︷︷︸=: Bn

≤∫ n

kt1−α dt

︸︷︷︸=: In

≤n−1∑ν=k

(ν +1)1−α

︸︷︷︸= Bn−k1−α+n1−α

.

Next, since In = (n2−α − k2−α)/(2−α),

In

n2−α=

1− (k/n)2−α

2−α→ 1

2−α.

Also,

In

n2−α≤ Bn

n2−α− k1−α

n2−α+1

n

andBn

n2−α≤ In

n2−α

imply Bn/n2−α → 1/(2−α) as required.

18. Observe that

∞

∑n=q

|hn| =∞

∑n=q

∣∣∣∣n

∑k=n−q

αkbn−k

∣∣∣∣

≤∞

∑n=q

∞

∑k=0

|αk||bn−k|I[n−q,n](k)

=∞

∑k=0

|αk|∞

∑n=q

|bn−k|I[0,q](n− k)

≤ M

( q

∑i=0

|bi|) ∞

∑k=0

(1+δ/2)−k

= M

( q

∑i=0

|bi|)

1

1−1/(1+δ/2)< ∞.

19. Following the hint, we first compute

∑n

αm−nYn = ∑n

αm−n

(∑k

akXn−k

)= ∑

n

αm−n∑l

an−lXl

= ∑l

Xl∑n

αm−nan−l = ∑l

Xl∑ν

αm−(ν+l)aν

︸︷︷︸= δ (m−l)

,

where the reduction to the impulse follows because the convolution of αn and ancorresponds in the z-transform domain to the product [1/A(z)] ·A(z) = 1, and 1 is the

transform of the unit impulse. Thus, ∑nαm−nYn = Xm.


Next,

∑n

αm−nYn = ∑n

αm−n

(∑k

bkZn−k

)= ∑

n

αm−n∑l

bn−lZl

= ∑l

Zl∑n

αm−nbn−l = ∑l

Zl∑k

αm−(l+k)bk

︸︷︷︸= hm−l

since this last convolution corresponds in the z-transform domain to the product

[1/A(z)] ·B(z) =: H(z).

20. Since Xn is WSS, E[|Xn|2] is a finite constant. Since ∑∞k=0 |hk| < ∞, we have by an

example in Chapter 13 or by Problem 26 in Chapter 13 that ∑mk=0 hkXn−k converges in

mean square as m→ ∞. By another example in Chapter 13,

E[Yn] = E

[ ∞

∑k=0

hkXn−k

]=

∞

∑k=0

hkE[Xn−k].

If E[Xn−k] = µ , then

E[Yn] =∞

∑k=0

hkµ

is finite and does not depend on n. Next, by the continuity of the inner product

(Problem 24 in Chapter 13),

E

[Xl

( ∞

∑k=0

hkXn−k

)]= lim

m→∞E

[Xl

(m

∑k=0

hkXn−k

)]= lim

m→∞

m

∑k=0

hkE[XlXn−k]

= limm→∞

m

∑k=0

hkRX (l−n+ k) =∞

∑k=0

hkRX ([l−n]+ k).

Similarly,

E[YnYl ] = E

[( ∞

∑k=0

hkXn−k

)Yl

]= lim

m→∞E

[(m

∑k=0

hkXn−k

)Yl

]

= limm→∞

m

∑k=0

hkE[Xn−kYl ] = limm→∞

m

∑k=0

hkRXY (n− k− l)

=∞

∑k=0

hkRXY ([n− l]− k).

Thus, Xn and Yn are J-WSS.

Gubner

Documents

Transcript of Gubner