Fundamentals of Statistical Signal Processing:...

6
Fundamentals of Statistical Signal Processing: Estimation Theory (Stephen Kay): Chapter 2 Detailed Solutions Question 1 The estimator is given as: ˆ σ 2 = 1 N N-1 X n=0 x 2 [n] (1) First, we find the expected value (should be equal to σ 2 if the estimator is unbiased): E{ ˆ σ 2 } = 1 N N-1 X n=0 E{x 2 [n]} = 1 N N-1 X n=0 (var{x} + E{x[n]} 2 )= 1 N N-1 X n=0 (σ 2 +0 2 )= N * σ 2 N = σ 2 (2) This implies that the estimator is unbiased. Next, we calculate the variance: var{ ˆ σ 2 } = 1 N 2 N-1 X n=0 var{x 2 [n]} = N N 2 var{x 2 [n]} = 1 N var{x 2 [n]} (3) In the above equation, var{x 2 [n]} = E{x 4 [n]}- E{x 2 [n]} 2 = E{x 4 [n]}- σ 4 (4) Using our knowledge that x[n] is normally distributed, we use the moment generating function to compute E{x 4 [n]}. The moment generating function, φ(t), for a Normal distribution is given as: φ(t)= exp{μt + σ 2 t 2 2 } (5) It is important to note that in general: φ n (0) = E{X n },n 1 (6) Since we are interested in E{x 4 [n]}, we take the fourth derivative of φ(t) and evaluate it at t= 0: φ 0000 (t)=3σ 4 exp{ σ 2 t 2 2 + μt} + exp{ σ 2 t 2 2 + μt}(2 + μ) 4 +6σ 2 exp{ σ 2 t 2 2 + μt}(2 + μ) 2 (7) φ 0000 (0) = 3σ 4 = E{x 4 [n]} (8) Substituting the obtained expressions, we obtain: var{ ˆ σ 2 } = 1 N (3σ 4 - σ 4 )= 2σ 4 N (9) Evidently, as N →∞, variance decreases and the estimator becomes better. 1

Transcript of Fundamentals of Statistical Signal Processing:...

Page 1: Fundamentals of Statistical Signal Processing: …milagorobets.com/wp-content/uploads/2014/02/Chapter2.pdf · Fundamentals of Statistical Signal Processing: Estimation Theory (Stephen

Fundamentals of Statistical Signal Processing:

Estimation Theory (Stephen Kay): Chapter 2 Detailed

Solutions

Question 1

The estimator is given as:

σ2 =1

N

N−1∑n=0

x2[n] (1)

First, we find the expected value (should be equal to σ2 if the estimator is unbiased):

E{σ2} =1

N

N−1∑n=0

E{x2[n]} =1

N

N−1∑n=0

(var{x}+ E{x[n]}2) =1

N

N−1∑n=0

(σ2 + 02) =N ∗ σ2

N= σ2 (2)

This implies that the estimator is unbiased.

Next, we calculate the variance:

var{σ2} =1

N2

N−1∑n=0

var{x2[n]} =N

N2var{x2[n]} =

1

Nvar{x2[n]} (3)

In the above equation,

var{x2[n]} = E{x4[n]} − E{x2[n]}2 = E{x4[n]} − σ4 (4)

Using our knowledge that x[n] is normally distributed, we use the moment generating function to compute

E{x4[n]}.The moment generating function, φ(t), for a Normal distribution is given as:

φ(t) = exp{µt+σ2t2

2} (5)

It is important to note that in general:

φn(0) = E{Xn}, n ≥ 1 (6)

Since we are interested in E{x4[n]}, we take the fourth derivative of φ(t) and evaluate it at t= 0:

φ′′′′

(t) = 3σ4exp{σ2t2

2+ µt}+ exp{σ

2t2

2+ µt}(tσ2 + µ)4 + 6σ2exp{σ

2t2

2+ µt}(tσ2 + µ)2 (7)

φ′′′′

(0) = 3σ4 = E{x4[n]} (8)

Substituting the obtained expressions, we obtain:

var{σ2} =1

N(3σ4 − σ4) =

2σ4

N(9)

Evidently, as N →∞, variance decreases and the estimator becomes better.

1

Page 2: Fundamentals of Statistical Signal Processing: …milagorobets.com/wp-content/uploads/2014/02/Chapter2.pdf · Fundamentals of Statistical Signal Processing: Estimation Theory (Stephen

Chapter 2 Detailed Solutions

Question 2

In a uniform distribution over (a,b), E{x[n]} = a+b2 . For this question we have a = 0, b = θ,⇒ E{x[n]} = θ

2 .

Since in an unbiased estimator, E{θ} = θ, we can simply average the samples and multiply the outcome by

2, so that:

θ =2

N

N−1∑n=0

x[n] (10)

Question 3

In example 2.1, we saw that x[n] = A + w[n], where w[n] is WGN. The estimator then was:

A =1

N

N−1∑n=0

x[n] (11)

This implies that the estimator is a linear sum - x[n] values are simply summed and averaged. Because of

this, the estimator has the same distribution as x[n], or as w[n], hence Gaussian or Normal.

We also know that the estimator is unbiased, and so its expected value is simply A. The variance is easily

found:

var{A} =1

N2

N−1∑n=0

var{x[n]} =1

N2

N−1∑n=0

var{w[n]} =1

N2

N−1∑n=0

σ2 =σ2

N(12)

Hence we can say that the estimator is normally distributed with a mean of A, and variance of σ2/N .

Question 4

An averaging estimator in this problem is defined as

h =1

N

N−1∑n=0

hi (13)

E{h} =1

N

N−1∑n=0

E{hi} = αh (14)

var{h} =1

N

N−1∑n=0

var{hi} =var(hi)

N=

1

N(15)

Mean:

For α = 1, E{h} = h and E{hi} = h. Similarly for α = 0.5, E{h} = 0.5h and E{hi} = 0.5h.

So, averaging does not improve the estimation of the mean - when α = 0.5, the estimation is biased no

matter what.

Variance: For α = 1, var{h} = 0.1 and var{hi} = 1. Similarly for α = 0.5, var{h} = 0.1 and var{hi} = 1.

So, averaging reduces the variance (which should be expected. However, when α = 0.5, the estimation

becomes worse as the distribution narrows around the wrong value.

Page 2 of 6

Page 3: Fundamentals of Statistical Signal Processing: …milagorobets.com/wp-content/uploads/2014/02/Chapter2.pdf · Fundamentals of Statistical Signal Processing: Estimation Theory (Stephen

Chapter 2 Detailed Solutions

Question 5

We are told that the estimator σ2 is unbiased, which means that E{σ2} = σ2.

Next, σ2 is expressed as a scaled sum of two squares (1/2(x2[0] + x2[1])), where the x[n] terms are normally

distributed.

The chi-squared distribution has a similar form, where statistic Y is chi-squared-distributed:

Y =

k∑i=1

(Xi − µiσi

)2 (16)

The chi-squared distribution has a pdf (for k = 2) of:

p(y) =e−y/2

2Γ(1)=

1

2e−y/2 (17)

Using our knowledge of x[n], we can write:

Y =x[0]

σ2+x[1]

σ2(18)

Comparing this to the expression for σ2, we can see that

σ2 =Y σ2

2(19)

Next, if we know the pdf of a random variable X, it is possible to calculate the pdf of another variable Y

that is related to x. The basic relationship is:

|fY (y)dy| = |fX(x)dx| (20)

|fY (y)| = |fX(x)|dy/dx

(21)

So in our case,

p(σ2) =12e−12

2ˆσ2

σ2

σ2/2(22)

The chi-squared distribution is only defined for σ2 ≥ 0. From the equation it is evident that the pdf is a

decaying exponential, which is not symmetrical.

Question 6

For the given estimator, we can define the mean and the variance:

A =

N−1∑n=0

anx[n] (23)

E{A} =

N−1∑n=0

E{anx[n]} =

N−1∑n=0

anA (24)

var{A} =

N−1∑n=0

a2nvar{x[n]} =

N−1∑n=0

a2nσ

2 (25)

Given the constraints, we can say thatN−1∑n=0

an = 1 (26)

Question 6 continued on next page. . . Page 3 of 6

Page 4: Fundamentals of Statistical Signal Processing: …milagorobets.com/wp-content/uploads/2014/02/Chapter2.pdf · Fundamentals of Statistical Signal Processing: Estimation Theory (Stephen

Chapter 2 Detailed Solutions

Then we minimize the variance using Lagrangian multipliers:

J = var{A}+ λ

(N−1∑n=0

an − 1

)=

N−1∑n=0

a2nσ

2 + λ

(N−1∑n=0

an − 1

)=

N−1∑n=0

a2nσ

2 + λ

N−1∑n=0

an − λ (27)

dJ

dai= 2aiσ

2 + λ = 0 (28)

ai =−λ2σ2

(29)

The value of ai is a constant, which means all ai are equal. From this we can say that Nai = 1, or ai = 1/N .

Question 7

When we are interested in evaluating something of the form P{|θ − θ| > ε}, we can look at z-scores (a.k.a.

standard score).

Z-score is a normalized metric, so we can write our expression as:

Pr

|θ − θ|√var(θ)

>ε√

var(θ)

< Pr

|θ − θ|√var(θ)

>ε√

var(θ)

(30)

The probability (Pr above) can be calculated as:

Pr (x > a) =

∞∫a

1√2πe−x

2/2dx (31)

This rewrites the expression as:

∞∫a

1√2πe−x

2/2dx <

∞∫b

1√2πe−x

2/2dx (32)

Here, a = ε√var(θ)

and b = ε√var(θ)

, and a > b. Since we are integrating a decaying exponential, shifting

a→∞ makes the probability associated with θ smaller than that for θ.

Question 8

Similarly to the previous question, we can normalize the probability expression and rewrite it in terms of an

integral:

Pr

|A−A|√var(A)

>ε√

var(A)

= 2

∞∫ε√

var(A)

1√2πe−x

2/2dx (33)

The lower limit on the integral is ε√var(A)

, where var(A) = σ2/N , making the limit ε√Nσ . The value of the

integral reduces with increasing lower limit as this is a decaying exponential past x = 0 (and this is always

true since we are looking at absolute values). Since as N →∞, ε√Nσ →∞ and hence the probability → 0.

Next, we look at estimator A = 12N

N−1∑n=0

x[n]. The variance of this estimator is:

var(A) =σ2

4N(34)

Question 8 continued on next page. . . Page 4 of 6

Page 5: Fundamentals of Statistical Signal Processing: …milagorobets.com/wp-content/uploads/2014/02/Chapter2.pdf · Fundamentals of Statistical Signal Processing: Estimation Theory (Stephen

Chapter 2 Detailed Solutions

At first glance this might look like it will reduce the probability faster than the previous estimator, but if we

analyze A more closely, we will notice that it is biased (expect value centered at A/2), and hence Pr → 1 as

N →∞.

Question 9

In Example 2.1 the estimator was:

A =1

N

N−1∑n=0

x[n] (35)

E{A} = A (36)

Now, for θ2,

E{θ} =1

N2E{(

N−1∑n=0

x[n])2} =1

N2

(E{

N−1∑n=0

x[n]}2 + var{N−1∑n=0

x[n]}

)(37)

E{θ} =1

N2

((NA)2 +Nσ2

)=NA2 + σ2

N=σ2

N+A2 = θ +

σ2

N(38)

Since the expected value isn’t equal to θ, the estimator is biased.

Question 10

We already know that estimator A is unbiased. We only need to show the expected value of σ2:

E{σ2} =N

N − 1E{(x[n]− A)2}

=N

N − 1

(E{x[n]− A}2 + var{x[n]− A}

)=

N

N − 1

(E{x[n]} − E{ 1

N

N−1∑n=0

x[m]}

)2

+ var{N − 1

Nx[n]− 1

N

N−1∑m=0,m 6=n

x[m]}

=

N

N − 1

((A−A)2 +

(N − 1)2

N2σ2 +

N − 1

N2σ2

)=

N

N − 1

N − 1

Nσ2 = σ2

Thus the estimator is unbiased.

Question 11

We are dealing with a uniform distribution, the pdf of which is defined as:

fX(x) =1

b− a=

1

1/θ − 0= θ (39)

Because we want an unbiased estimator, we also know that

E{θ} = θ (40)

Question 11 continued on next page. . . Page 5 of 6

Page 6: Fundamentals of Statistical Signal Processing: …milagorobets.com/wp-content/uploads/2014/02/Chapter2.pdf · Fundamentals of Statistical Signal Processing: Estimation Theory (Stephen

Chapter 2 Detailed Solutions

In general we can also write the expected value as

E{θ} = θ =

∞∫−∞

g(x)f(x)dx (41)

Here g(x) is a measurable function of x, and f(x) is the pdf of x.

θ =

∞∫−∞

g(x)f(x)dx

=

1/θ∫0

g(x[0])θd(x[0])

Cancelling out θ, we get:

1 =

1/θ∫0

g(x[0])d(x[0])

=

1/θ∫0

g(u)du

Next we need to prove that a function g(x[0]) cannot be found to satisfy this condition for all θ > 0.

Let’s take two values of θ - θ1 and θ2 that are not equal. Then,

1 =

1/θ1∫0

g(u)du

1 =

1/θ2∫0

g(u)du

Subtracting these two results in:

0 =

1/θ1∫1/θ2

g(u)du

But this is only possible when g(u) is 0, which doesn’t represent θ, and makes the estimator biased.

Page 6 of 6