HW-Sol-5-V1 - MIT - Massachusetts Institute of...

8
6.434J/16.391J Statistics for Engineers and Scientists Apr 11 MIT, Spring 2006 Handout #13 Solution 5 Problem 1: Let a> 0 be a known constant, and let θ> 0 be a parameter. Suppose X 1 ,X 2 ,...,X n is a sample from a population with one of the following densities. (a) The beta, β(θ, 1), density: f X (x | θ)= θx θ1 , for 0 <x< 1. (b) The Weilbull density: f X (x | θ)= θax a1 e θx a , for x> 0. (c) The Pareto density: f X (x | θ)= θa θ x (θ+1) , for x>a. In each case, find a real-valued sufficient statistic for θ. Solution Let X (X 1 ,X 2 ,...,X n ) be a collection of i.i.d. random variables X i ’s, and let x (x 1 ,x 2 ,...,x n ) be a collection of observed data. (a) For any x, the joint pdf is f X (x | θ)= θ n (x 1 x 2 ··· x n ) θ1 , if i, 0 <x i < 1 0, otherwise; = θ n (x 1 x 2 ··· x n ) θ1 g(T (x) | θ) × I (0,1) (x 1 )I (0,1) (x 2 ) ··· I (0,1) (x n ) h(x) . Factorization theorem implies that T (x) x 1 x 2 ··· x n is a sufficient statistic for θ. (b) For any x, the joint pdf is f X (x | θ)= θ n a n (x 1 x 2 ··· x n ) a1 e θ n i=1 x a i , if i, x i > 0; 0, otherwise; = θ n e θ n i=1 x a i g(T (x) | θ) × a n (x 1 x 2 ··· x n ) a1 ×I (0,) (x 1 )I (0,) (x 2 ) ··· I (0,) (x n ) h(x) . 1

Transcript of HW-Sol-5-V1 - MIT - Massachusetts Institute of...

Page 1: HW-Sol-5-V1 - MIT - Massachusetts Institute of Technologyweb.mit.edu/fmkashif/spring_06_stat/hw5solutions.pdf · 6.434J/16.391J Statistics for Engineers and Scientists Apr 11 MIT,

6.434J/16.391JStatistics for Engineers and Scientists Apr 11MIT, Spring 2006 Handout #13

Solution 5

Problem 1: Let a > 0 be a known constant, and let θ > 0 be a parameter.Suppose X1,X2, . . . , Xn is a sample from a population with one of the followingdensities.

(a) The beta, β(θ, 1), density: fX (x | θ) = θxθ−1, for 0 < x < 1.

(b) The Weilbull density: fX (x | θ) = θaxa−1 e−θxa

, for x > 0.

(c) The Pareto density: fX (x | θ) = θaθ

x(θ+1) , for x > a.

In each case, find a real-valued sufficient statistic for θ.

Solution Let X � (X1,X2, . . . , Xn) be a collection of i.i.d. random variablesXi’s, and let x � (x1, x2, . . . , xn) be a collection of observed data.

(a) For any x, the joint pdf is

fX (x | θ) =

θn(x1x2 · · ·xn)θ−1, if ∀i, 0 < xi < 1

0, otherwise;

= θn(x1x2 · · ·xn)θ−1︸ ︷︷ ︸�g(T (x) | θ)

× I(0,1)(x1)I(0,1)(x2) · · · I(0,1)(xn)︸ ︷︷ ︸�h(x)

.

Factorization theorem implies that

T (x) � x1x2 · · ·xn

is a sufficient statistic for θ.

(b) For any x, the joint pdf is

fX (x | θ) =

θnan(x1x2 · · ·xn)a−1 e−θ

�ni=1 xa

i , if ∀i, xi > 0;

0, otherwise;

= θn e−θ�n

i=1 xai︸ ︷︷ ︸

�g(T (x) | θ)

× an(x1x2 · · ·xn)a−1

×I(0,∞)(x1)I(0,∞)(x2) · · · I(0,∞)(xn)︸ ︷︷ ︸�h(x)

.

1

Page 2: HW-Sol-5-V1 - MIT - Massachusetts Institute of Technologyweb.mit.edu/fmkashif/spring_06_stat/hw5solutions.pdf · 6.434J/16.391J Statistics for Engineers and Scientists Apr 11 MIT,

Factorization theorem implies that

T (x) �n∑

i=1

xai

is a sufficient statistic for θ.

(c) For any x, the joint pdf is

fX (x | θ) =

θnanθ

(x1x2···xn)θ+1 , if ∀i, xi > a;

0, otherwise;

=θnanθ

(x1x2 · · ·xn)θ+1︸ ︷︷ ︸�g(T (x) | θ)

× I(a,∞)(x1)I(a,∞)(x2) · · · I(a,∞)(xn)︸ ︷︷ ︸�h(x)

.

Factorization theorem implies that

T (x) � x1x2 · · ·xn

is a sufficient statistic for θ.

Problem 2:

a) Let X1,X2, . . . , Xn be independent random variables, each uniformly dis-tributed on the interval [−θ, θ], for some θ > 0. Find a sufficient statisticfor θ.

b) Let X1,X2, . . . , Xn be a random sample of size n from a normal N(θ, θ)distribution, for some θ > 0. Find a sufficient statistic for θ.

Solution

a) For any x � (x1, x2, . . . , xn), the joint pdf is given by

fX (x | θ) =

(12θ

)n

, if ∀i,−θ ≤ xi ≤ θ;

0, otherwise;

=

(12θ

)n

, if −θ ≤ min(x1, . . . , xn) and max(x1, . . . , xn) ≤ θ;

0, otherwise;

=( 1

)n

I[−θ,∞)(min(x1, . . . , xn))I(−∞,θ](max(x1, . . . , xn))︸ ︷︷ ︸�g(T(x) | θ)

× 1︸︷︷︸�h(x)

.

2

Page 3: HW-Sol-5-V1 - MIT - Massachusetts Institute of Technologyweb.mit.edu/fmkashif/spring_06_stat/hw5solutions.pdf · 6.434J/16.391J Statistics for Engineers and Scientists Apr 11 MIT,

Factorization theorem implies that

T(x) �(min(x1, . . . , xn),max(x1, . . . , xn)

)is jointly sufficient for θ.

b) For any x � (x1, x2, . . . , xn), the joint pdf is given by

fX (x) =( 1√

2πθ

)n

e−12θ

�ni=1(xi−θ)2

=( 1√

2πθ

)n

e−12θ (�n

i=1 x2i−2θ

�ni=1 xi+nθ2)

=( 1√

2πθ

)n

e−12θ

�ni=1 x2

i +�n

i=1 xi−nθ2

=( 1√

)n

e�n

i=1 xi

︸ ︷︷ ︸�h(x)

×( 1√

θ

)n

e−12θ

�ni=1 x2

i−nθ2︸ ︷︷ ︸

�g(T (x) | θ)

.

Factorization theorem implies that

T (x) �n∑

i=1

x2i

is a sufficient statistic for θ.

Problem 3: Let X be the number of trials up to (and including) the firstsuccess in a sequence of Bernoulli trials with probability of success θ, for 0 <

θ < 1. Then, X has a geometric distribution with the parameter θ:

Pθ {X = k} = (1 − θ)k−1θ, k = 1, 2, 3, . . . .

Show that the family of geometric distributions is a one-parameter exponentialfamily with T (x) = x.[Hint : xα = eα ln x, for x > 0.]

Solution Recall that the pmf of a one-parameter (θ) exponential family is ofthe form

p(x | θ) = h(x) eη(θ)T (x)−B(θ),

where x ∈ X . Rewriting the pmf of a Geometric random variable yields

Pθ {X = x} = e(x−1) ln(1−θ)+ln θ

= ex ln(1−θ)−(ln(1−θ)−ln θ),

3

Page 4: HW-Sol-5-V1 - MIT - Massachusetts Institute of Technologyweb.mit.edu/fmkashif/spring_06_stat/hw5solutions.pdf · 6.434J/16.391J Statistics for Engineers and Scientists Apr 11 MIT,

where x ∈ {1, 2, 3, . . . }. Thus, the geometric distribution is a one-parameterexponential family with

h(x) � 1 η(θ) � ln(1 − θ)

T (x) � x B(θ) � ln(1 − θ) − ln θ

X � {1, 2, 3, . . . }.

Problem 4: Let X1,X2, . . . , Xn be a random sample of size n from the trun-cated Bernoulli probability mass function (pmf),

P {X = x | p} =

p, if x = 1;

(1 − p), if x = 0.

(a) Show that the joint pmf of X1,X2, . . . , Xn is a member of the exponentialfamily of distribution.

(b) Find a minimal sufficient statistic for p.

Solution

(a) Let x � (X1,X2, . . . Xn) denote the collection of i.i.d. Bernoulli randomvariables. The joint pmf is given by

P {X = x | p} =[px1(1 − p)1−x1

][px2(1 − p)1−x2

]· · ·

[pxn(1 − p)1−xn

]= p�n

i=1 xi(1 − p)n−�ni=1 xi

= e(ln p)�n

i=1 xi e[ln(1−p)][n−�ni=1 xi]

= e[ln p−ln(1−p)]�n

i=1 xi+n ln(1−p),

for x ∈ {0, 1}n. Therefore, the joint pmf is a member of the exponentialfamily, with the mappings:

θ = p h(x) = 1

η(p) = ln p − ln(1 − p) T (x) =n∑

i=1

xi

B(p) = −n ln(1 − p) X = {0, 1}n.

(b) Let x,y ∈ {0, 1}n be given. Consider the likelihood ratio,

P {X = x | p}P {X = y | p} = e[ln p−ln(1−p)][

�ni=1 xi−

�ni=1 yi] .

4

Page 5: HW-Sol-5-V1 - MIT - Massachusetts Institute of Technologyweb.mit.edu/fmkashif/spring_06_stat/hw5solutions.pdf · 6.434J/16.391J Statistics for Engineers and Scientists Apr 11 MIT,

Define a function k(x,y) � h(x)/h(y) = 1, which is bounded and non-zerofor any x ∈ X and y ∈ X .

Note that x and y such that∑n

i=1 xi =∑n

i=1 yi are equivalent becausefunction k(x,y) satisfies the requirement of likelihood ratio partition.Therefore, T (x) �

∑ni=1 xi is a sufficient statistic.

Problem 5: Let X1,X2, . . . , Xm and Y1, Y2, . . . , Yn be two independent sam-ples from N(µ, σ2) and N(µ, τ2) populations, respectively. Here, −∞ < µ < ∞,σ2 > 0, and τ2 > 0. Find a minimal sufficient statistic for θ � (µ, σ2, τ2).

Solution Let X � (X1,X2, . . . , Xm) and Y � (Y1, Y2, . . . , Yn) denote the col-lections of random samples. The joint pdf (of Xj ’s and Yi’s), evaluated atx � (x1, x2, . . . , xm) and y � (y1, y2, . . . , yn), is given by

fX,Y (x,y |θ) =( 1√

2πσ2

)m

· e−�m

j=1(xj−µ)2

2σ2 ·( 1√

2πτ2

)n

· e−�n

i=1(yi−µ)2

2τ2

= e−1

2σ2�m

j=1 x2j− 1

2τ2�n

i=1 y2i + µ

σ2�m

j=1 xj+µ

τ2�n

i=1 yi−B(µ,σ2,τ2),

where B(µ, σ2, τ2) � m2 ln 2πσ2 + n

2 ln 2πτ2 + mµ2

2σ2 + nµ2

2τ2 .Notice that the joint pdf belongs to the exponential family, so that the

minimal statistic for θ is given by

T(X,Y) �( m∑

j=1

X2j ,

n∑i=1

Y 2i ,

m∑j=1

Xj ,

n∑i=1

Yi

).

Note: One should not be surprised that the joint pdf belongs to the exponen-tial family of distribution. Recall that Gaussian distribution is a member of theexponential family of distribution and that random variables, Xi’s and Yj ’s, aremutually independent. Thus, their joint pdf belongs to the exponential familyas well.Note: To derive the minimal sufficient statistic, one may alternatively considerlikelihood ratio partition.

The set D0 is defined to be

D0 �{

(x,y) ∈ Rm+n

∣∣∣ for all µ, for all σ2 > 0, for all τ2 > 0

fX,Y

(x,y |µ, σ2, τ2

)= 0

}= ∅ (empty set).

5

Page 6: HW-Sol-5-V1 - MIT - Massachusetts Institute of Technologyweb.mit.edu/fmkashif/spring_06_stat/hw5solutions.pdf · 6.434J/16.391J Statistics for Engineers and Scientists Apr 11 MIT,

Let (x,y) /∈ D0 and (v, w) /∈ D0 be given. Their likelihood ratio is given by

fX,Y (x,y |θ)fX,Y (v,w |θ)

= exp{− 1

2σ2

( m∑j=1

x2j −

m∑j=1

v2j

)− 1

2τ2

( n∑i=1

y2i −

n∑i=1

w2i

)

σ2

( m∑j=1

xj −m∑

j=1

vj

)+

µ

τ2

( n∑i=1

yi −n∑

i=1

wi

)}.

By definition, (x,y) /∈ D0 and (v,w) /∈ D0 are equivalent iff there exists afunction, 0 < k(·, ·, ·, ·) < ∞, which is independent of θ, such that

fX,Y (x,y |θ)fX,Y (v,w |θ)

= k(x,y,v,w).

The likelihood ratio implies that (x,y) /∈ D0 and (v,w) /∈ D0 are equivalent ifand only if

m∑j=1

x2j =

m∑j=1

v2j , (1)

n∑i=1

y2i =

n∑i=1

w2i , (2)

m∑j=1

xj =m∑

j=1

vj , and (3)

n∑i=1

yi =n∑

i=1

wi, (4)

where function k(x,y,v,w) � 1.That is, (x,y) and (v,w) are in the same equivalent class iff conditions

(1)-(4) are satisfied. Then a representation of the equivalent class is given by

T(X,Y) �( m∑

j=1

X2j ,

n∑i=1

Y 2i ,

m∑j=1

Xj ,

n∑i=1

Yi

).

Thus, we have a minimal sufficient statistic, T(X,Y).

Problem 6: The two hypotheses about the probability density fX(x) of anobserved random variable X are

H1 : fX(x) =12

e−|x|, for any x

H0 : fX(x) =1√2π

e−12 x2

, for any x.

6

Page 7: HW-Sol-5-V1 - MIT - Massachusetts Institute of Technologyweb.mit.edu/fmkashif/spring_06_stat/hw5solutions.pdf · 6.434J/16.391J Statistics for Engineers and Scientists Apr 11 MIT,

(a) Find the likelihood ratio Λ(x).

(b) The test is of the form

Λ(x)H1

≷H0

η .

Compute the decision regions for various values of the threshold η.

Solution

(a) Let x ∈ R denote an observation. The likelihood ratio is given by

Λ (x) �fX |H (x |H1)fX |H (x |H0)

.

Substituting the densities of random variable X (under hypothesis H1 andunder hypothesis H0) yields the likelihood ratio

Λ (x) =

(12 e−|x|

)(

1√2π

e−12 x2

)=

√π

2e

12 x2−|x| .

(b) The decision region for hypothesis H1, R1 , is the set of points x’s thatgive rise to the output decision H1:

R1 � {x | the test decides H1 on input x}= {x |Λ (x) > η}.

Similarly, the decision region for hypothesis H0, R0, is given by

R0 � {x | the test decides H0 on input x}= {x |Λ (x) ≤ η}= R\R1,

where the symbol “\” denotes the set difference.

Substituting the expression of the likelihood ratio from part (a) yields thefollowing definition of decision region R1:

R1 ={

x

∣∣∣∣√

π

2e

12 x2−|x| > η

}.

When 0 ≥ η, we will have R1 = R since ey > 0 ≥ η for any y. Thus, wewill consider the case when η > 0.

7

Page 8: HW-Sol-5-V1 - MIT - Massachusetts Institute of Technologyweb.mit.edu/fmkashif/spring_06_stat/hw5solutions.pdf · 6.434J/16.391J Statistics for Engineers and Scientists Apr 11 MIT,

Taking natural log both sides of the inequality and writing x2 as |x|2 yields

R1 =

{x

∣∣∣∣∣12 |x|2 − |x| − ln(η

√2√π

)> 0

}.

When 1 + 2 ln(

η√

2√π

)< 0, the decision region is empty (since the term,

b2 − 4ac, in the square root of the quadratic formula is negative).

When 1 + 2 ln(

η√

2√π

)≥ 0, or equivalently, η ≥ √

π2 e−

12 , we will have a

non-empty decision region,

R1 ={

x

∣∣∣∣ |x| > 1 +

√1 + 2 ln

√2

π

)or

|x| < 1 −√

1 + 2 ln(η

√2

π

)}

={

x

∣∣∣∣ |x| > 1 +

√1 + 2 ln

√2

π

)}(absolute value cannot be negative)

={

x

∣∣∣∣ x > 1 +

√1 + 2 ln

√2

π

)or

x < −1 −√

1 + 2 ln(η

√2

π

)}.

Therefore, the decision region R1 is given by

R1 =

R, for η ≤ 0;

∅, for 0 < η <√

π2 e−1/2;

(1 +√

1 + 2 ln(η√

2/√

π),∞)⋃(−∞, 1 −

√1 + 2 ln(η

√2/√

π) ), for√

π2 e−1/2 ≤ η,

while the decision region R0 is given by

R0 = R\R1

=

∅, for η ≤ 0;

R, for 0 < η <√

π2 e−1/2;[

1 −√

1 + 2 ln(η√

2/√

π),

1 +√

1 + 2 ln(η√

2/√

π)], for

√π2 e−1/2 ≤ η.

8