HW-Sol-5-V1 - MIT - Massachusetts Institute of...
Transcript of HW-Sol-5-V1 - MIT - Massachusetts Institute of...
6.434J/16.391JStatistics for Engineers and Scientists Apr 11MIT, Spring 2006 Handout #13
Solution 5
Problem 1: Let a > 0 be a known constant, and let θ > 0 be a parameter.Suppose X1,X2, . . . , Xn is a sample from a population with one of the followingdensities.
(a) The beta, β(θ, 1), density: fX (x | θ) = θxθ−1, for 0 < x < 1.
(b) The Weilbull density: fX (x | θ) = θaxa−1 e−θxa
, for x > 0.
(c) The Pareto density: fX (x | θ) = θaθ
x(θ+1) , for x > a.
In each case, find a real-valued sufficient statistic for θ.
Solution Let X � (X1,X2, . . . , Xn) be a collection of i.i.d. random variablesXi’s, and let x � (x1, x2, . . . , xn) be a collection of observed data.
(a) For any x, the joint pdf is
fX (x | θ) =
θn(x1x2 · · ·xn)θ−1, if ∀i, 0 < xi < 1
0, otherwise;
= θn(x1x2 · · ·xn)θ−1︸ ︷︷ ︸�g(T (x) | θ)
× I(0,1)(x1)I(0,1)(x2) · · · I(0,1)(xn)︸ ︷︷ ︸�h(x)
.
Factorization theorem implies that
T (x) � x1x2 · · ·xn
is a sufficient statistic for θ.
(b) For any x, the joint pdf is
fX (x | θ) =
θnan(x1x2 · · ·xn)a−1 e−θ
�ni=1 xa
i , if ∀i, xi > 0;
0, otherwise;
= θn e−θ�n
i=1 xai︸ ︷︷ ︸
�g(T (x) | θ)
× an(x1x2 · · ·xn)a−1
×I(0,∞)(x1)I(0,∞)(x2) · · · I(0,∞)(xn)︸ ︷︷ ︸�h(x)
.
1
Factorization theorem implies that
T (x) �n∑
i=1
xai
is a sufficient statistic for θ.
(c) For any x, the joint pdf is
fX (x | θ) =
θnanθ
(x1x2···xn)θ+1 , if ∀i, xi > a;
0, otherwise;
=θnanθ
(x1x2 · · ·xn)θ+1︸ ︷︷ ︸�g(T (x) | θ)
× I(a,∞)(x1)I(a,∞)(x2) · · · I(a,∞)(xn)︸ ︷︷ ︸�h(x)
.
Factorization theorem implies that
T (x) � x1x2 · · ·xn
is a sufficient statistic for θ.
Problem 2:
a) Let X1,X2, . . . , Xn be independent random variables, each uniformly dis-tributed on the interval [−θ, θ], for some θ > 0. Find a sufficient statisticfor θ.
b) Let X1,X2, . . . , Xn be a random sample of size n from a normal N(θ, θ)distribution, for some θ > 0. Find a sufficient statistic for θ.
Solution
a) For any x � (x1, x2, . . . , xn), the joint pdf is given by
fX (x | θ) =
(12θ
)n
, if ∀i,−θ ≤ xi ≤ θ;
0, otherwise;
=
(12θ
)n
, if −θ ≤ min(x1, . . . , xn) and max(x1, . . . , xn) ≤ θ;
0, otherwise;
=( 1
2θ
)n
I[−θ,∞)(min(x1, . . . , xn))I(−∞,θ](max(x1, . . . , xn))︸ ︷︷ ︸�g(T(x) | θ)
× 1︸︷︷︸�h(x)
.
2
Factorization theorem implies that
T(x) �(min(x1, . . . , xn),max(x1, . . . , xn)
)is jointly sufficient for θ.
b) For any x � (x1, x2, . . . , xn), the joint pdf is given by
fX (x) =( 1√
2πθ
)n
e−12θ
�ni=1(xi−θ)2
=( 1√
2πθ
)n
e−12θ (�n
i=1 x2i−2θ
�ni=1 xi+nθ2)
=( 1√
2πθ
)n
e−12θ
�ni=1 x2
i +�n
i=1 xi−nθ2
=( 1√
2π
)n
e�n
i=1 xi
︸ ︷︷ ︸�h(x)
×( 1√
θ
)n
e−12θ
�ni=1 x2
i−nθ2︸ ︷︷ ︸
�g(T (x) | θ)
.
Factorization theorem implies that
T (x) �n∑
i=1
x2i
is a sufficient statistic for θ.
Problem 3: Let X be the number of trials up to (and including) the firstsuccess in a sequence of Bernoulli trials with probability of success θ, for 0 <
θ < 1. Then, X has a geometric distribution with the parameter θ:
Pθ {X = k} = (1 − θ)k−1θ, k = 1, 2, 3, . . . .
Show that the family of geometric distributions is a one-parameter exponentialfamily with T (x) = x.[Hint : xα = eα ln x, for x > 0.]
Solution Recall that the pmf of a one-parameter (θ) exponential family is ofthe form
p(x | θ) = h(x) eη(θ)T (x)−B(θ),
where x ∈ X . Rewriting the pmf of a Geometric random variable yields
Pθ {X = x} = e(x−1) ln(1−θ)+ln θ
= ex ln(1−θ)−(ln(1−θ)−ln θ),
3
where x ∈ {1, 2, 3, . . . }. Thus, the geometric distribution is a one-parameterexponential family with
h(x) � 1 η(θ) � ln(1 − θ)
T (x) � x B(θ) � ln(1 − θ) − ln θ
X � {1, 2, 3, . . . }.
Problem 4: Let X1,X2, . . . , Xn be a random sample of size n from the trun-cated Bernoulli probability mass function (pmf),
P {X = x | p} =
p, if x = 1;
(1 − p), if x = 0.
(a) Show that the joint pmf of X1,X2, . . . , Xn is a member of the exponentialfamily of distribution.
(b) Find a minimal sufficient statistic for p.
Solution
(a) Let x � (X1,X2, . . . Xn) denote the collection of i.i.d. Bernoulli randomvariables. The joint pmf is given by
P {X = x | p} =[px1(1 − p)1−x1
][px2(1 − p)1−x2
]· · ·
[pxn(1 − p)1−xn
]= p�n
i=1 xi(1 − p)n−�ni=1 xi
= e(ln p)�n
i=1 xi e[ln(1−p)][n−�ni=1 xi]
= e[ln p−ln(1−p)]�n
i=1 xi+n ln(1−p),
for x ∈ {0, 1}n. Therefore, the joint pmf is a member of the exponentialfamily, with the mappings:
θ = p h(x) = 1
η(p) = ln p − ln(1 − p) T (x) =n∑
i=1
xi
B(p) = −n ln(1 − p) X = {0, 1}n.
(b) Let x,y ∈ {0, 1}n be given. Consider the likelihood ratio,
P {X = x | p}P {X = y | p} = e[ln p−ln(1−p)][
�ni=1 xi−
�ni=1 yi] .
4
Define a function k(x,y) � h(x)/h(y) = 1, which is bounded and non-zerofor any x ∈ X and y ∈ X .
Note that x and y such that∑n
i=1 xi =∑n
i=1 yi are equivalent becausefunction k(x,y) satisfies the requirement of likelihood ratio partition.Therefore, T (x) �
∑ni=1 xi is a sufficient statistic.
Problem 5: Let X1,X2, . . . , Xm and Y1, Y2, . . . , Yn be two independent sam-ples from N(µ, σ2) and N(µ, τ2) populations, respectively. Here, −∞ < µ < ∞,σ2 > 0, and τ2 > 0. Find a minimal sufficient statistic for θ � (µ, σ2, τ2).
Solution Let X � (X1,X2, . . . , Xm) and Y � (Y1, Y2, . . . , Yn) denote the col-lections of random samples. The joint pdf (of Xj ’s and Yi’s), evaluated atx � (x1, x2, . . . , xm) and y � (y1, y2, . . . , yn), is given by
fX,Y (x,y |θ) =( 1√
2πσ2
)m
· e−�m
j=1(xj−µ)2
2σ2 ·( 1√
2πτ2
)n
· e−�n
i=1(yi−µ)2
2τ2
= e−1
2σ2�m
j=1 x2j− 1
2τ2�n
i=1 y2i + µ
σ2�m
j=1 xj+µ
τ2�n
i=1 yi−B(µ,σ2,τ2),
where B(µ, σ2, τ2) � m2 ln 2πσ2 + n
2 ln 2πτ2 + mµ2
2σ2 + nµ2
2τ2 .Notice that the joint pdf belongs to the exponential family, so that the
minimal statistic for θ is given by
T(X,Y) �( m∑
j=1
X2j ,
n∑i=1
Y 2i ,
m∑j=1
Xj ,
n∑i=1
Yi
).
Note: One should not be surprised that the joint pdf belongs to the exponen-tial family of distribution. Recall that Gaussian distribution is a member of theexponential family of distribution and that random variables, Xi’s and Yj ’s, aremutually independent. Thus, their joint pdf belongs to the exponential familyas well.Note: To derive the minimal sufficient statistic, one may alternatively considerlikelihood ratio partition.
The set D0 is defined to be
D0 �{
(x,y) ∈ Rm+n
∣∣∣ for all µ, for all σ2 > 0, for all τ2 > 0
fX,Y
(x,y |µ, σ2, τ2
)= 0
}= ∅ (empty set).
5
Let (x,y) /∈ D0 and (v, w) /∈ D0 be given. Their likelihood ratio is given by
fX,Y (x,y |θ)fX,Y (v,w |θ)
= exp{− 1
2σ2
( m∑j=1
x2j −
m∑j=1
v2j
)− 1
2τ2
( n∑i=1
y2i −
n∑i=1
w2i
)
+µ
σ2
( m∑j=1
xj −m∑
j=1
vj
)+
µ
τ2
( n∑i=1
yi −n∑
i=1
wi
)}.
By definition, (x,y) /∈ D0 and (v,w) /∈ D0 are equivalent iff there exists afunction, 0 < k(·, ·, ·, ·) < ∞, which is independent of θ, such that
fX,Y (x,y |θ)fX,Y (v,w |θ)
= k(x,y,v,w).
The likelihood ratio implies that (x,y) /∈ D0 and (v,w) /∈ D0 are equivalent ifand only if
m∑j=1
x2j =
m∑j=1
v2j , (1)
n∑i=1
y2i =
n∑i=1
w2i , (2)
m∑j=1
xj =m∑
j=1
vj , and (3)
n∑i=1
yi =n∑
i=1
wi, (4)
where function k(x,y,v,w) � 1.That is, (x,y) and (v,w) are in the same equivalent class iff conditions
(1)-(4) are satisfied. Then a representation of the equivalent class is given by
T(X,Y) �( m∑
j=1
X2j ,
n∑i=1
Y 2i ,
m∑j=1
Xj ,
n∑i=1
Yi
).
Thus, we have a minimal sufficient statistic, T(X,Y).
Problem 6: The two hypotheses about the probability density fX(x) of anobserved random variable X are
H1 : fX(x) =12
e−|x|, for any x
H0 : fX(x) =1√2π
e−12 x2
, for any x.
6
(a) Find the likelihood ratio Λ(x).
(b) The test is of the form
Λ(x)H1
≷H0
η .
Compute the decision regions for various values of the threshold η.
Solution
(a) Let x ∈ R denote an observation. The likelihood ratio is given by
Λ (x) �fX |H (x |H1)fX |H (x |H0)
.
Substituting the densities of random variable X (under hypothesis H1 andunder hypothesis H0) yields the likelihood ratio
Λ (x) =
(12 e−|x|
)(
1√2π
e−12 x2
)=
√π
2e
12 x2−|x| .
(b) The decision region for hypothesis H1, R1 , is the set of points x’s thatgive rise to the output decision H1:
R1 � {x | the test decides H1 on input x}= {x |Λ (x) > η}.
Similarly, the decision region for hypothesis H0, R0, is given by
R0 � {x | the test decides H0 on input x}= {x |Λ (x) ≤ η}= R\R1,
where the symbol “\” denotes the set difference.
Substituting the expression of the likelihood ratio from part (a) yields thefollowing definition of decision region R1:
R1 ={
x
∣∣∣∣√
π
2e
12 x2−|x| > η
}.
When 0 ≥ η, we will have R1 = R since ey > 0 ≥ η for any y. Thus, wewill consider the case when η > 0.
7
Taking natural log both sides of the inequality and writing x2 as |x|2 yields
R1 =
{x
∣∣∣∣∣12 |x|2 − |x| − ln(η
√2√π
)> 0
}.
When 1 + 2 ln(
η√
2√π
)< 0, the decision region is empty (since the term,
b2 − 4ac, in the square root of the quadratic formula is negative).
When 1 + 2 ln(
η√
2√π
)≥ 0, or equivalently, η ≥ √
π2 e−
12 , we will have a
non-empty decision region,
R1 ={
x
∣∣∣∣ |x| > 1 +
√1 + 2 ln
(η
√2
π
)or
|x| < 1 −√
1 + 2 ln(η
√2
π
)}
={
x
∣∣∣∣ |x| > 1 +
√1 + 2 ln
(η
√2
π
)}(absolute value cannot be negative)
={
x
∣∣∣∣ x > 1 +
√1 + 2 ln
(η
√2
π
)or
x < −1 −√
1 + 2 ln(η
√2
π
)}.
Therefore, the decision region R1 is given by
R1 =
R, for η ≤ 0;
∅, for 0 < η <√
π2 e−1/2;
(1 +√
1 + 2 ln(η√
2/√
π),∞)⋃(−∞, 1 −
√1 + 2 ln(η
√2/√
π) ), for√
π2 e−1/2 ≤ η,
while the decision region R0 is given by
R0 = R\R1
=
∅, for η ≤ 0;
R, for 0 < η <√
π2 e−1/2;[
1 −√
1 + 2 ln(η√
2/√
π),
1 +√
1 + 2 ln(η√
2/√
π)], for
√π2 e−1/2 ≤ η.
8