Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard...

48
Part IB Statistics Year 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2004 2003 2002 2001

Transcript of Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard...

Page 1: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

Part IB

—Statistics

Year

20172016201520142013201220112010200920082007200620052004200320022001

Page 2: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

40

Paper 1, Section I

7H Statistics(a) State and prove the Rao–Blackwell theorem.

(b) Let X1, . . . ,Xn be an independent sample from Poisson(λ) with θ = e−λ to beestimated. Show that Y = 1{0}(X1) is an unbiased estimator of θ and that T =

∑iXi is

a sufficient statistic.

What is E[Y | T ]?

Paper 2, Section I

8H Statistics(a) Define a 100γ% confidence interval for an unknown parameter θ.

(b) Let X1, . . . ,Xn be i.i.d. random variables with distribution N(µ, 1) with µunknown. Find a 95% confidence interval for µ.

[You may use the fact that Φ(1.96) ≃ 0.975.]

(c) Let U1, U2 be independent U [θ − 1, θ + 1] with θ to be estimated. Find a 50%confidence interval for θ.

Suppose that we have two observations u1 = 10 and u2 = 11.5. What might be abetter interval to report in this case?

Paper 4, Section II

19H Statistics(a) State and prove the Neyman–Pearson lemma.

(b) Let X be a real random variable with density f(x) = (2θx+1− θ)1[0,1](x) with−1 6 θ 6 1.

Find a most powerful test of size α of H0 : θ = 0 versus H1 : θ = 1.

Find a uniformly most powerful test of size α of H0 : θ = 0 versus H1 : θ > 0.

Part IB, 2017 List of Questions

2017

Page 3: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

41

Paper 1, Section II

19H Statistics(a) Give the definitions of a sufficient and a minimal sufficient statistic T for an

unknown parameter θ.

Let X1,X2, . . . ,Xn be an independent sample from the geometric distribution withsuccess probability 1/θ and mean θ > 1, i.e. with probability mass function

p(m) =1

θ

(1− 1

θ

)m−1

for m = 1, 2, . . . .

Find a minimal sufficient statistic for θ. Is your statistic a biased estimator of θ?

[You may use results from the course provided you state them clearly.]

(b) Define the bias of an estimator. What does it mean for an estimator to beunbiased?

Suppose that Y has the truncated Poisson distribution with probability massfunction

p(y) = (eθ − 1)−1 · θy

y!for y = 1, 2, . . . .

Show that the only unbiased estimator T of 1 − e−θ based on Y is obtained by takingT = 0 if Y is odd and T = 2 if Y is even.

Is this a useful estimator? Justify your answer.

Part IB, 2017 List of Questions [TURN OVER

2017

Page 4: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

42

Paper 3, Section II

20H StatisticsConsider the general linear model

Y = Xβ + ε,

where X is a known n × p matrix of full rank p < n, ε ∼ Nn(0, σ2I) with σ2 known and

β ∈ Rp is an unknown vector.

(a) State without proof the Gauss–Markov theorem.

Find the maximum likelihood estimator β for β. Is it unbiased?

Let β∗ be any unbiased estimator for β which is linear in (Yi). Show that

var(tT β) 6 var(tTβ∗)

for all t ∈ Rp.

(b) Suppose now that p = 1 and that β and σ2 are both unknown. Find themaximum likelihood estimator for σ2. What is the joint distribution of β and σ2 in thiscase? Justify your answer.

Part IB, 2017 List of Questions

2017

Page 5: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

45

Paper 1, Section I

7H StatisticsLet X1, . . . ,Xn be independent samples from the exponential distribution with

density f(x;λ) = λe−λx for x > 0, where λ is an unknown parameter. Find the criticalregion of the most powerful test of size α for the hypotheses H0 : λ = 1 versus H1 : λ = 2.Determine whether or not this test is uniformly most powerful for testing H ′

0 : λ 6 1versus H ′

1 : λ > 1.

Paper 2, Section I

8H StatisticsThe efficacy of a new medicine was tested as follows. Fifty patients were given the

medicine, and another fifty patients were given a placebo. A week later, the number ofpatients who got better, stayed the same, or got worse was recorded, as summarised inthis table:

medicine placebobetter 28 22same 4 16worse 18 12

Conduct a Pearson chi-squared test of size 1% of the hypothesis that the medicineand the placebo have the same effect.[Hint: You may find the following values relevant:Distribution χ2

1 χ22 χ2

3 χ24 χ2

5 χ26

99% percentile 6.63 9.21 11.34 13.3 15.09 16.81.]

Part IB, 2016 List of Questions [TURN OVER

2016

Page 6: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

46

Paper 4, Section II

19H StatisticsConsider the linear regression model

Yi = α+ βxi + εi,

for i = 1, . . . , n, where the non-zero numbers x1, . . . , xn are known and are such thatx1 + . . . + xn = 0, the independent random variables ε1, . . . , εn have the N(0, σ2)distribution, and the parameters α, β and σ2 are unknown.

(a) Let (α, β) be the maximum likelihood estimator of (α, β). Prove that for eachi, the random variables α, β and Yi − α − βxi are uncorrelated. Using standard factsabout the multivariate normal distribution, prove that α, β and

∑ni=1(Yi − α− βxi)

2 areindependent.

(b) Find the critical region of the generalised likelihood ratio test of size 5% fortesting H0 : α = 0 versus H1 : α 6= 0. Prove that the power function of this test is of theform w(α, β, σ2) = g(α/σ) for some function g. [You are not required to find g explicitly.]

Paper 1, Section II

19H Statistics(a) What does it mean to say a statistic T is sufficient for an unknown parameter

θ? State the factorisation criterion for sufficiency and prove it in the discrete case.

(b) State and prove the Rao-Blackwell theorem.

(c) Let X1, . . . ,Xn be independent samples from the uniform distribution on [−θ, θ]for an unknown positive parameter θ. Consider the two-dimensional statistic

T = (miniXi,max

iXi).

Prove that T is sufficient for θ. Determine, with proof, whether or not T is minimallysufficient.

Part IB, 2016 List of Questions

2016

Page 7: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

47

Paper 3, Section II

20H StatisticsLet X1, . . . ,Xn be independent samples from the Poisson distribution with mean θ.

(a) Compute the maximum likelihood estimator of θ. Is this estimator biased?

(b) Under the assumption that n is very large, use the central limit theorem tofind an approximate 95% confidence interval for θ. [You may use the notation zα for thenumber such that P(Z > zα) = α for a standard normal Z ∼ N(0, 1).]

(c) Now suppose the parameter θ has the Γ(k, λ) prior distribution. What is theposterior distribution? What is the Bayes point estimator for θ for the quadratic lossfunction L(θ, a) = (θ − a)2? Let Xn+1 be another independent sample from the samedistribution. Given X1, . . . ,Xn, what is the posterior probability that Xn+1 = 0?[Hint: The density of the Γ(k, λ) distribution is f(x; k, λ) = λkxk−1e−λx/Γ(k), for x > 0.]

Part IB, 2016 List of Questions [TURN OVER

2016

Page 8: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

41

Paper 1, Section I

7H StatisticsSuppose that X1, . . . ,Xn are independent normally distributed random variables,

each with mean µ and variance 1, and consider testing H0 : µ = 0 against H1 : µ = 1.Explain what is meant by the critical region, the size and the power of a test.

For 0 < α < 1, derive the test that is most powerful among all tests of size atmost α. Obtain an expression for the power of your test in terms of the standard normaldistribution function Φ(·).

[Results from the course may be used without proof provided they are clearly stated.]

Paper 2, Section I

8H StatisticsSuppose that, given θ, the random variable X has P(X = k) = e−θθk/k!,

k = 0, 1, 2, . . .. Suppose that the prior density of θ is π(θ) = λe−λθ, θ > 0, for someknown λ (> 0). Derive the posterior density π(θ | x) of θ based on the observation X = x.

For a given loss function L(θ, a), a statistician wants to calculate the value of a thatminimises the expected posterior loss

∫L(θ, a)π(θ | x)dθ.

Suppose that x = 0. Find a in terms of λ in the following cases:

(a) L(θ, a) = (θ − a)2;

(b) L(θ, a) = |θ − a|.

Part IB, 2015 List of Questions [TURN OVER

20152015

Page 9: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

42

Paper 4, Section II

19H StatisticsConsider a linear model Y = Xβ+ ε where Y is an n× 1 vector of observations, X

is a known n × p matrix, β is a p × 1 (p < n) vector of unknown parameters and ε is ann × 1 vector of independent normally distributed random variables each with mean zeroand unknown variance σ2. Write down the log-likelihood and show that the maximumlikelihood estimators β and σ2 of β and σ2 respectively satisfy

XTXβ = XTY,1

σ4(Y −Xβ)T (Y −Xβ) =

n

σ2

(T denotes the transpose). Assuming that XTX is invertible, find the solutions β and σ2

of these equations and write down their distributions.

Prove that β and σ2 are independent.

Consider the model Yij = µi+γxij+εij , i = 1, 2, 3 and j = 1, 2, 3. Suppose that, forall i, xi1 = −1, xi2 = 0 and xi3 = 1, and that εij , i, j = 1, 2, 3, are independent N(0, σ2)random variables where σ2 is unknown. Show how this model may be written as a linearmodel and write down Y, X, β and ε. Find the maximum likelihood estimators of µi

(i = 1, 2, 3), γ and σ2 in terms of the Yij. Derive a 100(1−α)% confidence interval for σ2

and for µ2 − µ1.

[You may assume that, if W = (W1T ,W2

T )T is multivariate normal withcov(W1,W2) = 0, then W1 and W2 are independent.]

Part IB, 2015 List of Questions

20152015

Page 10: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

43

Paper 1, Section II

19H StatisticsSuppose X1, . . . ,Xn are independent identically distributed random variables each

with probability mass function P(Xi = xi) = p(xi; θ), where θ is an unknown parameter.State what is meant by a sufficient statistic for θ. State the factorisation criterion for asufficient statistic. State and prove the Rao–Blackwell theorem.

Suppose that X1, . . . ,Xn are independent identically distributed random variableswith

P(Xi = xi) =

(mxi

)θxi(1− θ)m−xi , xi = 0, . . . ,m,

where m is a known positive integer and θ is unknown. Show that θ = X1/m is unbiasedfor θ.

Show that T =∑n

i=1Xi is sufficient for θ and use the Rao–Blackwell theorem to

find another unbiased estimator θ for θ, giving details of your derivation. Calculate thevariance of θ and compare it to the variance of θ.

A statistician cannot remember the exact statement of the Rao–Blackwell theoremand calculates E(T | X1) in an attempt to find an estimator of θ. Comment on thesuitability or otherwise of this approach, giving your reasons.

[Hint: If a and b are positive integers then, for r = 0, 1, . . . , a + b,(a+b

r

)=∑r

j=0

(aj

)(b

r−j

).]

Paper 3, Section II

20H Statistics(a) Suppose that X1, . . . ,Xn are independent identically distributed random vari-

ables, each with density f(x) = θ exp(−θx), x > 0 for some unknown θ > 0. Use thegeneralised likelihood ratio to obtain a size α test of H0 : θ = 1 against H1 : θ 6= 1.

(b) A die is loaded so that, if pi is the probability of face i, then p1 = p2 = θ1,p3 = p4 = θ2 and p5 = p6 = θ3. The die is thrown n times and face i is observed xi times.Write down the likelihood function for θ = (θ1, θ2, θ3) and find the maximum likelihoodestimate of θ.

Consider testing whether or not θ1 = θ2 = θ3 for this die. Find the generalisedlikelihood ratio statistic Λ and show that

2 loge Λ ≈ T, where T =

3∑

i=1

(oi − ei)2

ei,

where you should specify oi and ei in terms of x1, . . . , x6. Explain how to obtain anapproximate size 0.05 test using the value of T . Explain what you would conclude (andwhy) if T = 2.03.

Part IB, 2015 List of Questions [TURN OVER

20152015

Page 11: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

42

Paper 1, Section I

7H StatisticsConsider an estimator θ of an unknown parameter θ, and assume that Eθ

(θ2)< ∞

for all θ. Define the bias and mean squared error of θ.

Show that the mean squared error of θ is the sum of its variance and the square ofits bias.

Suppose that X1, . . . ,Xn are independent identically distributed random variableswith mean θ and variance θ2, and consider estimators of θ of the form kX whereX = 1

n

∑ni=1Xi.

(i) Find the value of k that gives an unbiased estimator, and show that the meansquared error of this unbiased estimator is θ2/n.

(ii) Find the range of values of k for which the mean squared error of kX is smallerthan θ2/n.

Paper 2, Section I

8H StatisticsThere are 100 patients taking part in a trial of a new surgical procedure for a

particular medical condition. Of these, 50 patients are randomly selected to receive thenew procedure and the remaining 50 receive the old procedure. Six months later, a doctorassesses whether or not each patient has fully recovered. The results are shown below:

Fully Not fullyrecovered recovered

Old procedure 25 25

New procedure 31 19

The doctor is interested in whether there is a difference in full recovery rates for patientsreceiving the two procedures. Carry out an appropriate 5% significance level test, statingyour hypotheses carefully. [You do not need to derive the test.] What conclusion shouldbe reported to the doctor?

[Hint: Let χ2k(α) denote the upper 100α percentage point of a χ2

k distribution. Then

χ21(0.05) = 3.84, χ2

2(0.05) = 5.99, χ23(0.05) = 7.82, χ2

4(0.05) = 9.49.]

Part IB, 2014 List of Questions

20142014

Page 12: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

43

Paper 4, Section II

19H StatisticsConsider a linear model

Y = Xβ + ε, (†)where X is a known n× p matrix, β is a p× 1 (p < n) vector of unknown parameters andε is an n× 1 vector of independent N(0, σ2) random variables with σ2 unknown. Assumethat X has full rank p. Find the least squares estimator β of β and derive its distribution.Define the residual sum of squares RSS and write down an unbiased estimator σ2 of σ2.

Suppose that Vi = a+ bui+ δi and Zi = c+ dwi+ ηi, for i = 1, . . . ,m, where ui andwi are known with

∑mi=1 ui =

∑mi=1 wi = 0, and δ1, . . . , δm, η1, . . . , ηm are independent

N(0, σ2) random variables. Assume that at least two of the ui are distinct and at leasttwo of the wi are distinct. Show that Y = (V1, . . . , Vm, Z1, . . . , Zm)T (where T denotestranspose) may be written as in (†) and identify X and β. Find β in terms of the Vi, Zi,ui and wi. Find the distribution of b− d and derive a 95% confidence interval for b− d.

[Hint: You may assume that RSSσ2 has a χ2

n−p distribution, and that β and theresidual sum of squares are independent. Properties of χ2 distributions may be used withoutproof.]

Part IB, 2014 List of Questions [TURN OVER

20142014

Page 13: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

44

Paper 1, Section II

19H StatisticsSuppose that X1, X2, and X3 are independent identically distributed Poisson

random variables with expectation θ, so that

P(Xi = x) =e−θθx

x!x = 0, 1, . . . ,

and consider testing H0 : θ = 1 against H1 : θ = θ1, where θ1 is a known value greaterthan 1. Show that the test with critical region {(x1, x2, x3) :

∑3i=1 xi > 5} is a likelihood

ratio test of H0 against H1. What is the size of this test? Write down an expression forits power.

A scientist counts the number of bird territories in n randomly selected sectionsof a large park. Let Yi be the number of bird territories in the ith section, andsuppose that Y1, . . . , Yn are independent Poisson random variables with expectationsθ1, . . . , θn respectively. Let ai be the area of the ith section. Suppose that n = 2m,a1 = · · · = am = a(> 0) and am+1 = · · · = a2m = 2a. Derive the generalised likelihoodratio Λ for testing

H0 : θi = λai against H1 : θi =

{λ1 i = 1, . . . ,mλ2 i = m+ 1, . . . , 2m.

What should the scientist conclude about the number of bird territories if 2 loge(Λ)is 15.67?

[Hint: Let Fθ(x) be P(W 6 x) where W has a Poisson distribution with expectation θ.Then

F1(3) = 0.998, F3(5) = 0.916, F3(6) = 0.966, F5(3) = 0.433 .]

Part IB, 2014 List of Questions

20142014

Page 14: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

45

Paper 3, Section II

20H StatisticsSuppose that X1, . . . ,Xn are independent identically distributed random variables

with

P(Xi = x) =

(k

x

)θx(1− θ)k−x, x = 0, . . . , k,

where k is known and θ (0 < θ < 1) is an unknown parameter. Find the maximumlikelihood estimator θ of θ.

Statistician 1 has prior density for θ given by π1(θ) = αθα−1, 0 < θ < 1, whereα > 1. Find the posterior distribution for θ after observing data X1 = x1, . . . ,Xn = xn.

Write down the posterior mean θ(B)1 , and show that

θ(B)1 = c θ + (1− c)θ1,

where θ1 depends only on the prior distribution and c is a constant in (0, 1) that is to bespecified.

Statistician 2 has prior density for θ given by π2(θ) = α(1−θ)α−1, 0 < θ < 1. Brieflydescribe the prior beliefs that the two statisticians hold about θ. Find the posterior mean

θ(B)2 and show that θ

(B)2 < θ

(B)1 .

Suppose that α increases (but n, k and the xi remain unchanged). How do the priorbeliefs of the two statisticians change? How does c vary? Explain briefly what happens

to θ(B)1 and θ

(B)2 .

[Hint: The Beta(α, β) (α > 0, β > 0) distribution has density

f(x) =Γ(α+ β)

Γ(α)Γ(β)xα−1(1− x)β−1, 0 < x < 1,

with expectation αα+β and variance αβ

(α+β+1)(α+β)2. Here, Γ(α) =

∫∞0 xα−1e−xdx, α > 0, is

the Gamma function.]

Part IB, 2014 List of Questions [TURN OVER

20142014

Page 15: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

41

Paper 1, Section I

7H StatisticsLet x1, . . . , xn be independent and identically distributed observations from a

distribution with probability density function

f(x) =

{λe−λ(x−µ), x > µ,

0, x < µ,

where λ and µ are unknown positive parameters. Let β = µ + 1/λ. Find the maximumlikelihood estimators λ, µ and β.

Determine for each of λ, µ and β whether or not it has a positive bias.

Paper 2, Section I

8H StatisticsState and prove the Rao–Blackwell theorem.

Individuals in a population are independently of three types {0, 1, 2}, with unknownprobabilities p0, p1, p2 where p0 + p1 + p2 = 1. In a random sample of n people the ithperson is found to be of type xi ∈ {0, 1, 2}.

Show that an unbiased estimator of θ = p0p1p2 is

θ =

{1, if (x1, x2, x3) = (0, 1, 2),

0, otherwise.

Suppose that ni of the individuals are of type i. Find an unbiased estimator of θ,say θ∗, such that var(θ∗) < θ(1− θ).

Part IB, 2013 List of Questions [TURN OVER

20132013

Page 16: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

42

Paper 4, Section II

19H StatisticsExplain the notion of a sufficient statistic.

Suppose X is a random variable with distribution F taking values in {1, . . . , 6},with P (X = i) = pi. Let x1, . . . , xn be a sample from F . Suppose ni is the number ofthese xj that are equal to i. Use a factorization criterion to explain why (n1, . . . , n6) issufficient for θ = (p1, . . . , p6).

Let H0 be the hypothesis that pi = 1/6 for all i. Derive the statistic of thegeneralized likelihood ratio test of H0 against the alternative that this is not a goodfit.

Assuming that ni ≈ n/6 when H0 is true and n is large, show that this test can beapproximated by a chi-squared test using a test statistic

T = −n+6

n

6∑

i=1

n2i .

Suppose n = 100 and T = 8.12. Would you reject H0? Explain your answer.

Part IB, 2013 List of Questions

20132013

Page 17: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

43

Paper 1, Section II

19H StatisticsConsider the general linear model Y = Xθ + ǫ where X is a known n× p matrix, θ

is an unknown p×1 vector of parameters, and ǫ is an n×1 vector of independent N(0, σ2)random variables with unknown variance σ2. Assume the p× p matrix XTX is invertible.Let

θ = (XTX)−1XTY

ǫ = Y −Xθ.

What are the distributions of θ and ǫ? Show that θ and ǫ are uncorrelated.

Four apple trees stand in a 2 × 2 rectangular grid. The annual yield of the tree atcoordinate (i, j) conforms to the model

yij = αi + βxij + ǫij, i, j ∈ {1, 2},

where xij is the amount of fertilizer applied to tree (i, j), α1, α2 may differ because ofvarying soil across rows, and the ǫij are N(0, σ2) random variables that are independentof one another and from year to year. The following two possible experiments are to becompared:

I :(xij

)=

(0 12 3

)and II :

(xij

)=

(0 23 1

).

Represent these as general linear models, with θ = (α1, α2, β). Compare the variances ofestimates of β under I and II.

With II the following yields are observed:

(yij

)=

(100 300600 400

).

Forecast the total yield that will be obtained next year if no fertilizer is used. What is the95% predictive interval for this yield?

Part IB, 2013 List of Questions [TURN OVER

20132013

Page 18: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

44

Paper 3, Section II

20H StatisticsSuppose x1 is a single observation from a distribution with density f over [0, 1]. It

is desired to test H0 : f(x) = 1 against H1 : f(x) = 2x.

Let δ : [0, 1] → {0, 1} define a test by δ(x1) = i ⇐⇒ ‘accept Hi’. Letαi(δ) = P (δ(x1) = 1− i | Hi). State the Neyman-Pearson lemma using this notation.

Let δ be the best test of size 0.10. Find δ and α1(δ).

Consider now δ : [0, 1] → {0, 1, ⋆} where δ(x1) = ⋆ means ‘declare the test to beinconclusive’. Let γi(δ) = P (δ(x) = ⋆ | Hi). Given prior probabilities π0 for H0 andπ1 = 1− π0 for H1, and some w0, w1, let

cost(δ) = π0(w0α0(δ) + γ0(δ)

)+ π1

(w1α1(δ) + γ1(δ)

).

Let δ∗(x1) = i ⇐⇒ x1 ∈ Ai, where A0 = [0, 0.5), A⋆ = [0.5, 0.6), A1 = [0.6, 1].Prove that for each value of π0 ∈ (0, 1) there exist w0, w1 (depending on π0) such thatcost(δ∗) = minδ cost(δ). [Hint : w0 = 1 + 2(0.6)(π1/π0).]

Hence prove that if δ is any test for which

αi(δ) 6 αi(δ∗), i = 0, 1

then γ0(δ) > γ0(δ∗) and γ1(δ) > γ1(δ

∗).

Part IB, 2013 List of Questions

20132013

Page 19: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

42

Paper 1, Section I

7H StatisticsDescribe the generalised likelihood ratio test and the type of statistical question for

which it is useful.

Suppose that X1, . . . ,Xn are independent and identically distributed random vari-ables with the Gamma(2, λ) distribution, having density function λ2x exp(−λx), x > 0.Similarly, Y1, . . . , Yn are independent and identically distributed with the Gamma(2, µ)distribution. It is desired to test the hypothesis H0 : λ = µ against H1 : λ 6= µ. Derivethe generalised likelihood ratio test and express it in terms of R =

∑i Xi/

∑i Yi.

Let F(1−α)ν1,ν2 denote the value that a random variable having the Fν1,ν2 distribution

exceeds with probability α. Explain how to decide the outcome of a size 0.05 test when

n = 5 by knowing only the value of R and the value F(1−α)ν1,ν2 , for some ν1, ν2 and α, which

you should specify.

[You may use the fact that the χ2k distribution is equivalent to the Gamma(k/2, 1/2)

distribution.]

Paper 2, Section I

8H StatisticsLet the sample x = (x1, . . . , xn) have likelihood function f(x; θ). What does it mean

to say T (x) is a sufficient statistic for θ?

Show that if a certain factorization criterion is satisfied then T is sufficient for θ.

Suppose that T is sufficient for θ and there exist two samples, x and y, for whichT (x) 6= T (y) and f(x; θ)/f(y; θ) does not depend on θ. Let

T1(z) =

{T (z) z 6= y

T (x) z = y.

Show that T1 is also sufficient for θ.

Explain why T is not minimally sufficient for θ.

Part IB, 2012 List of Questions

20122012

Page 20: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

43

Paper 4, Section II

19H StatisticsFrom each of 3 populations, n data points are sampled and these are believed to

obeyyij = αi + βi(xij − xi) + ǫij , j ∈ {1, . . . , n}, i ∈ {1, 2, 3},

where xi = (1/n)∑

j xij , the ǫij are independent and identically distributed as N(0, σ2),

and σ2 is unknown. Let yi = (1/n)∑

j yij.

(i) Find expressions for αi and βi, the least squares estimates of αi and βi.

(ii) What are the distributions of αi and βi?

(iii) Show that the residual sum of squares, R1, is given by

R1 =

3∑

i=1

n∑

j=1

(yij − yi)2 − β2

i

n∑

j=1

(xij − xi)2

.

Calculate R1 when n = 9, {αi}3i=1 = {1.6, 0.6, 0.8}, {βi}3i=1 = {2, 1, 1},

9∑

j=1

(yij − yi)2

3

i=1

= {138, 82, 63},

9∑

j=1

(xij − xi)2

3

i=1

= {30, 60, 40}.

(iv) H0 is the hypothesis that α1 = α2 = α3. Find an expression for the maximumlikelihood estimator of α1 under the assumption that H0 is true. Calculate its value forthe above data.

(v) Explain (stating without proof any relevant theory) the rationale for a statisticwhich can be referred to an F distribution to test H0 against the alternative that it is nottrue. What should be the degrees of freedom of this F distribution? What would be theoutcome of a size 0.05 test of H0 with the above data?

Part IB, 2012 List of Questions [TURN OVER

20122012

Page 21: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

44

Paper 1, Section II

19H StatisticsState and prove the Neyman-Pearson lemma.

A sample of two independent observations, (x1, x2), is taken from a distribution withdensity f(x; θ) = θxθ−1, 0 6 x 6 1. It is desired to test H0 : θ = 1 against H1 : θ = 2.Show that the best test of size α can be expressed using the number c such that

1− c+ c log c = α .

Is this the uniformly most powerful test of size α for testing H0 against H1 : θ > 1?

Suppose that the prior distribution of θ is P (θ = 1) = 4γ/(1 + 4γ), P (θ = 2) =1/(1+4γ), where 1 > γ > 0. Find the test of H0 against H1 that minimizes the probabilityof error.

Let w(θ) denote the power function of this test at θ (> 1). Show that

w(θ) = 1− γθ + γθ log γθ.

Part IB, 2012 List of Questions

20122012

Page 22: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

45

Paper 3, Section II

20H StatisticsSuppose that X is a single observation drawn from the uniform distribution on the

interval[θ−10, θ+10

], where θ is unknown and might be any real number. Given θ0 6= 20

we wish to test H0 : θ = θ0 against H1 : θ = 20. Let φ(θ0) be the test which accepts H0 ifand only if X ∈ A(θ0), where

A(θ0) =

{[θ0 − 8,∞

), θ0 > 20

(−∞, θ0 + 8

], θ0 < 20 .

Show that this test has size α = 0.10.

Now consider

C1(X) = {θ : X ∈ A(θ)},C2(X) =

{θ : X − 9 6 θ 6 X + 9

}.

Prove that both C1(X) and C2(X) specify 90% confidence intervals for θ. Find theconfidence interval specified by C1(X) when X = 0.

Let Li(X) be the length of the confidence interval specified by Ci(X). Let β(θ0) bethe probability of the Type II error of φ(θ0). Show that

E[L1(X) | θ = 20] = E

[∫ ∞

−∞1{θ0∈C1(X)}dθ0

∣∣∣ θ = 20

]=

∫ ∞

−∞β(θ0) dθ0.

Here 1{B} is an indicator variable for event B. The expectation is over X. [Orders ofintegration and expectation can be interchanged.]

Use what you know about constructing best tests to explain which of the twoconfidence intervals has the smaller expected length when θ = 20.

Part IB, 2012 List of Questions [TURN OVER

20122012

Page 23: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

38

Paper 1, Section I

7H StatisticsConsider the experiment of tossing a coin n times. Assume that the tosses are

independent and the coin is biased, with unknown probability p of heads and 1−p of tails.A total of X heads is observed.

(i) What is the maximum likelihood estimator p of p?

Now suppose that a Bayesian statistician has the Beta(M,N) prior distributionfor p.

(ii) What is the posterior distribution for p?

(iii) Assuming the loss function is L(p, a) = (p − a)2, show that the statistician’s pointestimate for p is given by

M +X

M +N + n.

[The Beta(M,N) distribution has densityΓ(M +N)

Γ(M)Γ(N)xM−1(1 − x)N−1 for 0 < x < 1 and

meanM

M +N.]

Paper 2, Section I

8H StatisticsLet X1, . . . ,Xn be random variables with joint density function f(x1, . . . , xn; θ),

where θ is an unknown parameter. The null hypothesis H0 : θ = θ0 is to be tested againstthe alternative hypothesis H1 : θ = θ1.

(i) Define the following terms: critical region, Type I error, Type II error, size, power.

(ii) State and prove the Neyman–Pearson lemma.

Part IB, 2011 List of Questions

20112011

Page 24: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

39

Paper 1, Section II

19H StatisticsLet X1, . . . ,Xn be independent random variables with probability mass function

f(x; θ), where θ is an unknown parameter.

(i) What does it mean to say that T is a sufficient statistic for θ? State, but do not prove,the factorisation criterion for sufficiency.

(ii) State and prove the Rao–Blackwell theorem.

Now consider the case where f(x; θ) =1

x!(− log θ)xθ for non-negative integer x and

0 < θ < 1.

(iii) Find a one-dimensional sufficient statistic T for θ.

(iv) Show that θ = 1{X1=0} is an unbiased estimator of θ.

(v) Find another unbiased estimator θ which is a function of the sufficient statistic Tand that has smaller variance than θ. You may use the following fact without proof:X1 + · · ·+Xn has the Poisson distribution with parameter −n log θ.

Part IB, 2011 List of Questions [TURN OVER

20112011

Page 25: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

40

Paper 3, Section II

20H StatisticsConsider the general linear model

Y = Xβ + ǫ

where X is a known n × p matrix, β is an unknown p × 1 vector of parameters, and ǫis an n × 1 vector of independent N(0, σ2) random variables with unknown variance σ2.Assume the p× p matrix XTX is invertible.

(i) Derive the least squares estimator β of β.

(ii) Derive the distribution of β. Is β an unbiased estimator of β?

(iii) Show that 1σ2 ‖Y −Xβ‖2 has the χ2 distribution with k degrees of freedom, where k

is to be determined.

(iv) Let β be an unbiased estimator of β of the form β = CY for some p × n matrix C.By considering the matrix E[(β − β)(β − β)T ] or otherwise, show that β and β − β areindependent.

[You may use standard facts about the multivariate normal distribution as well as resultsfrom linear algebra, including the fact that I −X(XTX)−1XT is a projection matrix ofrank n− p, as long as they are carefully stated.]

Paper 4, Section II

19H StatisticsConsider independent random variablesX1, . . . ,Xn with theN(µX , σ2

X) distributionand Y1, . . . , Yn with the N(µY , σ

2Y ) distribution, where the means µX , µY and variances

σ2X , σ2

Y are unknown. Derive the generalised likelihood ratio test of size α of the nullhypothesis H0 : σ2

X = σ2Y against the alternative H1 : σ2

X 6= σ2Y . Express the critical

region in terms of the statistic T =SXX

SXX + SY Yand the quantiles of a beta distribution,

where

SXX =n∑

i=1

X2i − 1

n

(n∑

i=1

Xi

)2

and SY Y =n∑

i=1

Y 2i − 1

n

(n∑

i=1

Yi

)2

.

[You may use the following fact: if U ∼ Γ(a, λ) and V ∼ Γ(b, λ) are independent,

thenU

U + V∼ Beta(a, b).]

Part IB, 2011 List of Questions

20112011

Page 26: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

40

Paper 1, Section I

7E StatisticsSuppose X1, . . . ,Xn are independent N(0, σ2) random variables, where σ 2 is an

unknown parameter. Explain carefully how to construct the uniformly most powerful testof size α for the hypothesis H0 : σ

2 = 1 versus the alternative H1 : σ2 > 1 .

Paper 2, Section I

8E StatisticsA washing powder manufacturer wants to determine the effectiveness of a television

advertisement. Before the advertisement is shown, a pollster asks 100 randomly chosenpeople which of the three most popular washing powders, labelled A, B and C, they prefer.After the advertisement is shown, another 100 randomly chosen people (not the same asbefore) are asked the same question. The results are summarized below.

A B C

before 36 47 17after 44 33 23

Derive and carry out an appropriate test at the 5% significance level of thehypothesis that the advertisement has had no effect on people’s preferences.

[You may find the following table helpful:

χ21 χ2

2 χ23 χ2

4 χ25 χ2

6

95 percentile 3.84 5.99 7.82 9.49 11.07 12.59.

]

Part IB, 2010 List of Questions

20102010

Page 27: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

41

Paper 1, Section II

19E StatisticsConsider the the linear regression model

Yi = β xi + ǫi,

where the numbers x1, . . . , xn are known, the independent random variables ǫ1, . . . , ǫnhave the N(0, σ 2) distribution, and the parameters β and σ 2 are unknown. Find themaximum likelihood estimator for β .

State and prove the Gauss–Markov theorem in the context of this model.

Write down the distribution of an arbitrary linear estimator for β . Hence show thatthere exists a linear, unbiased estimator β for β such that

Eβ, σ 2 [(β − β)4] 6 Eβ, σ 2 [(β − β)4]

for all linear, unbiased estimators β .

[Hint: If Z ∼ N(a, b 2) then E [(Z − a)4] = 3 b4 .]

Paper 3, Section II

20E StatisticsLet X1, . . . , Xn be independent Exp(θ) random variables with unknown parameter

θ . Find the maximum likelihood estimator θ of θ , and state the distribution of n/θ . Showthat θ/θ has the Γ(n, n) distribution. Find the 100 (1 − α)% confidence interval for θ ofthe form [0, C θ] for a constant C > 0 depending on α .

Now, taking a Bayesian point of view, suppose your prior distribution for theparameter θ is Γ(k, λ). Show that your Bayesian point estimator θB of θ for the lossfunction L(θ, a) = (θ − a)2 is given by

θB =n+ k

λ+∑

i Xi.

Find a constant CB > 0 depending on α such that the posterior probability thatθ 6 CB θB is equal to 1− α .

[The density of the Γ(k, λ) distribution is f(x; k, λ) = λkx k−1e−λx/Γ(k), for x > 0.]

Part IB, 2010 List of Questions [TURN OVER

20102010

Page 28: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

42

Paper 4, Section II

19E StatisticsConsider a collection X1, . . . , Xn of independent random variables with common

density function f(x; θ) depending on a real parameter θ. What does it mean to say Tis a sufficient statistic for θ? Prove that if the joint density of X1, . . . , Xn satisfies thefactorisation criterion for a statistic T , then T is sufficient for θ .

Let each Xi be uniformly distributed on [−√θ,

√θ ] . Find a two-dimensional

sufficient statistic T = (T1, T2). Using the fact that θ = 3X 21 is an unbiased estimator of

θ , or otherwise, find an unbiased estimator of θ which is a function of T and has smallervariance than θ . Clearly state any results you use.

Part IB, 2010 List of Questions

20102010

Page 29: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

48

Paper 1, Section I

7H Statistics

What does it mean to say that an estimator θ of a parameter θ is unbiased?

An n-vector Y of observations is believed to be explained by the model

Y = Xβ + ε,

where X is a known n × p matrix, β is an unknown p-vector of parameters, p < n, and

ε is an n-vector of independent N(0, σ2) random variables. Find the maximum-likelihood

estimator β of β, and show that it is unbiased.

Paper 3, Section I

8H Statistics

In a demographic study, researchers gather data on the gender of children in families

with more than two children. For each of the four possible outcomes GG, GB, BG, BB

of the first two children in the family, they find 50 families which started with that pair,

and record the gender of the third child of the family. This produces the following table

of counts:First two children Third child B Third child G

GG 16 34

GB 28 22

BG 25 25

BB 31 19

In view of this, is the hypothesis that the gender of the third child is independent of the

genders of the first two children rejected at the 5% level?

[Hint: the 95% point of a χ23 distribution is 7.8147, and the 95% point of a χ2

4 distribution

is 9.4877.]

Part IB, 2009 List of Questions

20092009

Page 30: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

49

Paper 1, Section II

18H Statistics

What is the critical region C of a test of the null hypothesis H0 : θ ∈ Θ0 against

the alternative H1 : θ ∈ Θ1? What is the size of a test with critical region C? What is

the power function of a test with critical region C?

State and prove the Neyman–Pearson Lemma.

If X1, . . . ,Xn are independent with common Exp(λ) distribution, and 0 < λ0 < λ1,

find the form of the most powerful size-α test of H0 : λ = λ0 against H1 : λ = λ1. Find

the power function as explicitly as you can, and prove that it is increasing in λ. Deduce

that the test you have constructed is a size-α test of H0 : λ 6 λ0 against H1 : λ = λ1.

Paper 2, Section II

19H Statistics

What does it mean to say that the random d-vector X has a multivariate normal

distribution with mean µ and covariance matrix Σ?

Suppose that X ∼ Nd(0, σ2Id), and that for each j = 1, . . . , J , Aj is a dj×d matrix.

Suppose further that

AjATi = 0

for j 6= i. Prove that the random vectors Yj ≡ AjX are independent, and that

Y ≡ (Y T1 , . . . , Y T

J )T has a multivariate normal distribution.

[ Hint: Random vectors are independent if their joint MGF is the product of their individual

MGFs.]

If Z1, . . . , Zn is an independent sample from a univariate N(µ, σ2) distribution,

prove that the sample variance SZZ ≡ (n−1)−1∑n

i=1(Zi− Z)2 and the sample mean Z ≡n−1

∑ni=1 Zi are independent.

Paper 4, Section II

19H Statistics

What is a sufficient statistic? State the factorization criterion for a statistic to be

sufficient.

Suppose that X1, . . . ,Xn are independent random variables uniformly distributed

over [a, b], where the parameters a < b are not known, and n > 2. Find a sufficient statistic

for the parameter θ ≡ (a, b) based on the sample X1, . . . ,Xn. Based on your sufficient

statistic, derive an unbiased estimator of θ.

Part IB, 2009 List of Questions [TURN OVER

20092009

Page 31: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

35

1/I/7H Statistics

A Bayesian statistician observes a random sample X1, . . . , Xn drawn from aN(µ, τ−1) distribution. He has a prior density for the unknown parameters µ, τ of theform

π0(µ, τ) ∝ τ α0−1 exp (− 12 K0τ (µ− µ0)2 − β0τ)

√τ ,

where α0 , β0 , µ0 and K0 are constants which he chooses. Show that after observingX1, . . . , Xn his posterior density πn(µ, τ) is again of the form

πn(µ, τ) ∝ τ αn−1 exp (− 12 Knτ (µ− µn)2 − βnτ)

√τ ,

where you should find explicitly the form of αn , βn , µn and Kn .

1/II/18H Statistics

Suppose that X1, . . . , Xn is a sample of size n with common N(µX , 1) distribution,and Y1, . . . , Yn is an independent sample of size n from a N(µY , 1) distribution.

(i) Find (with careful justification) the form of the size-α likelihood–ratio test of thenull hypothesis H0 : µY = 0 against alternative H1 : (µX , µY ) unrestricted.

(ii) Find the form of the size-α likelihood–ratio test of the hypothesis

H0 : µX > A,µY = 0 ,

against H1 : (µX , µY ) unrestricted, where A is a given constant.

Compare the critical regions you obtain in (i) and (ii) and comment briefly.

Part IB 2008

20082008

Page 32: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

36

2/II/19H Statistics

Suppose that the joint distribution of random variables X,Y taking values inZ+ = {0, 1, 2, . . . } is given by the joint probability generating function

ϕ(s, t) ≡ E [sXtY ] =1− α− β

1− αs− βt ,

where the unknown parameters α and β are positive, and satisfy the inequality α+β < 1.Find E(X). Prove that the probability mass function of (X,Y ) is

f(x, y |α, β) = (1− α− β)

(x+ y

x

)αxβy (x, y ∈ Z+) ,

and prove that the maximum-likelihood estimators of α and β based on a sample of sizen drawn from the distribution are

α =X

1 +X + Y, β =

Y

1 +X + Y,

where X (respectively, Y ) is the sample mean of X1, . . . , Xn (respectively, Y1, . . . , Yn).

By considering α + β or otherwise, prove that the maximum-likelihood estimatoris biased. Stating clearly any results to which you appeal, prove that as n → ∞, α → α,making clear the sense in which this convergence happens.

3/I/8H Statistics

If X1, . . . , Xn is a sample from a density f(·|θ) with θ unknown, what is a 95%confidence set for θ?

In the case where the Xi are independent N(µ, σ2) random variables with σ2 known,µ unknown, find (in terms of σ2) how large the size n of the sample must be in order forthere to exist a 95% confidence interval for µ of length no more than some given ε > 0 .

[Hint: If Z ∼ N(0, 1) then P (Z > 1.960) = 0.025 .]

Part IB 2008

20082008

Page 33: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

37

4/II/19H Statistics

(i) Consider the linear model

Yi = α+ βxi + εi ,

where observations Yi, i = 1, . . . , n, depend on known explanatory variables xi,i = 1, . . . , n, and independent N(0, σ2) random variables εi, i = 1, . . . , n .

Derive the maximum-likelihood estimators of α , β and σ2.

Stating clearly any results you require about the distribution of the maximum-likelihoodestimators of α , β and σ2, explain how to construct a test of the hypothesis that α = 0against an unrestricted alternative.

(ii) A simple ballistic theory predicts that the range of a gun fired at angle ofelevation θ should be given by the formula

Y =V 2

gsin 2θ ,

where V is the muzzle velocity, and g is the gravitational acceleration. Shells are fired at9 different elevations, and the ranges observed are as follows:

θ (degrees) 5 15 25 35 45 55 65 75 85sin 2θ 0.1736 0.5 0.7660 0.9397 1 0.9397 0.7660 0.5 0.1736Y (m) 4322 11898 17485 20664 21296 19491 15572 10027 3458

The modelYi = α+ β sin 2θi + εi (∗)

is proposed. Using the theory of part (i) above, find expressions for the maximum-likelihood estimators of α and β.

The t-test of the null hypothesis that α = 0 against an unrestricted alternativedoes not reject the null hypothesis. Would you be willing to accept the model (∗)? Brieflyexplain your answer.

[You may need the following summary statistics of the data. If xi = sin 2θi, thenx ≡ n−1∑xi = 0.63986, Y = 13802, Sxx ≡

∑(xi − x)2 = 0.81517, Sxy =

∑Yi(xi − x) =

17186. ]

Part IB 2008

20082008

Page 34: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

28

1/I/7C Statistics

Let X1, . . . , Xn be independent, identically distributed random variables from theN(µ, σ2

)distribution where µ and σ2 are unknown. Use the generalized likelihood-ratio

test to derive the form of a test of the hypothesis H0 : µ = µ0 against H1 : µ 6= µ0.

Explain carefully how the test should be implemented.

1/II/18C Statistics

Let X1, . . . , Xn be independent, identically distributed random variables with

P (Xi = 1) = θ = 1− P (Xi = 0) ,

where θ is an unknown parameter, 0 < θ < 1, and n > 2. It is desired to estimate thequantity φ = θ(1− θ) = nVar ((X1 + · · ·+Xn) /n).

(i) Find the maximum-likelihood estimate, φ, of φ.

(ii) Show that φ1 = X1 (1−X2) is an unbiased estimate of φ and hence, or otherwise,

obtain an unbiased estimate of φ which has smaller variance than φ1 and which isa function of φ.

(iii) Now suppose that a Bayesian approach is adopted and that the prior distributionfor θ, π(θ), is taken to be the uniform distribution on (0, 1). Compute the Bayes

point estimate of φ when the loss function is L(φ, a) = (φ− a)2.

[You may use that fact that when r, s are non-negative integers,

∫ 1

0

xr(1− x)sdx = r!s!/(r + s+ 1)! ]

2/II/19C Statistics

State and prove the Neyman–Pearson lemma.

Suppose that X is a random variable drawn from the probability density function

f(x | θ) = 12 |x |θ−1e−|x|/Γ(θ), −∞ < x <∞ ,

where Γ(θ) =∞∫0

yθ−1e−ydy and θ > 1 is unknown. Find the most powerful test of size α,

0 < α < 1, of the hypothesis H0 : θ = 1 against the alternative H1 : θ = 2. Express thepower of the test as a function of α.

Is your test uniformly most powerful for testing H0 : θ = 1 against H1 : θ > 1?Explain your answer carefully.

Part IB 2007

20072007

Page 35: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

29

3/I/8C Statistics

Light bulbs are sold in packets of 3 but some of the bulbs are defective. A sampleof 256 packets yields the following figures for the number of defectives in a packet:

No. of defectives 0 1 2 3

No. of packets 116 94 40 6

Test the hypothesis that each bulb has a constant (but unknown) probability θ ofbeing defective independently of all other bulbs.

[ Hint: You may wish to use some of the following percentage points:

Distribution χ21 χ2

2 χ23 χ2

4 t1 t2 t3 t4

90% percentile 2·71 4·61 6·25 7·78 3·08 1·89 1·64 1·5395% percentile 3·84 5·99 7·81 9·49 6·31 2·92 2·35 2·13 ]

4/II/19C Statistics

Consider the linear regression model

Yi = α+ βxi + εi, 1 6 i 6 n ,

where ε1, . . . , εn are independent, identically distributed N(0, σ2), x1, . . . , xn are knownreal numbers with

∑ni=1 xi = 0 and α, β and σ2 are unknown.

(i) Find the least-squares estimates α and β of α and β, respectively, and explain whyin this case they are the same as the maximum-likelihood estimates.

(ii) Determine the maximum-likelihood estimate σ2 of σ2 and find a multiple of it whichis an unbiased estimate of σ2.

(iii) Determine the joint distribution of α, β and σ2.

(iv) Explain carefully how you would test the hypothesis H0 : α = α0 against thealternative H1 : α 6= α0.

Part IB 2007

20072007

Page 36: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

29

1/I/7C Statistics

A random sample X1, . . . , Xn is taken from a normal distribution having unknownmean θ and variance 1. Find the maximum likelihood estimate θM for θ based onX1, . . . , Xn.

Suppose that we now take a Bayesian point of view and regard θ itself as a normalrandom variable of known mean µ and variance τ−1. Find the Bayes’ estimate θB for θbased on X1, . . . , Xn, corresponding to the quadratic loss function (θ − a)2.

1/II/18C Statistics

Let X be a random variable whose distribution depends on an unknown parameterθ. Explain what is meant by a sufficient statistic T (X) for θ.

In the case where X is discrete, with probability mass function f(x|θ), explain,with justification, how a sufficient statistic may be found.

Assume now that X = (X1, . . . , Xn), where X1, . . . , Xn are independent non-negative random variables with common density function

f(x|θ) ={λe−λ(x−θ) if x > θ,0 otherwise.

Here θ ≥ 0 is unknown and λ is a known positive parameter. Find a sufficient statistic forθ and hence obtain an unbiased estimator θ for θ of variance (nλ)−2.

[You may use without proof the following facts: for independent exponential randomvariables X and Y , having parameters λ and µ respectively, X has mean λ−1 and varianceλ−2 and min{X,Y } has exponential distribution of parameter λ+ µ.]

2/II/19C Statistics

Suppose that X1, . . . , Xn are independent normal random variables of unknownmean θ and variance 1. It is desired to test the hypothesis H0 : θ ≤ 0 against thealternative H1 : θ > 0. Show that there is a uniformly most powerful test of size α = 1/20and identify a critical region for such a test in the case n = 9. If you appeal to anytheoretical result from the course you should also prove it.

[The 95th percentile of the standard normal distribution is 1.65.]

Part IB 2006

20062006

Page 37: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

30

3/I/8C Statistics

One hundred children were asked whether they preferred crisps, fruit or chocolate.Of the boys, 12 stated a preference for crisps, 11 for fruit, and 17 for chocolate. Of thegirls, 13 stated a preference for crisps, 14 for fruit, and 33 for chocolate. Answer each ofthe following questions by carrying out an appropriate statistical test.

(a) Are the data consistent with the hypothesis that girls find all three types ofsnack equally attractive?

(b) Are the data consistent with the hypothesis that boys and girls show the samedistribution of preferences?

4/II/19C Statistics

Two series of experiments are performed, the first resulting in observationsX1, . . . , Xm, the second resulting in observations Y1, . . . , Yn. We assume that all observa-tions are independent and normally distributed, with unknown means µX in the first seriesand µY in the second series. We assume further that the variances of the observations areunknown but are all equal.

Write down the distributions of the sample mean X = m−1∑mi=1Xi and sum of

squares SXX =∑mi=1(Xi − X)2 .

Hence obtain a statistic T (X,Y ) to test the hypothesis H0 : µX = µY againstH1 : µX > µY and derive its distribution under H0. Explain how you would carry out atest of size α = 1/100 .

Part IB 2006

20062006

Page 38: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

32

1/I/7D Statistics

The fast-food chain McGonagles have three sizes of their takeaway haggis, Large,Jumbo and Soopersize. A survey of 300 randomly selected customers at one branch choose92 Large, 89 Jumbo and 119 Soopersize haggises.

Is there sufficient evidence to reject the hypothesis that all three sizes are equallypopular? Explain your answer carefully.

[Distribution t1 t2 t3 χ2

1 χ22 χ2

3 F1,2 F2,3

95% percentile 6·31 2·92 2·35 3·84 5·99 7·82 18·51 9·55

]

1/II/18D Statistics

In the context of hypothesis testing define the following terms: (i) simple hypoth-esis; (ii) critical region; (iii) size; (iv) power; and (v) type II error probability.

State, without proof, the Neyman–Pearson lemma.

Let X be a single observation from a probability density function f . It is desiredto test the hypothesis

H0 : f = f0 against H1 : f = f1,

with f0(x) =12 |x| e−x

2/2 and f1(x) = Φ′(x), −∞ < x <∞, where Φ(x) is the distributionfunction of the standard normal, N(0, 1).

Determine the best test of size α, where 0 < α < 1, and express its power in termsof Φ and α.

Find the size of the test that minimizes the sum of the error probabilities. Explainyour reasoning carefully.

2/II/19D Statistics

Let X1, . . . , Xn be a random sample from a probability density function f(x | θ),where θ is an unknown real-valued parameter which is assumed to have a prior densityπ(θ). Determine the optimal Bayes point estimate a(X1, . . . , Xn) of θ, in terms of theposterior distribution of θ given X1, . . . , Xn, when the loss function is

L(θ, a) =

{γ(θ − a) when θ > a,

δ(a− θ) when θ 6 a,

where γ and δ are given positive constants.

Calculate the estimate explicitly in the case when f(x | θ) is the density of theuniform distribution on (0, θ) and π(θ) = e−θθn/n!, θ > 0.

Part IB 2005

20052005

Page 39: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

33

3/I/8D Statistics

Let X1, . . . , Xn be a random sample from a normal distribution with mean µ andvariance σ2, where µ and σ2 are unknown. Derive the form of the size-α generalizedlikelihood-ratio test of the hypothesis H0 : µ = µ0 against H1 : µ 6= µ0, and show that itis equivalent to the standard t-test of size α.

[You should state, but need not derive, the distribution of the test statistic.]

4/II/19D Statistics

Let Y1, . . . , Yn be observations satisfying

Yi = βxi + εi, 1 6 i 6 n,

where ε1, . . . , εn are independent random variables each with the N(0, σ2) distribution.Here x1, . . . , xn are known but β and σ2 are unknown.

(i) Determine the maximum-likelihood estimates (β, σ2) of (β, σ2).

(ii) Find the distribution of β.

(iii) By showing that Yi − βxi and β are independent, or otherwise, determine

the joint distribution of β and σ2.

(iv) Explain carefully how you would test the hypothesis H0 : β = β0 againstH1 : β 6= β0.

Part IB 2005

20052005

Page 40: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

23

1/I/10H Statistics

Use the generalized likelihood-ratio test to derive Student’s t-test for the equalityof the means of two populations. You should explain carefully the assumptions underlyingthe test.

1/II/21H Statistics

State and prove the Rao–Blackwell Theorem.

Suppose that X1, X2, . . . , Xn are independent, identically-distributed random vari-ables with distribution

P (X1 = r) = pr−1(1− p), r = 1, 2, . . . ,

where p, 0 < p < 1, is an unknown parameter. Determine a one-dimensional sufficientstatistic, T , for p.

By first finding a simple unbiased estimate for p, or otherwise, determine anunbiased estimate for p which is a function of T .

Part IB 2004

20042004

Page 41: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

24

2/I/10H Statistics

A study of 60 men and 90 women classified each individual according to eye colourto produce the figures below.

Blue Brown Green

Men 20 20 20

Women 20 50 20

Explain how you would analyse these results. You should indicate carefully any underlyingassumptions that you are making.

A further study took 150 individuals and classified them both by eye colour and bywhether they were left or right handed to produce the following table.

Blue Brown Green

Left Handed 20 20 20

Right Handed 20 50 20

How would your analysis change? You should again set out your underlying assumptionscarefully.

[You may wish to note the following percentiles of the χ2 distribution.

χ21 χ2

2 χ23 χ2

4 χ25 χ2

6

95% percentile 3.84 5.99 7.81 9.49 11.07 12.59

99% percentile 6.64 9.21 11.34 13.28 15.09 16.81 ]

2/II/21H Statistics

Defining carefully the terminology that you use, state and prove the Neyman–Pearson Lemma.

Let X be a single observation from the distribution with density function

f (x | θ) = 12e

−|x−θ|, −∞ < x <∞,

for an unknown real parameter θ. Find the best test of size α, 0 < α < 1, of the hypothesisH0 : θ = θ0 against H1 : θ = θ1, where θ1 > θ0.

When α = 0.05, for which values of θ0 and θ1 will the power of the best test be atleast 0.95?

Part IB 2004

20042004

Page 42: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

25

4/I/9H Statistics

Suppose that Y1, . . . , Yn are independent random variables, with Yi having thenormal distribution with mean βxi and variance σ2; here β, σ2 are unknown and x1, . . . , xnare known constants.

Derive the least-squares estimate of β.

Explain carefully how to test the hypothesis H0 : β = 0 against H1 : β 6= 0.

4/II/19H Statistics

It is required to estimate the unknown parameter θ after observing X, a singlerandom variable with probability density function f(x | θ); the parameter θ has the priordistribution with density π(θ) and the loss function is L(θ, a). Show that the optimalBayesian point estimate minimizes the posterior expected loss.

Suppose now that f(x | θ) = θe−θx, x > 0 and π(θ) = µe−µθ, θ > 0, where µ > 0is known. Determine the posterior distribution of θ given X.

Determine the optimal Bayesian point estimate of θ in the cases when

(i) L(θ, a) = (θ − a)2, and

(ii) L(θ, a) = |(θ − a) /θ|.

Part IB 2004

20042004

Page 43: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

31

1/I/3H Statistics

Derive the least squares estimators α and β for the coefficients of the simple linearregression model

Yi = α+ β(xi − x) + εi, i = 1, . . . , n,

where x1, . . . , xn are given constants, x = n−1∑ni=1 xi, and εi are independent with

E εi = 0, Var εi = σ2, i = 1, . . . , n.

A manufacturer of optical equipment has the following data on the unit cost (inpounds) of certain custom-made lenses and the number of units made in each order:

No. of units, xi 1 3 5 10 12Cost per unit, yi 58 55 40 37 22

Assuming that the conditions underlying simple linear regression analysis are met, estimatethe regression coefficients and use the estimated regression equation to predict the unitcost in an order for 8 of these lenses.

[Hint: for the data above, Sxy =∑ni=1(xi − x)yi = −257.4.]

1/II/12H Statistics

Suppose that six observations X1, . . . , X6 are selected at random from a normaldistribution for which both the mean µX and the variance σ2

X are unknown, and it is found

that SXX =∑6i=1(xi− x)2 = 30, where x = 1

6

∑6i=1 xi. Suppose also that 21 observations

Y1, . . . , Y21 are selected at random from another normal distribution for which both themean µY and the variance σ2

Y are unknown, and it is found that SY Y = 40. Derivecarefully the likelihood ratio test of the hypothesis H0: σ

2X = σ2

Y against H1: σ2X > σ2

Y

and apply it to the data above at the 0.05 level.

[Hint:Distribution χ2

5 χ26 χ2

20 χ221 F5,20 F6,21

95% percentile 11.07 12.59 31.41 32.68 2.71 2.57 ]

2/I/3H Statistics

Let X1, . . . , Xn be a random sample from the N(θ, σ2) distribution, and supposethat the prior distribution for θ is N(µ, τ2), where σ2, µ, τ2 are known. Determine theposterior distribution for θ, given X1, . . . , Xn, and the best point estimate of θ under bothquadratic and absolute error loss.

Part IB 2003

20032003

Page 44: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

32

2/II/12H Statistics

An examination was given to 500 high-school students in each of two large cities,and their grades were recorded as low, medium, or high. The results are given in the tablebelow.

Low Medium HighCity A 103 145 252City B 140 136 224

Derive carefully the test of homogeneity and test the hypothesis that the distributions ofscores among students in the two cities are the same.

[Hint:Distribution χ2

1 χ22 χ2

3 χ25 χ2

6

99% percentile 6.63 9.21 11.34 15.09 16.8195% percentile 3.84 5.99 7.81 11.07 12.59 ]

4/I/3H Statistics

The following table contains a distribution obtained in 320 tosses of 6 coins andthe corresponding expected frequencies calculated with the formula for the binomialdistribution for p = 0.5 and n = 6.

No. heads 0 1 2 3 4 5 6Observed frequencies 3 21 85 110 62 32 7Expected frequencies 5 30 75 100 75 30 5

Conduct a goodness-of-fit test at the 0.05 level for the null hypothesis that the coins areall fair.

[Hint:Distribution χ2

5 χ26 χ2

7

95% percentile 11.07 12.59 14.07 ]

4/II/12H Statistics

State and prove the Rao–Blackwell theorem.

Suppose that X1, . . . , Xn are independent random variables uniformly distributedover (θ, 3θ). Find a two-dimensional sufficient statistic T (X) for θ. Show that an unbiased

estimator of θ is θ = X1/2.

Find an unbiased estimator of θ which is a function of T (X) and whose mean square

error is no more than that of θ.

Part IB 2003

20032003

Page 45: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

26

1/I/3H Statistics

State the factorization criterion for sufficient statistics and give its proof in thediscrete case.

Let X1, . . . , Xn form a random sample from a Poisson distribution for which thevalue of the mean θ is unknown. Find a one-dimensional sufficient statistic for θ.

1/II/12H Statistics

Suppose we ask 50 men and 150 women whether they are early risers, late risers,or risers with no preference. The data are given in the following table.

Early risers Late risers No preference TotalsMen 17 22 11 50Women 43 78 29 150Totals 60 100 40 200

Derive carefully a (generalized) likelihood ratio test of independence of classifica-tion. What is the result of applying this test at the 0.01 level?

[ Distribution χ21 χ2

2 χ23 χ2

5 χ26

99%percentile 6.63 9.21 11.34 15.09 16.81 ]

2/I/3H Statistics

Explain what is meant by a uniformly most powerful test, its power function andsize.

Let Y1, . . . , Yn be independent identically distributed random variables withcommon density ρe−ρy, y ≥ 0. Obtain the uniformly most powerful test of ρ = ρ0 againstalternatives ρ < ρ0 and determine the power function of the test.

Part IB

20022002

Page 46: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

27

2/II/12H Statistics

For ten steel ingots from a production process the following measures of hardnesswere obtained:

73.2, 74.3, 75.4, 73.8, 74.4, 76.7, 76.1, 73.0, 74.6, 74.1.

On the assumption that the variation is well described by a normal density function obtainan estimate of the process mean.

The manufacturer claims that he is supplying steel with mean hardness 75. Derivecarefully a (generalized) likelihood ratio test of this claim. Knowing that for the dataabove

SXX =n∑

j=1

(Xi − X)2 = 12.824,

what is the result of the test at the 5% significance level?

[ Distribution t9 t1095% percentile 1.83 1.8197.5% percentile 2.26 2.23 ]

4/I/3H Statistics

From each of 100 concrete mixes six sample blocks were taken and subjected tostrength tests, the number out of the six blocks failing the test being recorded in thefollowing table:

No. x failing strength tests 0 1 2 3 4 5 6No. of mixes with x failures 53 32 12 2 1 0 0

On the assumption that the probability of failure is the same for each block, obtain anunbiased estimate of this probability and explain how to find a 95% confidence intervalfor it.

4/II/12H Statistics

Explain what is meant by a prior distribution, a posterior distribution, and a Bayesestimator. Relate the Bayes estimator to the posterior distribution for both quadratic andabsolute error loss functions.

Suppose X1, . . . , Xn are independent identically distributed random variables froma distribution uniform on (θ − 1, θ + 1), and that the prior for θ is uniform on (20, 50).

Calculate the posterior distribution for θ, given x = (x1, . . . , xn), and find the pointestimate for θ under both quadratic and absolute error loss function.

Part IB

20022002

Page 47: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

25

1/I/3D Statistics

Let X1, . . . , Xn be independent, identically distributed N(µ, µ2) random variables,µ > 0.

Find a two-dimensional sufficient statistic for µ, quoting carefully, without proof,any result you use.

What is the maximum likelihood estimator of µ?

1/II/12D Statistics

What is a simple hypothesis? Define the terms size and power for a test of onesimple hypothesis against another.

State, without proof, the Neyman–Pearson lemma.

Let X be a single random variable, with distribution F . Consider testing the nullhypothesis H0 : F is standard normal, N(0, 1), against the alternative hypothesis H1 : Fis double exponential, with density 1

4e−|x|/2, x ∈ R.

Find the test of size α, α < 14 , which maximises power, and show that the power is e−t/2,

where Φ(t) = 1− α/2 and Φ is the distribution function of N(0, 1).

[Hint: if X ∼ N(0, 1), P (|X| > 1) = 0.3174.]

2/I/3D Statistics

Suppose the single random variable X has a uniform distribution on the interval[0, θ] and it is required to estimate θ with the loss function

L(θ, a) = c(θ − a)2,

where c > 0.

Find the posterior distribution for θ and the optimal Bayes point estimate with respectto the prior distribution with density p(θ) = θe−θ, θ > 0.

Part IB

20012001

Page 48: Statistics - Tartarus 2, Section I 8H Statistics ... number such that P (Z > z ) = for a standard normal Z N ... is a known n p matrix, is a p 1 ( p < n ) ...

26

2/II/12D Statistics

What is meant by a generalized likelihood ratio test? Explain in detail how toperform such a test.

Let X1, . . . , Xn be independent random variables, and let Xi have a Poissondistribution with unknown mean λi, i = 1, . . . , n.

Find the form of the generalized likelihood ratio statistic for testingH0 : λ1 = . . . = λn, and show that it may be approximated by

1

X

n∑

i=1

(Xi − X)2,

where X = n−1∑ni=1Xi.

If, for n = 7, you found that the value of this statistic was 27.3, would you acceptH0? Justify your answer.

4/I/3D Statistics

Consider the linear regression model

Yi = βxi + εi,

i = 1, . . . , n, where x1, . . . , xn are given constants, and ε1, . . . , εn are independent,identically distributed N(0, σ2), with σ2 unknown.

Find the least squares estimator β of β. State, without proof, the distribution of βand describe how you would test H0 : β = β0 against H1 : β 6= β0, where β0 is given.

4/II/12D Statistics

Let X1, . . . , Xn be independent, identically distributed N(µ, σ2) random variables,where µ and σ2 are unknown.

Derive the maximum likelihood estimators µ, σ2 of µ, σ2, based on X1, . . . , Xn.Show that µ and σ2 are independent, and derive their distributions.

Suppose now it is intended to construct a “prediction interval” I(X1, . . . , Xn) fora future, independent, N(µ, σ2) random variable X0. We require

P

{X0 ∈ I(X1, . . . , Xn)

}= 1− α,

with the probability over the joint distribution of X0, X1, . . . , Xn.

Let

Iγ(X1, . . . , Xn) =

(µ− γσ

√1 +

1

n, µ+ γσ

√1 +

1

n

).

By considering the distribution of (X0− µ)/(σ√

n+1n−1 ), find the value of γ for which

P{X0 ∈ Iγ(X1, . . . , Xn)} = 1− α.

Part IB

20012001