Lecture Notes on Nonparametric MethodsChapter 3 Ranking Statistics Let Z 1;:::;Z ndenote iid...

Post on 21-Jul-2020

1 views 0 download

Transcript of Lecture Notes on Nonparametric MethodsChapter 3 Ranking Statistics Let Z 1;:::;Z ndenote iid...

Lecture Notes on Nonparametric Methods

January 19, 2010

Contents

1 Introduction and basic definitions 1

2 Counting Statistics 3

3 Ranking Statistics 4

4 Counting and ranking combined 7

5 Distribution of the order statistic 11

6 The empirical distribution 14

i

ii

Chapter 1

Introduction and basic definitions

Definition 1.1 Let Xi : (Ωi,Ai) → R, i = 1, . . . , n, denote real-valued randomvariables. Further let f : Rn → Rm, m ∈ N, denote a measurable function. Thenthe random variable T = f(X1, . . . , Xn) is called a statistic based onX1, . . . , Xn.

Remark 1.1 As random variables, the Xi : (Ωi,Ai) → (R,B) are measurablefunctions, with B denoting the Borel σ-algebra. Then the composition

f (X1, . . . , Xn) : Ω1 × . . .× Ωn → Rm

is measurable (A1 × . . .×An → Bm), where Bm denotes the Borel σ-algebra onRm. Thus T = f(X1, . . . , Xn) is indeed a random variable.

Example 1.1 Let m = 1 and f(x1, . . . , xn) := 1n

∑ni=1 xi. Then

X = f(X1, . . . , Xn) =1

n

n∑i=1

Xi

is the sample mean of X1, . . . , Xn.

Example 1.2 Let m = 1 and define f(x1, . . . , xn) := 1n−1

∑ni=1(xi − µ)2, where

µ = µ(x1, . . . , xn) := 1n

∑ni=1 xi. Then

S = f(X1, . . . , Xn) =1

n− 1

n∑i=1

(Xi − X)2

is the sample variance of X1, . . . , Xn.

1

Example 1.3 Define the function Ψ : R→ [0, 1] by

Ψ(x) =

1, x > 0

0, x ≤ 0

Let m = 1 and f(x1, . . . , xn) :=∑n

i=1 Ψ(xi). Then

B = f(X1, . . . , Xn) =n∑i=1

Ψ(Xi)

is called the sign test statistic.

Example 1.4 Let m = n and define f(x1, . . . , xn) := (x(1), . . . , x(n)), wherex(1) ≤ . . . ≤ x(n). Then

T = f(X1, . . . , Xn) = (X(1), . . . , X(n))

is called the order statistic of X1, . . . , Xn.

Definition 1.2 Let C denote a class of distributions on Rn. A statistic T is calleddistribution-free over C if T = f(X1, . . . , Xn) has the same distribution for alldistributions F ∈ C of X1, . . . , Xn.

Example 1.5 Let C := N(µ0, σ2)n : σ2 > 0 denote the class of joint distri-

butions of n independent and identically distributed (or shortly iid) random vari-ables X1, . . . , Xn that have a normal distribution with known mean µ0 ∈ R andunknown variance σ2 > 0. Let X be the sample mean and S be the sample vari-ance. Then the statistic T =

√n(X − µ0)/

√S has a t-distribution with n − 1

degrees of freedom for all σ2 > 0. Thus T is distribution-free over C. It is calledthe t-statistic. T is used to test the null hypothesis H0 : µ = µ0. Note that H0

coincides with C.

Definition 1.3 Let T denote a statistic that is distribution-free over a class C ofdistributions. If C is finite-dimensional, then T is called parametric. If C hasinfinite dimension, then T is called nonparametric.

2

Chapter 2

Counting Statistics

Example 2.1 Let X1, . . . , Xn be independent random variables with a commonp-quantile θ, i.e. P(Xi ≤ θ) = p for all i = 1, . . . , n. Define the function

Ψθ(x) := Ψ(x− θ) =

1, x > θ

0, x ≤ θ

Then the statistic B :=∑n

i=1 Ψθ(Xi) has a binomial distribution with parameter1 − p and n degrees of freedom. Thus B is nonparametric distribution-free overthe class C =

∏ni=1 Fi : Fi is a distribution function, Fi(θ) = p. B is called the

sign test statistic. This generalises example 1.3, where θ = 0.

This observation translates immediately to the following

Theorem 2.1 Let C denote a class of distributions such that the eventsA1, . . . , Anare independent and P (Ai) = p for all i = 1, . . . , n and P ∈ C. Then the statisticB :=

∑ni=1 1Ai

has a binomial distribution with parameter p and n degrees offreedom for all P ∈ C. B is called counting statistic.

Exercise 2.1 Let X1, . . . , Xn be iid continuous random variables which are sym-metric around 0. Let F denote the distribution function of X1. Describe a testbased on a counting statistic for the null hypothesis H0 : F (1)− F (−1) = 1/2.

Exercise 2.2 Consider the regression model Yi = βci + Ei for i = 1, . . . , n,where the Yi are the observations, β ≥ 0 is an unknown parameter, the ci > 0 areknown constants, and the Ei are iid continuous random variables with median 0.Describe a test based on a counting statistic for the null hypothesis H0 : β = 0.

3

Chapter 3

Ranking Statistics

LetZ1, . . . , Zn denote iid real-valued random variables with a continuous distribu-tion function F . Let Z(1) ≤ . . . ≤ Z(n) denote the order statistic of (Z1, . . . , Zn),see example 1.4.

Remark 3.1 The event Zi = Zj for j 6= i is called a tied rank. For independentrandom variables with continuous distributions, tied ranks are null events, i.e. theyhave probablity 0. We can thus assume without loss of generality that there are noties and Z(1) < . . . < Z(n).

Definition 3.1 The number R∗i ∈ 1, . . . , n such that Z(R∗i ) = Zi is called the

rank of Zi. Let R∗ = (R∗1, . . . , R∗n) denote the vector of ranks.

Theorem 3.1 Let Z1, . . . , Zn denote iid real-valued random variables with a con-tinuous distribution function and R∗ = (R∗1, . . . , R

∗n) be the corresponding vector

of ranks. Then R∗ is uniformly distributed over the set R of all permutations of1, . . . , n.

Proof: Let π = (π1, . . . , πn) ∈ R. Since Z1, . . . , Zn are iid, we obtain

(Zπ1 , . . . , Zπn)d= (Z1, . . . , Zn)

where d= means equality in distribution. Hence

P(R∗ = π) = P(Zπ1 < . . . < Zπn) = P(Z1 < . . . < Zn) = P(R∗ = (1, . . . , n))

Since π can be chosen arbitrarily, the statement follows.

4

Exercise 3.1 Show that P(R∗i = k) = 1/n and

P(R∗i = k,R∗j = l) =1

n(n− 1)

for all k 6= l ∈ 1, . . . , n and i 6= j ∈ 1, . . . , n.

Corollary 3.1 Let Z1, . . . , Zn denote iid real-valued random variables. Furtherlet R∗ = (R∗1, . . . , R

∗n) be the corresponding vector of ranks. Then any statistic

V (R∗) based onR∗1, . . . , R∗n is distribution-free over the class of joint distributions

for Z1, . . . , Zn.

Definition 3.2 A statistic V (R∗) based on the ranks R∗1, . . . , R∗n of iid real-valued

random variables is called a rank statistic.

Example 3.1 Two-sample location problemLet X1, . . . , Xm and Y1, . . . , Yn denote iid real-valued random variables with con-tinuous distribution functions F (x) and G(x) = F (x − ∆), respectively, where∆ ∈ R is an unknown shift parameter. Let R∗ = (Q1, . . . , Qm, R1, . . . , Rn) de-note the vector of ranks for (X1, . . . , Xm, Y1, . . . , Yn). Then

W =n∑i=1

Ri U =m∑i=1

n∑j=1

Ψ(Yj −Xi)

are called the Wilcoxon and the Mann-Whitney rank sum statistics, respectively.Under the null hypothesis H0 : ∆ = 0, W is a rank statistic.

Exercise 3.2 Show that W = U + n(n+1)2

. This implies that under H0 : ∆ = 0, Uis a rank statistic, too.

Theorem 3.2 Under H0 : ∆ = 0, the distribution of W is given by

P0(W = k) =tm,n(k)(n+mn

)where tm,n(k) = |A = k1, . . . , kn ⊂ 1, . . . , n+m :

∑ni=1 ki = k|.

Remark 3.2 tm,n(k) is the number of subsets of 1, . . . , n+m with n elementsfor which the sum of all elements equals k.

5

Proof: Under H0 : ∆ = 0, the vector (X1, . . . , Xm, Y1, . . . , Yn) consists of iidrandom variables with a continuous distribution function. By theorem 3.1 thevector of ranks R∗ = (Q1, . . . , Qm, R1, . . . , Rn) is then uniformly distributedover all permutations of 1, . . . , n+m. Since

(n+mn

)is the number of all subsets

of 1, . . . , n+m with n elements, we obtain the statement.

Exercise 3.3 Determine the distribution of W for m = 3 and n = 2.

Theorem 3.3 Under H0 : ∆ = 0, the distribution of W is symmetric aroundµ = n · (m+ n+ 1)/2.

Proof: Under H0, the random variables X1, . . . , Xm, Y1, . . . , Yn are iid. Thus, therandom variables −X1, . . . ,−Xm,−Y1, . . . ,−Yn are iid, too. This implies that

(Q1, . . . , Qm, R1, . . . , Rn)d= (N+1−Q1, . . . , N+1−Qm, N+1−R1, . . . , N+1−Rn)

where N = m+ n. From this we obtain

n∑i=1

Rid=

n∑i=1

(N + 1−Ri) = n · (N + 1)−n∑i=1

Ri

which translates to

W − n

2(N + 1)

d=n

2(N + 1)−W

6

Chapter 4

Counting and ranking combined

Recall the function Ψ defined in example 1.3.

Lemma 4.1 Let Z be a continuous random variable that is symmetric around 0.Then the random variables |Z| and Ψ(Z) are independent.

Proof: For all x > 0 we obtain

P(Ψ(Z) = 1, |Z| ≤ x) = P(Z > 0, |Z| ≤ x) = P(0 < Z ≤ x)

=1

2· P(−x ≤ Z ≤ x)

since Z is continuous and symmetric around 0. Since Z has median 0, the lastexpression equals

1

2· P(|Z| ≤ x) = P(Ψ(Z) = 1) · P(|Z| ≤ x)

The proof for P(Ψ(Z) = 0, |Z| ≤ x) = P(Ψ(Z) = 0) · P(|Z| ≤ x) is analogousand left as an exercise.

Definition 4.1 Let Z1, . . . , Zn be real-valued random variables. The absoluterank of Zi is the rank of |Zi| among |Z1|, . . . , |Zn|. It shall be denoted by R+

i .The signed rank of Zi is Ψ(Zi)R

+i . A statistic based on Ψ(Z1)R

+1 , . . . ,Ψ(Zn)R+

n

is called a signed rank statistic.

Remark 4.1

Ψ(Zi)R+i =

R+i , Zi > 0

0, Zi ≤ 0

7

Theorem 4.1 Let Z1, . . . , Zn be iid continuous real-valued random variables thatare symmetric around 0. Let R+ = (R+

1 , . . . , R+n ) denote the vector of absolute

ranks of Z1, . . . , Zn. Then the random variables Ψ(Z1), . . . ,Ψ(Zn), R+ are inde-pendent. Each Ψ(Zi) has a Bernoulli distribution with parameter p = 1/2 andR+ is uniformly distributed over the setR of permutations on 1, . . . , n.

Proof: By lemma 4.1, independence of Z1, . . . , Zn implies independence of therandom variables Ψ(Z1), |Z1|, . . . ,Ψ(Zn), |Zn|. SinceR+ depends on |Z1|, . . . , |Zn|only, Ψ(Z1), . . . ,Ψ(Zn), R+ are independent, too. Since each Zi is continuousand symmetric around 0, it is clear that Ψ(Zi) ∼ Be(1/2) for all i = 1, . . . , n.Finally, theorem 3.1 yields R+ ∼ U(R) as the |Z1|, . . . , |Zn| are independent.

Corollary 4.1 Assume that Z1, . . . , Zn are iid random variables with Z1 ∼ F .Let S denote a statistic based on Ψ(Z1), . . . ,Ψ(Zn), R+. Then S is distribution-free over the class of joint distributions

∏ni=1 F , where F is any continuous dis-

tribution function with F (−x) = 1− F (x) for all x > 0.

Example 4.1 LetX1, . . . , Xn be iid continuous real-valued random variables thatare symmetric around θ ∈ R, where θ is unknown. For some θ0 ∈ R, defineZi := Xi − θ0 for all i = 1, . . . , n. Let R+ = (R+

1 , . . . , R+n ) denote the vector of

absolute ranks for Z1, . . . , Zn. Then

W+ :=n∑i=1

Ψ(Zi)R+i

is called the Wilcoxon signed rank statistic.

Theorem 4.2 Under the null hypothesis H0 : θ = θ0, the distribution of W+ is

P0(W+ = k) =

cn(k)

2n

for all k = 0, 1, . . . , n(n+1)/2, where cn(k) = |A ⊂ 1, . . . , n :∑

i∈A i = k|.

Remark 4.2 cn(k) is the number of subsets of 1, . . . , n for which the sum ofall elements equals k.

Proof: For ψ ∈ 0, 1n and r ∈ R a permutation of 1, . . . , n, the vector (ψ, r)is called a constellation. Theorem 4.1 yields

P((Ψ(Z1), . . . ,Ψ(Zn), R+) = (ψ, r)) =1

2n· 1

n!

8

for all constellations (ψ, r). Let A ⊂ 1, . . . , n with∑

i∈A i = k, and writeq := |A|. Further write ψ = (ψ1, . . . , ψn) and r = (r1, . . . , rn). There are(nq

)q!(n − q)! = n! constellations such that A = ri : ψi = 1. The definition of

cn(k) yields

P(W+ = k) = cn(k) · n! · 1

2n· 1

n!=cn(k)

2n

Exercise 4.1 Compute the distribution of W+ for n = 3.

Theorem 4.3 Under H0 : θ = θ0, the distribution of W+ is symmetric aroundµ = n · (n+ 1)/4.

Proof: By theorem 4.1, the random variables Ψ(Z1), . . . ,Ψ(Zn), R+ are inde-pendent and Ψ(Zi) ∼ Be(1/2) for all i = 1, . . . , n. This implies that the randomvariables 1−Ψ(Z1), . . . , 1−Ψ(Zn), R+ are independent, too, and

(Ψ(Z1), . . . ,Ψ(Zn), R+)d= (1−Ψ(Z1), . . . , 1−Ψ(Zn), R+)

Hence we obtainn∑i=1

Ψ(Zi)R+i

d=

n∑i=1

(1−Ψ(Zi))R+i =

n(n+ 1)

2−

n∑i=1

Ψ(Zi)R+i

which means that

W+ − n(n+ 1)

4d=n(n+ 1)

4−W+

Exercise 4.2 For random variables X1, . . . , Xn and any i ≤ j ∈ 1, . . . , n, theaverage (Xi + Xj)/2 of Xi and Xj is called a Walsh average. Show that W+

equals the number of Walsh averages that are greater than θ0.

Exercise 4.3 Let X and Y be two real-valued random variables and denote thedifference by Z := Y −X . Assume that X and Y are symmetric around a and b,respectively. Show that then Z is symmetric around b− a.

9

Example 4.2 treatment effect (paired replicates)Let (X1, Y1), . . . , (Xn, Yn) be iid continuous bivariate random variables. DefineZi := Yi − Xi for all i = 1, . . . , n and let θ ∈ R denote the unknown medianof Z1. Then the Wilcoxon signed rank statistic W+ may be used to test the nullhypothesis H0 : θ = θ0. Often the Xi are called the pre-treatment variables andthe Yi are called the post-treatment variables. Then θ is called the treatmenteffect.

10

Chapter 5

Distribution of the order statistic

LetX1, . . . , Xn denote iid real-valued random variables with an absolutely contin-uous distribution function F , i.e. there is a density function f(x) = F ′(x) for allx ∈ R. Let X(1) < . . . < X(n) be the order statistic of X1, . . . , Xn (see example1.4) and recall remark 3.1.

Theorem 5.1 The joint density function of the order statistic is given by

P(X(1) ∈ dx1, . . . , X(n) ∈ dxn) = n!f(x1) . . . f(xn) dx1 . . . dxn

for all x1 < . . . < xn ∈ R.

Proof: The random variables X1, . . . , Xn are independent, and there are n! possi-ble permutations of 1, . . . , n.

Theorem 5.2 For all i ∈ 1, . . . , n and x ∈ R,

P(X(i) ∈ dx) =n!

(i− 1)!(n− i)!(F (x))i−1(1− F (x))n−if(x)dx

Proof: The event X(i) ∈ dx implies X(k) < x for all k = 1, . . . , i − 1 andX(k) > x+ dx for all k = i+ 1, . . . n. This is the result of a trinomial experimentwith probability

n!

(i− 1)! · 1! · (n− i)!(F (x))i−1(F (x+ dx)− F (x))(1− F (x))n−i

The statement now follows from F (x+ dx)− F (x) = f(x)dx+ o(dx).

11

Definition 5.1 Let F denote the distribution function of a real-valued randomvariable X . Define its generalised inverse by

F−1(y) := infx ∈ R : F (x) ≥ y

for 0 < y < 1. The random variable F (X) is called the probability integraltransform of X .

Remark 5.1 If F is strictly increasing, then F−1 coincides with the usual inversefunction of F . If F is continuous, then F (F−1(y)) = y for all 0 < y < 1. Ingeneral, F (F−1(y)) ≥ y as the following example shows.

Example 5.1 Let F be the distribution function of the Dirac measure δ0 on 0, i.e.

F (x) =

0, x < 0

1, x ≥ 0

Then F−1(1/2) = 0 and F (F−1(1/2)) = F (0) = 1 > 1/2.

Theorem 5.3 Let X be a continuous random variable with distribution functionF . Then the probability integral transform F (X) is uniformly distributed over[0, 1].

Proof: Choose any 0 < y < 1. Since F is continuous and increasing, the impli-cation

X ≤ F−1(y) ⇒ F (X) ≤ F (F−1(y)) = y

holds certainly, i.e. X ≤ F−1(y) ⊂ F (X) ≤ y. We further observe thatP(F (X) = y) = 0, since X is continuous. By definition of F−1, the implication

X > F−1(y) ⇒ F (X) ≥ y

holds certainly, i.e. F (X) < y ⊂ F−1(y) ≥ X. Altogether we obtain

P(F (X) ≤ y) = P(X ≤ F−1(y)) = F (F−1(y)) = y

Since 0 < y < 1 can be chosen arbitrarily, this yields the statement.

12

Theorem 5.4 LetX1, . . . , Xn be iid absolutely continuous random variables withdistribution function F andX(1) < . . . < X(n) their order statistic. Define Z(i) :=F (X(i)) for all i = 1, . . . , n. Then the density function of Z(i) is given by

P(Z(i) ∈ dz) =n!

(i− 1)!(n− i)!zi−1(1− z)n−idz

for 0 < z < 1.

Proof: Theorem 5.3 states that F (X1), . . . , F (Xn) are iid with Zi := F (Xi) ∼U(0, 1) for all i = 1, . . . , n. Since F is increasing, Z(1) < . . . < Z(n) is the orderstatistic of Z1, . . . , Zn. Now the statement follows from theorem 5.2.

Remark 5.2 The distribution in the above theorem is a beta distribution with pos-itive integer parameters.

13

Chapter 6

The empirical distribution

Let X1, . . . , Xn be iid real-valued random variables with an unknown distributionfunction G. Their empirical distribution function is defined as

Fn(x) :=1

n

n∑i=1

1Xi≤x

for all x ∈ R, where

1A :=

1 if A is true0 if A is false

is the indicator function of an event A.

Remark 6.1 The Xi can be seen as independent observations from a populationwith distribution function G. Then Fn(x) is the number of observations of size atmost x divided by the sample size n.

Remark 6.2 Under a null hypothesis H0 : G = F , where F is a known distri-bution function, we obtain P0(Xi ≤ x) = F (x) for all x ∈ R and i = 1, . . . , n.Thus for a fixed x ∈ R, the random variable n · Fn(x) has a binomial distributionwith parameter F (x) and n degrees of freedom. This implies E(Fn(x)) = F (x)and Var(Fn(x)) = F (x) · (1− F (x))/n.

Exercise 6.1 Let Y be a real-valued random variable and h(y) ≥ 0 an increasingfunction. Show that for any ε > 0,

P(Y ≥ ε) ≤ 1

h(ε)E(h(Y ))

14

Lemma 6.1 Let Y and (Yn : n ≥ 1) be real-valued random variables. If

∞∑n=1

P(|Yn − Y | ≥ ε) <∞

for all ε > 0, then Yn → Y for n→∞ with probability 1.

Proof: The statement P(limn→∞ Yn = Y ) = 1 holds if for every δ, ε > 0 there isan m ∈ N such that

P(|Yn − Y | < ε ∀n ≥ m) > 1− δ

This is equivalent to

P

(⋃n≥m

|Yn − Y | ≥ ε

)< δ

Choose any δ, ε > 0. By assumption,∑∞

n=1 P(|Yn−Y | ≥ ε) <∞, which impliesthat there is an m ∈ N such that

∞∑n=m

P(|Yn − Y | ≥ ε) < δ

Now the statement follows from

P

(⋃n≥m

|Yn − Y | ≥ ε

)≤

∞∑n=m

P(|Yn − Y | ≥ ε)

Theorem 6.1 (Borel)Under H0 : G = F and for any x ∈ R,

Fn(x)→ F (x) for n→∞

holds with probability 1.

Proof: Define the random variables Yn := |Fn(x) − F (x)| and the increasingfunction h(y) := y4 for all y ≥ 0. Then exercise 6.1 yields

P(|Fn(x)− F (x)| ≥ ε

)≤ 1

ε4E(

(Fn(x)− F (x))4)

15

Write p := F (x) and q := 1− F (x). Since Z := nFn(x) has a binomial distribu-tion with parameter p and n degrees of freedom, the expectation on the right-handside is related to the fourth centralised moment of Z as

E(

(Fn(x)− F (x))4)

=1

n4E((Z − np)4

)=

1

n4

(3(npq)2 + npq(1− 6pq)

)see e.g. [1], p.110, for the last equality. Thus we obtain a bound

E(

(Fn(x)− F (x))4)

=1

n2

(3(pq)2 +

pq(1− 6pq)

n

)≤ C

n2

which holds for all n ≥ 1 for the constant C = 3(pq)2 + pq(1− 6pq). Hence

P(|Fn(x)− F (x)| ≥ ε

)≤ C

ε4n2

for all ε > 0 and n ≥ 1. Since∑∞

n=1 1/n2 < ∞, the statement follows fromlemma 6.1 with Yn := Fn(x) and Y := F (x).

Theorem 6.2 (Glivenko-Cantelli)Under H0 : G = F and if F is continuous, then

limn→∞

(supx∈R|Fn(x)− F (x)|

)= 0

with probability 1.

Proof: Let r ∈ N and k ∈ 1, . . . , r − 1. Define xr,k := infx ∈ R : F (x) =k/r and further xr,r :=∞ as well as xr,0 := −∞. Then for k = 1, . . . , r− 2 andx ∈ [xr,k, xr,k+1[ we obtain the bound

Fn(x)− F (x) ≤ Fn(xr,k+1)− F (xr,k)

= Fn(xr,k+1)− F (xr,k+1) + F (xr,k+1)− F (xr,k)

≤ Fn(xr,k+1)− F (xr,k+1) +1

r

Analogously one arrives at

Fn(x)− F (x) ≥ Fn(xr,k)− F (xr,k)−1

r

16

For x < xr,1 the same arguments yield

−1

r≤ Fn(x)− F (x) ≤ Fn(xr,1)− F (xr,1) +

1

r

while for x > xr,r−1 we obtain

Fn(xr,r−1)− F (xr,r−1)−1

r≤ Fn(x)− F (x) ≤ 1

r

Thus for any r ∈ N,

supx∈R|Fn(x)− F (x)| ≤ max

1≤k≤r−1|Fn(xr,k)− F (xr,k)|+

1

r(6.1)

For all 1 ≤ k ≤ r − 1, define the events

Ek,r :=

limn→∞

|Fn(xr,k)− F (xr,k)| = 0

Theorem 6.1 yields P0(Ek,r) = 1 for all 1 ≤ k ≤ r−1. DefiningEr :=⋂r−1k=1Ek,r,

this implies P0(Er) = 1 for all r ∈ N. For the limit E :=⋂∞r=1Er we then obtain

P0(E) = 1. Due to the bound (6.1), this implies the statement.

Definition 6.1 The Kolmogorov-Smirnov (one-sample) statistics are defined as

D+n := sup

x∈R

(Fn(x)− F (x)

), D−n := sup

x∈R

(F (x)− Fn(x)

)and

Dn := supx∈R

∣∣∣Fn(x)− F (x)∣∣∣ = max

D+n , D

−n

Theorem 6.3 Assume that F is continuous. Then the statistics D+

n , D−n , Dn are

dsitribution-free under H0 : G = F .

Proof: By definition,

Fn(x) =

0, x < X(1)

k/n, X(k) ≤ x ≤ X(k+1), k = 1, . . . , n− 1

1, x ≥ X(n)

17

whereX(1) < . . . < X(n) is the order statistic ofX1, . . . , Xn. DefineX(0) := −∞and X(n+1) :=∞. Then

D+n = sup

x∈R

(Fn(x)− F (x)

)= max

0≤k≤nsup

X(k)≤x<X(k+1)

(k

n− F (x)

)= max

0≤k≤n

(k

n− F (X(k))

)since F is increasing. Analogously,

D−n = max1≤k≤n+1

(F (X(k))−

k − 1

n

)Finally Dn = max D+

n , D−n . Since F is continuous, theorem 5.4 states that the

F (X(k)) are distribution-free under H0, which implies the statement.

Remark 6.3 For exact expressions of the null distribution for D+n , D−n , and Dn,

see [3], section 3.5.b.

18

Bibliography

[1] N. L. Johnson, S. Kotz, and A. W. Kemp. Univariate discrete distributions,3rd ed. Wiley, 2005.

[2] E. Manoukian. Mathematical nonparametric statistics. Gordon and Breach,1986.

[3] R. Randles and D. Wolfe. Introduction to the theory of nonparametric statis-tics. Wiley, 1979.

19