Testing for Serial Correlation by means of Extreme Values

Testing for SerialCorrelation by means of

Extreme Values

Ishay WeissmanTechnion - Israel Institute of Technology

ieriw01@ie.technion.ac.il

Vimeiro 2013

A typical lecture in statistics begins as

follows:

X1, X2, · · · , Xn

be an iid sample from some df F ...

I will open with

Ivette, Ivette Jr., Tiago de Oliveira

Vimeiro, 1983

And now, let

X1, X2, · · · , Xn

be a sample from a continuous df F0 and F0

is U [0,1]

(if not, replace Xi by F0(Xi)).

We suspect the data exhibit some serial

correlation (dependence).

The main purpose of this work is to study

the effectiveness of the LARGEST

SPACING (LS) as a tool to detect serial

dependence.

OVERVIEW

- Background on Spacings

- Possible Competitors

- Autoregressive Model and a Surprising

Connection to Extreme Values

- Power Comparisons

- Two More Models

- Conclusions

Want to test

H0 : ”iid-uniform”

There is no optimal test against all possible

alternatives !!!

Concentrate on Autoregressive Model

Xi = ρXi−1 + (1 − ρ)Ui

(1 ≤ i ≤ n , 0 ≤ ρ ≤ 1),

{Ui : i ≥ 0} is an iid-U [0,1] sequence,

X0 = U0.

So, here we test

H0 : ρ = 0 vs. H1 : ρ > 0 .

SPACINGS

Y1 ≤ Y2 ≤ · · · ≤ Yn (Y0 ≡ 0, Yn+1 ≡ 1)

be the order statistics of the {Xi} and let

Vi = Yi − Yi−1 (i = 1,2, · · · , n + 1)

be the spacings and Vmax be the largest.

When ρ = 0, for 0 ≤ y ≤ 1,

P{Vmax ≤ y } =

n+1∑j=0

(−1)j(n + 1

){(1 − jy)+}n

(Whitworth (1897), Darling(1953)).

If E1, E2, · · · , En+1 are iid unit-exponential

Tn+1 =n+1∑i=1

(V1, V2, · · · , Vn+1)D=

(E1, E2, · · · , En+1)

=n + 1

Tn+1·(E1, E2, · · · , En+1)

n + 1,

independent of Tn+1. Since

Tn+1/(n + 1) → 1 a.s., for large n, the

spacings behave (approximately) as iid

exponential (λ = n + 1).

Hence, for −∞ < x < ∞

limn→∞P{ (n + 1)Vmax − log(n + 1) ≤ x }

= exp{−e−x} ,

i.e. attraction to the Gumbel distribution.8

Want to compare the power of LS with some

other competitors.

That is, the power of the test which rejects

H0 when Vmax > cα with powers of tests

based on:

- Likelihood ratio (LR)

- Sample serial correlation (SSC)

- Kolmogorov-Smirnov (K-S)

LR: Most powerful, as a benchmark, to see

how close is LS to LR.

SSC: Least squares estimator of ρ, intuitive.

K-S: Very popular, similar in nature:

extreme vertical distance

extreme horizontal distance.

Empirical df vs. Uniform df

0.0 0.2 0.4 0.6 0.8 1.0

0.00.2

0.40.6

0.81.0

0.0 0.2 0.4 0.6 0.8 1.0

0.00.2

0.40.6

0.81.0

LS and K-S occur together K-S is large due to

accumulation

LIKELIHOOD RATIO

Denote

X = (X1, X2, · · · , Xn)

U = (U1, U2, · · · , Un)

and let U0 = X0 = x0 ∈ [0,1]. Then

Ui = (Xi − ρXi−1)(1 − ρ)−1 (1 ≤ i ≤ n).

The Jacobian of U 7→ X is (1 − ρ)−n.

Hence the joint density of X, conditioned on

U0 = X0 = x0, at x ∈ [0,1]n, is given by

fX(x) =

= (1 − ρ)−nn∏

i=1I{ρxi−1 ≤ xi ≤ ρxi−1 + 1 − ρ}

= (1−ρ)−nI

ρ ≤ min1≤i≤n

xi−1,

1 − xi

1 − xi−1

Ti = min

Xi−1,

1 − Xi

1 − Xi−1

(∈ [ρ,1] )

(1 ≤ i ≤ n, Tmin = min1≤i≤n

then the following facts follow from Slide 11:

Fact 1. The {Ti} are iid uniform on [ρ,1].

Fact 2. The likelihood function is given by

L(ρ) = (1 − ρ)−nI{ρ ≤ Tmin} (0 ≤ ρ ≤ 1).

Fact 3. The statistic Tmin is sufficient with

respect to ρ and it is the maximum likelihood

estimator (MLE) of ρ.13

Fact 4. For testing

H0 : ρ = 0 vs. H1 : ρ > 0

the test which rejects H0 when

Tmin > cα = 1 − α1/n

is most powerful α-level test, with power

given by

πα(ρ) =

(1−ρ)n if ρ ≤ cα ,

1 if ρ ≥ cα .

Interesting case:

a sample extreme (minimum) is most

powerful for testing existence of serial

correlation !!!

POWER COMPARISONS

For each pair ρ, n we generated 105 samples

from the autoregressive model and computed

the (empirical) power, namely, the proportion

of samples for which H0 : ρ = 0 was rejected.

The significance level is α = .05 in all cases.

Power Functions, Autoregressive Model, α = .05.

0.0 0.1 0.2 0.3 0.4

alpha=.05, n=10

0.0 0.1 0.2 0.3 0.40

alpha=.05, n=20

0.0 0.1 0.2 0.3 0.4

alpha=.05, n=50

0.0 0.1 0.2 0.3 0.4

alpha=.05, n=100

0.0 0.1 0.2 0.3 0.4

alpha=.05, n=200

0.00 0.05 0.10 0.15 0.20

alpha=.05, n=500

LR (blue), LS (black), K-S (red) , SSC (green)

0.00 0.05 0.10 0.15

alpha=.05, n=1000

0.00 0.02 0.04 0.06 0.08 0.10

alpha=.05, n=2000

0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07

alpha=.05, n=5000

0.00 0.01 0.02 0.03 0.04 0.05

alpha=.05, n=10000

LR (blue), LS (black), K-S (red) , SSC (green)

To be fair to Kolmogorov-Smirnov, we have

run similar simulations on samples from beta

models beta(γ, 1), namely

Xi = U1/γi .

Independent, but not uniform.

H0 : γ = 1 vs. H1 : γ > 1

H0 : γ = 1 vs. H2 : γ < 1 .

Power Functions, Beta Model

0.5 1.0 1.5 2.0

alpha=.05, n=10

1/gamma

0.5 1.0 1.5 2.0

alpha=.05, n=50

1/gamma

0.5 1.0 1.5 2.0

alpha=.05, n=100

1/gamma

0.5 1.0 1.5 2.0

alpha=.05, n=1000

1/gamma

LR (blue), LS (black), K-S (red)

LR here refers to the likelihood ratio test for

this model∗, namely the most powerful test.

K-S tends to the optimum, while LS stays

far below.19

(∗) Reject H0 vs. γ > 1 when −2Σ logXi < χ22n(.05)

Reject H0 vs. γ < 1 when −2Σ logXi > χ22n(.95).

TWO MORE MODELS

Binomial Model:

Let B1, B2, · · · be iid Bernoulli sequence with

parameter p, independent of the {Ui}

sequence.

Define

Yi = BiYi−1 + (1 − Bi)Ui

( i ≥ 1 , Y0 = U0 )

Notice, the marginal distribution of Yi is

U [0,1],

the first serial correlation, P{Yi = Yi+1}

and the extremal index, all three are equal

to p. Clusters of equal neighbors are of

random (geometric) length.

Moving-max model:

Let ξ1, ξ2, · · · be a sequence of iid β(k−1,1)

random variables, where k is a fixed positive

integer. Let

Zi = max{ξi, ξi+1, · · · , ξi+k−1} (i ≥ 1).

The Z-sequence is called a moving-max

sequence of order k. For each i, Zi is

U [0,1]-distributed but neighboring values are

dependent. Upper extreme values appear in

clusters of size k, which imply that the

extremal index is equal to k−1.

For k = 2, the first serial correlation is 3/7

and P{Zi = Zi+1} = 1/3.

Scatter points (i, Yi) and (i, Zi)

0 20 40 60 80 100

0.00.2

0.40.6

0.81.0

Binomial, p=.333

0 20 40 60 80 100

0.20.4

0.60.8

Moving−Max(2)

The two plots look very similar. In both

cases, the experienced practitioner will reject

the independence hypothesis just on the

basis of the fact that for continuous random

variables, the probability of a tie is 0. We

brought these cases to see how well the LS

and K-S tests detect the dependence.

Power Functions

1 10 100 1000 10000

Moving−Max(2), alpha=.05

1 10 100 1000 10000

Moving−Max(3), alpha=.05

Logarithmic scale, LS (black), K-S (red)

K-S test is not consistent !

(Similar results for the Binomial Model.)

CONCLUSION

- We presented here evidence (not a

theorem) that the largest spacing is quite

sensitive to serial dependence.

- K-S is more sensitive to deviation from

”uniform distribution”.

- As a byproduct, in the Autoregressive

Model, the optimal test for serial correlation

is based on lower extremes.

THANK YOU FOR

YOUR ATTENTION

SEE YOU ALL

IN VIMEIRO 2043

Testing for Serial Correlation by means of Extreme Values

Documents

Transcript of Testing for Serial Correlation by means of Extreme Values

Supplementary Materials for Extreme Electron-Polaron ......2015/10/28 · Supplementary Materials for Extreme Electron-Polaron Spatial Delocalization in π-Conjugated Materials Jeff

Extreme Ultraviolet Variability Experiment (EVE) Multiple ...lasp.colorado.edu/home/eve/files/2011/06/Final_Sol_Phy_Hock_1April... · Extreme Ultraviolet Variability Experiment (EVE)

Speckle correlation functions applied to surface plasmons

Αυτο-συσχέτιση ( auto-correlation )

24AA04/24LC04B 4K I2C™ Serial EEPROM Data Sheet · • 2-Wire Serial Interface, I 2C™ Compatible ... 4K I2C™ Serial EEPROM * 24XX04 is used in this document as a generic part

Greene, Econometric Analysis (6th ed, 2008)fm · Greene, Econometric Analysis (6th ed, 2008) Chapters 10, 11, 12: Generalized Least Squares, Heteroskedas-ticity, Serial Correlation

Bubble Point Pressure Correlation

Serial correlation coefficient - Manfred Mudelsee

Extreme points and the Krein–Milman theoremmath.caltech.edu/Simon_Chp8.pdf · 8 Extreme points and the Krein–Milman theorem Thenextfourchapterswillfocusonanimportantgeometricaspectofcompactsets,

Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Wind Turbine Extreme Gust Control - · PDF fileWind Turbine Extreme Gust Control Recognition and Control of extreme operation ... β additional (to φ¯yw) yaw misalignment angle for

Extreme points and the Kreinâ€“Milman theorem - Caltech

11.07.2006R. Lednický Subatech Nantes1 Correlation Femtoscopy R. Lednický, JINR Dubna & IP ASCR Prague History QS correlations FSI correlations Correlation.

Parallel Correlation Clustering on Big Graphs

Core™2 Extreme Processor - Intel

Development of Fuzzy Extreme Value Theory Control Charts Using ...

NANO266 - Lecture 5 - Exchange-Correlation Functionals

Extreme Variables in Star Forming Regions

2-Wire Serial Temperature Sensor Data Sheet

Power System Resilience to Extreme Weather ... - CIGRE