Introduction to Empirical Processes and Semiparametric...

Empirical Processes: Lecture 19 Spring, 2014

Introduction to Empirical Processesand Semiparametric Inference

Lecture 19: M-estimatorsYair Goldberg, Ph.D.,

and Michael R. Kosorok, Ph.D.

University of North Carolina-Chapel Hill

�� M-Estimators

M-estimators are (approximate) maximizers (or minimizers) θn of objective

functions θ 7→Mn(θ).

Examples include:

• maximum likelihood estimators

• least squares estimators

• least absolute deviation estimators

Usually the objective funciton θ 7→Mn(θ) is an empirical (data

generated) process while θ 7→M(θ) is a limiting process of some kind.

Often,

θ 7→Mn(θ) = Pnmθ(X),

where {mθ(X) : θ ∈ Θ} is a class of measurable functions

X 7→ mθ(X) on the sample space X .

The Argmax theorem studies the limiting distribution of M-estimators

through the limiting behavior of the associated objective functions.

�� The Argmax Theorem

Let Mn,M be stochastic processes indexed by a metric space H .

Assume

(A) The sample paths h 7→M(h) are upper semicontinuous and possess

a unique maximum at a (random) point h, which as a random map in H is

tight.

(B) Mn ;M in `∞(K) for every compact K ⊂ H .

(C) The sequence hn is uniformly tight and satisfies

Mn(hn) ≥ suph∈HMn(h)− oP (1)

then hn ; h in H .

A sequence Xn is asymptotically tight if for every ε > 0, there is a

compact set K such that lim inf P∗(Xn ∈ Kδ) > 1− ε for every

δ > 0, where Kδ = {x : d(x,K) < δ}.

A sequence Xn is uniformly tight if for every ε > 0, there is a compact set

K such that P (Xn ∈ K) > 1− ε.

In Rp, Xn is asymptotically tight iff Xn is uniformly tight.

�� Rate of Convergence

Let θ 7→M(θ) be twice differentiable at a point of unique maximum θ0.

Then ∂∂θM(θ0) ≡ 0.

while ∂2

∂θ2M(θ0) is negative definite.

Hence we can expect that

M(θ)−M(θ0) ≤ −cd2(θ, θ0)

for some c > 0 in a neighborhood of θ0.

Sometimes we replace the metric function d by a function

d : Θ×Θ 7→ [0,∞)

that satisfies d(θn, θ0)→ 0 whenever d(θn, θ0)→ 0.

This is useful, for example, when different parameters of the model have

different rates of convergence.

The modulus of continuity of a stochastic process {X(t) : t ∈ T}is defined by

mx(δ) ≡ sups,t∈T :d(s,t)≤δ

|X(s)−X(t)| .

An upper bound for the rate of convergence of an M-estimator can be

obtained from the modulus of continuity of Mn −M at θ0.

Theorem 14.4: Rate of convergence

Let Mn be a sequence of stochastic processes indexed by a semimetric

space (Θ, d) and M : Θ 7→ R a deterministic function.

Assume that

(A) For every θ in a neighborhood of θ0, there exists a c1 > 0 such that

M(θ)−M(θ0) ≤ −c1d2(θ, θ0),

(B) For all n large enough and sufficiently small δ, the centered process

Mn −M satisfies

E∗ supd(θ,θ0)<δ

√n |(Mn −M)(θ)− (Mn −M)(θ0)| ≤ c2φn(δ),

for c2 <∞ and functions φn such that δ 7→ φn(δ)/δα is decreasing for

some α < 2 not depending on n.

(C) The sequence θn converges in outer probability to θ0,

and satisfies

Mn(θn) ≥ supθ∈Θ

Mn(θ)−OP (r−2n )

for some sequence rn that satisfies

r2nφn(r−1

n ) ≤ c3

√n, for every n and some c3 <∞ .

rnd(θn, θ0) = OP (1) .

�� Remark

The “modulus of continuity” of the empirical process gives an upper bound

on the rate.

When φ(δ) = δα then the rate is at least n1/(4−2α).

For φ(δ) = δ we get the√n rate.

Empirical Processes: Lecture 19 Spring, 2014�� Proof

We assume for simplicity that θn maximize Mn(θ) and that d = d.

Our goal is to show that rnd(θn, θ0) = OP (1). This is equivalent to

showing that for all n large enough P ∗(rnd(θn, θ0) > 2K) < ε for

some constant K .

For each n, the parameter space (minus the point θ0) can be partitioned

into “peels”

Sj,n = {θ : 2j−1 < rnd(θ, θ0) ≤ 2j}with j ranging over the integers.

Fix η > 0 small enough such that

supθ:d(θ,θ0)<η

M(θ)−M(θ0) ≤ −c1d2(θ, θ0) .

and such that for all δ < η

√n |(Mn −M)(θ)− (Mn −M)(θ0)| ≤ c2φn(δ),

Such η exists by assumptions (A) and (B).

Note that if rnd(θn, θ0) > 2K for a given integer K , then θn is in one of

the peels Sj,n, with j > K .

P ∗(rnd(θn, θ0) > 2K

∑j≥K,2j≤ηrn

P ∗(

supθ∈Sj,n

[Mn(θ)−Mn(θ0)] ≥ 0

+P ∗(

2d(θn, θ0) ≥ η)

By Assumption (A), for every θ ∈ Sn,j , such that 2j < ηrn,

M(θ)−M(θ0) ≤ −c1d2(θ, θ0) ≤ −c122j−2r−2

By Assumption (B), Markov’s inequality, and the fact that

φn(cδ) ≤ cαφn(δ) for every c > 1,

P ∗(

supθ∈Sj,n

|(Mn −M)(θ)− (Mn −M)(θ0)| ≥ c122j−2

≤c2φn

)r2n√

n(c122j−2)≤ 2c2c32jα−2j+2

Summarizing

P ∗(rnd(θn, θ0) > 2K

∑j≥K,2j≤ηrn

P ∗(

supθ∈Sj,n

[Mn(θ)−Mn(θ0)] ≥ 0

+P ∗(

2d(θn, θ0) ≥ η)

≤∑j>K

2c2c32jα−2j+2

c1+ P ∗

(2d(θn, θ0) ≥ η

)The first term is smaller than ε for all K large enough. The second term is

smaller than ε for all n large enough since θn is consistent. This proves

that rnd(θn, θ0) = OP (1)

�� Regular Euclidean M-Estimators

Let mθ : X 7→ R where θ ∈ Θ ⊂ Rp.

Let Mn(θ) = Pnmθ and M(θ) = Pmθ.

Theorem 2.13

Assume

(A) θ0 maximizes M(θ) and M(θ) has a non-singular second derivative

matrix V .

(B) There exist measurable functions Fδ : X 7→ R and mθ0 : X 7→ Rp

such that

|mθ1(x)−mθ2(x)| ≤ Fδ(x)‖θ1 − θ2‖,P (mθ1 −mθ0 − mθ0‖θ1 − θ0‖)2 = o(‖θ1 − θ0‖2) ,

and PF 2δ <∞, P‖mθ‖2 <∞ in some neighborhood Θ0 ⊂ Θ that

contains θ0.

(C) θnP→ θ0 and Mn(θn) ≥ supθ∈ΘMn(θ)−OP (n−1)

Then√n(θn − θ0) ; −V −1Z where Z is the limiting distribution of

Gnmθ0 .

�� Monotone Density Estimation

Let X1, . . . , Xn be a sample of size n from a Lebesgue density f on

[0,∞) that is known to be decreasing. Note that this means that F is

concave.

Fix t > 0. We assume that f is differentiable at t with derivative

−∞ < f ′(t) < 0.

The maximum likelihood estimator fn of f is the non-increasing step

function equal to the left derivative of Fn, the least concave majorant of

the empirical distribution function Fn which is known as the Grenander

estimator (Grenander, 1956).

�� Consistency

LEMMA 1. Marshall’s lemma

supt≥0|Fn(t)− F (t)| ≤ sup

t≥0|Fn(t)− F (t)| .

The proof is an exercise.

Fix 0 < δ < t. Note that

Fn(t+ δ)− Fn(t)

δ≤ fn(t) ≤ Fn(t)− Fn(t− δ)

By Marshall’s lemma,

Fn(t+ δ)− Fn(t)

as∗→ F (t+ δ)− F (t)

Fn(t− δ)− Fn(t)

as∗→ F (t− δ)− F (t)

By the assumptions on F and the arbitrariness of δ, we obtain

fn(t)as∗→ f(t).

�� Rate of Convergence

The inverse function representation

Define the stochastic process

sn(a) = arg maxs≥0{Fn(s)− as}, for a > 0 .

The largest value is selected when multiple maximizers exist.

The function sn is a sort of inverse of the function fn in the sense that

fn(t) ≤ a if and only if sn(a) ≤ t for every t ≥ 0 and a > 0.

Figure 1: sn(a) = arg maxs≥0{Fn(s)− as}, for a > 0.

Define

Mn(g) ≡ Fn(t+ g)− Fn(t)− f(t)g − xgn−1/3

M(g) ≡ F (t+ g)− F (t)− f(t)g .

By changing variable s 7→ t+ g in the dentition of sn combined with the

fact that the location of the maximum of a function does not change when

the function is shifted vertically we have

sn(f(t) + xn−1/3)− t ≡ arg max{g>−t}

{Fn(t+ g)

−(f(t) + xn−1/3)(t+ g)}= arg max

{g>−t}Mn(g)

Define gn = arg max{g>−t}Mn(g).

Our goal is to show that the conditions of Theorem 14.4 hold for gn with

rate of n1/3 where

θ = g , θ0 = 0 , d(θ, θ0) = |θ − θ0| .

Note that by the existence of the derivative for f at t we have

M(g) = F (t+ g)− F (t)− f(t)g =1

2f ′(t)g2 + o(g2)

Since by assumption f ′(t) < 0, Assumption (A), namely,

M(θ)−M(θ0) ≤ −c1d2(θ, θ0), holds.

Recall that Assumption (B) of Theorem 14.4 states:

For all n large enough and sufficiently small δ, the centered process

Mn −M satisfies

√n |(Mn −M)(θ)− (Mn −M)(θ0)| ≤ c2φn(δ),

for c2 <∞ and functions φn such that δ 7→ φn(δ)/δα is decreasing for

some α < 2 not depending on n.

Recall

Mn(g) ≡ Fn(t+ g)− Fn(t)− f(t)g − xgn−1/3

M(g) ≡ F (t+ g)− F (t)− f(t)g .

and thus Mn(0) = M(0) = 0.

E∗ sup|g|<δ

√n |Mn(g)−M(g)|

≤ E∗ sup|g|<δ|Gn(1{X ≤ t+ g} − 1{X ≤ t})|

+O(√nδn−1/3)

. φn(δ) ≡ δ1/2 +√nδn−1/3.

Clearly

φn(δ)

δα=δ1/2 +

√nδn−1/3

is decreasing for α = 3/2 < 2.

Assumption (C) of Theorem 14.4:

The sequence θn converges in outer probability to θ0,

and satisfies

Mn(θn) ≥ supθ∈Θ

Mn(θ)−OP (r−2n )

for some sequence rn that satisfies

r2nφn(r−1

n ) ≤ c3

√n, for every n and some c3 <∞ .

• M(g) = F (t+ g)− F (t)− f(t)g is continuous and has a unique

maximum at g = 0.

• Mn(g) ;M(g) uniformly on compacts.

• Mn(gn) = supgMn(g).

Thus by the argmax theorem gn ; 0.

Choose rn = n1/3. Then

r2nφn(r−1

n ) = n2/3φn(n−1/3) = n1/2 + n1/6n−1/3 = O(n1/2)

Thus Assumption (C) holds.

Hence n1/3gn = OP (1).

�� Weak Convergence

Denote hn = n1/3gn = n1/3 arg max{g>−t}Mn(g).

Rewriting, and multiplying by n2/3, we have n2/3Mn(n−1/3h)

= n2/3(Pn − P )(

1{X ≤ t+ hn−1/3} − 1{X ≤ t})

+n2/3[F (t+ hn−1/3)− F (t)− f(t)hn−1/3

]− xh .

It can be shown that

n2/3Mn(n−1/3h) ; H(h) ≡√f(t)Z(h) +

2f ′(t)h2 − xh,

where Z is a two-sided Brownian motion.

We use the argmax theorem to prove that

arg max{n2/3Mn(n−1/3h) = hn ; h = arg max H.

We need to show

• H is continuous and has a unique maximum.

• n2/3Mn(n−1/3h)P→ H(h) uniformly on compacts.

• Mn(n−1/3hn) = suphMn(n−1/3h).

Using the rescaling attributes of Brownian motion, we have

arg max H =

∣∣∣∣ 4f(t)

[f ′(t)]2

∣∣∣∣1/3 arg maxh{Z(h)− h2}+

f ′(t).

Simple algebra yields

(∣∣∣∣ 4f(t)

[f ′(t)]2

∣∣∣∣1/3 arg maxh{Z(h)− h2}+

f ′(t)≤ 0

(∣∣4f ′(t)f(t)∣∣1/3 arg max

{Z(h− h2

}≤ x

By the inverse function representation we have

P (n1/3(fn(t)− f(t)) ≤ x)

= P (fn(t) ≤ f(t) + xn−1/3)

= P (sn(f(t) + xn−1/3) < t)

= P (arg maxh{Mn(n−1/3h)} ≤ 0)

= P (hn ≤ 0)

→ P (h ≤ 0)

(∣∣4f ′(t)f(t)∣∣1/3 arg max

{Z(h)− h2

}≤ x

Summarizing:

n1/3(fn(t)− f(t)) ; |4f ′(t)f(t)|1/3C,

where the random variable C ≡ arg maxh{Z(h)− h2} has Chernoff’s

distribution.

Introduction to Empirical Processes and Semiparametric...

Documents

Transcript of Introduction to Empirical Processes and Semiparametric...

LTM4653 (Rev 0) - analog.com · LTM4653 Rev 0 For more information 48V 0 0 5V,,,

+ 0 & 0 2 ! . · 2012-11-30 · + 0 & 0 2 ! . 294 0 2 ! & $ 0 2 ! 0 " / $ 2 ) # " # 2 2 . 1 0 . 1 0 2 ... + 0 & 0 2 ! . 295 1 ! 2 ! 1 ) 2 2 . " ( - +- ) ù / * 2 ! & . $ # / * 0 #

Exercices. - uliege.be 6_2020...CHAPITRE6.EXERCICES. 175 h) 1 −α 0 0 0 1 −β 0 0 0 1 −γ 0 0 0 1 oùα,β,γ ∈C Rép.: 1 α αβ αβγ 0 1 β βγ 0 0 1 γ 0 0 0 1 2) Soientlesmatrices

P ( 0 - selcuk.edu.tr Tek Eksenli Gerilme Hali 24... · Tek Eksenli Gerilme Hali σ1 0 0 0 0 0 0 σ1 ≠ 0 ( 0 (σ2 = 0 σ3 = 0 0 P σ1 1 2 3 Tek Eksenli Gerilme Hali 1 σ1 P P 1

ü 0# 0! 0 . 0 .2 . .2!) - iatrikionline.gr · .2!)" ü 0# 0! 0 . 0 .2 ." ΕIA ΚΛΟΕΓΣΟΣΚΣΚ ΓΟΣ ΚΕΣ ΤΕΥΧΟΣ 45 • ΕΠΙΤΟΜΟ ΤΕΥΧΟΣ 2018 ΕΚΛΟΓΕΣ

7$& &+6$ $., 7$& &+6$$XWR 5HVWDUW ù # 2 ) . 2 0 . 0 1 ù 3 * . 0 0 ! * 1 0 0 ! 2 & 1 2 # $ . . " / . " 2 # 0 2 ! * ! 0 * . 2 " 2 . . 2 1 2 0 . 1 0 1 . / . 2 ! 1 # . . . . 2 . 1 2

0 ( ) 0 ( & + ) + ! . - European Parliament · 0 ( ) 0 ( & + ) + ! . - European Parliament ... "#$#% 8

For Research Use Only PKM2-specific Monoclonal ANTIBODY · 2020. 8. 31. · PKM has 2 isoforms,the activity of the M2 isoform can be inhibited by tyrosine kinase signalling in tumourcells.

Semiparametric Estimation Methods for Panel Count Data Using …€¦ · Data Using Monotone Polynomial Splines Minggen Lu, Ying Zhang, and Jian Huang We study semiparametric likelihood-based

ΒΟΥΛΓΑΡΙΑ ΧΑΡΤΗΣ 6 FYROM Ανάγκες για τη · ΑΛΒΑΝΙΑ ΒΟΥΛΓΑΡΙΑ 20°0' 20°0' 21°0' 21°0' 22°0' 22°0' 23°0' 23°0' 24°0' 24°0' 25°0'

The Royal Society of Chemistry= 0.5257 - 0.02583 Diameter - 0.0766 Growth time + 0.003862 Diameter*Growth time - 5 .0 - 2 .5 0 .0 2 .5 5 .0 9 9 9 5 9 0 8 0 7 0 6 0 5 0 4 0 3 0 2 0

pic-pac.cap.ca · &' ('))'* !+,' +- )&' .'../+01 )&' .)%*)/02 )/3'1 )&' (+!%)/+0 %0, )&' )+4/!. +- '%!& .'../+0 %*' 2/5'0 /0 +*,'* (( .'../+0. 6 /(( 7' &'(, /0 )&' &8./!. 9/(,/02

JMJD5 regulates PKM2 nuclear translocation and reprograms ... · its interaction with pyruvate kinase muscle isozyme (PKM)2. One of the hallmarks of cancer cells is their altered

Dynamics of η → π 0 π 0 π 0

The GTAP-E Model: An Extension of the GTAP Model for ......2 EU27 0 -105.612 0 0 0 -334.871 -12.606 0 -453.09 3 EEFSU 0 -3324.02 0 0 0 -256.794 83.603 0 -3497.21 Scenario 1: 30% decrease

2 2 - blogs.sch.gr · $qguhdv $wkdqdvvrsrxorv 0 " 0 / ù . 0 2 # . 0 # # ) . 1 2 0 . ) 1 1 . 2 0 & ! * 0 2 ! 0 # $qguhdv $wkdqdvvrsrxorv . ! # 1 . 1 2 # ! . 2 "

SLP075 VB/0 SLP130 VB/0 SLP275 VB/0 SLP330 GE V/0

PKM-P Mikroenkapsulasi ekstrak kombinasi formula sambiloto dan daun sirsak sebagai inhibitor enzim alfa-glikosidase secara in-vitro.docx

6) 2 - MIK3 · 6) 2 0 . . . 2 ) . . 2 3 ) ! . # . 0 # * 0 2 . 1 0 0 ! . 1 ) 0 0 0 ! ) / 2 # & 1 0 & $ 2 . "

ù . 2 # . / 0 * $ . . 0 1 2 0 ) . . . ! 1 $ 0 2 0 ü / + . ! 0 2 0 0 ... - Αρχικήgym-geroskipou-paf.schools.ac.cy/images/2019-2020/1.YLIKO... · 2020. 4. 6. · ù . 2 # .

pic-pac.cap.ca · &' ('))'* !+,' +- )&' .'../+01 )&' .)%)/02 )/3'1 )&' (+!%)/+0 %0, )&' )+4/!. +- '%!& .'../+0 %' 2/5'0 /0 +,' (( .'../+0. 6 /(( 7' &'(, /0 )&' &8./!. 9/(,/02