Simulation et Estimation des Processus de Hawkes20140319... · Hawkes processes - general...

Simulation et Estimation des Processus deHawkes

E.BacryCNRS, CMAP, Ecole Polytechnique,

France

http ://www.cmap.polytechnique.fr/ b̃acry

E. Bacry, Dauphine, 2014 Simulation et Estimation des processus de Hawkes 1

Hawkes processes - general definition in dimension D

Nt : a D-dimensional jump process (jumps are all of size 1)

λt : D-dimensional stochastic intensity

µ : D-dimensional exogenous intensity

Φ(t) : D × D square matrix of kernel functions Φij(t) whichare positive and causal (i.e., supported by R+).

Moreover ||Φij ||L1 < +∞, 1 ≤ i , j ≤ D

”Auto-regressive” relation

λt = µ+Φ ⋆ dNt,

where by definition

(Φ ⋆ dNt)ij =

D∑

k=1

∫ +∞

−∞Φik(t − s)dNk(s)


Simulation

Inverse Methods −→ based on time change

Cluster Methods −→ hierarchical method

Thinning methods −→ based on a bound of intensity


Simulation : Inverse Methods

In 1d (D = 1), it gives

F (t) =

∫ t

0λ(u)du

N(F−1(t)) is an homogeneous Poisson process of intensity 1.

=⇒ One needs to know how to simulate F−1(ti+1)− F−1(ti )

where ti+1 − ti is exponentially distributed

Not that easy !


Simulation : Inverse Methods

Dassios, Zhao, 2013

Exponential case φ(t) = αe−βt

F−1(ti+1)− F−1(ti ) is simulated using 2 iid exponential

variables −→ Total is 2N

update of the intensity at each jump : multiplication by anexponential −→ Total is N(D)

where N be the total number of jumps

Complexity ≃ O(ND)


Simulation : Clustering Methods

The “basic“ method in 1d (D = 1)

i. Simulate the ”immigrants” {t0k}k on [0,T ] (using Poisson µ)

ii. Set n = 0

iii. For each point tnk of generation n, we generate a Poissonprocess on t ∈ [tnk ,T ] of intensity φ(t − t

nk )

iv. The so-obtained points form generation n+ 1 : {tn+1l }l .

v n← n + 1 and go back to iii.


Simulation : Clustering Methods

The “basic“ method Problems

not ”causal”

Edge problems : clusters generated by immigrants before time0 may contain offspring in [0,T ].

=⇒ Use the ”perfect” ( !) algorithm by Moller, Rasmussen2014


Simulation : Thinning algorithm

Ogata 1981

Key point : φ(t) is bounded by a decreasing function φ̄(t)

i. Let t0 be a time of jump. n← 0

ii. Compute λ̄t = λt−n |Ft−+ φ̄(t − tn)

iii. s ← tn

iv. s ′ ← s + d , where d is exponential of parameter λ̄s

vi. If u < λs′/λ̄s with u ∼ Uniform in [0,1]Record a new jump tn+1 = s

′, n← n + 1 and go to ii.

vii. Thinning : s ← s ′ and go to iv.


Simulation : Thinning algorithm

If N ′ is the total number of jumps (including the thinned ones)

Any kernel ⇒ Complexity is O(N ′2D)

Exponential kernel ⇒ Complexity is O(N ′D)


Estimation

Parametric estimation (Maximum likelihood)First work : Ogata 78

Non parametric estimation

Marsan Lengliné (2008), generalized by Lewis, Mohler (2010)→ Expected Maximization (EM) procedure of a (penalized)likelihood function→ Monovariate Hawkes processes, Small amount of data, Notheoretical resultsReynaud-Bouret and Schbath (2010)→ Developed for small amount of data (Sparse penalization)E.B. and J.F.Muzy (2014)→ Developed for large amount of data


Hawkes processes - Stationarity condition

In the following, I shall suppose that

ρ(||Φ||L1) < 1,

where ||Φ||L1 = {||Φij ||L1}i ,j .

It implies that λt is (asymptotically) stationary and

E (λt) = Λ = (I− ||Φ||L1)−1µ

where

Ψ(t) =

+∞∑

k=1

Φ(∗k)(t).



E.B. and J.F. Muzy (2014)

The true values of Φ and µ verify a Wiener-Hopfequation (Hawkes 71 - E.B., J.F. Muzy, 2013)

g(t) = Φ(t) + Φ ∗ g(t), ∀t > 0

where g ij(t)dt = E (dN it |dNj0 = 1)

Given a ”good” g , it has a unique solution in Φ

=⇒ The second-order statistics characterize a Hawkesprocess



E.B. and J.F. Muzy (2014)

Wiener-Hopf equation

g(t) = Φ(t) +

∫ +∞

0φ(s)g(t − s)ds, ∀t > 0

Solution is computed using Nyström method (Gaussianquadrature)

Warning : g(t) is discontinuous at t = 0

Use of Gaussian quadrature ⇐⇒ regularization

Can be generalized in the case of a marked Hawkes process

λt = µ+Φ ⋆ f (mt)dNt ,

where the marks mt (iid) are known and f (.) is a newparameter


Application to High-frequency financial time-series

E.B, J.F.Muzy (2013)

4 dimension Hawkes process : Nt =

(

TtPt

)

,

Upward/Downward price jumps : Pt =

(

P+tP−t

)

Buying/Selling trades : Tt =

(

T+tT−t

)


Application to High-frequency financial time-series

E.B, J.F.Muzy (2013)

The kernels

Φ(t) =

(

ΦT (t) ΦR(t)ΦI (t) ΦN(t)

)

each Φ and Φ are 2× 2 symmetric matrices :

(

φs φc

φc φs

)

ΦT (t) : Auto-correlation of trades

ΦI (t) : Impact of trades on the price

ΦN(t) : Influence of past price moves on future price moves

ΦR(t) : Retro-influence of price moves on trades


Non parametric estimation of ΦT for Eurostoxx Futures10h-12h, 2009-2012 (800 days)

Trade auto-correlation −→ ”Positive” correlation : Order splitting


Non parametric estimation of ΦI for Eurostoxx Futures10h-12h, 2009-2012 (800 days)

Trade ”instantaneous” impact


Non parametric estimation of ΦN for Eurostoxx Futures10h-12h, 2009-2012 (800 days)

Influence of past price moves on future price moves−→ Mostly mean reverting (micro-structure)


Non parametric estimation of ΦR for Eurostoxx Futures10h-12h, 2009-2012 (800 days)

Retro-influence of price moves on anonymous trades :Price goes up =⇒ more sell market orders


Non parametric estimation : Application to ETAS model

ETAS 1d model (Kagan and Knopoff 81, 87 - Ogata, 88)

λt = µ+

∫ t

0φ(t − s)f (mt)dNt

the marks (magnitude) mt are supposed to be iid

mark : f (m) = Aeαm

kernel φ(t) = C(1+t/c)p

Many parametric estimation

From our knowledge, single previous non parametricestimation for earthquakes : Marsan and Lengliné (2008),


Non parametric estimation : Application to ETAS model

E.B. and J.F. Muzy (2014)estimation from North-Carolina Earthquake Catalogp ≃ 0.7 and c ≃ 1 mn

0 3 6 9 12

t [h]0.0

0.1

0.2

0.3

φ(t)

1 2 3 4 5 6m

0

10

20

30

f(m

)

−4 −3 −2 −1 0 1 2

ln(t)

−6

−4

−2

ln[φ

(t)]

1 2 3 4 5 6m

0.0

0.5

1.0

1.5

log10[f(m

)]


Parametric estimation : Maximum likelihood

Likelihood of a Hawkes process {Nt}t∈[0,T ] (Ogata, 78) :

L(Φ, µ,Nt) =

D∑

i=1

∫ T

0ln(λit)dN

it −

D∑

i=1

∫ T

0λitdt

Parametric estimation :

General kernel case :−→ complexity O(N2D) (N : total number of jumps)

Exponential kernel case : Φij(t) = aije−βij t

−→ complexity O(ND)


Parametric estimation in large dimensions

E.B, S.Gaiffas, J.F.Muzy (in prep.)Application : Social Network, Finance, . . .

Exponential kernels : Φij(t) = aije−βij t

A = {aij}1≤i ,j≤D : ”adjacency” matrix{βij}1≤i ,j≤D : ”decay” matrix

Likelihood convex parametrization : Φij(t) = aije−βij/aij t

B = {βij/aij}1≤i ,j≤D

Regularization (group of similar agents)

Sparsity of A −→ L1 penalization on the aij

A should be low rank −→ sparsity on the spectrum of A,Trace-norm penalization of A


Convex optimization / numerical aspects

Accelerated proximal gradient descent such as Fista [BeckTeboulle (2009)] or Prisma [Orabona et al (2012)]

When carefully done complexity of one gradient is O(ND)(instead of O(N2D) for the naive approach)

We have a parallelized code for this : the gradient on eachnode j ∈ {1, . . . ,D} can be computed in parallel

Computation bottleneck is the heavy use of exp and log ! (canbe accelerated using some ugly hacking)

Proximal operator of ℓ1-norm and trace-norm are the standardsoft-thresholding and spectral soft-thresholding operators

spectral soft-thresholding operators requires truncated SVD :we use the Lanczos’s default implementation of Python, it isfast enough



Dimension = 10 (210 parameters)Adjacency matrix :



No Penalization



No penalization (left) and Lasso Penalization (right)



No penalization (left) and Trace norm Penalization (right)



Lasso (left), Trace Norm (middle), both (right)



Dimension = 100 (20100 Parameters), Lasso penalization



Dimension = 100 (20100 Parameters), Trace norm penalization



Oracle Inequality

L2 framework

Penalization in θ̂ depending on data-driven weights

pen(θ) = ||µ||1,ŵ + ||A||1,Ŵ + ŵ∗||A||∗

given by empirical variance terms



E.B, S.Gaiffas, J.F.Muzy (in prep.)

Theorem : Oracle inequality

We have

||λθ̂ − λ∗||2T ≤ inf

θ

{

||λθ − λ∗||2T + complexity(θ)

}

with a probability larger than 1− ce−x with

complexity(θ) .||µ||0(x + logD)

Tmax

jN j ([0,T ])/T

+||A||0(x + 2 logD)

Tmaxj ,k

V̂jk(T )

+rank(A)(x + logD)

T||V̂1(T )||op ∨ ||V̂2(T )||op

Leading constant is 1



Main tool : new matrix-martingale concentration inequality forcontinuous-time martingale

Consider the random matrix Z(t)

Z(t) = dM ⋆Φ ⋆ dNt

[M = martingale obtained by compensation of N]

This is the noise term in our problem



E.B, S.Gaiffas, J.F.Muzy (in prep.)

Theorem

For any x > 0, we have

||Z(t)||opt

≤ 4

√

(x + logD + ℓ̂x(t))||V̂1(t)||op ∨ ||V̂2(t)||opt

+(x + logD + ℓ̂x(t))(10.34 + 2.65 supt∈[0,T ] ||H(t)||2,∞)

t

with a probability larger than 1− 84.9e−x

First non-commutative version of Bernstein’s inequality forcountinuous-time martingales

Extension of [Tropp (2012)] results



Where : ||H(t)||2,∞ = maximum ℓ2-norm of rows of H(t), where

H(t) = Φ ⋆ dNt

V̂1(t) diagonal matrix with entries

V̂jj

1 (t) =1

t

∫ t

0

||H(s)||22,∞dNj(s)

V̂2(t) matrix with entries

V̂jk

2 (t) =1

t

∫ t

0

||H(s)||22,∞

d∑

l=1

Φjl (s)Φkl (s)

||Hl,•(s)||22dNl(s),


Simulation et Estimation des Processus de Hawkes20140319... · Hawkes processes - general...

Documents

Transcript of Simulation et Estimation des Processus de Hawkes20140319... · Hawkes processes - general...

LIMIT THEOREMS FOR ADDITIVE FUNCTIONALS OF A MARKOV …hektor.umcs.lublin.pl/~komorow/mydoc/ips2.pdf · Markov jump processes X t, whose skeleton chain satis es our as-sumptions.

The Complexity of Crazy Frog Puzzle and Permutation ...The frog must follow a given sequence of horizontal, vertical or diagonal jumps of varying length; at every jump the frog can

Lightning Jump Algorithm Update W. Petersen, C. Schultz, L. Carey, E. Hill.

Chapitre 5: Processus stochastique à temps discret. Section 1. … · 2020-03-17 · Chapitre 5: Processus stochastique à temps discret. Remarquer que dire que X est F-adapté,

Centre de Recherche en Mathématiques de la Décision …mischler/Enseignements/ProcContM1/... · de l’´etude des processus de Poisson dans les prochains chapitres. 1.2 Martingales

Laser induced Temperature jump - Chemistry DevelSAS.part3-ultrafast + V… · Laser induced Temperature jump IR pulse heats the solvent ( Raman shifted YAG to 1.9 μfor D 2 O) Probe

Macroeconomic Effects on Stock Jumps

Binary Jumps Stochastics

Open Channel Flow Part 2 - University of Notre Damecefluids/files/Open_channel_part2...2. Open Channel Flow – Specific Energy and Rapid Transitions – Hydraulic Jumps – Slowly

To jump the queue or wait my turn? - ERNETifcam/new_avenue/Slides/Seminars/Manjunath.pdfD. Manjunath (IIT Bombay)To jump the queue or wait my turn?January 14, 2014 4 / 24 Adding value

arXivAsymptotics of orthogonal polynomials for a weight with a jump on [ 1;1] A. Foulqui e Moreno, A. Mart nez-Finkelshtein, and V.L. Sousa …

Dipartimento di Matematica, P.zza Porta S. Donato, 40127 ... · MEAN-VARIANCE HEDGING WITH RANDOM VOLATILITY JUMPS Francesca Biagini Dipartimento di Matematica, P.zza Porta S. Donato,

Quantum jumps in a two-level atom - CERNcds.cern.ch/record/380923/files/9903023.pdfQuantum jumps in a two-level atom H.M. Wiseman 1and G.E. Toombes;2 1Centre for Laser Science, Department

Impact of EO on Innovation and Entrepeneurship @ESA · 2020-07-16 · ESA UNCLASSIFIED - For Official Use | Slide 10 Entrepreneurship “An entrepreneur is someone who jumps off a

414 535 - M12BPS · pinch point can dig into the surface of the material causing the wheel to climb out or kick out. The wheel may either jump toward or away from the operator, depending

l’Hypertrophie Bénigne de la Prostate...• Les α-bloquants : Ils inhibent la contraction musculaire lisse de la prostate, ils n’agissent pas sur le processus de prolifération

Markov Chain Monte Carlo Simulation of a System with Jumps

Control-Flow Graphs Dataflow Analysis · – nodes: labeled basic blocks of instructions • single-entry, single-exit • i.e., no jumps, branching, or labels inside block – edges:

@u nu d'ordre n > 2 @t @xn - Claude Bernard University Lyon 1math.univ-lyon1.fr/~alachal/exposes/transp_pseudo_proc_2008.pdf · Un panorama de lois remarquables pour le pseudo-processus

Folding of a heterogeneous β-hairpin peptide from ... · by a fast temperature jump (T-jump) (12, 14). TZ2 (Fig. 1) is a 12-residue peptide designed to form a type I′ β-turn due