N IME ERIES - Data Science Summer School · max )( min) ; where max min = 1 + N=T 2 p N=T , and 2 [...

O N C LUSTERING F INANCIAL T IME S ERIES G AUTIER M ARTI ,P HILIPPE D ONNAT AND F RANK N IELSEN N OISY CORRELATION MATRICES Let X be the matrix storing the standardized returns of N = 560 assets (credit default swaps) over a period of T = 2500 trading days. Then, the empirical correlation matrix of the returns is C = 1 T XX > . We can compute the empirical density of its eigenvalues ρ(λ)= 1 N dn(λ) dλ , where n(λ) counts the number of eigenvalues of C less than λ. From random matrix theory, the Marchenko- Pastur distribution gives the limit distribution as N →∞, T →∞ and T /N fixed. It reads: ρ(λ)= T /N 2π p (λ max - λ)(λ - λ min ) λ , where λ max min = 1+ N/T ± 2 p N/T , and λ ∈ [λ min ,λ max ]. Figure 1: Marchenko-Pastur density vs. empirical density of the correlation matrix eigenvalues Notice that the Marchenko-Pastur density fits well the empirical density meaning that most of the information contained in the empirical correlation matrix amounts to noise: only 26 eigenvalues are greater than λ max . The highest eigenvalue corresponds to the ‘mar- ket’, the 25 others can be associated to ‘industrial sectors’. C LUSTERING TIME SERIES Given a correlation matrix of the returns, Figure 2: An empirical and noisy correlation matrix one can re-order assets using a hierarchical clustering algorithm to make the hierarchical correlation pattern blatant, Figure 3: The same noisy correlation matrix re-ordered by a hierarchical clustering algorithm and finally filter the noise according to the correlation pattern: Figure 4: The resulting filtered correlation matrix B EYOND CORRELATION Sklar’s Theorem. For any random vector X =(X 1 ,...,X N ) having continuous marginal cumulative distribution functions F i , its joint cumulative distribution F is uniquely expressed as F (X 1 ,...,X N )= C (F 1 (X 1 ),...,F N (X N )), where C , the multivariate distribution of uniform marginals, is known as the copula of X . Figure 5: ArcelorMittal and Société générale prices are projected on dependence ⊕ distribution space; notice their heavy-tailed exponential distribution. Let θ ∈ [0, 1]. Let (X, Y ) ∈V 2 . Let G =(G X ,G Y ), where G X and G Y are respectively X and Y marginal cdf. We define the following distance d 2 θ (X, Y )= θd 2 1 (G X (X ),G Y (Y )) + (1 - θ )d 2 0 (G X ,G Y ), where d 2 1 (G X (X ),G Y (Y )) = 3E[|G X (X ) - G Y (Y )| 2 ], and d 2 0 (G X ,G Y )= 1 2 R R q dG X dλ - q dG Y dλ 2 dλ. C LUSTERING R ESULTS &S TABILITY Figure 6: (Top) The returns correlation structure ap- pears more clearly using rank correlation; (Bottom) Clusters of returns distributions can be partly described by the returns volatility Figure 7: Stability test on Odd/Even trading days sub- sampling: our approach (GNPR) yields more stable clusters with respect to this perturbation than standard approaches (using Pearson correlation or L 2 distances).

Upload
others
Category

Documents
view
2
download
0

Embed Size (px):

Transcript of N IME ERIES - Data Science Summer School · max )( min) ; where max min = 1 + N=T 2 p N=T , and 2 [...

Page 1: N IME ERIES - Data Science Summer School · max )( min) ; where max min = 1 + N=T 2 p N=T , and 2 [ min; max]. 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 ¸ 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4

ON CLUSTERING FINANCIAL TIME SERIESGAUTIER MARTI, PHILIPPE DONNAT AND FRANK NIELSEN

NOISY CORRELATION MATRICESLet X be the matrix storing the standardized re-turns of N = 560 assets (credit default swaps)over a period of T = 2500 trading days.

Then, the empirical correlation matrix of the re-turns is

C =1

TXX>.

We can compute the empirical density of itseigenvalues

ρ(λ) =1

dn(λ)

dλ,

where n(λ) counts the number of eigenvalues ofC less than λ.

From random matrix theory, the Marchenko-Pastur distribution gives the limit distribution asN →∞, T →∞ and T/N fixed. It reads:

ρ(λ) =T/N

2π

√(λmax − λ)(λ− λmin)

λ,

where λmaxmin = 1 + N/T ± 2

√N/T , and λ ∈

[λmin, λmax].

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

1.8

ρ(λ

)

Figure 1: Marchenko-Pastur density vs. empirical den-sity of the correlation matrix eigenvalues

Notice that the Marchenko-Pastur density fitswell the empirical density meaning that most ofthe information contained in the empirical corre-lation matrix amounts to noise: only 26 eigenval-ues are greater than λmax.The highest eigenvalue corresponds to the ‘mar-ket’, the 25 others can be associated to ‘industrialsectors’.

CLUSTERING TIME SERIESGiven a correlation matrix of the returns,

0 100 200 300 400 5000

100

200

300

400

500

Figure 2: An empirical and noisy correlation matrix

one can re-order assets using a hierarchical clus-tering algorithm to make the hierarchical correla-tion pattern blatant,

0 100 200 300 400 5000

100

200

300

400

500

Figure 3: The same noisy correlation matrix re-orderedby a hierarchical clustering algorithm

and finally filter the noise according to the corre-lation pattern:

0 100 200 300 400 5000

100

200

300

400

500

Figure 4: The resulting filtered correlation matrix

BEYOND CORRELATIONSklar’s Theorem. For any random vector X = (X1, . . . , XN ) having continuous marginal cumulativedistribution functions Fi, its joint cumulative distribution F is uniquely expressed as

F (X1, . . . , XN ) = C(F1(X1), . . . , FN (XN )),

where C, the multivariate distribution of uniform marginals, is known as the copula of X .

Figure 5: ArcelorMittal and Société générale prices are projected on dependence ⊕ distribution space; notice theirheavy-tailed exponential distribution.

Let θ ∈ [0, 1]. Let (X,Y ) ∈ V2. Let G = (GX , GY ), where GX and GY are respectively X and Y marginalcdf. We define the following distance

d2θ(X,Y ) = θd21(GX(X), GY (Y )) + (1− θ)d20(GX , GY ),

where d21(GX(X), GY (Y )) = 3E[|GX(X)−GY (Y )|2], and d20(GX , GY ) =12

∫R

(√dGX

dλ −√

dGY

dλ

dλ.

CLUSTERING RESULTS & STABILITY

0 5 10 15 20 25 30

Standard Deviation in basis points0

Num

ber

occ

urr

ence

Standard Deviations Histogram

Figure 6: (Top) The returns correlation structure ap-pears more clearly using rank correlation; (Bottom)Clusters of returns distributions can be partly describedby the returns volatility

Figure 7: Stability test on Odd/Even trading days sub-sampling: our approach (GNPR) yields more stableclusters with respect to this perturbation than standardapproaches (using Pearson correlation or L2 distances).

Требуемые технологии file© ZF KAMA Массово-габаритные характеристики min max Диаметр(D, мм) 75,7 233,8 Модуль(мм,

Order code VDS R max. ID PTOT - STMicroelectronics€¦ · VDS increases from 0 to 80% VDSS Table 7: Switching times Symbol Parameter Test conditions Min. Typ. Max. Unit t d(on) Turn-on

Summary Candidate supernova remnants G25.5+0.0 and G23.5–0.0 were observed by XMM-Newton in the course of a snap-shot survey of plerionic and composite.

HA-5002 - Farnell element14 · HA-5002-2 HA-5002-5, -9 MIN TYP MAX MIN TYP MAX UNITS INPUT CHARACTERISTICS Offset Voltage 25 - 5 20 - 5 20 mV Full - 10 30 - 10 30 mV Average Offset

LS Q976 · LS Q976 3 Version 1.5 | 2020-02-11 Maximum Ratings Parameter Symbol Values Operating Temperature T op min. max.-30 °C 85 °C Storage Temperature T stg min. max.-40 °C

Punta Cana, D.R. 4Ψ with 5Ψ SpacePunta Cana Max. (FO) Min. (FO) Max. (FO) Min. (FO) Max. (FO) Min. (FO) JAN 84 MAY 88 SEP 90 FEB 84 JUN OCT 89 MAR 86 JUL NOV 86 APR 86 AUG DEC 85

Supplementary Table 1: Probability distribution functions ... · min,allo 0.08 – 9 27 3739 380.40 36 40 DOC min,allo 0.71 ... y n t h e s i s 0.0 0.2 0.4 0.6 0.8 1.0 W ater residence

Max V Min V 1/e * (Max V – Min V) RF pickup signal (200 ps/pt averaged over 100 pts), 7.5 µs beam τTime (µs) Pickup voltage (V)

C-MAX & Grand C-MAX ΖΕ»»¬΄ΉΏ

$Instituto Militar de Engenharia - CBPFlabmag/MinicursoNano/EBSDMET.pdf · min ρ max Electron Backscatter HoughTransform DiffractionPattern (EBSP) θ ρ Hough Parameters Identificação$