• date post

11-Oct-2020
• Category

## Documents

• view

1

0

Embed Size (px)

### Transcript of High-dimensional covariance matrix estimation with ... · PDF file Bias-variance trade-o...

• High-dimensional covariance matrix estimation with applications to microarray studies and

portfolio optimization

Esa Ollila

Department of Signal Processing and Acoustics Aalto University, Finland

June 7th, 2018, Gipsa-Lab

• Covariance estimation problem

x : p-variate (centered) random vector (p large)

x1, . . . ,xn i.i.d. realizations of x.

Problem: Find an estimate Σ̂ of the pos. def. covariance matrix

Σ = E [ (x− E[x])(x− E[x])>

] ∈ Sp×p++

The sample covariance matrix (SCM),

S = 1

n− 1 n∑

i=1

(xi − x)(xi − x)>,

is the most commonly used estimator of Σ.

Challenges in HD:

1 Insufficient sample support (ISS) case: p > n. =⇒ Estimate of Σ−1 can not be computed!

2 Low sample support (LSS) (i.e., p of the same magnitude as n) =⇒ Σ̂ is estimated with a lot of error.

3 Outliers or heavy-tailed non-Gaussian data

2/40

• Why covariance estimation?

Portfolio selection Discriminant Analysis The Most Important Applications

graphical models clustering/discriminant analysis

Ilya Soloveychik (HUJI) Robust Covariance Estimation 15 / 47

PCA

The Most Important Applications

graphical models clustering/discriminant analysis

Ilya Soloveychik (HUJI) Robust Covariance Estimation 15 / 47

Graphical models

Gaussian graphical model

n-dimensional Gaussian vector

x = (x1, . . . , xn) ∼ N (0,Σ)

xi, xj are conditionally independent (given the rest of x) if

(Σ−1)ij = 0

modeled as undirected graph with n nodes; arc i, j is absent if (Σ−1)ij = 0

1

2

34

5 Σ−1 =

 

• • 0 • • • • • 0 • 0 • • • 0 • 0 • • 0 • • 0 0 •

 

13/40

Frobenius norm: ‖A‖2F = tr(A2). Any estimator Σ̂ ∈ Sp×p++ of Σ verifies

MSE(Σ̂) = var(Σ̂) + bias(Σ̂)

where MSE(Σ̂) , E[‖Σ̂−Σ‖2F]. Suppose Σ̂ = S. Then, since S is unbiased, i.e., bias(Σ̂) = ‖E[Σ̂]−Σ‖2F = 0, one has that

MSE(Σ̂) = var(Ŝ) (= tr{cov(vec(S))})

7 However, S is not applicable for high-dimensional problems

n ≈ p ⇒ var(S) is very large p > n or p� n ⇒ S is singular (non-invertible).

3 Use an estimator Σ̂ = Ŝβ that shrinks S towards a structure (e.g., a scaled identity matrix) using a tuning (shrinkage) parameter β

MSE(Σ̂) can be reduced by introducing some bias! Positive definiteness of Σ̂ can be ensured!

4/40

Frobenius norm: ‖A‖2F = tr(A2). Any estimator Σ̂ ∈ Sp×p++ of Σ verifies

MSE(Σ̂) = var(Σ̂) + bias(Σ̂)

where MSE(Σ̂) , E[‖Σ̂−Σ‖2F]. Suppose Σ̂ = S. Then, since S is unbiased, i.e., bias(Σ̂) = ‖E[Σ̂]−Σ‖2F = 0, one has that

MSE(Σ̂) = var(Ŝ) (= tr{cov(vec(S))})

7 However, S is not applicable for high-dimensional problems

n ≈ p ⇒ var(S) is very large p > n or p� n ⇒ S is singular (non-invertible).

3 Use an estimator Σ̂ = Ŝβ that shrinks S towards a structure (e.g., a scaled identity matrix) using a tuning (shrinkage) parameter β

MSE(Σ̂) can be reduced by introducing some bias! Positive definiteness of Σ̂ can be ensured!

4/40

Regularized SCM (RSCM) a lá Ledoit and Wolf

Sβ = βS + (1− β)[tr(S)/p]I where β > 0 denotes the shrinkage (regularization) parameter.

,_

e ,_

w

Total error Variance

Bias2

Tuning parameter

5/40

• This talk: choose β optimally under the random sampling from elliptical distribution + applications

Esa Ollila. Optimal high-dimensional shrinkage covariance estimation for elliptical distributions. In EUSIPCO’17, Kos, Greece, 2017, pp 1689–1693.

Esa Ollila and Elias Raninen. Optimal shrinkage covariance matrix estimator under random sampling from elliptical distributions. in Arxiv soon.

M. N. Tabassum and E. Ollila. Compressive regularized discriminant analysis of high-dimensional data with applications to microarray studies. In ICASSP’18, Calgary, Canada, 2017, pp. 4204 –4208.

6/40

• 1 Portfolio optimization

2 Ell-RSCM estimators

3 Estimates of oracle parameters

4 Simulation studies

5 Compressive Regularized Discriminant Analysis

• Modern portfolio theory (MPT)

mathematical framework for portfolio allocations that balances the return-risk tradeoff by ??.

Important inputs by ???∗

A portfolio consist of p assets, e.g.:

- equity securities (stocks), market indexes - fixed-income securities (e.g., government or corporate bonds),

commodities (e.g., oil, gold, etc), ETF-s - currencies (exchange rates), derivatives, . . . ,

To use MPT one needs to estimate the mean vector µ and the covariance matrix Σ of asset returns.

7 often p, the number of assets is larger (or of similar magnitude) to n, the number of historical returns.

⇒ MPT standard plug-in estimates is not applicable ∗Noble price winners: James Tobin (1981), Harry Markovitz (1990) and Willian F.

Sharpe (1990), and Eugene F. Fama (2013)

7/40

• Basic definitions

Portfolio weight at (discrete) time index t:

wt = (wt,1, . . . , wt,p) > s.t.

p∑

i=1

wt,i = 1 >wt = 1

Let Ci,t > 0 be the price of i th asset

The net return of the ith asset over one interval is

ri,t = Ci,t − Ci,t−1

Ci,t−1 =

Ci,t Ci,t−1

− 1 ∈ [−1,∞)

Single period net returns of p assets is a p-variate vector

rt = (r1,t, . . . , rp,t) >

The portfolio net return at period t+ 1 is

Rt+1 = w > t rt+1 =

p∑

i=1

wi,tri,t+1

8/40

• Basic definitions (cont’d)

Given historical returns {rt}nt=1,

µ = E[rt] and Σ ≡ cov(rt) = E [ (rt − µt)(rt − µt)>

]

holds for all t (so drop the index t from subscript).

Let r denote (random vector) of returns, then the expected return of the portfolio is

µw = E[R] = w>µ

and the variance (or risk) of the portfolio is

σ2w = var(R) = w >Σw.

The standard deviation σw = √

var(R) is called the volatility

Global mean variance portfolio (GMVP) optimization problem:

minimize w∈Rp

w>Σw subject to 1>w = 1.

The solution is wo = Σ−11

1>Σ−11 .

9/40

• S&P 500 and Nasdaq-100 indexes for year 2017

01-03 03-16 05-26 08-08 10-18 12-29

2300

2400

2500

2600

S&P 500 index prices

01-03 03-16 05-26 08-08 10-18 12-29

5000

5500

6000

6500 NASDAQ-100 prices

01-03 03-16 05-26 08-08 10-18 12-29

-0.02

-0.01

0

0.01

0.02

NASDAQ-100 daily net returns

01-03 03-16 05-26 08-08 10-18 12-29

-0.02

-0.01

0

0.01

0.02

S&P 500 daily net returns

10/40

• Are historical returns Gaussian?

QQ-plots and estimated pdf-s

-4 -2 0 2 4

Standard Normal Quantiles

-4

-2

0

2

4

Q u

a n

ti le

s o

f In

p u

t S

a m

p le

S&P 500

-4 -2 0 2 4

Standard Normal Quantiles

-4

-2

0

2

4

Q u

a n

ti le

s o

f In

p u

t S

a m

p le

NASDAQ-100

-4 -2 0 2 4

Standard Normal Quantiles

-4

-2

0

2

4

Q u

a n

ti le

s o

f In

p u

t S

a m

p le

N(0,1) sample

S&P 500

-4 -2 0 2

0

0.1

0.2

0.3

0.4

0.5

0.6 NASDAQ-100

-4 -2 0 2 4

0

0.1

0.2

0.3

0.4

0.5

0.6 N(0,1) sample

-2 -1 0 1 2

0

0.1

0.2

0.3

0.4

0.5

0.6

11/40

• Are historical returns Gaussian (cont’d)?

Scatter plots and estimated 99%, 95% and 50% tolerance ellipses

-0.02 -0.015 -0.01 -0.005 0 0.005 0.01 0.015

-0.03

-0.02

-0.01

0

0.01

0.02

0.03

inside the 50% ellipse: 65.6% of returns inside the 95 and 99% ellipses = 93.2% 95.6% of returns

12/40

• Stocks are unpredictable (and non-Gaussian)!

03-16 05-26 08-08 10-18 12-29 -5

0

5

10

15

Shorting allowed

long-only

05-26 08-08 10-18 12-29 -0.2

-0.15

-0.1

-0.05

0

0.05

0.1

Shorting allowed

long-only