Dynamical low rank approximation in hierarchical tensor format · 2013-07-25 · Summary and some...

23
Dynamical low rank approximation in hierarchical tensor format R. Schneider (TUB Matheon) John von Neumann Lecture – TU Munich, 2012

Transcript of Dynamical low rank approximation in hierarchical tensor format · 2013-07-25 · Summary and some...

Page 1: Dynamical low rank approximation in hierarchical tensor format · 2013-07-25 · Summary and some open questions IManifold approach, in particular tangent space representation, allowstreatment

Dynamical low rank approximation inhierarchical tensor format

R. Schneider (TUB Matheon)

John von Neumann Lecture – TU Munich, 2012

Page 2: Dynamical low rank approximation in hierarchical tensor format · 2013-07-25 · Summary and some open questions IManifold approach, in particular tangent space representation, allowstreatment

MotivationEquations describing complex systems with multi-variatesolution spaces, e.g.

B stationary/instationary Schrodinger type equations

i~∂

∂tΨ(t , x) = (−1

2∆ + V )︸ ︷︷ ︸H

Ψ(t , x), HΨ(x) = EΨ(x)

describing quantum-mechanical many particle systemsB stochastic DEs and the Fokker-Planck equation,

∂p(t , x)∂t

=d∑

i=1

∂xi

(fi(t , x)p(t , x)

)+

12

d∑i,j=1

∂2

∂xi∂xj

(Bi,j(t , x)p(t , x)

)describing mechanical systems in stochastic environment,

B chemical master equations, parametric PDEs, machinelearning, . . .

Solutions depend on x = (x1, . . . , xd ), where usually, d >> 3!

Page 3: Dynamical low rank approximation in hierarchical tensor format · 2013-07-25 · Summary and some open questions IManifold approach, in particular tangent space representation, allowstreatment

Setting - TensorsGoal: Generic perspective on methods for high-dimensionalproblems, i.e. problems posed on tensor spaces,

H :=⊗d

i=1 Vi , today: H =⊗d

i=1 Rn = R(nd )

Notation: (x1, . . . , xd ) 7→ U = U(x1, . . . , xd ) ∈ H

Main problem:

dim H = O(nd ) – Curse of dimensionality!

e.g. n = 100,d = 10 10010 basis functions, coefficient vectors of 800× 1018 Bytes = 800 Exabytes

Approach: Some higher order tensors can be constructed(data-) sparsely from lower order quantities.

As for matrices, incomplete SVD:

A(x1, x2) ≈r∑

k=1

σk(uk (x1)⊗ vk (x2)

)

Page 4: Dynamical low rank approximation in hierarchical tensor format · 2013-07-25 · Summary and some open questions IManifold approach, in particular tangent space representation, allowstreatment

Setting - TensorsGoal: Generic perspective on methods for high-dimensionalproblems, i.e. problems posed on tensor spaces,

H :=⊗d

i=1 Vi , today: H =⊗d

i=1 Rn = R(nd )

Notation: (x1, . . . , xd ) 7→ U = U(x1, . . . , xd ) ∈ H

Main problem:

dim H = O(nd ) – Curse of dimensionality!

e.g. n = 100,d = 10 10010 basis functions, coefficient vectors of 800× 1018 Bytes = 800 Exabytes

Approach: Some higher order tensors can be constructed(data-) sparsely from lower order quantities.

As for matrices, incomplete SVD:

A(x1, x2) ≈r∑

k=1

σk(uk (x1)⊗ vk (x2)

)

Page 5: Dynamical low rank approximation in hierarchical tensor format · 2013-07-25 · Summary and some open questions IManifold approach, in particular tangent space representation, allowstreatment

Setting - TensorsGoal: Generic perspective on methods for high-dimensionalproblems, i.e. problems posed on tensor spaces,

H :=⊗d

i=1 Vi , today: H =⊗d

i=1 Rn = R(nd )

Notation: (x1, . . . , xd ) 7→ U = U(x1, . . . , xd ) ∈ H

Main problem:

dim H = O(nd ) – Curse of dimensionality!

e.g. n = 100,d = 10 10010 basis functions, coefficient vectors of 800× 1018 Bytes = 800 Exabytes

Approach: Some higher order tensors can be constructed(data-) sparsely from lower order quantities.

Canonical decomposition for order-d-tensors:

U(x1, . . . , xd ) ≈r∑

k=1

σk(⊗d

i=1 ui,k (xi)).

Page 6: Dynamical low rank approximation in hierarchical tensor format · 2013-07-25 · Summary and some open questions IManifold approach, in particular tangent space representation, allowstreatment

Tensor formatsB Hierarchical Tucker format

(HT; Hackbusch/Kuhn, Grasedyck, Kressner, Q: Tree-tensor networks)

{1,2,3,4,5}B

{4,5}

U4 5 U

B

B

B

U

UU

3

2 1

{1,2,3}

{1,2}

U{1,2}

U{1,2,3}

Page 7: Dynamical low rank approximation in hierarchical tensor format · 2013-07-25 · Summary and some open questions IManifold approach, in particular tangent space representation, allowstreatment

Tensor formatsB Hierarchical Tucker format

(HT; Hackbusch/Kuhn, Grasedyck, Kressner, Q: Tree-tensor networks)

B Tucker format (Q: MCTDH(F))

U(x1, .., xd ) =

r1∑k1=1

. . .

rd∑kd =1

B(k1, .., kd )d⊗

i=1

Ui(ki , xi)

{1,2,3,4,5}

1 2 3 54

Page 8: Dynamical low rank approximation in hierarchical tensor format · 2013-07-25 · Summary and some open questions IManifold approach, in particular tangent space representation, allowstreatment

Tensor formatsB Hierarchical Tucker format

(HT; Hackbusch/Kuhn, Grasedyck, Kressner, Q: Tree-tensor networks)

B Tucker format (Q: MCTDH(F))

B Tensor Train (TT-)format(Oseledets/Tyrtyshnikov, ' MPS-format of quantum physics)

U(x) =

r1∑k1=1

. . .

rd−1∑kd−1=1

d∏i=1

Bi(ki−1, xi , ki) = B1(x1) · · ·Bd (xd )

{1,2,3,4,5}

{1} {2,3,4,5}

{2} {3,4,5}

{4,5}

{5}

{3}

{4}

U1 U2 U3 U4 U5

r1 r2 r3 r4

n1 n2 n3 n4 n5

Page 9: Dynamical low rank approximation in hierarchical tensor format · 2013-07-25 · Summary and some open questions IManifold approach, in particular tangent space representation, allowstreatment

Approximation on low-rank manifoldM⊆ V

B for optimisation tasks J (U)→ min:

Solve first order condition J ′(U) = 0 on tangent space,

〈J ′(U),V 〉 = 0 ∀V ∈ TU .

(Dirac-Frenkel variational principle, Absil et al., Q.Chem.: MCSCF, . . . )

J’(U) = X − U

X

U M

T UM

Page 10: Dynamical low rank approximation in hierarchical tensor format · 2013-07-25 · Summary and some open questions IManifold approach, in particular tangent space representation, allowstreatment

Approximation on low-rank manifoldM⊆ V

B for differential equations X = f (X ),X (0) = X0:

Solve projected DE, U = PU f (U),U(0) = U0 ∈M,

〈U(t),V 〉 = 〈f (U(t)),V 〉 ∀V ∈ TU(t) .(Dirac-Frenkel variational principle, Lubich et al., Q.Chem.: TDMCH . . . )

U M

X = F(X).

U = PUF(U).

F(U) TUM

Page 11: Dynamical low rank approximation in hierarchical tensor format · 2013-07-25 · Summary and some open questions IManifold approach, in particular tangent space representation, allowstreatment

Manifolds and gauge conditionsLubich et al. (2009), Holtz/Rohwedder/S. (2011a), Uschmajew/Vandereycken (2012),Lubich/Rohwedder/S./Vandereycken (in prep.)

B The sets of above tree (HT, TT or Tucker) tensors of fixedrank r each provide embedded submanifoldsMr of R(nd ).

B Canonical tangent space parametrization via componentfunctions Wt ∈ Ct is redundant, but unique via gaugeconditions for nodes t 6= tr , e.g.

Gt ={

Wt ∈ Ct | 〈WTt ,Bt〉 resp. 〈WT

t ,Ut〉 = 0 ∈ Rkt×kt }

B Linear isomorphism

E : ×t∈T Gt → TUM, E =∑t∈T

Et

Et : “node-t embedding operators”, defined via currentiterate (Ut ,Bt ).

Projector onto TUM: P = EE+.

Page 12: Dynamical low rank approximation in hierarchical tensor format · 2013-07-25 · Summary and some open questions IManifold approach, in particular tangent space representation, allowstreatment

Manifolds and gauge conditionsLinear isomorphism

E = E(U) : ×t∈T Gt → TU , E(U) =∑t∈T

Et (U)

E+ Moore Penrose inverse of E

Projector onto TUM: P(U) = EE+.

Theorem (Lubich/Rohwedder/S./Vandereycken (in prep.))

For tensor B,U,V; ‖U − V‖ ≤ cρ; there exists C dependingonly on n,d, such that there holds

‖(P(U)− P(V )

)B‖ ≤ Cρ−1‖U − V‖‖B‖

‖(I − P(U)

)(U − V )‖ ≤ Cρ−1‖U − V‖2 .

These are estimates for the curvature ofMr at U.

Page 13: Dynamical low rank approximation in hierarchical tensor format · 2013-07-25 · Summary and some open questions IManifold approach, in particular tangent space representation, allowstreatment

Optimization problems/differential flow

The problems

〈J ′(U),V 〉 = 0 resp. 〈U,V 〉 = 〈f (U),V 〉 ∀V ∈ TU

on can now be re-cast into equations for components (Ut ,Bt )representing low-rank tensor

U = τ(Ut ,Bt ) :

With P⊥t projector to Gt , embedding operator Et = EUt as

above, solve

P⊥t ETt J ′(U) = 0 resp. Ut = P⊥t E+

t f (U),

for t 6= tr , and

ETtr J′(U) = 0 resp. Ut = E+

t f (U).

for the “root” (e.g. by standard methods for nonlinear eqs.).

Page 14: Dynamical low rank approximation in hierarchical tensor format · 2013-07-25 · Summary and some open questions IManifold approach, in particular tangent space representation, allowstreatment

Differential equations for components under gradientflow

Let U(t) = U1(t) · · ·Ui(t) · · ·Ud (t) ∈ Tr be fixed.Then

U′d (t) = −ETd (t)

(f (U(t))

)∈ Rrd−1×nd .

provided that Ui(t), i = 1, . . . ,d − 1, are left orthogonal, sinceDd (t) = I.The other components U′i(t), , i = 1, . . . ,d − 1, are given by

U′i(t) =

((I− Pi(t))⊗ Di

−1(t))

ETi (t)

(f (U(t))

).

with the orthogonal projection Pi(t) onto the parameter space,

Pi(t)W(ki−1, xi , ki) =

=∑

k ′i1,x ′i ,k

′i

Ui(t , k ′i−1, x′i , k′i )W(k ′i−1, x

′i , ki)Ui(t , ki−1, xi , k ′i ) .

Page 15: Dynamical low rank approximation in hierarchical tensor format · 2013-07-25 · Summary and some open questions IManifold approach, in particular tangent space representation, allowstreatment

Stabilization and preconditioning

In time step t → t + ∆t compute the componentsVi(τ) ≈ (I⊗ Di(t))Ui(τ), , i = 1, . . . ,d − 1, t ≤ τ < t + ∆t by

V′i(τ) = (I⊗ Di(t))U′i(τ) =

((I− Pi(τ))⊗ I

)ET

i (τ)(f (U(τ))

).

and Ui(t + δ) = left-orth(Vi(t + ∆t)

)I. Oseledets et al. : Strang

splitting and alternating directions ALS, (compare TD DMRG)Generalization of HOSVD bases of Hackbusch.For non-leaf vertices α ∈ TD , α 6= D, we have

∑rα`=1(σ

(α)` )2 C(α,`)C(α,`)H = Σ2

α1,∑rα

`=1(σ(α)` )2 C(α,`)TC(α,`) = Σ2

α2,

where α1, α2 are the first and second son of α ∈ TD and Σαi the diagonal of the

singular values ofMαi (v).

Page 16: Dynamical low rank approximation in hierarchical tensor format · 2013-07-25 · Summary and some open questions IManifold approach, in particular tangent space representation, allowstreatment

Some current results and trends

Optimization:

B Alternating optimization of components for TT robustpractical algorithm (ALS/MALS, Holtz/Rohwedder/S., SISC 2012)

B DMRG = MALS sees boost of interest in quantumphysics/quantum chemistry community

I Gradient methods — gradient flow (see below)B (Quasi-) Newton methods onM (Rohwedder/Vandereycken, in

prep.)

Time-dependent equations:

B Quasi-optimal error bounds(Lubich/Rohwedder/S./Vandereycken, in prep.)solution X (t) with approx. U(t) ∈Mr , X (0) = U(0),

‖U(t)− Ubest(t)‖ . t · maxs∈[0,t]

‖Ubest(s)− X (s)‖.

Page 17: Dynamical low rank approximation in hierarchical tensor format · 2013-07-25 · Summary and some open questions IManifold approach, in particular tangent space representation, allowstreatment

Some current results and trends

Optimization:

B Alternating optimization of components for TT robustpractical algorithm (ALS/MALS, Holtz/Rohwedder/S., SISC 2012)

B DMRG = MALS sees boost of interest in quantumphysics/quantum chemistry community

I Gradient methods — gradient flow (see below)B (Quasi-) Newton methods onM (Rohwedder/Vandereycken, in

prep.)

Time-dependent equations:

B Quasi-optimal error bounds(Lubich/Rohwedder/S./Vandereycken, in prep.)solution X (t) with approx. U(t) ∈Mr , X (0) = U(0),

‖U(t)− Ubest(t)‖ . t · maxs∈[0,t]

‖Ubest(s)− X (s)‖.

Page 18: Dynamical low rank approximation in hierarchical tensor format · 2013-07-25 · Summary and some open questions IManifold approach, in particular tangent space representation, allowstreatment

Practical example: EVP in Q-TT (Khoromkij/Oseledets, 2011)

Problem from quantum molecular dynamics:

HΨ = (−12

∆ + V )Ψ = EΨ

with potential energy surface given by Henon-Heiles potential

V (q1, . . . , qf ) =12

f∑k=1

q2k + λ

f−1∑k=1

(q2

k qk+1 −13

q3k

).

phys. dim. f = 4, . . . , 256; one-direction grid size n = 128 = 27; QTT-tensors ∈ (R2)7f .

Page 19: Dynamical low rank approximation in hierarchical tensor format · 2013-07-25 · Summary and some open questions IManifold approach, in particular tangent space representation, allowstreatment

Parabolic PDEsjoint paper with B. Khoromskij, I. Oseledets

∂tΨ = HΨ = (−1

2∆ + V )Ψ , Ψ(0) = Ψ0 .

Timings and error dependence for the modified heat equation (imaginary time) with aHenon-Heiles potential

time interval [0, 1], τ = 10−2, the manifold has ranks 10

Table: Time

Dimension Time (sec)2 2.774 21.398 64.8216 142.232 346.964 832.31

Table: Errror

τ Error1.000e-01 3.137e-035.000e-02 7.969e-042.500e-02 2.000e-041.250e-02 5.001e-056.250e-03 1.247e-053.125e-03 3.081e-061.563e-03 7.335e-07

Next slide: Schrodinger type equation :

∂tΨ = iHΨ = i(−

12

∆ + V )Ψ , Ψ(0) = Ψ0 .

Page 20: Dynamical low rank approximation in hierarchical tensor format · 2013-07-25 · Summary and some open questions IManifold approach, in particular tangent space representation, allowstreatment
Page 21: Dynamical low rank approximation in hierarchical tensor format · 2013-07-25 · Summary and some open questions IManifold approach, in particular tangent space representation, allowstreatment

QC-DMRG and TT resp, MPS approximations

In courtesy of O Legeza (Hess & Legeza & ..)dissoziation of a diatomic molecule LiF , 1st + 2nd eigenvalue

Page 22: Dynamical low rank approximation in hierarchical tensor format · 2013-07-25 · Summary and some open questions IManifold approach, in particular tangent space representation, allowstreatment

Summary and some open questions

I Manifold approach, in particular tangent spacerepresentation, allows treatment and analysis ofhigh-dimensional optimization tasks and PDEs on low-ranktensor manifolds.

I TT and HT formats overcome weaknesses of older formats(CANDECOMP, Tucker).

I Quantum chemistry applications quite well-established andblooming, Fokker-Planck eq.: see Kohromskij et al. Stillmostly open.

I Algorithms for ordering of indices?

I Theory: Uniqueness of minimizers, function space setting?

Page 23: Dynamical low rank approximation in hierarchical tensor format · 2013-07-25 · Summary and some open questions IManifold approach, in particular tangent space representation, allowstreatment

Thank youfor your attention.

References:D. Conte and C. Lubich, An error analysis of the multi-configuration time-dependent Hartree method of quantum

dynamics, M2AN, 44 (2010), pp. 759–780.

Chr. Lubich, Th. Rohwedder, R. Schneider, B. Vandereycken, Dynamical approximation of tensors in thehierarchical Tucker and Tensor train formats (working title), in preparation, 2012.

B. Khoromskij, I. Oseledets, DMRG+QTT approach to computation of the ground stateof the molecular Schrodinger operator, submitted to Num. Math., 2011.

S. Holtz, Th. Rohwedder, R. Schneider, On manifolds of tensors with fixed TT rank,Num. Math. DOI: 10.1007/s00211-011-0419-7, 2011.

O. Koch, C. Lubich, Dynamical tensor approximation,SIAM J. Matrix Anal. Appl. 31, p. 2360, 2010.

I. Oseledets, B.N. Khoromskij, and R. Schneider. Efficient time-stepping scheme for dynamics on TT manifolds.Preprint 24/2012 MPI MiS Leipzig, 2012.

A. Uschmajew and B. Vandereycken, The geometry of algorithms using hierarchical tensors, SPP 1324 preprint,2012,

Remark: there are many more interesting papers of B. Khoromskij, to all previous lectures, only few arecited here, for a collection see e.g.

http://personal-homepages.mis.mpg.de/bokh/reference1.html