Sparse-grid collocation methods to solve the vibrational...

Post on 16-Oct-2020

0 views 0 download

Transcript of Sparse-grid collocation methods to solve the vibrational...

Sparse-grid collocation methods to solve thevibrational Schroedinger equation

Gustavo Avila and Tucker Carrington

Chemistry DepartmentQueen’s University

6 octobre 2016

1 / 49

Objective

Compute a vibrational spectrum by solving the vibrationalSchroedinger equation

Hψk = Ekψk

H = K + V .

Represent wavefunctions with basis functions

ψk(q) =∑n

ckn fn(q)

2 / 49

The curse

Often one wishes to use product basis functions

fk1,k2,··· ,kD = φk1(x1)φk2(x2) · · ·φkD (xD)

The size of the required tensor product basis is about 103Natoms−6.

3 / 49

Reducing the size of the basis

Keep only basis functions for which∑c

nc ≤ b

4 / 49

One can do better

We also use

α1n1 + · · ·+ αDnD ≤ b.

and more generally,

g1(n1) + · · ·+ gD(nD) ≤ b ,

where g c(n) is an general monotonic function of n, designed toinclude basis functions we suspect are important.

5 / 49

Because we cannot use exact matrix elements

We have

used a Galerkin method with a Smolyak quadrature

used a sparse-grid collocation method

6 / 49

With a Galerkin method I get a generalized eigenvalueproblem

there are no efficient iterative eigensolvers forHU = SUE

7 / 49

Collocation

obviates integrals

has the advantage that as the basis improves, the choice ofthe points becomes irrelevant.

8 / 49

What is collocation ?

Apply H − Ek to

ψk(q) =∑n

ukn fn(q)

and determine the ukn by demanding that the Schroedingerequation be satisfied at a set of points, i.e. solve

(T + VB)U = BUE ,

9 / 49

The most important drawback of established collocationmethods

B 6= Iand it is necessary to solve a generalized eigenvalue problem.

This makes collocation almost unusable (when the number ofbasis functions is larger than about 50’000).

In this talk I present a new collocation method thatobviates the need to solve a generalized eigenvalueproblem

10 / 49

Avoiding a generalized eigenvalue problem

In 1-D this is easily accomplished by using as basis functionsLagrange-like basis functions that are one at one of the collocationpoints and zero at all the others.

( B = I because fb(xa) = δab)

11 / 49

Solving the D-d Schroedinger equation

To solve the Schroedinger equation we replace Ψn with aninterpolant and demand that

(K + V )I (D,H)Ψn(x1, x2, · · · , xD) = EnI (D,H)Ψn(x1, x2, · · · , xD)

be satisfied at the interpolation points[x1 = xk′1 , x2 = xk′2 , · · · , xD = xk′D ]

12 / 49

V (x1, x2, · · · , xD)

K =3N−6∑k,l

(Gkl

∂xk

∂xl

)+

3N−6∑k

(Hk

∂xk

)+ V ′

13 / 49

When I (D,H)Ψn(x1, x2, · · · , xD) is written in terms of Lagrangetype functions, then the RHS is simply EnΨn(xk′1 , xk′2 , · · · , xk′D ), i.e.( B = I) ; and

V is a diagonal matrix.

One obtains

K I (D,H)Ψn(x1, x2, · · · , xD)|[x1=xk′1,x2=xk′

2,··· ,xD=xk′

D]

+ V (xk′1 , xk′2 , · · · , xk′D )Ψn(xk′1 , xk′2 , · · · , xk′D )

= EnΨn(xk′1 , xk′2 , · · · , xk′D ) ,

The hard part is applying K

14 / 49

We exploit the fact that when nested points are used, a sparse-gridinterpolant can be written

I (D,H)Ψn(x1, x2, · · · , xD) =∑g(M1(n1+1),M2(n2+1),··· ,MD(nD+1))≤H

[Cnn1,n2,··· ,nD

× ϕn1(x1)ϕn2(x2) · · ·ϕnD (xD)] .

it is then fairly straightforward to apply K

15 / 49

Interpolation

For any basis ϕn(x), it is possible to make Lagrange-like functionsak(x), so that

f (x) = Umf (x) =nmax+1∑k=1

ak(x)f (xk) ,

where

ak(x) =nmax∑n=0

Bk,nϕn(x), k = 1, 2, · · · ,m = nmax + 1 .

16 / 49

It is easy to write a multi-D direct product interpolant,

F n(x1, x2, · · · , xD) = Um1 ⊗ Um2 ⊗ · · · ⊗ UmDF n(x1, x2, · · · , xD)

=

m1∑k1=1

m2∑k2=1

· · ·mD∑

kD=1

ak1(x1)ak2(x2) · · · akD (xD)F n(xk1 , xk2 , · · · , xkD ) .

When D ∼ 12, it is not possible to storeF n(xk1 , xk2 , · · · , xkD ) in memory

17 / 49

Sparse-grid interpolation

I (D,H) =∑

g(i1,i2,··· ,iD)≤H

∆U i1(x1)⊗∆U i2(x2)⊗ · · · ⊗∆U iD (xD)

∆U ic (xc) = U ic (xc)− U ic−1(xc) .

18 / 49

Points and functions are organized into levels

The interpolant is built using multiple sets of nested points. Forexample, if the number of functions in level ic and in level ic + 1differ by one :

xc1 for ic = 1,

xc1 , xc2 for ic = 2,

xc1 , xc2 , x

c3 for ic = 3,

xc1 , xc2 , x

c3 , x

c4 for ic = 4,

...

xc1 , xc2 , · · · , xcK−1, xcK for ic = K .

19 / 49

The interpolant is built using multiple sets of nested functions,

ϕc0 for ic = 1,

ϕc0, ϕ

c1 for ic = 2,

ϕc0, ϕ

c1, ϕ

c2 for ic = 3,

ϕc0, ϕ

c1, ϕ

c2, ϕ

c3 for ic = 4,...

ϕc0, ϕ

c1, · · · , ϕc

K−2, ϕcK−1 for ic = K .

20 / 49

Insert the definition of U ic (xc)

F n(x1, x2, · · · , xD) ≈ I (D,H)F n(x1, x2, · · · , xD)

=∑

g(i1,i2,··· ,iD)≤H

m1(i1)∑k1=1

m2(i2)∑k2=1

· · ·mD(iD)∑kD=1

F n(x i1k1 , xi2k2, · · · , x iDkD )

[ai1k1(x1)− ai1−1k1(x1)][ai2k2(x2)− ai2−1k2

(x2)] · · · [aiDkD (xD)− aiDkD (xD)]

21 / 49

Insert the definition of aickc

∑g(i1,i2,··· ,iD)≤H

∆U i1(x1)⊗∆U i2(x2)⊗ · · · ⊗∆U iD (xD)F n(x1, x2, · · · , xD)

=∑

g(i1,i2,··· ,iD)≤H

mD(iD)∑kD=1

nmaxD (iD)∑nD=0

[B iDkD ,nD

− B iD−1kD ,nD

]ϕnD (xD)

mD−1(iD−1)∑kD−1=1

nmaxD−1(iD−1)∑nD−1=0

[BiD−1

kD−1,nD−1− B

iD−1−1kD−1,nD−1

]ϕnD−1(xD−1)

· · ·m1(i1)∑k1=1

nmax1 (i1)∑n1=0

[B i1k1,n1− B i1−1

k1,n1]ϕn1(x1)

× F n(x i1k1 , xi2k2, · · · , x iDkD )

22 / 49

When the points are nested, it is possible to move the sums overn1, n2, · · · to the left to obtain

∑g(M1(n1+1),M2(n2+1),··· ,MD(nD+1))≤H

Cnn1,n2,··· ,nD ϕn1(x1)ϕn2(x2) · · ·ϕnD (xD) .

where, M i (ni + 1) is an integer that is the index of the interpolationrule U i with the smallest value of i containing the function ni + 1.

23 / 49

Two equivalent interpolation equations

One in which the coefficients are values of the function,

F n(x1, x2, · · · , xD) ≈ I (D,H)F n(x1, x2, · · · , xD)

=∑

g(i1,i2,··· ,iD)≤H

m1(i1)∑k1=1

m2(i2)∑k2=1

· · ·mD(iD)∑kD=1

F n(xk1 , xk2 , · · · , xkD )

[ai1k1(x1)− ai1−1k1(x1)][ai2k2(x2)− ai2−1k2

(x2)] · · · [aiDkD (xD)− aiDkD (xD)].

The points in the union of the sets of points for all levels aredenoted (xk1 , xk2 , · · · , xkD )

24 / 49

Another in which they are Cnn1,n2,··· ,nD , coefficients of products of

ϕnc (xc)

∑g(M1(n1+1),M2(n2+1),··· ,MD(nD+1))≤H

Cnn1,n2,··· ,nD

× ϕn1(x1)ϕn2(x2) · · ·ϕnD (xD) .

It is important to put the first equation on the RHS (to avoid ageneralized eigenvalue problem) and helpful to apply the KEO tothe second equation.

25 / 49

To use collocation for solving the Schroedinger equation, I mustapply the KEO to the interpolant and write the result in terms ofthe unknowns I will determine, Ψn(xk1 , xk2 , · · · , xkD ).We therefore replace

C nn1,n2,··· ,nD =∑

g(M1(k1)+M2(k2)+···+MD(kD))≤H

T(n1,n2,··· ,nD),(k1,k2,··· ,kD)Ψn(xk1 , xk2 , · · · , xkD )

Nondirect product pseudo-spectral method

This transformation must be done efficiently

T is never built

26 / 49

The eigenvalue problem we solve is

[K + V]U = UE ,

It is solved with an Arnoldi method by evaluating matrix-vectorproducts.

For the kinetic matrix-vector product the first step is

Cn1,n2,··· ,nD =∑

g(M1(k1)+M2(k2)+···+MD(kD))≤H

T(n1,n2,··· ,nD),(k1,k2,··· ,kD)zk1,k2,··· ,kD ,

Avila and Carrington, Journal of Chemical Physics, 143, 214108 (2015)

27 / 49

Once Cn1,n2,··· ,nD is known, we must compute

∑g(M1(n1+1)+M2(n2+1)+···+MD(nD+1))≤H

[Kϕn1(x1)ϕn2(x2) · · ·ϕnD (xD) ]|[x1=xk′1,x2=xk′

2,··· ,xD=xk′

D]

Cn1,n2,··· ,nD

which is efficiently done by evaluating sums sequentially

28 / 49

But how does one compute Cn1,n2,··· ,nD ?

This is done most efficiently if we replace ϕnc (xc) with ϕnc (xc)where, the ϕnc (xc) we use are defined as follows :

ϕnc (xc) =nc−1∑pc=0

Anc ,pcϕpc (xc) + ϕnc (xc) .

The Anc ,pc are determined so that

ϕnc (xpc ) = 0, if nc > 0, pc = 1, · · · , nc .

A function in level i will be zero at the points in levels1, 2, · · · , i − 1 and, ϕnc (xc) in level i will also be zero at pointsxkc in level i for which kc < nc .

29 / 49

Points and functions are organized into levels

The interpolant is built using multiple sets of nested points,

xc1 for ic = 1,

ϕc0 for ic = 1,

xc1 , xc2 for ic = 2,

ϕc0, ϕ

c1 for ic = 2,

(xc1 , xc2 ), xc3 , x

c4 for ic = 3,

(ϕc0, ϕ

c1), ϕc

2, ϕc3 for ic = 3,...

30 / 49

Differences between these “hierarchical functions” andthose used by previous authors

Our hierarchical basis functions can be made from anyϕnc (xc) ; in our applications the ϕnc (xc) from which ourhierarchical functions are made are not polynomials whosedegree is linked to nc .

We have complete control over the number of functions inlevel ic + 1 that are not in level ic . In standard hierarchicalbases, the number of functions per level increasesexponentially with ic By minimizing the number of functionsin level ic + 1 that are not in level ic , we significantly reducethe number of points in the grid.

Unlike previous hierarchical bases, our hierarchical functions inlevel ic are zero not only at points in level ic − 1, but also atsome points in level ic that are not in level ic − 1.

31 / 49

Using this new procedure :

A set of hierarchical functions can be made from a list of anyϕnc (xc) and a list of any points by dividing the lists into levels ;

The number of functions and points in each of the levels canbe chosen at will.

Similar ideas are also used in the Master thesis (2016) of VasilVelikov (Jan Hamaekers, Michael Griebel, Jochen Garcke)

32 / 49

Back to computing Cn1,n2,··· ,nD

By equating the two forms of the interpolant one finds

Cn1,n2,··· ,nD =∑

g(i1,i2,··· ,iD)≤H

mD(iD)∑kD=1

[B iDkD ,nD

− B iD−1kD ,nD

]

mD−1(iD−1)∑kD−1=1

[BiD−1

kD−1,nD−1− B

iD−1−1kD−1,nD−1

] · · ·m2(i2)∑k2=1

[B i2k2,n2− B i2−1

k2,n2]

m1(i1)∑k1=1

[B i1k1,n1− B i1−1

k1,n1]Fk1,k2,··· ,kD

In general this sum is costly to evaluate, but if we use theϕnc (xc) then ∆B ic

kc ,nc= B ic

kc ,nc− B ic−1

kc ,nc, is only non-zero for one

value of ic

33 / 49

Cn1,n2,··· ,nD−1,nD =

kmaxD∑

kD=1

∆BMD(nD+1)kD ,nD

kmaxD−1∑

kD−1=1

∆BMD−1(nD−1+1)kD−1,nD−1

· · ·kmax2∑

k2=1

∆BM2(n2+1)k2,n2

kmax1∑

k1=1

∆BM1(n1+1)k1,n1

Fk1,k2,k3,k4,k5,k6 ,

kmaxD =MIN(mD [H −M1(1)−M2(1)− · · · −MD−1(1)], nD + 1),

kmaxD−1 =MIN(mD−1[H −M1(1)−M2(1)− · · · −MD−2(1)−MD(kD)], nD−1 + 1),

...

kmax2 =MIN(m2[H −M1(1)−M3(k3)− · · · −MD−1(kD−1)−MD(kD)], n2 + 1),

kmax1 =MIN(m1[H −M2(k2)−M3(k3)− · · · −MD−1(kD−1)−MD(kD)], n1 + 1),

34 / 49

This transformation between Cn1,n2,··· ,nD−1,nD and F (x1, x2, · · · , xD)is useful whenever one wishes to compute the interpolant at pointsthat are not interpolation points.

It is costly to evaluate

I (D,H)F (x1, x2, · · · , xD)

=∑

g(i1,i2,··· ,iD)≤H

m1(i1)∑k1=1

m2(i2)∑k2=1

· · ·mD(iD)∑kD=1

F (x i1k1 , xi2k2, · · · , x iDkD )

[ai1k1(x1)− ai1−1k1(x1)][ai2k2(x2)− ai2−1k2

(x2)] · · · [aiDkD (xD)− aiDkD (xD)]

35 / 49

It is inexpensive to use

I (D,H)F (x1, x2, · · · , xD)

=∑

g(M1(n1+1),M2(n2+1),··· ,MD(nD+1))≤H

Cn1,n2,··· ,nD

× ϕn1(x1)ϕn2(x2) · · ·ϕnD (xD) .

36 / 49

Choosing the points

For a given basis set, the best solutions are obtained by usinga Galerkin method, and exact matrix elements, to computethe expansion coefficients.

ψk(q) =∑n

znk fn(q)

[K + V]Z = SZE ,

For example,

Vn′n =

∫fn′(q)V (q)fn(q)dq

For all interesting problems, exact matrix elements do notexist for K,V, and S

The best one can do is to use quadrature

37 / 49

If the basis is excellent, it is best to calculate all integrals byquadrature, even if exact integrals are known for one or moreof K,V and S

Eigenvalues obtained by computing all integrals by quadratureare identical to those obtained with collocation

We therefore choose points to optimize the calculation of S

38 / 49

Pseudo-Gauss (PG) points

N PG points and weights evaluate overlap (Gram) integrals

Sn,m =

∫fn(x)fm(x)dx

with with n ≤ m and n + m ≤ (2N − 1) as exactly as possible.

We start with one Gauss point and add a point whose positionis chosen, using a simplex optimization method, to minimizethe error of overlap integrals with n + m ≤ (2 ∗ 2− 1). Wethen add a third, a fourth, · · · points in the same fashion.

39 / 49

Cost

The expensive part of this collocation method is applying K toI (D,H)Ψn(x1, x2, · · · , xD)

Obtaining Cn1,n2,··· ,nD from Fk1,k2,k3,··· ,kDEvaluating

∑g(M1(n1+1)+M2(n2+1)+···+MD(nD+1))≤H

[Kϕn1(x1)ϕn2(x2) · · ·ϕnD (xD) ]|[x1=xk′1,x2=xk′

2,··· ,xD=xk′

D]

Cn1,n2,··· ,nD

Both are done by evaluating sums sequentially, exploiting thestructure of the basis and the grid ; there are D + 1 nestedloops

40 / 49

First Fk1,k2,k3,··· ,kD → Cn1,n2,··· ,nD

Cn1,n2,··· ,nD−1,nD =

kmaxD∑

kD=1

∆BMD(nD+1)kD ,nD

kmaxD−1∑

kD−1=1

∆BMD−1(nD−1+1)kD−1,nD−1

· · ·kmax2∑

k2=1

∆BM2(n2+1)k2,n2

kmax1∑

k1=1

∆BM1(n1+1)k1,n1

Fk1,k2,k3,k4,k5,k6 ,

If the upper limits of all the sums were n, then the cost wouldbeDnD+1 ≈ nD log nD = N logN.

In reality the upper limits are a lot smaller.

41 / 49

This favourable scaling depends only the functions and pointsbeing nested.

It does not depend on using “hat functions”

It does not depend on using orthogonal basis functions

It does not depend on using localized basis functions

It does not depend on using a fast (e.g FFT) transform

It does not depend on exploiting sparsity

It works for any g1(n1) + · · ·+ gD(nD) ≤ b restriction

It is not a result of using the uni-directional principle

It does not depend on the number of functions added to leveli to make level i + 1

We often use H ≈ 25

42 / 49

Once we know Cn1,n2,··· ,nD

We evaluate,

∑g(M1(n1+1)+M2(n2+1)+···+MD(nD+1))≤H

[Kϕn1(x1)ϕn2(x2) · · ·ϕnD (xD) ]|[x1=xk′1,x2=xk′

2,··· ,xD=xk′

D]

Cn1,n2,··· ,nD

term by term

K =3N−6∑k,l

(Gkl

∂xk

∂xl

)+

3N−6∑k

(Hk

∂xk

)+ V ′

Cost of computing the spectrum does not depend on the formor complexity of the functions Gkl and Hk

43 / 49

Test calculations

HONO with bond length and bond angles coordinates

Cut eigenfunctions for 1-D bases (different basis functions anddifferent points for each coordinate) ; basis functions andpoints adapted for the problem I am solving.

We have done similar calculations in 12D

44 / 49

For HONO, the 330 lowest levels are accurately calculated with

H = 27 and 2× 106 points.

HONO is a hard problem because it has a double-well potentialand significant coupling.

Errors ∼ 0.3 cm−1 (one part in 104)

45 / 49

Of general interest

General pruning of the basis and the grid,

g1(n1) + · · ·+ gD(nD) ≤ b ,

46 / 49

We use nested points, but avoid 2i + 1 ; the differencebetween the number of points in level i + 1 and level i is verysmall (often one)

We use various point sets, but often choose points to optimizethe calculation of the (1-d) overlap (Gram) matrix

47 / 49

Multidimensional collocation without a “mass matrix”, crucialif iterative eigensolvers are necessary

Collocation with spectral Lagrange-type functions

New “hierarchical” functions make it possible to efficientlyobtain Cn1,n2,··· ,nD from Fk1,k2,k3,··· ,kD

The new “hierarchical” functions can be made from any (i.e.non-polynomial) basis and there are not restrictions on thenumber of functions (points) per level

48 / 49

This work has been supported by theNatural Sciences and EngineeringResearch Council of Canada, the Reseau quebecois de calculde haute performance, and the Canada Research Chairsprogramme.