Sparse-grid collocation methods to solve the vibrational...
Transcript of Sparse-grid collocation methods to solve the vibrational...
Sparse-grid collocation methods to solve thevibrational Schroedinger equation
Gustavo Avila and Tucker Carrington
Chemistry DepartmentQueen’s University
6 octobre 2016
1 / 49
Objective
Compute a vibrational spectrum by solving the vibrationalSchroedinger equation
Hψk = Ekψk
H = K + V .
Represent wavefunctions with basis functions
ψk(q) =∑n
ckn fn(q)
2 / 49
The curse
Often one wishes to use product basis functions
fk1,k2,··· ,kD = φk1(x1)φk2(x2) · · ·φkD (xD)
The size of the required tensor product basis is about 103Natoms−6.
3 / 49
Reducing the size of the basis
Keep only basis functions for which∑c
nc ≤ b
4 / 49
One can do better
We also use
α1n1 + · · ·+ αDnD ≤ b.
and more generally,
g1(n1) + · · ·+ gD(nD) ≤ b ,
where g c(n) is an general monotonic function of n, designed toinclude basis functions we suspect are important.
5 / 49
Because we cannot use exact matrix elements
We have
used a Galerkin method with a Smolyak quadrature
used a sparse-grid collocation method
6 / 49
With a Galerkin method I get a generalized eigenvalueproblem
there are no efficient iterative eigensolvers forHU = SUE
7 / 49
Collocation
obviates integrals
has the advantage that as the basis improves, the choice ofthe points becomes irrelevant.
8 / 49
What is collocation ?
Apply H − Ek to
ψk(q) =∑n
ukn fn(q)
and determine the ukn by demanding that the Schroedingerequation be satisfied at a set of points, i.e. solve
(T + VB)U = BUE ,
9 / 49
The most important drawback of established collocationmethods
B 6= Iand it is necessary to solve a generalized eigenvalue problem.
This makes collocation almost unusable (when the number ofbasis functions is larger than about 50’000).
In this talk I present a new collocation method thatobviates the need to solve a generalized eigenvalueproblem
10 / 49
Avoiding a generalized eigenvalue problem
In 1-D this is easily accomplished by using as basis functionsLagrange-like basis functions that are one at one of the collocationpoints and zero at all the others.
( B = I because fb(xa) = δab)
11 / 49
Solving the D-d Schroedinger equation
To solve the Schroedinger equation we replace Ψn with aninterpolant and demand that
(K + V )I (D,H)Ψn(x1, x2, · · · , xD) = EnI (D,H)Ψn(x1, x2, · · · , xD)
be satisfied at the interpolation points[x1 = xk′1 , x2 = xk′2 , · · · , xD = xk′D ]
12 / 49
V (x1, x2, · · · , xD)
K =3N−6∑k,l
(Gkl
∂
∂xk
∂
∂xl
)+
3N−6∑k
(Hk
∂
∂xk
)+ V ′
13 / 49
When I (D,H)Ψn(x1, x2, · · · , xD) is written in terms of Lagrangetype functions, then the RHS is simply EnΨn(xk′1 , xk′2 , · · · , xk′D ), i.e.( B = I) ; and
V is a diagonal matrix.
One obtains
K I (D,H)Ψn(x1, x2, · · · , xD)|[x1=xk′1,x2=xk′
2,··· ,xD=xk′
D]
+ V (xk′1 , xk′2 , · · · , xk′D )Ψn(xk′1 , xk′2 , · · · , xk′D )
= EnΨn(xk′1 , xk′2 , · · · , xk′D ) ,
The hard part is applying K
14 / 49
We exploit the fact that when nested points are used, a sparse-gridinterpolant can be written
I (D,H)Ψn(x1, x2, · · · , xD) =∑g(M1(n1+1),M2(n2+1),··· ,MD(nD+1))≤H
[Cnn1,n2,··· ,nD
× ϕn1(x1)ϕn2(x2) · · ·ϕnD (xD)] .
it is then fairly straightforward to apply K
15 / 49
Interpolation
For any basis ϕn(x), it is possible to make Lagrange-like functionsak(x), so that
f (x) = Umf (x) =nmax+1∑k=1
ak(x)f (xk) ,
where
ak(x) =nmax∑n=0
Bk,nϕn(x), k = 1, 2, · · · ,m = nmax + 1 .
16 / 49
It is easy to write a multi-D direct product interpolant,
F n(x1, x2, · · · , xD) = Um1 ⊗ Um2 ⊗ · · · ⊗ UmDF n(x1, x2, · · · , xD)
=
m1∑k1=1
m2∑k2=1
· · ·mD∑
kD=1
ak1(x1)ak2(x2) · · · akD (xD)F n(xk1 , xk2 , · · · , xkD ) .
When D ∼ 12, it is not possible to storeF n(xk1 , xk2 , · · · , xkD ) in memory
17 / 49
Sparse-grid interpolation
I (D,H) =∑
g(i1,i2,··· ,iD)≤H
∆U i1(x1)⊗∆U i2(x2)⊗ · · · ⊗∆U iD (xD)
∆U ic (xc) = U ic (xc)− U ic−1(xc) .
18 / 49
Points and functions are organized into levels
The interpolant is built using multiple sets of nested points. Forexample, if the number of functions in level ic and in level ic + 1differ by one :
xc1 for ic = 1,
xc1 , xc2 for ic = 2,
xc1 , xc2 , x
c3 for ic = 3,
xc1 , xc2 , x
c3 , x
c4 for ic = 4,
...
xc1 , xc2 , · · · , xcK−1, xcK for ic = K .
19 / 49
The interpolant is built using multiple sets of nested functions,
ϕc0 for ic = 1,
ϕc0, ϕ
c1 for ic = 2,
ϕc0, ϕ
c1, ϕ
c2 for ic = 3,
ϕc0, ϕ
c1, ϕ
c2, ϕ
c3 for ic = 4,...
ϕc0, ϕ
c1, · · · , ϕc
K−2, ϕcK−1 for ic = K .
20 / 49
Insert the definition of U ic (xc)
F n(x1, x2, · · · , xD) ≈ I (D,H)F n(x1, x2, · · · , xD)
=∑
g(i1,i2,··· ,iD)≤H
m1(i1)∑k1=1
m2(i2)∑k2=1
· · ·mD(iD)∑kD=1
F n(x i1k1 , xi2k2, · · · , x iDkD )
[ai1k1(x1)− ai1−1k1(x1)][ai2k2(x2)− ai2−1k2
(x2)] · · · [aiDkD (xD)− aiDkD (xD)]
21 / 49
Insert the definition of aickc
∑g(i1,i2,··· ,iD)≤H
∆U i1(x1)⊗∆U i2(x2)⊗ · · · ⊗∆U iD (xD)F n(x1, x2, · · · , xD)
=∑
g(i1,i2,··· ,iD)≤H
mD(iD)∑kD=1
nmaxD (iD)∑nD=0
[B iDkD ,nD
− B iD−1kD ,nD
]ϕnD (xD)
mD−1(iD−1)∑kD−1=1
nmaxD−1(iD−1)∑nD−1=0
[BiD−1
kD−1,nD−1− B
iD−1−1kD−1,nD−1
]ϕnD−1(xD−1)
· · ·m1(i1)∑k1=1
nmax1 (i1)∑n1=0
[B i1k1,n1− B i1−1
k1,n1]ϕn1(x1)
× F n(x i1k1 , xi2k2, · · · , x iDkD )
22 / 49
When the points are nested, it is possible to move the sums overn1, n2, · · · to the left to obtain
∑g(M1(n1+1),M2(n2+1),··· ,MD(nD+1))≤H
Cnn1,n2,··· ,nD ϕn1(x1)ϕn2(x2) · · ·ϕnD (xD) .
where, M i (ni + 1) is an integer that is the index of the interpolationrule U i with the smallest value of i containing the function ni + 1.
23 / 49
Two equivalent interpolation equations
One in which the coefficients are values of the function,
F n(x1, x2, · · · , xD) ≈ I (D,H)F n(x1, x2, · · · , xD)
=∑
g(i1,i2,··· ,iD)≤H
m1(i1)∑k1=1
m2(i2)∑k2=1
· · ·mD(iD)∑kD=1
F n(xk1 , xk2 , · · · , xkD )
[ai1k1(x1)− ai1−1k1(x1)][ai2k2(x2)− ai2−1k2
(x2)] · · · [aiDkD (xD)− aiDkD (xD)].
The points in the union of the sets of points for all levels aredenoted (xk1 , xk2 , · · · , xkD )
24 / 49
Another in which they are Cnn1,n2,··· ,nD , coefficients of products of
ϕnc (xc)
∑g(M1(n1+1),M2(n2+1),··· ,MD(nD+1))≤H
Cnn1,n2,··· ,nD
× ϕn1(x1)ϕn2(x2) · · ·ϕnD (xD) .
It is important to put the first equation on the RHS (to avoid ageneralized eigenvalue problem) and helpful to apply the KEO tothe second equation.
25 / 49
To use collocation for solving the Schroedinger equation, I mustapply the KEO to the interpolant and write the result in terms ofthe unknowns I will determine, Ψn(xk1 , xk2 , · · · , xkD ).We therefore replace
C nn1,n2,··· ,nD =∑
g(M1(k1)+M2(k2)+···+MD(kD))≤H
T(n1,n2,··· ,nD),(k1,k2,··· ,kD)Ψn(xk1 , xk2 , · · · , xkD )
Nondirect product pseudo-spectral method
This transformation must be done efficiently
T is never built
26 / 49
The eigenvalue problem we solve is
[K + V]U = UE ,
It is solved with an Arnoldi method by evaluating matrix-vectorproducts.
For the kinetic matrix-vector product the first step is
Cn1,n2,··· ,nD =∑
g(M1(k1)+M2(k2)+···+MD(kD))≤H
T(n1,n2,··· ,nD),(k1,k2,··· ,kD)zk1,k2,··· ,kD ,
Avila and Carrington, Journal of Chemical Physics, 143, 214108 (2015)
27 / 49
Once Cn1,n2,··· ,nD is known, we must compute
∑g(M1(n1+1)+M2(n2+1)+···+MD(nD+1))≤H
[Kϕn1(x1)ϕn2(x2) · · ·ϕnD (xD) ]|[x1=xk′1,x2=xk′
2,··· ,xD=xk′
D]
Cn1,n2,··· ,nD
which is efficiently done by evaluating sums sequentially
28 / 49
But how does one compute Cn1,n2,··· ,nD ?
This is done most efficiently if we replace ϕnc (xc) with ϕnc (xc)where, the ϕnc (xc) we use are defined as follows :
ϕnc (xc) =nc−1∑pc=0
Anc ,pcϕpc (xc) + ϕnc (xc) .
The Anc ,pc are determined so that
ϕnc (xpc ) = 0, if nc > 0, pc = 1, · · · , nc .
A function in level i will be zero at the points in levels1, 2, · · · , i − 1 and, ϕnc (xc) in level i will also be zero at pointsxkc in level i for which kc < nc .
29 / 49
Points and functions are organized into levels
The interpolant is built using multiple sets of nested points,
xc1 for ic = 1,
ϕc0 for ic = 1,
xc1 , xc2 for ic = 2,
ϕc0, ϕ
c1 for ic = 2,
(xc1 , xc2 ), xc3 , x
c4 for ic = 3,
(ϕc0, ϕ
c1), ϕc
2, ϕc3 for ic = 3,...
30 / 49
Differences between these “hierarchical functions” andthose used by previous authors
Our hierarchical basis functions can be made from anyϕnc (xc) ; in our applications the ϕnc (xc) from which ourhierarchical functions are made are not polynomials whosedegree is linked to nc .
We have complete control over the number of functions inlevel ic + 1 that are not in level ic . In standard hierarchicalbases, the number of functions per level increasesexponentially with ic By minimizing the number of functionsin level ic + 1 that are not in level ic , we significantly reducethe number of points in the grid.
Unlike previous hierarchical bases, our hierarchical functions inlevel ic are zero not only at points in level ic − 1, but also atsome points in level ic that are not in level ic − 1.
31 / 49
Using this new procedure :
A set of hierarchical functions can be made from a list of anyϕnc (xc) and a list of any points by dividing the lists into levels ;
The number of functions and points in each of the levels canbe chosen at will.
Similar ideas are also used in the Master thesis (2016) of VasilVelikov (Jan Hamaekers, Michael Griebel, Jochen Garcke)
32 / 49
Back to computing Cn1,n2,··· ,nD
By equating the two forms of the interpolant one finds
Cn1,n2,··· ,nD =∑
g(i1,i2,··· ,iD)≤H
mD(iD)∑kD=1
[B iDkD ,nD
− B iD−1kD ,nD
]
mD−1(iD−1)∑kD−1=1
[BiD−1
kD−1,nD−1− B
iD−1−1kD−1,nD−1
] · · ·m2(i2)∑k2=1
[B i2k2,n2− B i2−1
k2,n2]
m1(i1)∑k1=1
[B i1k1,n1− B i1−1
k1,n1]Fk1,k2,··· ,kD
In general this sum is costly to evaluate, but if we use theϕnc (xc) then ∆B ic
kc ,nc= B ic
kc ,nc− B ic−1
kc ,nc, is only non-zero for one
value of ic
33 / 49
Cn1,n2,··· ,nD−1,nD =
kmaxD∑
kD=1
∆BMD(nD+1)kD ,nD
kmaxD−1∑
kD−1=1
∆BMD−1(nD−1+1)kD−1,nD−1
· · ·kmax2∑
k2=1
∆BM2(n2+1)k2,n2
kmax1∑
k1=1
∆BM1(n1+1)k1,n1
Fk1,k2,k3,k4,k5,k6 ,
kmaxD =MIN(mD [H −M1(1)−M2(1)− · · · −MD−1(1)], nD + 1),
kmaxD−1 =MIN(mD−1[H −M1(1)−M2(1)− · · · −MD−2(1)−MD(kD)], nD−1 + 1),
...
kmax2 =MIN(m2[H −M1(1)−M3(k3)− · · · −MD−1(kD−1)−MD(kD)], n2 + 1),
kmax1 =MIN(m1[H −M2(k2)−M3(k3)− · · · −MD−1(kD−1)−MD(kD)], n1 + 1),
34 / 49
This transformation between Cn1,n2,··· ,nD−1,nD and F (x1, x2, · · · , xD)is useful whenever one wishes to compute the interpolant at pointsthat are not interpolation points.
It is costly to evaluate
I (D,H)F (x1, x2, · · · , xD)
=∑
g(i1,i2,··· ,iD)≤H
m1(i1)∑k1=1
m2(i2)∑k2=1
· · ·mD(iD)∑kD=1
F (x i1k1 , xi2k2, · · · , x iDkD )
[ai1k1(x1)− ai1−1k1(x1)][ai2k2(x2)− ai2−1k2
(x2)] · · · [aiDkD (xD)− aiDkD (xD)]
35 / 49
It is inexpensive to use
I (D,H)F (x1, x2, · · · , xD)
=∑
g(M1(n1+1),M2(n2+1),··· ,MD(nD+1))≤H
Cn1,n2,··· ,nD
× ϕn1(x1)ϕn2(x2) · · ·ϕnD (xD) .
36 / 49
Choosing the points
For a given basis set, the best solutions are obtained by usinga Galerkin method, and exact matrix elements, to computethe expansion coefficients.
ψk(q) =∑n
znk fn(q)
[K + V]Z = SZE ,
For example,
Vn′n =
∫fn′(q)V (q)fn(q)dq
For all interesting problems, exact matrix elements do notexist for K,V, and S
The best one can do is to use quadrature
37 / 49
If the basis is excellent, it is best to calculate all integrals byquadrature, even if exact integrals are known for one or moreof K,V and S
Eigenvalues obtained by computing all integrals by quadratureare identical to those obtained with collocation
We therefore choose points to optimize the calculation of S
38 / 49
Pseudo-Gauss (PG) points
N PG points and weights evaluate overlap (Gram) integrals
Sn,m =
∫fn(x)fm(x)dx
with with n ≤ m and n + m ≤ (2N − 1) as exactly as possible.
We start with one Gauss point and add a point whose positionis chosen, using a simplex optimization method, to minimizethe error of overlap integrals with n + m ≤ (2 ∗ 2− 1). Wethen add a third, a fourth, · · · points in the same fashion.
39 / 49
Cost
The expensive part of this collocation method is applying K toI (D,H)Ψn(x1, x2, · · · , xD)
Obtaining Cn1,n2,··· ,nD from Fk1,k2,k3,··· ,kDEvaluating
∑g(M1(n1+1)+M2(n2+1)+···+MD(nD+1))≤H
[Kϕn1(x1)ϕn2(x2) · · ·ϕnD (xD) ]|[x1=xk′1,x2=xk′
2,··· ,xD=xk′
D]
Cn1,n2,··· ,nD
Both are done by evaluating sums sequentially, exploiting thestructure of the basis and the grid ; there are D + 1 nestedloops
40 / 49
First Fk1,k2,k3,··· ,kD → Cn1,n2,··· ,nD
Cn1,n2,··· ,nD−1,nD =
kmaxD∑
kD=1
∆BMD(nD+1)kD ,nD
kmaxD−1∑
kD−1=1
∆BMD−1(nD−1+1)kD−1,nD−1
· · ·kmax2∑
k2=1
∆BM2(n2+1)k2,n2
kmax1∑
k1=1
∆BM1(n1+1)k1,n1
Fk1,k2,k3,k4,k5,k6 ,
If the upper limits of all the sums were n, then the cost wouldbeDnD+1 ≈ nD log nD = N logN.
In reality the upper limits are a lot smaller.
41 / 49
This favourable scaling depends only the functions and pointsbeing nested.
It does not depend on using “hat functions”
It does not depend on using orthogonal basis functions
It does not depend on using localized basis functions
It does not depend on using a fast (e.g FFT) transform
It does not depend on exploiting sparsity
It works for any g1(n1) + · · ·+ gD(nD) ≤ b restriction
It is not a result of using the uni-directional principle
It does not depend on the number of functions added to leveli to make level i + 1
We often use H ≈ 25
42 / 49
Once we know Cn1,n2,··· ,nD
We evaluate,
∑g(M1(n1+1)+M2(n2+1)+···+MD(nD+1))≤H
[Kϕn1(x1)ϕn2(x2) · · ·ϕnD (xD) ]|[x1=xk′1,x2=xk′
2,··· ,xD=xk′
D]
Cn1,n2,··· ,nD
term by term
K =3N−6∑k,l
(Gkl
∂
∂xk
∂
∂xl
)+
3N−6∑k
(Hk
∂
∂xk
)+ V ′
Cost of computing the spectrum does not depend on the formor complexity of the functions Gkl and Hk
43 / 49
Test calculations
HONO with bond length and bond angles coordinates
Cut eigenfunctions for 1-D bases (different basis functions anddifferent points for each coordinate) ; basis functions andpoints adapted for the problem I am solving.
We have done similar calculations in 12D
44 / 49
For HONO, the 330 lowest levels are accurately calculated with
H = 27 and 2× 106 points.
HONO is a hard problem because it has a double-well potentialand significant coupling.
Errors ∼ 0.3 cm−1 (one part in 104)
45 / 49
Of general interest
General pruning of the basis and the grid,
g1(n1) + · · ·+ gD(nD) ≤ b ,
46 / 49
We use nested points, but avoid 2i + 1 ; the differencebetween the number of points in level i + 1 and level i is verysmall (often one)
We use various point sets, but often choose points to optimizethe calculation of the (1-d) overlap (Gram) matrix
47 / 49
Multidimensional collocation without a “mass matrix”, crucialif iterative eigensolvers are necessary
Collocation with spectral Lagrange-type functions
New “hierarchical” functions make it possible to efficientlyobtain Cn1,n2,··· ,nD from Fk1,k2,k3,··· ,kD
The new “hierarchical” functions can be made from any (i.e.non-polynomial) basis and there are not restrictions on thenumber of functions (points) per level
48 / 49
This work has been supported by theNatural Sciences and EngineeringResearch Council of Canada, the Reseau quebecois de calculde haute performance, and the Canada Research Chairsprogramme.