jcwomack.files.wordpress.com › 2012 › 07 › jcw_cms2015_po… · Intception: Automatic...

Post on 29-Jun-2020

2 views 0 download

Transcript of jcwomack.files.wordpress.com › 2012 › 07 › jcw_cms2015_po… · Intception: Automatic...

Intception: Automatic generation of codefor the evaluation of molecular integrals

J.C. Womack and F.R. Manby

Centre for Computational Chemistry, School of Chemistry, University of Bristol, Bristol, BS8 1TS

Molecular integrals

The need to evaluate integrals over electronic coordinates is a common feature of methods whichapproximately solve the Schrodinger equation for molecules:

E =〈Ψ|H |Ψ〉〈Ψ|Ψ〉

In these electronic structure methods, the molecular wavefunction, |Ψ〉, is typically expressed in a basis ofone-electron functions, leading to integrals over one- and two-electron coordinates, e.g.

〈i |f |j〉 =

∫drψ∗i (r)f ψj(r) (ia|jb) =

∫dr1dr2ψ

∗i (r1)ψ∗j (r2)r−1

12 ψa(r1)ψb(r2)

Evaluation of molecular integrals is computationally intensive—in some methods it is the most expensivestep in a calculation. The computational implementation of electronic structure methods therefore requiresthe development of efficient molecular integral evaluation code.

Evaluating molecular integrals

For molecular calculations, contracted Gaussian-type orbitals (GTOs) are often employed, constructed fromfixed linear-combinations of primitive Gaussian functions, e.g.

|a) ≡ φa(r; A, a) =K∑m

dmag(r; ζm,A, a)

|am) ≡ g(r; ζm,A, a) = (x − Ax)ax(y − Ay)ay (z − Az)az exp(−ζm|r − A|2)

with centre A, angular momentum vector a, total angular momentum la = ax + ay + az , and per-primitiveexponent ζm. Integrals over contracted GTOs are obtained by contraction (and spherical transformation) ofintegrals over primitive Cartesian Gaussians, e.g. for 2-index overlap:

(a|b) = (am|bn) =

Ka∑m

Kb∑n

dmadnb

∫dr g(r; ζm,A, a)g(r; ζn,B,b)

To obtain primitive integrals with higher angular momentum, l , it is generally only necessary to explicitlyevaluate the l = 0 case, e.g.

(0A|0B) = (π/ζ)3/2 exp(−ξ|A− B|2)

Integrals over primitive Cartesian Gaussians with higher l can then be obtained using (vertical) recurrencerelations (VRRs) [1], e.g.

(a + 1i |b) = PAi(a|b) +ai2ζ

(a− 1i |b) +bi2ζ

(a|b− 1i)

The horizontal recurrence relation (HRR) can be used to shift angular momentum between centres and canbe applied to contracted integrals [2].

(a(b + 1i)|c) = ((a + 1i)b|c) + ABi(ab|c)

VRRs, HRRs and contractions may be combined to create multiple integral evaluation schemes:

(a|0) (a|b)(0|0)

Primitive(lb=0)

PrimitivePrimitive(la=lb=0)

(a|b)VRR ContractVRR

Contracted

(a|0) (a|0)(0|0)

Primitive(lb=0)

Contracted(lb=0)

Primitive(la=lb=0)

(a|b)VRR HRRContract

Contracted

Electron repulsion integrals (ERIs)

ERIs are the most computationally expensive integral required in many electronic structure methods (e.g.HF, DFT, MP2, CC)—efficient software implementation is vital.

(ab|r−112 |c) =

∫dr1dr2 ga(r1)gb(r1)r−1

12 gc(r2)

To enable use of Cartesian RRs, an integral transform, r−112 = 2π−1/2

∫∞0 du g(r2; u2, r1, 0), is used and an

auxiliary index, m, introduced:

((a + 1i)b|c)(m) = VRR{

(ab|c)(m), (ab|c)(m+1), . . .}

When m = 0, the true ERIs are obtained, i.e. (ab|c)(0) ≡ (ab|r−112 |c) [1]. Additionally, the l = 0 case is

complicated by the need to evaluate the Boys function [3]:

(0A0B |0C )(m) = f (ζa, ζb, ζc ,A,B,C)Fm(T ) Fm(T ) =

∫ 1

0dt t2m exp(−Tt2)

Intception

Intception is designed to automatically generate molecular integral evaluation code,addressing the following issues:

Difficult and time consuming development processI Efficient algorithms may be specific to integral types.

I Very efficient code may not be easy to read or debug.

I Discourages development of methods requiring new integral classes.

A shifting software/hardware environmentI Over the lifetime of scientific software, the operating environment of the software

may change significantly.

I Efficient algorithms are specific to software/hardware environment.

I Fully utilising new software/hardware (e.g. GPGPU) requires modification orrewriting of existing code.

Cray 2 supercomputer (1980s)

x86 workstation (1990-2000s)

Nvidia GPGPU (2000s-)

References and Acknowledgements

[1] Obara, S. & Saika, A. J. Chem. Phys. 84, 3963–3974 (1986).[2] Head-Gordon, M. & Pople, J. A. J. Chem. Phys. 89, 5777–5786 (1988).[3] Helgaker, T., Jørgensen, P. & Olsen, J. Molecular Electronic-Structure Theory (Wiley, 2000), pp.365–368.[4] The Python programming language, version 3.x. https://www.python.org/.[5] ISO/IEC. Programming languages - C (ISO/IEC 9899:1999(E)) (1999).[6] MOLPRO, H.-J. Werner, P. J. Knowles, G. Knizia, F. R. Manby, M. Schutz, and others , see http://www.molpro.net.[7] Optimisation of code in collaboration with MEng student Tom Rumsey.[8] Dunning Jr., T. H. J. Chem. Phys. 90, 1007–1023 (1989).[9] Wilson, A. K., Woon, D. E., Peterson, K. A. & Dunning Jr., T. H. J. Chem. Phys. 110, 7667–7676 (1999).

Image credits: Cray 2 image by NASA [Public domain], via Wikimedia Commons; x86 workstation image by Vernon Chan [CC-BY-2.0], via Wikimedia

Commons; Nvidia GPU image by Flickr user GBPublic PR [CC BY-NC-SA 2.0], via Flickr. All images modified to add text descriptions.

Intception

Input script Intception Source code

Integral classes definedin DSL:● Integral indexes● Base expression● Recurrence relations

● Process DSL input ● Construct optimized

algorithm● Output integral

evaluation code

● Can be compiled into a library

● Can be interfacedwith existing packages

Python C

Key featuresI Input is written using a domain-specific language (DSL), built using Python [4].

I Users define integrals using RRs and base (l = 0) expression.

I Source code generated by customising an “algorithm template”.

I Output is in C, allowing wide compatibility with other software packages (C99 standard) [5].

Domain-specific language (DSL)

The DSL encapsulates the abstract mathematical problem of molecular integral evaluation using Python classesto represent relevant objects.

dsl_binopop_pow

left right

dsl_unopop_exp

arg

dsl_binopop_mul

left right

dsl_binopop_mul

left right

int 1.5dsl_binopop_mul

left right

dsl_scalarpi

dsl_scalaro_o_xp

dsl_unopop_neg

argdsl_scalarxaxb_o_xp

dsl_scalarRAB2

dsl_cartesian_gaussian ga dsl_cartesian_gaussian gb*

dsl_cartesian_gaussian ga

dsl_cartesian_gaussian gb

dsl_binop dsl_op

op_mul

The DSL modifies the default behaviour of Python operators, allowing expressions comprised of DSL objects tobe parsed and manipulated directly within the Python language. Mathematical expressions are represented astrees of binary and unary operations on DSL objects, which may easily be analysed and manipulated in order togenerate source code.

Integral evaluation algorithm template

VRR sequence

Contraction sequence

HRR sequence

Copy to output array

Base functionContracted

Primitive(00|0)(m)

(a0|c)

(a0|c)

(ab|c)

(ab|c)

(00|0)(m)

(ab|c)

(ab|c)

(00|0)(m)

(a0|c)

(ab|c)

(ab|c)

(00|0)(m)

(ab|c)

(ab|c)

(ab|c)

Source code is generated by customising a general template for molecular integral evaluation. The template iscustomised based on a user-provided description of the integral type (e.g. RRs, expression for l = 0 case) andspecification of which algorithm components to use (e.g. VRR, HRR, contraction), written in the DSL.

Results

Generated molecular integral evaluation code is numerically accurate when compared toexisting integral code [6]. It is possible to rapidly generate and test routines using differentroutes through the algorithm template. At present, generated code is less efficient than theexisting code in Molpro, though we anticipate significant improvements with furtheroptimisation [7].

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0

0 100 200 300 400 500 600 700 800 900 1000

Tim

eta

ken

/s

Repetitions

H2OMolpro built-inGenerated (VRR-only)Generated (VRR and HRR)

0.0

2.0

4.0

6.0

8.0

10.0

12.0

14.0

16.0

18.0

0 100 200 300 400 500 600 700 800 900 1000

Tim

eta

ken

/s

Repetitions

KrMolpro built-inGenerated (VRR-only)Generated (VRR and HRR)

Plots of serial code execution time to evaluate all contracted, spherically transformed 3-index Coulombintegrals (ab|r−1

12 |c), using a cc-pVDZ basis set [8, 9] for all atoms. Each repetition is a single loop iterationover a call to the integral evaluation routine.

Type AlgorithmH2O Kr

Av. mag. Max diff. RMSD Av. mag. Max diff. RMSD

(a|b) VRR-only 10−1 3.33× 10−16 4.68× 10−17 10−1 2.64× 10−16 2.73× 10−17

(a|b) VRR and HRR 10−1 3.55× 10−16 6.62× 10−17 10−1 2.22× 10−16 1.95× 10−17

(a|b|c) VRR-only 10−2 4.44× 10−16 1.05× 10−17 10−2 1.78× 10−15 3.88× 10−17

(a|b|c) VRR and HRR 10−2 4.44× 10−16 1.41× 10−17 10−2 3.55× 10−15 4.20× 10−17

(a|r−112 |b) VRR-only 1 4.26× 10−14 5.68× 10−15 1 1.42× 10−14 1.35× 10−15

(ab|r−112 |c) VRR-only 10−1 2.35× 10−14 1.01× 10−15 10−2 3.55× 10−15 1.05× 10−16

(ab|r−112 |c) VRR and HRR 10−1 2.35× 10−14 1.03× 10−15 10−2 3.55× 10−15 9.76× 10−17

Numerical comparison of integrals evaluated using Molpro’s [6] built-in routines and generated routines forseveral types of contracted, spherically transformed integrals, using a cc-pVDZ basis set [8, 9]. Maximumdifference is the largest per-integral difference for an integral type, while the root mean square deviation iscalculated for the entire block of integrals of each type. The average (mean) magnitude for all integralscomputed of each type is reported to provide context.

March 2015 Research funded by EPSRC