Linear scaling computation of the Fock matrix. VII. Periodic density functional theory at the Γ...

10
Linear scaling computation of the Fock matrix. VII. Periodic density functional theory at the Γ point C. J. Tymczak and Matt Challacombe Citation: The Journal of Chemical Physics 122, 134102 (2005); doi: 10.1063/1.1853374 View online: http://dx.doi.org/10.1063/1.1853374 View Table of Contents: http://scitation.aip.org/content/aip/journal/jcp/122/13?ver=pdfcov Published by the AIP Publishing Articles you may be interested in Benchmark calculations for reduced density-matrix functional theory J. Chem. Phys. 128, 184103 (2008); 10.1063/1.2899328 Linear scaling computation of the Fock matrix. VIII. Periodic boundaries for exact exchange at the Γ point J. Chem. Phys. 122, 124105 (2005); 10.1063/1.1869470 Comparison of two genres for linear scaling in density functional theory: Purification and density matrix minimization methods J. Chem. Phys. 122, 084114 (2005); 10.1063/1.1853378 Linear scaling computation of the Fock matrix. VII. Parallel computation of the Coulomb matrix J. Chem. Phys. 121, 6608 (2004); 10.1063/1.1790891 Linear scaling computation of the Fock matrix J. Chem. Phys. 106, 5526 (1997); 10.1063/1.473575 This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 129.49.23.145 On: Wed, 17 Dec 2014 18:45:48

Transcript of Linear scaling computation of the Fock matrix. VII. Periodic density functional theory at the Γ...

Linear scaling computation of the Fock matrix. VII. Periodic density functional theory atthe Γ pointC. J. Tymczak and Matt Challacombe Citation: The Journal of Chemical Physics 122, 134102 (2005); doi: 10.1063/1.1853374 View online: http://dx.doi.org/10.1063/1.1853374 View Table of Contents: http://scitation.aip.org/content/aip/journal/jcp/122/13?ver=pdfcov Published by the AIP Publishing Articles you may be interested in Benchmark calculations for reduced density-matrix functional theory J. Chem. Phys. 128, 184103 (2008); 10.1063/1.2899328 Linear scaling computation of the Fock matrix. VIII. Periodic boundaries for exact exchange at the Γ point J. Chem. Phys. 122, 124105 (2005); 10.1063/1.1869470 Comparison of two genres for linear scaling in density functional theory: Purification and density matrixminimization methods J. Chem. Phys. 122, 084114 (2005); 10.1063/1.1853378 Linear scaling computation of the Fock matrix. VII. Parallel computation of the Coulomb matrix J. Chem. Phys. 121, 6608 (2004); 10.1063/1.1790891 Linear scaling computation of the Fock matrix J. Chem. Phys. 106, 5526 (1997); 10.1063/1.473575

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP:

129.49.23.145 On: Wed, 17 Dec 2014 18:45:48

Linear scaling computation of the Fock matrix. VII. Periodic densityfunctional theory at the G point

C. J. Tymczaka! and Matt ChallacombeTheoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545

sReceived 18 October 2004; accepted 8 December 2004; published online 1 April 2005d

Linear scaling quantum chemical methods for density functional theory are extended to thecondensed phase at theG point. For the two-electron Coulomb matrix, this is achieved with atree-code algorithm for fast Coulomb summationfM. Challacombe and E. Schwegler, J. Chem.Phys. 106, 5526 s1997dg, together with multipole representation of the crystal fieldfM.Challacombe, C. White, and M. Head-Gordon, J. Chem. Phys.107, 10131 s1997dg. A periodicversion of the hierarchical cubature algorithmfM. Challacombe, J. Chem. Phys.113, 10037s2000dg, which builds a telescoping adaptive grid for numerical integration of theexchange-correlation matrix, is shown to be efficient when the problem is posed as integration overthe unit cell. Commonalities between the Coulomb and exchange-correlation algorithms arediscussed, with an emphasis on achieving linear scaling through the use of modern data structures.With these developments, convergence of theG-point supercell approximation to thek-spaceintegration limit is demonstrated for MgO and NaCl. Linear scaling construction of the Fockian andcontrol of error is demonstrated for RBLYP/6-21G* diamond up to 512 atoms. ©2005 AmericanInstitute of Physics. fDOI: 10.1063/1.1853374g

I. INTRODUCTION

Quantum chemical methods that employ Gaussian-typeatomic orbitalssGTAOsd offer a number of advantages inmaterials science. First, because they are local basis func-tions, it is possible to achieve a linear scaling cost with sys-tem size for insulating systems. Second, almost all one- andtwo-electron integrals involving GTAOs are analytic, en-abling the rapid evaluation of expectation values involvingcomplicated operators that are often involved in the compu-tation of response properties.1–3 The Dalton quantum chemi-cal program4 is a premier example of this capability, offeringa wide range of electric and magnetic molecular responseproperties. The ability to treat core-states analytically alsoopens the ability to go beyond the pseudopotential approxi-mation in computation of relativistic effects with the four-component Dirac–Hartree–Fock5,6 and Dirac–Kohn–Sham7

theories. Perhaps most important though, the exact Hartree–Fock sHFd exchange may be computed efficiently with aGTAO basis set. In addition to providing a reference forcorrelated wave function methods, the exact HF exchange iscentral to hybrid HF/DFT models.8–11 The use of hybridmethods in the condensed phase, pioneered by the CRYS-TAL group,12 has proven to be a useful improvement beyondthe generalized gradient approximation for a number ofproperties, including bulk geometries, electronicproperties,13,14 and absorption energies.15,16

Recently, we have developed linear scaling quantumchemical methods for gas phase density functional theorysDFTd, including computation of the Coulomb matrixJ sRef.

17d and the exchange-correlation matrixK xc.18 In this con-

tribution, these linear scaling methods are extended to peri-odic boundary conditions at theG point.

With periodic linear scaling quantum chemical algo-rithms, it is possible to begin bridging the gap between meth-ods developed for small molecule chemistry and large scaleproblems in the solid state. Together with the results pre-sented here,OsNd methods for solving the Self-consistent-field equations19,20 and linear scaling algorithms for comput-ing the periodic HF exchange,21 it is now possible to performcondensed phase HF/DFT calculations on systems largerthan 500 atoms with a single processor. In addition, with theadvent of linear scaling density matrix perturbationtheory,22,23 well developed quantum chemical methods forthe analytic computation of response properties may bebrought to bear on large solid state problems.

This paper is organized as follows: In Sec. II, periodicboundary conditions and theG-point approximation are in-troduced. Next, in Sec. III, the relationship between the nu-merical error estimates and data structures that underly thefast linear scaling algorithms for computation of the Cou-lomb and exchange-correlation matrix are outlined. In Sec.IV, we extend previous work on the Niboer and De Wette24,25

lattice sum method to linear scaling computation of quantumCoulomb sums and tin-foil boundaries. Then, in Sec. V,OsNd methods for computing the GTAO-based exchange-correlation matrix are presented. In Sec. VI A we discuss theimplementation of these developments in the MONDOSCF26

suite of linear scaling quantum chemistry codes. In Sec. VI Bcomparison of theG-point results is made with those ob-tained with CRYSTAL98 using k-space integration for NaCland MgO. Next, in Sec. VI C, linear scaling is demonstratedfor construction of the diamond Fock matrix at theadElectronic mail: [email protected]

THE JOURNAL OF CHEMICAL PHYSICS122, 134102s2005d

0021-9606/2005/122~13!/134102/9/$22.50 © 2005 American Institute of Physics122, 134102-1

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP:

129.49.23.145 On: Wed, 17 Dec 2014 18:45:48

RBLYP/6-21G* level of theory. Finally, in Sec. VII wepresent our conclusions.

II. PERIODIC BOUNDARY CONDITIONS, LINEARSCALING, AND BASIS SETS

In the conventional implementations of periodic bound-ary conditions, the Bloch functions

caksr d = o

Reik·Rfasr − Rd s1d

are often constructed from nonorthogonal functions local tothe unit cell sUCd. Here, the local functionfa is a GTAOcentered on atomA, while the sum onR runs over the Bra-vais lattice defined by integer translates of the primitive lat-tice vectorsa, b, andc. These Bloch functionsscrystal orbit-alsd yield all possible translational symmetries throughvariation of the reciprocal lattice vectork. Programs such asCRYSTAL98 perform a careful sampling of reciprocal space toachieve an accurate description of the periodic system. Analternative approach to including these important symmetriesis to setk =0, and then use a larger supercell created throughreplication and translation of the primitive unit cell. This isthe supercellG-point approximation, used primarily for thestudy of defects and vacancies rather than as a replacementfor k-space integration.

In this contribution,OsNd algorithms are developed spe-cifically for the G-point approximation, allowing the use oflarge supercells in the case of high symmetry, as well aslarge primary cells in the case of disordered systems. Whilek dependence is avoided, lattice summation and formal inte-gration over the unit cell volumeVUC are retained. At firstsight this would seem to make matrix construction quite dif-ferent than in the gas phase, where integrals are typicallytaken over all space,V`. Thus, elements of the gas phaseoverlap matrix,

Sab =EV`

drfasr dfbsr d, s2d

become

Sab = oR−R8

EVUC

drfasr + Rdfbsr + R8d s3d

in the periodicG-point regime. However, this formalism canbe brought into a form more closely related to its quantumchemical counterpart via the transformation,

oRE

VUC

dr fsr + Rd → EV`

dr fsr d, s4d

allowing use of conventional analytic integral technologies.For example, elements of the periodic overlap matrix be-come

Sab = oRE

V`

drfasr dfbsr + Rd. s5d

For compactness of notation, let us define the intermedi-ate basis function productssdistributionsd rabsr d=oRR8fasr+Rdfbsr +R8d associated with integration overVUC and the

corresponding distributionsrab` sr d=oRfasr dfbsr +Rd associ-

ated with integration overV`. We likewise define the elec-tron densityrsr d=oabPabrabsr d associated with integrationover VUC and the corresponding densityr`sr d=oabPabrab

` sr d associated with integration overV`, wherePab is the one-electron reduced density matrix. In this con-vention, V` is the default volume of integration, and ele-ments of the periodic overlap matrix are expressed simply asSab=edrrab

` sr d, while the electron count isNel=edrr`sr d.It is worth noting that the complexity ofr` is OsNd, due

to the exponential prefactore−xabsA −B−Rd2 that enters eachterm in the sum overA, B, andR. Thus,N scaling may beachieveda priori with a simple distance test. However, forsmall exponents, care must be exercised in truncation of pe-riodic sums to avoid overlap matrices that are not positivedefinite. While these situations can often be ameliorated witha tighter distance neglect criteria, they are typically a symp-tom of near linear dependence, often due to the use of basissets designed for gas phase calculations in conjunction withsmall unit cells.

These considerations and others are discussed by Towlerin an excellent overview of Gaussian basis sets for the con-densed phase.27 Also, there are at least twosalbeit relateddlibraries28,29 of Gaussian basis sets appropriate for materialsat standard densities. For high densities though, these basissets may still encounter problems with linear dependenceand sensitivity to truncation. One solution to this problem,suggested by Grüneich and Hess30 for even tempered basissets, is to scale the exponents by the inverse square of thelattice constant. In many cases though, especially for largesystems, standard quantum chemical basis sets work well.

III. DATA STRUCTURES AND ERROR ESTIMATES

Both the quantum chemical tree codesQCTCd sRef. 17dfor computing the Coulomb matrix and Hierarchical Cuba-ture sHiCud sRef. 18d for computing the exchange-correlation matrix are fastOsN ln Nd algorithms whose per-formance is coupled to underlying data structures and errorestimates. It is important to understand some of these par-ticulars first, before addressing their extension to periodicboundary conditions. Also, the current version of QCTC isquite different from previous descriptions and deserves someintroduction.

Both QCTC and HiCu are homeomorphic, involvingk-d tree representation of the density. In our implementa-tions, k-d trees are doubly linked lists with axis alignedbounding boxessAABBsd delimiting the spatialextent ofeach node and its children. This scheme is similar to welldeveloped technologies for ray tracing and data basesearches, allowing fastOsln Nd range queriesof overlappingcomponents through AABB intersection tests.31 In the caseof QCTC, this fast look up constitutes the penetration accept-ability criterion sPACd which identifies spatial clusters or ag-glomerationsrQ of the density that may be accurately repre-sented via a multipole approximation due to the absence ofcharge-charge penetration effects.

For accepted clusters a second test, the multipole accept-ability criterion sMACd, is performed to check translation

134102-2 C. J. Tymczak and M. Challacombe J. Chem. Phys. 122, 134102 ~2005!

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP:

129.49.23.145 On: Wed, 17 Dec 2014 18:45:48

errors in the multipole expansion. This second test is criticalto the overall accuracy of the Coulomb matrix build. Wehave recently developed a different MAC in Ref. 32 that hasseveral advantages. First, it takes into account the magnitudeor weight of the distribution within the cluster. Second, itcorrectly takes into account the angular symmetry of theprimitive Gaussian distributions. Third, and most important,it is always an exact bound to the translation error.

For each primitive bra distributionrab, a fast rangequery is performed on thek-d tree representation of the totaldensity, leading to an on the fly partition of near-fieldsNFdand far-field sFFd interactions in construction of the gasphase Coulomb matrix which may be written as

Jab = oQPFF

ol

s− 1dlom

Oml frabgo

l8om8

Mm+m8l+l8 Om8

l8 frQg

+ oqPNF

E dr E dr 8rabsr dur − r 8u−1rqsr d, s6d

whereMml is the irregular solid harmonic interaction tensor,

Oml ffg=edrOm

l sr dfsr d is a moment of the regular solid har-monic,Q runs over the highest possible nodes in the density-tree consistent with the PAC and MAC, andq runs on left-over near-field primitive distributions in the density. SeeRefs. 17 and 33 for further details on this representation.

A fundamental difference between QCTC and FMMbased methods is that QCTC pushes the near/far-field parti-tion to the limit, employing the PAC and MAC best-caseerror estimates to resolve individual primitive distributions.On the other hand, FMM based methods employ static,worst-case error estimates. While recurring down the densitytree to the level of individual primitives precludes well de-veloped technologies for the integral evaluation of contractedfunctions, it accelerates the onset of linear scaling throughearly clustering.

The quantum chemical tree code generally employs thetotal density, which simplifies the code, allows electrostaticscreening in MAC error estimates and provides charge neu-trality, an essential feature for periodic calculations. Thus,the Coulomb matrix employed here includes the electron-nuclear terms;J;Jee+Ven.

In the case of HiCu, two separatek-d tree structures areused. The rho-tree holds the electron density, while the cube-tree contains a hierarchical grid for integration of theexchange-correlation potential. Each node of the cube-tree iscomposed of a Cartesian nonproduct integration rule with thegrid points enclosed by its AABB. The cube-tree is con-structed iteratively through recursive bisection of the primaryvolume sthe root AABBd, using exact error bounds toachieve arbitrary precision of the integrated density and itsgradients. As the cube-tree is extended, AABB intersectiontests are performed while traversing the rho-tree, avoidingparts of the density that do not overlap with that portion ofthe grid. Upon construction of the grid, the reverse procedureis carried out; for each primitive distribution, the cube-tree iswalked selecting only overlapping portions of the grid viathe AABB intersection test.

For both of these fast algorithms, the trade-off betweenefficiency and accuracy is controlled by the AABB, which in

turn depends on the the extent or rangeRq of a primitiveGaussian distributionrq, beyond which it is assumed to benegligible. Of course, negligible depends on the use to whichthe distribution is put, as will become obvious in the follow-ing.

Both HiCu and QCTC employ the Hermite–Gaussianrepresentation of distributions34

rqsr d = olmn

dlmnLlmnq sr d, s7d

where

Llmnq sr d =

] l+m+n

]Qxl ]Qy

m]Qzne−zqsr − Qd2. s8d

This representation provides an intermediate form into whichelements of the density matrix may be folded, and allows theuse of McMurchie–Davidson recurrence relations35 in ana-lytic integral evaluation and density evaluation. For thisform, Cramer’s inequality36 provides a bound for the behav-ior of a Hermite–Gaussian distribution:

rqsr d ø Cqe−z̃qsr − Qd2, s9d

where

Cq = olmn

udlmnuK3f2l+m+nl!m!n!zql+m+ng1/2, s10d

the constantK=1.09, and

z̃q = Hzq l + m+ n = 012zq otherwise.

J s11d

The overlap extentRqo is the value beyond which numeri-

cal evaluation of the distributionrq yields a value less thant,

Cqe−z̃qsRq

od2 = t. s12d

For QCTC, errors in the electrostatic potential due to pen-etration errors must be considered. For this purpose, the pen-etration extentRq

p is introduced, satisfying the equation

CqE fsp/z̃qd3/2dsrd − e−z̃qr2gur − Rq

pu−1dr = t. s13d

In both HiCu and QCTC, the density tree is constructedby recursively splitting the largest box dimension, until eachprimitive has been resolved. Then the primitive AABBs arecomputed from their extents and merged recursively back upthe tree. For HiCu, this is all there is to it, but for QCTCmultipole moments are also translated to a common centerand recursively merged up the tree. Also, when computingmatrix elements ofJ, the primitive bra AABB is computedwith Rq

o, while the Rqp are used to construct AABBs of the

density tree.In Fig. 1, differences between the penetration and over-

lap extent are shown for a diffuses-type Gaussian. For largeextents, such as those encountered in a static FMM-type er-ror bound, the two extents behave in a similar way. However,with aggressive use of the multipole approximation as inQCTC, the distinction becomes critical.

134102-3 Linear scaling computation J. Chem. Phys. 122, 134102 ~2005!

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP:

129.49.23.145 On: Wed, 17 Dec 2014 18:45:48

IV. PERIODIC QUANTUM COULOMB SUMS

In the G-point approximation, elements of the periodicCoulomb matrix are

Jab =EVUC

dr E dr 8rabsr dur − r 8u−1rtotsr 8d

= oRE E drdr 8rab

` sr dur − r 8u−1rtot` sr 8 + Rd, s14d

wherertot is the total periodic density including both elec-tronic and nuclear terms. These integrals involve infinitesummation over the lattice vectorsR, and must be handledwith care. There are at least two main approaches to handlingthis summation: Multipole expansion of the Ewald potentialor Ewald-like summation of the multipole expansion. Expan-sion of the Ewald potential yields tin foilsTFd boundaryconditions, requires reciprocal and real space summationwith every J build, and scales asOsN3/2d. An alternative isthe Ewald-like summation of the multipole interaction ten-sor, which was first described by Nijboer and De WettesNDWd sRefs. 24 and 25d and later reviewed and extendedby Challacombe, White, and Head–Gordon33 to lattice sum-mation of the irregular solid harmonic multipole interactiontensor. This Ewald-like summation is taken over the periodicfar field VPFF and is equivalent to a direct lattice summationsnot a true Ewald sumd excluding an inner regionVIn sur-rounding the unit cell. This inner region has been subtractedto avoid penetration errors and to guarantee convergence ofthe multipole expansion. With the summed interaction ten-sors cheaply precomputed and reused, the cost of Coulombsummation over the PFF scales asOsNp2d, wherep is theorder of the multipole expansion. With this partition, theN-scaling periodic quantum Coulomb sums involve the con-tributions

J = JIn + JPFF+ JTF, s15d

corresponding to the three separate regions shown in Fig. 2.Here,JIn is computed using the fastOsN ln Nd QCTC algo-rithm outlined previously in Sec. III. Construction ofJPFF

will be developed in the following section, while in Sec.IV B the term JTF, necessary to introduce tin-foil boundaryconditions, is detailed.

A. The periodic far field

By construction, the PFF term in the Coulomb matrix,

JabPFF= o

RPPFFE E drdr 8rab

` sr dur − r 8 + Ru−1rtot` sr 8d,

s16d

involves charge distributions that are well separated with re-spect to both penetration and the convergence of multipoleexpansion errors, as outlined in Fig. 2 and discussed in thefollowing.

With these conditions, and assuming the unit cell is cen-tered at the origin, the bipolar multipole expansion employ-ing the regular and irregular solid harmonics,Om

l and Mml ,

respectively, is

FIG. 1. sColord Behavior of the overlap extentRo and the penetration extentRp as a function oft /Cq for an s-type Gaussian with exponentz=1. Forsmall Cq soccurring, for example, due to a large atom-atom separationand/or small density matrix prefactord, Ro goes to zero at the origin and itsdistribution is eliminated, whileRp goes slowly to zero due to the Coulombsingularity.

FIG. 2. Schematic of the regions contributing toN-scaling summation of theCoulomb matrix. The inner cells that make upVIn provide a buffer regionthat guarantees convergence of the multipole expansion of Coulomb inter-actions between the unit cell and all cells inVPFF. The periodic far-fieldregion VPFF is the spherically ordered lattice extending to infinity but ex-cluding VIn. For large cells and/or high order multipole expansions,VIn

includes just the unit cell’s 27 nearest neighbors. In FMM notation thiscorresponds to ws=1. However, for smaller cells and/or lower order multi-pole expansions,VIn tends to a spherical distribution of cells surrounding theunit cell. Direct summation overVPFF leads to charges at the infinite surfaceS̀ , which must be be canceled by tin-foilsconductingd boundary conditionsto achieve equivalence with Ewald summation.

134102-4 C. J. Tymczak and M. Challacombe J. Chem. Phys. 122, 134102 ~2005!

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP:

129.49.23.145 On: Wed, 17 Dec 2014 18:45:48

ur − r 8 + Ru−1

< ol=0

p

s− 1dl ol8=0

p F om=−l

l

om=−l8

l8

Olmsr dMl+l8

m+m8sRdOl8m8sr 8dG .

s17d

Inserting Eq.s17d into Eq. s16d yields

JabPFF= o

RPPFFol=0

p

s− 1dl

3ol8=0

p

om=−l

l S om=−l8

l8

Olmfrab

` gMl+l8m+m8sRdOl8

m8frtot` gD .

s18d

This expression decouples the complexity ofrab` from rtot

`

through the precomputed multipole momentsOlmfrab

` g=edrOl

msr drab` sr d and Ol

mfrtot` g=edrOl

msr drtot` sr d. Following

Nijboer and De Wette,24,25,33we introduce the effective mul-tipole interaction tensor

Mlm = o

RPVPFF

MlmsRd, s19d

which can be efficiently computed on the fly for each newlattice, both to high accuracy and to high orderslarge pdusing the new methods detailed in the Appendix. Note thatthis is a direct sum of the interaction tensor, and isnotequivalent to Ewald summation. Nevertheless, with this sim-plification, theOsp2Nd working equation

JabPFF= o

l=0

p

om=−l

l

Olmfrab

` gJ lm s20d

is obtained, where the intermediate tensor

J ml = s− 1dl o

l8=0

p

om=−l8

l8

Ml+l8m+m8Ol8

m8frtot` g s21d

is cheaply precomputed at the start of each Coulomb build.Because Eq.s20d is inexpensive, our strategy is to define

a minimal buffer regionVIn sufficient to control penetrationerrors, subtracting effort from the computation ofJIn viaQCTC and replacing it with cheaper, multipole work in thecomputation ofJPFF. To this end, a fixed inner regionVIn isconstructed from neighboring cells that have simple Gauss-ian overlap with the unit cell, defined by the radiusRo. Asexplained in Sec. III, for the relatively large distances con-sidered at this level the differences between the penetrationand overlap extent are negligible. WithVIn fixed, the preci-sion of JPFF is controlled entirely by the expansion orderp.In generalp will be much higher than the expansion orders,5d employed by QCTC in computation ofJIn. With QCTCaccuracy is controlled on the fly by the MAC and PAC, es-tablishing a dynamic near/far-field partition, while computa-tion of JPFF involves a static, worst-case error dominated bythe multipole expansion. This static error is controlled byusing the FMM-like error bound,

2pC2dmaxp+1

sRodp+1uRo − 2dmaxuø tMAC, s22d

to set the appropriate expansion orderp. In Eq. s22d, dmax isthe maximum translational distance,C is the asymptotic Un-söld weight of the total density, andtMAC is the thresholdcontrolling the translation errors. See Ref. 32 for develop-ment of this expression and further explanation of these pa-rameters.

B. Tin-foil boundary conditions

The surface charges created by direct summation overVPFF must be canceled to achieve equivalence with Ewaldsummation. Achieving this equivalence is more than seman-tic, since without tin-foil boundary conditions matrix ele-ments lack translational invariance and often incur dramaticcharge sloshing instabilities. The correction is strongly de-pendent on ordering of the direct sum; as the Nijboer and DeWette method corresponds to spherical summation due tosymmetry of the real/reciprocal space partition, the appropri-ate correction is37

FEwsr d = FSSsr d +2p

3VUCsQ − 2r ·Dd, s23d

where D is the system dipole moment,Q is trace of thesystem quadrupole, and we have assumed origin centering,the tin-foil correction to the Coulomb matrix is then

JabTF =

2p

2VUCsQSab − 2dab ·Dd, s24d

with Sab being an element of the overlap matrix anddab thedipole moment of the distributionrab

` .

V. PERIODIC EXCHANGE CORRELATION

The HiCu algorithm is ideally suited for periodic bound-ary conditions, as the unit-cellVUC can be simply trans-formed into an equivalent rectangular integration domainVh, that is, the cube-tree’s AABB. These volumes, shown inFig. 3, are equivalent due to full periodicity of both distribu-tions and density. The integration is then simply

Kabxc =E

Vh

drrabsr dvxcfr;r g. s25d

This approach should be contrasted with more conventionalquantum chemical methods for computing the exchange-correlation matrix, involving the “Becke weights,”38 whichdemand numerical integration overV`.

While we have written Eq.s25d in terms of theexchange-correlation potential for simplicity, in practiceHiCu employs the Pople, Gill, and Johnson formulation.18,39

Because the distributions and density both involve adouble sum over lattice vectors, there will be a large numberof atom-atom pairs that do not overlap withVh. A similarsituation is encountered in the gas phase for parallel versionsof HiCu,40 where each processor has a small, local cube-treethat may overlap only a few of the many possible atom-atompairs. The solution to this problem again comes from the ray

134102-5 Linear scaling computation J. Chem. Phys. 122, 134102 ~2005!

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP:

129.49.23.145 On: Wed, 17 Dec 2014 18:45:48

tracing literature, in the form of a modified ray-AABBsRef.31d and sphere-AABB test.41 The ray-AABB test has beenmodified into a cylinder-AABB test, where the radius of thecylinder is a maximal overlap extent of the atom-atom pair.In the case of a same center atom-atom pair, it is of coursemore appropriate to employ a sphere-AABB test. In bothcases, overlap between the HiCu integration volume andatom-atom pairs is established with a negligible prefactorwhen using these tests.

VI. RESULTS

A. Implementation

All developments were implemented in a serial versionof MONDOSCF v1.0a9,26 a suite of linear scaling quantumchemistry code. The code was complied using the PortlandGroup F90 compilerPGF90 V4.2sRef. 42d with the −01 −tpathlon options and with the Gnu C compilerGCC V3.2.2usingthe −01 flag. All calculations were carried out on a 1.6 GHzAMD Athlon running REDHAT LINUXV9.0.43

Thresholds controlling the cost to accuracy ratio of HiCuand QCTC are set by the accuracy levels LOOSE, GOOD,and TIGHT, which have been empirically chosen to deliver4–5, 6–7, and 8–9 digits, respectively, of relative accuracy inthe energy. Values of these thresholds are listed in AppendixB of Ref. 21. The unmodified two-electron thresholdt2E setsthe overlap extentRp

o in Eq. s12d and the penetration extentRq

p in Eq. s13d, both of which control the PAC. As explainedin Ref. 32, the thresholdtMAC controlling the MAC is set astMAC =102t2E. The HiCu thresholdtHICU likewise sets twosubthresholds. The overlap extentRp

o in Eq. s12d, definingaccuracy of the density on the grid, is set using 10−1tHICU str

in Ref. 18d. The target relative error defining accuracy of theHiCu grid is justtHICU str in Ref. 18d. It should be pointedout that of all the thresholding schemes, those governing

HiCu are the least conservative; it is a simplesand not tooexpensived matter to simply tighten the HiCu threshold ifintermediate accuracies are required.

The multipole interaction and contraction code used byQCTC in the near/far-field partition has been highly opti-mized by symbolic manipulation and factorization, using realarithmetic and expansions through seventh order in the cal-culation of JIn. The computation ofJPFF employs a generalcode for multipole contraction, allowing expansion throughp=64. Eigensolution of the self-consistent-field equationshas been used throughout, with the corresponding matrix anddistribution thresholds given in Appendix B of Ref. 21. Allcalculations were performed withC1 snod symmetry, and allresults are reported in atomic units.

B. Validation

The ability of our implementation to reproduce trueEwald summation is shown in Fig. 4 for a periodic system of64 classical water molecules. Note that with both the Ewaldsum and the Nijboer and De Wette approach, ordering thereal and reciprocal space sums is critical; high order agree-ment is achieved only when the summation proceeds fromthe smallest to the largest terms.

The use of Cartesian Gaussian basis sets in many casesallows direct numerical comparison of different programs, atleast to within the approximations, grids, etc., peculiar to acode. Here, we make connection with the preeminent Gauss-ian orbital program for periodic calculations,CRYSTAL98.12

TABLE I. Comparison ofCRYSTAL98 and MONDOSCFG-point calculationson NaCl at the RBLYP/8-511/8-631G level of theory.

Program Nat Energysa.u.d Energy/Nat

MONODOSCF 2a −622.391 01 −311.195 51CRYSTAL98 2a −622.391 14 −311.195 57MONODOSCF 8b −2490.0016 −311.250 20CRYSTAL98 8b −2490.0013 −311.250 16

aTriclinic.bCubic.

FIG. 3. sColord Transformation between the unit cell with volumeVUC

sdescribed by the primitive lattice vectorsa, b, andcd and the rectangularintegration volumeVh employed by HiCu.

FIG. 4. sColord Error in the Coulomb energy computed with the Nijboer andDe Wette scheme relative to true Ewald summation. Shown is the error inthe Coulomb energy vsp with one sws=1d and twosws=2d layers of cellsin VPFF for a periodic system of 64 classical water molecules.

134102-6 C. J. Tymczak and M. Challacombe J. Chem. Phys. 122, 134102 ~2005!

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP:

129.49.23.145 On: Wed, 17 Dec 2014 18:45:48

Calculations have been carried out largely with basis setsoptimized for the condensed phase,28 which tend to have lessdiffuse valence functions. Tables I–III make a direct com-parison withCRYSTAL98 for NaCl and MgO obtained with theMONODOSCF TIGHT precision level. For theCRYSTAL98

calculations, we used the following threshold parameters:TOLDENS=10, TOLPOT=10, TOLGRID=15, and BASIS=4. The BASIS parameter determines the auxiliary functionsused to fit the exchange-correlation potential.

In Table I, comparison is made forG-point NaCl with the8-511GsRef. 44d basis set for sodium, the 8-631GsRef. 45dbasis set for chlorine, and using the restricted BLYPfunctional.46,47 Next, in Table II, convergence of the super-cell G-point approximation is demonstrated for NaCl withthe STO-3G basis set and the restricted local density ap-proximation. Then, in Table III, convergence of the supercellG-point approximation to thek-space integration result isdemonstrated for MgO, using the 8-61GsRef. 48d basis setfor magnesium, the 8-51GsRef. 48d basis set for the oxygen,and the restricted BLYP functional. The primitive lattice co-ordinates for these systems are given in Ref. 49.

Finally, in Table IV, convergence of the supercellG-point approximation is shown for diamond at the GOODaccuracy level, using the restricted BLYP functional and the6-21G* sRef. 50d basis set. Since MONDOSCF employs6-d and 10-f functions, whileCRYSTAL98 employs 5-d and

7-f, we were not able to make a direct comparison for thisbasis set.

C. Scaling and accuracy

Demonstrating linear scaling at the outset, Fig. 5 showsthe CPU time forJ and K xc builds with RBLYP/6-21G*

diamond at its standard density. These timings correspond toa GOOD level of accuracy, targeting six digits in the totalenergy and corresponding to the values listed in Table IV.Shown in Fig. 6 is the precision of the computed energies,obtained by performing a second set of calculations with allthresholds reduced by one order of magnitude. For these cal-culations, the largest source of error is the numerical integra-tion performed by HiCu, as the QCTC thresholds are signifi-cantly more conservative.

VII. CONCLUSIONS

We have extended linear scaling quantum chemicalmethods for computation of exchange-correlation and Cou-lomb matrices to periodic boundary conditions at theG point.These methods have demonstrated an early onset of linearscaling and error control for diamond, allowing calculationsup to 512 atoms at the RBLYP/6-21G* level of theory. Inboth cases, this early onset of linear scaling has been enabledby the use of modern data structures such as thek-d tree,together with reliable error estimates for the Gaussian extent.

TABLE II. Convergence of theG-point supercell approximation for NaCl,computed with MONODOSCF at the RLDA/STO-3G level of theory.

Program Nat Energysa.u.d Energy/Nat

MONDOSCF 2a −610.975 36 −305.487 688b −2444.3584 −305.544 8016a −4888.7002 −305.543 7754a −16 499.490 −305.546 1164b −19 554.956 −305.546 18128a −39 109.912 −305.546 18216b −65 997.977 −305.546 19

CRYSTAL98c 2a −611.092 28 −305.546 14

aTriclinic.bCubic.c63636 k-space grid.

TABLE III. Convergence of theG-point supercell approximation for MgO,computed with MONDOSCF at the RBLYP/8-61G/8-51G level of theory.

Program Nat Energysa.u.d Energy/Nat

MONDOSCF 2a −275.090 97 −137.545 488b −1101.7295 −137.716 18

16a −2203.6904 −137.730 6554a −7437.7989 −137.737 0264b −8815.2131 −137.737 71

128a −17 630.430 −137.737 74216b −29 751.352 −137.737 74

CRYSTAL98c 2a −275.475 47 −137.737 74

aTriclinic.bCubic.c63636 k-space grid.

TABLE IV. Convergence of theG-point supercell approximation for dia-mond, computed with MONDOSCF at the RBLYP/6-21G* level of theory.

Nat Energysa.u.d Energy/Nat

8 −303.989 −37.998616 −608.667 −38.041732 −1218.02 −38.063264 −2436.28 −38.066996 −3654.59 −38.0687

144 −5482.04 −38.0697216 −8223.10 −38.0699288 −10964.1 −38.0700384 −14618.9 −38.0700

FIG. 5. sColord Computational complexity of theJ and K xc matrix buildsfor cubic diamond at the RBLYP/6-21G* level of theory at the GOODaccuracy level.

134102-7 Linear scaling computation J. Chem. Phys. 122, 134102 ~2005!

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP:

129.49.23.145 On: Wed, 17 Dec 2014 18:45:48

These algorithms have been parallelized,40,51 demonstratinghigh efficiencies up to 128 processors, and have been usedrecently to determine theT=0 K equation of state for pen-taerythritol tetranitrate52 at the RPBE/6-31G** level oftheory and a GOOD accuracy level.

While this contribution has focused on demonstratinglinear scaling for diamond, the methods presented here workfor slabs and wires as well, using methods for computationof the two- and one-dimensional theM tensor outlined inthe Appendix. Our experience has shown that, for the samenumber of atoms, these lower dimensional systems run muchfaster.

ACKNOWLEDGMENTS

The authors would like to acknowledge Tommy Sewelland Ed Kober for their advice and support. They would alsolike to thank Chee Kwan Gan for a careful reading of thismanuscript.

APPENDIX: COMPUTATION OF THE M TENSOR

Following Nijboer and De Wette,24,25 and later Challa-combe, White, and Head–Gordon33 sCWHGd, we begin withthe partition

1

r l+1 = Glsb,rd + Flsb,rd sA1d

involving the functions

Glsb,rd =

GSl +1

2,b2r2D

GSl +1

2Dr l+1

sA2d

and

Flsb,rd =

gSl +1

2,b2r2D

GSl +1

2Dr l+1

, sA3d

whereG is the gamma function,g is the incomplete gammafunction,53 and b is a parameter controlling the partition.With this separation of length scales, the lattice sum definingthe multipole interaction tensorMl

m may be expressed as

Mlm = o

RPVPFF

MlmfRg = o

RPVPFF

P̃lmscosuRdeimfRGlsb,uRud

+ oRPVPFF

P̃lmscosuRdeimfRFlsb,uRud. sA4d

Following CWHG, this expression can be further developedinto real and reciprocal space terms:

Mlm = o

RPVPFF

P̃lmscosuRdeimfRGlsb,uRud

− oRPVln

P̃lmscosuRdeimfRFlsb,uRud

+

4p3/2S i

2Dl

VUCGSl +1

2D o

GÞhxjuGul−2e−p2uGu2/b2

3P̃lmscosuGdeimfG, sA5d

whereG are reciprocal lattice vectors. With an appropriatechoice ofb,Îp / sVUCd1/3, and summing terms from small-est to largest, the periodic multipole interaction tensor can becomputed to high precision assuming an accurate represen-tation of the incomplete gamma function. In previous workby CWHG, the upward recursion

Gsm+ 1,xd = mGsm,xd + xme−x sA6d

was used, which results in a loss of precision for large valuesof x and m, demanding extended precision arithmetic andprecluding on the fly computation. This problem is overcomeby analytically summing theg function, collecting terms,and then rewriting it as

GSm+1

2,xD = GSm+

1

2DHerfcsÎxd

+Î x

pon=0

m−1

sSnxe−p/ndnJ , sA7d

where the terms

Sn =1 GS1

2D

GSn +3

2D2

1/maxsn,1d

, sA8d

are simply pretabulated. This version of theg function isboth easy to program and precise, even for large values ofxor m.

FIG. 6. sColord Relative error with system size for RBLYP/6-21G* dia-mond at the GOOD accuracy level.

134102-8 C. J. Tymczak and M. Challacombe J. Chem. Phys. 122, 134102 ~2005!

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP:

129.49.23.145 On: Wed, 17 Dec 2014 18:45:48

In one dimension, theM tensor can be computed ana-lytically as

Mlm =

P̃lmscosu0deimf0

a0l+1 o

n=n0

`1

nl+1

+

P̃lmFcosSu0 +

p

2DGeimsf0+pd

a0l+1 o

n=n0

`1

nl+1

= Qlmfa0,u0,f0gHzsl + 1d − o

n=1

n0−11

nl+1J . sA9d

wherea0, u0, andf0 are the initial box dimension and angleswhich are independent of the summation, andz is the Rie-mann zeta function.53

In two dimensions, the Fourier integrals for the calcula-tion of theM tensor become more complicated. Taking thelimit as the box dimension in the nonperiodic direction goesto infinity sthe z direction in the followingd, we obtain fromEq. sA5d

Mlm = o

RPVPFF

sl − md!PlmscosuRdeimfRGlsb,uRud

− oR8PVln

sl − md!PlmscosuRdeimfRFlsb,uRud

+ C2 oGÞhxj

E−`

`

dGzuGul−2e−p2uGu2/b2P̃l

mscosuGdeimfG,

sA10d

where

C2 =

4p3/2S i

2Dl

AUCGSl +1

2D sA11d

andAUC is the area of the cell along the nonperiodic direc-tion. In practice, we carry out numerical evaluation of thisintegral.

1M. Honda, K. Sato, and S. Obara, J. Chem. Phys.94, 3790s1991d.2T. Helgaker and P. R. Taylor, Theor. Chim. Acta83, 177 s1992d.3P. J. Stephens, F. J. Devlin, C. F. Chabalowski, and M. J. Frisch, J. Phys.Chem. 98, 11623s1994d.

4T. Helgaker, H. Jensen, P. Jorgensenet al., DALTON, a molecular electronicstructure program, Release 1.2, 2001, URL http://www.kjemi.uio.no/software/dalton/dalton.html

5J. K. Laerdahl, T. Saue, and K. Faegri, Theor. Chem. Acc.97, 177s1997d.6I. P. Grant and H. M. Quiney, Int. J. Quantum Chem.80, 283 s2000d.7T. Yanai, H. Iikura, T. Nakajima, Y. Ishikawa, and K. Hirao, J. Chem.Phys. 115, 8267s2001d.

8P. M. Gill, B. J. Johnson, J. A. Pople, and M. J. Frisch, Int. J. QuantumChem. S26, 319 s1992d.

9A. D. Becke, J. Chem. Phys.98, 1372s1993d.10V. Barone, C. Adamo, and F. Mele, Chem. Phys. Lett.249, 290 s1996d.11C. Adamo, M. Cossi, and V. Barone, THEOCHEM493, 145 s1999d.12V. Saunders, R. Dovesi, C. Roetti, M. Causà, N. Harrison, R. Orlando, and

C. M. Zicovich-Wilson, CRYSTAL98, http://www.chimifm.unito.it/teorica/crystal/ s1998d.

13T. Bredow and A. R. Gerson, Phys. Rev. B61, 5194s2000d.14J. Muscat, A. Wander, and N. M. Harrison, Chem. Phys. Lett.342, 397

s2001d.15P. Baranek, A. Lichanot, R. Orlando, and R. Dovesi, Chem. Phys. Lett.

340, 362 s2001d.16A. Wander and N. M. Harrison, J. Chem. Phys.105, 6191s2001d.17M. Challacombe and E. Schwegler, J. Chem. Phys.106, 5526s1997d.18M. Challacombe, J. Chem. Phys.113, 10037s2000d.19A. M. N. Niklasson, Phys. Rev. B66, 155115s2002d.20A. M. N. Niklasson, C. J. Tymczak, and M. Challacombe, J. Chem. Phys.

118, 8611s2003d.21C. J. Tymczak, V. Weber, E. Schwegler, and M. Challacombe, Phys. Rev.

B ssubmittedd.22A. M. N. Niklasson and M. Challacombe, Phys. Rev. Lett.92, 193001

s2004d.23V. Weber, A. M. N. Niklasson, and M. Challacombe, Phys. Rev. Lett.92,

193002s2004d.24B. R. A. Nijboer and F. W. De Wette, PhysicasAmsterdamd 23, 309

s1957d.25B. R. A. Nijboer and F. W. De Wette, PhysicasAmsterdamd 24, 422

s1958d.26M. Challacombe, E. Schwegler, C. J. Tymczak, C. K. Gan, K. Nemeth, V.

Weber, A. M. N. Niklasson, and G. Henkelman, MONDOSCF v1.0a9, Aprogram suite for massively parallel, linear scaling SCF theory and abinitio molecular dynamics, 2001, URL http://www.t12.lanl.gov/home/mchalla/; Los Alamos National Laboratory Report No. LA-CC 01-2sun-publishedd, Copyright University of California.

27M. Towler, An introductory guide to Gaussian basis sets in solid-stateelectronic structure calculationss2000d, notes for Summer School, Torino.

28R. Dovesi, V. Saunders, C. Roetti, M. Causa, N. Harrison, R. Orlando, andC. Zicovich-Wilson, CRYSTAL98 Basis Sets, http://www.crystal.unito.it/BasisISets/ptable.htmls2003d.

29M. Towler, CRYSTAL98 Basis Sets, http://www.tcm.phy.cam.ac.uk/;mdt26/crystal.htmls1998d.

30A. Grüneich and B. A. Hess, Theor. Chem. Acc.100, 253 s1998d.31M. Gomez, Simple Intersection Tests for Gamess1999d, URL http://

www.gamasutra.com/32C. J. Tymczak and M. Challacombesunpublishedd.33M. Challacombe, C. White, and M. Head-Gordon, J. Chem. Phys.107,

10131s1997d.34G. R. Ahmadi and J. Almlöf, Chem. Phys. Lett.246, 364 s1995d.35L. E. McMurchie and E. R. Davidson, J. Comp. Physiol.26, 218 s1978d.36E. Hille, Ann. Math. 27, 427 s1926d.37A. Redlack and J. Grindlay, Can. J. Phys.50, 2815s1972d.38A. D. Becke, J. Chem. Phys.88, 2547s1988d.39J. A. Pople, P. M. W. Gill, and B. G. Johnson, Chem. Phys. Lett.199, 557

s1992d.40C. K. Gan and M. Challacombe, J. Chem. Phys.118, 9128s2003d.41Graphics Gems, edited by A. S. GlassnersAcademic, New York, 1990d.42The Portland Group,PGF90 V4.2s2002d, URL http://www.pgroup.com/43Redhat,REDHAT V9.0, http://www.redhat.com/s2004d.44M. Prencipe, A. Zupan, R. Dovesi, E. Apra, and V. R. Saunders, Phys.

Rev. B 51, 3391s1995d.45N. M. Harrison and V. R. Saunders, J. Phys. I4, 3873s1992d.46C. T. Lee, W. T. Yang, and R. G. Parr, Phys. Rev. B37, 785 s1988d.47A. D. Becke, Phys. Rev. A38, 3098s1988d.48M. Causa, R. Dovesi, C. Pisani, and C. Roetti, Phys. Rev. B33, 1308

s1986d.49Periodic coordinates used in MONDOSCF validations2004d, URL http://

www.t12.lanl.gov/;mchalla/50M. Catti, A. Pavese, R. Dovesi, and V. R. Saunders, Phys. Rev. B47,

9189 s1993d.51C. K. Gan, C. J. Tymczak, and M. Challacombe, J. Chem. Phys.sto be

publishedd.52C. K. Gan, T. D. Sewell, and M. Challacombe, Phys. Rev. B69, 035116

s2004d.53Handbook of Mathematical Functions, edited by M. Abramowitz and I. A.

Stegun, 9th ed.sDover, New York, 1987d.

134102-9 Linear scaling computation J. Chem. Phys. 122, 134102 ~2005!

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP:

129.49.23.145 On: Wed, 17 Dec 2014 18:45:48