Linear scaling computation of the Fock matrix. VII. Periodic density functional theory at the Γ...
Transcript of Linear scaling computation of the Fock matrix. VII. Periodic density functional theory at the Γ...
Linear scaling computation of the Fock matrix. VII. Periodic density functional theory atthe Γ pointC. J. Tymczak and Matt Challacombe Citation: The Journal of Chemical Physics 122, 134102 (2005); doi: 10.1063/1.1853374 View online: http://dx.doi.org/10.1063/1.1853374 View Table of Contents: http://scitation.aip.org/content/aip/journal/jcp/122/13?ver=pdfcov Published by the AIP Publishing Articles you may be interested in Benchmark calculations for reduced density-matrix functional theory J. Chem. Phys. 128, 184103 (2008); 10.1063/1.2899328 Linear scaling computation of the Fock matrix. VIII. Periodic boundaries for exact exchange at the Γ point J. Chem. Phys. 122, 124105 (2005); 10.1063/1.1869470 Comparison of two genres for linear scaling in density functional theory: Purification and density matrixminimization methods J. Chem. Phys. 122, 084114 (2005); 10.1063/1.1853378 Linear scaling computation of the Fock matrix. VII. Parallel computation of the Coulomb matrix J. Chem. Phys. 121, 6608 (2004); 10.1063/1.1790891 Linear scaling computation of the Fock matrix J. Chem. Phys. 106, 5526 (1997); 10.1063/1.473575
This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP:
129.49.23.145 On: Wed, 17 Dec 2014 18:45:48
Linear scaling computation of the Fock matrix. VII. Periodic densityfunctional theory at the G point
C. J. Tymczaka! and Matt ChallacombeTheoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545
sReceived 18 October 2004; accepted 8 December 2004; published online 1 April 2005d
Linear scaling quantum chemical methods for density functional theory are extended to thecondensed phase at theG point. For the two-electron Coulomb matrix, this is achieved with atree-code algorithm for fast Coulomb summationfM. Challacombe and E. Schwegler, J. Chem.Phys. 106, 5526 s1997dg, together with multipole representation of the crystal fieldfM.Challacombe, C. White, and M. Head-Gordon, J. Chem. Phys.107, 10131 s1997dg. A periodicversion of the hierarchical cubature algorithmfM. Challacombe, J. Chem. Phys.113, 10037s2000dg, which builds a telescoping adaptive grid for numerical integration of theexchange-correlation matrix, is shown to be efficient when the problem is posed as integration overthe unit cell. Commonalities between the Coulomb and exchange-correlation algorithms arediscussed, with an emphasis on achieving linear scaling through the use of modern data structures.With these developments, convergence of theG-point supercell approximation to thek-spaceintegration limit is demonstrated for MgO and NaCl. Linear scaling construction of the Fockian andcontrol of error is demonstrated for RBLYP/6-21G* diamond up to 512 atoms. ©2005 AmericanInstitute of Physics. fDOI: 10.1063/1.1853374g
I. INTRODUCTION
Quantum chemical methods that employ Gaussian-typeatomic orbitalssGTAOsd offer a number of advantages inmaterials science. First, because they are local basis func-tions, it is possible to achieve a linear scaling cost with sys-tem size for insulating systems. Second, almost all one- andtwo-electron integrals involving GTAOs are analytic, en-abling the rapid evaluation of expectation values involvingcomplicated operators that are often involved in the compu-tation of response properties.1–3 The Dalton quantum chemi-cal program4 is a premier example of this capability, offeringa wide range of electric and magnetic molecular responseproperties. The ability to treat core-states analytically alsoopens the ability to go beyond the pseudopotential approxi-mation in computation of relativistic effects with the four-component Dirac–Hartree–Fock5,6 and Dirac–Kohn–Sham7
theories. Perhaps most important though, the exact Hartree–Fock sHFd exchange may be computed efficiently with aGTAO basis set. In addition to providing a reference forcorrelated wave function methods, the exact HF exchange iscentral to hybrid HF/DFT models.8–11 The use of hybridmethods in the condensed phase, pioneered by the CRYS-TAL group,12 has proven to be a useful improvement beyondthe generalized gradient approximation for a number ofproperties, including bulk geometries, electronicproperties,13,14 and absorption energies.15,16
Recently, we have developed linear scaling quantumchemical methods for gas phase density functional theorysDFTd, including computation of the Coulomb matrixJ sRef.
17d and the exchange-correlation matrixK xc.18 In this con-
tribution, these linear scaling methods are extended to peri-odic boundary conditions at theG point.
With periodic linear scaling quantum chemical algo-rithms, it is possible to begin bridging the gap between meth-ods developed for small molecule chemistry and large scaleproblems in the solid state. Together with the results pre-sented here,OsNd methods for solving the Self-consistent-field equations19,20 and linear scaling algorithms for comput-ing the periodic HF exchange,21 it is now possible to performcondensed phase HF/DFT calculations on systems largerthan 500 atoms with a single processor. In addition, with theadvent of linear scaling density matrix perturbationtheory,22,23 well developed quantum chemical methods forthe analytic computation of response properties may bebrought to bear on large solid state problems.
This paper is organized as follows: In Sec. II, periodicboundary conditions and theG-point approximation are in-troduced. Next, in Sec. III, the relationship between the nu-merical error estimates and data structures that underly thefast linear scaling algorithms for computation of the Cou-lomb and exchange-correlation matrix are outlined. In Sec.IV, we extend previous work on the Niboer and De Wette24,25
lattice sum method to linear scaling computation of quantumCoulomb sums and tin-foil boundaries. Then, in Sec. V,OsNd methods for computing the GTAO-based exchange-correlation matrix are presented. In Sec. VI A we discuss theimplementation of these developments in the MONDOSCF26
suite of linear scaling quantum chemistry codes. In Sec. VI Bcomparison of theG-point results is made with those ob-tained with CRYSTAL98 using k-space integration for NaCland MgO. Next, in Sec. VI C, linear scaling is demonstratedfor construction of the diamond Fock matrix at theadElectronic mail: [email protected]
THE JOURNAL OF CHEMICAL PHYSICS122, 134102s2005d
0021-9606/2005/122~13!/134102/9/$22.50 © 2005 American Institute of Physics122, 134102-1
This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP:
129.49.23.145 On: Wed, 17 Dec 2014 18:45:48
RBLYP/6-21G* level of theory. Finally, in Sec. VII wepresent our conclusions.
II. PERIODIC BOUNDARY CONDITIONS, LINEARSCALING, AND BASIS SETS
In the conventional implementations of periodic bound-ary conditions, the Bloch functions
caksr d = o
Reik·Rfasr − Rd s1d
are often constructed from nonorthogonal functions local tothe unit cell sUCd. Here, the local functionfa is a GTAOcentered on atomA, while the sum onR runs over the Bra-vais lattice defined by integer translates of the primitive lat-tice vectorsa, b, andc. These Bloch functionsscrystal orbit-alsd yield all possible translational symmetries throughvariation of the reciprocal lattice vectork. Programs such asCRYSTAL98 perform a careful sampling of reciprocal space toachieve an accurate description of the periodic system. Analternative approach to including these important symmetriesis to setk =0, and then use a larger supercell created throughreplication and translation of the primitive unit cell. This isthe supercellG-point approximation, used primarily for thestudy of defects and vacancies rather than as a replacementfor k-space integration.
In this contribution,OsNd algorithms are developed spe-cifically for the G-point approximation, allowing the use oflarge supercells in the case of high symmetry, as well aslarge primary cells in the case of disordered systems. Whilek dependence is avoided, lattice summation and formal inte-gration over the unit cell volumeVUC are retained. At firstsight this would seem to make matrix construction quite dif-ferent than in the gas phase, where integrals are typicallytaken over all space,V`. Thus, elements of the gas phaseoverlap matrix,
Sab =EV`
drfasr dfbsr d, s2d
become
Sab = oR−R8
EVUC
drfasr + Rdfbsr + R8d s3d
in the periodicG-point regime. However, this formalism canbe brought into a form more closely related to its quantumchemical counterpart via the transformation,
oRE
VUC
dr fsr + Rd → EV`
dr fsr d, s4d
allowing use of conventional analytic integral technologies.For example, elements of the periodic overlap matrix be-come
Sab = oRE
V`
drfasr dfbsr + Rd. s5d
For compactness of notation, let us define the intermedi-ate basis function productssdistributionsd rabsr d=oRR8fasr+Rdfbsr +R8d associated with integration overVUC and the
corresponding distributionsrab` sr d=oRfasr dfbsr +Rd associ-
ated with integration overV`. We likewise define the elec-tron densityrsr d=oabPabrabsr d associated with integrationover VUC and the corresponding densityr`sr d=oabPabrab
` sr d associated with integration overV`, wherePab is the one-electron reduced density matrix. In this con-vention, V` is the default volume of integration, and ele-ments of the periodic overlap matrix are expressed simply asSab=edrrab
` sr d, while the electron count isNel=edrr`sr d.It is worth noting that the complexity ofr` is OsNd, due
to the exponential prefactore−xabsA −B−Rd2 that enters eachterm in the sum overA, B, andR. Thus,N scaling may beachieveda priori with a simple distance test. However, forsmall exponents, care must be exercised in truncation of pe-riodic sums to avoid overlap matrices that are not positivedefinite. While these situations can often be ameliorated witha tighter distance neglect criteria, they are typically a symp-tom of near linear dependence, often due to the use of basissets designed for gas phase calculations in conjunction withsmall unit cells.
These considerations and others are discussed by Towlerin an excellent overview of Gaussian basis sets for the con-densed phase.27 Also, there are at least twosalbeit relateddlibraries28,29 of Gaussian basis sets appropriate for materialsat standard densities. For high densities though, these basissets may still encounter problems with linear dependenceand sensitivity to truncation. One solution to this problem,suggested by Grüneich and Hess30 for even tempered basissets, is to scale the exponents by the inverse square of thelattice constant. In many cases though, especially for largesystems, standard quantum chemical basis sets work well.
III. DATA STRUCTURES AND ERROR ESTIMATES
Both the quantum chemical tree codesQCTCd sRef. 17dfor computing the Coulomb matrix and Hierarchical Cuba-ture sHiCud sRef. 18d for computing the exchange-correlation matrix are fastOsN ln Nd algorithms whose per-formance is coupled to underlying data structures and errorestimates. It is important to understand some of these par-ticulars first, before addressing their extension to periodicboundary conditions. Also, the current version of QCTC isquite different from previous descriptions and deserves someintroduction.
Both QCTC and HiCu are homeomorphic, involvingk-d tree representation of the density. In our implementa-tions, k-d trees are doubly linked lists with axis alignedbounding boxessAABBsd delimiting the spatialextent ofeach node and its children. This scheme is similar to welldeveloped technologies for ray tracing and data basesearches, allowing fastOsln Nd range queriesof overlappingcomponents through AABB intersection tests.31 In the caseof QCTC, this fast look up constitutes the penetration accept-ability criterion sPACd which identifies spatial clusters or ag-glomerationsrQ of the density that may be accurately repre-sented via a multipole approximation due to the absence ofcharge-charge penetration effects.
For accepted clusters a second test, the multipole accept-ability criterion sMACd, is performed to check translation
134102-2 C. J. Tymczak and M. Challacombe J. Chem. Phys. 122, 134102 ~2005!
This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP:
129.49.23.145 On: Wed, 17 Dec 2014 18:45:48
errors in the multipole expansion. This second test is criticalto the overall accuracy of the Coulomb matrix build. Wehave recently developed a different MAC in Ref. 32 that hasseveral advantages. First, it takes into account the magnitudeor weight of the distribution within the cluster. Second, itcorrectly takes into account the angular symmetry of theprimitive Gaussian distributions. Third, and most important,it is always an exact bound to the translation error.
For each primitive bra distributionrab, a fast rangequery is performed on thek-d tree representation of the totaldensity, leading to an on the fly partition of near-fieldsNFdand far-field sFFd interactions in construction of the gasphase Coulomb matrix which may be written as
Jab = oQPFF
ol
s− 1dlom
Oml frabgo
l8om8
Mm+m8l+l8 Om8
l8 frQg
+ oqPNF
E dr E dr 8rabsr dur − r 8u−1rqsr d, s6d
whereMml is the irregular solid harmonic interaction tensor,
Oml ffg=edrOm
l sr dfsr d is a moment of the regular solid har-monic,Q runs over the highest possible nodes in the density-tree consistent with the PAC and MAC, andq runs on left-over near-field primitive distributions in the density. SeeRefs. 17 and 33 for further details on this representation.
A fundamental difference between QCTC and FMMbased methods is that QCTC pushes the near/far-field parti-tion to the limit, employing the PAC and MAC best-caseerror estimates to resolve individual primitive distributions.On the other hand, FMM based methods employ static,worst-case error estimates. While recurring down the densitytree to the level of individual primitives precludes well de-veloped technologies for the integral evaluation of contractedfunctions, it accelerates the onset of linear scaling throughearly clustering.
The quantum chemical tree code generally employs thetotal density, which simplifies the code, allows electrostaticscreening in MAC error estimates and provides charge neu-trality, an essential feature for periodic calculations. Thus,the Coulomb matrix employed here includes the electron-nuclear terms;J;Jee+Ven.
In the case of HiCu, two separatek-d tree structures areused. The rho-tree holds the electron density, while the cube-tree contains a hierarchical grid for integration of theexchange-correlation potential. Each node of the cube-tree iscomposed of a Cartesian nonproduct integration rule with thegrid points enclosed by its AABB. The cube-tree is con-structed iteratively through recursive bisection of the primaryvolume sthe root AABBd, using exact error bounds toachieve arbitrary precision of the integrated density and itsgradients. As the cube-tree is extended, AABB intersectiontests are performed while traversing the rho-tree, avoidingparts of the density that do not overlap with that portion ofthe grid. Upon construction of the grid, the reverse procedureis carried out; for each primitive distribution, the cube-tree iswalked selecting only overlapping portions of the grid viathe AABB intersection test.
For both of these fast algorithms, the trade-off betweenefficiency and accuracy is controlled by the AABB, which in
turn depends on the the extent or rangeRq of a primitiveGaussian distributionrq, beyond which it is assumed to benegligible. Of course, negligible depends on the use to whichthe distribution is put, as will become obvious in the follow-ing.
Both HiCu and QCTC employ the Hermite–Gaussianrepresentation of distributions34
rqsr d = olmn
dlmnLlmnq sr d, s7d
where
Llmnq sr d =
] l+m+n
]Qxl ]Qy
m]Qzne−zqsr − Qd2. s8d
This representation provides an intermediate form into whichelements of the density matrix may be folded, and allows theuse of McMurchie–Davidson recurrence relations35 in ana-lytic integral evaluation and density evaluation. For thisform, Cramer’s inequality36 provides a bound for the behav-ior of a Hermite–Gaussian distribution:
rqsr d ø Cqe−z̃qsr − Qd2, s9d
where
Cq = olmn
udlmnuK3f2l+m+nl!m!n!zql+m+ng1/2, s10d
the constantK=1.09, and
z̃q = Hzq l + m+ n = 012zq otherwise.
J s11d
The overlap extentRqo is the value beyond which numeri-
cal evaluation of the distributionrq yields a value less thant,
Cqe−z̃qsRq
od2 = t. s12d
For QCTC, errors in the electrostatic potential due to pen-etration errors must be considered. For this purpose, the pen-etration extentRq
p is introduced, satisfying the equation
CqE fsp/z̃qd3/2dsrd − e−z̃qr2gur − Rq
pu−1dr = t. s13d
In both HiCu and QCTC, the density tree is constructedby recursively splitting the largest box dimension, until eachprimitive has been resolved. Then the primitive AABBs arecomputed from their extents and merged recursively back upthe tree. For HiCu, this is all there is to it, but for QCTCmultipole moments are also translated to a common centerand recursively merged up the tree. Also, when computingmatrix elements ofJ, the primitive bra AABB is computedwith Rq
o, while the Rqp are used to construct AABBs of the
density tree.In Fig. 1, differences between the penetration and over-
lap extent are shown for a diffuses-type Gaussian. For largeextents, such as those encountered in a static FMM-type er-ror bound, the two extents behave in a similar way. However,with aggressive use of the multipole approximation as inQCTC, the distinction becomes critical.
134102-3 Linear scaling computation J. Chem. Phys. 122, 134102 ~2005!
This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP:
129.49.23.145 On: Wed, 17 Dec 2014 18:45:48
IV. PERIODIC QUANTUM COULOMB SUMS
In the G-point approximation, elements of the periodicCoulomb matrix are
Jab =EVUC
dr E dr 8rabsr dur − r 8u−1rtotsr 8d
= oRE E drdr 8rab
` sr dur − r 8u−1rtot` sr 8 + Rd, s14d
wherertot is the total periodic density including both elec-tronic and nuclear terms. These integrals involve infinitesummation over the lattice vectorsR, and must be handledwith care. There are at least two main approaches to handlingthis summation: Multipole expansion of the Ewald potentialor Ewald-like summation of the multipole expansion. Expan-sion of the Ewald potential yields tin foilsTFd boundaryconditions, requires reciprocal and real space summationwith every J build, and scales asOsN3/2d. An alternative isthe Ewald-like summation of the multipole interaction ten-sor, which was first described by Nijboer and De WettesNDWd sRefs. 24 and 25d and later reviewed and extendedby Challacombe, White, and Head–Gordon33 to lattice sum-mation of the irregular solid harmonic multipole interactiontensor. This Ewald-like summation is taken over the periodicfar field VPFF and is equivalent to a direct lattice summationsnot a true Ewald sumd excluding an inner regionVIn sur-rounding the unit cell. This inner region has been subtractedto avoid penetration errors and to guarantee convergence ofthe multipole expansion. With the summed interaction ten-sors cheaply precomputed and reused, the cost of Coulombsummation over the PFF scales asOsNp2d, wherep is theorder of the multipole expansion. With this partition, theN-scaling periodic quantum Coulomb sums involve the con-tributions
J = JIn + JPFF+ JTF, s15d
corresponding to the three separate regions shown in Fig. 2.Here,JIn is computed using the fastOsN ln Nd QCTC algo-rithm outlined previously in Sec. III. Construction ofJPFF
will be developed in the following section, while in Sec.IV B the term JTF, necessary to introduce tin-foil boundaryconditions, is detailed.
A. The periodic far field
By construction, the PFF term in the Coulomb matrix,
JabPFF= o
RPPFFE E drdr 8rab
` sr dur − r 8 + Ru−1rtot` sr 8d,
s16d
involves charge distributions that are well separated with re-spect to both penetration and the convergence of multipoleexpansion errors, as outlined in Fig. 2 and discussed in thefollowing.
With these conditions, and assuming the unit cell is cen-tered at the origin, the bipolar multipole expansion employ-ing the regular and irregular solid harmonics,Om
l and Mml ,
respectively, is
FIG. 1. sColord Behavior of the overlap extentRo and the penetration extentRp as a function oft /Cq for an s-type Gaussian with exponentz=1. Forsmall Cq soccurring, for example, due to a large atom-atom separationand/or small density matrix prefactord, Ro goes to zero at the origin and itsdistribution is eliminated, whileRp goes slowly to zero due to the Coulombsingularity.
FIG. 2. Schematic of the regions contributing toN-scaling summation of theCoulomb matrix. The inner cells that make upVIn provide a buffer regionthat guarantees convergence of the multipole expansion of Coulomb inter-actions between the unit cell and all cells inVPFF. The periodic far-fieldregion VPFF is the spherically ordered lattice extending to infinity but ex-cluding VIn. For large cells and/or high order multipole expansions,VIn
includes just the unit cell’s 27 nearest neighbors. In FMM notation thiscorresponds to ws=1. However, for smaller cells and/or lower order multi-pole expansions,VIn tends to a spherical distribution of cells surrounding theunit cell. Direct summation overVPFF leads to charges at the infinite surfaceS̀ , which must be be canceled by tin-foilsconductingd boundary conditionsto achieve equivalence with Ewald summation.
134102-4 C. J. Tymczak and M. Challacombe J. Chem. Phys. 122, 134102 ~2005!
This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP:
129.49.23.145 On: Wed, 17 Dec 2014 18:45:48
ur − r 8 + Ru−1
< ol=0
p
s− 1dl ol8=0
p F om=−l
l
om=−l8
l8
Olmsr dMl+l8
m+m8sRdOl8m8sr 8dG .
s17d
Inserting Eq.s17d into Eq. s16d yields
JabPFF= o
RPPFFol=0
p
s− 1dl
3ol8=0
p
om=−l
l S om=−l8
l8
Olmfrab
` gMl+l8m+m8sRdOl8
m8frtot` gD .
s18d
This expression decouples the complexity ofrab` from rtot
`
through the precomputed multipole momentsOlmfrab
` g=edrOl
msr drab` sr d and Ol
mfrtot` g=edrOl
msr drtot` sr d. Following
Nijboer and De Wette,24,25,33we introduce the effective mul-tipole interaction tensor
Mlm = o
RPVPFF
MlmsRd, s19d
which can be efficiently computed on the fly for each newlattice, both to high accuracy and to high orderslarge pdusing the new methods detailed in the Appendix. Note thatthis is a direct sum of the interaction tensor, and isnotequivalent to Ewald summation. Nevertheless, with this sim-plification, theOsp2Nd working equation
JabPFF= o
l=0
p
om=−l
l
Olmfrab
` gJ lm s20d
is obtained, where the intermediate tensor
J ml = s− 1dl o
l8=0
p
om=−l8
l8
Ml+l8m+m8Ol8
m8frtot` g s21d
is cheaply precomputed at the start of each Coulomb build.Because Eq.s20d is inexpensive, our strategy is to define
a minimal buffer regionVIn sufficient to control penetrationerrors, subtracting effort from the computation ofJIn viaQCTC and replacing it with cheaper, multipole work in thecomputation ofJPFF. To this end, a fixed inner regionVIn isconstructed from neighboring cells that have simple Gauss-ian overlap with the unit cell, defined by the radiusRo. Asexplained in Sec. III, for the relatively large distances con-sidered at this level the differences between the penetrationand overlap extent are negligible. WithVIn fixed, the preci-sion of JPFF is controlled entirely by the expansion orderp.In generalp will be much higher than the expansion orders,5d employed by QCTC in computation ofJIn. With QCTCaccuracy is controlled on the fly by the MAC and PAC, es-tablishing a dynamic near/far-field partition, while computa-tion of JPFF involves a static, worst-case error dominated bythe multipole expansion. This static error is controlled byusing the FMM-like error bound,
2pC2dmaxp+1
sRodp+1uRo − 2dmaxuø tMAC, s22d
to set the appropriate expansion orderp. In Eq. s22d, dmax isthe maximum translational distance,C is the asymptotic Un-söld weight of the total density, andtMAC is the thresholdcontrolling the translation errors. See Ref. 32 for develop-ment of this expression and further explanation of these pa-rameters.
B. Tin-foil boundary conditions
The surface charges created by direct summation overVPFF must be canceled to achieve equivalence with Ewaldsummation. Achieving this equivalence is more than seman-tic, since without tin-foil boundary conditions matrix ele-ments lack translational invariance and often incur dramaticcharge sloshing instabilities. The correction is strongly de-pendent on ordering of the direct sum; as the Nijboer and DeWette method corresponds to spherical summation due tosymmetry of the real/reciprocal space partition, the appropri-ate correction is37
FEwsr d = FSSsr d +2p
3VUCsQ − 2r ·Dd, s23d
where D is the system dipole moment,Q is trace of thesystem quadrupole, and we have assumed origin centering,the tin-foil correction to the Coulomb matrix is then
JabTF =
2p
2VUCsQSab − 2dab ·Dd, s24d
with Sab being an element of the overlap matrix anddab thedipole moment of the distributionrab
` .
V. PERIODIC EXCHANGE CORRELATION
The HiCu algorithm is ideally suited for periodic bound-ary conditions, as the unit-cellVUC can be simply trans-formed into an equivalent rectangular integration domainVh, that is, the cube-tree’s AABB. These volumes, shown inFig. 3, are equivalent due to full periodicity of both distribu-tions and density. The integration is then simply
Kabxc =E
Vh
drrabsr dvxcfr;r g. s25d
This approach should be contrasted with more conventionalquantum chemical methods for computing the exchange-correlation matrix, involving the “Becke weights,”38 whichdemand numerical integration overV`.
While we have written Eq.s25d in terms of theexchange-correlation potential for simplicity, in practiceHiCu employs the Pople, Gill, and Johnson formulation.18,39
Because the distributions and density both involve adouble sum over lattice vectors, there will be a large numberof atom-atom pairs that do not overlap withVh. A similarsituation is encountered in the gas phase for parallel versionsof HiCu,40 where each processor has a small, local cube-treethat may overlap only a few of the many possible atom-atompairs. The solution to this problem again comes from the ray
134102-5 Linear scaling computation J. Chem. Phys. 122, 134102 ~2005!
This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP:
129.49.23.145 On: Wed, 17 Dec 2014 18:45:48
tracing literature, in the form of a modified ray-AABBsRef.31d and sphere-AABB test.41 The ray-AABB test has beenmodified into a cylinder-AABB test, where the radius of thecylinder is a maximal overlap extent of the atom-atom pair.In the case of a same center atom-atom pair, it is of coursemore appropriate to employ a sphere-AABB test. In bothcases, overlap between the HiCu integration volume andatom-atom pairs is established with a negligible prefactorwhen using these tests.
VI. RESULTS
A. Implementation
All developments were implemented in a serial versionof MONDOSCF v1.0a9,26 a suite of linear scaling quantumchemistry code. The code was complied using the PortlandGroup F90 compilerPGF90 V4.2sRef. 42d with the −01 −tpathlon options and with the Gnu C compilerGCC V3.2.2usingthe −01 flag. All calculations were carried out on a 1.6 GHzAMD Athlon running REDHAT LINUXV9.0.43
Thresholds controlling the cost to accuracy ratio of HiCuand QCTC are set by the accuracy levels LOOSE, GOOD,and TIGHT, which have been empirically chosen to deliver4–5, 6–7, and 8–9 digits, respectively, of relative accuracy inthe energy. Values of these thresholds are listed in AppendixB of Ref. 21. The unmodified two-electron thresholdt2E setsthe overlap extentRp
o in Eq. s12d and the penetration extentRq
p in Eq. s13d, both of which control the PAC. As explainedin Ref. 32, the thresholdtMAC controlling the MAC is set astMAC =102t2E. The HiCu thresholdtHICU likewise sets twosubthresholds. The overlap extentRp
o in Eq. s12d, definingaccuracy of the density on the grid, is set using 10−1tHICU str
in Ref. 18d. The target relative error defining accuracy of theHiCu grid is justtHICU str in Ref. 18d. It should be pointedout that of all the thresholding schemes, those governing
HiCu are the least conservative; it is a simplesand not tooexpensived matter to simply tighten the HiCu threshold ifintermediate accuracies are required.
The multipole interaction and contraction code used byQCTC in the near/far-field partition has been highly opti-mized by symbolic manipulation and factorization, using realarithmetic and expansions through seventh order in the cal-culation of JIn. The computation ofJPFF employs a generalcode for multipole contraction, allowing expansion throughp=64. Eigensolution of the self-consistent-field equationshas been used throughout, with the corresponding matrix anddistribution thresholds given in Appendix B of Ref. 21. Allcalculations were performed withC1 snod symmetry, and allresults are reported in atomic units.
B. Validation
The ability of our implementation to reproduce trueEwald summation is shown in Fig. 4 for a periodic system of64 classical water molecules. Note that with both the Ewaldsum and the Nijboer and De Wette approach, ordering thereal and reciprocal space sums is critical; high order agree-ment is achieved only when the summation proceeds fromthe smallest to the largest terms.
The use of Cartesian Gaussian basis sets in many casesallows direct numerical comparison of different programs, atleast to within the approximations, grids, etc., peculiar to acode. Here, we make connection with the preeminent Gauss-ian orbital program for periodic calculations,CRYSTAL98.12
TABLE I. Comparison ofCRYSTAL98 and MONDOSCFG-point calculationson NaCl at the RBLYP/8-511/8-631G level of theory.
Program Nat Energysa.u.d Energy/Nat
MONODOSCF 2a −622.391 01 −311.195 51CRYSTAL98 2a −622.391 14 −311.195 57MONODOSCF 8b −2490.0016 −311.250 20CRYSTAL98 8b −2490.0013 −311.250 16
aTriclinic.bCubic.
FIG. 3. sColord Transformation between the unit cell with volumeVUC
sdescribed by the primitive lattice vectorsa, b, andcd and the rectangularintegration volumeVh employed by HiCu.
FIG. 4. sColord Error in the Coulomb energy computed with the Nijboer andDe Wette scheme relative to true Ewald summation. Shown is the error inthe Coulomb energy vsp with one sws=1d and twosws=2d layers of cellsin VPFF for a periodic system of 64 classical water molecules.
134102-6 C. J. Tymczak and M. Challacombe J. Chem. Phys. 122, 134102 ~2005!
This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP:
129.49.23.145 On: Wed, 17 Dec 2014 18:45:48
Calculations have been carried out largely with basis setsoptimized for the condensed phase,28 which tend to have lessdiffuse valence functions. Tables I–III make a direct com-parison withCRYSTAL98 for NaCl and MgO obtained with theMONODOSCF TIGHT precision level. For theCRYSTAL98
calculations, we used the following threshold parameters:TOLDENS=10, TOLPOT=10, TOLGRID=15, and BASIS=4. The BASIS parameter determines the auxiliary functionsused to fit the exchange-correlation potential.
In Table I, comparison is made forG-point NaCl with the8-511GsRef. 44d basis set for sodium, the 8-631GsRef. 45dbasis set for chlorine, and using the restricted BLYPfunctional.46,47 Next, in Table II, convergence of the super-cell G-point approximation is demonstrated for NaCl withthe STO-3G basis set and the restricted local density ap-proximation. Then, in Table III, convergence of the supercellG-point approximation to thek-space integration result isdemonstrated for MgO, using the 8-61GsRef. 48d basis setfor magnesium, the 8-51GsRef. 48d basis set for the oxygen,and the restricted BLYP functional. The primitive lattice co-ordinates for these systems are given in Ref. 49.
Finally, in Table IV, convergence of the supercellG-point approximation is shown for diamond at the GOODaccuracy level, using the restricted BLYP functional and the6-21G* sRef. 50d basis set. Since MONDOSCF employs6-d and 10-f functions, whileCRYSTAL98 employs 5-d and
7-f, we were not able to make a direct comparison for thisbasis set.
C. Scaling and accuracy
Demonstrating linear scaling at the outset, Fig. 5 showsthe CPU time forJ and K xc builds with RBLYP/6-21G*
diamond at its standard density. These timings correspond toa GOOD level of accuracy, targeting six digits in the totalenergy and corresponding to the values listed in Table IV.Shown in Fig. 6 is the precision of the computed energies,obtained by performing a second set of calculations with allthresholds reduced by one order of magnitude. For these cal-culations, the largest source of error is the numerical integra-tion performed by HiCu, as the QCTC thresholds are signifi-cantly more conservative.
VII. CONCLUSIONS
We have extended linear scaling quantum chemicalmethods for computation of exchange-correlation and Cou-lomb matrices to periodic boundary conditions at theG point.These methods have demonstrated an early onset of linearscaling and error control for diamond, allowing calculationsup to 512 atoms at the RBLYP/6-21G* level of theory. Inboth cases, this early onset of linear scaling has been enabledby the use of modern data structures such as thek-d tree,together with reliable error estimates for the Gaussian extent.
TABLE II. Convergence of theG-point supercell approximation for NaCl,computed with MONODOSCF at the RLDA/STO-3G level of theory.
Program Nat Energysa.u.d Energy/Nat
MONDOSCF 2a −610.975 36 −305.487 688b −2444.3584 −305.544 8016a −4888.7002 −305.543 7754a −16 499.490 −305.546 1164b −19 554.956 −305.546 18128a −39 109.912 −305.546 18216b −65 997.977 −305.546 19
CRYSTAL98c 2a −611.092 28 −305.546 14
aTriclinic.bCubic.c63636 k-space grid.
TABLE III. Convergence of theG-point supercell approximation for MgO,computed with MONDOSCF at the RBLYP/8-61G/8-51G level of theory.
Program Nat Energysa.u.d Energy/Nat
MONDOSCF 2a −275.090 97 −137.545 488b −1101.7295 −137.716 18
16a −2203.6904 −137.730 6554a −7437.7989 −137.737 0264b −8815.2131 −137.737 71
128a −17 630.430 −137.737 74216b −29 751.352 −137.737 74
CRYSTAL98c 2a −275.475 47 −137.737 74
aTriclinic.bCubic.c63636 k-space grid.
TABLE IV. Convergence of theG-point supercell approximation for dia-mond, computed with MONDOSCF at the RBLYP/6-21G* level of theory.
Nat Energysa.u.d Energy/Nat
8 −303.989 −37.998616 −608.667 −38.041732 −1218.02 −38.063264 −2436.28 −38.066996 −3654.59 −38.0687
144 −5482.04 −38.0697216 −8223.10 −38.0699288 −10964.1 −38.0700384 −14618.9 −38.0700
FIG. 5. sColord Computational complexity of theJ and K xc matrix buildsfor cubic diamond at the RBLYP/6-21G* level of theory at the GOODaccuracy level.
134102-7 Linear scaling computation J. Chem. Phys. 122, 134102 ~2005!
This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP:
129.49.23.145 On: Wed, 17 Dec 2014 18:45:48
These algorithms have been parallelized,40,51 demonstratinghigh efficiencies up to 128 processors, and have been usedrecently to determine theT=0 K equation of state for pen-taerythritol tetranitrate52 at the RPBE/6-31G** level oftheory and a GOOD accuracy level.
While this contribution has focused on demonstratinglinear scaling for diamond, the methods presented here workfor slabs and wires as well, using methods for computationof the two- and one-dimensional theM tensor outlined inthe Appendix. Our experience has shown that, for the samenumber of atoms, these lower dimensional systems run muchfaster.
ACKNOWLEDGMENTS
The authors would like to acknowledge Tommy Sewelland Ed Kober for their advice and support. They would alsolike to thank Chee Kwan Gan for a careful reading of thismanuscript.
APPENDIX: COMPUTATION OF THE M TENSOR
Following Nijboer and De Wette,24,25 and later Challa-combe, White, and Head–Gordon33 sCWHGd, we begin withthe partition
1
r l+1 = Glsb,rd + Flsb,rd sA1d
involving the functions
Glsb,rd =
GSl +1
2,b2r2D
GSl +1
2Dr l+1
sA2d
and
Flsb,rd =
gSl +1
2,b2r2D
GSl +1
2Dr l+1
, sA3d
whereG is the gamma function,g is the incomplete gammafunction,53 and b is a parameter controlling the partition.With this separation of length scales, the lattice sum definingthe multipole interaction tensorMl
m may be expressed as
Mlm = o
RPVPFF
MlmfRg = o
RPVPFF
P̃lmscosuRdeimfRGlsb,uRud
+ oRPVPFF
P̃lmscosuRdeimfRFlsb,uRud. sA4d
Following CWHG, this expression can be further developedinto real and reciprocal space terms:
Mlm = o
RPVPFF
P̃lmscosuRdeimfRGlsb,uRud
− oRPVln
P̃lmscosuRdeimfRFlsb,uRud
+
4p3/2S i
2Dl
VUCGSl +1
2D o
GÞhxjuGul−2e−p2uGu2/b2
3P̃lmscosuGdeimfG, sA5d
whereG are reciprocal lattice vectors. With an appropriatechoice ofb,Îp / sVUCd1/3, and summing terms from small-est to largest, the periodic multipole interaction tensor can becomputed to high precision assuming an accurate represen-tation of the incomplete gamma function. In previous workby CWHG, the upward recursion
Gsm+ 1,xd = mGsm,xd + xme−x sA6d
was used, which results in a loss of precision for large valuesof x and m, demanding extended precision arithmetic andprecluding on the fly computation. This problem is overcomeby analytically summing theg function, collecting terms,and then rewriting it as
GSm+1
2,xD = GSm+
1
2DHerfcsÎxd
+Î x
pon=0
m−1
sSnxe−p/ndnJ , sA7d
where the terms
Sn =1 GS1
2D
GSn +3
2D2
1/maxsn,1d
, sA8d
are simply pretabulated. This version of theg function isboth easy to program and precise, even for large values ofxor m.
FIG. 6. sColord Relative error with system size for RBLYP/6-21G* dia-mond at the GOOD accuracy level.
134102-8 C. J. Tymczak and M. Challacombe J. Chem. Phys. 122, 134102 ~2005!
This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP:
129.49.23.145 On: Wed, 17 Dec 2014 18:45:48
In one dimension, theM tensor can be computed ana-lytically as
Mlm =
P̃lmscosu0deimf0
a0l+1 o
n=n0
`1
nl+1
+
P̃lmFcosSu0 +
p
2DGeimsf0+pd
a0l+1 o
n=n0
`1
nl+1
= Qlmfa0,u0,f0gHzsl + 1d − o
n=1
n0−11
nl+1J . sA9d
wherea0, u0, andf0 are the initial box dimension and angleswhich are independent of the summation, andz is the Rie-mann zeta function.53
In two dimensions, the Fourier integrals for the calcula-tion of theM tensor become more complicated. Taking thelimit as the box dimension in the nonperiodic direction goesto infinity sthe z direction in the followingd, we obtain fromEq. sA5d
Mlm = o
RPVPFF
sl − md!PlmscosuRdeimfRGlsb,uRud
− oR8PVln
sl − md!PlmscosuRdeimfRFlsb,uRud
+ C2 oGÞhxj
E−`
`
dGzuGul−2e−p2uGu2/b2P̃l
mscosuGdeimfG,
sA10d
where
C2 =
4p3/2S i
2Dl
AUCGSl +1
2D sA11d
andAUC is the area of the cell along the nonperiodic direc-tion. In practice, we carry out numerical evaluation of thisintegral.
1M. Honda, K. Sato, and S. Obara, J. Chem. Phys.94, 3790s1991d.2T. Helgaker and P. R. Taylor, Theor. Chim. Acta83, 177 s1992d.3P. J. Stephens, F. J. Devlin, C. F. Chabalowski, and M. J. Frisch, J. Phys.Chem. 98, 11623s1994d.
4T. Helgaker, H. Jensen, P. Jorgensenet al., DALTON, a molecular electronicstructure program, Release 1.2, 2001, URL http://www.kjemi.uio.no/software/dalton/dalton.html
5J. K. Laerdahl, T. Saue, and K. Faegri, Theor. Chem. Acc.97, 177s1997d.6I. P. Grant and H. M. Quiney, Int. J. Quantum Chem.80, 283 s2000d.7T. Yanai, H. Iikura, T. Nakajima, Y. Ishikawa, and K. Hirao, J. Chem.Phys. 115, 8267s2001d.
8P. M. Gill, B. J. Johnson, J. A. Pople, and M. J. Frisch, Int. J. QuantumChem. S26, 319 s1992d.
9A. D. Becke, J. Chem. Phys.98, 1372s1993d.10V. Barone, C. Adamo, and F. Mele, Chem. Phys. Lett.249, 290 s1996d.11C. Adamo, M. Cossi, and V. Barone, THEOCHEM493, 145 s1999d.12V. Saunders, R. Dovesi, C. Roetti, M. Causà, N. Harrison, R. Orlando, and
C. M. Zicovich-Wilson, CRYSTAL98, http://www.chimifm.unito.it/teorica/crystal/ s1998d.
13T. Bredow and A. R. Gerson, Phys. Rev. B61, 5194s2000d.14J. Muscat, A. Wander, and N. M. Harrison, Chem. Phys. Lett.342, 397
s2001d.15P. Baranek, A. Lichanot, R. Orlando, and R. Dovesi, Chem. Phys. Lett.
340, 362 s2001d.16A. Wander and N. M. Harrison, J. Chem. Phys.105, 6191s2001d.17M. Challacombe and E. Schwegler, J. Chem. Phys.106, 5526s1997d.18M. Challacombe, J. Chem. Phys.113, 10037s2000d.19A. M. N. Niklasson, Phys. Rev. B66, 155115s2002d.20A. M. N. Niklasson, C. J. Tymczak, and M. Challacombe, J. Chem. Phys.
118, 8611s2003d.21C. J. Tymczak, V. Weber, E. Schwegler, and M. Challacombe, Phys. Rev.
B ssubmittedd.22A. M. N. Niklasson and M. Challacombe, Phys. Rev. Lett.92, 193001
s2004d.23V. Weber, A. M. N. Niklasson, and M. Challacombe, Phys. Rev. Lett.92,
193002s2004d.24B. R. A. Nijboer and F. W. De Wette, PhysicasAmsterdamd 23, 309
s1957d.25B. R. A. Nijboer and F. W. De Wette, PhysicasAmsterdamd 24, 422
s1958d.26M. Challacombe, E. Schwegler, C. J. Tymczak, C. K. Gan, K. Nemeth, V.
Weber, A. M. N. Niklasson, and G. Henkelman, MONDOSCF v1.0a9, Aprogram suite for massively parallel, linear scaling SCF theory and abinitio molecular dynamics, 2001, URL http://www.t12.lanl.gov/home/mchalla/; Los Alamos National Laboratory Report No. LA-CC 01-2sun-publishedd, Copyright University of California.
27M. Towler, An introductory guide to Gaussian basis sets in solid-stateelectronic structure calculationss2000d, notes for Summer School, Torino.
28R. Dovesi, V. Saunders, C. Roetti, M. Causa, N. Harrison, R. Orlando, andC. Zicovich-Wilson, CRYSTAL98 Basis Sets, http://www.crystal.unito.it/BasisISets/ptable.htmls2003d.
29M. Towler, CRYSTAL98 Basis Sets, http://www.tcm.phy.cam.ac.uk/;mdt26/crystal.htmls1998d.
30A. Grüneich and B. A. Hess, Theor. Chem. Acc.100, 253 s1998d.31M. Gomez, Simple Intersection Tests for Gamess1999d, URL http://
www.gamasutra.com/32C. J. Tymczak and M. Challacombesunpublishedd.33M. Challacombe, C. White, and M. Head-Gordon, J. Chem. Phys.107,
10131s1997d.34G. R. Ahmadi and J. Almlöf, Chem. Phys. Lett.246, 364 s1995d.35L. E. McMurchie and E. R. Davidson, J. Comp. Physiol.26, 218 s1978d.36E. Hille, Ann. Math. 27, 427 s1926d.37A. Redlack and J. Grindlay, Can. J. Phys.50, 2815s1972d.38A. D. Becke, J. Chem. Phys.88, 2547s1988d.39J. A. Pople, P. M. W. Gill, and B. G. Johnson, Chem. Phys. Lett.199, 557
s1992d.40C. K. Gan and M. Challacombe, J. Chem. Phys.118, 9128s2003d.41Graphics Gems, edited by A. S. GlassnersAcademic, New York, 1990d.42The Portland Group,PGF90 V4.2s2002d, URL http://www.pgroup.com/43Redhat,REDHAT V9.0, http://www.redhat.com/s2004d.44M. Prencipe, A. Zupan, R. Dovesi, E. Apra, and V. R. Saunders, Phys.
Rev. B 51, 3391s1995d.45N. M. Harrison and V. R. Saunders, J. Phys. I4, 3873s1992d.46C. T. Lee, W. T. Yang, and R. G. Parr, Phys. Rev. B37, 785 s1988d.47A. D. Becke, Phys. Rev. A38, 3098s1988d.48M. Causa, R. Dovesi, C. Pisani, and C. Roetti, Phys. Rev. B33, 1308
s1986d.49Periodic coordinates used in MONDOSCF validations2004d, URL http://
www.t12.lanl.gov/;mchalla/50M. Catti, A. Pavese, R. Dovesi, and V. R. Saunders, Phys. Rev. B47,
9189 s1993d.51C. K. Gan, C. J. Tymczak, and M. Challacombe, J. Chem. Phys.sto be
publishedd.52C. K. Gan, T. D. Sewell, and M. Challacombe, Phys. Rev. B69, 035116
s2004d.53Handbook of Mathematical Functions, edited by M. Abramowitz and I. A.
Stegun, 9th ed.sDover, New York, 1987d.
134102-9 Linear scaling computation J. Chem. Phys. 122, 134102 ~2005!
This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP:
129.49.23.145 On: Wed, 17 Dec 2014 18:45:48