Implementation of the Molecular Electrostatic Potential over Graphics

May 30, 2019 - Inf. Model.2019XXXXXXXXXX-XXX .... (22−25) Such implementations show that there is a compromise between the accuracy and the speedup ...
1 downloads 0 Views 2MB Size
Article Cite This: J. Chem. Inf. Model. 2019, 59, 3120−3127

pubs.acs.org/jcim

Implementation of the Molecular Electrostatic Potential over Graphics Processing Units J. Ceś ar Cruz,† Raymundo Hernań dez-Esparza,† Á lvaro Vaź quez-Mayagoitia,‡ Rubicelia Vargas,† and Jorge Garza*,† †

Downloaded via UNIV OF SOUTHERN INDIANA on July 24, 2019 at 01:54:20 (UTC). See https://pubs.acs.org/sharingguidelines for options on how to legitimately share published articles.

Departamento de Química, División de Ciencias Básicas e Ingeniería, Universidad Autónoma Metropolitana-Iztapalapa, San Rafael Atlixco 186, Col. Vicentina, Iztapalapa. C. P. 09340, Ciudad de México, México ‡ Argonne Leadership Computing Facility, Argonne National Laboratory, 9700 S. Cass Avenue, Lemont, Illinois 60439, United States S Supporting Information *

ABSTRACT: The molecular electrostatic potential (MEP) generated by quantum chemistry methods and Gaussian functions is evaluated over graphics processing units (GPUs). This implementation is based on full-range Rys polynomials with nodes and weights obtained in each thread of a GPU. For high angular moments, the corresponding integral is solved using a onedimension vertical recurrence relation. Thus, we computed the MEP with minimal approximations. We show that this implementation is stable and very efficient since the time consumed over GPUs is quite small compared with similar implementations over CPUs. The implementation was done by using CUDA-C programming techniques within the Graphics Processing Units for Atoms and Molecules (GPUAM) project, which has been designed to analyze quantum chemistry fields over heterogeneous computational resources. With this new scalar field GPUAM is a useful application for the quantum chemistry community, in particular for people interested in chemical reactivity analysis.



INTRODUCTION

In many quantum chemistry methods, the electron density is represented in terms of a set of orbitals,11,12 {ψi(r)}, from the expression

The molecular electrostatic potential (MEP) is an important property of a molecular system that may be used to elucidate its intrinsic chemical reactivity.1−4 The MEP definition is based on Coulomb’s law5,6 Φ(r2) =

∑ A

ZA − |r2 − RA|

ρ(r) =

i

1

2

(1)

where the charges in the system are involved in the evaluation of this property at the point r2. In our case, ZA represents the nuclear charge of each atom in the system and the electron density is represented by ρ(r). From definition 1, one can deduce that there are sites in a molecule where nuclei command and therefore Φ(r) is positive and regions where the electron density rules and consequently Φ(r) is negative; this characteristic has been widely used in quantum chemistry applications.2,7−10 © 2019 American Chemical Society

(2)

where the occupation of each orbital is represented by ωi. In this article, we do discuss the evaluation of eq 2 by using a basis set, {ϕμ}, through the linear combination

ρ (r )

∫ dr1 |r −1r |

∑ ωiψi*(r)ψi(r)

K

ψi(r) =

∑ cμ(i)ϕμ(r) μ=1

(3)

where K represents the number of functions in the basis set and {c(i) μ } are obtained in the minimization process of the total Received: December 23, 2018 Published: May 30, 2019 3120

DOI: 10.1021/acs.jcim.8b00951 J. Chem. Inf. Model. 2019, 59, 3120−3127

Article

Journal of Chemical Information and Modeling energy. If the basis set is build with real functions then {c(i) μ } are real and eq 2 has the form K

K

∑ ∑ Pνμϕμ(r)ϕν(r)

ρ(r) =

(4)

μ=1 ν=1

with Pνμ =

∑ ωicν(i)cμ(i)

(5)

i

In this way Φ(r2) =

∑ A

ZA − |r2 − RA|

K

∑ μ,ν=1

Pνμ

∫ dr1

ϕμ(r1)ϕν(r1) |r1 − r2|

Figure 1. Kernel used in the GPUAM code to distribute the MEP evaluation over each GPU thread.

(6)

The main problem to evaluate Φ(r2) is related to the integral of the last equation since it must be computed at each point r2 contained within a grid where the MEP is analyzed. As an example, we will discuss at the end of this article one system that requires of 3 873 096 points in the grid to obtain a highquality picture of the MEP mapped on the electron density for a system with 146 nuclei. There are proposals to evaluate this kind of integrals,13−21 but this is still an open issue in quantum chemistry to accelerate the corresponding calculations, although many of them use some approximations to estimate it.22−25 Such implementations show that there is a compromise between the accuracy and the speedup to evaluate the MEP since the bigger accuracy, the bigger the computational effort. We can find programs that evaluate very fast the MEP. However, sometimes, this evaluation is achieved by using strong approximations, and therefore, one must use it only for qualitative interpretations since atomic charges are involved in computing this property.26 During the last years, graphics processing units (GPUs) have gained an important place within the high-performance computing. Evidently computational modeling has taken advantage of this hardware with a huge impact in chemistry. We can mention two fields where GPUs have contributed to accelerate some applications: molecular dynamics27−30 and quantum chemistry.16,20,31−36 In this context, some of us have reported the evaluation of scalar and vector fields defined in quantum chemistry, over grids by using GPUs. The graphics processing units for atoms and molecules (GPUAM)37,38 project has as its primary purpose the development of a code where personal computers, with an installed GPU, can be used for some demanding computational tasks. This code uses as part of the input data WFN or WFX files provided by programs as GAMESS,31 Gaussian,32 NWChem,33 or Molden2AIM,39 to obtain nuclei coordinates, exponents of the basis set, and the coefficients {ciν}. At the end of the evaluation of a scalar or vector field, GPUAM writes the corresponding results in permanent memory through files with cube format, which is a standard extension introduced by Gaussian, Inc. to analyze some scalar fields defined by quantum chemistry. GPUAM has been designed to run over personal computers as a tool to analyze the electron density or related quantum chemistry fields. However, parts of this code can be inserted with small efforts in quantum chemistry codes to accelerate the evaluation of some scalar or vector fields through GPUs. In GPUAM, each point of the grid selected by a user is assigned to each thread of a GPU. In Figure 1 we present the kernel where GPUAM distributes the grid over a GPU,37,38 which has been developed under CUDA-C programming

techniques.40 In these days some GPUs can be used for highperformance computing41−49 since they have installed many threads with appreciable clock rate; consequently, the time used by this computing architecture is smaller than that obtained by central processing units (CPUs).38 To the best of our knowledge, there have been a few efforts to evaluate Φ(r) from eq 6 over GPUs, although important contributions have been reported around the evaluation of two-electron integrals,15,16,20,21,36,50,51 which is strongly related to the MEP. Thus, the aim of this article is the implementation of this quantity through the GPUAM code combining two methods well-established in quantum chemistry, with the primary purpose to reduce the elapsed time involved in the evaluation of the MEP. In the next section, we give some details of the implementation. We show its efficiency in the Results and Discussion section when this is applied over some chemical systems, and their results are contrasted with those obtained by the multiwfn code,52 which is widely used by the quantum chemistry community, and consequently, this is a reference for these kinds of tasks.10,53−55



IMPLEMENTATION DETAILS In this section, we do present details to evaluate the integral in eq 6 for Gaussian functions defined as i

j

2

k

ϕμ(r; A) = xAμyAμ zAμe−ζμrA

(7)

with iμ, jμ, kμ = 0, 1, 2, ... rA2 = |r − RA|2 = xA2 + yA2 + zA2

(8)

xA = x − XA

(9)

and so on for y and z. The standard way to transform the integral starts writing the exponential as 2

2

2

e−ζμrA − ζνrB = Me−p |r − P|

(10)

with p = ζμ + ζν P=

(11)

ζμRA + ζν R B p

(12)

and 2

M = e−ζμζν |RA − RB| 3121

/p

(13) DOI: 10.1021/acs.jcim.8b00951 J. Chem. Inf. Model. 2019, 59, 3120−3127

Article

Journal of Chemical Information and Modeling Following the approach of Boys56 coupled with the Macmurchie and Davidson57 method to deal with powers of xA, yA, and zA, the integral becomes

∫ dr1

ϕμ(r1)ϕν(r1) |r1 − r2|

XPB = Px − XB

The process starts from I00 = 1

L

=

∑ CmFm(X )

where Fm(X) is Fm(X ) =

∫0

1

dt t 2me−Xt

2

(15)

and Cm are expansion coefficients where m reaches its maximum value when L = iμ + iν + jμ + jν + kμ + kν. The exponent X is defined from X = p |P − r2|2

(16)

Jn + 1(t , X ) = (t − αn)Jn (t , X ) − βnJn − 1(t , X )

Another way to express eq 14 is in terms of a polynomial, PL(t2), of degree L as

∫ dr1

ϕμ(r1)ϕν(r1) |r1 − r2|

=

∫0

1

dtPL(t 2)e−Xt

∫0

dte

−Xt 2

2

π = erf( p |P − r2|) 2 p |P − r2|

b

(17)

αn =

1

2

dtPL(t 2)e−Xt =

βn =

with N = L/2 + 2. Weights, wκ, and roots, tκ, of Rys polynomials are needed to evaluate exactly integral 19. It is convenient that PL(t2) can be written in terms of onedimensional Cartesian integrals Iiμiν(t2), Ijμjν(t2), and Ikμkν(t2) for x, y, and z coordinates, respectively.58 Thus, we have PL(t 2) =

2π p

μν

μ ν

μν

κ=1

(iIiμ − 1, iν + jIi

μ , iν − 1

Iiμ , iν + 1(t 2) = (XPB − t 2X Pr2)Iiμiν + (iIiμ − 1, iν + jIi

μ , iν − 1

)

β0 =

(28)

βn Jn − 1 ,

n = 0, 1, ... N

(29)

1

∫−1 dt w(t , X ) = ∫−1 dt e−Xt

2

(30)

Moments of the weight function (μn) appear in the evaluation of βn, which are defined as 1

μn =

2

∫−1 dte−Xt t n

(31)

A recurrence relation is obtained for these moments in the integration by parts (21)

μn =

1 (1 − t 2) 2p

n−1 1 μ − e −X , 2X n − 2 X

n = 2, 4, 6

(32)

with μ0 =

(22)

π erf( X ) X

(33)

For large values of n, this method is numerically unstable.62 However, in the evaluation of the MEP high values of n are not necessary since, in the worst case, angular moment values are around 12 and this method starts to be unstable for n > 20.63

where XPA and XPB are given by XPA = Px − XA

βn + 1 Jn + 1 +

1

(20)

1 (1 − t 2) 2p

)

b

∫a dt w(t , X )Jn2− 1(t , X )

This equation is obtained from eq 26 by using the normalization condition of the monic orthogonal polynomials.61 The starting point of the recurrence relation 29 is J−1 = 0, J0 = 1, and

from here it is clear the dependence of PL on μ and ν. Rys, Dupuis, and King (RDK) designed a recursive method to obtain these integrals.58 In this report we use the vertical recurrence relations for each Cartesian coordinate, e.g., for the X coordinate Iiμ + 1, iν(t 2) = (XPA − t 2X Pr2)Iiμiν +

∫a dt w(t , X )Jn2 (t , X )

tJn(t , X ) =

N

∑ wκIi i (tκ2)I j j (tκ2)Ik k (tκ2)

(27)

For the integral interval t ∈ [−1, 1], the recursion coefficient αn is zero because this quantity has involved an odd function. For that reason, we do use full-range Rys polynomials to avoid computational work. Thus, the recurrence relation is written as60

(19)

κ=1

b

∫a dt w(t , X )Jn2 (t , X ) b

(18)

N

∑ wκPL(tκ2)

∫a dt w(t , X )tJn2(t , X )

and

where erf represents the error function. For L > 0 the corresponding integral is evaluated numerically. In our case we use a N-point Gaussian quadrature rule

∫0

(26)

is involved for monic orthogonal polynomials, with αn and βn obtained from

Precisely, in this article the integral involved in eq 17 is evaluated over a GPU. The first case is obtained for L = 0 1

(25)

The Rys polynomials form an orthogonal set of functions 2 with the weight function w(X, t) = e−Xt ; the half-range Rys polynomials, Rm(t), in the interval t ∈ [0, 1] and the full-range polynomials, Jm(t), in the interval t ∈ [−1, 1]. From here, it is clear that roots and weights of Rys polynomials must be evaluated over each point of the grid where the MEP is computed. This task is computationally demanding, and for that reason, some approximations have been reported to evaluate roots and weights.59 In general, a three-term recurrence relation

(14)

m=0

(24)

(23)

and 3122

DOI: 10.1021/acs.jcim.8b00951 J. Chem. Inf. Model. 2019, 59, 3120−3127

Article

Journal of Chemical Information and Modeling To control possible numerical issues, in this article, we used two different methods to obtain βn coefficients. For X ⩽ 100, integrals of eq 28 were solved numerically using a Lagrange Gaussian quadrature of one hundred points according to the discretized Gautschi-Stieltjes method.64 For X > 100 we use the Gram-Schmidt procedure to construct the Rys polynomials, in this way numerical errors are not present when X → ∞. To obtain roots and weights of the Rys polynomials, the N × N Jacobi matrix jij jj jj jj jj jj jj jj jj jj J = jjjj jj jj jj jj jj jj jj jj jj jj k

0

β1

0

0

...

β1

0

β2

0

...

0

β2

0

β3

...

0 .

0 .

β3 .

0 .

... .

.

.

.

.

.

0

0

0

0

βN

0 zyz zz zz 0 zzzz zz zz 0 zzzz zz zz 0 zzzz z . zzzz zz z βN zzz zz zz z 0 zz {

To test our implementation in GPUs, we studied 106 molecules, where the number of primitive functions and the number of occupied orbitals were increased in ascending order. From molecule 1 to molecule 62 the number of primitive functions and occupied orbitals allow the use of a GPU installed in a laptop. For this set of molecules we do compare results obtained by a GTX 1050Ti GPU and 4 Intel(R) Core i7-4770 CPU @ 3.4 GHz. For this comparison, we used in average 133017 points in the grid for each molecule. From molecule 63 to molecule 106 we used a high-performance computing system comparing results between a Titan V GPU and 32 Intel(R) Xeon E5-2683 CPU v4 @ 2.1 GHz, and 132 578 points in average within the grid. All structures considered in this article were optimized by using different methods and basis sets. Each structure, method, basis set, and name for each system is listed in the Supporting Information (SI). Thus, we have tested the MEP implementation in the GPUAM code with different molecules, methods, and basis sets. All these molecules were optimized, and their minima were confirmed by frequency analysis. All calculations were done by using the G09 code (previous release of G1632), which provided the corresponding WFX files. The MEP obtained by the GPUAM code was contrasted with that computed by the multiwfn code52 using the same number points in the grid. For the five largest molecules involved in our set of study we tested 5 GPUs: Tesla K20m, GTX 1050Ti, Quadro GP100, Titan V, and Tesla V100-SXM2. All calculations were done in single precision, and the differences presented with regard to double precision are contrasted at the end of the next section.

(34)

must be diagonalized, its eigenvalues are the roots (tκ) and the weights (wκ) are obtained from the Christoffel-Darboux relation63 1

wκ =

N−1 ∑i = 0 Ji (tκ )2

(35)

In our implementation, the QL algorithm was used to diagonalize J.65 Finally, to evaluate the MEP only K(K + 1)/2 values are required since we have symmetric matrices to obtain K

1

ρ (r ) ∫ dr1 |r −1r | = ∑ Gμμ̃ ∫−1 dtPL(t 2)e−Xt 1

K

2

μ

K

̃ +2 ∑ ∑ Gμν μ ν>μ

1

∫−1 dtPL(t 2)e−Xt

2

(36)

2

(37)

where K

̃ = Gμν

2

∑ Pνμe ζ ζ /p |A− B| μ ν

μν

(38)

Thus, this is the final expression coded in the GPUAM. In the algorithm below, we present a summary of the operations to be executed over each thread of a GPU, which is a mix of two methods: Gram-Schmidt plus the discretized Gautschi-Stieltjes method, which can be implemented in GPUs or CPUs. The accuracy of the MEP depends on the number of points used to evaluate numerically the integrals involved in the recurrence relationships 21 and 22, and also on the numerical method to evaluate the βn coefficients, which are obtained by two methods to control the numerical precision, which is confirmed in Table S1 of the Supporting Information (SI). A full discussion around errors involved in the Rys quadrature was done by Yasuda.15 In our implementation the number of points is determined by L through the relation N = L/2 + 2, we found this number adequate to obtain a good performance/ accuracy ratio.



RESULTS AND DISCUSSION Technical details of our implementation can be found in the SI. Thus, in the main text we left the corresponding results for their discussion. The time used to evaluate the MEP for 106 molecules is reported in Tables S4 and S5. For these two tables GPUAM and multiwfn codes were executed with same input data to obtain a fair comparison. GPUAM was executed over a NVIDIA(R) GeForce GTX 1050Ti GPU (Table S4) and a NVIDIA(R) Titan V GPU (Table S5). The multiwfn code was executed over 4 Intel(R) Core i7-4770 CPU @ 3.40 GHz (Table S4) and 32 Intel(R) Xeon E5-2683 CPU v4 @ 2.1 GHz 3123

DOI: 10.1021/acs.jcim.8b00951 J. Chem. Inf. Model. 2019, 59, 3120−3127

Article

Journal of Chemical Information and Modeling (Table S5). An extract from these tables is presented in Table 1. In these tables, we report the number of primitive Gaussian Table 1. Time (s) Used to Evaluate the MEP time (s) molecule

primitives

orbitals

points

1 21 39 47 54 57

62 200 302 456 631 749

7 18 28 38 59 70

135675 131936 133920 132600 130052 134504

75 77 91 101 103 104

416 700 1008 1647 2210 2371

51 80 86 130 104 127

133380 133000 132540 132300 134848 134504

GPUAMa 1 10 22 56 68 100 GPUAMc 4 11 21 49 95 90

multiwfnb 3 40 96 230 476 714 multiwfnd 41 120 262 751 1667 2063

Figure 2. Logarithm of the root-mean-square deviation (RMSD) between data obtained by GPUAM and multiwfn for 106 molecules.

points. For this comparison, we used three GPUs installed in personal computers: GTX 1050Ti, Titan V (used previously in Table 1), and a Quadro GP100, besides, two Tesla GPUs used mainly for high-performance computing: K20m and V100SXM2. We want to mention that the Titan V GPU is installed in the same desktop where the multiwfn code was executed with 8 Intel(R) Core i7 CPUs. We stress this point since in this case, we do show that desktop computers with GPUs as boosters offer an excellent solution for demanding computational tasks. As an example, in Figure 3, the MEP is projected over an isosurface of the corresponding electron density for the molecule 106. The elapsed time to compute MEP is reported in Table 2 for molecules 102−106.

a

NVIDIA(R) GeForce GTX 1050Ti. b4 Intel(R) Core i7-4770 CPU @ 3.40 GHz. cNVIDIA(R) Titan V. d32 Intel(R) Xeon E5-2683 CPU v4 @ 2.1 GHz.

functions, the number of occupied orbitals, and the number of points in the grid points involved in the MEP evaluation for each molecule. The time, in seconds, used by GPUAM or multiwfn is reported in all tables. From Table 1 it is clear that GPUAM uses less time to evaluate the MEP than multiwfn, and this difference is enhanced while the number of primitive functions times the number of orbitals is increased. In principle, we expect a linear relationship between the elapsed time for the execution and the product primitives × orbitals × points. However, the Jacobian matrix depends on the angular moment involved in the basis set and consequently the time to find its eigenvalues is not linear; for this reason when the basis set contains d or f functions the total time is increased notably. We want to mention that the multiwfn code evaluates integrals by using approximations to the Boys function according to Cook’s book,66 which induces several numerical errors, and for that reason, other programs do not use anymore this approach to evaluate two-electron integrals.31,32 Thus, the MEP obtained by GPUAM and multiwfn is not precisely the same in a molecule. For this reason, we contrasted such results between cube files, comparing both results for each point in the grid. For this comparison we used the root-mean-square deviation (RMSD) defined as RMSD =

1 points

Figure 3. MEP of a system with 146 nuclei, 301 orbitals, and 2680 primitive functions over a grid of 3 873 096 points mapped over the electron density (ρ = 0.01 a.u.). Negative regions in blue color, and positive regions in red color.

points

∑ i=1

|viGPUAM − vimultiwfn|2 (39)

From Table 2, it is clear the impact of high angular moment values in the basis set over our implementation, since for the molecule 102 the product primitives × orbitals is less than that obtained for the molecule 106. However, for both cases GPUAM spends almost the same time to compute the MEP. The main difference in these systems is that the molecule 102 has f functions in its basis set, and as we have discussed above, this is a critical parameter in our implementation. Results in Table 2 also reveal how NVIDIA(R) has improved its GPUs through the time since a GPU in a laptop (GTX 1050Ti) shows a similar performance than that exhibited by a Tesla K20m GPU, which was designed for

where points is the number of points in the grid for this quantity, which is reported in Figure 2 for the 106 molecules tested in this article. From here, we can say that there is a small deviation of the MEP between GPUAM and multiwfn, even when multiwfn truncates the evaluation of integrals for large values of the exponents. To compare the performance of several GPUs, the MEP was evaluated by using different NVIDIA(R) GPUs for the last five molecules in our set; 102−106. The basis set of these molecules contain more than 2000 primitive functions, and we computed the MEP over grids with more than 2 millions of 3124

DOI: 10.1021/acs.jcim.8b00951 J. Chem. Inf. Model. 2019, 59, 3120−3127

Article

Journal of Chemical Information and Modeling

Table 2. Time Used for GPUAM to Evaluate MEP in Five Systems over Five Different GPUs: Tesla K20m (GPU1), GTX 1050Ti (GPU2), Quadro GP100 (GPU3), Titan V (GPU4), Tesla V100-SXM2 (GPU5) time (s) molecule

primitives

orbitals

points

GPU1

GPU2

GPU3

GPU4

GPU5

102 103 104 105 106

2118 2210 2371 2442 2680

95 104 127 249 301

2987580 2203422 2718328 3154250 3873096

7015 14402 12025 5466 7240

7229 11686 9752 5662 7845

1468 2797 2342 1147 1514

1163 2174 1756 879 1169

1004 (537) 2515 (1360) 2101 (1180) 824 (453) 1030 (553)

Table 3. Comparison between Single and Double Precision Obtained with a GPU NVIDIA Quadro GP100 for Five Moleculesa time (s)

minimum MEP

maximum MEP

system

points

single

double

single

double

single

double

RMSD

22 30 65 68 101

1048125 1165376 1465282 1223790 1666746

5.0 4.0 6.0 13.0 121.0

9.0 8.0 10.0 24.0 243.0

−0.04710197 −0.03896332 −0.11490200 −0.00436378 −0.29938130

−0.04709860 −0.03896302 −0.11490240 −0.00436497 −0.29940280

362.95680 145.08070 89.910100 337.22880 99.774070

362.95830 145.08080 89.910080 337.23150 99.773860

0.000003 0.000002 0.000003 0.000006 0.000061

a

Minimum and maximum represent the lowest and the biggest values of the MEP in atomic units.

several NVIDIA(R) GPUs. Besides, if a user does not have NVIDIA GPUs installed in the system, we have an OpenMP implementation over CPUs optimized for Intel compilers. The binary file can be found in the same site of the GPUs version.

high performance computing, although this company releases the K20m 6 years ago. Concerning the Titan V performance, it is impressive the short times exhibited by the GPUAM on this GPU, although the performance shown by the Quadro GP100 GPU is excellent. In Table 2, the GPU5 has two rows, the second one has results in parentheses obtained by 2 T V100SXM2 GPUs. From these results, GPUAM shows its capability to take advantage of the two GPUs installed in the system, in fact, this code is designed to use all NVIDIA(R) GPUs installed in a server, if a user wants to do it. The impact of the double precision to evaluate the MEP over GPUs is presented in Table 3 comparing single and double precision calculations. First, we do observe an expected result for a GPU optimized to execute double precision applications since the elapsed time to run over double precision is twice the time used by a single precision execution. In terms of precision, in the same table the maximum and minimum value delivered by the MEP for each system is compared between single and double precision executions, with differences in the sixth figure of this comparison, except in systems where the basis set contains large angular moment values, for these cases the difference is increased. This difference has an important impact over the RMSD found between single and double precision data sets, which is also reported in Table 3. This deterioration from double to single precision is due to the number of points used to evaluate numerically the corresponding integral. It is worth noting that GPUs present a crucial limitation to work in double precision when an applications requires constant memory since these devices contain only 65 kB, and consequently, the arrays used to evaluate the electron density are restricted in size. In our opinion, the GPUs for personal computers are good solutions for tasks involved in many quantum chemistry applications. In our implementation, we have used CUDA to take advantage of GPUs. However, there are alternatives to use heterogeneous hardware through OpenCL programming techniques. On this way, we are rewriting our code to offer an application which can be exploited by GPUs, CPUs, or any multiple thread device with OpenCL support. The implementation presented in this article can be obtained from https:// github.com/gpuam/binaries.git, which has been tested over



CONCLUSIONS An implementation of the molecular electrostatic potential over graphics processing units was given in detail in this article. We proposed an approach where two numerical techniques are mixed to obtain reliable results and avoid numerical errors. The implementation over the GPUAM code was tested on personal computers with GPUs of the last generation. To the best of our knowledge, this is the first time where such an approach is proposed to use GPUs as accelerators of personal computers. In this work we use a recursive Rys approach because this method needs small memory. However, there are additional approaches to evaluate these integrals. Quite recently Zhang proposed a mix of three approaches to evaluate two-electron integrals over CPUs.67 We think this is a good approach that must be tested in GPUs or heterogeneous computing.



ASSOCIATED CONTENT

S Supporting Information *

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jcim.8b00951. (1)Technical details to evaluate numerically the molecular electrostatic potential. (2) Comparison between GPUAM and multiwfn to evaluate the molecular electrostatic potential. (3) Coordinates of molecular systems considered in this article (PDF)



AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. ORCID

Jorge Garza: 0000-0003-4249-6078 Notes

The authors declare no competing financial interest. 3125

DOI: 10.1021/acs.jcim.8b00951 J. Chem. Inf. Model. 2019, 59, 3120−3127

Article

Journal of Chemical Information and Modeling



(18) Sandberg, J. A. R.; Rinkevicius, Z. An algorithm for the efficient evaluation of two-electron repulsion integrals over contracted Gaussian-type basis functions. J. Chem. Phys. 2012, 137, 234105. (19) Miao, Y.; Merz, K. M. Acceleration of Electron Repulsion Integral Evaluation on Graphics Processing Units via Use of Recurrence Relations. J. Chem. Theory Comput. 2013, 9, 965−976. (20) Kussmann, J.; Ochsenfeld, C. Preselective Screening for LinearScaling Exact Exchange-Gradient Calculations for Graphics Processing Units and General Strong-Scaling Massively Parallel Calculations. J. Chem. Theory Comput. 2015, 11, 918−922. (21) Kussmann, J.; Ochsenfeld, C. Hybrid CPU/GPU Integral Engine for Strong-Scaling Ab Initio Methods. J. Chem. Theory Comput. 2017, 13, 3153−3159. (22) Kikuchi, O.; Nakajima, H.; Horikoshi, K.; Takahashi, O. Rapid evaluation of molecular electrostatic potential maps by simple analytical functions - extension to compounds containing pi-electron systems. J. Mol. Struct.: THEOCHEM 1993, 285, 57−69. (23) Lee, M.; Leiter, K.; Eisner, C.; Knap, J. Atom-partitioned multipole expansions for electrostatic potential boundary conditions. J. Comput. Phys. 2017, 328, 344−353. (24) Weiner, P. K.; Langridge, R.; Blaney, J. M.; Schaefer, R.; Kollman, P. A. Electrostatic potential molecular-surfaces. Proc. Natl. Acad. Sci. U. S. A. 1982, 79, 3754−3758. (25) Gasteiger, J.; Marsili, M. Iterative partial equalization of orbital electronegativity - a rapid access to atomic charges. Tetrahedron 1980, 36, 3219−3228. (26) O’Boyle, N. M.; Banck, M.; James, C. A.; Morley, C.; Vandermeersch, T.; Hutchison, G. R. Open Babel: An open chemical toolbox. J. Cheminf. 2011, 3, 33. (27) Lee, T.-S.; Cerutti, D. S.; Mermelstein, D.; Lin, C.; LeGrand, S.; Giese, T. J.; Roitberg, A.; Case, D. A.; Walker, R. C.; York, D. M. GPU-Accelerated Molecular Dynamics and Free Energy Methods in Amber18: Performance Enhancements and New Features. J. Chem. Inf. Model. 2018, 58, 2043−2050. (28) Abraham, M. J.; Murtola, T.; Schulz, R.; Páll, S.; Smith, J. C.; Hess, B.; Lindahl, E. GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX 2015, 1−2, 19−25. (29) Stone, J. E.; Hardy, D. J.; Ufimtsev, I. S.; Schulten, K. GPUaccelerated molecular modeling coming of age. J. Mol. Graphics Modell. 2010, 29, 116. (30) Anderson, J. A.; Lorenz, C. D.; Travesset, A. General purpose molecular dynamics simulations fully implemented on graphics processing units. J. Comput. Phys. 2008, 227, 5342. (31) Schmidt, M. W.; Baldridge, K. K.; Boatz, J. A.; Elbert, S. T.; Gordon, M. S.; Jensen, J. H.; Koseki, S.; Matsunaga, N.; Nguyen, K. A.; Su, S. J.; Windus, T. L.; Dupuis, M.; Montgomery, J. A. General Atomic and Molecular Electronic-Structure System. J. Comput. Chem. 1993, 14, 1347−1363. (32) Frisch, M. J. et al. Gaussian 16 Revision A.03; Gaussian Inc.: Wallingford CT, 2016. (33) Valiev, M.; Bylaska, E.; Govind, N.; Kowalski, K.; Straatsma, T.; Van Dam, H. J. J.; Wang, D.; Nieplocha, J.; Apra, E.; Windus, T.; de Jong, W. NWChem: A comprehensive and scalable open-source solution for large scale molecular simulations. Comput. Phys. Commun. 2010, 181, 1477−1489. (34) Liu, F.; Luehr, N.; Kulik, H. J.; Martínez, T. J. Quantum Chemistry for Solvated Molecules on Graphical Processing Units Using Polarizable Continuum Models. J. Chem. Theory Comput. 2015, 11, 3131−3144. (35) Liu, F.; Sanchez, D. M.; Kulik, H. J.; Martínez, T. J. Exploiting graphical processing units to enable quantum chemistry calculation of large solvated molecules with conductor-like polarizable continuum models. Int. J. Quantum Chem. 2019, 119, No. e25760. (36) Kussmann, J.; Ochsenfeld, C. Employing OpenCL to Accelerate Ab Initio Calculations on Graphics Processing Units. J. Chem. Theory Comput. 2017, 13, 2712−2716. (37) Hernández-Esparza, R.; Mejia-Chica, S. M.; Zapata-Escobar, A. D.; Guevara-García, A.; Martínez-Melchor, A.; Hernández-Pérez, J.

ACKNOWLEDGMENTS The authors are thankful for the facilities provided by the Laboratorio de Supercómputo y Visualización en Paralelo at the Universidad Autónoma Metropolitana-Iztapalapa and NVIDIA Corporation since part of the results related to GPUs were provided by the NVIDIA PSG Cluster. This research used resources of the Argonne Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC02-06CH11357. Partial funding was provided by CONACYT, México, through the project FC-2016/2412. R.H.-E. and J.-C.C. thank CONACYT for scholarships 283251 and 620190, respectively.



REFERENCES

(1) Murray, J. S.; Politzer, P. The electrostatic potential: an overview. Wiley Interdiscip. Rev.-Comput. Mol. Sci. 2011, 1, 153−163. (2) Liu, L.; Miao, L.; Li, L.; Li, F.; Lu, Y.; Shang, Z.; Chen, J. Molecular Electrostatic Potential: A New Tool to Predict the Lithiation Process of Organic Battery Materials. J. Phys. Chem. Lett. 2018, 9, 3573−3579. (3) Wang, J.; Liu, Z.; Frank, J.; Moore, P. B. Identification of ions in experimental electrostatic potential maps. IUCrJ 2018, 5, 375−381. (4) Ishikawa, T. Ab initio quantum chemical calculation of electron density, electrostatic potential, and electric field of biomolecule based on fragment molecular orbital method. Int. J. Quantum Chem. 2018, 118, No. e25535. (5) Scrocco, E.; Tomasi, J. The electrostatic molecular potential as a tool for the interpretation of molecular properties. New Concepts II; Berlin, Heidelberg, 1973; pp 95−170. (6) Scrocco, E.; Tomasi, J. In Electronic Molecular Structure, Reactivity and Intermolecular Forces: An Euristic Interpretation by Means of Electrostatic Molecular Potentials; Löwdin, P.-O., Ed.; Advances in Quantum Chemistry; Academic Press, 1978; Vol. 11; pp 115−193. (7) Bauza, A.; Seth, S. K.; Frontera, A. Molecular electrostatic potential and “atoms-in-molecules” analyses of the interplay between π-hole and lone pair··· π /X-H··· π /metal··· π interactions. J. Comput. Chem. 2018, 39, 458−463. (8) Murray, J. S.; Politzer, P. Molecular electrostatic potentials and noncovalent interactions. Wiley Interdiscip. Rev.-Comput. Mol. Sci. 2017, 7, No. e1326. (9) Bijina, P. V.; Suresh, C. H.; Gadre, S. R. Electrostatics for probing lone pairs and their interactions. J. Comput. Chem. 2018, 39, 488−499. (10) Mehmood, A.; Jones, S. I.; Tao, P.; Janesko, B. G. An OrbitalOverlap Complement to Ligand and Binding Site Electrostatic Potential Maps. J. Chem. Inf. Model. 2018, 58, 1836−1846. (11) Szabo, A.; Ostlund, N. S. Modern Quantum Chemistry: Introduction to Advanced Electronic Structure Theory; Dover: New York, 1996. (12) Helgaker, T.; Jørgensen, P.; Olsen, J. Molecular Electronic Structure Theory; John Wiley & Sons, LTD: Chichester, 2000. (13) Pople, J. A.; Hehre, W. J. Computation of electron repulsion integrals involving contracted Gaussian basis functions. J. Comput. Phys. 1978, 27, 161−168. (14) Gill, P. M. Molecular integrals Over Gaussian Basis Functions. Adv. Quantum Chem. 1994, 25, 141−205. (15) Yasuda, K. Two-electron integral evaluation on the graphics processor unit. J. Comput. Chem. 2008, 29, 334−342. (16) Ufimtsev, I. S.; Martínez, T. J. Quantum Chemistry on Graphical Processing Units. 1. Strategies for Two-Electron Integral Evaluation. J. Chem. Theory Comput. 2008, 4, 222−231. (17) Asadchev, A.; Allada, V.; Felder, J.; Bode, B. M.; Gordon, M. S.; Windus, T. L. Uncontracted Rys Quadrature Implementation of up to G Functions on Graphical Processing Units. J. Chem. Theory Comput. 2010, 6, 696−704. 3126

DOI: 10.1021/acs.jcim.8b00951 J. Chem. Inf. Model. 2019, 59, 3120−3127

Article

Journal of Chemical Information and Modeling M.; Vargas, R.; Garza, J. Grid-Based algorithm to search critical points, in the electron density, accelerated by graphics processing units. J. Comput. Chem. 2014, 35, 2272−2278. (38) Hernández-Esparza, R.; Vázquez-Mayagoitia, A.; SorianoAgueda, L.-A.; Vargas, R.; Garza, J. GPUs as boosters to analyze scalar and vector fields in quantum chemistry. Int. J. Quantum Chem. 2019, 119, e25671. (39) Zou, W.; Nori-Shargh, D.; Boggs, J. E. On the Covalent Character of Rare Gas Bonding Interactions: A New Kind of Weak Interaction. J. Phys. Chem. A 2013, 117, 207−212. (40) Sanders, J.; Kandrot, E. CUDA by Example: An Introduction to General-purpose GPU Programming; Tsinghua University Press: Boston, 2010. (41) Sánchez-Linares, I.; Pérez-Sánchez, H.; Cecilia, J. M.; García, J. M. High-Throughput parallel blind Virtual Screening using BINDSURF. BMC Bioinf. 2012, 13, S13. (42) Imbernón, B.; Cecilia, J. M.; Pérez-Sánchez, H.; Giménez, D. METADOCK: A parallel metaheuristic schema for virtual screening methods. Int. J. High Perform. Comput. Appl. 2018, 32, 789−803. (43) Zhang, Q.; Wang, J.; Guerrero, G. D.; Cecilia, J. M.; García, J. M.; Li, Y.; Pérez-Sánchez, H.; Hou, T. Accelerated Conformational Entropy Calculations Using Graphic Processing Units. J. Chem. Inf. Model. 2013, 53, 2057−2064. (44) Guerrero, G. D.; Imbernón, B.; Pérez-Sánchez, H.; Sanz, F.; García, J. M.; Cecilia, J. M. A Performance/Cost Evaluation for a GPU-Based Drug Discovery Application on Volunteer Computing. BioMed Res. Int. 2014, 2014, 474219. (45) Guerrero, G.; Pérez-Sánchez, H.; Wenzel, W.; Cecilia, J.; García, J. In 5th International Conference on Practical Applications of Computational Biology & Bioinformatics; Rocha, M., Rodríguez, J., Fdez-Riverola, F., Valencia, A., Eds.; Springer, 2011; Vol. 93, Chapter 9, pp 63−70. (46) Computation of Non-Bonded Interactions in Drug Discovery: Nvidia GPUs vs. Intel Xeon Phi. Work-Conference on Bioinformatics and Biomedical Engineering, Granada, 2014; pp 579−588. (47) Guerrero, G. D.; Pérez-Sánchez, H.; Cecilia, J. M.; García, J. M. Parallelization of Virtual Screening in Drug Discovery on Massively Parallel Architectures. 20th Euromicro International Conference on Parallel, Distributed and Network-based Processing, 2012; pp 588−595. (48) Sánchez-Linares, I.; Pérez-Sánchez, H.; García, J. M. Accelerating Grid Kernels for Virtual Screening on Graphics Processing Units. Adv. Parallel Comput. 2011, 413−420. (49) Pérez-Sánchez, H.; Wenzel, W. Implementation of an effective non-bonded interactions kernel for biomolecular simulations on the Cell processor. Informatik 2009−Im Focus das Leben; Gesellschaft für Informatik: Lübeck, Deutschland, 2009; pp 721−729. (50) Luehr, N.; Ufimtsev, I. S.; Martinez, T. J. Dynamic Precision for Electron Repulsion Integral Evaluation on Graphical Processing Units (GPUs). J. Chem. Theory Comput. 2011, 7, 949−954. (51) Ufimtsev, I. S.; Martinez, T. J. Quantum Chemistry on Graphical Processing Units. 2. Direct Self-Consistent-Field Implementation. J. Chem. Theory Comput. 2009, 5, 1004−1015. (52) Lu, T.; Chen, F. Multiwfn: A multifunctional wavefunction analyzer. J. Comput. Chem. 2012, 33, 580−592. (53) Kang, X.; Qian, J.; Deng, J.; Latif, U.; Zhao, Y. Novel molecular descriptors for prediction of H2S solubility in ionic liquids. J. Mol. Liq. 2018, 265, 756−764. (54) Shayakhmetova, R. K.; Khamitov, E. M.; Mustafin, A. G. Modeling the Self-Assembly of 5-Hydroxy-6-methyluracil within Electrostatic Potential Approach. Russ. J. Phys. Chem. A 2018, 92, 1523−1529. (55) Yao, H.; Qian, D.; Zhang, H.; Qin, Y.; Xu, B.; Cui, Y.; Yu, R.; Gao, F.; Hou, J. Critical Role of Molecular Electrostatic Potential on Charge Generation in Organic Solar Cells. Chin. J. Chem. 2018, 36, 491−494. (56) Boys, S. F. Electronic wave functions. I. A general method of calculation for the stationary states of any molecular system. Proc. R. Soc. Lond. A 1950, 200, 542−554.

(57) Mcmurchie, L. E.; Davidson, E. R. One- and Two-electron Integrals over Cartesian Gaussian Functions. J. Comput. Phys. 1978, 26, 218−231. (58) Rys, J.; Dupuis, M.; King, H. F. Computation of Electron Repulsion Integrals Using the Rys Quadrature Method. J. Comput. Chem. 1983, 4, 154−157. (59) King, H. F. Strategies for Evaluation of Rys Roots and Weights. J. Phys. Chem. A 2016, 120, 9348−9351. (60) Shizgal, B. D. A novel Rys quadrature algorithm for use in the calculation of electron repulsion integrals. Comput. Theor. Chem. 2015, 1074, 178−184. (61) Gautschi, W. Orthogonal Polynomials: computation and approximation.; Oxford University Press: New York, 2004. (62) Schwenke, D. W. On the computation of high orders Rys quadrature weights and nodes. Comput. Phys. Commun. 2014, 185, 762−763. (63) Shizgal, B. D. Spectral Methods in Chemistry and Physics. Applications to kinetic theory and quantum mechanics; Wiley: Boston, 2015. (64) Sagar, R. P.; Smith, V. H. On the calculation of Rys Polynomials and Quadratures. Int. J. Quantum Chem. 1992, 42, 827−836. (65) Bowdler, H.; Martin, R.; Reinsch, C.; Wilkinson, J. Handbook series linear algebra - QR and QL Algorithms for Symmetric Matrices. Numer. Math. 1968, 11, 293. (66) Cook, D. B. Handbook of Computational Quantum Chemistry; Oxford University Press: New York, 1998. (67) Zhang, J. LIBRETA:Computerized Optimization and Code Synthesis for Electron Repulsion Integral Evaluation. J. Chem. Theory Comput. 2018, 14, 572−587.

3127

DOI: 10.1021/acs.jcim.8b00951 J. Chem. Inf. Model. 2019, 59, 3120−3127