WarpEngine, a flexible platform for distributed computing implemented

Subscriber access provided by UNIV OF NEWCASTLE

Application Note

WarpEngine, a flexible platform for distributed computing implemented in the VEGA program and specially targeted for Virtual Screening studies Alessandro Pedretti, Angelica Mazzolari, and Giulio Vistoli J. Chem. Inf. Model., Just Accepted Manuscript • DOI: 10.1021/acs.jcim.8b00086 • Publication Date (Web): 10 May 2018 Downloaded from http://pubs.acs.org on May 13, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

WarpEngine, a flexible platform for distributed computing implemented in the VEGA program and specially targeted for Virtual Screening studies Alessandro Pedretti*, Angelica Mazzolari, and Giulio Vistoli Dipartimento di Scienze Farmaceutiche, Facoltà di Farmacia, Università degli Studi di Milano, Via Luigi Mangiagalli, 25, I-20133 Milano, Italy

1 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Abstract The manuscript describes WarpEngine, a novel platform implemented within the VEGA ZZ suite of software for performing distributed simulations both in local and wide area networks. Despite being tailored for structure-based virtual screening campaigns, WarpEngine possesses the required flexibility to carry out distributed calculations utilizing various pieces of software, which can be easily encapsulated within this platform without changing their source codes. WarpEngine takes advantages of all cheminformatics features implemented in the VEGA ZZ program as well as of its largely customizable scripting architecture thus allowing an efficient distribution of various time-demanding simulations. To offer an example of the WarpEngine potentials, the manuscript includes a set of virtual screening campaigns based on the ACE dataset of the DUD-E collections using PLANTS as the docking application. Benchmarking analyses revealed a satisfactory linearity of the WarpEngine performances, the speed-up values being roughly equal to the number of utilized cores. Again, the computed scalability values emphasized that a vast majority (i.e. > 90%) of the performed simulations benefits of the here presented distributed platform. WarpEngine can be freely downloaded along with the VEGA ZZ program at www.vegazz.net.


Page 2 of 24

Page 3 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60


1. Introduction The advancement in the “omics” disciplines has had the overall effect of markedly increasing the amount of available scientific data. The need of storing, handling and analyzing these huge databases has induced remarkable changes in both the hardware architectures and the computational tools. [1] When focusing on the drug discovery applications, an example of such a big data revolution is offered by virtual screening (VS) campaigns which analyze very large compounds libraries to increase the chance of finding novel bioactive molecules. [2] Although VS studies can also involve ligand-based methods, the time demanding structure-based docking approaches represent a very efficient way to perform successful VS campaigns [3,4]. Thus, maximizing the computational resources represents an essential ingredient to develop enhanced VS approaches. [5] To do this, the parallel and the distributed computing can be considered. The former has been recently fostered by the diffusion of motherboards equipped by multicore CPUs and finds in HPC supercomputers its maximum expression. Parallel computing requires specific codes in which the main calculation is subdivided into a number of sub-calculations, which can be run together thus permitting calculations, which would be unfeasible using single core systems. [6] The efficiency of a parallel calculation depends on the strategy by which the main calculation is subdivided and this implies that old programs have to be vastly re-engineered to perform parallel calculations [7]. In contrast, distributed computing involves networks of independent computers which collaborate to achieve an objective. The single computer does not share resources but transfers only input and output data through the network. [8] As a rule, parallel computing is more suitable when performing very complex simulations, such as MD runs or quantum mechanical calculations, while distributed computing becomes more productive when the overall simulation comprises in fact a huge number of reasonably simple and independent calculations as seen in docking-based VS campaigns. A relevant motivation for distributed computing can be also found in the observation



that desktop computers are often underused and thus they can run calculations in background without users noticing them, while optimizing the return on investment for hardware resources. Notably, old programs do not require significant reengineering of their source code to run on distributed systems since they are still running in a sequential way on each connected computer. Distributed computing requires a general platform which encapsulates the executables and dispatches the calculations to the connected hardware resources managing the I/O communications between them. While considering the great interest attracted, very few platforms for distributed computing have been proposed and indeed most reported grid computing projects are based on the BOINC platform which represents the gold standard to manage distributed calculations ranging from astrophysics to life sciences [9]. Alternatively, there are very specific in silico applications which include in their features the possibility to distribute the computational efforts to connected hardware resources. Recent examples focused on molecular modelling are offered by (a) MoSGrid Science Gateway which performs various types of molecular simulations by distributed computing [10], (b) QuBiLS-MIDAS which allows the calculation of molecular descriptors to be distributed on the web [11], (c) Copernicus which is a distributed high-performance platform for molecular simulations [12], and (d) ChemScreener to carry out library generation and virtual screening analyses in a platformindependent distributed computing environment [13]. On these grounds, the present study describes the WarpEngine architecture which represents a flexible client/server platform for distributed computing. This is implemented in the VEGA ZZ suite of programs thus taking advantages from both its computational features (such as trajectory analysis or database handling) and graphical interfaces. [14] While being substantially targeted for cheminformatics and bioinformatics applications, WarpEngine possesses the required flexibility to be largely customized and adapted for different purposes running different pieces of software without changing their source code but using scripting languages. WarpEngine was designed to run 4 ACS Paragon Plus Environment

Page 4 of 24

Page 5 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60


in rather closed environments, such as in the computers connected within a laboratory, but includes the necessary network features to perform calculations worldwide distributed. Along with a detailed description of its major characteristics, the capabilities of this platform will be documented by performing a VS campaign based on a dataset taken from the DUD-E collection.



2. Methods 2.1 WarpEngine platform: the server side WarpEngine is fully integrated in the VEGA ZZ program as a part of the already described PowerNet plug-in. [15] The same software can run either as a server or a client and thus specific versions for each personality are not required. WarpEngine can work in both LAN and WAN but, in the latter, some minor restrictions can be applied to grant security. Due to the asynchronous nature of the code, all implemented modules run as separated threads and the communications are implemented through events, mutexes and message queues as provided by the cross-platform library HyperDrive, which allows the parallelization of several time-critical functions required by VEGA ZZ (see Table S1). In detail, HyperDrive shows the following key characteristics: (a) Hardware independent: the same application can be developed for every operating system without specific code. (b) Same software for single or multiprocessor systems: the library checks the number and the type of CPUs and automatically switches itself from sequential to parallel mode taking also advantage from the HyperThreading technology. (c) No specific compiler required: any C/C++ compiler can be freely utilized and special symmetrical multiprocessor (SMP) extensions are not required. As shown in Figure 1A, the server code includes several logically linked modules. The Project manager handles the user-defined projects which are defined by specific XML files. It can manage more than one calculation on the same server. The Client manager watches the client activities based on user-defined policies. It supervises the client connections and the initial negotiation phase in which the client is recognized through an encrypted digital signature, thus avoiding malicious actions by non-authorized clients. Then, the client is added to the calculation pool and receives the software and the input data required for the calculation. Since the client-server communication is fully asynchronous and the Client manager does not know the status of the 6 ACS Paragon Plus Environment

Page 6 of 24

Page 7 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60


connected clients, client and server periodically exchange messages to check if they are working properly. Hence, the Client manager automatically disconnects the clients from which it does not receive expected messages. Moreover, the Client manager tracks the client errors: if a client reports too many consecutive errors (the maximum number of consecutive errors is a user-defined parameter), the Client manager forces its disconnection to avoid further problems. The Job manager distributes and traces the jobs to the clients through the Client manager. If a client returns an error, the corresponding job is not re-submitted to the clients, while if a client is inadvertently disconnected, the uncompleted jobs are re-submitted to another client. The HTTP server manages the client-server communication. It is embedded in the PowerNet plug-in and is protected by the IP filter and, optionally, by an encrypted tunnel. The IP filter implements a basic network protection system by which simple rules can be applied to grant or to deny the access to the server by specifying the IP addresses. The HTTP server is interfaced to the VEGA ZZ core and is connected with its database engine. In detail, the WarpEngine server includes an abstraction layer which translates the HTML methods into a SQL code to retrieve the input files required by each job. The User Datagram Protocol (UDP) server receives broadcast messages sent from the clients to recognize the WarpEngine servers in the local network. The UDP server generates a suitable answer according to the server status. Again, the server can send broadcast messages in the local network to detect and recruit the clients in an automatic way, while the clients outside the LAN have to be manually added. The Project manager plays key roles which can be schematized as follows (see Figure 1B): 1) Set-up of the calculation environment: the Project manager reads the project files as written in XML format and validates the projects by checking if the required data files and the programs are available and compatible with the hardware architecture. Currently,



WarpEngine supports the C-scripts and the object code is generated by the Script compiler based on Tcc [16] as built in the PowerNet plug-in. 2) Initialization of the calculations: when the project check is completed, the server script is embedded into the WarpEngine main code as a Server module. The script, which includes the customizations required for a specific calculation, is structured in different sections (i.e., functions in the specific case of the C language) called when an event occurs. The initialization phase also manages the optional graphic user interface. 3) Performing the calculations: based on the client actions, both the Client manager and the Job manager synergistically work by sending messages via the Event handler to the Server module and by calling the specific section of the code. In detail, the Server module distributes the input data to the clients, collects the results from the clients and frees the resources when the calculation is finished. 4) Stop of the calculations: When all jobs are completed, the Job manager signals to the Event handler which releases all allocated resources. The same action can be activated by manually stopping all calculations.

2.2 WarpEngine platform: the client side Figure 1C shows the key modules of the client side, which can be described as follows. As detailed below, the Project manager is the client counterpart of the server Project manager: it retrieves the files required by the calculation from the server through the HTTP client and supervises the Multithreaded worker which processes the jobs. The Multithreaded worker performs the calculation as specified in the client script. The WarpEngine main code creates one instance for each core or thread and includes the Simultaneous Multi-Threading technology (SMT) to process parallel jobs. The results and the error messages (if the calculation fails) are sent to the server by the HTTP client. The HTTP client handles all the communications with the server required by the calculations. Furthermore, it sends the results 8 ACS Paragon Plus Environment

Page 8 of 24

Page 9 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60


through the standard HTTP GET and POST methods. As mentioned above for the server side, the UDP client sends messages on the local network to find active WarpEngine servers. Figure 1D schematizes the main events handled by the client Project manager, which parallels those handled by the corresponding module at the server side as summarized below. 1) Initialization of the client: the XML project file is downloaded from the server by the HTTP client as supervised by the server Project manager. The so obtained file is then parsed and decoded. Besides the main parameters for the set-up, the project file also includes the list of the files required for the calculation such as the client script, the program to run (e.g. the docking software) and the input files (e.g. the receptor and the ligand files when performing a docking calculation). All these files are then downloaded and their digital signature is verified. Finally, the client script is compiled and its initialization code is run. 2) Performing the calculation: the Client Project manager includes the additional code required by the Multithreaded worker to carry out the specific tasks to complete each job. In particular, the module includes the code to download and to process the input data as well as to upload the results. Thus, the client performs the calculation (e.g. the docking simulation) and, when the job is completed, sends the results (e.g. scores and ligand poses) to the server through POST or PUT methods as implemented by the HTTP client. All calculations are always carried out as low-priority processes to avoid slowing down during the interactive use of the involved computers. 3) End of the calculation: When the project is completed, the Job manager induces the exiting of the calculation section of the Client module, during which the client unregisters itself from the server, closes the connection and releases all resources by calling the specific code of the Client module.



2.3 Warpengine platform: the graphical interface VEGA ZZ includes both the client and the server codes thus allowing applications to be carried out locally by running two WarpEngine instances on the same node. Figure S1A shows the main server interface. Each running project is defined by the identification number and a short description. It can be regularly checked by monitoring the status of the calculation, the number of occurred errors, the time required to end the calculation and the time spent until then. The Summary of projects box reports the number of available projects, the number of enabled/checked projects and the number of completed projects. As depicted in Figure S1B, the Clients tab shows the connected clients, their status and some statistics about their activity. For each client, it reports the identification code, the host name, the IP address, the ID of the working project (PID), the session ID (SID) used to identify different WarpEngine instances running on the same client, the number of working threads, the number of completed jobs, the calculation speed (jobs per minute), the number or recoverable errors occurred, the time passed from the first and the last client contacts. Collective data are reported at the bottom such as the number of active clients, the number of completed jobs, the number of threads, the number of remaining jobs, the average ratios of jobs per client and per thread. In the Performances tab, the calculation performances can be monitored by checking the average speed, the maximum average speed, the current speed (as completed jobs per minute) and the CPU load of the server (as percentage value). The servers in the local network are automatically detected. Alternatively, the client interface allows their manual inclusion by specifying the server name, the IP address and the communication port. After starting a calculation, both interfaces can be minimized on the system tray bar, while allowing its major features to be controlled. Some relevant WarpEngine projects are pre-installed including applications for docking simulations (based on the PLANTS program [17]), semi-empirical calculations (based on MOPAC [18]) and rescoring analyses (based on ReScore+ [19] and NAMD [20]). They can be used as such or adapted to develop different applications (see below).


Page 10 of 24

Page 11 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60


2.4 Developing a WarpEngine application Due to its flexible structure, WarpEngine can be adapted to specific calculations through apposite server and client scripts which can be written in the C-Script language to allow a direct access to the WarpEngine APIs. The required WarpEngine files are stored into a Projects directory which includes the data for each project into specific subdirectories (e.g. Project_1), while the corresponding Templates subdirectories contain the auxiliary files which can be used without changes for the different projects. In each project directory, the project.xml file describes the calculation by defining a set of XML tags as compiled in Table S2 (Supporting Information). The WarpEngine installation includes an empty project useful to build a new calculation. It includes a pre-defined project.xml file and basic client.c and server.c scripts. Both scripts support the language localization, include different functions called when a specific event occurs and extensively use the HyperDrive library. While avoiding here a detailed description of the variables and the events which can be managed by these scripts, the interested reader can found in the Supporting Information the lists of (a) global variables (Table S3); events managed by (b) the server script (Figure S2A and Table S4) and (c) by the client script (Figure S2B and Table S5); (d) the implemented APIs (Table S6) and (e) macros (Table S7). Tables S8 and S9 include simple examples for the script codes required for server and client. The interested reader can refer to the user guide at http://nova.disfarm.unimi.it/manual/plugins/warpengine.htm.

2.5 Case study from DUD-E datasets With a view to documenting the potentialities of the WarpEngine platform and the specific performances of the included scripts, a VS campaign based on the Angiotensin-converting enzyme (ACE) dataset from the DUD-E collection was performed [21]. Docking simulations were carried out by utilizing a distributed system composed of 104 threads on 3 blade servers. As shown in Table 1, the three servers are equipped with two Intel Xeon E5 CPUs of different clock rate and involves both 16 and 20 cores with 32 and 40 threads, respectively. The system has a global amount 11 ACS Paragon Plus Environment


of 192 Gb of RAM and was connected through a slow 100 Mbs Ethernet switch. Docking simulations involved the resolved 3bkl protein structure which was prepared as described elsewhere [19] plus a dataset including 808 active compounds and 17144 decoys. In order to test all included scripts, the virtual screening simulations were performed using the following procedure: (1) all ligands were optimized by PM7 semi-empirical calculations using MOPAC which also allows a precise calculation of the atomic charges [22]; (2) docking simulations were performed using PLANTS by scoring the computed poses using all three implemented scoring functions (i.e., ChemPLP, PLP and PLP95) and by testing all three possible values (i.e. S1, S2 and S4) for the parameter (Speed) by which the program varies the accuracy of the calculations, the lower the speed, the higher the accuracy [17]; (3) all generated complexes were rescored by ReScore+ with and without complex minimization [19]. In all performed simulations, the search was focused within a 15 Å around the bound kAW inhibitor so encompassing the entire catalytic cavity.

3. Results 3.1 Preliminary tests Primarily, WarpEngine was tested for its network performances by using the above described hardware configuration. The first test was based on the Apache Bench program [23] which tested the WarpEngine network features by executing 100 HTTP GET requests and processing 5 requests concurrently. The system was able to execute 3205.13 pages·s-1, namely 1.56 ms per request. For easy comparison and in the same conditions, Microsoft Internet Information Services (IIS) 6.0 [24] showed a speed equal to 4.69 ms per request that is about 1/3 compared to WarpEngine. The second more specific test was designed to measure the efficiency in delivering the jobs among the connected PCs. In this test, WarpEngine was able to deliver 1327.53 jobs · s-1, namely 7.53 ms per delivered job. The last test was focused on the performances of the database engine by monitoring its efficiency in delivering the molecules to the clients. The test involved extraction (by SQL query), decompression and delivery of molecules from a database to the clients 12 ACS Paragon Plus Environment

Page 12 of 24

Page 13 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60


and WarpEngine proved able to deliver 685.25 molecules · s-1, namely 14.59 ms per molecule. Altogether, these preliminary tests emphasized that the WarpEngine network features should be efficient enough to allow a satisfactory management of the distributed calculations even for very extended distributed systems and/or when the calculations require the exchange of large amount of data.

3.2 Docking calculations: results and benchmarking Table S10 collects the Top 1% enrichment factors (EF) obtained for the performed VS campaigns using the ACE dataset from DUD-E and allows for some meaningful considerations. The first observation concerns the influence of post-docking minimization on the resulting VS performances. The comparison involved the simulations based on the ChemPLP score and on the empirical atomic charges and revealed that non-minimized complexes perform markedly better regardless of the calculation accuracy. This result, which reminds that already obtained when considering the BChE substrates [19], emphasizes that these scoring functions are optimized to evaluate the stability of the complexes generated by PLANTS and can worsen their reliability when they are refined by different approaches or force fields. Accordingly, the remaining simulations were focused on non-minimized complexes only. Again, ChemPLP is always the best performing score although the rescoring calculations involved a large set of scoring functions. This result is in line with previously reported simulations which revealed that PLANTS is the program affording the best results for the ACE dataset of the DUD-E collection. [25] As expected, the obtained performances parallel the calculation accuracy as defined by the Speed parameter: the worsening is modest when shifting from S1 to S2 to become more relevant when selecting the S4 speed. The role of the charge calculation appears to be marginal with the S1 and S2 speeds, while ligands with MOPAC charges performs better when selecting the least accurate S4 speed. In other words, the obtained results suggest that using more precise atomic charges can conveniently counterbalance the approximations introduced by less accurate docking simulations. This finding might be relevant 13 ACS Paragon Plus Environment


when screening huge ligand databases, since it permits a marked reduction of the overall computational costs without worsening the reliability of the obtained results. The EF averages for the specific scores as obtained by the docking simulations before and after rescoring afford less expected results since ChemPLP is the best performing score when considering docking results before rescoring, while the subsequent rescoring generates the following rank PLP95 > ChemPLP > PLP which is retained regardless of the atomic charges. This finding reveals that the ChemPLP performances are more sensitive to the calculation accuracy than the other scoring functions and suggests that PLP95 is the best performing score in the initial search phase. By considering that the ChemPLP calculation is more time demanding than PLP95, these results suggest that combining PLP95 for pose generation with ChemPLP for rescoring can be a fruitful procedure to speed up the screening of huge databases. Although the here performed docking simulations had the almost exclusive objective to test the WarpEngine performances in terms of achieved calculation speed (see below), the obtained results suggest that increasing the accuracy of the atomic charges or combining different scoring functions might represent useful expedients to reduce the computational time. To derive general rules, these suggestions should be substantiated by more extended validations, in which also the reliability of the obtained poses is taken into careful consideration. Table 1 compiles the performances provided by the three utilized workstations when used alone and variously combined. This benchmark analysis involved only the most demanding docking procedure (i.e. using ChemPLP with speed = 1) and was limited to the 806 active compounds. The performances of the utilized systems are evaluated by considering the scalability based on the core number as well as the one day scale up values. Table 1 also lists the time required to finish the calculations as well as the resulting one day throughputs. The comparison between the performances obtained by using PLANTS in a standard sequential way, i.e. without WarpEngine, and those reached by using WarpEngine reveals a satisfactory linearity in the monitored performances since the speed up values are roughly equal to the number of involved cores. More 14 ACS Paragon Plus Environment

Page 14 of 24

Page 15 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60


importantly, such a remarkable result is reached both when using the three workstations alone and when combining two or three workstations as exemplified by the more extended configuration including all workstations. To better illustrate the performances of the tested configurations, Table 1 includes the scalability as computed per workstation and reveals that the portion of the simulation which benefits from the distributed platform is greater than 90% even when combining all available CPUs. Again, the benchmarking tests also included simulations based on the WarpEngine platform but involving only one thread (similarly to the sequential way). By comparing the performances obtained using only one thread with and without WarpEngine, one may reveal the computational cost of the WarpEngine platform. This cost can be attributed to two major factors: on one hand, the time spent by distributing the jobs and sharing the I/O files on the net; on the other hand, the time required to initialize the calculations. Indeed and when using PLANTS in a standard sequential way, a single calculation run (with a single initialization process) screens the entire ligand dataset. In contrast and when using WarpEngine, each ligand is simulated by a distinct calculation thus implying that the initialization costs can markedly increase. Nevertheless, Table 1 shows that the loss when using WarpEngine with only one core is equal to about 5%, a computational cost which can be counterbalanced even when using WarpEngine in the simplest possible system, namely when docking two ligands using two CPUs. Notice that such a computational cost is very similar to that seen when considering the scalability per workstation thus suggesting that the simulation part which does not benefit from the distributed computing can correspond to the initialization processes. This implies that the overall WarpEngine performances are markedly influenced by the start-up processes required by the distributed programs and this aspect should be carefully considered when planning distributed calculations. Taken globally, these analyses emphasize the WarpEngine potentialities especially when considering that the reported benchmarking studies were performed by using an Ethernet 100 Mbs network system. To give a concrete example of the reached performances, Table 1 shows that the 15 ACS Paragon Plus Environment


combination of all available workstations allows docking simulations for more than 35000 molecules to be carried out in one day with an average time per molecule just over 2 seconds.

Conclusions One of the most challenging problem in the big data scenario is that the data grow faster than the computational power and this requires new computational strategies to conveniently analyze these data. In this context, distributed calculations can have key advantages since they require neither dedicated hardware architectures nor a substantial rewriting of the existing codes but they take advantage from old codes and the existing (often underused) hardware resources. Not to mention that distributed computing allows a more flexible management of the failures since a single node can be easily excluded without hampering the entire calculation. While considering these remarkable advantages, few customizable platforms for distributed computing have been until now proposed. This lack is particularly evident in the cheminformatics field even though its typical calculations are well suited to be distributed since they are composed of a huge number of relatively simple and independent simulations (as seen in docking analyses). On these grounds, the study describes the WarpEngine platform which is embedded in the VEGA ZZ suite of programs and allows distributed calculations both in LAN and WAN systems. WarpEngine is highly customizable by exploiting the scripting architecture and all cheminformatics features implemented in the VEGA ZZ program. To document the potentialities of WarpEngine the study reports a set of exemplificative VS campaigns based on a DUD-E dataset. The obtained results emphasize the notable performances of WarpEngine which shows a really satisfactory scalability as evidenced in the benchmarking tests.


Page 16 of 24

Page 17 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60


This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.



Page 18 of 24

Table 1: Major results for the here reported benchmarking tests performed by comparing the performances with and without using WarpEngine as well as using the three available workstations alone or variously combined.

PLANTS ChemPLP Speed 1 Completion Time Thoughput Speed Workstation (s) (mol./day) up scalability 115200 606.0 1.00 ---

System

CPU type ( Intel Xeon)

Clock GHz

1

E5-2640 v2

2.00

One thread

2

E5-2630 v3

2.40

One thread

86640

805.8

1.00

---

3

E5-2650 v3

2.30

One thread

105660

660.7

1.00

---

1

E5-2640 v2

2.00

One thread +WE

122816

568.4

0.94

---

2

E5-2630 v3

2.40

One thread +WE

91586

762.2

0.95

---

3

E5-2650 v3

2.30

One thread +WE

111655

625.2

0.95

---

1

E5-2640 v2

2.00

2

16

6680

10450.8

17.25

100.00

2

E5-2630 v3

2.40

2

16

5521

12644.7

15.69

100.00

CPUs

Cores

3

E5-2650 v3

2.30

2

20

4506

15492.9

23.45

100.00

1+2

-

-

4

32

3130

22303.9

31.59

96.57

1+3

-

-

4

36

2756

25330.6

39.80

97.64

2+3

-

-

4

36

2538

27506.4

37.93

97.76

1+2+3

-

-

6

52

1958

35654.3

51.78

92.40


Page 19 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60


Figure 1: Main logical units of the server (1A) and the client (1C) personalities (yellow boxes: network units; blue boxes: management units; green boxes: VEGA features). Main events handled by project managers at the server (1B) and client (1D) sides.



ASSOCIATED CONTENT Supporting Information. Table S1 describes the HyperDrive features, Tables S2-S9 reports commands, events, variables, APIs, macros and examples to develop novel WarpEngine applications, Table S10 includes the results of the reported VS campaigns, Figure S1 shows the main windows of the WarpEngine GUI and Figure S2 details the events managed by the client and server scripts (PDF).

AUTHOR INFORMATION Corresponding Author Alessandro Pedretti: [email protected]; Phone: +390250319332; Fax: +390250319359 Author Contributions The manuscript was written through contributions of all authors. All authors have given approval to the final version of the manuscript.

ABBREVIATIONS ACE: Angiotensin-converting enzyme; EF: enrichment factor; HTTP: HyperText Transfer Protocol; LAN: local area network; MD: molecular dynamics; UDP: User Datagram Protocol; VS: virtual screening; WAN: wide area network.


Page 20 of 24

Page 21 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60


References

1

Bellazzi, R. Big Data and Biomedical Informatics: A Challenging Opportunity. Yearb Med. Inform. 2014, 9, 8-13. 2

Basak, S. C.; Vracko, M.; Bhattacharjee, A. K. Big Data and New Drug Discovery: Tackling "Big Data" for Virtual Screening of Large Compound Databases. Curr. Comput. Aided Drug. Des. 2015, 11, 197-201. 3

Fradera, X.; Babaoglu, K. Overview of Methods and Strategies for Conducting Virtual Small Molecule Screening. Curr Protoc Chem Biol. 2017, 9:196-212

4

Shin, W.H.; Christoffer, C.W.; Kihara, D. In silico structure-based approaches to discover proteinprotein interaction-targeting drugs. Methods. 2017, 131:22-32 5

Feinstein, W.; Brylinski, M. Structure-Based Drug Discovery Accelerated by Many-Core Devices. Curr. Drug Targets. 2016, 17, 1595-1609. 6

Bischof, C. In Advances in Parallel Computing - Parallel Computing: Architectures, Algorithms, and Applications; IOS Press: Amsterdam, 2008; Vol. 15. 7

Foster, I. In Designing and Building Parallel Programs at http://www.mcs.anl.gov/~itf/dbpp/ (accessed Maj 7, 2018). 8

Kshemkalyani, A. D., Singhal, M. In Distributed Computing: Principles, Algorithms, and Systems; Cambridge University Press: Cambridge, 2011. 9

Anderson, D.P. Boinc: A System for Public-resource Computing and Storage. In Grid Computing, 2004 Proceedings Fifth IEEE/ACM International Workshop on Grid Computing; IEEE: Piscataway, 2004, pp 4-10. 10

Krüger, J.; Grunzke, R.; Gesing, S.; Breuers, S.; Brinkmann, A.; de la Garza, L.; Kohlbacher, O.; Kruse, M.; Nagel, W.E.; Packschies, .; Müller-Pfefferkorn, R.; Schäfer, P.; Schärfe, C.; Steinke, T.; Schlemmer, T.; Warzecha, K.D.; Zink, A. Herres-Pawlis S. The MoSGrid Science Gateway - A Complete Solution for Molecular Simulations. J Chem Theory Comput. 2014, 10:2232-45 11

García-Jacas, C.R.; Marrero-Ponce, Y.; Acevedo-Martínez, L.; Barigye, S.J.; Valdés-Martiní, J.R.; Contreras-Torres, E. QuBiLS-MIDAS: a parallel free-software for molecular descriptors computation based on multilinear algebraic maps. J Comput Chem. 2014, 35, 1395-409

12

Pronk, S.; Pouya, I.; Lundborg, M.; Rotskoff, G.; Wesén, B.; Kasson, P.M.; Lindahl, E. Molecular simulation workflows as parallel algorithms: the execution engine of Copernicus, a distributed high-performance computing platform. J Chem Theory Comput. 2015, 11, 2600-8

13

Karthikeyan, M.; Pandit, D.; Vyas, R. ChemScreener: A Distributed Computing Tool for Scaffold based Virtual Screening. Comb Chem High Throughput Screen. 2015;18:544-61 14

Pedretti, A.; Villa, L.; Vistoli, G. VEGA: A Versatile Program to Convert, Handle and Visualize Molecular Structure on Windows-based PCs. J. Mol. Graph. Model. 2002, 21, 47-49.



Page 22 of 24

15

Pedretti, A.; Villa, L.; Vistoli, G. VEGA--An Open Platform to Develop Chemo-bio-informatics Applications, Using Plug-in Architecture and Script Programming. J. Comput. Aided Mol. Des. 2004, 18, 167-173. 16

TCC: Tiny C Compiler, https://bellard.org/tcc/ (accessed May 7, 2018).

17

Korb, O.; Stützle, T.; Exner, T. E. Empirical Scoring Functions for Advanced Protein-ligand Docking with PLANTS. J. Chem. Inf. Model. 2009, 49, 84-96. 18

MOPAC2016, J. J. P. Stewart, Stewart Computational Chemistry, Colorado Springs, CO, USA.

19

Vistoli, G.; Mazzolari, A.; Testa, B.; Pedretti, A. Binding Space Concept: A New Approach To Enhance the Reliability of Docking Scores and Its Application to Predicting Butyrylcholinesterase Hydrolytic Activity. J. Chem. Inf. Model. 2017, 57, 1691-1702.

20

Phillips, J. C.; Braun, R.; Wang, W.; Gumbart, J.; Tajkhorshid, E.; Villa, E.; Chipot, C.; Skeel, R. D.; Kalé, L.; Schulten, K. Scalable Molecular Dynamics with NAMD. J. Comput. Chem. 2005, 26, 1781-1802. 21

Mysinger, M. M.; Carchia, M.; Irwin, J. J.; Shoichet, B. K. Directory of Useful Decoys, Enhanced (DUD-E): Better Ligands and Decoys for Better Benchmarking. J. Med. Chem. 2012, 55, 6582-6594. 22

Stewart, J. J. Optimization of Parameters for Semiempirical Methods VI: More Modifications to the NDDO Approximations and Re-optimization of Parameters. J. Mol. Model. 2013, 19, 1-32.

23

Ab – Apache http server benchmarking http://httpd.apache.org/docs/2.2/en/programs/ab.html (accessed May 7, 2018).

tool,

24

Internet Information Services (IIS) 6.0 Resource Kit Tools, https://www.microsoft.com/enus/download/details.aspx?id=17275 (accessed May 7, 2018). 25

Ericksen, S. S.; Wu H.; Zhang, H.; Michael, L. A.; Newton, M. A.; Hoffmann, F. M.; Wildman, S. A. Machine Learning Consensus Scoring Improves Performance Across Targets in StructureBased Virtual Screening. J. Chem. Inf. Model. 2017, 57, 1579-1590.


Page 23 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60




For Table of Contents use only


Page 24 of 24

WarpEngine, a flexible platform for distributed computing implemented

Recommend Documents