Descriptions and Implementations of DL_F ... - ACS Publications

J. LogsdailAlexey A. SokolC. Richard A. CatlowPaul SherwoodThomas W. ... Christopher M. Kane , Rob Clowes , Tom Hasell , Andrew I. Cooper , Graeme...
0 downloads 0 Views 499KB Size
Subscriber access provided by La Trobe University Library

Application Note

Descriptions and implementations of DL_F Notation: A natural chemical expression system of atom types for molecular simulations Chin W. Yong J. Chem. Inf. Model., Just Accepted Manuscript • DOI: 10.1021/acs.jcim.6b00323 • Publication Date (Web): 25 Jul 2016 Downloaded from http://pubs.acs.org on July 29, 2016

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Chemical Information and Modeling is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 12

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Descriptions and Implementations of DL_F Notation: A Natural Chemical Expression System of Atom Types for Molecular Simulations Chin W. Yong1,2 1

Scientific Computing Department, Science and Technology Facilities Council, Daresbury Laboratory, Sci-Tech Daresbury, Warrington WA4 4AD, UK 2

Manchester Pharmacy School, Faculty of Medical and Human Sciences, Manchester Academic Health Science Centre, the University of Manchester, Manchester M13 9NT, UK. Email: [email protected]

ABSTRACT: DL_F Notation is an easy-to-understand, standardised atom typesetting expression for molecular simulations for a range of organic force field (FF) schemes such as OPLSAA, PCFF and CVFF. It is implemented within DL_FIELD, a software program that facilitates the setting up of molecular FF models for DL_POLY molecular dynamics simulation software. By making use of the Notation, a single core conversion module (the DL_F conversion Engine) implemented within DL_FIELD can be used to analyse a molecular structure and determine the types of atoms for a given FF scheme. Users only need to provide the molecular input structure in a simple xyz format and DL_FIELD can produce the necessary force field file for DL_POLY automatically. In commensurate with the development concept of DL_FIELD, which placed emphasis on robustness and user friendliness, the Engine provides a single-step solution to setup complex FF models. This allows users to switch from one of the above mentioned FF seamlessly to another while at the same time provides a consistent atom typing that is expressed in a natural chemical sense.

INTRODUCTION The molecular dynamics (MD) is a powerful molecular simulation technique to study the interactions of many-body systems at an atomistic scale.1 However, the outcome and the quality of the MD simulations depend very much upon the choice of the parameters defined for a set of mathematical functions that describe the chemical and physical behaviour of a molecular system. In molecular simulations, these specific sets of mathematical equations are called the force field (FF) of a system model that describes the energy of the system.1 The derivation of these parameters is largely interpolative; either based on some experimental data or ab-initio quantum mechanical calculations. There is no general consensus or standard procedure to follow how these parameters should be derived. For this reason, there are numerous FF schemes, each with their own philosophy and specific parametric fitting procedures to set up FF libraries that contain parameters that bear little or no correlation

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

among the FF schemes. Examples of some popular schemes are CHARMM,2 AMBER3 and OPLSAA,4 which are designed specifically to model certain class of molecules such as proteins, lipids, carbohydrates and some organic molecules in condensed phases. More generalised FF schemes are also available such as DREIDING,5 UFF,6 CVFF7 and PCFF8 are just a few examples. These FF schemes are designed to be generic and can be used to model a wide range of different molecules. Depending on the type of molecules of interest, the FF quality of one scheme can be different compare with the other and there is often a need to switch from one FF model to the other so as to be able to more reliably investigate a molecular system of interest. Migration or conversion of a structure configuration to a different FF model is often a non-trivial task. This is because each FF scheme contains a unique set of force field parameters that are described in different formats and often with different functional forms. However, the single most difficult aspect of converting and setting up a FF model is the assignment of the types of atoms, or atom typing, which ensures correct sets of potential parameters are used to describe the molecular systems. Each software package has adopted an independent approach to atom typing. For instance, reliance of some internal files that contained sets of rules for atom type determination, or the use of some modified standard molecular input notation as such SMART9 to indicate the chemical position of an atom within a molecule, as implemented within the IMPACT program.10 Alternatively, others such as ATEN11 use natural typing language to precisely indicate the location of an atom, and hence its corresponding type within a molecular system. However, these methods often results in atom types that are referenced as strings of texts which are not immediately apparent to human cognition that can also vary significantly from one FF scheme to the other. The situation is further complicated by the fact that the actual labels (or the atom keys) that are used to indicate the atom types are often expressed in cryptic terms with little or no consensus to agree on the format description. To this end, DL_FIELD12 has been developed that provides a one-step conversion solution from a user’s configuration to a chosen FF model that can readily run in DL_POLY13 MD simulation package, without the intervention from the users, nor require the support of additional scripting. More recently, the latest DL_FIELD version 3.5 implements the standard DL_F Notation, a system of expression that provides consistent atom typing notation for molecular models. This ensures smooth data transition among various FF schemes and to remove the ambiguity of the atom type assignment. The Notation describes the actual chemical identity of an atom within a molecule that can be easily interpreted by both the computational modellers as well as experimentalists. Due to the universality nature of the Notation, the use of a single core conversion methodology, namely the DL_F conversion Engine, becomes feasible for the atom type determination. The DL_F conversion Engine that implements in DL_FIELD only requires configuration input files in

ACS Paragon Plus Environment

Page 2 of 12

Page 3 of 12

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

the simple xyz format to produce complex force field models for a number of different FF schemes. In this paper, DL_F Notation and its implementation in the DL_F conversion Engine will be described, by making reference to OPLSAA, CVFF and PCFF schemes that are implemented within DL_FIELD 3.5. CONSISTENT ATOMTYPSETTING Depending on the nature of bond connections, an atomic element contained within a molecule will adopt different chemical behaviour. Usually for a given FF scheme, some symbolic terms called the atom keys are used to distinguish one atom type from the other for a given element. The unique combination of these sets of keys within a molecule ensures the correct mapping of the potential parameters that are assigned to the molecular model. This is where the art of atom typing come into play – the methodology that ensures the correct identification of the atom types and hence, the corresponding atom keys. The aim of the DL_F Notation is to provide a universal scheme with a consistent atom typing that is contiguous to all FF schemes implemented within DL_FIELD. In order to achieve this, for every type of atom defined for a given FF scheme, DL_FIELD keeps two sets of records in the FF library files: the human readable atom type label (called the ATOM_TYPE within the DL_FIELD context14) and the corresponding atom key (called the ATOM_KEY within the DL_FIELD context14), which in turn, is used to map the appropriate FF parameters. This is in contrast to most other simulation packages that keep track of only the atom keys in the library files. In DL_FIELD, all ATOM_TYPEs are unique, whereas ATOM_KEYs are not. In other words, different ATOM_TYPEs can reference to a same ATOM_KEY. In order to implement the DL_F Notation, all ATOM_TYPEs are expressed in some human-readable forms that bear the actual chemical identity of the atoms (see the following Section) and all FF schemes share the same ATOM_TYPEs expressed according to the DL_F Notation. However, the ATOM_KEYs are expressed in the same original forms as those in the respective FF schemes. In this way, only one core conversion engine is required to carry out the determination of ATOM_TYPEs which are referenced to the appropriate ATOM_KEYs. This means that the ATOM_TYPEs determination would be the same for different FF schemes, but the corresponding ATOM_KEYs, which retain the original information of a particular FF scheme, enable DL_FIELD to assign the appropriate sets of FF parameters for the setting up of the DL_POLY’s FF file (the FIELD file). Note that users also have the option to produce the FIELD files that express atom keys in the DL_F Notation (the DL_F ATOM_KEYs), rather than that in the original FF format. There is no need to keep a separate reference list for the DL_F ATOM_KEYs, as these can be generated in situ within the DL_FIELD program if needed. DL_F NOTATION FEATURES

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

In this section, the definitions and naming convention of the DL_F Notation will be described. DL_F Notation avoids the use of cryptic terms whenever possible. This enables users to easily recognise the chemical identity of an atom within a molecule without referencing to a manual. (1) DL_F Atoms – A collective term for a number of elements that the DL_F Notation rules apply. These include: (i) those elements that are commonly encountered in organic molecules: Hydrogen (H), carbon (C), oxygen (O), nitrogen (N), sulphur (S), phosphorus (P) and boron (B). (ii) Halogens such as fluorine (F), chlorine (Cl), bromine (Br) and iodine (I). (iii) Inert gases such as helium (He), neon (Ne), argon (Ar), krypton (Kr), xenon (Xe) and radon (Rn). (iv) Alkaline metals such as lithium (Li), sodium (Na) and potassium (K). (2) Chemical Group (CG) – Name refers to a specific group of DL_F Atoms and bonds within a molecular structure that exhibits a characteristic chemical behaviour of the molecule. It is broadly similar to the functional group in chemistry. (3) Chemical Group Index (CGI) – A unique CG index number, ranged from 1 to 999 inclusive, which is referenced to a CG. (4) Primary DL_F Token – This is a group of optional supporting tokens which can be expressed along with a DL_F Atom. It is used to locate or to distinguish one DL_F Atom from another within a CG. The available tokens are shown below: (i) p, s, q, t – primary, secondary, tertiary and quaternary DL_F Atom, or the DL_F Atom that is attached to one, two, three or four carbon atoms. (ii) E – terminal or end position within a CG. (iii) L – link, or non-terminal (inner) DL_F Atom within a CG. (iv) ‘+’ – a positively charged DL_F atom. (v) ‘-’ – a negatively charged DL_F atom. (vi) 1, 2, 3, etc – numerical values. These tokens usually apply to aromatic cyclic compounds. (5) Secondary DL_F Token - This is a group of optional supporting tokens that can be expressed along with a DL_F Atom, to indicate the type of atom to which it is connected to. The available tokens are shown below: (i) A – alpha atom, or the first DL_F Atom that connects directly to another DL_F Atom within a CG. (ii) B – beta atom, or the second DL_F Atom that connects directly to another DL_F Atom within a CG. (iii) U – next to an unsaturated carbon atom. (iv) R – next to an aryl carbon atom. (v) Element symbols of the neighbouring DL_F Atom itself. The following lists the rules in using the DL_F Notation. (1) Every DL_F Atom in a system model will have an ATOM_TYPE that associates with a CG tag to which it belongs. This means that the corresponding DL_F ATOM_KEY will associate with the corresponding CGI.

ACS Paragon Plus Environment

Page 4 of 12

Page 5 of 12

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

(2) Every DL_F Atom can only associate with not more than one token from each group of the DL_F token. This means that there can be only one token, either from the Primary DL_F Token or the Secondary DL_F Token that associates with a DL_F Atom. However, two tokens are permissible only if each is originated from both Primary and Secondary Token groups. (3) Single-valence DL_F Atoms such as hydrogen and halogens must always express along with a Secondary DL_F Token. This is usually the element symbol of a neighbouring DL_F Atom. DL_F NOTATION EXPRESSIONS The ATOM_TYPE of a DL_F Atom is always expressed in the following format: A[tp][ts]_CG Where A is the element symbol of the DL_F Atom, tp and ts in square brackets are the optional Primary and Secondary DL_F Tokens, respectively and CG is the Chemical Group to which A belongs. The corresponding ATOM_KEY for the DL_F Atom is always expressed as follows: ACGI[tp][ts] For example, a tertiary alkyl carbon atom is associated with the ATOM_TYPE Ct_alkane with the corresponding ATOM_KEY C1t. In this case, C is the DL_F Atom; ‘alkane’ is the CG and the value ‘1’ is the unique CGI for ‘alkane’. The t (tertiary) is one of the Primary DL_F Tokens. On the other hand, an alkyl hydrogen atom is associated with the ATOM_TPYE HC_alkane with the corresponding ATOM_KEY H1C. In this case, C is actually a Secondary DL_F Token which is the neighbouring DL_F Atom to which the hydrogen atom is connected to. Note that most of the CGs are based on the actual chemical functional groups. However, due to the anomalous behaviour of certain molecules that can deviate from the general behaviour of other molecules that contain the same functional group, specific sets of potential parameters are usually derived for such molecules. For instance, ‘cyclopropyl’ and ‘cyclobutyl’ are two CGs that are distinguished from other larger cycloalkane molecules. Fig. 1 shows the ball and stick representations of ethanol molecules, labelled with the original atom keys for three different FF schemes: OPLSAA, CVFF and PCFF. Note that not only are these keys expressed in some cryptic terms, quite often an identical atom key can be shared among various types of atoms. For instance, in CVFF, ethanol contains the following atom keys: c3 and c2, referring to methyl (–[C]H3) and methylene (–[C]H2–) carbon respectively, whereas OPLSAA does not make such a distinction. However, by using the DL_F Notation, it is possible to produce a universal set of human-readable ATOM_TYPEs (Fig. 1 on the right) for the molecule and the corresponding ATOM_KEYs were produced in-situ from DL_FIELD program. The value 15 is the CGI for the ‘alcohol’ CG.

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

It can be seen that the DL_F Notation effectively removes layers of complexity involving data structure conversion and ensures smooth model transition across various FF schemes. The DL_POLY’s FIELD files generated for ethanol molecules are listed in the Supporting Information. These files were produced by DL_FIELD version 3.5 with the option to specify the atoms in the DL_F Notation for OPLSAA and CVFF. Fig. 1 Chemical models of ethanol molecules.

On the left, the molecules are labelled with the standard atom keys for OPLSAA, CVFF and PCFF. On the right, the atom keys can be converted into a universal set of keys expressed in the DL_F Notation.

DL_F CONVERSION ENGINE Central to the successful implementation of DL_F Notation is the atom typing Engine module contained within the DL_FIELD program. This is where the CG of every atom, and hence, the ATOM_TYPE is determined. The Engine has the capability to analyse and detect the chemical structures of atoms and functional groups to which they belong. Other chemical functionality and structural analysis programs are also available such as checkmol15. However, DL_F conversion Engine works specifically for molecular simulations that can precisely determine the chemical location of every member atom within a functional group in a molecule. The only information that is required for the conversion is the element symbols and the corresponding xyz coordinates of an input configuration. Note that DL_FIELD does not have the capability to detect the presence of possible tautomerism. It is up to the user to decide which tautomer to use when setting up the configuration model. From such, DL_FIELD will map out the bond connectivity matrix based on the atomic distance criteria. Bond analysis is then carried out to determine the bond order and assignment of any formal charges. Fig. 2 shows the schematic workflow for the Engine. The green boxes represent various functional analysis subroutines and the yellow boxes the corresponding possible outcomes, which is the

ACS Paragon Plus Environment

Page 6 of 12

Page 7 of 12

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

identification of CGs for the atoms in the structure. The solid arrows indicate the direction of the data flow. Fig. 2 Workflow for the DL_F conversion Engine.

The green boxes show the functional analysis subroutines and the solid arrows show the direction of the data flows. The yellow boxes illustrate some of the possible analysis outcome which are the CGs of atoms. The dotted arrows indicate the possible CG identification depending on the input structure.

First of all, the Engine will identify any one-particle CGs, such as ‘noble_gas’, that do not form bonds. After that, atoms belong to cyclic structures will be analysed so as to match with any ring-specific CGs This is done by detecting the presence of any ring structures by using the molecular p-graph analysis method.16 Instead of searching for every possible combination of rings, larger overlapping rings are removed, keeping only the smallest sets of rings. The Engine will identify whether a ring is an aromatic or non-aromatic type based on the Hückel rule of 4n+2 π electrons and its conjugation structure. Once the CG of a ring is identified, any single-valence DL_F Atoms, such as hydrogen atoms and halogens which are attached to the ring, will also be assigned to the same CG tag. After that, the Engine continues on the analysis for the remaining parts of the molecule, including any ring structures that are still undefined, through a series of independent subroutine each is responsible to carry out a certain specific functional analysis, depending on the presence of certain types of bond arrangements. Within each functional analysis subroutine, detection for CG with the largest number of atom members will be carried out first and followed by subsequent CGs will smaller number of member atoms. For example, within the >C=O carbonyl bond analysis routine, detection for ‘imide’ and ‘acid_anhydride’ CGs are first carried out, which contain two carbonyl bonds. This is then followed by ‘carbonate_ester’, ‘carboxylic’, ‘ester’, etc and finally ‘ketone’. The last CG, consists of the least number of member atoms, which is the >C=O itself. After that, the analysis will carry on to the next subroutine looking for –O-C-O– type of bonds and subsequently, the presence of any acetal-type CGs. Example below shows the assignment of the ATOM_TYPEs for atorvastatin, a member of the drug class known as statin. The input file is in a simple xyz format (listed in the Supporting

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Information). Fig. 3 shows the atom type detection flow diagram, along with the sketch of the drug molecule. The numerical numbers 1 to 9 indicate the detection sequence for groups of atoms (molecular fragments) enclosed in red boxes as the information is passed into the analysis subroutines (labelled in green). Within the Ring analysis routine, four aromatic rings are detected, labelled from 1 to 4. The first two being the regular benzene rings and the third is a fluorobenzene derivative. The fourth is a 5-membered pyrrole ring. After that, the Engine detected two molecular fragments containing the carbonyl bond (>C=O): the ‘carboxylic’ CG and followed by the ‘amide’ CG. The process is continued on till the last analysis subroutine, which is the detection of carbon-only CGs such as the ‘alkane’. Whenever a CG is identified, the Engine assigns the appropriate ATOM_TYPEs to all member atoms belong to the CG. All the ATOM_TYPEs detected for atoms in atorvastatin are shown in SI4.

Fig. 3 Determination of ATOM_TYPEs for atorvastatin

Note that the atom typing will fail if the Engine comes across an unknown CG, or sets of atoms in an unknown arrangement. To date, the Engine can recognise about 130 CGs. They are listed in DL_FIELD 3.5 library and are also shown in the Supporting Information. CONCLUSION AND FUTURE PLANS DL_F Notation is a standardised expression system implemented within the DL_FIELD program to describe the types of atoms for organic molecules in molecular simulations. A core conversion Engine is used to carry out the atom typing for FF schemes such as OPLSAA, PCFF and CVFF, which demonstrates the capability and robustness of the Engine to analyse the chemical structure of a molecule by using only a simple xyz as the input format. The FF model files can be set up in a single step without resorting to further intervention from the users. Furthermore, the paper also shows that DL_F Notation can be implemented to produce a consistent atom typing across a range of FF with precise chemical identification of every atom in the system. At the moment, DL_F Notation only applies to general organic molecules including drug molecules. Tests for the first 50 small drug molecules (both synthetic and natural-product

ACS Paragon Plus Environment

Page 8 of 12

Page 9 of 12

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

derived) as listed in the ChEMBL’s approved drug data freeze (December 2015) show about 50% of the molecules with completely successful CG detection. Such low percentage coverage is not surprising since DL_FIELD at the moment contains only a subset out of a total of about 220 known functional groups excluding their derivatives. More specific class of biomolecules, such as amino acid residues for proteins and carbohydrate molecules, are usually associated with specific sets of FF parameters. Within the DL_FIELD program, an entirely different approach, namely the template matching method, is used to set up these FF models. Such method is not described here since it is beyond the scope of this paper. Future development of the DL_F Notation is to expand the Notation to other FF schemes that are already available within the DL_FIELD program such as CHARMM, DREIDING and AMBER. More new CGs will also be included in DL_FIELD future releases in order to improve the functionality of the Engine. DL_FIELD program is supplied to individuals under an academic license, which is free to academics pursuing scientific research of a non-commercial nature. Daresbury Laboratory is the sole center for distribution of the software. To obtain a copy of the software please visit http://www.ccp5.ac.uk/DL_FIELD/ ASSOCIATED CONTENT Supporting Information available: Lists of force field files (the FIELD files) for ethanol expressed in the DL_F Notation for OPLSAA and CVFF, input coordinates and atom typing for atorvastatin, and list of the available CG in DL_FIELD version 3.5. (PDF) ACKNOWLEDGEMENTS The author is grateful to the funding of the Engineering and Physical Sciences Research Council (EPSRC) for the development of DL_FIELD under the auspices of the EPSRC's Collaborative Computational Project No. 5 (CCP5), of grant no: EP/J010480/1 and EP/M022617/1. Ilian Todorov is acknowledged for helping with DL_POLY support. Marcin Miklitz is acknowledged for providing test molecules.

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 12

REFERENCES 1. Leach, A. R. Molecular Modelling: Principles and Applications (2nd Edition); Pearson Education: Harlow, 2001 2. Brooks, R.; Bruccoleri, R. E.; Olafson, B. D.; States, D. J.; Swaminathan, S.; Karplus, M. CHARMM: A Program for Macromolecular Energy, Minimization, and Dynamics Calculations. J. Comput. Chem. 1983, 4, 187-217. 3. Cornell, W. D.; Cieplak, P.; Bayly, C. I.; Gould, I. R.; Merz Jr., K. M.; Ferguson, D. M.; Spellmeyer, D. C.; Fox, T.; Caldwell, J. W.; Kollman, P. A. A Second Generation Force Field for the Simulation of Proteins, Nucleic Acids and Organic Molecules. J. Amer. Chem. Soc. 1995, 117, 5179-5197. 4. Damm, W.; Frontera, A.; Tirado-Rives, J.; Jorgensen, W. L. OPLS All-atom Force Field for Carbohydrates. J. Comput. Chem. 1997, 18, 1955-1970. 5. Mayo, S. L.; Olafson, B. D.; Goddard III, W. A. DREIDING: A Generic Force Field for Molecular Simulations. J. Phys. Chem. 1990, 94, 8897-8909 6. Rappe, A. K.; Casewit, C. J.; Colwell, K. S.; Goddard III, W. A.; Skiff, W. M. UFF, a Full Periodic Table Force Field for Molecular Mechanics and Molecular Dynamics Simulations. J. Amer. Chem. Soc. 1992, 114, 10024-10035. 7. Dauber-Osguthorpe, P.; Roberts, V. A.; Osguthorpe, D. J.; Wolff, J.; Genest, M.; Hagler, A. T. Structure and Energetics of Ligand Binding to Proteins: E. coli Dihydrofolate Reductase-trimethoprim, a Drug-receptor System. Prot. Struct. Funct. Genet. 1988, 4, 31-47. 8. Sun, H.; Murnby, S. J.; Maple, J. R.; Hagler, A. T. An ab-initio CFF93 All-atom Force Field for Polycarbonates. J. Am. Chem. Soc. 1994, 116, 2978-2987. 9. Weininger, D. SMILES, a Chemical Language and Information System. 1. Introduction to Methodology and Encoding rules. J. Chem. Inf. Comput. Sci. 1988, 28, 31-36 10. Banks, J. L.; Beard, H. S.; Cao, Y.; Cho, A. E.; Damm, W.; Farid, R.; Felts, A. K.; Halgren, T. A.; Mainz, D. T.; Maple, J. R.; Murphy, R.; Philip, D. M.; Repasky, M. P.; Zhang, L. Y.; Berne, B. J.; Friesner, R. A.; Gallicchio, E.; Levy, R. M. Integrated Modelling Program, Applied Chemical Theory (IMPACT). J. Comp. Chem. 2005, 26, 1752-1780 11. Youngs, T. G. A. Aten – An Application for the Creation, Editing, and Visualization of Coordinates for Glasses, Liquids, Crystals, and Molecules. J. Comp. Chem. 2010, 31, 639-648 12. DL_FIELD version 3.5; A force field and model development software tool for DL_POLY; Daresbury Laboratory: Warrington, 2016. 13. Smith, W.; Yong, C. W.; Rodger, P. M. DL_POLY: Application to Molecular Simulation. Mol. Sim. 2002, 28, 385-471

ACS Paragon Plus Environment

Page 11 of 12

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

14. Yong, C. W. DL_FIELD User Manual version 3.5, 2016. 15. Haider, N. Functionality Pattern Matching as an Efficient Complementary Structure/reaction Search Tool: an Open-source Approach. Molecules 2010, 15, 5079-5092. 16. Hanser, Th.; Jauffret, Ph.; Kaufmann, G. A New Algorithm for Exhaustive Ring Perception in a Molecular Graph. J. Chem. Inf. Comput. Sci., 1996, 36, 1146-1152.

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

For Table of Contents Use Only

DL_F Notation: A Natural Chemical Expression System of Atom Types for Molecular Simulations Chin W. Yong1,2 1

Scientific Computing Department, Science and Technology Facilities Council, Daresbury Laboratory, Sci-Tech Daresbury, Warrington WA4 4AD, UK ([email protected])

2

Manchester Pharmacy School, Faculty of Medical and Human Sciences, Manchester Academic Health Science Centre, the University of Manchester M13 9NT, UK.

ACS Paragon Plus Environment

Page 12 of 12