Enabling the Discovery and Virtual Screening of Potent and Safe

Jun 9, 2016 - Department of Applied Physics, University of Santiago de Compostela (USC), 15782 Santiago de Compostela, Spain. § LAQV@REQUIMTE, Depart...
0 downloads 10 Views 1MB Size
Research Article pubs.acs.org/acscombsci

Enabling the Discovery and Virtual Screening of Potent and Safe Antimicrobial Peptides. Simultaneous Prediction of Antibacterial Activity and Cytotoxicity Valeria V. Kleandrova,† Juan M. Ruso,‡ Alejandro Speck-Planche,*,‡,§ and M. Natália Dias Soeiro Cordeiro*,§ †

Faculty of Technology and Production Management, Moscow State University of Food Production, Volokolamskoe shosse 11, Moscow, Russia ‡ Department of Applied Physics, University of Santiago de Compostela (USC), 15782 Santiago de Compostela, Spain § LAQV@REQUIMTE, Department of Chemistry and Biochemistry, University of Porto, 4169-007 Porto, Portugal S Supporting Information *

ABSTRACT: Antimicrobial peptides (AMPs) represent promising alternatives to fight against bacterial pathogens. However, cellular toxicity remains one of the main concerns in the early development of peptide-based drugs. This work introduces the first multitasking (mtk) computational model focused on performing simultaneous predictions of antibacterial activities, and cytotoxicities of peptides. The model was created from a data set containing 3592 cases, and it displayed accuracy higher than 96% for classifying/predicting peptides in both training and prediction (test) sets. The technique known as alanine scanning was computationally applied to illustrate the calculation of the quantitative contributions of the amino acids (in their respective positions of the sequence) to the biological effects of a defined peptide. A small library formed by 10 peptides was generated, where peptides were designed by considering the interpretations of the different descriptors in the mtkcomputational model. All the peptides were predicted to exhibit high antibacterial activities against multiple bacterial strains, and low cytotoxicity against various cell types. The present mtkcomputational model can be considered a very useful tool to support high throughput research for the discovery of potent and safe AMPs. KEYWORDS: alanine scanning, AMP, autocorrelations, contributions, mtk-computational model



INTRODUCTION

Remarkable advances achieved in disciplines such as chemoinformatics and bioinformatics have enabled the creation of several computational models from different public sources,9 which are capable of discriminating AMPs from those with other biological functions/profiles.10−28 At the same time, computational approaches have been applied to the recognition of toxic and nontoxic peptides.29−31 However, there are several drawbacks in the use of the current computational models regarding the discovery of highly active and toxicologically safe AMPs. For instance, most of the models predict AMPs unspecifically, that is, without assessing the microorganisms against which an AMP can be active. That prevents the efficacious search for highly active and versatile AMPs. Additionally, descriptors calculated from the sequences and/ or the 3D-structures of the AMPs have only global nature; the absence of local descriptors disables the analysis of the critical residues and regions responsible for the activities of the AMPs.

Bacterial diseases have plagued mankind for centuries, and until now, they represent serious unresolved health problems.1 Over time, many species of bacteria have become resistant to current antibiotics.1,2 In this context, antimicrobial peptides (AMPs) have been widely regarded as promising therapeutic alternatives, exhibiting wide-spectrum antiproliferative activities.3 Despite their multiple applications, peptides can undergo fast degradation in proteolytic environments,4 and therefore, the stability is an aspect of great importance, which should be considered when designing new peptides. In any case, the stability of peptides can be enhanced through different ways,5,6 which include incorporating D-amino acids or α-aminoxy amino acids, changing the backbone chemistry, or cyclization. However, such chemical modifications may not be enough to reduce the cytotoxicity, which remains one of the main concerns in the early development of peptide-based drugs,7,8 thus preventing the future use of the AMPs for clinical purposes. © 2016 American Chemical Society

Received: May 1, 2016 Published: June 9, 2016 490

DOI: 10.1021/acscombsci.6b00063 ACS Comb. Sci. 2016, 18, 490−498

Research Article

ACS Combinatorial Science

and afterward multiplied by 103, yielding the final values expressed in μM. It should be noticed that the transformations of the MIC, CC50, and HC50 values permitted a more accurate comparison of the potencies or cytotoxicities among the peptides. Not all the peptides were reported against all the biological systems, and therefore, the data set contained 3592 cases of peptides (combinations involving peptide + measure of the biological effect + biological system). Each peptide in the data set was assigned to 1 of 2 possible groups/classes, namely, positive [BEq(cr) = 1, indicating high antibacterial activity or low cytotoxicity] or negative [BEq(cr) = −1, referred to low antibacterial activity or high cytotoxicity]. The cutoff values to annotate a peptide as positive were: MIC ≤ 14.97 μM, or CC50 ≥ 60.91 μM, or HC50 ≥ 105.7 μM. Here, BEq(cr) is a binary variable that encodes the biological effect of the qth peptide under the experimental condition cr. It should be emphasized that the experimental condition cr is an ontology of the form cr → (me, bs), where me and bs have been explained above. The descriptors were calculated in order to characterize the structural information present in the peptide sequences. Once again, the program ProtDCal was used for this task. An important detail of this software is that several families of molecular descriptors can be calculated for peptides/ proteins.43−45 In the present study, we selected the topological indices known as Broto-Moreau autocorrelations, which have been widely studied.46 The configuration used to calculate these descriptors appears summarized in Supporting Information 1 (SM1). The mathematical formalism underpinning the calculation of the ACk(w) indices is represented in the following form:

The same problems mentioned above occur in models focused on identifying toxic and nontoxic peptides, where toxicity is predicted without considering the cells against which a peptide may or may not be toxic. Very recently, a multitarget chemobioinformatic model has been created to perform simultaneous predictions of antibacterial activity of peptides against many Gram-positive bacterial strains,32 and that work has overcome some of the aforementioned handicaps of the current computational models. However, until now, there is no model able to simultaneously predict antimicrobial activity and cellular toxicity of peptides. In the last years, several researchers have emphasized the use of multitasking (mtk) computational models, which are able to integrate different kinds of chemical and biological data, allowing the assessment of multiple biological activities against diverse biological systems (microorganisms, cell lines, etc.).33−36 A peculiar detail of all these mtk-computational models is that they are based on the use of graph-theoretic invariants (topological descriptors), which allow the characterization of the molecular diversity and complexity at local and global levels.37 On the other hand, topological descriptors are very useful for determining the relationships between the substructures/fragments and the biological effects.36,38 In this work, to speed up peptide discovery, we introduce the first mtkcomputational model focused on performing simultaneous predictions of antibacterial activities against multiple Gramnegative bacterial strains, and cytotoxicities against different cell types.



MATERIALS AND METHODS Retrieving the Data Set and Calculating the Descriptors. All chemical and biological data regarding the antibacterial activity and toxicity of peptides were extracted from the public source named Database of Antimicrobial Activity and Structure of Peptides (DBAASP).39 This database was chosen because of its high level of curation in terms of the experimental assays under which the peptides were tested. For instance, DBAASP takes into consideration the different strains of the same bacteria that were assayed against one or more AMPs. Nevertheless, we performed an additional curation of the data set retrieved from DBAASP by using the StrainInfo Web site, which is based on very precise annotations of many different microbial strains.40 Our data set was formed by 2123 peptides, containing from 4 to 119 amino acids. Each peptide was experimentally evaluated against at least 1 out of 70 biological systems (bs), which included diverse Gram-negative bacterial strains, and different mammalian cell types. In this data set, three measures of biological effects (me) were considered, namely, minimum inhibitory concentration (MIC), cytotoxic concentration at 50% (CC50), and hemolytic concentration at 50% (HC50). The MIC, CC50, and HC50 values were transformed from microgram per milliliter (μg/mL) to micromolar (μM). In order to perform this task, all the peptide sequences were stored in a txt file, which was converted to fasta file by employing the online tool named format converter. This tool is available at http://www.hiv.lanl.gov/content/sequence/ FORMAT_CONVERSION/form.html, being a part of the public source known as HIV Sequence Database.41 After, the program ProtDCal created by Marrero-Ponce et al. was used to calculate the approximate molecular weight/molar mass of each peptide.42 Then, the MIC, CC50, and HC50 values expressed in μg/mL were divided by the molecular weights/molar masses,

N

ACk(w) =

N

∑ ∑ wi·wj·δ(dij ; k) i=1 j=1

(1)

In eq 1, N is the total number of amino acids, wi and wj are the physicochemical/structural properties of the amino acids i and j, respectively. In this equation, δ is the Kronecker delta; δ = 1 if dij = k and Lp ≥ k, where dij is the topological distance between the ith and jth amino acids, k is cutoff value of topological distance, and Lp is the length of the peptide (number of amino acids). However, for the specific case where dij < k and Lp > 1, the ACk(w) index is calculated as a constant value, being the simple sum of the squares of the physicochemical/structural properties of each amino acid. In this work, the ACk(w) descriptors were modified according to the following formalism: MACk(w) =

ACk(w)·θ nt ·K ct Lp

(2)

In eq 2, MACk(w) is the modified Broto-Moreau autocorrelation, while nt and ct represent binary variables describing the presence/absence of groups modifying the N-terminus (acetyl) and the C-terminus (amino), respectively. The term Lp is employed to normalize the ACk(w) descriptors. On the other hand, the arbitrary mathematical terms θ = 1.306 (Mills’s constant) and K = 1.132 (Viswanath’s constant) are introduced to increase/decrease the values of the MACk(w) indices according to the chemical modifications present in the Nterminus and C-terminus, respectively. Thus, the fact that θ > K indicates that the acetyl group modifying the N-terminus is larger than the amino group modifying the C-terminus. Therefore, with the introduction of θ and K, there are no 491

DOI: 10.1021/acscombsci.6b00063 ACS Comb. Sci. 2016, 18, 490−498

Research Article

ACS Combinatorial Science

BEq(cr), and transforms them into scores (continuous values) of biological effects. After the use of default procedures applied by the program to find the discriminant function, each predicted continuous score is converted to its predicted categorical value [Pred-BEq(cr)]. The performance of the model was determined through the analysis of several statistical indices such as Wilks’s lambda (λ), chi-square (χ2), p-value, sensitivity, specificity, accuracy, Matthews’s correlation coefficient (MCC), and ROC (receiver operating characteristic) curves.36,38 The general steps leading to the development of the mtk-computational model appear depicted in Figure 1.

drastic changes in the values of MACk(w) but the values change enough to differentiate unmodified peptides from those with modifications in N-terminus and/or C-terminus. Notice however that other mathematical constants may be employed. Box−Jenkins Moving Averages and Generation of the mtk-Computational Model. It is intuitive to see that the MACk(w) indices depend only on the peptide sequences. Consequently, they will not be capable of discriminating the antibacterial activity or the cytotoxicity of a peptide when tested against different Gram-negative bacterial strains or diverse mammalian cell types, respectively. The Box−Jenkins approach offers a simple solution through the calculations of moving averages, which were initially used in time series analysis.47 Thus, in numerous fields of research in drug discovery, various works have reported the development of advanced chemoinformatic models inspired by the idea regarding the calculation of Box−Jenkins moving averages.33−36,48−58 In the first step, the following equation is employed: n(c )

avg_MACk(w)cr =

r 1 ∑ MACk q(w) n(cr) q = 1

(3)

In eq 3, n(cr) represents the number of peptides assayed by considering the same element of the experimental condition (ontology) cr, which have also been annotated as positive. For instance, for the case of the element bs, n(cr) will be the number of peptides tested against the same biological system, which have been assigned to the class named positive. The same deduction can be made for the element me. In addition, in eq 3, avg_MACk(w)cr is the arithmetic mean of the MACkq(w) indices. After the calculation of avg_MACk(w)cr, a subsequent expression can be used for each peptide: DMACk q(w)cr = MACk q(w)‐avg_MACk(w)cr

Figure 1. General steps involved in the development of the mtkcomputational model.



(4)

RESULTS AND DISCUSSION Mtk-Computational Model. Even if forward stepwise was used as variable selection strategy, we also took into account the fact that the best model should exhibit a very high statistical quality while containing as few descriptors as possible. This idea is related to the well-known principle of parsimony. The best mtk-computational model found by us contained 4 variables/ descriptors:

In eq 4, the deviation term DMACkq(w) is an adaptation of the Box−Jenkins moving averages, and it takes into account both the peptide sequence and a defined element of the experimental condition cr under which a peptide was tested. For this reason, only the DMACkq(w) descriptors (210 in total) were considered for the creation of the mtk-computational model. The data set formed by the 3592 cases of peptides was randomly partitioned into two series: training and prediction (test) sets. The training set was used to seek for the best mtkcomputational model. This set contained 2711 cases, 1404 annotated as positive, and 1307 negative. On the other hand, the prediction set contained 881 cases, 440 considered as positive, and the remaining 441 assigned as negative. This set was employed to assess the predictive power of the mtkcomputational model. Linear discriminant analysis (LDA) was used as the data analysis method for generating the model, where the procedure known as forward stepwise served as the variable selection strategy. The statistical analysis was realized by the software STATISTICA, version 6.0.59 The mtkcomputational model can be expressed according to the following general form:

BEq(cr) = 6.8 × 10−5DMAC2(At)me + 1.2 × 10−4DMAC6(Vm)me − 1.61 × 10−4DMAC3(Anp)bs + 0.03DMAC4(IP)bs + 1.19 N = 2711

∑ bi ·[DMACk q(w)]i i=1

p‐value < 10−16

χ 2 = 2527.03

(6)

The symbols and concepts related to the different descriptors in eq 6 are depicted in Table 1. It is easy to infer that the small magnitude of the statistical indices p-value and λ, and the large χ2 are indicators of the good quality of the mtk-computational model. In fact, the percentages of correct classification for positive (sensitivity) and negative (specificity) peptides support the idea regarding the great performance of the model. In this sense, the sensitivity of the mtk-computational model was 98.36% in the training set, which means that 1381 out of 1404 positive cases of peptides were correctly classified. In this same set, 1266 out of 1307 negative cases of peptides were properly classified, for a specificity of 96.86%. The accuracy was 97.64%. At the same time, 429 out of 440 positive (sensitivity = 97.5%) and 427 out of 441 negative (specificity = 96.83%) cases of peptides were correctly classified in the prediction (test) set,

m

BEq(cr) = a0 +

λ = 0.39

(5)

In eq 5, a0 represents the constant term, while bi is used to describe the coefficients of the descriptors. We need to point out that during the creation of the model, the software STATISTICA, version 6.0, takes the initial categorical values of 492

DOI: 10.1021/acscombsci.6b00063 ACS Comb. Sci. 2016, 18, 490−498

Research Article

ACS Combinatorial Science Table 1. Descriptors Present in the Final mtk-Computational Model symbol

definition

DMAC2(At)me

Deviation of the Broto−Moreau autocorrelation index weighted by the accessible surface area of each amino acid in unfolded state, considering all the amino acids placed at topological distance equal to 2, and depending on the measure of the biological effect. Deviation of the Broto−Moreau autocorrelation index weighted by the volume, considering all the amino acids placed at topological distance equal to 6, and depending on the measure of the biological effect. Deviation of the Broto−Moreau autocorrelation index weighted by the nonpolar area of each amino acid in unfolded state, considering all the amino acids placed at topological distance equal to 3, and depending on the biological system against which a peptide was assayed. Deviation of the Broto−Moreau autocorrelation index weighted by the isoelectric point, considering all the amino acids placed at topological distance equal to 4, and depending on the biological system against which a peptide was assayed.

DMAC6(Vm)me DMAC3(Anp)bs DMAC4(IP)bs

which yielded an accuracy of 97.16%. All the particular details of the statistical classifications related to each AMP, as well as the chemical and biological data appear in Supporting Information 2 (SM2). On the other hand, we also calculated another statistical index named percentage of correctly classified peptides depending on the biological systems [% CCP(bs)] against which the peptides were tested. This index considered both Gram-negative bacterial strains and mammalian cell types. Thus, the %CCP(bs) values ranged in the interval 83.33−100% for positive peptides, and 80−100% for negative peptides. A similar index was calculated for the different measures of biological effects used in this study [% CCP(me)], where the values were higher than 95%. This information can be found in Supporting Information 3 (SM3). In addition to the statistical indices mentioned above, MCC served as another measure of the statistical quality and predictive power of the mtk-computational model. This comes from the fact that MCC measures the strength of the correlation between the observed and predicted values of the categorical variable BEq(cr), which encodes the relative biological effect of each peptide in the data set. The MCC values obtained from the mtk-computational model were 0.95 and 0.94 for the training and prediction sets, respectively. Notice that these MCC values are quite close to 1, which demonstrates that there is a very high correlation between the observed and predicted values of BEq(cr). As a final proof of the significant performance of the model, we used the areas under the ROC curves (Figure 2). The values of areas under the ROC curves were 0.998 in the training set, and 0.997 for the prediction set. The values for this statistical index clearly indicate the great efficiency of the mtk-computational model,

which does not behave as a random classifier (area = 0.5). The integrated statistical analysis of the model permits to envisage its high quality and predictive power. A particular characteristic of the mtk-computational model developed in this study is that the descriptors depend on both the peptides sequences and at least one of the elements of the experimental condition cr, namely, measure of the biological effect (me), and the biological system against which a peptide was assayed (bs). Thus, a peptide may appear more than one time in the data set because different biological systems (Gram-negative bacterial strains or mammalian cell types) and diverse measures of biological effects (MIC, CC50, HC50) were employed during the assays. Consequently, for each time that a peptide was repeated in the data set, the mtk-computational model yielded a different classification/prediction result. On the other hand, alignment rules are not needed, and the use of sequences instead of 3D-structures helps to minimize the computational cost associated with procedures based on geometry optimizations. Therefore, the present model can be used as a complementary in silico tool for techniques, such as high-throughput screening, enabling the virtual search in large libraries of peptides, and filtering those peptides with potent and versatile antibacterial activity, and low cytotoxicity. Interpretation of the Descriptors from a Physicochemical/Structural Point of View. One of the most important aspects of the mtk-computational model developed in this study is the fact that all the descriptors have relatively simple physicochemical/structural interpretations. The DMACk(w)cr descriptors calculated in this work are derived from autocorrelations. Therefore, they measure the strength of the relationship between any two amino acids as a function of the topological distance between them, describing how a defined physicochemical/structural property is distributed along the peptide sequence. Taking into account all these elements, we will analyze the DMACk(w)cr indices in such a way forward-obtaining important information regarding how these descriptors should be varied to improve the biological effect (increasing antibacterial activity and decreasing cytotoxicity) of any peptide. With the aim of providing as much information as possible, we will rely on the relative influences of the diverse DMACk(w)cr descriptors, which have been assessed through the generation of the absolute values of the standardized coefficients (Figure 3). By inspecting eq 6, one can see that three of the four descriptors have steric nature. The descriptor DMAC2(At)me is the second most important variable in the mtk-computational model, and it characterizes the increment of the accessible surface area of the amino acids that are placed at topological distance equal to 2, depending on the measure of the biological effect. By definition, the accessible surface area is defined as the portion of area of a molecule that is accessible to a solvent (water in the case of biological media). However, as we are

Figure 2. ROC curves as measures of the high performance of the model. 493

DOI: 10.1021/acscombsci.6b00063 ACS Comb. Sci. 2016, 18, 490−498

Research Article

ACS Combinatorial Science

suggests not only the influence of electronic, steric, or hydrophobic factors, but also the presence of specific patterns.30,31,60 Therefore, it is intuitive to deduce that certain amino acids may exhibit an intrinsic effect on the antibacterial activity or the cytotoxicity of a peptide. At the same time, that intrinsic effect may vary depending on the position of the amino acid in the peptide sequence. Thus, the calculation of the quantitative contributions of the amino acids to the antibacterial activity and cytotoxicity of the peptides is a crucial aspect toward the optimization of the desirable properties that a peptide should have. To calculate such contributions, we have computationally applied the approach known as alanine scanning, a well-known experimental technique used in molecular biology with the aim of assessing the influence of the amino acids on the stability and functions of a given protein.61 The peptide LVIRTVIAGYNLYRAIKKK (Lp = 19, being Lp the number of amino acids) was employed as a case of study for the calculation of the contributions. This AMP appears four times in our data set. In this context, LVIRTVIAGYNLYRAIKKK was assigned as positive (high antibacterial activity) against Salmonella enterica subsp. enterica serovar Typhimurium (ATCC 14028) and Escherichia coli (ATCC 25922/KCTC 1682), and it was also annotated as positive (low cytotoxicity) against Madin−Darby canine kidney cells and human erythrocytes. By applying alanine scanning, we substituted the amino acid at position 1 by alanine. Consequently, due to the substitution, a mutated sequence was obtained. Then, the descriptors were calculated for both the original and the mutated sequence through the application of all the equations referred to above, and two different scores of biological effects (one per each sequence) were generated by substituting the corresponding DMACkq(w)cr descriptors in eq 6. The subtraction of the score of the mutated sequence from the score of the original sequence yielded a score named SBE1, which characterized the contribution of the amino acid (in the kth position) to the biological effect of the peptide under study, depending on the measure of the biological effect (me), and the biological system (bs) against which the peptide was assayed. This procedure was applied to each amino acid of the peptide under analysis. It was possible to calculate a second score for the amino acid in the kth position of the peptide sequence. Such score was calculated by substituting in eq 6 the descriptors of the amino acid that was replaced (in the kth position) by alanine. This score, annotated as SBE2, embodied the intrinsic influence of the aforementioned amino acid, without considering its molecular environment, that is, regardless of its neighborhood with respect to other amino acids. It should be noticed that SBE2 also depends on the elements me and bs. The knowledge about SBE1 and SBE2 allowed us to apply the following mathematical formalism:

Figure 3. Standardized coefficients as measures of the importance of the descriptors in the mtk-computational model. The higher the standardized coefficient the greater the significance of the descriptor in the model.

describing how the descriptor has influence in the biological effect of the peptides, DMAC2(At)me may also be related with regions in the peptides, which are accessible to a surrounding medium other than water. Amino acids with the highest accessible surface area are tryptophan (W), arginine (R), tyrosine (Y), methionine (M), and lysine (K). At the same time, DMAC3(Anp)bs considers the diminution of the nonpolar area of the amino acids placed at topological distance equal to 3. This descriptor is the most important of all, and it depends on the biological systems against which a peptide has been tested. We would like to emphasize that DMAC3(Anp)bs indicates the presence of amino acids with small nonpolar area such as arginine (R), glutamine (Q), glutamic acid (E), aspartic acid (D), and serine (S), among others. On the other hand, DMAC4(IP)bs depends on the biological systems against which a peptide was assayed, and it characterizes the increment of the isoelectric point of the amino acids placed at topological distance equal to 4. This means that some basic amino acids such as arginine (R), and lysine (K) may be required in the peptide sequence, and will help improving the biological effect. The descriptor DMAC4(IP)bs has the lowest significance in the mtk-computational model. Finally, DMAC6(Vm)me expresses that the biological effect of a peptide can be improved if the volumes of the amino acids placed at topological distance equal to 6 are increased. The descriptor DMAC6(Vm)me also depends on the measure of biological effect, and it is the third most significant variable. Notice that the bulkiest amino acids are tryptophan (W), tyrosine (Y), and phenylalanine (F), and two of them (W and F) are nonpolar. Therefore, DMAC6(Vm)me may account for the presence of nonpolar bulky amino acids. The combined interpretations of the descriptors in the mtk-computational model suggest that peptides where polar amino acids are concentrated in small regions (dipeptides and tripeptides), and bulky amino acids are dispersed along the peptide sequence, will have high antibacterial activity and low cytotoxicity. Determining the Contributions of the Amino Acids to the Antibacterial Activity and Cytotoxicity of a Peptide. The current knowledge regarding the requirements responsible for the appearance or enhancement of the antibacterial activity of the peptides and the diminution of their cytotoxicities

WSBE =

n1 n2 ·SBE1 + ·SBE2 Lp Lp

(7)

Here in eq 7, n2 stands for the number of amino acids that were substituted in the peptide. In our specific case, we are analyzing the substitution of only one amino acid (although sequence fragments can be also substituted). For this reason, n2 = 1. At the same time, n1 is the number of amino acids (of the original sequence), which did not undergo any substitution. It should be noticed that the condition n1 + n2 = Lp is always valid, and 494

DOI: 10.1021/acscombsci.6b00063 ACS Comb. Sci. 2016, 18, 490−498

Research Article

ACS Combinatorial Science therefore n1 = 18. Finally, in eq 7, WSBE represents the weighted score of biological effect, being calculated as the weighted arithmetic mean of SBE1 and SBE2. As the final step, we standardized all the WSBE scores. To perform this task, both the mean and the standard deviation of the whole set of WSBE scores were determined. The standardized WSBE score of each amino acid was calculated by subtracting the mean (of all the WSBE scores) from its unstandardized WSBE score (eq 7), and after dividing the result of the subtraction by the standard deviation (of all the WSBE scores). The WSBE scores can be interpreted as the relative quantitative contributions of the amino acids to the different biological effects of the peptide LVIRTVIAGYNLYRAIKKK, depending on me and bs. A summary regarding the contributions is depicted in Table 2, whereas the complete

contributions. While L1 has negative influence on the diverse biological effects of the peptide under analysis (diminution of the antibacterial activity and increment of the cytotoxicity), L12 has the opposite effect. These observations suggest that from one side, L1 should be replaced by another amino acid with more desirable properties according to the interpretations of the descriptors in the mtk-computational model (eq 6). On the other hand, it has been demonstrated the sensibility of the different descriptors to discriminate the positions of the amino acids in the peptide sequence. As described above, we have employed the peptide LVIRTVIAGYNLYRAIKKK as a case of study for calculating the contributions of the amino acids to its multiple biological effects (antibacterial activity and cytotoxicity). It is necessary to point out that the procedure can be applied to any peptide, even if it is not present in our data set, and the contributions of the amino acids will be valid only for the peptide under analysis. Therefore, this means that for different peptides, the contributions of their respective amino acids will vary. An interesting aspect is that many peptides assigned (and correctly classified) as positive may have in their sequences amino acids with undesirable properties according to the descriptors in the mtk-computational model. At the same time, some amino acids with desirable properties may be found in many peptides annotated (and correctly classified) as negative. This is not a surprising phenomenon, but the information gathered from it tell us that the presence or absence of such amino acids is not enough for an appreciable enhancement of the antibacterial activity and/or the diminution of the cytotoxicity. Only the combination of certain amino acids, and the way through which they are connected and placed at certain topological distances are the essential features influencing the improvement of the antibacterial activity and the decrease of the cytotoxicity. Anyway, the calculation of the relative quantitative contributions of the amino acids to the diverse biological effects of any peptide allows the elimination of those amino acids with undesirable properties (negative contributions), or the substitutions of such amino acids by others with positive contributions. Amino acids with mixed contributions should be analyzed with caution. All these ideas may optimize the search for highly active and versatile AMPs against multiple Gram-negative bacterial strains, which can also exhibit low cytotoxicity. Virtual Design of New Peptides and Prediction of Their Biological Profiles. So far there has been a tendency to use the existing computational models as tools that can perform virtual screening of AMPs.9 However, it must be emphasized that descriptors used in any model can characterize only a small portion of the chemical diversity and complexity of the peptides, which means that the models (even those using combinations of many different descriptors) are just approximations to those two aspects. Consequently, computational models employed in peptide discovery will consider a reduced part of the vast chemical space related to peptides, and therefore, they will correctly predict peptides to some extent. Thus, it is natural to infer that the computational models should be used beyond the classical task focused on predicting large (external) data sets of peptides. In this section, our purpose is to demonstrate that at least from a theoretical point of view, potent antibacterial peptides that also exhibit low cytotoxicity, can be virtually designed by using the interpretations of the different descriptors together with the information provided by the standardized coefficients in Figure

Table 2. Relative Quantitative Contributions of the Amino Acids to the Different Biological Effects of the Peptide LVIRTVIAGYNLYRAIKKKa,b,c me = MIC

amino acid

position

Salmonella enterica subsp. enterica serovar Typhimurium (ATCC 14028)

L V I R T V I A G Y N L Y R A I K K K

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

−0.33 −0.80 −0.81 1.79 −0.68 −0.83 −0.30 −1.13 −1.18 0.17 0.19 0.95 −0.01 2.33 −1.13 −0.85 0.84 0.91 0.57

me = CC50

me = HC50

Escherichia coli (ATCC 25922/ KCTC 1682)

Madin− Darby canine kidney cells

human erythrocytes

−0.32 −0.79 −0.80 1.80 −0.67 −0.82 −0.29 −1.12 −1.17 0.18 0.20 0.96 0.00 2.34 −1.12 −0.84 0.85 0.92 0.58

−0.34 −0.80 −0.82 1.78 −0.68 −0.84 −0.30 −1.14 −1.18 0.16 0.19 0.95 −0.02 2.32 −1.14 −0.85 0.83 0.90 0.56

−0.27 −0.74 −0.75 1.85 −0.62 −0.77 −0.24 −1.07 −1.12 0.22 0.25 1.01 0.05 2.38 −1.07 −0.79 0.90 0.97 0.62

a The element me is referred to the measure of biological effect employed in the assay. bThe bacteria and cells depicted in this table are the different biological systems (bs) against which the experimental tests were performed. cEach value inside the table represents the contribution of an amino acid to a defined biological effect of the peptide under study. The higher this contribution value the greater the favorable influence of the amino acid in the biological effect of the peptide.

details of all the calculations appear represented in Supporting Information 4 (SM4). The larger the contribution value the greater the favorable influence of an amino acid in the biological effect of the peptide. Results from Table 2 indicate that amino acids such as arginine (R4 and R14) and lysine (K17−K19) have high positive contributions in all the conditions, which indicates that they have a remarkable influence in the enhancement of the antibacterial activity, and the diminution of the cytotoxicity of the peptide LVIRTVIAGYNLYRAIKKK. Interestingly, two different residues based on leucine (L1 and L12) have very different 495

DOI: 10.1021/acscombsci.6b00063 ACS Comb. Sci. 2016, 18, 490−498

Research Article

ACS Combinatorial Science

Table 3. Probabilities Predicted by the mtk-Computational Model to Classify the Designed Peptides As Positive under Multiple Experimental Conditionsa,b me = MIC (%)

me = CC50 (%)

me = HC50 (%)

peptide ID

Escherichia coli (ATCC 25922/KCTC 1682)

Pseudomonas aeruginosa (ATCC 27853)

mouse fibroblast cells (NIH 3T3)

human monocytic THP-1 cells

human erythrocytes

M1 M2 M3 M4 M5 M6 M7 M8 M9 M10

99.61 97.15 99.64 97.93 99.96 99.95 99.62 97.31 99.66 99.98

99.54 96.63 99.57 97.55 99.96 99.94 99.55 96.81 99.59 99.98

99.98 99.82 99.98 99.87 100 100 99.98 99.83 99.98 100

99.83 98.71 99.84 99.07 99.98 99.98 99.83 98.78 99.85 99.99

99.74 98.06 99.75 98.6 99.98 99.97 99.75 98.17 99.77 99.99

a

The element me is referred to the measure of biological effect employed in the assay. bThe bacteria and cells depicted in this table are some of the most representative biological systems (bs) against which the peptides were assayed.

foundations for the design of large libraries of active and potentially safe AMPs using the mtk-computational model as a knowledge generator. This work paves the way to the rational discovery of potent, versatile, and toxicologically desirable antibacterial peptides, envisaging new horizons toward their uses as encouraging therapeutic options.

3. Thereafter, such peptides can be predicted to check if the virtual design was well performed. Taking into account the previous ideas, and in order to illustrate how the mtk-computational model works in practice, we created a small library formed by 10 peptides (each of them containing 20 amino acids), which were not present in our data set. The peptides were generated in a simple way, that is, they were designed to be rich in amino acids such as arginine (R), glutamine (Q), glutamic acid (E), and asparagine (N). These amino acids have very low nonpolar areas, and this information is characterized by the descriptor DMAC3(Anp)bs, which has the highest influence in the biological effects of the peptides (Figure 3). Additionally, the amino acids mentioned above have acceptable values for some of the other physicochemical/ structural properties explained by descriptors other than DMAC3(Anp)bs. The mtk-computational model was used to predict the antibacterial activities and the cytotoxicities of the designed peptides. A summary of the results of the predictions appears depicted in Table 3. The information present in this table is referred to some of the most representative bacterial strains and mammalian cells types reported in our data set. All these peptides were predicted with probabilities higher than 96% to exhibit high antibacterial activity (MIC ≤ 14.97 μM) against all the Gram-negative bacteria, and low cytotoxicity (CC50 ≥ 60.91 μM, and HC50 ≥ 105.7 μM) against all the mammalian cell types. The specific details can be found in Supporting Information 5 (SM5).



ASSOCIATED CONTENT

* Supporting Information S

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acscombsci.6b00063. Configurations of the program ProtDCal (PDF) Descriptors, averages, input model, and classification (XLSX) Total number of peptides tested and percentage of correct classification (XLSX) Contributions (XLSX) DMAC2, DMAC6, DMAC3, DMAC4, Pred-BEq, and probabilites (XLSX)



AUTHOR INFORMATION

Corresponding Authors

*E-mail: [email protected]. Fax: +351 220402659. *E-mail: [email protected]. Fax: +351 220402659.



Notes

The authors declare no competing financial interest.



CONCLUSION Recent advances in the development and application of theoretical and computational approaches have strengthened the ties between bioinformatics and other allied disciplines such chemoinformatics, enabling the virtual discovery of new and efficacious AMPs. The mtk-computational model generated in this work represents a promising alternative to achieve that goal. Our model was able to classify/predict many peptides tested against dissimilar Gram-negative bacterial strains and diverse mammalian cell types. At the same time, our model was successfully employed to design new peptides, which were predicted as antibacterial agents with high activity, and low cytotoxicity. The interpretation of the different descriptors, together with the methodology focused on the calculation of the relative quantitative contributions of the amino acids to the different biological effects of the peptides create the

ACKNOWLEDGMENTS The authors are grateful for the joint financial support given by the Portuguese Fundaçaõ para a Ciencia e a Tecnologia (FCT/ MEC) and FEDER (Projects No. UID/QUI/50006/2013 and POCI/01/0145/FEDER/007265). Prof. Juan M. Ruso also acknowledges the financial support given by MICINN-Spain (Project No. MAT2011-25501).



REFERENCES

(1) Brachman, P. S.; Abrutyn, E. Bacterial Infections of Humans: Epidemiology and Control, 4th ed.; Springer Science+Business Media, LLC: New York, NY, 2009. (2) Ryan, K. J.; Ray, C. G. Sherris Medical Microbiology. An Introduction to Infectious Diseases, 4th ed.; McGraw-Hill Companies, Inc: AZ, 2004. 496

DOI: 10.1021/acscombsci.6b00063 ACS Comb. Sci. 2016, 18, 490−498

Research Article

ACS Combinatorial Science (3) Fjell, C. D.; Hiss, J. A.; Hancock, R. E.; Schneider, G. Designing antimicrobial peptides: form follows function. Nat. Rev. Neurosci. 2012, 11, 37−51. (4) Gorris, H. H.; Bade, S.; Rockendorf, N.; Albers, E.; Schmidt, M. A.; Franek, M.; Frey, A. Rapid profiling of peptide stability in proteolytic environments. Anal. Chem. 2009, 81, 1580−1586. (5) Gentilucci, L.; De Marco, R.; Cerisoli, L. Chemical modifications designed to improve peptide stability: incorporation of non-natural amino acids, pseudo-peptide bonds, and cyclization. Curr. Pharm. Des. 2010, 16, 3185−3203. (6) Chen, F.; Ma, B.; Yang, Z. C.; Lin, G.; Yang, D. Extraordinary metabolic stability of peptides containing alpha-aminoxy acids. Amino Acids 2012, 43, 499−503. (7) Maher, S.; McClean, S. Investigation of the cytotoxicity of eukaryotic and prokaryotic antimicrobial peptides in intestinal epithelial cells in vitro. Biochem. Pharmacol. 2006, 71, 1289−1298. (8) Maher, S.; McClean, S. Melittin exhibits necrotic cytotoxicity in gastrointestinal cells which is attenuated by cholesterol. Biochem. Pharmacol. 2008, 75, 1104−1114. (9) Aguilera-Mendoza, L.; Marrero-Ponce, Y.; Tellez-Ibarra, R.; Llorente-Quesada, M. T.; Salgado, J.; Barigye, S. J.; Liu, J. Overlap and diversity in antimicrobial peptide databases: compiling a nonredundant set of sequences. Bioinformatics 2015, 31, 2553−2559. (10) Jenssen, H.; Fjell, C. D.; Cherkasov, A.; Hancock, R. E. QSAR modeling and computer-aided design of antimicrobial peptides. J. Pept. Sci. 2008, 14, 110−114. (11) Fjell, C. D.; Jenssen, H.; Hilpert, K.; Cheung, W. A.; Pante, N.; Hancock, R. E.; Cherkasov, A. Identification of novel antibacterial peptides by chemoinformatics and machine learning. J. Med. Chem. 2009, 52, 2006−2015. (12) Cherkasov, A.; Hilpert, K.; Jenssen, H.; Fjell, C. D.; Waldbrook, M.; Mullaly, S. C.; Volkmer, R.; Hancock, R. E. Use of artificial intelligence in the design of small peptide antibiotics effective against a broad spectrum of highly antibiotic-resistant superbugs. ACS Chem. Biol. 2009, 4, 65−74. (13) Torrent, M.; Andreu, D.; Nogues, V. M.; Boix, E. Connecting peptide physicochemical and antimicrobial properties by a rational prediction model. PLoS One 2011, 6, e16968. (14) Mooney, C.; Haslam, N. J.; Holton, T. A.; Pollastri, G.; Shields, D. C. PeptideLocator: prediction of bioactive peptides in protein sequences. Bioinformatics 2013, 29, 1120−1126. (15) Porto, W. F.; Pires, A. S.; Franco, O. L. CS-AMPPred: an updated SVM model for antimicrobial activity prediction in cysteinestabilized peptides. PLoS One 2012, 7, e51444. (16) Ng, X. Y.; Rosdi, B. A.; Shahrudin, S. Prediction of antimicrobial peptides based on sequence alignment and support vector machinepairwise algorithm utilizing LZ-complexity. BioMed Res. Int. 2015, 2015, 212715. (17) Khosravian, M.; Kazemi Faramarzi, F.; Mohammad Beigi, M.; Behbahani, M.; Mohabatkar, H. Predicting antibacterial peptides by the concept of Chou’s pseudo-amino acid composition and machine learning methods. Protein Pept. Lett. 2013, 20, 180−186. (18) Lira, F.; Perez, P. S.; Baranauskas, J. A.; Nozawa, S. R. Prediction of antimicrobial activity of synthetic peptides by a decision tree model. Appl. Environ. Microbiol. 2013, 79, 3156−3159. (19) Khamis, A. M.; Essack, M.; Gao, X.; Bajic, V. B. Distinct profiling of antimicrobial peptide families. Bioinformatics 2015, 31, 849−856. (20) Xiao, X.; Wang, P.; Lin, W. Z.; Jia, J. H.; Chou, K. C. iAMP-2L: a two-level multi-label classifier for identifying antimicrobial peptides and their functional types. Anal. Biochem. 2013, 436, 168−177. (21) Juretic, D.; Vukicevic, D.; Ilic, N.; Antcheva, N.; Tossi, A. Computational design of highly selective antimicrobial peptides. J. Chem. Inf. Model. 2009, 49, 2873−2882. (22) Wang, P.; Hu, L.; Liu, G.; Jiang, N.; Chen, X.; Xu, J.; Zheng, W.; Li, L.; Tan, M.; Chen, Z.; Song, H.; Cai, Y. D.; Chou, K. C. Prediction of antimicrobial peptides based on sequence alignment and feature selection methods. PLoS One 2011, 6, e18476.

(23) Melo, M. N.; Ferre, R.; Feliu, L.; Bardaji, E.; Planas, M.; Castanho, M. A. Prediction of antibacterial activity from physicochemical properties of antimicrobial peptides. PLoS One 2011, 6, e28549. (24) Vishnepolsky, B.; Pirtskhalava, M. Prediction of linear cationic antimicrobial peptides based on characteristics responsible for their interaction with the membranes. J. Chem. Inf. Model. 2014, 54, 1512− 1523. (25) Freire, J. M.; Almeida Dias, S.; Flores, L.; Veiga, A. S.; Castanho, M. A. Mining viral proteins for antimicrobial and cell-penetrating drug delivery peptides. Bioinformatics 2015, 31, 2252−2256. (26) Chang, K. Y.; Lin, T. P.; Shih, L. Y.; Wang, C. K. Analysis and prediction of the critical regions of antimicrobial peptides based on conditional random fields. PLoS One 2015, 10, e0119490. (27) Toropova, M. A.; Veselinovic, A. M.; Veselinovic, J. B.; Stojanovic, D. B.; Toropov, A. A. QSAR modeling of the antimicrobial activity of peptides as a mathematical function of a sequence of amino acids. Comput. Biol. Chem. 2015, 59, 126−130. (28) Toropov, A. A.; Toropova, A. P.; Raska, I., Jr.; Benfenati, E.; Gini, G. QSAR modeling of endpoints for peptides which is based on representation of the molecular structure by a sequence of amino acids. Struct. Chem. 2012, 23, 1891−1904. (29) Gupta, S.; Kapoor, P.; Chaudhary, K.; Gautam, A.; Kumar, R.; Raghava, G. P. Peptide toxicity prediction. Methods Mol. Biol. 2015, 1268, 143−157. (30) Gupta, S.; Kapoor, P.; Chaudhary, K.; Gautam, A.; Kumar, R.; Raghava, G. P. In silico approach for predicting toxicity of peptides and proteins. PLoS One 2013, 8, e73957. (31) Chaudhary, K.; Kumar, R.; Singh, S.; Tuknait, A.; Gautam, A.; Mathur, D.; Anand, P.; Varshney, G. C.; Raghava, G. P. A web server and mobile app for computing hemolytic potency of peptides. Sci. Rep. 2016, 6, 22843. (32) Speck-Planche, A.; Kleandrova, V. V.; Ruso, J. M.; Cordeiro, M. N. D. S. First multitarget chemo-bioinformatic model to enable the discovery of antibacterial peptides against multiple Gram-positive pathogens. J. Chem. Inf. Model. 2016, 56, 588−598. (33) Romero-Duran, F. J.; Alonso, N.; Yanez, M.; Caamano, O.; Garcia-Mera, X.; Gonzalez-Diaz, H. Brain-inspired cheminformatics of drug-target brain interactome, synthesis, and assay of TVP1022 derivatives. Neuropharmacology 2016, 103, 270−278. (34) Tenorio-Borroto, E.; Penuelas-Rivas, C. G.; Vasquez-Chagoyan, J. C.; Castanedo, N.; Prado-Prado, F. J.; Garcia-Mera, X.; GonzalezDiaz, H. Model for high-throughput screening of drug immunotoxicity - Study of the anti-microbial G1 over peritoneal macrophages using flow cytometry. Eur. J. Med. Chem. 2014, 72, 206−220. (35) Speck-Planche, A.; Cordeiro, M. N. D. S. Simultaneous virtual prediction of anti-Escherichia coli activities and ADMET profiles: A chemoinformatic complementary approach for high-throughput screening. ACS Comb. Sci. 2014, 16, 78−84. (36) Speck-Planche, A.; Cordeiro, M. N. D. S. Chemoinformatics for medicinal chemistry: in silico model to enable the discovery of potent and safer anti-cocci agents. Future Med. Chem. 2014, 6, 2013−2028. (37) Todeschini, R.; Consonni, V. Molecular Descriptors for Chemoinformatics; Wiley-VCH Verlag GmbH & Co. KGaA: Weinheim, 2009. (38) Speck-Planche, A.; Cordeiro, M. N. D. S. Simultaneous modeling of antimycobacterial activities and ADMET profiles: a chemoinformatic approach to medicinal chemistry. Curr. Top. Med. Chem. 2013, 13, 1656−1665. (39) Gogoladze, G.; Grigolava, M.; Vishnepolsky, B.; Chubinidze, M.; Duroux, P.; Lefranc, M. P.; Pirtskhalava, M. DBAASP: database of antimicrobial activity and structure of peptides. FEMS Microbiol. Lett. 2014, 357, 63−68. (40) Verslyppe, B.; De Smet, W.; De Baets, B.; De Vos, P.; Dawyndt, P. StrainInfo introduces electronic passports for microorganisms. Syst. Appl. Microbiol. 2014, 37, 42−50. (41) Gaschen, B.; Kuiken, C.; Korber, B.; Foley, B. Retrieval and onthe-fly alignment of sequence fragments from the HIV database. Bioinformatics 2001, 17, 415−418. 497

DOI: 10.1021/acscombsci.6b00063 ACS Comb. Sci. 2016, 18, 490−498

Research Article

ACS Combinatorial Science (42) Ruiz-Blanco, Y. B.; Paz, W.; Green, J.; Marrero-Ponce, Y. ProtDCal: A program to compute general-purpose-numerical descriptors for sequences and 3D-structures of proteins. BMC Bioinf. 2015, 16, 162. (43) Ruiz-Blanco, Y. B.; Marrero-Ponce, Y.; Garcia, Y.; Puris, A.; Bello, R.; Green, J.; Sotomayor-Torres, C. M. A physics-based scoring function for protein structural decoys: Dynamic testing on targets of CASP-ROLL. Chem. Phys. Lett. 2014, 610−611, 135−140. (44) Ruiz-Blanco, Y. B.; Marrero-Ponce, Y.; Paz, W.; Garcia, Y.; Salgado, J. Global stability of protein folding from an empirical free energy function. J. Theor. Biol. 2013, 321, 44−53. (45) Ruiz-Blanco, Y. B.; Garcia, Y.; Sotomayor-Torres, C. M.; Marrero-Ponce, Y. New set of 2D/3D thermodynamic indices for proteins. A formalism based on “Molten Globule” theory. Phys. Procedia 2010, 8, 63−72. (46) Todeschini, R.; Consonni, V. Handbook of Molecular Descriptors; Wiley-VCH Verlag GmbH: Weinheim, 2000. (47) Hill, T.; Lewicki, P. STATISTICS Methods and Applications. A Comprehensive Reference for Science, Industry and Data Mining; StatSoft: Tulsa, 2006. (48) Prado-Prado, F.; Garcia-Mera, X.; Abeijon, P.; Alonso, N.; Caamano, O.; Yanez, M.; Garate, T.; Mezo, M.; Gonzalez-Warleta, M.; Muino, L.; Ubeira, F. M.; Gonzalez-Diaz, H. Using entropy of drug and protein graphs to predict FDA drug-target network: theoreticexperimental study of MAO inhibitors and hemoglobin peptides from Fasciola hepatica. Eur. J. Med. Chem. 2011, 46, 1074−1094. (49) Garcia, I.; Fall, Y.; Gomez, G.; Gonzalez-Diaz, H. First computational chemistry multi-target model for anti-Alzheimer, antiparasitic, anti-fungi, and anti-bacterial activity of GSK-3 inhibitors in vitro, in vivo, and in different cellular lines. Mol. Diversity 2011, 15, 561−567. (50) Marzaro, G.; Chilin, A.; Guiotto, A.; Uriarte, E.; Brun, P.; Castagliuolo, I.; Tonus, F.; Gonzalez-Diaz, H. Using the TOPSMODE approach to fit multi-target QSAR models for tyrosine kinases inhibitors. Eur. J. Med. Chem. 2011, 46, 2185−2192. (51) Speck-Planche, A.; Kleandrova, V. V.; Luan, F.; Cordeiro, M. N. D. S. In silico discovery and virtual screening of multi-target inhibitors for proteins in Mycobacterium tuberculosis. Comb. Chem. High Throughput Screening 2012, 15, 666−673. (52) Speck-Planche, A.; Luan, F.; Cordeiro, M. N. D. S. Abelson tyrosine-protein kinase 1 as principal target for drug discovery against leukemias. Role of the current computer-aided drug design methodologies. Curr. Top. Med. Chem. 2012, 12, 2745−2762. (53) Speck-Planche, A.; Cordeiro, M. N. D. S. Multi-target QSAR approaches for modeling protein inhibitors. Simultaneous prediction of activities against biomacromolecules present in gram-negative bacteria. Curr. Top. Med. Chem. 2015, 15, 1801−1813. (54) Speck-Planche, A.; Kleandrova, V. V.; Luan, F.; Cordeiro, M. N. D. S. Fragment-based QSAR model toward the selection of versatile anti-sarcoma leads. Eur. J. Med. Chem. 2011, 46, 5910−5916. (55) Speck-Planche, A.; Kleandrova, V. V.; Luan, F.; Cordeiro, M. N. D. S. Rational drug design for anti-cancer chemotherapy: multi-target QSAR models for the in silico discovery of anti-colorectal cancer agents. Bioorg. Med. Chem. 2012, 20, 4848−4855. (56) Speck-Planche, A.; Kleandrova, V. V.; Luan, F.; Cordeiro, M. N. D. S. Chemoinformatics in anti-cancer chemotherapy: multi-target QSAR model for the in silico discovery of anti-breast cancer agents. Eur. J. Pharm. Sci. 2012, 47, 273−279. (57) Gonzalez-Diaz, H.; Prado-Prado, F. J. Unified QSAR and network-based computational chemistry approach to antimicrobials, part 1: multispecies activity models for antifungals. J. Comput. Chem. 2008, 29, 656−667. (58) Prado-Prado, F. J.; Ubeira, F. M.; Borges, F.; Gonzalez-Diaz, H. Unified QSAR & network-based computational chemistry approach to antimicrobials. II. Multiple distance and triadic census analysis of antiparasitic drugs complex networks. J. Comput. Chem. 2010, 31, 164−173. (59) Statsoft-Team STATISTICA. Data Analysis Software System, version 6.0; Statsoft: Tulsa, 2001.

(60) Hilpert, K.; Elliott, M. R.; Volkmer-Engert, R.; Henklein, P.; Donini, O.; Zhou, Q.; Winkler, D. F.; Hancock, R. E. Sequence requirements and an optimization strategy for short antimicrobial peptides. Chem. Biol. 2006, 13, 1101−1107. (61) Morrison, K. L.; Weiss, G. A. Combinatorial alanine-scanning. Curr. Opin. Chem. Biol. 2001, 5, 302−307.

498

DOI: 10.1021/acscombsci.6b00063 ACS Comb. Sci. 2016, 18, 490−498