Subscriber access provided by DUKE UNIV
Article
Prediction of the radical scavenging activities of some antioxidant from their molecular structure Mohammad Hossein Fatemi, and Elham Gholami Rostami Ind. Eng. Chem. Res., Just Accepted Manuscript • DOI: 10.1021/ie4001426 • Publication Date (Web): 15 Jun 2013 Downloaded from http://pubs.acs.org on June 16, 2013
Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.
Industrial & Engineering Chemistry Research is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.
Page 1 of 25
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
Prediction of the radical scavenging activities of some antioxidant from their molecular structure Mohammad H. Fatemi*, ElhamGholamiRostami Chemometrics Laboratory, Faculty of Chemistry, University of Mazandaran, Babolsar, Iran E-mail:
[email protected], Tel: +98 112 5242931 Fax: +98 112 5342350
ABSTRACT A quantitative structure–activity relationships (QSAR) studies were performed on the radical scavenging activities of a set of compounds consisting of various type of antioxidants family. The predicting five parameters models correlating selected descriptors derived from the 2D and 3D representations of molecules and antioxidant activity, were set up by using multiple linear regressions (MLR) and multilayer perceptron neural network (MLP-NN), separately. The best obtained model had statistics of R2= 0.968 and q2 = 0.898 for MLP-NN model and R2= 0.902, and q2= 0.862 for MLR model. The chemical applicability domains of these models were determined via leverage approach. The obtained result indicated that obtained The proposed models can be successfully used for predictions of radical scavenging activitiesof
new
antioxidants.
INTRUDUCTION Free radicals play a crucial role in the pathogenesis of several human diseases such as; cancer, rheumatoid arthritis and various neurodegenerative and pulmonary diseases.1 An abnormal level
1 ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
of reactive oxygen species leads to damage to specific molecules with consequential injury to cells or tissue. Many forms of cancer are thought to be the result of reactions between free radicals and DNA, resulting in mutations that can adversely affect the cell cycle, potentially leading to malignancy.2,3 The human body has a number of mechanisms to minimize free radical induced damage and repair the damage that occurs. Antioxidants play a key role in these defense mechanisms. Antioxidants are chemical entities that function by breaking free radical chain reaction and metal ion chelation, which would otherwise catalyze free radical induced systemic damage.4 Antioxidant activity is primarily based on three different molecular mechanisms: (a) hydrogen atom transfer (HAT), (b) single-electron transfer followed by proton transfer (SET-PT) and (c) sequential proton loss electron transfer (SPLET).5,6 Within the wide range of methods used to screen antioxidants, the Trolox equivalent antioxidant capacity (TEAC) assay is very popular.7 This assay is based on the scavenging of the relatively stable blue/green ABTS [2,2’azinobis(3-ethylbenzothiazoline- 6-sulfonic acid)] radical and converting it into a colorless product. The degree of this decolorization reflects the amount of ABTS that has been scavenged and can be determined spectrophotometrically. One important class of antioxidants is phenolic compound. These compounds have gained interest due to their numerous beneficial health effects that have been proven over the years. These beneficial effects are as; anti-inflammatory, anticancer and antiviral activities.8-10 The position of hydroxyl groups and other features in the chemical structure of phenolic compounds have significant effects on their antioxidant and free.11Since antoxidant activities of these chemicals depends on their structures, therefore it was possible to predict the antioxidant activities of interested compounds from their structural features by using the quantitative structure–activity relationships (QSAR) approaches. QSAR methodology has often been used to find correlations between activity and molecular structural
2 ACS Paragon Plus Environment
Page 2 of 25
Page 3 of 25
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
descriptors of different class of chemicals.12,13 There are some published reports about QSAR prediction of antioxidant activities of some organic compound.14-19 For example R.M.V. Abreu et al. was build a QSAR model for description and prediction of radical scavenging activity of di(hetero)arylamines’derivatives of benzo[b]thiophenes using the partial least squares (PLS) projection of latent structures method, which gives the statistical parameters of R2= 0.881 and q2LOO =0.844.20In other report, K. Roy et al. utilize a series of coumarin derivatives that were modeled for their antioxidant activities based on their ability to inhibit DPPH (1,1-Diphenyl-2picryl-hydrazyl) free radicals.Different QSAR approaches (the descriptor-based QSAR model, 3D pharmacophore model and fragment-based QSAR model developed using the hologram QSAR technique)employed for identifying the essential structural attributes imparting a potential antioxidant activity profile of the coumarin derivatives. Their results aptly match those of the pharmacophore analysis, which indicates the importance of the fused benzene ring and the oxygen atom of the pyran ring for capturing the hydrophobic feature and one of the HBA (hydrogen bond acceptor) features, respectively.21Moreover A. Pérez-Garrido et al. developed a QSAR model to predict the antioxidant activity of for a heterogeneous group of chemicals by using the bond contributions method.22The information extracted from their QSAR model revealed that the major driving forces for radical scavenging activity are hydrogen bond donation and polarity. In the present study the main goal was to build a QSAR model for prediction of radical scavenging activity of different type of antioxidant, using the multilinear regression (MLR) and artificial neural network (ANN) as feature mapping techniques. The developed QSAR models will help to better recognize structural features of molecules that can affected on their antioxidant activities and guide to synthesis of potential new antioxidant radical scavengers.
3 ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 4 of 25
EXPERIMENTAL Data set: Data set that was shown in table 1, consist the name of 96 antioxidants and their experimental activities, which their experimental activities were determined by in vitro TEAC assay that recently reported byA. Pérez-Garridoet al.22 The radical scavenging activity against ABTS+ was expressed in terms of Trolox equivalent antioxidant capacity (TEAC/molL1
).As can be seen in table S1 ( Supporting Information) various type of chemicals can be found in
data set such as; flavanolschalcones , flavones , flavonols , stilbenes and curcuminoids, coumarins, anthraquinones and naphthoquinones , flavanones ,lignans ,isoflavones ,taninns. The value of TEAC were ranged between 0.00 to 9.18 for flavanoneandprocyanidin B2 digallate values, respectively.The compounds in the data set were divided into the training, internal, and external test sets consisting of 77, 10, and 9 members, respectively. In order to compare our result with those reported by A. Pérez-Garrido et al.22 the training set was chosen similar to their work and the remained 19 molecules were subdivided into internal and external test sets. Molecular Descriptor: Molecular descriptors are the simple mathematical representation of a molecule that were used to encode significant structural features of molecules. In order to calculate molecular descriptors, the Hyperchem program (ver. 7),23 was applied to construct all molecular structures. Then the molecular geometry was optimized with the Austin Model 1 (AM1) semi empirical method using the Polak–Ribière algorithm. After geometry optimization, Hyperchem output files were used by the Dragon (ver.3.0)
24
program as input to calculate
molecular descriptors. This package can calculate various type of descriptors
such as
constitutional, topological, geometrical, charge, GETAWAY (geometry, topology and atomsweighted assembly), WHIM (weighted holistic invariant molecular descriptors), 3D-MoRSE (3D-molecular representation of structure based on Electron diffraction), molecular walk counts, 4 ACS Paragon Plus Environment
Page 5 of 25
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
BCUT descriptors, 2D-autocorrelations, aromaticity indices, randic molecular profiles, radial distribution functions, functional groups and atom-centered fragments.25During
developing
model, great care was taken in order to avoid inclusion of highly collinear molecular descriptors. The collinear descriptors encoded similar molecular information, therefore, it was vital to test descriptors and eliminate those with low variation and those which encoded similar information (descriptors with the absolute value of Pearson correlation coefficient above 0.9). Then the most significant descriptors were selected from the pool of molecular descriptors by forward selection method.A simple technique to control the model expansion is the ‘break point’ procedure. In this method, the improvement of the statistical quality of the models is analyzed by plotting the adjusted squared of correlation coefficient (R2u) values of the obtained models versus the number of descriptors involved in each model. Consequently, the model corresponding to the break point is considered as the best/optimum model. As can be seen in Figure 1, the application of the ‘break point’ algorithm led to the conclusion that the best model had five parameters. These five descriptors can be used for developing linear and non-linear models.Table S2 (Supporting Information) shows the correlation matrix between these descriptors. As can be seen in this table there is not any high correlated pairs (R 0.5
(4)
r2> 0.6
(5)
[(r2-r02)/r2] < 0.1 or [(r - r'0 )/r ]< 0.1
2
2
(6)
0.85 ≤ k ≤ 1.15 or 0.85 ≤ k' ≤ 1.15
(7)
|r0 -r'0 |> 0.3
8
2
2
2
Definitions of above parameters are presented obviously in ref [36] and are not written again here for shortness. 7 ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 8 of 25
Further statistical significance of the relationship between activity and the descriptors can be checked by randomization test (Y-randomization). This test was carried out to prove the possible existence of chance correlation.37 The Y-randomization technique proceeds with scrambling of the Y-column data, keeping the descriptor matrix (X-matrix) unchanged. Each time, the models are built using the scrambled data and the values of correlation coefficients are calculated.Two types of randomisation
techniques were used in this work, namely process randomization and model randomization. In case of process randomization at 90% confidence level, the values of the dependent variable are randomly scrambled and variable selection is done freshly from the whole descriptor matrix. In case of model randomization at 99% confidence level, the Y column entries are scrambled and new QSAR models are developed using same set of variables as present in the unrandomized model.For an acceptable QSAR model, the average correlation coefficient (Rr) of randomized models should be less than the correlation coefficient (R) of non-randomized model. To determine the extent of the difference in the values of R2 and R2r that signifies the reliability of the developed QSAR model, we used another parameter named R2p.38,39 The R2p parameter penalizes the model R2 for small differences in the values of R2 and R2r to express the value of R2p, so far the following formula is used:
(9)
This novel parameter Rp2ensures that the models thus developed are not obtained by chance. The threshold value of Rp2should be greater than 0.5 for an acceptable model. However, in an ideal case, the average value of R2 for the randomized models should be zero, i.e. Rr2should be zero. Consequently, in such a case, the value of Rp2should be equal to the value of R2 for the
8 ACS Paragon Plus Environment
Page 9 of 25
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
developed QSAR model. Thus, the corrected formula of R2p (cR2p), as proposed by Todeschini,40 is given by:
c 2 R p
(10)
RESULTS AND DISCUSSION As noted earlier the quantitative relation between antioxidants activities and their structural descriptors were investigated by using linear and non-linear models. The obtained MLR model using descriptors selected by break point procedure is shown in tables S3 (Supporting Information). Performing the external validation test on this 5-parameter model gives the standard errors in prediction of 0.621 and 0.596, for training and test sets, respectively. The calculated values of antioxidant activities of all molecules in data set by this model are shown in table 1. Moreover this model was validated for its robustness and predictive power by leaveone-out cross validation (LOO) procedure, which gives theQ2= 0.862 ,SPRESS= 0.616. Other statistical parameters of
this QSAR model are; = 0.851,∆
=0.046 R2pre =
0.887, = 0.752∆
= 0.099
!!
= 0.843 ∆
!!
= 0.064 and Q2ext(F1)
=0.834. Neural Network Modeling: The mediocre statistical parameters of MLR model moved us to apply the artificial neural network for investigating nonlinear relationships between molecular structural descriptors and antioxidant activities of interested molecules. This ANN had five nodes in input layer and one node in output layer. The number
of nodes in the hidden layer
were optimized by continuous changing of the number of neurons in hidden layer from 1 to 10,
9 ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 10 of 25
and considering the lowest prediction error of network, which was found to be five. Then the developed 5:5:1 network was trained by using training set to optimize its weights and bias values. Finally, the developed multilayer perceptron network model was used to calculate the antioxidant activity of molecules in training, internal and external test sets compounds (table 1). The standard errors in prediction of this model were 0.333, 0.523 and 0.550 for training, internal and external test sets, respectively. Figure 2 represents the plot of experimental versus calculated TEAC values using MLP-NN model. The good agreement between calculated and experimental values of TEAC (R2train=0.968, R2int=0.927, R2ext = 0.938) reveals the suitability of developed model. Also the r2m value for the dataset is 0.865. The residuals of the predicted TEAC are plotted against the experimental values in Figure 3 . The propagation of residuals on both sides of the zero line indicates that no systematic error exists in developed MLP-NN model. The statistical values for the external validation set used in MLP-NN modeling were; q2=0.898, r2 =0.961 , [(r2-r02)/r2] = 0.015, [(r2- r'02)/r2] = 0.019, k = 0.971, k = 1.000 and | r02-r'02| = 0.004. The obtained values of the model are in good agreement with the limits described earlier; demonstrating once again the high predictive ability of the MLP-NN model. Comparison between these statistics and those obtained by MLR model, reveals the predominance of nonlinear over linear model. Moreover the standard error and R2 values of QSAR model that developed by A. Pérez-Garrido et al. on this data set were; SE train = 0.589,SE test= 0.483, R2train= 0.907 andR2test = 0.903. Inspection to these values indicate the superiority of our MLP-NN model over these reported by A. Pérez-Garrido’s et al.21In the case of MLR model, the values of cR2p were 0.832 and 0.841 for process and model randomization while in the case of MLP-NN model these values were 0.925 and 0.915, respectively.These values indicate that good results in our original models are not due to the chance correlation or structural dependency of data set.
10 ACS Paragon Plus Environment
Page 11 of 25
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
According to the results of sensitivity analysis on MLP-NN model the order of importance of descriptors was; C026>Mor19u>MATS2v>GATS5p>MATS4e. The name and the meaning of these descriptors are shown in table S4 (Supporting Information). All of these descriptors encode the topological and electronic aspects of chemicals, which effects on their antioxidant activities. Detailed description of these descriptors can found in the Handbook of Molecular Descriptions.41 The most important descriptor, which is more effective than the other descriptor is C026 which belongs to atom centered fragment accounts for functionality
R--CX--R. This group of
descriptors investigate size and position of heteroatom in molecules. As mentioned earlier the presence and position of OH group on benzene ring has an important influence on radical scavenging activates. The values of C-026 encodethese information about OH group. The value of this descriptor has positive effects on antioxidant activities. This observation was in agreement with those reported by I. Mitra et al.42 The increase in antioxidant activity with an increase in hydroxyl substitution groups can be explained by the HAT43 mechanism of antioxidant action: the antioxidant donates its proton (hydrogen atom) to neutralize the unpaired electron of the free radicals. Applicability Domain analysis:Before a QSAR model is put into use for screening chemicals, its domain of application (AD) must be defined.44A simple measure of a chemical being too far from the applicability domain of the model is its leverage hi, which is defined as:
h
i
=
X
T i
( X
T
X )
− 1
X
i
(i = 1,…,n)
(11)
where xi is the descriptor row-vector of the query compound and X is the n× p-1 matrix of p model parameter values for n training set compounds. The superscript T refers to the transpose of the matrix/vector. The warning leverage h* is, generally fixed at 3p/n, where n is the number of training compounds.To visualize the AD of a QSAR model, the standardized residuals versus 11 ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 12 of 25
leverage (Hat diagonal) values (hi) was plotted (Figure 4) for an immediate and simple graphical detection of both the response outliers (i.e. compounds with standardized residuals greater than three standard deviation units, > 3σ) and structurally influential chemicals in the model (h> h*). As it can be seen from this figure, all predictions were reliable except the compound number 1 in training set (procyanidin B2 digallate), which is not within the cut off value of h*= 0.233.The abnormal behavior of this compound could be due to incorrect experimental input data or its different antioxidant activity mechanism. CONCLUSION Some linear and nonlinear QSAR models were developed to predict the antioxidant activity to a heterogeneous group of substances using four molecular descriptors that take into account 2D and 3D-aspects of the molecular structure. Descriptors that appear in QSAR models provide some information related to different molecular aspects, which can participate in the intermolecular interactions that affected on the antioxidant activity of chemicals. The good agreement between experimental and predicted TEAC by ANN model, confirms the validity of obtained QSAR model. The results obtained for this work indicate that the MLP-NN models exhibit reasonable prediction capabilities, and is superior over MLR model as well as those developed by A. Pérez-Garrido’s et al. The developed QSAR model implicates importance and significant contribution of the hydroxyl group attached to the phenyl ring for their radical scavenging activity because the antioxidant compounds could act as hydrogen atom transferors by this functionality group. Additionally, the structural information for free radical scavenging activity of antioxidant could provide deeper insight into the mechanisms of untested compounds.
12 ACS Paragon Plus Environment
Page 13 of 25
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
ASSOCIATED CONTENT Supporting Information Additional tables. This tables are available free of charge via the Internet at http://pubs.acs.org.
REFRENCES (1) Halliwell, B.; Gutteridge, J. M. C. Free Radicals in Biology and Medicine; Oxford University Press: New York, 1999. (2) Halliwell, B.; Gutteridge, J. M. C. Free Radicals in Biology and Medicine; Oxford University Press: New York, 1985. (3) Cooke, M. S.; Evans, M. D.; Dizdaoglu, M.; Lunec, J. Oxidative DNA damage: mechanisms, mutation, and disease. FASEB J. 2003, 17, 1195-1214. (4) Gutteridge, J. M. C.; Halliwell, B. Antioxidants in Nutrition, Health and Disease. Oxford University Press: Oxford, 1994. (5) Wright, J. S.; Johnson, E. R.; DiLabio, G. A. Predicting the activity of phenolic antioxidants: theoretical method, analysis of substituent effects, and application to major families of antioxidants. J. Am. Chem. Soc. 2001, 123, 1173. -1183. (6) Musialik, M.; Litwinienko, G. Scavenging of dpph* radicals by vitamin E is acelerated by its partial ionization: the role of sequential proton loss electron transfer. Org. Lett. 2005, 7, 49514954. 13 ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 14 of 25
(7) Arts, M. J. T. J.; Dallinga, J. S.; Voss, H.-P.; Haenen, G. R. M. M.; Aalt, B. A critical appraisal of the use of the antioxidant capacity (TEAC) assay in defining optimal antioxidant structures.Food Chem. 2003, 80, 409-414. (8) Chung, K. T.; Wong, T. Y.; Huang, Y. W.; Lin, Y. Tannins and human health. Crit. Rev. Food Sci. 1998, 38, 421-464. (9) Cassidy, A.; Hanley, B.; Lamuela-Raventos, R. M. Isoflavones, lignans and stilbenes/origins, metabolism and potential importance to human health. J. Sci. Food Agr. 2000, 80, 1044-1062. (10) Tapiero, H.; Tew, K. D.; Ba, N.; Mathe, G. Polyphenols: do they play a role in the prevention of human pathologies?Biomed.Pharmacother. 2002, 56, 200-207. (11) Kruzlicova, D.; Danihelova, M.; Veverka, M. Quantitative Structure-Antioxidant Activity Relationship of Quercetin and its New Synthetised Derivatives. Nova BiotechnologicaetChimica. 2012, 11, 37-44. (12) Kontogiorgis, A. C.; Pontiki, A. E.; Hadjipavlou-Litina, D. A Review on Quantitative StructureActivity Relationships (QSARs) of Natural and Synthetic Antioxidants Compounds. Mini-Rev. Med. Chem. 2005, 5, 563-574. (13) Quintero, F. A.; Patel, S. J.; Muñoz, F.; Mannan, M. S. Review of Existing QSAR/QSPR Models Developed for PropertiesUsed in Hazardous Chemicals Classification System. Ind. Eng. Chem. Res. 2012, 51, 16101−16115. (14) Mitra, I.; Saha, A.; Roy, K. Predictive chemometric modeling of DPPH free radicalscavenging activity of azole derivatives using 2D- and 3D-quantitative structure-activity relationship tools. Future. Med. Chem. 2013, 5, 261-280.
14 ACS Paragon Plus Environment
Page 15 of 25
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
(15) Jing, P.; Zhao, S.-J.; Jian, W.-J.; Qian, B.-J.; Dong, Y.; Pang, J. Quantitative studies on Structure-DPPH• scavenging activity relationships of food phenolic acids. Molecules. 2012, 11, 12910-12924. (16) Sarkar, A.; Middya, T. R.; Jana, A. D. A QSAR study of radical scavenging antioxidant activity of a series of flavonoids using DFT based quantum chemical descriptors - The importance of group frontier electron density. J. Mol. Model. 2012, 18, 2621-2631. (17) Ghinet, A.; Farce, A.; Oudir, S.; Pommery, J.; Vamecq, J.; Hénichart, J.-P.; Rigo, B.; Gautret, P. Antioxidant activity of new benzo[de]quinolines and lactams: 2DQuantitative structure-activity relationships. J. Med. Chem. 2012, 8, 942-946. (18) Gupta, S.; Matthew, S.; Abreu, P. M.; Aires-de-Sousa, J. QSAR analysis of phenolic antioxidants using MOLMAP descriptors of local properties. Bioorg. Medicinal Chem. 2006, 14, 1199-1206. (19) Amic, D.; Davidovic-Amic, D.; Beslo, D.; Rastija, V.; Lucic, B.; Trinajstic, N. SAR and QSAR of the antioxidant activity of flavonoids. Curr. Med. Chem. 2007, 14, 827-845.
(20) Abreu, R. M. V.; Ferreira, I. C. F. R.; Queiroz, M. J. o. R. P. QSAR model for predicting radical scavenging activity of di(hetero)arylamines derivatives of benzo[b]thiophenes. Eur. J. Med. Chem. 2009, 44, 1952-1958. (21) Mitra, I.; Saha, A.; Roy, K. Predictive modeling of antioxidant coumarin derivatives using multiple approaches: Descriptor-based QSAR, 3D-pharmacophore mapping, and HQSAR. Sci. Pharm. 2013, 81, 57-80.
15 ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 16 of 25
(22) Pérez-Garrido, A.; Helguera, A. M.; Ruiz, J. M. M.; Rentero, P. Z. Topological substructural molecular design approach: Radical scavenging activity. Eur. J. Med. Chem. 2012, 49, 86-94. (23) HyperChem Release 7.0 for windows; Hypercube: Inc. 2002. (24) http://www.disat.unimib.it/chem. (25) Todeschini, R.; Consonni, V. Molecular Descriptors for Chemoinformatics; Wiley-VCH: Weinheim, Germany, 2009. (26) Liu, P.; Long, W. Current Mathematical Methods Used in QSAR/QSPR Studies. Int. J. Mol. Sci. 2009, 10, 1978-1998. (27) Niculescu, S. P. Artificial neural networks and genetic algorithms in QSAR. J. Mol. Struct. 2003, 622, 71-83. (28) Beal, M. T.; Hagan, H. B.; Demuth, M. Neural Network Design; PWS: Boston, 1996. (29) Konoz, E.; Fatemi, M. H.; Faraji, R. Prediction of Kovats Retention Indices of Some Aliphatic Aldehydes and Ketones on Some Stationary Phases at Different Temperatures Using Artificial Neural Network. J. Chromatogr. Sci. 2008, 46, 1-7. (30) http://www.statsoft.com/ (31) Lahmiri, S. A comparative study of backpropagation algorithm in financial prediction.TIJCSA. 2001, 1, 15-21. (32) Roy, P. P.; Roy, K. On some aspects of variable selection forpartial least squares regression models. QSAR Comb. Sci. 2008, 27, 302−313. (33) Roy K, Mitra I, Kar S, Ojha PK, Das RN, KabirH.Comparative studies on some metrics for external validation of QSPR models. J. Chem. Inf. Model. 2012, 52, 396-408.
16 ACS Paragon Plus Environment
Page 17 of 25
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
(34) Ojha PK, Mitra I, Das R, Roy K.Further exploring rm2 metrics for validation of QSPR models.Chemom.Intell. Lab. Syst.2011, 107, 194-205. (35) A. Tropsha. Best practices for QSAR model development, validation, and exploitation.Mol. Inf. 2010, 29, 476-488. (36) Roy, K.; Chakraborty, P.; Mitra, I.; Ojha, P. K.; Kar, S.; Das, R. N. Some Case Studies on Application of "rm2" Metrics for Judging Quality of Quantitative Structure–Activity Relationship Predictions: Emphasis on Scaling of Response Data. J. Comput. Chem. 2013, 34, 1071-1082. (37) Walczak, B.; Massart, D. L. Robust principal components regression as a detection tool for outliers. Chemometr.Intell. Lab. Syst.1995, 27, 41–54. (38)Mitra, I.; Saha, A.; Roy, a. K. Quantitative structure–activity relationship modeling of antioxidant activities of hydroxybenzalacetonesusing quantum chemical, physicochemical and spatialdescriptors. Chem. Biol. Drug Des. 2009, 73, 526–536. (39) Roy, P. P.; Paul, S.; Mitra, I.; Roy, K. On two novel parameters for validation of predictive QSAR models. Molecules 2009, 14, 1660–1701. (40) Todeschini, R.: Milano Chemometrics, University of Milano- Bicocca, Milano, Italy (personal communication), 2010. (41) Todeschini, R.; Consonni, V. Handbook of Molecular Descriptors; Wiley-VCH: Weinheim, 2000. (42) Mitra, I.; Saha, A.; Roy, K. Chemometric modeling of free radical scavenging activity of flavone derivatives. Eur. J. Med. Chem. 2010, 45, 5071-5079.
17 ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 18 of 25
(43) Wright, S.; Johnson, E. R.; G. A. DiLabio. Predicting the activity of phenolic antioxidants: theoretical method, analysis of substituent effects, and application to major families of antioxidants.J.Am.Chem.Soc. 2001, 123 1173–1183. (44) Tropsha, A.; Gramatica, P.; Gombar, V. K. The importance of being earnest: validation is the absolute essential for successful application and interpretation of QSPR models. QSAR Comb. Sci. 2003, 22, 69-77.
18 ACS Paragon Plus Environment
Page 19 of 25
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
Table 1. Names, experimental and predicted TEAC (mol. L-1) TEACexp TEACANN TEACMLR
Compound
Name
1
Procyanidin B2 digallate
9.180
8.841
9.868
2
Procyanidin C1
8.290
8.457
7.586
Corilagin
7.760
7.846
7.110
Procyanidin B1
6.140
6.409
5.017
3 4
int
5
(-)-Epigallocatechingallate
5.950
5.994
5.683
6
(-)-Epicatechingallate
5.270
5.344
5.031
Quercetin
4.420
4.530
3.517
(-)-Epigallocatechin
3.710
1.581
3.274
9
Gallic acid
3.520
3.610
2.678
10
(-)-Epicatechin
3.060
2.892
2.410
11
Morin
2.680
2.671
2.867
12
Baicalein
2.560
2.131
1.385
13
Piceatannol
2.530
2.267
2.772
14
Butein
2.420
2.662
2.927
15
Quercetin 3 glucoside
2.390
2.711
2.511
16
Esculetin
2.380
2.152
1.826
17
Curcumin
2.240
1.403
1.611
18
Quercitrin
2.180
1.814
1.486
19
Luteolin
2.180
2.170
2.295
20
Resveratrol
2.140
2.374
1.700
21
Rutin
2.020
1.605
1.836
22
para-Coumaric acid
1.960
1.270
1.132
23
Sappanchalcone
1.930
1.699
2.130
24
Purpurin
1.930
1.302
1.495
25
Ferulic acid
1.920
1.883
1.070
26
Phloretin
1.790
1.894
1.677
27int
Demethoxycurcumin
1.630
0.978
1.264
28
Piceatannol-3'-glucoside
1.620
1.883
1.934
29
Pseudopurpurin
1.620
1.640
1.152
30ext
Kaempferol
1.590
1.706
2.459
31
Chlorogenic acid
1.560
0.716
1.208
Quercetin 3 glucoside 7 rham
1.560
2.690
0.967
7 8
ext
32
int
33
Baicalin
1.550
1.095
1.223
34
Isoferulic acid
1.530
1.536
1.130
35ext
Cinaroside
1.470
0.544
1.804
int
Carthamin
1.430
0.366
2.151
37
Syringic acid
1.390
1.408
0.513
38ext
Resveratrol-3-glucoside
1.350
-0.077
1.003
Caffeic acid
1.310
0.551
2.183
2, 4-Hydroxybenzoic acid
1.220
0.660
0.729
36
39 40
int
19 ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 20 of 25
41
Bisdemethoxycurcumin
1.180
1.326
0.862
42ext
Protocatechuic acid
1.150
0.321
1.775
43
Galangina
1.120
1.197
1.422
44
Alizarin
1.070
0.900
1.011
45
ortho-Cumaric acid
0.930
0.883
1.084
46
meta-Coumaric acid
0.820
0.993
1.237
47
Flavonol
0.707
0.828
0.583
Resveratrol-4'-glucoside
0.558
0.744
1.114
48 49
int
Quinizarin
0.548
4.475
0.861
50
Hesperetin
0.403
1.178
1.312
51ext
Scopoletin
0.383
2.444
0.392
52
Secoisolariciresinol
0.308
-0.070
0.420
53
Matairesinol
0.253
0.312
0.928
54
Naringenin
0.217
0.563
0.828
Vitexin
0.216
0.297
-0.435
Magnolol
0.209
1.499
0.682
55 56
int
57
Esculetin-6-glucoside
0.164
0.535
0.743
58
Astragalin
0.138
0.398
0.625
59
Shikonin
0.124
0.048
-0.785
Genistein
0.123
0.073
0.795
60 61
ext
Acetylskikonin
0.107
1.995
-0.578
62
Alizarin-2-glucoside
0.105
0.408
0.041
63
Juglone
0.105
-0.047
0.728
64
Hesperidin
0.104
0.061
0.416
65
Arctigenin
0.104
0.571
1.344
66
Daidzein
0.101
-0.307
0.742
67
int
Naringin
0.098
1.713
-0.405
68
ext
Glycitein
0.097
0.226
0.806
69
int
Emodin
0.095
0.008
0.852
70
Vanillic acid
0.092
0.168
0.593
71
Apigenin
0.086
0.063
0.795
Apigetrin
0.083
0.363
0.712
72 73
ext
Chrysin
0.081
0.571
0.055
74
Genistin
0.077
-0.084
0.498
75
Aloe-emodin
0.077
-0.230
0.382
76
Rhein
0.076
0.226
1.068
77
1,5-Dihydroxyanthraquinone
0.076
0.550
0.877
78
Ruberythic acid
0.073
-0.039
-0.611
79
Daidzin
0.072
-0.004
0.049
80
2, 6-Dihydroxyanthraquinone
0.072
0.755
1.276
81
Chrysophanol
0.069
-0.427
0.062
82
Physcion
0.068
0.098
-0.105
83
Chrysazine
0.068
0.523
0.855
20 ACS Paragon Plus Environment
Page 21 of 25
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
84int
ortho-Hydroxybenzoic acid
0.037
0.728
-0.266
85
para-Hydroxybenzoic acid
0.028
0.660
0.729
86
meta-Hydroxybenzoic acid
0.025
0.654
1.003
87
Anthraquinone
0.009
0.209
-0.431
88
trans-Cinnamic acid
0.007
0.090
0.313
89
Benzoic acid
0.005
0.076
-0.085
90
Isoflavone
0.005
-0.492
-0.907
91
5-Metoxifuranocoumarin
0.003
0.375
0.291
92
Flavone
0.003
-0.149
-0.592
93
trans-Stilbene
0.002
-0.086
-0.470
94
Coumarin
0.001
0.062
0.553
95 trans-Chalcone 0.001 0.435 -0.287 96 Flavanone 0.000 0.109 -1.041 * int andext refer to compounds used as internal test and external test set, respectively.
21 ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1.000
.900 R2u
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
.800
.700 0
3
6
9
12
15
No. of descriptor
Figure 1.The variations of R2u against the number of descriptors in MLR model.
22 ACS Paragon Plus Environment
Page 22 of 25
Page 23 of 25
Training set
9
Internal test
External test
7 TEAC cal
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
5 3 1 -1 -1
1
3
5
7
9
TEACexp Figure 2. The plot of MLP-NN calculated versus experimental of values TEAC.
23 ACS Paragon Plus Environment
Industrial & Engineering Chemistry Research
1.5 Training test set"
Internal test set
External test set
1.0 0.5 Residuals
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 24 of 25
0.0 -1
1
3
5
7
9
-0.5 -1.0 -1.5 TEACexp Figure 3.Plot of predicted residuals versus experimental values of TEAC.
24 ACS Paragon Plus Environment
Page 25 of 25
4 training set
3 Standared Residualas
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Industrial & Engineering Chemistry Research
internal test
external test
2 1 0 -1 -2 -3 -4 0
0.1
0.2 leverage
0.3
0.4
Figure4.The plot of standardized residuals versus hat values, with a warning leverage of h*=0.233 (William plot).
25 ACS Paragon Plus Environment