Prediction of the Radical Scavenging Activities of ... - ACS Publications

Jun 15, 2013 - Chemometrics Laboratory, Faculty of Chemistry, University of Mazandaran, Babolsar, Iran. Ind. Eng. Chem. Res. , 2013, 52 (28), pp 9525â...
0 downloads 0 Views 497KB Size
Subscriber access provided by DUKE UNIV

Article

Prediction of the radical scavenging activities of some antioxidant from their molecular structure Mohammad Hossein Fatemi, and Elham Gholami Rostami Ind. Eng. Chem. Res., Just Accepted Manuscript • DOI: 10.1021/ie4001426 • Publication Date (Web): 15 Jun 2013 Downloaded from http://pubs.acs.org on June 16, 2013

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Industrial & Engineering Chemistry Research is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 25

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Prediction of the radical scavenging activities of some antioxidant from their molecular structure Mohammad H. Fatemi*, ElhamGholamiRostami Chemometrics Laboratory, Faculty of Chemistry, University of Mazandaran, Babolsar, Iran E-mail: [email protected], Tel: +98 112 5242931 Fax: +98 112 5342350

ABSTRACT A quantitative structure–activity relationships (QSAR) studies were performed on the radical scavenging activities of a set of compounds consisting of various type of antioxidants family. The predicting five parameters models correlating selected descriptors derived from the 2D and 3D representations of molecules and antioxidant activity, were set up by using multiple linear regressions (MLR) and multilayer perceptron neural network (MLP-NN), separately. The best obtained model had statistics of R2= 0.968 and q2 = 0.898 for MLP-NN model and R2= 0.902, and q2= 0.862 for MLR model. The chemical applicability domains of these models were determined via leverage approach. The obtained result indicated that obtained The proposed models can be successfully used for predictions of radical scavenging activitiesof

new

antioxidants.

INTRUDUCTION Free radicals play a crucial role in the pathogenesis of several human diseases such as; cancer, rheumatoid arthritis and various neurodegenerative and pulmonary diseases.1 An abnormal level

1 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

of reactive oxygen species leads to damage to specific molecules with consequential injury to cells or tissue. Many forms of cancer are thought to be the result of reactions between free radicals and DNA, resulting in mutations that can adversely affect the cell cycle, potentially leading to malignancy.2,3 The human body has a number of mechanisms to minimize free radical induced damage and repair the damage that occurs. Antioxidants play a key role in these defense mechanisms. Antioxidants are chemical entities that function by breaking free radical chain reaction and metal ion chelation, which would otherwise catalyze free radical induced systemic damage.4 Antioxidant activity is primarily based on three different molecular mechanisms: (a) hydrogen atom transfer (HAT), (b) single-electron transfer followed by proton transfer (SET-PT) and (c) sequential proton loss electron transfer (SPLET).5,6 Within the wide range of methods used to screen antioxidants, the Trolox equivalent antioxidant capacity (TEAC) assay is very popular.7 This assay is based on the scavenging of the relatively stable blue/green ABTS [2,2’azinobis(3-ethylbenzothiazoline- 6-sulfonic acid)] radical and converting it into a colorless product. The degree of this decolorization reflects the amount of ABTS that has been scavenged and can be determined spectrophotometrically. One important class of antioxidants is phenolic compound. These compounds have gained interest due to their numerous beneficial health effects that have been proven over the years. These beneficial effects are as; anti-inflammatory, anticancer and antiviral activities.8-10 The position of hydroxyl groups and other features in the chemical structure of phenolic compounds have significant effects on their antioxidant and free.11Since antoxidant activities of these chemicals depends on their structures, therefore it was possible to predict the antioxidant activities of interested compounds from their structural features by using the quantitative structure–activity relationships (QSAR) approaches. QSAR methodology has often been used to find correlations between activity and molecular structural

2 ACS Paragon Plus Environment

Page 2 of 25

Page 3 of 25

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

descriptors of different class of chemicals.12,13 There are some published reports about QSAR prediction of antioxidant activities of some organic compound.14-19 For example R.M.V. Abreu et al. was build a QSAR model for description and prediction of radical scavenging activity of di(hetero)arylamines’derivatives of benzo[b]thiophenes using the partial least squares (PLS) projection of latent structures method, which gives the statistical parameters of R2= 0.881 and q2LOO =0.844.20In other report, K. Roy et al. utilize a series of coumarin derivatives that were modeled for their antioxidant activities based on their ability to inhibit DPPH (1,1-Diphenyl-2picryl-hydrazyl) free radicals.Different QSAR approaches (the descriptor-based QSAR model, 3D pharmacophore model and fragment-based QSAR model developed using the hologram QSAR technique)employed for identifying the essential structural attributes imparting a potential antioxidant activity profile of the coumarin derivatives. Their results aptly match those of the pharmacophore analysis, which indicates the importance of the fused benzene ring and the oxygen atom of the pyran ring for capturing the hydrophobic feature and one of the HBA (hydrogen bond acceptor) features, respectively.21Moreover A. Pérez-Garrido et al. developed a QSAR model to predict the antioxidant activity of for a heterogeneous group of chemicals by using the bond contributions method.22The information extracted from their QSAR model revealed that the major driving forces for radical scavenging activity are hydrogen bond donation and polarity. In the present study the main goal was to build a QSAR model for prediction of radical scavenging activity of different type of antioxidant, using the multilinear regression (MLR) and artificial neural network (ANN) as feature mapping techniques. The developed QSAR models will help to better recognize structural features of molecules that can affected on their antioxidant activities and guide to synthesis of potential new antioxidant radical scavengers.

3 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 25

EXPERIMENTAL Data set: Data set that was shown in table 1, consist the name of 96 antioxidants and their experimental activities, which their experimental activities were determined by in vitro TEAC assay that recently reported byA. Pérez-Garridoet al.22 The radical scavenging activity against ABTS+ was expressed in terms of Trolox equivalent antioxidant capacity (TEAC/molL1

).As can be seen in table S1 ( Supporting Information) various type of chemicals can be found in

data set such as; flavanolschalcones , flavones , flavonols , stilbenes and curcuminoids, coumarins, anthraquinones and naphthoquinones , flavanones ,lignans ,isoflavones ,taninns. The value of TEAC were ranged between 0.00 to 9.18 for flavanoneandprocyanidin B2 digallate values, respectively.The compounds in the data set were divided into the training, internal, and external test sets consisting of 77, 10, and 9 members, respectively. In order to compare our result with those reported by A. Pérez-Garrido et al.22 the training set was chosen similar to their work and the remained 19 molecules were subdivided into internal and external test sets. Molecular Descriptor: Molecular descriptors are the simple mathematical representation of a molecule that were used to encode significant structural features of molecules. In order to calculate molecular descriptors, the Hyperchem program (ver. 7),23 was applied to construct all molecular structures. Then the molecular geometry was optimized with the Austin Model 1 (AM1) semi empirical method using the Polak–Ribière algorithm. After geometry optimization, Hyperchem output files were used by the Dragon (ver.3.0)

24

program as input to calculate

molecular descriptors. This package can calculate various type of descriptors

such as

constitutional, topological, geometrical, charge, GETAWAY (geometry, topology and atomsweighted assembly), WHIM (weighted holistic invariant molecular descriptors), 3D-MoRSE (3D-molecular representation of structure based on Electron diffraction), molecular walk counts, 4 ACS Paragon Plus Environment

Page 5 of 25

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

BCUT descriptors, 2D-autocorrelations, aromaticity indices, randic molecular profiles, radial distribution functions, functional groups and atom-centered fragments.25During

developing

model, great care was taken in order to avoid inclusion of highly collinear molecular descriptors. The collinear descriptors encoded similar molecular information, therefore, it was vital to test descriptors and eliminate those with low variation and those which encoded similar information (descriptors with the absolute value of Pearson correlation coefficient above 0.9). Then the most significant descriptors were selected from the pool of molecular descriptors by forward selection method.A simple technique to control the model expansion is the ‘break point’ procedure. In this method, the improvement of the statistical quality of the models is analyzed by plotting the adjusted squared of correlation coefficient (R2u) values of the obtained models versus the number of descriptors involved in each model. Consequently, the model corresponding to the break point is considered as the best/optimum model. As can be seen in Figure 1, the application of the ‘break point’ algorithm led to the conclusion that the best model had five parameters. These five descriptors can be used for developing linear and non-linear models.Table S2 (Supporting Information) shows the correlation matrix between these descriptors. As can be seen in this table there is not any high correlated pairs (R 0.5

(4)

r2> 0.6

(5)

[(r2-r02)/r2] < 0.1 or [(r - r'0 )/r ]< 0.1

2

2

(6)

0.85 ≤ k ≤ 1.15 or 0.85 ≤ k' ≤ 1.15

(7)

|r0 -r'0 |> 0.3

8

2

2

2

Definitions of above parameters are presented obviously in ref [36] and are not written again here for shortness. 7 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 25

Further statistical significance of the relationship between activity and the descriptors can be checked by randomization test (Y-randomization). This test was carried out to prove the possible existence of chance correlation.37 The Y-randomization technique proceeds with scrambling of the Y-column data, keeping the descriptor matrix (X-matrix) unchanged. Each time, the models are built using the scrambled data and the values of correlation coefficients are calculated.Two types of randomisation

techniques were used in this work, namely process randomization and model randomization. In case of process randomization at 90% confidence level, the values of the dependent variable are randomly scrambled and variable selection is done freshly from the whole descriptor matrix. In case of model randomization at 99% confidence level, the Y column entries are scrambled and new QSAR models are developed using same set of variables as present in the unrandomized model.For an acceptable QSAR model, the average correlation coefficient (Rr) of randomized models should be less than the correlation coefficient (R) of non-randomized model. To determine the extent of the difference in the values of R2 and R2r that signifies the reliability of the developed QSAR model, we used another parameter named R2p.38,39 The R2p parameter penalizes the model R2 for small differences in the values of R2 and R2r to express the value of R2p, so far the following formula is used:

       

(9)

This novel parameter Rp2ensures that the models thus developed are not obtained by chance. The threshold value of Rp2should be greater than 0.5 for an acceptable model. However, in an ideal case, the average value of R2 for the randomized models should be zero, i.e. Rr2should be zero. Consequently, in such a case, the value of Rp2should be equal to the value of R2 for the

8 ACS Paragon Plus Environment

Page 9 of 25

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

developed QSAR model. Thus, the corrected formula of R2p (cR2p), as proposed by Todeschini,40 is given by:

c 2 R p

   

(10)

RESULTS AND DISCUSSION As noted earlier the quantitative relation between antioxidants activities and their structural descriptors were investigated by using linear and non-linear models. The obtained MLR model using descriptors selected by break point procedure is shown in tables S3 (Supporting Information). Performing the external validation test on this 5-parameter model gives the standard errors in prediction of 0.621 and 0.596, for training and test sets, respectively. The calculated values of antioxidant activities of all molecules in data set by this model are shown in table 1. Moreover this model was validated for its robustness and predictive power by leaveone-out cross validation (LOO) procedure, which gives theQ2= 0.862 ,SPRESS= 0.616. Other statistical parameters of

  this QSAR model are;  = 0.851,∆

=0.046 R2pre =

   0.887, = 0.752∆

= 0.099 

!!

 = 0.843 ∆

!!

= 0.064 and Q2ext(F1)

=0.834. Neural Network Modeling: The mediocre statistical parameters of MLR model moved us to apply the artificial neural network for investigating nonlinear relationships between molecular structural descriptors and antioxidant activities of interested molecules. This ANN had five nodes in input layer and one node in output layer. The number

of nodes in the hidden layer

were optimized by continuous changing of the number of neurons in hidden layer from 1 to 10,

9 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 25

and considering the lowest prediction error of network, which was found to be five. Then the developed 5:5:1 network was trained by using training set to optimize its weights and bias values. Finally, the developed multilayer perceptron network model was used to calculate the antioxidant activity of molecules in training, internal and external test sets compounds (table 1). The standard errors in prediction of this model were 0.333, 0.523 and 0.550 for training, internal and external test sets, respectively. Figure 2 represents the plot of experimental versus calculated TEAC values using MLP-NN model. The good agreement between calculated and experimental values of TEAC (R2train=0.968, R2int=0.927, R2ext = 0.938) reveals the suitability of developed model. Also the r2m value for the dataset is 0.865. The residuals of the predicted TEAC are plotted against the experimental values in Figure 3 . The propagation of residuals on both sides of the zero line indicates that no systematic error exists in developed MLP-NN model. The statistical values for the external validation set used in MLP-NN modeling were; q2=0.898, r2 =0.961 , [(r2-r02)/r2] = 0.015, [(r2- r'02)/r2] = 0.019, k = 0.971, k = 1.000 and | r02-r'02| = 0.004. The obtained values of the model are in good agreement with the limits described earlier; demonstrating once again the high predictive ability of the MLP-NN model. Comparison between these statistics and those obtained by MLR model, reveals the predominance of nonlinear over linear model. Moreover the standard error and R2 values of QSAR model that developed by A. Pérez-Garrido et al. on this data set were; SE train = 0.589,SE test= 0.483, R2train= 0.907 andR2test = 0.903. Inspection to these values indicate the superiority of our MLP-NN model over these reported by A. Pérez-Garrido’s et al.21In the case of MLR model, the values of cR2p were 0.832 and 0.841 for process and model randomization while in the case of MLP-NN model these values were 0.925 and 0.915, respectively.These values indicate that good results in our original models are not due to the chance correlation or structural dependency of data set.

10 ACS Paragon Plus Environment

Page 11 of 25

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

According to the results of sensitivity analysis on MLP-NN model the order of importance of descriptors was; C026>Mor19u>MATS2v>GATS5p>MATS4e. The name and the meaning of these descriptors are shown in table S4 (Supporting Information). All of these descriptors encode the topological and electronic aspects of chemicals, which effects on their antioxidant activities. Detailed description of these descriptors can found in the Handbook of Molecular Descriptions.41 The most important descriptor, which is more effective than the other descriptor is C026 which belongs to atom centered fragment accounts for functionality

R--CX--R. This group of

descriptors investigate size and position of heteroatom in molecules. As mentioned earlier the presence and position of OH group on benzene ring has an important influence on radical scavenging activates. The values of C-026 encodethese information about OH group. The value of this descriptor has positive effects on antioxidant activities. This observation was in agreement with those reported by I. Mitra et al.42 The increase in antioxidant activity with an increase in hydroxyl substitution groups can be explained by the HAT43 mechanism of antioxidant action: the antioxidant donates its proton (hydrogen atom) to neutralize the unpaired electron of the free radicals. Applicability Domain analysis:Before a QSAR model is put into use for screening chemicals, its domain of application (AD) must be defined.44A simple measure of a chemical being too far from the applicability domain of the model is its leverage hi, which is defined as:

h

i

=

X

T i

( X

T

X )

− 1

X

i

(i = 1,…,n)

(11)

where xi is the descriptor row-vector of the query compound and X is the n× p-1 matrix of p model parameter values for n training set compounds. The superscript T refers to the transpose of the matrix/vector. The warning leverage h* is, generally fixed at 3p/n, where n is the number of training compounds.To visualize the AD of a QSAR model, the standardized residuals versus 11 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 25

leverage (Hat diagonal) values (hi) was plotted (Figure 4) for an immediate and simple graphical detection of both the response outliers (i.e. compounds with standardized residuals greater than three standard deviation units, > 3σ) and structurally influential chemicals in the model (h> h*). As it can be seen from this figure, all predictions were reliable except the compound number 1 in training set (procyanidin B2 digallate), which is not within the cut off value of h*= 0.233.The abnormal behavior of this compound could be due to incorrect experimental input data or its different antioxidant activity mechanism. CONCLUSION Some linear and nonlinear QSAR models were developed to predict the antioxidant activity to a heterogeneous group of substances using four molecular descriptors that take into account 2D and 3D-aspects of the molecular structure. Descriptors that appear in QSAR models provide some information related to different molecular aspects, which can participate in the intermolecular interactions that affected on the antioxidant activity of chemicals. The good agreement between experimental and predicted TEAC by ANN model, confirms the validity of obtained QSAR model. The results obtained for this work indicate that the MLP-NN models exhibit reasonable prediction capabilities, and is superior over MLR model as well as those developed by A. Pérez-Garrido’s et al. The developed QSAR model implicates importance and significant contribution of the hydroxyl group attached to the phenyl ring for their radical scavenging activity because the antioxidant compounds could act as hydrogen atom transferors by this functionality group. Additionally, the structural information for free radical scavenging activity of antioxidant could provide deeper insight into the mechanisms of untested compounds.

12 ACS Paragon Plus Environment

Page 13 of 25

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

ASSOCIATED CONTENT Supporting Information Additional tables. This tables are available free of charge via the Internet at http://pubs.acs.org.

REFRENCES (1) Halliwell, B.; Gutteridge, J. M. C. Free Radicals in Biology and Medicine; Oxford University Press: New York, 1999. (2) Halliwell, B.; Gutteridge, J. M. C. Free Radicals in Biology and Medicine; Oxford University Press: New York, 1985. (3) Cooke, M. S.; Evans, M. D.; Dizdaoglu, M.; Lunec, J. Oxidative DNA damage: mechanisms, mutation, and disease. FASEB J. 2003, 17, 1195-1214. (4) Gutteridge, J. M. C.; Halliwell, B. Antioxidants in Nutrition, Health and Disease. Oxford University Press: Oxford, 1994. (5) Wright, J. S.; Johnson, E. R.; DiLabio, G. A. Predicting the activity of phenolic antioxidants: theoretical method, analysis of substituent effects, and application to major families of antioxidants. J. Am. Chem. Soc. 2001, 123, 1173. -1183. (6) Musialik, M.; Litwinienko, G. Scavenging of dpph* radicals by vitamin E is acelerated by its partial ionization: the role of sequential proton loss electron transfer. Org. Lett. 2005, 7, 49514954. 13 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 25

(7) Arts, M. J. T. J.; Dallinga, J. S.; Voss, H.-P.; Haenen, G. R. M. M.; Aalt, B. A critical appraisal of the use of the antioxidant capacity (TEAC) assay in defining optimal antioxidant structures.Food Chem. 2003, 80, 409-414. (8) Chung, K. T.; Wong, T. Y.; Huang, Y. W.; Lin, Y. Tannins and human health. Crit. Rev. Food Sci. 1998, 38, 421-464. (9) Cassidy, A.; Hanley, B.; Lamuela-Raventos, R. M. Isoflavones, lignans and stilbenes/origins, metabolism and potential importance to human health. J. Sci. Food Agr. 2000, 80, 1044-1062. (10) Tapiero, H.; Tew, K. D.; Ba, N.; Mathe, G. Polyphenols: do they play a role in the prevention of human pathologies?Biomed.Pharmacother. 2002, 56, 200-207. (11) Kruzlicova, D.; Danihelova, M.; Veverka, M. Quantitative Structure-Antioxidant Activity Relationship of Quercetin and its New Synthetised Derivatives. Nova BiotechnologicaetChimica. 2012, 11, 37-44. (12) Kontogiorgis, A. C.; Pontiki, A. E.; Hadjipavlou-Litina, D. A Review on Quantitative StructureActivity Relationships (QSARs) of Natural and Synthetic Antioxidants Compounds. Mini-Rev. Med. Chem. 2005, 5, 563-574. (13) Quintero, F. A.; Patel, S. J.; Muñoz, F.; Mannan, M. S. Review of Existing QSAR/QSPR Models Developed for PropertiesUsed in Hazardous Chemicals Classification System. Ind. Eng. Chem. Res. 2012, 51, 16101−16115. (14) Mitra, I.; Saha, A.; Roy, K. Predictive chemometric modeling of DPPH free radicalscavenging activity of azole derivatives using 2D- and 3D-quantitative structure-activity relationship tools. Future. Med. Chem. 2013, 5, 261-280.

14 ACS Paragon Plus Environment

Page 15 of 25

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

(15) Jing, P.; Zhao, S.-J.; Jian, W.-J.; Qian, B.-J.; Dong, Y.; Pang, J. Quantitative studies on Structure-DPPH• scavenging activity relationships of food phenolic acids. Molecules. 2012, 11, 12910-12924. (16) Sarkar, A.; Middya, T. R.; Jana, A. D. A QSAR study of radical scavenging antioxidant activity of a series of flavonoids using DFT based quantum chemical descriptors - The importance of group frontier electron density. J. Mol. Model. 2012, 18, 2621-2631. (17) Ghinet, A.; Farce, A.; Oudir, S.; Pommery, J.; Vamecq, J.; Hénichart, J.-P.; Rigo, B.; Gautret, P. Antioxidant activity of new benzo[de]quinolines and lactams: 2DQuantitative structure-activity relationships. J. Med. Chem. 2012, 8, 942-946. (18) Gupta, S.; Matthew, S.; Abreu, P. M.; Aires-de-Sousa, J. QSAR analysis of phenolic antioxidants using MOLMAP descriptors of local properties. Bioorg. Medicinal Chem. 2006, 14, 1199-1206. (19) Amic, D.; Davidovic-Amic, D.; Beslo, D.; Rastija, V.; Lucic, B.; Trinajstic, N. SAR and QSAR of the antioxidant activity of flavonoids. Curr. Med. Chem. 2007, 14, 827-845.

(20) Abreu, R. M. V.; Ferreira, I. C. F. R.; Queiroz, M. J. o. R. P. QSAR model for predicting radical scavenging activity of di(hetero)arylamines derivatives of benzo[b]thiophenes. Eur. J. Med. Chem. 2009, 44, 1952-1958. (21) Mitra, I.; Saha, A.; Roy, K. Predictive modeling of antioxidant coumarin derivatives using multiple approaches: Descriptor-based QSAR, 3D-pharmacophore mapping, and HQSAR. Sci. Pharm. 2013, 81, 57-80.

15 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 25

(22) Pérez-Garrido, A.; Helguera, A. M.; Ruiz, J. M. M.; Rentero, P. Z. Topological substructural molecular design approach: Radical scavenging activity. Eur. J. Med. Chem. 2012, 49, 86-94. (23) HyperChem Release 7.0 for windows; Hypercube: Inc. 2002. (24) http://www.disat.unimib.it/chem. (25) Todeschini, R.; Consonni, V. Molecular Descriptors for Chemoinformatics; Wiley-VCH: Weinheim, Germany, 2009. (26) Liu, P.; Long, W. Current Mathematical Methods Used in QSAR/QSPR Studies. Int. J. Mol. Sci. 2009, 10, 1978-1998. (27) Niculescu, S. P. Artificial neural networks and genetic algorithms in QSAR. J. Mol. Struct. 2003, 622, 71-83. (28) Beal, M. T.; Hagan, H. B.; Demuth, M. Neural Network Design; PWS: Boston, 1996. (29) Konoz, E.; Fatemi, M. H.; Faraji, R. Prediction of Kovats Retention Indices of Some Aliphatic Aldehydes and Ketones on Some Stationary Phases at Different Temperatures Using Artificial Neural Network. J. Chromatogr. Sci. 2008, 46, 1-7. (30) http://www.statsoft.com/ (31) Lahmiri, S. A comparative study of backpropagation algorithm in financial prediction.TIJCSA. 2001, 1, 15-21. (32) Roy, P. P.; Roy, K. On some aspects of variable selection forpartial least squares regression models. QSAR Comb. Sci. 2008, 27, 302−313. (33) Roy K, Mitra I, Kar S, Ojha PK, Das RN, KabirH.Comparative studies on some metrics for external validation of QSPR models. J. Chem. Inf. Model. 2012, 52, 396-408.

16 ACS Paragon Plus Environment

Page 17 of 25

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

(34) Ojha PK, Mitra I, Das R, Roy K.Further exploring rm2 metrics for validation of QSPR models.Chemom.Intell. Lab. Syst.2011, 107, 194-205. (35) A. Tropsha. Best practices for QSAR model development, validation, and exploitation.Mol. Inf. 2010, 29, 476-488. (36) Roy, K.; Chakraborty, P.; Mitra, I.; Ojha, P. K.; Kar, S.; Das, R. N. Some Case Studies on Application of "rm2" Metrics for Judging Quality of Quantitative Structure–Activity Relationship Predictions: Emphasis on Scaling of Response Data. J. Comput. Chem. 2013, 34, 1071-1082. (37) Walczak, B.; Massart, D. L. Robust principal components regression as a detection tool for outliers. Chemometr.Intell. Lab. Syst.1995, 27, 41–54. (38)Mitra, I.; Saha, A.; Roy, a. K. Quantitative structure–activity relationship modeling of antioxidant activities of hydroxybenzalacetonesusing quantum chemical, physicochemical and spatialdescriptors. Chem. Biol. Drug Des. 2009, 73, 526–536. (39) Roy, P. P.; Paul, S.; Mitra, I.; Roy, K. On two novel parameters for validation of predictive QSAR models. Molecules 2009, 14, 1660–1701. (40) Todeschini, R.: Milano Chemometrics, University of Milano- Bicocca, Milano, Italy (personal communication), 2010. (41) Todeschini, R.; Consonni, V. Handbook of Molecular Descriptors; Wiley-VCH: Weinheim, 2000. (42) Mitra, I.; Saha, A.; Roy, K. Chemometric modeling of free radical scavenging activity of flavone derivatives. Eur. J. Med. Chem. 2010, 45, 5071-5079.

17 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 25

(43) Wright, S.; Johnson, E. R.; G. A. DiLabio. Predicting the activity of phenolic antioxidants: theoretical method, analysis of substituent effects, and application to major families of antioxidants.J.Am.Chem.Soc. 2001, 123 1173–1183. (44) Tropsha, A.; Gramatica, P.; Gombar, V. K. The importance of being earnest: validation is the absolute essential for successful application and interpretation of QSPR models. QSAR Comb. Sci. 2003, 22, 69-77.

18 ACS Paragon Plus Environment

Page 19 of 25

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Table 1. Names, experimental and predicted TEAC (mol. L-1) TEACexp TEACANN TEACMLR

Compound

Name

1

Procyanidin B2 digallate

9.180

8.841

9.868

2

Procyanidin C1

8.290

8.457

7.586

Corilagin

7.760

7.846

7.110

Procyanidin B1

6.140

6.409

5.017

3 4

int

5

(-)-Epigallocatechingallate

5.950

5.994

5.683

6

(-)-Epicatechingallate

5.270

5.344

5.031

Quercetin

4.420

4.530

3.517

(-)-Epigallocatechin

3.710

1.581

3.274

9

Gallic acid

3.520

3.610

2.678

10

(-)-Epicatechin

3.060

2.892

2.410

11

Morin

2.680

2.671

2.867

12

Baicalein

2.560

2.131

1.385

13

Piceatannol

2.530

2.267

2.772

14

Butein

2.420

2.662

2.927

15

Quercetin 3 glucoside

2.390

2.711

2.511

16

Esculetin

2.380

2.152

1.826

17

Curcumin

2.240

1.403

1.611

18

Quercitrin

2.180

1.814

1.486

19

Luteolin

2.180

2.170

2.295

20

Resveratrol

2.140

2.374

1.700

21

Rutin

2.020

1.605

1.836

22

para-Coumaric acid

1.960

1.270

1.132

23

Sappanchalcone

1.930

1.699

2.130

24

Purpurin

1.930

1.302

1.495

25

Ferulic acid

1.920

1.883

1.070

26

Phloretin

1.790

1.894

1.677

27int

Demethoxycurcumin

1.630

0.978

1.264

28

Piceatannol-3'-glucoside

1.620

1.883

1.934

29

Pseudopurpurin

1.620

1.640

1.152

30ext

Kaempferol

1.590

1.706

2.459

31

Chlorogenic acid

1.560

0.716

1.208

Quercetin 3 glucoside 7 rham

1.560

2.690

0.967

7 8

ext

32

int

33

Baicalin

1.550

1.095

1.223

34

Isoferulic acid

1.530

1.536

1.130

35ext

Cinaroside

1.470

0.544

1.804

int

Carthamin

1.430

0.366

2.151

37

Syringic acid

1.390

1.408

0.513

38ext

Resveratrol-3-glucoside

1.350

-0.077

1.003

Caffeic acid

1.310

0.551

2.183

2, 4-Hydroxybenzoic acid

1.220

0.660

0.729

36

39 40

int

19 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 25

41

Bisdemethoxycurcumin

1.180

1.326

0.862

42ext

Protocatechuic acid

1.150

0.321

1.775

43

Galangina

1.120

1.197

1.422

44

Alizarin

1.070

0.900

1.011

45

ortho-Cumaric acid

0.930

0.883

1.084

46

meta-Coumaric acid

0.820

0.993

1.237

47

Flavonol

0.707

0.828

0.583

Resveratrol-4'-glucoside

0.558

0.744

1.114

48 49

int

Quinizarin

0.548

4.475

0.861

50

Hesperetin

0.403

1.178

1.312

51ext

Scopoletin

0.383

2.444

0.392

52

Secoisolariciresinol

0.308

-0.070

0.420

53

Matairesinol

0.253

0.312

0.928

54

Naringenin

0.217

0.563

0.828

Vitexin

0.216

0.297

-0.435

Magnolol

0.209

1.499

0.682

55 56

int

57

Esculetin-6-glucoside

0.164

0.535

0.743

58

Astragalin

0.138

0.398

0.625

59

Shikonin

0.124

0.048

-0.785

Genistein

0.123

0.073

0.795

60 61

ext

Acetylskikonin

0.107

1.995

-0.578

62

Alizarin-2-glucoside

0.105

0.408

0.041

63

Juglone

0.105

-0.047

0.728

64

Hesperidin

0.104

0.061

0.416

65

Arctigenin

0.104

0.571

1.344

66

Daidzein

0.101

-0.307

0.742

67

int

Naringin

0.098

1.713

-0.405

68

ext

Glycitein

0.097

0.226

0.806

69

int

Emodin

0.095

0.008

0.852

70

Vanillic acid

0.092

0.168

0.593

71

Apigenin

0.086

0.063

0.795

Apigetrin

0.083

0.363

0.712

72 73

ext

Chrysin

0.081

0.571

0.055

74

Genistin

0.077

-0.084

0.498

75

Aloe-emodin

0.077

-0.230

0.382

76

Rhein

0.076

0.226

1.068

77

1,5-Dihydroxyanthraquinone

0.076

0.550

0.877

78

Ruberythic acid

0.073

-0.039

-0.611

79

Daidzin

0.072

-0.004

0.049

80

2, 6-Dihydroxyanthraquinone

0.072

0.755

1.276

81

Chrysophanol

0.069

-0.427

0.062

82

Physcion

0.068

0.098

-0.105

83

Chrysazine

0.068

0.523

0.855

20 ACS Paragon Plus Environment

Page 21 of 25

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

84int

ortho-Hydroxybenzoic acid

0.037

0.728

-0.266

85

para-Hydroxybenzoic acid

0.028

0.660

0.729

86

meta-Hydroxybenzoic acid

0.025

0.654

1.003

87

Anthraquinone

0.009

0.209

-0.431

88

trans-Cinnamic acid

0.007

0.090

0.313

89

Benzoic acid

0.005

0.076

-0.085

90

Isoflavone

0.005

-0.492

-0.907

91

5-Metoxifuranocoumarin

0.003

0.375

0.291

92

Flavone

0.003

-0.149

-0.592

93

trans-Stilbene

0.002

-0.086

-0.470

94

Coumarin

0.001

0.062

0.553

95 trans-Chalcone 0.001 0.435 -0.287 96 Flavanone 0.000 0.109 -1.041 * int andext refer to compounds used as internal test and external test set, respectively.

21 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1.000

.900 R2u

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

.800

.700 0

3

6

9

12

15

No. of descriptor

Figure 1.The variations of R2u against the number of descriptors in MLR model.

22 ACS Paragon Plus Environment

Page 22 of 25

Page 23 of 25

Training set

9

Internal test

External test

7 TEAC cal

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

5 3 1 -1 -1

1

3

5

7

9

TEACexp Figure 2. The plot of MLP-NN calculated versus experimental of values TEAC.

23 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1.5 Training test set"

Internal test set

External test set

1.0 0.5 Residuals

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 25

0.0 -1

1

3

5

7

9

-0.5 -1.0 -1.5 TEACexp Figure 3.Plot of predicted residuals versus experimental values of TEAC.

24 ACS Paragon Plus Environment

Page 25 of 25

4 training set

3 Standared Residualas

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

internal test

external test

2 1 0 -1 -2 -3 -4 0

0.1

0.2 leverage

0.3

0.4

Figure4.The plot of standardized residuals versus hat values, with a warning leverage of h*=0.233 (William plot).

25 ACS Paragon Plus Environment