New Quantitative Structure–Activity Relationship Model for

Oct 6, 2017 - New Quantitative Structure–Activity Relationship Model for ... Key Laboratory of Agro-Ecological Processes in Subtropical Region, Inst...
1 downloads 0 Views 1MB Size
Subscriber access provided by Gothenburg University Library

Article

A new quantitative structure-activity relationship model for Angiotensinconverting enzyme inhibitory dipeptides based on integrated descriptors Baichuan Deng, Xiaojun Ni, Zhenya Zhai, Tianyue Tang, Chengquan Tan, Yijing Yan, Jinping Deng, and Yulong Yin J. Agric. Food Chem., Just Accepted Manuscript • DOI: 10.1021/acs.jafc.7b03367 • Publication Date (Web): 06 Oct 2017 Downloaded from http://pubs.acs.org on October 7, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Agricultural and Food Chemistry is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 33

Journal of Agricultural and Food Chemistry

A new quantitative structure-activity relationship model for angiotensin-converting enzyme inhibitory dipeptides based on integrated descriptors 1

Baichuan Deng,†,ǁ Xiaojun Ni,†,ǁ Zhenya Zhai,† Tianyue Tang,† Chengquan Tan,†

2

Yijing Yan,† Jinping Deng,*,† Yulong Yin*,†,‡

3



4

Guangdong, P.R. China

5



6

Poultry Production, Key Laboratory of Agro-ecological Processes in Subtropical Region, Institute

7

of Subtropical Agriculture, Chinese Academy of Sciences, Changsha 410125, Hunan, P.R. China

College of Animal Science, South China Agricultural University, Guangzhou, 510642,

National Engineering Laboratory for Pollution Control and Waste Utilization in Livestock and

8

1

ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

Page 2 of 33

9

ABSTRACT: Angiotensin-converting enzyme (ACE) inhibitory peptides derived

10

from food proteins have been widely reported for hypertension treatment. In this

11

paper, a benchmark dataset containing 141 unique ACE inhibitory dipeptides was

12

constructed

13

relationships (QSAR) study was carried out to predict half-inhibitory concentration

14

(IC50) of ACE activity. 16 descriptors were tested and the model generated by

15

G-scale descriptor showed the best predictive performance with the coefficient of

16

determination (R²) and cross-validated R² (Q²) of 0.6692 and 0.6220, respectively.

17

For most other descriptors, R² were ranging from 0.52-0.68 and Q² were ranging

18

from 0.48-0.61. A complex model combining all 16 descriptors was carried out and

19

variable selection was performed in order to further improve the prediction

20

performance. The quality of model using integrated descriptors (R2 0.7340±0.0038,

21

Q² 0.7151±0.0019) was better than that of G-scale. An in-depth study of variable

22

importance showed that the most correlated properties to ACE inhibitory activity

23

were hydrophobicity, steric and electronic properties and C-terminal amino acids

24

contribute more than N-terminal amino acids. Five novel predicted ACE-inhibitory

25

peptides were synthesized and their IC50 values were validated through in vitro

26

experiments. The results indicated that the constructed model could give a reliable

27

prediction of ACE-inhibitory activity of peptides and it may be useful in the design

28

of novel ACE-inhibitory peptides.

through

database

mining

and

quantitative

2

ACS Paragon Plus Environment

structure–activity

Page 3 of 33

Journal of Agricultural and Food Chemistry

29

KEYWORDS: ACE-inhibitory peptides, QSAR, Variable selection, Variable

30

importance, Amino acid descriptors

31

INTRODUCTION

32

Nowadays, inhibitors of angiotensin-converting enzyme (ACE) have been

33

considered as first-line therapy for hypertension.1 ACE is a zinc- and chloride-

34

dependent metallopeptidase (EC. 3.4.15.1)2 and plays a dual role in regulating

35

renin-angiotensin system (RAS) and kallikrein-kinin system (KKS). It catalyzes the

36

conversion of inactive angiotensin I (decapeptide) to generate strongly

37

vasoconstrictive angiotensin II (octapeptide) as well as inactivates the vasodilator

38

bradykinin.3 Therefore, ACE has become an appropriate target for antihypertensives.

39

The inhibition of ACE would lead to the reduction of angiotensin II production and

40

consequently the decrease of blood pressure.4 Various synthetic ACE inhibitors,

41

such as captopril, enalapril, ramipril and lisinopril, have been developed for the

42

clinical treatment of hypertension.5 However, synthetic ACE inhibitors inevitably

43

cause adverse side effects such as cough, allergic reactions, taste disturbances, and

44

skin rashes.6 Thus, numerous ACE-inhibitory peptides have been identified from

45

hydrolytic products of food-derived proteins and could be used as a potent

46

functional food additive and represent a healthier and natural alternative to

47

ACE-inhibitory drugs. The origin of these peptides were from milk,7 porcine 3

ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

48

skeletal muscle,8 bovine collagen,9 bovine blood,10 egg,11 soybean,12 rapeseed,13 oat

49

(avena sativa),14 marine,15 etc.

50

Among approaches for studying bioactive peptides, many ACE-inhibitory peptides

51

have been discovered by the classical approach, involving peptides production,

52

isolation, purification and identification.16 Then, these newly discovered

53

ACE-inhibitory peptides will be collected and deposited in related databases. Based

54

on databases or literatures, bioinformatic approach has become a more efficient and

55

economical tool for peptide research and discovery of new bioactive peptides when

56

compared with the classical approach.17 Particularly, quantitative structure–activity

57

relationship (QSAR) is a crucial tool for bioinformatic approach and plays an

58

important role in the study of bioactive peptides. In recent years, a number of

59

experimentally validated ACE-inhibitory peptides were used to build QSAR models

60

and in certain cases to predict novel and potent ACE-inhibitory peptides.5, 18 Among

61

these, di and tri-peptides were most frequently studied because they have excellent

62

biological properties such that they can be intact absorbed into blood circulation and

63

they are usually resistant to gastrointestinal proteolysis.19 A classical dataset of

64

dipeptide sequences of 58 ACE inhibitors20 are often utilized to test effectiveness of

65

diverse kinds of amino acid descriptors in QSAR studies. A database consisting of

66

168 dipeptides, in which 95 sequences are unique, was constructed from published

67

literatures to study the QSAR of ACE-inhibitory peptides.21 Besides, most of 4

ACS Paragon Plus Environment

Page 4 of 33

Page 5 of 33

Journal of Agricultural and Food Chemistry

68

previously studies only use a single amino acid descriptor to construct QSAR model,

69

which may result in the loss of descriptive information and neglecting of the

70

connection between different descriptors.

71

Databases such as BIOPEP,22 ACEpepDB (http://www.cftri.com/pepdb/index.php)

72

and PepBank23 contain ACE-inhibitory peptides, but the number is limited. The

73

records with experimentally validated IC50 values are even fewer. Recent years, new

74

ACE-inhibitory peptides are continuously reported in literatures. Kumar et al.

75

established a specific and new database for antihypertensive peptides, AHTPDB,

76

which contains 5978 peptide entries.24 Among these, 3364 entries have provided

77

information of IC50 values of peptides and 1694 were unique peptides.24 Moreover,

78

this database contains 1878 records of dipeptides, including 141 unique dipeptides

79

sequences with IC50 values.

80

In this study, we used the 141 unique ACE-inhibitory dipeptides from AHTPDB to

81

construct a dataset. It is, to our knowledge, the largest number of unique dipeptides

82

ever used in a single QSAR model. 16 different descriptors were used to construct a

83

sophisticated QSAR model in order to use more comprehensive information to

84

describe amino acids. We also used outlier elimination and variable selection

85

methods to optimize the model and improve the prediction performance. The newly

86

predicted ACE-inhibitory peptides were synthesized and their IC50 values were

87

validated through in vitro experiments. The objectives of this study were to 5

ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

88

construct a reliable QSAR model for ACE-inhibitory dipeptides prediction and it

89

may be useful in the design of novel ACE-inhibitory peptides.

90

MATERIALS AND METHODS

91

Chemicals. Angiotensin-converting enzyme (ACE) from rabbit lung and

92

hippuryl-histidyl-leucine (HHL) as a substrate of ACE were purchased from

93

Sigma–Aldrich (St. Louis, MO, USA). The chemically synthesized (purity >95%)

94

peptides were obtained from DGpeptidesCo., Ltd. (Hangzhou, China). All the other

95

reagents used in this study were of analytical pure.

96

Assay for ACE-inhibitory activity. ACE-inhibitory activity was assayed by the

97

method of Cushman and Cheung (1971) with slight modifications.25 The peptide

98

solution (50 µL) was mixed with 5 mM HHL solution (150 µL), followed by

99

pre-incubation for 5 min at 37 ℃. Afterwards, 50 µL of a 25 mU/mL ACE solution

100

(prepared in a 0.1 M sodium borate buffer containing 0.3 M NaCl at pH 8.3) was

101

added, the reaction mixture was further incubated for 30min at 37 ℃. The enzymic

102

reaction was terminated by adding 250 µl of 1 M HCl and the liberated hippuric acid

103

was extracted with 1.5 ml of ethyl acetate by vortex mixing for 30 sec. After

104

centrifugation (4000 × g, 10 min), 1 ml aliquot of the upper layer was transferred

105

into a glass tube, and evaporated by heating at 120 ℃ for 30 min. The hippuric acid 6

ACS Paragon Plus Environment

Page 6 of 33

Page 7 of 33

Journal of Agricultural and Food Chemistry

106

was redissolved in 3 ml of distilled water. The absorbance was measured at 228 nm

107

using UV-spectrophotometer (UV-2501PC, Shimadzu, Tokyo, Japan). The IC50

108

value was defined as the concentration of the inhibitor required to inhibit 50% of the

109

ACE-inhibitory activity.

110

Data collection. Datasets for the antihypertensive peptides (AHTPs) were manually

111

collected from AHTPDB (http://crdd.osdd.net/raghava/ahtpdb/), which is a database

112

of experimentally validated antihypertensive peptides and most of the peptides

113

belong to the family of angiotensin I converting enzyme inhibiting peptides.24

114

Source of information of this database were mainly collected from three databases,

115

i.e.

116

(http://www.uwm.edu.pl/biochemia/index.php/pl/biopep)

117

database (http://erop.inbi.ras.ru/), and published literatures. First of all, we selected

118

all the dipeptides in this database, including information about sequence, half

119

maximal inhibitory concentration (IC50), IC50 determination assay, source and

120

molecular mass. IC50 represents the concentration that inhibits 50% activity of ACE.

121

Data processing. During data collection from the database, it was noticed that many

122

of the identical peptides have exhibited the same or different IC50 values. For the

123

peptide with multiple IC50 values, the median value was retained to remove

124

duplicates. A total of 1878 dipeptides were obtained before merging. Then, the total

ACEpepDB

(http://www.cftri.com/pepdb/),

7

ACS Paragon Plus Environment

and

BIOPEP EROP-Moscow

Journal of Agricultural and Food Chemistry

125

number of unique peptides included in QSAR model is 141. All the IC50 values were

126

log-transformed prior to modeling.

127

Building the QSAR model. QSAR is defined as a relationship linking structural

128

characteristics of molecules to their biological or physicochemical properties. Data

129

sets for the processed dipeptides are presented in Table S1. The peptide sequences

130

were transformed into X-matrix by means of 16 descriptors, respectively, while

131

dependent variable Y represents activity values (IC50) of peptides. These descriptors

132

were collected from published articles which can well represent the structural

133

characteristics of the amino acids for QSAR models, including Z-scale,26 5Z-scale,27

134

DPPS,28 MS-WHIM,29 ISA-ECI,30 VHSE,31 FASGAI,32 VSW,33 T-scale,34 ST-scale,35

135

E-scale,36 V,37 G-scale,38 HESH39 and HSEHPCSV.40 For a set of peptides analogues,

136

the structures would be characterized by describing each varied amino acid position

137

with the descriptor’s parameter values. For example, the G-scale descriptor

138

including eight kinds of parameters, if we used it to describe dipeptides, the

139

chemical structures of dipeptides would be described by 16 (8 parameters × 2 amino

140

acids) variables. Thus, a set of peptide sequences varied in n positions can be

141

described by 8×n variables. The amino acid at the N-terminus was designated as n1,

142

and its properties were described as n1G1, n1G2, n1G3, n1G4, n1G5, n1G6, n1G7

143

and n1G8 of the G-scale model. The C-terminus was designated as n2 and so on.

8

ACS Paragon Plus Environment

Page 8 of 33

Page 9 of 33

Journal of Agricultural and Food Chemistry

144

After that, a combination of 204 variables for each dipeptides was undertaken, and

145

204 predictor variables were defined with the above descriptors express as:

146

Z-scale(D1-D6); 5Z-scale(D7-D16); DPPS(D17-D36); MS-WHIM1(D37-D42);

147

MS-WHIM2(D43-D48);

148

FASGAI(D69-D80);

149

ST(D119-D134);

150

HSEHPCSV(D181-D204).

151

Partial least square (PLS) regression41 was used to build the correlation between

152

amino acid descriptors (predictors, X) and log-transformed IC50 (dependent, Y) and

153

it was implemented using MATLAB R2015a software. All variables were

154

auto-scaled to unit variance prior to the analyses. The data set was validated by

155

cross-validation as internal validation, the number of significant PLS components

156

was chosen automatically by using various rules based on a statistic called Q²,

157

which is the cross-validated R², referred to as the predictive ability of the model. R²

158

is the coefficient of determination, which is also an important parameter in PLS

159

analysis and provides an estimate of the model fit.

160

Model population analysis (MPA). MPA is a general framework for chemical

161

modeling which uses random resampling and statistical analysis techniques to

162

extract important information from the data.42 Generally, it contains three steps: (1) a

163

random resampling procedure to obtain sub-datasets, (2) a model building procedure

ISA-ECI(D49-D52); VSW(D81-D98);

V(D135-D140);

E(D99-D108);

G(D141-D156);

9

ACS Paragon Plus Environment

VHSEA(D53-D68); T(D109-D118); HESH(D157-D180);

Journal of Agricultural and Food Chemistry

Page 10 of 33

164

to build sub-models, (3) and a statistical analysis procedure to extract information

165

from the outcome of sub-models. In this study, MPA was applied for outlier

166

detection and variable selection.

167

Outlier detection. In an attempt to obtain a robust and highly predictive model, it is

168

crucial to identify and remove outlying samples from measured data before

169

modeling. The MPA-based method was used to detect outliers of the data.43 To begin

170

with, a number of (e.g. 5000) sub-datasets were generated by applying random

171

resampling procedure in sample space. Each sub-dataset contains 80% of random

172

selected samples from the pool of samples. Then, for each sub-dataset, a PLS

173

regression were built. Thus, a number of (e.g. 5000) were built. In the next step, the

174

sub-models were used to predict the IC50 value of remaining samples separately and

175

the prediction errors for each sample were recorded. Finally, for each sample, a

176

statistical analysis was applied on the prediction errors. The average of prediction

177

errors (MEAN) and standard deviation of prediction errors (STD) were used as the

178

basis for outlier detection. In this study, 3-sigma rule was applied and the samples

179

which exceed the range of mean±3*standard deviation for MEAN (or STD) were

180

considered as outliers. This method eliminated outliers one by one until all samples

181

were within the range.

182

Variable selection. Variable selection was carried out after excluding the outliers. In 10

ACS Paragon Plus Environment

Page 11 of 33

Journal of Agricultural and Food Chemistry

183

the present study, a bootstrapping soft shrinkage (BOSS) method was applied for

184

variable selection.44 It is also based on the idea of MPA.42 Firstly, a number of

185

sub-models were generated using bootstrap resampling in sample space. Then, for

186

each sub-model, the regression coefficients were extracted. The regression

187

coefficients for sub-models were summed up to obtain weights for variables. In the

188

next step, weighted bootstrap resampling45 was applied to build new sub-models,

189

where variables with larger weights had larger probabilities to be selected into the

190

sub-models. The resampling procedure was repeated and the less important variables

191

were eliminated gradually. This variable selection method used multi-model instead

192

of single model for comparison and considered random combination of variables,

193

which had advantages in selecting optimal variable combination compared with

194

previous methods.44 The selected variable is represented as n1/n2-descriptor’s

195

name-the parameter number, where n1 denotes N-terminus and n2 denotes

196

C-terminus. For example, ‘n1-G-1’ means that the selected important variable is the

197

first parameter of the G-scale to describe the amino acid at N-terminus.

198

Statistical Analysis. All statistical analyses were performed by using MATLAB

199

software (Version R2015a, the Mathworks, Inc).

11

ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

200

Page 12 of 33

RESULTS AND DISCUSSION

201

(Please insert Table 1)

202

QSAR study on ACE inhibitory dipeptides. Modeling of these dipeptides was

203

conducted according to the data sets in the Table S1. Table 1 summarizes the most

204

important statistical parameters of the model based on dipeptides dataset using the

205

above 16 kinds of descriptors. After elimination of outliers, the final sizes of

206

calibration dataset were slightly different and resulted in substantially improved

207

models. According to the three-sigma rule, each descriptor excluded 2-6 outliers.

208

Figure 1 shows the process of outlier elimination on the model built with G-scale

209

descriptor. The outlier numbers were displayed in each figure of Figure 1a-e, the

210

elimination order was 130, 80, 127 and 125, respectively. All samples were within

211

the range according to the three-sigma rule (dashed line) after removing outliers

212

(Figure 1e). The process of removing outliers for QSAR model with other

213

descriptors is the same as G-scale descriptor. After eliminating outliers, all models

214

established and descriptors were presented in Table 1. It can be seen clearly that the

215

model derived from G-scale descriptor has the best predictive performance.

216

Modeling of the G-scale descriptor with activities has the higher Q² (0.6220) and

217

could explain 66.92% of the sum of squares in Y-variance (R2) after excluding

218

outliers. The Q² value of G-scale, HSEHPCSV, 5Z-scale models are larger than 0.6.

219

For most of the other descriptors, Q² values are between 0.5 and 0.6. Only the 12

ACS Paragon Plus Environment

Page 13 of 33

220

Journal of Agricultural and Food Chemistry

models of MS-WHTM1 and ISA-ECI show Q² values of lower than 0.5.

221

(Please insert Figure 1)

222

To further improvement the model, we built the model using integrated variables,

223

where variables of 16 descriptors were combined. And a large dataset with 204

224

variables were obtained. It was followed by outlier elimination and variable

225

selection on PLS regression models. The aim of this process is to integrate the

226

information of different descriptors together to make a better model. The process of

227

outlier elimination on integrated model is displayed in Figure 2. The order of outlier

228

elimination is 127, 130, 125, 124 and 80, respectively. All samples were within the

229

range according to the three-sigma rule (dashed line) after getting rid of all outliers

230

(Figure 2f). Table 1 shows that the Q²and R² values obtained by using integrated

231

descriptors are 0.6205 and 0.7110, respectively. It is comparable to the result of the

232

best performed single descriptor (G-scale descriptor). Moreover, the integrated

233

model leaves room for further improvement of the model, superior to any

234

single-descriptor models.

235

(Please insert Figure 2)

236

Variable selection. The bootstrapping soft shrinkage (BOSS) method was applied

237

for variable selection on the integrated descriptors model.44 The effective of BOSS

238

has been proved elsewhere.46 In the present study, BOSS was run 100 times and the

239

results were shown in Table 1. Compared the model with all descriptors, variable 13

ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

Page 14 of 33

240

selection not only reduced the number of variables, but also improved the prediction

241

performance of the model. The Q² values after variable selection is 0.7151. It has a

242

distinct improvement compared to the full-variable model, of which the Q² value is

243

0.6205 (Table 1). On average, 48 variables were selected from 204 variables by the

244

variable selection method. Researches showed that not all molecular descriptors

245

were related to biological activity, so it is necessary to delete redundant descriptors

246

to improve the prediction performance of the QSAR model.39 Moreover, it also

247

emphasizes the importance of removing outliers and variable selection method in

248

QSAR modeling.

249

(Please insert Table 2)

250

Comparison with the reported models. QSAR studies have been carried out on 58

251

ACE-inhibitory dipeptides using T-scale, G-scale and HESH descriptors, with the Q2

252

values 0.784, 0.831 and 0.838, respectively (Table 2).38 The Q2 value of integrated

253

descriptors combined with BOSS is 0.910, which is larger than previous reports. Wu

254

et al. carried out QSAR study of 168 ACE inhibitor dipeptides with Z-scale with the

255

Q² of 0.711 and R2 of 0.732.21 Fu et al. further improved the model to obtain Q2

256

0.716 and R2 0.746.9 By using integrated descriptors the Q2 and R2 were further

257

improved, which are 0.804 and 0.816, respectively (Table 2). The comparison of the

258

results with the previous reports showed that our method can give higher prediction

259

accuracy on the same datasets.

260

It should be noted that the dipeptides in Wu’s study contained 72 duplicated 14

ACS Paragon Plus Environment

Page 15 of 33

Journal of Agricultural and Food Chemistry

261

sequences (only 94 unique dipeptides). The existence of duplicated sequences may

262

result in an over-optimistic Q2. In the present study, duplicated sequences were

263

eliminated and the median of IC50 value for a unique sequence was retained. As a

264

result, 141 unique dipeptides were used for modeling. It is, to our knowledge, the

265

largest number of unique dipeptides ever used in a single QSAR model. Thus, the

266

prediction performance of our model (Q2=0.7151) is better or comparable with the

267

previous studies.

268

(Please insert Figure 3)

269

Evaluate the importance of variables. For the 16 single descriptor models, the

270

importance of amino acid properties in each position is evaluated using the value of

271

PLS regression coefficients and variable importance in project (VIP).47 Figure 3

272

shows the evaluation of variable importance of G-scale model. Through the PLS

273

regression coefficients values (Figure 3a), it is observed that variables of G1, G5, G6

274

and G7 are important for the bioactivities of ACE-inhibitor dipeptides. For the

275

position n1, G1, G2, G4 and G5 are negatively related to the log values, while G3,

276

G6, G7 and G8 are positively related to the log values. For the position n2, G1, G2,

277

G3, G5 and G6 are negatively to the log values, while G4, G7 and G8 are positive. It

278

is evident that position n2 is more relevant to biological activities than position n1.

279

G-scale descriptor including eight kinds of parameters were derived from 457 kinds

280

of physicochemical properties of the amino acid index database, which was 15

ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

Page 16 of 33

281

classified into three sorts of parameters including hydrophobic, steric and electric

282

properties.38 The eight parameters were encoded as G1∼G8 which represented

283

hydrophobicity, STERIMOL minimum width of the side chain, loss of side chain

284

hydropathy by helix formation, optical rotation, side chain molecular volume,

285

frequency of the 4th residue in turn, amino acid composition of EXT of

286

multi-spanning proteins and net charge index, respectively. For the ACE-inhibitory

287

dipeptides, amino acid residues with information of hydrophobic and stereo

288

characteristics are most important to biological activities. The importance of the

289

amino acid residue in position n2 is mainly decided by G1 followed by G5, which

290

represents hydrophobicity, side chain molecular volume, respectively. For both

291

positions, amino acid residues with large bulk chain as well as hydrophobic side

292

chains are preferred, such as phenylalanine, tryptophan, and tyrosine. VIP plots of

293

the PLS models using G-scale descriptor are summarized in Figure 3b. VIP reflects

294

the size of the contribution for variables to activity. For the QSAR model using

295

G-scale descriptor, the most influential property parameters to ACE- inhibitory

296

dipeptides are G1, G5 and G7, and the properties contributing to the model are

297

Hydrophobicity > Steric Property > Electric Properties according to the VIP value. It

298

is obvious that the position n2 is most influential to biological activities and the

299

order of the important variables in n2 is G5 > G1 > G7 > G3 > G2 (VIP value >1).

300

For position n1, eight parameters are arranged in a proper order as G2 > G1 > G8 >

301

G5 > G7 > G6 > G3 > G4. According to these results, the results of PLS regression 16

ACS Paragon Plus Environment

Page 17 of 33

Journal of Agricultural and Food Chemistry

302

coefficients and VIP are similar.

303

The variable importance evaluation of E-scale, HSEHPCSV, 5Z-scale, VHSEA and

304

Z-scale showed that hydrophobicity and steric properties were important for the

305

bioactivities. For HESH, the significant properties contributing to the model was

306

hydrophobicity, especially for the C-terminus. The regression coefficients of

307

FASGAI show that the vital parameters of the bulky properties may be conducive to

308

enhancing bioactivities of ACE inhibitors. The properties of the important variables

309

of DPPS were steric and electronic properties, while for 3D descriptors, relative

310

importance of the X-variables of ISA-ECI in the QSAR model was isotropic surface

311

area, MS-WHIM descriptor was primarily of electrostatic potential. According to the

312

regression coefficients and VIP values of all the descriptors, it could be seen that

313

position n2 (C-terminus) of the dipeptide played an important role in ACE-inhibitory

314

activity. For the important properties of variables, hydrophobicity, steric properties,

315

and electronic properties were crucial, in addition to hydrogen bonding.

316

There were many studies speculated that the amino acid with hydrophobic property

317

on C-terminus was positively highest correlated with ACE inhibitors bioactivity.18, 20,

318

31

Dipeptides with aromatic side chains and proline on C-terminus and branched

319

aliphatic side amino acids on N-terminus were essential for high inhibitory

320

activity.48 Our results are in agreement with the previous findings.

321

(Please insert Figure 4) 17

ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

Page 18 of 33

322

The BOSS method includes some randomness in its algorithm, so that the selected

323

variables in each run can be slightly different. This property gives BOSS a new way

324

of evaluating variable importance, i.e. the frequency of selected variables. The

325

variables which have higher frequency of being selected by BOSS show higher

326

variable importance. Figure 4 displayed the frequency of variables selected by

327

BOSS method on ACE dipeptide dataset. Based on BOSS method, after running 100

328

times, the frequency of the selected variables was shown. Among 16 descriptors,

329

G-scale and HSEHPCSV have the highest frequency, followed by ST-scale,

330

FASGAI, MS-WHIM and VSW. Back to the parameters of each descriptor, the top 8

331

variables with descending order are as follows: ST-4 > G-1 > FASGAI-6 >

332

HSEHPCSV-11 > G-3 > G-8> HSEHPCSV-4 > VSW-5 > MS-WHIM2-1. They may

333

have high correlated with ACE inhibitors bioactivities. In other words, properties of

334

hydrophobicity, steric, electronic and hydrogen bonding were more relevant to the

335

biological activities of ACE-inhibitor dipeptide.

336

It can be seen from Figure 3 that the variables with high frequency comes from

337

different descriptors, and the combination of descriptors greatly improved the

338

prediction ability of the QSAR model. Most of the highly selected descriptors, such

339

as G-scale, HSEHPCSV, FASGAI, have good performance when applied

340

separately in QSAR models. However, some descriptors, such as ST-scale,

341

MS-WHIM, have poor predictive performance when modeled separately. They also 18

ACS Paragon Plus Environment

Page 19 of 33

Journal of Agricultural and Food Chemistry

342

have high frequency when applied in variable selection. Coupled with the fact that

343

the model obtained by BOSS has improved prediction performance, we may

344

conclude that these descriptors are also important in QSAR model building. It is

345

suggested that BOSS can extract additional information from the poorly performed

346

descriptors and have considered the interaction with highly performed descriptors.

347

(Please insert Table 3)

348

Prediction and validation of potential ACE-inhibitory dipeptides. According to

349

the constructed QSAR models, the ACE-inhibitory activities of the remaining

350

dipeptides were predicted. Inevitably, there is a certain degree of variation based on

351

QSAR models, therefore, in vitro experiments is required to further validate the

352

activity of peptides predicted.8 In this study, five predicted dipeptides, which had the

353

lowest predicted IC50, were synthesized chemically to determine the IC50 values.

354

Table 3 displays the comparison between predicted and experimental values of

355

dipeptides. The predicted logIC50 values of CW, TW, HW, QW and CY were 0.98,

356

1.20 1.24, 1.24 and 1.35, respectively. The experimental values were 0.54, 1.15, 1.09,

357

2.03 and 1.63, respectively. It can be seen that all these 5 predicted dipeptides are

358

verified to have ACE-inhibitory activities and all the prediction errors are lower than

359

0.5, except for QW. Among the five dipeptides, CW has the lowest predicted

360

logIC50 and measured logIC50, which means the highest ACE-inhibitory activity.

361

Besides, HW, TW, CY and QW also show strong ACE-inhibitory activities. Based 19

ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

Page 20 of 33

362

on structure-activity relationship, it has been suggested that high ACE-inhibitory

363

activity of the peptides should have hydrophobic amino acids, especially on

364

C-terminus. The C-terminus of the five predicted dipeptides contains tryptophan or

365

tyrosine, showing strong hydrophobicity. These results indicated the validity of the

366

prediction models, which could provide a reliable prediction on ACE-inhibitory

367

activity of peptides.

368

In conclusion, we constructed a benchmark dataset for QSAR study of ACE

369

inhibitory dipeptides, which contains 141 unique dipeptides. It is, to our knowledge,

370

the largest number of unique dipeptides ever used in a single QSAR model. Among

371

the 16 amino acid descriptors, G-scale descriptor has the best predictive

372

performance and can be selected to describe the structure of ACE-inhibitory

373

dipeptides. Meanwhile, further improvement on the predictive ability of the QSAR

374

model was obtained using integrated descriptors combined with variable selection

375

method. The newly predicted ACE-inhibitory peptides were validated through in

376

vitro experiments, which verified the reliability of the model. The QSAR model we

377

built may be useful in the design of novel ACE-inhibitory peptides.

378

AUTHOR INFORMATION

379

Corresponding Authors

380 381

*

(J.D.) Fax: +86 84615285 Tel: +8613787136677 E-mail: [email protected]

*

(Y.Y.) Fax: +86 84615285 Tel: +8613974915255 E-mail: [email protected] 20

ACS Paragon Plus Environment

Page 21 of 33

382 383

Journal of Agricultural and Food Chemistry

Author Contributions ǁ

Baichuan Deng and Xiaojun Ni contributed equally.

384

Funding

385

The authors gratefully thank the National Natural Science Foundation of China for

386

support of the projects (Grant Nos. 31330075, 31572420 and 31110103909). The

387

studies meet with the approval of the university’s review board.

388

Notes

389

The authors declare no competing financial interest.

390

ABBREVIATIONS USED

391

ACE, angiotensin-converting enzyme; QSAR, quantitative structure-activity

392

relationship; PLS, Partial least square regression; IC50, half maximal inhibitory

393

concentration; VIP, variable importance in project; BOSS, bootstrapping soft

394

shrinkage method.

395

Supporting Information

396

A table listing the sequence and IC50 values of 141 unique ACE-inhibitory

397

dipeptides.

21

ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

Page 22 of 33

REFERENCES 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436

(1) Hanif, K.; Bid, H. K.; Konwar, R., Reinventing the ACE inhibitors: some old and new implications of ACE inhibition. Hypertens. Res. 2010, 33, 11-21. (2) Sturrock, E. D.; Natesh, R.; van Rooyen, J. M.; Acharya, K. R., Structure of angiotensin I-converting enzyme. Cell. Mol. Life Sci. 2004, 61, 2677-2686. (3) Skeggs, L. T., Jr.; Kahn, J. R.; Shumway, N. P., The preparation and function of the hypertensin-converting enzyme. J. Exp. Med. 1956, 103, 295-299. (4) Lonn, E. M.; Yusuf, S.; Jha, P.; Montague, T. J.; Teo, K. K.; Benedict, C. R.; Pitt, B., Emerging role of angiotensin-converting enzyme inhibitors in cardiac and vascular protection. Circulation. 1994, 90, 2056-2069. (5) Jahangiri, R.; Soltani, S.; Barzegar, A., A review of QSAR studies to predict activity of ACE peptide inhibitors. Pharm Sci. 2014, 20, 122-129. (6) Cooper, W. O.; Hernandezdiaz, S.; Arbogast, P. G.; Dudley, J. A.; Dyer, S.; Gideon, P. S.; Hall, K.; Ray, W. A., Major congenital malformations after first-trimester exposure to ACE inhibitors. N. Engl. J. Med. 2006, 354, 2443-2451. (7) Nakamura Y, Y. N., Sakai K, Okubo A, Yamazaki S, Takano T, Antihypertensive Effect of Sour Milk and Peptides Isolated from It That are Inhibitors to Angiotensin I-Converting Enzyme. J. Dairy Sci. 1995, 78, 1253–1257. (8) Castellano, P.; Aristoy, M. C.; Sentandreu, M. A.; Vignolo, G.; Toldra, F., Peptides with angiotensin I converting enzyme (ACE) inhibitory activity generated from porcine skeletal muscle proteins by the action of meat-borne Lactobacillus. J. Proteomics. 2013, 89, 183-190. (9) Fu, Y.; Young, J. F.; Løkke, M. M.; Lametsch, R.; Aluko, R. E.; Therkildsen, M., Revalorisation of bovine collagen as a potential precursor of angiotensin I-converting enzyme (ACE) inhibitory peptides based on in silico and in vitro protein digestions. J. Funct. Foods. 2016, 24, 196-206. (10) Zhang, T.; Nie, S.; Liu, B.; Yu, Y.; Zhang, Y.; Liu, J., Activity prediction and molecular mechanism of bovine blood derived angiotensin I-converting enzyme inhibitory peptides. PloS one. 2015, 10, e0119598. (11) Majumder, K.; Wu, J. P., A new approach for identification of novel antihypertensive peptides from egg proteins by QSAR and bioinformatics. Food Res. Int. 2011, 44, 1371-1378. (12) Wu, J.; Ding, X., Hypotensive and physiological effect of angiotensin converting enzyme inhibitory peptides derived from soy protein on spontaneously hypertensive rats. J. Agric. Food Chem. 2001, 49, 501-506. (13) He, R.; Malomo, S. A.; Alashi, A.; Girgih, A. T.; Ju, X.; Aluko, R. E., Purification and hypotensive activity of rapeseed protein-derived renin and angiotensin converting enzyme inhibitory peptides. J. Funct. Foods. 2013, 5, 781-789. (14) Cheung, I. W. Y.; Nakayama, S.; Hsu, M. N. K., Angiotensin-I Converting Enzyme Inhibitory Activity of Hydrolysates from Oat (Avena sativa) Proteins by In Silico and In Vitro Analyses. J. Agric. Food Chem. 2009, 57, 9234-9242. (15) He, H. L.; Liu, D.; Ma, C. B., Review on the angiotensin-I-converting enzyme (ACE) inhibitor peptides from marine proteins. Appl. Biochem. Biotechnol. 2013, 169, 738-749. (16) Capriotti, A. L.; Cavaliere, C.; Piovesana, S.; Samperi, R.; Lagana, A., Recent trends in the analysis of bioactive peptides in milk and dairy products. Anal. Bioanal. Chem. 2016, 408, 2677-2685. (17) Udenigwe, C. C., Bioinformatics approaches, prospects and challenges of food bioactive peptide research. Trends Food Sci. Technol. 2014, 36, 137-143. (18) Nongonierma, A.; Fitzgerald, D., Learnings from quantitative structure activity relationship (QSAR) studies with respect to food protein-derived bioactive peptides: A review. RSC Adv. 2016, 6, 75400-75413. (19) Miner-Williams, W. M.; Stevens, B. R.; Moughan, P. J., Are intact peptides absorbed from the healthy gut in the 22

ACS Paragon Plus Environment

Page 23 of 33

437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477

Journal of Agricultural and Food Chemistry

adult human? Nutr. Res. Rev. 2014, 27, 308-329. (20) Hellberg, S.; Eriksson, L.; Jonsson, J.; Lindgren, F.; Sjöström, M.; Skagerberg, B.; Wold, S.; Andrews, P., Minimum analogue peptide sets (MAPS) for quantitative structure-activity relationships. Int. J. Pept. Protein Res. 1991, 37, 414-424. (21) Wu, J.; Aluko, R. E.; Nakai, S., Structural requirements of Angiotensin I-converting enzyme inhibitory peptides: quantitative structure-activity relationship study of di- and tripeptides. J. Agric. Food Chem. 2006, 54, 732-738. (22) Minkiewicz, P.; Dziuba, J.; Iwaniak, A.; Dziuba, M.; Darewicz, M., BIOPEP database and other programs for processing bioactive peptide sequences. J. AOAC Int. 2008, 91, 965-980. (23) Shtatland, T.; Guettler, D.; Kossodo, M.; Pivovarov, M.; Weissleder, R., PepBank--a database of peptides based on sequence text mining and public peptide data sources. BMC Bioinf. 2007, 8, 280. (24) Kumar, R.; Chaudhary, K.; Sharma, M.; Nagpal, G.; Chauhan, J. S.; Singh, S.; Gautam, A.; Raghava, G. P., AHTPDB: a comprehensive platform for analysis and presentation of antihypertensive peptides. Nucleic Acids Res. 2015, 43, D956-D962. (25) Cushman, D. W.; Cheung, H. S., Spectrophotometric assay and properties of the angiotensin-converting enzyme of rabbit lung. Biochem Pharmacol 1971, 20, 1637. (26) Hellberg, S.; Sjöström, M.; Skagerberg, B.; Wold, S., Peptide quantitative structure-activity relationships, a multivariate approach. J. Med. Chem. 1987, 30, 1126-1135. (27) Sandberg, M.; Eriksson, L.; Jonsson, J.; Sjöström, M.; Wold, S., New chemical descriptors relevant for the design of biologically active peptides. A multivariate characterization of 87 amino acids. Journal of Medicinal Chemistry 1998, 41, 2481-91. (28) Tian, F.; Yang, L.; Lv, F., In silico quantitative prediction of peptides binding affinity to human MHC molecule:an intuitive quantitative structure-activity relationship approach. Amino Acids. 2009, 36, 535-554. (29) Zaliani, A.; Gancia, E., ChemInform Abstract: MS‐WHIM Scores for Amino Acids: A New 3D‐Description for Peptide QSAR and QSPR Studies. J. Chem. Inf. Model. 1999, 39, 525-533. (30) Collantes, E. R.; Rd, D. W., Amino acid side chain descriptors for quantitative structure-activity relationship studies of peptide analogues. J. Med. Chem. 1995, 38, 2705-2713. (31) Mei, H.; Liao, Z. H.; Zhou, Y.; Li, S. Z., A new set of amino acid descriptors and its application in peptide QSARs. Biopolymers. 2005, 80, 775–786. (32) Liang, G.; Yang, L.; Kang, L.; Mei, H.; Li, Z., Using multidimensional patterns of amino acid attributes for QSAR analysis of peptides. Amino Acids 2009, 37, 583-591. (33) Tong, J.; Liu, S.; Zhou, P.; Wu, B.; Li, Z., A novel descriptor of amino acids and its application in peptide QSAR. J. Theor. Biol. 2008, 253, 90-97. (34) Tian, F.; Zhou, P.; Li, Z., T-scale as a novel vector of topological descriptors for amino acids and its application in QSARs of peptides. J. Mol. Struct. 2007, 830, 106-115. (35) Yang, L.; Shu, M.; Ma, K.; Mei, H.; Jiang, Y.; Li, Z., ST-scale as a novel amino acid descriptor and its application in QSAM of peptides and analogues. Amino Acids. 2010, 38, 805-816. (36) Venkatarajan, M. S.; Braun, W., New quantitative descriptors of amino acids based on multidimensional scaling of a large number of physical–chemical properties. J. Mol. Model. 2001, 7, 445-453. (37) Lin, Z. H.; Long, H. X.; Bo, Z.; Wang, Y. Q.; Wu, Y. Z., New descriptors of amino acids and their application to peptide QSAR study. Peptides. 2008, 29, 1798-1805. (38) Wang, X.; Wang, J.; Lin, Y.; Ding, Y.; Wang, Y.; Cheng, X.; Lin, Z., QSAR study on angiotensin-converting enzyme 23

ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501

Page 24 of 33

inhibitor oligopeptides based on a novel set of sequence information descriptors. J. Mol. Model. 2011, 17, 1599-1606. (39) Shu, M.; Mei, H.; Yang, S.; Liao, L.; Li, Z., Structural Parameter Characterization and Bioactivity Simulation Based on Peptide Sequence. QSAR Comb. Sci. 2009, 28, 27-35. (40) Dan-Qun; LIANG; Gui-Zhao; ZHANG; Zhi-Liang, New Descriptors of Amino Acids and Its Applications to Peptide Quantitative Structure-activity Relationship. Chin. J. Struct. Chem. 2008, 27, 1375-1383. (41) Wold, S.; Sjöström, M.; Eriksson, L., PLS-regression: a basic tool of chemometrics. Chemom. Intell. Lab. Syst. 2008, 58, 109-130. (42) Deng, B. C.; Yun, Y. H.; Liang, Y. Z., Model population analysis in chemometrics. Chemometrics & Intelligent Laboratory Systems 2015, 149, 166–176. (43) Cao, D. S.; Liang, Y. Z.; Xu, Q. S.; Li, H. D.; Chen, X., A new strategy of outlier detection for QSAR/QSPR. Journal of Computational Chemistry 2010, 31, 592–602. (44) Deng, B. C.; Yun, Y. H.; Cao, D. S.; Yin, Y. L.; Wang, W. T.; Lu, H. M.; Luo, Q. Y.; Liang, Y. Z., A bootstrapping soft shrinkage approach for variable selection in chemical modeling. Anal. Chim. Acta 2016, 908, 63-74. (45) Barbe, P.; Bertail, P., The Weighted Bootstrap. Lecture Notes in Statistics 1995, 98. (46) Lin, Y. W.; Deng, B. C.; Wang, L. L.; Xu, Q. S.; Liu, L.; Liang, Y. Z., Fisher optimal subspace shrinkage for block variable selection with applications to NIR spectroscopic analysis. Chemometrics & Intelligent Laboratory Systems 2016, 159. (47) Wold, S.; Johansson, E.; Cocchi, M., PLS: Partial Least Squares Projections to Latent Structures, 3D QSAR in drug design. 1993; Vol. 1, p 523-550. (48) Cheung, H. S.; Wang, F. L.; Ondetti, M. A.; Sabo, E. F.; Cushman, D. W., Binding of peptide substrates and inhibitors of angiotensin-converting enzyme. Importance of the COOH-terminal dipeptide sequence. J. Biol. Chem. 1980, 255, 401-407.

24

ACS Paragon Plus Environment

Page 25 of 33

Journal of Agricultural and Food Chemistry

Table1. Comparisons among different QSAR models for ACE inhibitory dipeptidesa Before outlier elimination

After outlier elimination

Descriptors

Q2

R2

optPC

Q2

R2

optPC

Outlier

G

0.5331

0.5619

1

0.6220

0.6692

2

130, 80, 125,127

E

0.4589

0.5202

1

0.5965

0.6490

1

130, 80, 125,127,124

HSEHPCSV

0.4824

0.5920

4

0.6087

0.6809

3

80,127,130,125

5Z_scale

0.4132

0.5181

5

0.5629

0.6294

5

130, 125,127, 124,123

HESH

0.4306

0.4823

1

0.5419

0.6776

8

127, 130, 125,124

FASGAI

0.4521

0.5074

1

0.5918

0.6436

2

127, 130, 125, 80

VHSEA

0.4980

0.5396

1

0.5650

0.6033

1

127, 125, 130

V

0.4420

0.4715

1

0.5501

0.5827

2

127, 80, 125, 130

T

0.4943

0.5585

6

0.5716

0.6271

6

127, 124, 130, 125

ST

0.4918

0.5859

4

0.5755

0.6457

4

127, 130, 124

Z_scale

0.4753

0.5149

2

0.6028

0.6387

3

130,125, 127, 124,123, 80

DPPS

0.4704

0.5449

2

0.5501

0.6155

2

125, 127, 130

VSW

0.5170

0.6101

2

0.5430

0.6185

1

127, 125, 130

MS_WHTM2

0.4294

0.4782

2

0.5348

0.5808

5

127, 124,130, 125

MS_WHTM1

0.4069

0.4540

1

0.4809

0.5238

1

127, 125

ISA_ECI

0.4323

0.4709

4

0.4851

0.5239

4

130, 127

0.5095

0.5528

1

0.6205

0.7110

2

127, 125, 130, 124, 80

0.7151±

0.7340±



0.0019

0.0038

0.4976

Integrated descriptors BOSS a

R² is the coefficient of determination; Q² is the cross-validated R²; optPC is optimal principal components for PLS regression model; the results of BOSS are shown in the form of mean value ± standard deviation in 100 runs. 502

25

ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

Page 26 of 33

Table 2. The comparison between QSAR models of ACE-inhibitory dipeptides using reported datasets Dataset

Descriptor

Modeling Method

Q2(CV)

R2

Ref

58 dipeptides 58 dipeptides 58 dipeptides 58 dipeptides 168 dipeptides 168 dipeptides 168 dipeptides

T-scale G-scale HESH Integral + BOSS Z-score 5Z-scale Integral + BOSS

PLS PLS PLS PLS PLS PLS PLS

0.784 0.831 0.838 0.910±0.002 0.711 0.716 0.804±0.001

0.868 0.870 0.877 0.937±0.004 0.732 0.746 0.816±0.002

34

26

ACS Paragon Plus Environment

38 39

21 9

Page 27 of 33

Journal of Agricultural and Food Chemistry

Table3. Prediction and experimental validation of potent ACE-inhibitory dipeptidesa logIC50 peptides predicted observed error CW 0.98 0.54 -0.44 TW 1.20 1.15 -0.05 HW 1.24 1.09 -0.15 QW 1.24 2.03 0.79 CY 1.35 1.63 0.28 a Predicted activity refers to the values obtained from PLS regression model; observed activity refers to the experimentally determined activity using synthetic dipeptides; logIC50 refers to the logarithmic form of IC50 (µM). 503

27

ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

Page 28 of 33

Figure captions

Figure 1. The process of outlier elimination on the model built with G-scale descriptor. The dashed line is defined as the threshold for outliers, which is mean±3*standard deviation for MEAN (or STD). (a) No outlier was eliminated, (b) sample no. 130 was eliminated, (c) sample no. 80 was eliminated, (d) sample no. 127 was eliminated, (e) sample no. 125 was eliminated and all outliers were removed from the model.

Figure 2. The process of outlier elimination on the model built with integrated descriptors. The dashed line is defined as the threshold for outliers, which is mean±3*standard deviation for MEAN (or STD). (a) No outlier was eliminated, (b) sample no. 127 was eliminated, (c) sample no. 130 was eliminated, (d) sample no. 125 was eliminated, (e) sample no. 124 was eliminated, (f) sample no. 80 was eliminated and all outliers were removed from the model.

Figure 3. (a) PLS regression coefficients and (b) VIP of the G-scale model of the ACE -inhibitory dipeptides. The larger value of VIP and the larger absolute value of regression coefficients denote higher variable importance.

Figure 4. The frequency of variables selected in BOSS method on ACE dataset (100 runs). The higher frequency denotes higher variable importance.

28

ACS Paragon Plus Environment

Page 29 of 33

Journal of Agricultural and Food Chemistry

Figure 1

29

ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

Figure 2

30

ACS Paragon Plus Environment

Page 30 of 33

Page 31 of 33

Journal of Agricultural and Food Chemistry

Figure 3 504

31

ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

Figure 4 505

32

ACS Paragon Plus Environment

Page 32 of 33

Page 33 of 33

Journal of Agricultural and Food Chemistry

Graphic for table of contents.

33

ACS Paragon Plus Environment