A straightforward and highly efficient strategy for hepatocellular

30 mixtures were labeled with mTRAQ reagents (Δ0 and Δ8, respectively) to find. Page 1 of 24. ACS Paragon Plus Environment. Analytical Chemistry. 1...
0 downloads 0 Views 2MB Size
Subscriber access provided by Nottingham Trent University

Article

A straightforward and highly efficient strategy for hepatocellular carcinoma glycoprotein biomarker discovery using a nonglycopeptide-based mass spectrometry (NGP-MS) pipeline Weiqian Cao, Biyun Jiang, Jiangming Huang, Lei Zhang, Mingqi Liu, Jun Yao, Mengxi Wu, Lijuan Zhang, Siyuan Kong, Yi Wang, and Pengyuan Yang Anal. Chem., Just Accepted Manuscript • DOI: 10.1021/acs.analchem.9b03074 • Publication Date (Web): 27 Aug 2019 Downloaded from pubs.acs.org on August 28, 2019

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

A straightforward and highly efficient strategy for hepatocellular carcinoma

2

glycoprotein biomarker discovery using a nonglycopeptide-based mass

3

spectrometry (NGP-MS) pipeline

4 5

Wei-Qian Cao1,2,#,*, Bi-Yun Jiang1,#, Jiang-Ming Huang1,3, Lei Zhang1, Ming-Qi

6

Liu1,3, Jun Yao1, Meng-Xi Wu1,3, Li-Juan Zhang1, Si-Yuan Kong1, Yi Wang1,

7

Peng-Yuan Yang1,3,*

8

1. The Fifth People’s Hospital of Shanghai and Institutes of Biomedical

9

Sciences, Fudan University, Shanghai, China

10

2. NHC Key Laboratory of Glycoconjugates Research, Fudan University,

11

Shanghai, China

12

3. Department of Chemistry, Fudan University, Shanghai, China

13

#

These authors contributed equally to this work

14

*

To whom correspondence should be addressed:

15

P.-Y.Y. ([email protected])

16

C.-W.Q. ([email protected])

17 18

ABSTRACT

19

Efficient detection of aberrant glycoproteins in serum is particularly

20

important for biomarker discovery. However, direct quantitation of glycoproteins

21

in serum remains technically challenging due to the extraordinary complexity of

22

the serum proteome. In the current work, we proposed a straightforward and

23

highly efficient strategy by using the nonglycopeptides releasing from the

24

specifically enriched glycoproteins for targeted glycoprotein quantification. With

25

this so called nonglycopeptide-based mass spectrometry (NGP-MS) strategy,

26

a powerful and nondiscriminatory pipeline for HCC glycoprotein biomarker

27

discovery, verification and validation has been developed. Firstly, a dataset of

28

234 NGPs was strictly established for MRM quantification in serum. Secondly,

29

the NGPs enriched from 20 HCC serum mixtures and 20 normal serum

30

mixtures were labeled with mTRAQ reagents (Δ0 and Δ8, respectively) to find

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 24

31

the differentially expressed glycoproteins in HCC. A total of 97 glycoprotein

32

candidates were preliminary screened and submitted for absolute quantitation

33

with NGP-based SID MRM in the individual samples of 38 HCC serum and 24

34

normal controls. Finally, 21 glycoproteins were absolutely quantified with high

35

quality. The diagnostic sensitivity results showed that three glycoproteins, beta-

36

2-glycoprotein 1 (APOH), alpha-1-acid glycoprotein 2 (ORM2) and complement

37

C3 (C3), could be used for the discrimination between HCC patients and

38

healthy people. A novel glycoprotein biomarker panel (APOH, ORM2, C3 and

39

AFP) has been testified outperformed than AFP, the known HCC serum

40

biomarker, alone, in this study. We believe that this strategy and the panel of

41

glycoproteins might hold great clinical value for HCC detection in the future.

42 43 44 45

Glycosylation, as one of the most important posttranslational protein

46

modifications, plays significant roles in various biological processes.1 Abnormal

47

changes in protein glycosylation not only affect the biological functions of

48

glycoproteins, but also are associated with a variety of diseases, including

49

cancers.2,3 Approximately 25% of FDA-approved tumor markers are

50

glycoproteins.4,5 Thus, efficient identification and quantification of abnormal

51

glycoproteins between healthy and cancerous individuals would be useful for

52

the study of the pathological mechanism of cancer and the development of

53

specific cancer biomarkers.6

54

Human serum is a rich source of biomarkers and generally considered

55

crucial for disease diagnosis and therapeutic target discovery. Serum contains

56

various proteins and presents the deepest version of human proteome.

57

Serological aberrant glycoproteins can reflect abnormal states of cancer

58

patients, since aberrant glycoproteins can either be secreted into the

59

bloodstream or shed from cell membranes via abnormally enhanced protease

60

activity.7

Consequently,

quantitative

studies

of

ACS Paragon Plus Environment

aberrant

changes

of

Page 3 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

61

glycoproteins in serum are particularly important for biomarker discovery. Direct

62

identification and quantitation of glycoproteins in serum based on mass

63

spectrometry8,9 are technically challenging due to the extraordinary complexity

64

of the serum proteome, the inherent low stoichiometry and microheterogeneity

65

of glycosylation, as well as the serious signal suppression of glycopeptides by

66

abundant nonglycopeptides during MS analysis.

67

In recent years, multiple reaction monitoring (MRM) MS has been

68

introduced to compensate for the shortfall in biomarker development10 and

69

recognized as a rapid and cost-effective measurement technology for highly

70

specific detection of targeted proteins in extremely complex biological

71

samples.11,12 Different strategies employing MRM have been designed for

72

glycoprotein quantification.13 For example, Lebrilla C.B. et al applied the MRM

73

technique to quantify immunoglobulins G, A, and M and their site-specific

74

glycans simultaneously and directly from human serum/plasma without protein

75

enrichment.14

76

deglycosylated glycopeptides and obtain site-specific quantification information

77

of core fucosylated peptides.15

Qian X. et al used MRM-MS to analyze the partially

78

Although MRM has shown vital value and great potential in both

79

glycoproteome research and biomarker discovery, the broad application of this

80

method has been impeded due to the limited choices of internal references.

81

MRM is based on the concept of dilution with stable isotope-labeled synthetic

82

reference peptides, which precisely mimic the deglycosylated form of candidate

83

glycopeptides as internal references, for the purpose of glycoprotein

84

quantification. The deglycosylated peptides that are selected as internal

85

references must meet several basic requirements, such as no missing cleavage

86

sites and no methionine in the peptide sequence.16 However, since only 5% of

87

the total number of tryptic peptides contain the N-X-S/T motif, the choice of

88

internal references is very limited.17 Moreover, microheterogeneity exists in

89

each N-glycosylation sites and every glycoproteins owns multiple glycosylation

90

sites, thus further enhancing the difficulties in the internal peptide choice and

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 24

91

also would lead to ambiguous results for glycoprotein quantification.18 To solve

92

the abovementioned problems, in the current study, we proposed a promising

93

targeted

94

nonglycopeptides from N-glycoproteins instead, developed an efficient pipeline

95

for high throughput screening of differentially expressed glycoproteins and

96

provided

97

quantification in clinical HCC serum.

glycoproteomic

a

dataset

MRM-MS

of

strategy

nonglycopeptide

based

references

on

for

monitoring

glycoprotein

98 99 100

EXPERIMENTAL SECTION Chemicals and Reagents

101

Sequencing grade trypsin was purchased from HUA LISHI SCIENTICFIC

102

Corporation (Beijing, China). PNGase F (glycerol free) was purchased from

103

New England Biolabs (Ipswich, MA). Affi-Gel® Hz Hydrazide Gel was

104

purchased from Bio-Rad laboratories (Hercules, CA). An mTRAQ reagent 10

105

assay kit was purchased from SCIEX (Redwood, CA). The crude and isotope-

106

labeled peptides were obtained from Guo Tai Biological Technology

107

Corporation (Hefei, China). The synthetic peptides were assessed by MALDI-

108

TOF MS and reversed-phase high-performance liquid chromatography (RP-

109

HPLC). All other chemicals were purchased from Sigma-Aldrich (St. Louis, MO).

110 111

Human Serum Samples Collection

112

A total of 102 human serum samples including 58 HCC and 44 normal

113

controls were collected at Shanghai Zhongshan Hospital and Huashan Hospital

114

from April to December, 2014. The collected blood samples were immediately

115

placed on ice and allowed to stand for 30 mins. Then the samples were

116

centrifuged at 2000g for 15 minutes. The supernatant was collected and stored

117

at -80 °C until it was used.

118

Informed consent was obtained under protocols that were approved by an

119

institutional review board approved. The research followed the tenet of the

120

Declaration of Helsinki and was proved by the Ethics Commeitee of the Fudan

ACS Paragon Plus Environment

Page 5 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

121

University Shanghai Zhongshan Hospital and Huashan Hospital. HCC

122

diagnoses were confirmed by histopathologic study. Normal controls were

123

selected from those without a history of liver disease and hepatitis B or hepatitis

124

C infection and with normal liver biochemical function. The clinical information

125

pertaining to HCC samples and normal controls are included in Table S1.

126 127

LC-MS/MS Identification of Nonglycoeptides and Glycopeptides for

128

Glycoproteins Enriched from Serum Sample

129

In the identification stage, a pooled 10μL serum sample from 20 HCC and

130

20 normal serum mixtures or a mixture of five standard glycoprotein with E.coli

131

proteins was used for subsequent analysis.

132

For identification, glycoproteins were first captured from samples using

133

the hydrazide chemistry method.19 Then, the bound glycoproteins were

134

reduced with 10 mM dithiotreitol and in 8 M urea/100 mM ammonium

135

bicarbonate buffer at 37 °C for 2 h, and alkylated with 20 mM iodoacetamide at

136

room temperature in the dark for 0.5 h. The resins were washed with 100 mM

137

ammonium bicarbonate containing 8 M urea, 1.5 M NaCl and 100 mM

138

ammonium bicarbonate twice to remove unbound nonglycosylated proteins.

139

Finally, the nonglycopeptides were released from captured glycoproteins by

140

trypsin digestion. Released nonglycopeptides were collected, lyophilized with

141

SpeedVac, and stored at -80°C for later analysis. Glycopeptides still bound to

142

the resins were released by further incubating the resins with PNGase F at 37°C

143

overnight. Glycopeptides were also lyophilized with SpeedVac and stored at -

144

80 °C for later analysis.

145

The enriched nonglycopeptides and glycopeptides were then

146

analyzed by LC-MS/MS. The analyses were carried out on nano-LC-ESI

147

MS/MS. The peptides were suspended in 5% (v/v) ACN containing 0.1% (v/v)

148

FA (phase A) and separated by a 15 cm reversed- phase column with a gradient

149

of 5%–45% phase B (95% ACN with 0.1% formic acid) over 100 mins at a

150

constant column-tip flow rate of 500 nL/min. The peptides were analyzed using

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

151

an LC-20AB system (Shimadzu, Tokyo, Japan) connected to an LTQ Orbitrap

152

mass spectrometer (Thermo Electron, Bremen, Germany) equipped with an

153

online nanoelectrospray ion source. The spray voltage was 2.3 kV and the

154

heated capillary was set at 180 °C. The peptides were analyzed by MS and

155

data-dependent MS/MS acquisition, selecting the 10 most abundant precursor

156

ions for MS/MS with a dynamic exclusion duration of 60 s. The resolution was

157

set to 60000@400. AGC was set to 1000000 for MS1, and 10000 for MS2.The

158

scan range was set from m/z 300 to m/z 1600.

159

The acquired MS/MS spectra from LC-MS/MS were searched by

160

MASCOT (version 2.3) against the human protein Swiss-Prot database for

161

human serum (including 20212 entries), or against the manually combined

162

dataset of five standard glycoproteins from Uniprot for the standard

163

glycoproteins. The searching parameters were set as follows: fixed modification

164

of cysteine residues (C, +57 Da), variable modifications of methionine oxidation

165

(M, +16 Da), maximum of two missed tryptic cleavage sites, 20 ppm error

166

tolerance in MS and 1 Da error tolerance in MS/MS. The cut-off false discovery

167

rate for all peptide identification processes was controlled to below 1%. For

168

glycopeptides search, variable modification of deamidation (N, +0.98 Da for

169

glycan releasing in H216O; N, or +2.98 Da for glycan releasing in H218O) was

170

added. Only peptides with an N-X-S/T (X≠P) sequon were considered N-

171

glycopeptides.

172 173

MRM Analysis

174

All MRM experiments were carried out on a 6500 QTRAP hybrid triple

175

quadrupole/linear ion trap mass spectrometer (SCIEX, CA) interfaced with an

176

Eksigent nano 1D plus system (AB Sciex, CA). Peptides were separated by a

177

15 cm reversed-phase column (75-μm inner diameter; C18 3-μm silica beads)

178

with a gradient of 5%–80% phase B (98% ACN with 0.1% formic acid) over 60

179

mins at a constant column-tip flow rate of 300 nL/min. The ion spray voltage

180

was set to 2300 V and the curtain gas pressure was 30 p.s.i. Q1/Q3

ACS Paragon Plus Environment

Page 6 of 24

Page 7 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

181

quadrupoles were set at unit resolution (0.7 FWHM). The dwell time for each

182

MRM transition was set to 0.02 s. Skyline was used to generate the MRM

183

method and analyze the data from the MRM assay.

184

MRM-MS optimization: A total of 234 nonglycopeptides from 105 N-

185

glycoproteins of human serum were selected and synthesized for MRM-MS

186

optimization according to the following basic principles: (1) no missing cleavage

187

sites; (2) 2+ or 3+ charge states; (3) no methionine in the peptide sequence;

188

and (4) 5-25 amino acids length. Transitions and collision energy (CE) were

189

optimized to guaranteed high sensitivity. The 2 or 3 most intense daughter ions

190

of the signature peptides were selected for the MRM transitions. Singly charged

191

y-ions, which yield the highest intensity and largest mass differences between

192

the labeled and unlabeled peptides were the preferred selection. The CE was

193

optimized based on calculation and experiments. The default collision energies

194

used for the 6500 QTAP instrument were calculated according to the formulas

195

CE = 0.057 × (precursor m/z) − 4.265 and CE =0.031 × (precursor m/z) + 7.082,

196

for doubly and triply charged precursor ions, respectively. The transitions for

197

each peptide were measured for 11 different CE values (five steps and 1 V step

198

size on either side of the default CE).

199

mTRAQ labeling and MRM relative quantitation: Nonglycopeptides

200

enriched from 20 HCC serum mixtures and 20 normal serum mixtures were

201

labeled using mTRAQ Reagent 10 Assay Kit (△0 and △8 reagent, respectively,

202

SCIEX 4440014 and 4427697) according to the product manual. Then the

203

labeled peptides were desalted with C18 columns, lyophilized with SpeedVac,

204

and stored at -80 °C for later analysis. Proteins with a fold change greater than

205

1.20 or less than 0.83 were selected as differentially expressed proteins.

206

Nonglycopeptides enriched from five standard glycoproteins were labeled using

207

mTRAQ Reagent 10 Assay Kit (△0 and △4 reagent, respectively, SCIEX

208

44440014 and 4427696).

209

Absolute quantitation with isotope dilution-based MRM (SID-MRM):

210

For absolute quantitation, stable isotope-labeled (SIL) peptides of targeted

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

211

proteins were synthesized (arginine or lysine of each peptide was isotope-

212

labeled at 13C and 15N, Guopingyaoye, Ltd, China). First, SIL peptides were

213

added to the digested serum proteins for MRM-MS to estimate the

214

concentration of the peptides in the sample. Then, SIL peptides were added to

215

the digested serum proteins and tested in triplicate to construct a standard

216

curve. Finally, SIL peptides were spiked in each sample to a certain

217

concentration (Table S2), and a total of 1μg proteotypic peptides of each

218

sample was applied for analysis.

219 220

Statistic Construction of a Diagnostic Model

221

The quantitative results from the SID-MRM analysis were processed and

222

visualized using Prism 5.0 (GraphPad Software Inc., La Jolla, CA, USA). Binary

223

logic regression was performed to calculate the receiver operation

224

characteristic curves (ROCs) and to produce predictive models by comparing

225

the area under the curve (AUC) using SPSS (v19.0, IBM, Armonk, NY, USA).

226

Parameters of binary logic regression using SPSS were set as follows:

227

dependent variable was set as 1; covariates were set as the average

228

concentration of the glycoproteins. The binary logic regression results were

229

saved as Probability (P) and showed in Table S3

230 231

Immunohistochemistry (IHC) Validation

232

Whole sections of formalin-fixed and paraffin-embedded tissue microarray

233

with 75 hepatocellular carcinoma tissue and 75 para cancer tissue points were

234

purchased from Xin Chao Corporation (Beijing, China) and used for

235

immunohistochemistry analysis. The pathological information pertaining to the

236

tissue chip was included in Table S4. After dewaxing, rehydration and antigen

237

retrieval, the sections were preincubated with 3% hydrogen peroxide to

238

inactivate endogenous peroxidase and blocked with 5% bovine serum albumin

239

(BSA, Sigma) for 1 h. The sections were then incubated with the primary

240

antibody at 4 °C overnight followed by a secondary antibody at 37 °C for 30 min.

ACS Paragon Plus Environment

Page 8 of 24

Page 9 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

241

Peroxidase activity was revealed by 3-diaminobenzidine (DAB). After staining

242

with hematoxylin, the sections were dehydrated in an alcohol gradient, cleared

243

with xylene, mounted and imaged under a light microscope.

244 245

RESULTS AND DISCUSSION

246

Nonglycopeptide based Mass Spectrometry (NGP-MS) Strategy

247

MRM-MS is a powerful tool for proteomic quantitation. However,

248

glycoprotein quantitation is hampered by the limited choices of glycopeptides

249

as internal references. In the current study, we proposed a straightforward

250

nonglycopeptide based MS (NGP-MS) strategy for targeted glycoproteomic

251

quantitation. In this strategy, glycoproteins were first captured on solid beads

252

through hydrazide chemistry. After thoroughly washing, the nonglycopeptides

253

(NGPs) of captured glycoproteins were released through trypsin digestion and

254

analyzed by LC-MS/MS. Finally, the selected NGPs were synthesized for the

255

MRM-assay (Fig. 1A).

256

The overall scheme of the NGP-MS-based pipeline for HCC biomarker

257

development is shown in Figure 1B-1D. First, to ensure the feasibility and

258

quantitative accuracy of the strategy, we quantitatively analyzed five standard

259

glycoproteins, cytochrome c (Cyto C), immunoglobulin G (IgG), ovalbumin

260

(OVA), horseradish peroxidase (HRP) and fetuin, with a defined amount of

261

mixture using the NGP-MS strategy (Fig. 1B). The NGP-MS strategy was

262

demonstrated to be an efficient strategy for glycoprotein quantitation, exhibiting

263

high accuracy and good reproducibility. Then, NGP-MS was applied to human

264

serum analysis (Fig. 1C). A total of 1924 NGPs from 259 glycoproteins were

265

identified. Through peptide selection and optimization, a dataset, containing

266

234 NGPs of 105 glycoproteins with optimum parameters, was established for

267

MRM quantitation. Based on the dataset, NGP-MS was ultimately used for the

268

discovery of HCC candidate biomarkers (Fig. 1D). We found that 97

269

glycoproteins were significantly changed in HCC serum through primary

270

screening by mTRAQ labeling and MRM relative quantitation (fold-

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 24

271

change>1.20 or