A multiplex fragment ion-based method for accurate proteome

1 day ago - Multiplex proteome quantification with high accuracy is urgently ... the measurement of universal protein properties for proteomic atlases...
0 downloads 0 Views 1MB Size
Subscriber access provided by WEBSTER UNIV

Article

A multiplex fragment ion-based method for accurate proteome quantification Jianhui Liu, Yuan Zhou, Yichu Shan, Baofeng Zhao, Yechen Hu, Zhigang Sui, Zhen Liang, Lihua Zhang, and YuKui Zhang Anal. Chem., Just Accepted Manuscript • Publication Date (Web): 21 Feb 2019 Downloaded from http://pubs.acs.org on February 21, 2019

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

A multiplex fragment ion-based method for accurate proteome quantification Jianhui Liu†,‡,‖, Yuan Zhou†,§,‖, Yichu Shan†, Baofeng Zhao†, Yechen Hu†,‡, Zhigang Sui†, Zhen Liang†, Lihua Zhang†,*, and Yukui Zhang† † CAS Key Laboratory of Separation Sciences for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian 116023, China. ‡ University of Chinese Academy of Sciences, Beijing 100049, China. § School of Medical Technology, Xuzhou Medical University, Xuzhou 221004, China ABSTRACT: Multiplex proteome quantification with high accuracy is urgently required to achieve a comprehensive understanding of dynamic cellular and physiological processes. Among the existing quantification strategies, fragment ion-based methods can provide highly accurate results, but the multiplex capacity is limited to 3-plex. Herein, we developed a multiplex pseudo-isobaric dimethyl labeling (m-pIDL) method to extend the capacity of the fragment ion-based method to 6-plex by one-step dimethyl labeling with several millidalton and dalton mass differences between precursor ions and enlarging the isolation window of precursor ions to 10 m/z during data acquisition. M-pIDL showed high quantification accuracy within the 20-fold dynamic range. Notably, the ratio compression was 1.13-fold in a benchmark two-proteome model (5:1 mixed E. coli proteins with HeLa proteins as interference), indicating that by m-pIDL, the ratio distortion of isobaric labeling approaches and the approximate 40% ratio shift of label-free quantification strategy could be effectively eliminated. Additionally, m-pIDL did not show ratio variation among post-translational modifications (CV 6.66%), which could benefit the measurement of universal protein properties for proteomic atlases. We further employed m-pIDL to monitor the time-resolved responses of TGF-β-induced epithelial-mesenchymal transition (EMT) in lung adenocarcinoma A549 cell lines, which facilitated the finding of new potential regulatory proteins. Therefore, the 6-plex quantification of m-pIDL with the remarkably high accuracy might create new prospects for comprehensive proteome analysis.

Due to the diversity and dynamic nature of biological systems, measurements of proteomics across multiple conditions, such as different perturbations, subcellular localizations and dynamic interaction networks, are urgently needed to understand complex protein properties and behavior in depth.13 Multiplex proteome quantification with high accuracy constitutes the crucial and ever-present requirement for numerous studies. Fueled by advances in mass spectrometry and related technologies, various relative quantification approaches have been widely used including label-free quantification strategies, data-independent acquisition (DIA) strategies, and isobaric labeling strategies.4

isolation window acquisition of all theoretical fragment-ion spectra (SWATH)–MS is implemented by selecting and fragmenting peptides within m/z ranges (that typically span 25 m/z) and repeatedly cycling through the entire mass range.10 Then, appropriate algorithms are needed to extract the fragment ion signals for each peptide from the tandem mass spectra within the retention time. However, multiple samples can only be analyzed in different runs by the above-mentioned methods, which might result in the same peptide having different retention times and ionization efficiencies between runs. Therefore, the high reproducibility of HPLC separation is indispensable for the sophisticated alignment and normalization procedures of data processing. Additionally, the required instrument time is multiplied along with the increase of the sample number.

Label-free quantification (LFQ) strategies, especially for MS1 intensity-based methods, are extensively used in highthroughput proteomics due to the simplicity, unlimited sample number and high quantification accuracy.5-7 By this strategy, peptide quantities can be determined by integrating the extracted ion chromatograms (XICs) of precursor ions across different runs in data-dependent acquisition (DDA) mode.8,9 Continued improvements in mass spectrometer hardware have also promoted the development of data-independent acquisition (DIA) strategies in recent years with high quantification reproducibility. As a remarkable example, the sequential

Such shortcomings could be overcome by multiplex isobaric labeling strategies, such as using isobaric tags for relative and absolute quantification (iTRAQ)11 and using tandem mass tags (TMTs)12,13. To realize the analysis of multiple samples in a single run, samples are labelled with different isotope reagents. After labeling, the precursor ion masses among samples are the same, and the yielded reporter ions from part of the labeling reagents can be distinguished and quantified in the tandem mass 1

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

spectra. The reagents of up to 8-plex iTRAQ and 11-plex TMTs have been commercially available, and such methods have been widely used for intricate proteomes14,15 because they not only make quantitative comparison more straightforward and simpler, especially for various sample pre-fractionation procedures, but also reduce the instrument time needed.16 However, these methods still suffer from critical ratio distortion owing to the fragmentation of co-isolated interference. MultiNotch MS3 method17,18 has been developed to improve the quantification accuracy, but the improvement comes at the expense of reduction in quantitative sensitivity, and the method has some special requirements for MS instruments. In addition, routine uses of the multiplex isobaric labeling strategies have been stifled by their high cost, mainly due to the multistep synthesis of labeling reagents with moderate to low yields. So, improving the quantification accuracy and reducing the cost are the main issues to be addressed.17,19

Page 2 of 15

10 g/L NaCl and 10 g/L tryptone). HeLa cells were grown in DMEM supplemented with 10% FBS and 1% penicillinstreptomycin. The cells were lysed in ice-cold lysis buffer composed of 8 M urea and 1% (v/v) protease inhibitor cocktail and ultrasonicated on ice for protein extraction. Then, the lysates were centrifuged at 20 000 g for 30 min to remove cell debris, and the protein concentrations were measured using a BCA assay (Beyotime, Nantong, China). Disulfide bonds of E. coli, HeLa and β-casein proteins were reduced by dithiothreitol at 56°C for 1 h, and cysteine residues were alkylated with iodoacetamide in darkness at room temperature for 30 min. After dilution to ten volumes with 50 mM phosphate buffer (pH 8.0), proteins were digested overnight at 37°C with trypsin at a 1:50 (enzyme/protein, m/m). A549 human lung adenocarcinoma cells were cultured in MEM supplemented with 10% FBS and 1% penicillinstreptomycin. Cells were treated with recombinant human transforming growth factor beta 1 (TGF-β1, R&D Systems, Minneapolis, MN, USA), and five samples were collected during the EMT process at the time points of 12 h, 18 h, 24 h, 36 h and 48 h, respectively. The control group was the A549 cells cultured 48 h without treatment. Proteins were extracted by the lysis buffer27 containing 10% C12Im-Cl and 1% protease inhibitor cocktail dissolved in 50 mM NH4HCO3. After centrifugation to remove cell debris and concentration measurement through a BCA assay, proteins were incubated in 0.1 M DTT at 95°C for 5 min. Then, the extracted proteins were transferred to 10 kDa filter devices and washed with 50 mM phosphate buffer by centrifugation at 16 000 g. Next, 100 μL of 50 mM IAA was added to the concentrates and kept in the darkness for 30 min. Subsequently, the resulting concentrates were washed three times with 50 mM phosphate buffer and digested by trypsin overnight at the ratio of 1:30 (enzyme/protein, m/m). The digests were finally obtained through centrifugation, and the filter devices were washed twice with water.

To improve the quantification accuracy, fragment ion-based strategies have shown good performance20,21 since fragment ions have peptide specificity and high signal-to-noise (S/N) ratios in tandem mass spectra after labeling. The strategies were achieved by the co-fragmentation of heavy and light labeled peptides, mainly based on isobaric peptide termini labeling20,22 or a wide precursor window23. However, the strategies have not been widely used because the multiplex capacity is limited to 3plex, confined by the multi-dalton mass differences between isotopic series24. In our previous works, we introduced pseudoisobaric dimethyl labeling (pIDL) with a 5.84 mDa mass difference between fragment ions based on mass defects of 12C/13C and 1H/2H, and it showed the advantages of high accuracy and reduced MS/MS complexity. However, only two samples could be analyzed simultaneously.25,26 Herein, we developed a multiplex pIDL-based proteome quantification method, named m-pIDL, using 6-plex dimethyl labeling and a 10 m/z isolation window of precursor ions for fragmentation. It showed excellent quantification accuracy and a 20-fold dynamic range in E. coli digests. The quantification accuracy was further confirmed by the analysis of a twoproteome interference model (a mixture of E. coli and HeLa digests). Moreover, we showed the applicability among standard peptides with different post-translational modifications. We further applied the method to the time course analysis of TGF-β-induced epithelial-mesenchymal transition (EMT) in lung adenocarcinoma A549 cell lines, and our understanding of the cancer invasion and metastasis mechanism was improved through the newly quantified differentially expressed proteins.

The two-proteome model for the LFQ strategy contained two samples for relative quantification. One consisted of 10 μg E. coli digests and 50 μg HeLa digests. The other consisted of 50 μg E. coli digests and 50 μg HeLa digests. Both were stored at -20°C until use. Dimethyl Labeling. The E. coli, HeLa and A549 cell digests were labeled with 30L (13CH2O and NaBH3CN), 30H (CH2O, NaBD3CN), 32L (13CH2O, NaBD3CN), 32H (CD2O, NaBH3CN), 34L (13CD2O, NaBH3CN), and 34H (CD2O, NaBD3CN) dimethyl labeling reagents. For this, 100 μg digests were labeled with 20 μL of 4% CH2O or its isotopes and 20 μL of 0.6 M NaBH3CN or NaBD3CN for 1 h at room temperature. β-casein and standard peptides were labeled with 32L, 32H, 34L and 34H. The six-plex labeled E. coli digests were mixed at the ratios of 1:1:1:1:1:1 and 1:20:2:5:10:15 (m/m/m/m/m/m,

EXPERIMENTAL SECTION Sample Preparation. Escherichia coli cells (E. coli, strain K12) were grown in Luria-Bertani broth medium (5 g/L yeast extracts, 2

ACS Paragon Plus Environment

Page 3 of 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry isolation window of 10.0 m/z for MS/MS events. Precursors were fragmented by higher-energy collision dissociation (HCD) with the normalized collision energy of 28%. 1E5 fragment ions were accumulated within a maximum injection time of 100 ms and detected in the Orbitrap analyzer at the resolution of 35,000@200 with the fixed first mass of 50 m/z. For the label free strategy, loop count was 20, with the isolation window of 2.0 m/z. MS/MS scans were detected at the resolution of 17,500@200 with the fixed first mass of 110 m/z.

with all mixed ratios below based on the mass of each label unless otherwise specified). In the two-proteome model for mpIDL, E. coli digests were mixed at 1:5:1:5:1:5, and HeLa digests were mixed at 1:1:1:1:1:1. Then, the two components were combined with the same peptide amount in channels of 30H, 32H and 34H. The sample for PTM analysis consisted of β-casein digests, N-phosphopeptides of TCpHAAIIAR, AANDDLLNSFWLLDSEpKGEAR, FASTpHTDSSAQTVSLEDYVSR, MGSTGIGNGIAIPpHGKLEEDTLR, ApKLESLVEDLVNR, AGYAEDEVVAVSpKLGDIEYR, QWVNLPLVLpHGASGLSTK, TSpHTSIMAR, AEAGIVISASpHNPFYDNGIK, and SpKIFDFVpKPGVITGDDVQpK, O-phosphopeptides of VALQDAGLSVpSDIDDVILVGGQTR, FMVMQVTGpYKR, GRRNpSIGK, VGDIVIFNDGpYGVK, and LNFpSHGDYAEHGQR, and O-glycopeptide of TAPTgSTIAPGR at the mixture ratio of 1:5:1:5. All the mixed samples were desalted by a home-made C18-trap column and lyophilized in a SpeedVac.

Database Searching and Quantification. MS raw files were analyzed in the MaxQuant environment28 (v.1.6.1.0) employing the Andromeda29 search engine. The tandem mass spectra were searched against the Uniprot FASTA databases of E. coli (downloaded in May, 2017, 4,444 entries), human (downloaded in May, 2017, 46,913 entries) or β-casein sequence together with standard peptides. A common contaminants database was added to each database. Enzyme specificity was set to trypsin with a maximum of two missed cleavages allowed for the database search. Fixed modification was set to cysteine carbamidomethylation, and methionine oxidation was set as variable modification. For the six-plex labeling experiments, dimethyl labeling (30H, +30.043 85 Da) of the N-terminal and lysine with neutral losses of -2.012 55 Da and -4.025 11 Da, dimethyl labeling (32H, +32.056 41 Da) of the N-terminal and lysine with neutral losses of -2.012 55 Da and 2.012 55 Da or dimethyl labeling (34H, +34.068 96 Da) of the N-terminal and lysine with neutral losses of 2.012 55 Da and 4.025 11 Da were separately added as variable modifications and searched three times. Then, the three results were combined for quantification. For the four-plex labeling experiment of PTM peptides, two additional variable modifications of phosphorylation (HKRSTY) and HexNAc (S) were set. Dimethyl labeling (32H, +32.056 41 Da) of the N-terminal and lysine with a neutral loss of -2.012 55 Da or dimethyl labeling (34H, +34.068 96 Da) of the N-terminal and lysine with a neutral loss of 2.012 55 Da were separately added and combined for quantification. The allowed mass deviations for peptide identification were up to 10 ppm for precursor ions and 20 ppm for fragment ions. The resulting spectra were filtered with an FDR of 0.01 for PSMs, peptides and proteins on the basis of a decoy database approach.

LC-MS/MS Analysis. The six-plex labeled A549 cell digests were first equally combined and separated by high-pH RPLC using a Shimadzu DGU-20A5 liquid chromatography system (Tokyo, Japan). Buffer A was 98% H2O and 2% ACN, NH3·H2O (pH 10), and buffer B was 2% H2O and 98% ACN, NH3·H2O (pH 10). The separation column was a home-made C18 column (5 μm, 100 Å, 150 mm×2.1 mm i.d., Durashell, China). A 60 min separation gradient was performed using 2−25% B (0.1−50 min), 25−45% B (50−55 min), and 45-80% B (55−60 min), with fractions collected every 1 min. The resulting fractions were consolidated into 30 samples with equal intervals and lyophilized. All samples above were resuspended with 0.1% formic acid (FA). All nanoRPLC-MS experiments were performed on an Accela 600 HPLC system (Thermo Fisher Scientific, San Jose, CA, USA) coupled to a Q-Exactive mass spectrometer (Thermo Fisher Scientific, San Jose, CA, USA). Peptides were separated on C18 columns (5 μm, 150 mm×75 μm i.d., Agela, China, or 3 μm 150 mm×150 μm i.d., Dr. Maisch GmbH, Germany) with low-pH mobile phases (Buffer A: 98% H2O+2% ACN+0.1% FA; Buffer B: 2% H2O+98 % ACN+0.1% FA). The separation gradient for method evaluation was achieved by applying 5%22% B for 110 min and 22%-35% B for 25 min. The separation gradient for application to the EMT process was performed by applying 5%-22% B for 45 min and 22%-35% B for 15 min.

For the m-pIDL quantification method, the intensities of fragment ion clusters for relative quantification comparison in each PSM were extracted by in-house built java scripts. The fragment ions with quantification intensities in all channels below 200 m/z were used. Meanwhile, y1 ion clusters from the peptides with the C-terminal lysine were excluded. The ratio of each PSM was calculated as the ratio of the total intensities for all kinds of quantitative fragment ions. Peptides with amino group numbers larger than their charge states were excluded from quantification, so that all the quantified channels of peptides were within the isolation window. Each peptide ratio

The eluted peptides were analyzed with a Q-Exactive mass spectrometer in data-dependent mode. Survey scans were performed by the Orbitrap at 70,000@200 resolving power within the scan range of 300-1800 m/z. The AGC target for the survey scans was 1e6 charges, and the maximum injection time was 50 ms. Top 10 most intense ions were selected with the 3

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 15

between precursor ions is 8.06190 Da. The about 4 m/z mass difference of the precursor ion series with a charge state of 2 is beyond the mostly used isolation window of 2 m/z, so that the precursor ion series cannot be co-fragmented into the same tandem mass spectrum for quantification. Herein, we adopted a strategy that only enlarged the precursor ion isolation window of the routine data-dependent acquisition mode to achieve the co-fragmentation of all labeled precursor ion series and the compatibility with most instruments. Considering the balance between the co-fragmentation of all labeled precursor ion series and the co-fragmentation of interference peptides, we compared the effect of different isolation windows (5-10 m/z) on the quantification accuracy using 1:1:1:1:1:1 mixed E. coli digests. The results showed that protein ratios were more widely distributed as the isolation window narrowed, especially for channels with mass differences of several daltons (Figure S-1). It was mainly caused by the incomplete co-fragmentation of the 6-plex labeled precursor ion series. Therefore, 10 m/z is proper for the precursor ion isolation window of the 6-plex m-pIDL method. Meanwhile, for 4-plex labels with 30L/30H/32L/32H or 32L/32H/34L/34H, the isolation window could be reduced to 5 m/z. In addition, the multiplex capacity could be further increased to 8-plex by using labeling reagents CH2O, NaBH3CN (28 Da) and 13CD2O, NaBD3CN (36 Da), together with a wider isolation window, insofar as the co-isolated peptides do not affect the m-pIDL performance.

was calculated as the median of all spectra from the peptide, and then each protein ratio was calculated as the median of the quantified peptides matching the same protein. For the label-free quantification method, six raw files from the two samples were analyzed in the MaxQuant environment (v.1.6.1.0). The search included cysteine carbamidomethylation as the fixed modification and methionine oxidation and acetylation of protein N-terminal as variable modifications. The searching tolerance for precursor ions was 10 ppm, and that for fragment ions was 20 ppm. Matching between runs with retention time window of 0.7 min and the LFQ algorithm were performed. Bioinformatic Analysis for EMT Study. A normalization process was performed so that the log2 median ratio in each channel equaled zero. Differentially expressed proteins were considered as proteins of which the log2 ratios were filtered after ANOVA analysis with p-value cutoff of 0.05 and at least 1.5-fold ratio changes. Hierarchical clustering of differentially expressed proteins was performed in the Perseus software environment30. Interaction network analysis was performed using the Cytoscape plug-in with the network-construction algorithm of ReactomeFIViz31.

RESULTS AND DISCUSSION Principle of the m-pIDL Method

Then, relative quantification was achieved by comparing the fragment ion intensities of each tandem mass spectrum acquired by a high-resolution Orbitrap instrument. To resolve all the labeling channels with a minimum mass difference of 5.84 mDa, the selection of resolution was a tradeoff between the cycle time of collecting the tandem mass spectrum and the number of fragment ion series used for quantification. In m-pIDL, we selected the MS/MS resolution of 35,000@200, so that fragment ion series below 200 m/z were extracted as quantification ions. For peptides with a C-terminal of arginine, the quantification ions were a and b ion series, and the quantification ions were a, b and y ion series for peptides with a C-terminal of lysine. However, y1 ions were excluded from quantification to avoid the co-isolation interference from other peptides. Furthermore, for the peptides with C-terminal of arginine, the overlap of third isotopic peak among 30L-32L34L and 30H-32H-34H channels in MS does not cause the overlap of fragment ion series in the low mass range and will not influence the quantification accuracy.

We implemented the multiplex fragment ion-based quantification method by 6-plex dimethyl labeling and a wide isolation window (Figure 1). First, we employed one-step dimethyl labeling to achieve 3-plex labels with mass increases of ~30 Da, ~32 Da and ~34 Da for each amino group. Then, the idea of adding subtle mass differences of 5.84 mDa was introduced for each labeling channel and doubled the capacity of the method to 6-plex. Therefore, six-plex dimethyl labeling was achieved via a one-step simple reaction using the labeling reagents 13CH2O, NaBH3CN (30L); CH2O, NaBD3CN (30H); 13CH O, NaBD CN (32L); CD O, NaBH CN (32H); 13CD O, 2 3 2 3 2 NaBH3CN (34L) and CD2O, NaBD3CN (34H), with the mass increases of 30.03801 Da, 30.04385 Da, 32.05056 Da, 32.05641 Da, 34.06312 Da and 34.06896 Da for each amino group (Table S-1). After mixing, the largest mass difference, generated between the channels of 30L and 34H, is 4.03095 Da for one amino group. Since the peptide with a C-terminal of lysine has two amino groups after trypsin digestion, the largest mass difference

4

ACS Paragon Plus Environment

Page 5 of 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Figure 1. Scheme of the m-pIDL method. Six samples were separately labeled by the dimethyl labeling reagents 13CH2O, NaBH3CN (30L); CH2O, NaBD3CN (30H); 13CH2O, NaBD3CN (32L); CD2O, NaBH3CN (32H); 13CD2O, NaBH3CN (34L); and CD2O, NaBD3CN (34H) and mixed. Then, fragment ion-based quantification was achieved by enlarging the precursor ion isolation window to 10 m/z.

Quantification Accuracy and Dynamic Range

ratios of the tandem mass spectra. Furthermore, as shown in Figure 2c, excellent reproducibility across the 20-fold dynamic range was obtained from the three replicates, with the average Pearson correlation coefficient of 0.94. The above results reveal the excellent accuracy and reproducibility of m-pIDL for the multiplex quantification of proteomic samples across a usable dynamic range.

To evaluate the quantification accuracy of the m-pIDL method, E. coli digests were first mixed at the ratio of 1:1:1:1:1:1 after 6-plex labeling and quantified in three replicates. The dimethyl labeling efficiency was 98.23%, which provided the necessary prerequisite for accurate quantification. In all the quantified channels, the median values between channels did not show much difference (Figure S-2). On average, 97.85% of all proteins were quantified within 2-fold change, and 92.22% proteins were quantified within 1.5-fold change, showing the achievement of the method in proteome quantification. Additionally, the fragment ion series used for quantification were mainly a1+ ions (99.63% PSMs), together with a2+ (9.54% PSMs), b1+ (5.55% PSMs) and b2+ ions (4.17% PSMs). The quantified fragment ion series showed peptide specificity to distinguish co-isolated peptides from the target one to a great extent (Figure S-3). We further investigated the dynamic range and the reproducibility of the m-pIDL method by mixing E. coli digests at the ratio of 1:2:5:10:15:20. First, to characterize the ratios influenced by different fragment ion intensities, log2 quantification ratios of the PSMs were plotted as a function of the log10 quantification ion intensities for the mixed ratio of 1:20 (Figure 2a). The medians and ratio distributions were nearly the same across the 4 orders of magnitude of the quantification ion intensities, indicating that the quantification accuracy was not affected by the intensity of the fragment ion response. In addition, the median protein ratios of the 6-plex mixed sample were 1:1.59:5.59:10.85:13.45:20.11, and the average SD (log2) value was 0.34 obtained from all the measured ratios in the channels (Figure 2b). As the ratio increased, no ratio compression was observed, demonstrating that the m-pIDL method could ensure accurate quantification across a 20-fold dynamic range, contributed by the high S/N

Figure 2. Quantification accuracy and dynamic range of mpIDL. (a) Scatter plots showing the distribution of the log2 ratios against the log10 fragment ion intensities for the 1:20 mixed ratios. (b) Box plots showing measured protein ratios with the mixed ratio of 1:2:5:10:15:20. (c) Scatter plots of log2 protein ratios from three technical replicates of the sample mixed at 1:2:5:10:15:20.

5

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 15

the model and distinguish E. coli peptides from HeLa peptides to a great extent with wider ratio distribution. However, a total shift of all log2 ratios was displayed, that is, the ratios of E. coli and HeLa peptides with medians of 2.64 and 0.55 were both centered at approximately 40% smaller values than expected. The SD values of the log2 ratios were 0.66 for E. coli peptides and 0.68 for HeLa peptides. A similar result was also reported in the MaxLFQ method.8 This is probably because the normalization procedure, an essential part of data processing due to the different ionization efficiencies between runs, was disturbed by the large proportion (31%) of proteins with changing ratios (E. coli peptides) in this model. However, by our m-pIDL method, the normalization procedure could be avoided because different samples were mixed into the same run and co-fragmented into the same tandem mass spectrum to ensure the same ionization efficiency. Therefore, the quantification accuracy of proteins obtained by the m-pIDL method is not influenced by the relative scale of changed proteins among samples, which is beneficial to analyzing proteome samples with significantly changing portions of proteins.

Quantitative Analysis in a Two-Proteome Model of Interference In proteome research, usually only a portion of proteins show expression changes. But the peptides from the differentially expressed proteins are frequently co-isolated and cofragmented with the peptides from other proteins. Therefore, it is important to accurately quantify differentially expressed proteins from interfering proteins. However, the wide isolation window used in this method might increase the number of cofragmented peptides and influence the quantification accuracy. To maximize the interference effect and evaluate the quantification accuracy of the m-pIDL method, we presented a benchmark model by mixing two distinguishable proteomes. After labeling the protein digests of E. coli and HeLa cell lines with 30L, 30H, 32L, 32H, 34L and 34H, we mixed the labeled E. coli peptides at the ratio of 1:5:1:5:1:5 and the HeLa peptides equally (1:1:1:1:1:1) (Figure 3a). Then, the HeLa cell digests, as interference, were mixed with the E. coli peptides to be the two-proteome model, and the total amounts of E. coli digests in channels 30H, 32H, 34H were equal to the amounts of HeLa digests in the same channels.

Moreover, we evaluated the interference effect among the multiplex channels. For E. coli proteins, the average median compression of the expected 5:1 ratios was 1.13-fold, with an RSD of 12.46% among the channels, and 0.98-fold for the 1:1 ratios, with an RSD of 10.35% (Figure 3d- left and Table S-2). In addition, the medians of the interfering HeLa peptides (averagely 1.03) were also not changed, despite different ratios of E. coli peptides being added (Figure 3d- right and Table S2). The results demonstrate that the high accuracy of the mpIDL method is maintained across the multiplex channels. In contrast, taking the widely used isobaric labeling method of TMT as an example, reporter ion-based quantification bears critically ratio distortion, with 2.2-fold median compression for an expected 4:1 ratio in a similar model, as reported.17 This further demonstrate the significant advantage of using multiplex fragment ions as quantification ions to improve accuracy.

We first evaluated the quantification accuracy of both targeted E. coli and interfering HeLa peptides along with different quantification ion intensities between channels 30L and 30H. 2061 peptides were quantified, and the median values were 4.47 for E. coli peptides and 1.01 for HeLa peptides without systematic distortion across the entire intensity range, especially for the ratios of low-intensity peptides (Figure 3b). Additionally, the ratios from E. coli and HeLa peptides were almost separated into two groups. The SD values of all log2 ratios, used to show the ratio distributions, were 0.65 for E. coli peptides and 0.59 for HeLa peptides. Therefore, by the m-pIDL method, the interference effect could be avoided to ensure the accurate quantification of complex samples, independent of the intensity of the quantitative peptides. For comparison, the mixture model was also used to evaluate the quantification performance of the LFQ strategy8. As shown in Figure 3c, the LFQ strategy could quantify 24423 peptides in

6

ACS Paragon Plus Environment

Page 7 of 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Figure 3. The quantitative results of the two-proteome model. (a) The composition of the two-proteome model. E. coli peptides were labeled with 30L, 30H, 32L, 32H, 34L and 34H and mixed at the ratio of 1:5:1:5:1:5. HeLa peptides were labeled in the same way and mixed equally (1:1:1:1:1:1). The two samples were combined into a two-proteome model according to the same amounts of E. coli and HeLa peptides in channels 30H, 32H and 34H. Scatter plots showing the log2 ratios of E. coli (orange) and HeLa (blue) peptides between channels 30L and 30H against the log10 quantification ion intensities in channel 30H obtained by the (b) m-pIDL and (c) LFQ methods. (d) Box plots showing the measured E. coli protein ratios (left) and HeLa protein ratios (right) in the different channels. Application of m-pIDL to Monitoring the Time-Resolved Responses of EMT

Application to Studying PTMs

EMT, as a central driver of epithelial-derived tumor malignancies, has been shown to trigger the migration of primary carcinoma cells to seed a new tumor.34 It is a shift from the cobblestone-like epithelial state to the cell scattering and elongated mesenchymal state, together with the downregulation of epithelial markers and the up-regulation of mesenchymal markers. Current research has exemplified the dynamic and complex phenomenon of EMT, but the precise molecular mechanism remains largely unknown.35 With the advantages of high accuracy and multiplex capacity in our mpIDL method, we further carried out an in-depth study of the dynamic changes of proteins during TGF-β-induced EMT in lung adenocarcinoma A549 cell lines. Cell lines without (control) and with TGF-β stimulation collected at 12 h, 18 h, 24 h, 36 h and 48 h were labeled by 30L, 30H, 32L, 32H, 34L and 34H (Figure 4a, Figure S-4), respectively. Across the three biological replicates, 7206 proteins were quantified. The upregulation of fibronectin and vimentin (mesenchymal markers) and the down-regulation of occludin and desmoplakin (epithelial markers, as shown in Table S-3) were observed in our experiment, indicating the accurate quantification of

We further investigated the applicability of m-pIDL method for peptides with different post-translational modifications (PTMs). -casein digests, O-phosphopeptides, Nphosphopeptides and a glycopeptide were mixed together and labeled by 32L, 32H, 34L and 34H separately, followed by mixing at the ratio of 1:5:1:5. The median ratios of peptides without modification, O-phosphopeptides, N-phosphopeptides and O-glycopeptide were, respectively, 0.92:5.08:1:3.42, 1.02:5.77:1:3.48, 0.93:5.74:1:3.56 and 0.87:6.59:1:3.63. The average of the CVs among different modified peptides was 6.66%, indicating that our method does not have significant ratio variation among diversely modified peptides. Additionally, for a 6-plex experiment to quantify 100 μg PTM peptides per channel, the total cost of labeling reagents is under $5, which is much lower than the price of commercial reagents. Therefore, the wide applicability of m-pIDL will promote further studies of low-abundance PTMs, such as N-phosphopeptides32 and proteolytic PTMs of proteins33, and the analysis of multidimensional protein properties to generate a comprehensive proteomic atlas. 7

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 15

proteins by our method. We further detected 368 differentially expressed proteins with larger than 1.5-fold changes after ANOVA analysis (Table S-4). From the hierarchical clustering heatmap (Figure 4b), most of the proteins showed sustained changes over time, but different changing rates of protein expressions over time were noticed. Our multiplex quantification method facilitated distinguishing between the proteins with rapid expression changes and those with slow expression changes. The differentially expressed proteins were involved in pathways such as the regulation of actin cytoskeleton, tight junction, metabolic pathways, focal adhesion, endocytosis and pathways in cancer. Remarkably, a dramatic phenotypic change, which is accompanied by the reorganization of the actin cytoskeleton and cell junctions, has been reported in the process of EMT, allowing cancer cells to gain migratory properties. This reorganization is a prerequisite for cell motility during the process. By m-pIDL, we found many dynamic changes of differentially expressed proteins in the actin cytoskeleton regulation and cell tight junction pathways, such as ITGA2, ITGB1, VCL, MYH9 and ACTN1 (Figure 4c). These data correlated well with the known EMT process36 associated with promoting cancer cell migration. In addition, we found that band 4.1-like protein 2 (EPB41L2) and Alpha-II spectrin (SPTAN1) were gradually up-regulated (Figure 4d), but such changes have not been reported in the EMT process of nonsmall cell lung cancer. Band 4.1 proteins have been proven as the important intracellular components mediating signaling events and cytoskeletal reorganization via integrin induced cell spreading.37 In our data, the protein expression was 2.26-fold higher at 48 h after TGF-β stimulation compared with that in the control. In combination with the up-regulation of ITGB1, EPB41L2 is a potential regulatory protein in promoting EMT via integrin-mediated cytoskeletal reorganization. Additionally, SPTAN1 is a scaffolding protein involved in cytoskeletal and filamental organization contributing to cell adhesion and migration.38 With a 1.87-fold change at 48 h, it may also affect the phenotypic changes of the EMT process by the reorganization of the actin cytoskeleton and cell junctions. These data demonstrate that the m-pIDL method can accurately monitor protein abundance changes in complex biological processes and help in discovering potential key proteins.

Figure 4. Analysis of the EMT process by the m-pIDL method. (a) Schematic illustration of the experimental setup, with the A549 cells stimulated by TGF-β and collected at six time points. (b) Unsupervised hierarchical clustering of 368 differentially expressed proteins. (c) Network of differentially expressed proteins in the pathways of the regulation of actin cytoskeleton and tight junction. (d) Time-dependent profiles of EPB41L2 and SPTAN1.

CONCLUSIONS In summary, we have developed a fragment ion-based method for 6-plex proteome quantification, namely, m-pIDL, by onestep dimethyl labeling and data-dependent acquisition with the wide isolation window of 10 m/z. The method demonstrated dramatically high accuracy in a 20-fold dynamic range, and only 1.13-fold ratio compression of E. coli proteins in the highinterference two-proteome model. It also showed high applicability and low cost for studying various PTM peptides, shedding light on the study of PTMs. Such a method is of great significance to render a system-wide view of protein properties, not only to enhance our understandings of various cellular and physiological processes, but also to provide future advances in drug development and medical diagnoses.

ASSOCIATED CONTENT Supporting Information The Supporting Information is available free of charge on the ACS Publications website. Labeling reagents of the m-pIDL method, quantification results of the 6-plex labels isolated by 5-10 m/z isolation windows, quantification results of the 1:1:1:1:1:1 mixed model and the twoproteome model, the tandem mass spectrum of co-fragmented peptide ALHFGAGNIGR and GLSLGMR, the micrographs, and quantification ratios of mesenchymal and epithelial markers during the TGF-β-induced EMT. (PDF) The quantification ratios of the differentially expressed proteins during TGF-β-induced EMT in A549 cell lines. (XLSX) The in-house built java scripts for the m-pIDL quantification. (ZIP)

AUTHOR INFORMATION

ACS Paragon Plus Environment 8

Page 9 of 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry (10) Gillet, L. C.; Navarro, P.; Tate, S.; Rost, H.; Selevsek, N.; Reiter, L.; Bonner, R.; Aebersold, R. Targeted data extraction of the MS/MS spectra generated by data-independent acquisition: a new concept for consistent and accurate proteome analysis. Mol. Cell. Proteomics 2012, 11, O111 016717. (11) Ross, P. L.; Huang, Y. N.; Marchese, J. N.; Williamson, B.; Parker, K.; Hattan, S.; Khainovski, N.; Pillai, S.; Dey, S.; Daniels, S.; Purkayastha, S.; Juhasz, P.; Martin, S.; BartletJones, M.; He, F.; Jacobson, A.; Pappin, D. J. Multiplexed protein quantitation in Saccharomyces cerevisiae using aminereactive isobaric tagging reagents. Mol. Cell. Proteomics 2004, 3, 1154-1169. (12) Thompson, A.; Schafer, J.; Kuhn, K.; Kienle, S.; Schwarz, J.; Schmidt, G.; Neumann, T.; Johnstone, R.; Mohammed, A. K.; Hamon, C. Tandem mass tags: a novel quantification strategy for comparative analysis of complex protein mixtures by MS/MS. Anal. Chem. 2003, 75, 1895-1904. (13) Dayon, L.; Hainard, A.; Licker, V.; Turck, N.; Kuhn, K.; Hochstrasser, D. F.; Burkhard, P. R.; Sanchez, J. C. Relative quantification of proteins in human cerebrospinal fluids by MS/MS using 6-plex isobaric tags. Anal. Chem. 2008, 80, 29212931. (14) Gao, Y.; Liu, X.; Tang, B.; Li, C.; Kou, Z.; Li, L.; Liu, W.; Wu, Y.; Kou, X.; Li, J.; Zhao, Y.; Yin, J.; Wang, H.; Chen, S.; Liao, L.; Gao, S. Protein Expression Landscape of Mouse Embryos during Pre-implantation Development. Cell Rep 2017, 21, 3957-3969. (15) Marx, H.; Minogue, C. E.; Jayaraman, D.; Richards, A. L.; Kwiecien, N. W.; Siahpirani, A. F.; Rajasekar, S.; Maeda, J.; Garcia, K.; Del Valle-Echevarria, A. R.; Volkening, J. D.; Westphall, M. S.; Roy, S.; Sussman, M. R.; Ane, J. M.; Coon, J. J. A proteomic atlas of the legume Medicago truncatula and its nitrogen-fixing endosymbiont Sinorhizobium meliloti. Nat. Biotechnol 2016, 34, 1198-1205. (16) Zhang, Y.; Fonslow, B. R.; Shan, B.; Baek, M. C.; Yates, J. R., 3rd. Protein analysis by shotgun/bottom-up proteomics. Chem. Rev. 2013, 113, 2343-2394. (17) Ting, L.; Rad, R.; Gygi, S. P.; Haas, W. MS3 eliminates ratio distortion in isobaric multiplexed quantitative proteomics. Nat. Methods 2011, 8, 937-940. (18) McAlister, G. C.; Nusinow, D. P.; Jedrychowski, M. P.; Wuhr, M.; Huttlin, E. L.; Erickson, B. K.; Rad, R.; Haas, W.; Gygi, S. P. MultiNotch MS3 enables accurate, sensitive, and multiplexed detection of differential expression across cancer cell line proteomes. Anal. Chem. 2014, 86, 7150-7158. (19) Xiang, F.; Ye, H.; Chen, R.; Fu, Q.; Li, L. N,N-dimethyl leucines as novel isobaric tandem mass tags for quantitative proteomics and peptidomics. Anal. Chem. 2010, 82, 2817-2825. (20) Koehler, C. J.; Strozynski, M.; Kozielski, F.; Treumann, A.; Thiede, B. Isobaric peptide termini labeling for MS/MSbased quantitative proteomics. J. Proteome Res. 2009, 8, 43334341. (21) Nie, A. Y.; Zhang, L.; Yan, G. Q.; Yao, J.; Zhang, Y.; Lu, H. J.; Yang, P. Y.; He, F. C. In vivo termini amino acid labeling for quantitative proteomics. Anal. Chem. 2011, 83, 6026-6033.

Corresponding Author * E-mail: [email protected]. Phone and fax: +86-41184379720.

Author Contributions ‖Jianhui Liu and Yuan Zhou made equal contributions to this work.

Notes The authors declare no competing financial interest. All the raw data have been deposited to the ProteomeXchange Consortium (http://proteomecentral.proteomexchange.org) via the iProx partner repository with the dataset identifier PXD012457. The java scripts are available at https://github.com/DICP1810/mpIDL.

ACKNOWLEDGEMENTS The authors are grateful for the financial support from The National Key Research and Development Program of China (2017YFA0505003 and 2016YFA0501401), National Natural Science Foundation (21725506, 91543201, 91753110 and 21405154), CAS Key Project in Frontier Science (QYZDYSSW-SLH017), and Innovation program from DICP, CAS (DICP TMSR201601, DMTO201701).

REFERENCES (1) Larance, M.; Lamond, A. I. Multidimensional proteomics for cell biology. Nat Rev. Mol. Cell Bio. 2015, 16, 269-280. (2) Aebersold, R.; Mann, M. Mass-spectrometric exploration of proteome structure and function. Nature 2016, 537, 347-355. (3) Wang, Y.; Song, L.; Liu, M.; Ge, R.; Zhou, Q.; Liu, W.; Li, R.; Qie, J.; Zhen, B.; Wang, Y.; He, F.; Qin, J.; Ding, C. A proteomics landscape of circadian clock in mouse liver. Nat. Commun. 2018, 9, 1553. (4) Mayne, J.; Ning, Z.; Zhang, X.; Starr, A. E.; Chen, R.; Deeke, S.; Chiang, C. K.; Xu, B.; Wen, M.; Cheng, K.; Seebun, D.; Star, A.; Moore, J. I.; Figeys, D. Bottom-Up Proteomics (2013-2015): Keeping up in the Era of Systems Biology. Anal. Chem. 2016, 88, 95-121. (5) Humphrey, S. J.; Azimifar, S. B.; Mann, M. Highthroughput phosphoproteomics reveals in vivo insulin signaling dynamics. Nat. Biotechnol. 2015, 33, 990-995. (6) Geyer, Philipp E.; Kulak, N. A.; Pichler, G.; Holdt, Lesca M.; Teupser, D.; Mann, M. Plasma Proteome Profiling to Assess Human Health and Disease. Cell Sys. 2016, 2, 185-195. (7) Robles, M. S.; Humphrey, S. J.; Mann, M. Phosphorylation Is a Central Mechanism for Circadian Control of Metabolism and Physiology. Cell Metab. 2017, 25, 118-127. (8) Cox, J.; Hein, M. Y.; Luber, C. A.; Paron, I.; Nagaraj, N.; Mann, M. Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ. Mol. Cell. Proteomics 2014, 13, 2513-2526. (9) Tyanova, S.; Temu, T.; Cox, J. The MaxQuant computational platform for mass spectrometry-based shotgun proteomics. Nature Protoc 2016, 11, 2301-2319.

ACS Paragon Plus Environment 9

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 15

platform for comprehensive analysis of (prote)omics data. Nat. Methods 2016, 13, 731-740. (31) Wu, G.; Dawson, E.; Duong, A.; Haw, R.; Stein, L. ReactomeFIViz: a Cytoscape app for pathway and networkbased data analysis. F1000Research 2014, 3, 146. (32) Junker, S.; Maabeta, S.; Otto, A.; Michalik, S.; Morgenroth, F.; Gerth, U.; Hecker, M.; Becher, D. Spectral Library Based Analysis of Arginine Phosphorylations in Staphylococcus aureus. Mol. Cell. Proteomics 2018, 17, 335348. (33) Kleifeld, O.; Doucet, A.; Prudova, A.; auf dem Keller, U.; Gioia, M.; Kizhakkedathu, J. N.; Overall, C. M. Identifying and quantifying proteolytic events and the natural N terminome by terminal amine isotopic labeling of substrates. Nat. Protoc. 2011, 6, 1578-1611. (34) Nieto, M. A.; Huang, R. Y.; Jackson, R. A.; Thiery, J. P. Emt: 2016. Cell 2016, 166, 21-45. (35) Bottoni, P.; Isgro, M. A.; Scatena, R. The epithelialmesenchymal transition in cancer: a potential critical topic for translational proteomic research. Expert Rev. Proteomics 2016, 13, 115-133. (36) Yilmaz, M.; Christofori, G. EMT, the cytoskeleton, and cancer cell invasion. Cancer Metast. Rev. 2009, 28, 15-33. (37) Jung, Y.; McCarty, J. H. Band 4.1 proteins regulate integrin-dependent cell spreading. Biochem. Bioph. Res. Co. 2012, 426, 578-584. (38) Hinrichsen, I.; Ernst, B. P.; Nuber, F.; Passmann, S.; Schafer, D.; Steinke, V.; Friedrichs, N.; Plotz, G.; Zeuzem, S.; Brieger, A. Reduced migration of MLH1 deficient colon cancer cells depends on SPTAN1. Mol. Cancer 2014, 13, 11.

(22) Waldbauer, J.; Zhang, L.; Rizzo, A.; Muratore, D. diDOIPTL: A Peptide-Labeling Strategy for Precision Quantitative Proteomics. Anal. Chem. 2017, 89, 11498-11504. (23) Zhang, G.; Neubert, T. A. Automated comparative proteomics based on multiplex tandem mass spectrometry and stable isotope labeling. Mol. Cell. Proteomics 2006, 5, 401-411. (24) Koehler, C. J.; Arntzen, M. O.; de Souza, G. A.; Thiede, B. An approach for triplex-isobaric peptide termini labeling (triplex-IPTL). Anal. Chem. 2013, 85, 2478-2485. (25) Zhou, Y.; Shan, Y.; Wu, Q.; Zhang, S.; Zhang, L.; Zhang, Y. Mass defect-based pseudo-isobaric dimethyl labeling for proteome quantification. Anal. Chem. 2013, 85, 10658-10663. (26) Yang, K. G.; Liu, J. H.; Sun, J. D.; Zhou, Y.; Zhao, Q.; Li, S. W.; Liu, L. K.; Zhang, L. H.; Zhao, J. Y.; Zhang, Y. K. Proteomic study provides new clues for complications of hemodialysis caused by dialysis membrane. Sci. Bull. 2017, 62, 1251-1255. (27) Zhao, Q.; Fang, F.; Liang, Y.; Yuan, H.; Yang, K.; Wu, Q.; Liang, Z.; Zhang, L.; Zhang, Y. 1-Dodecyl-3methylimidazolium chloride-assisted sample preparation method for efficient integral membrane proteome analysis. Anal. Chem. 2014, 86, 7544-7550. (28) Cox, J.; Mann, M. MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat. Biotechnol. 2008, 26, 1367-1372. (29) Cox, J. r.; Neuhauser, N.; Michalski, A.; Scheltema, R. A.; Olsen, J. V.; Mann, M. Andromeda: A Peptide Search Engine Integrated into the MaxQuant Environment. J. Proteome Res. 2011, 10, 1794-1805. (30) Tyanova, S.; Temu, T.; Sinitcyn, P.; Carlson, A.; Hein, M. Y.; Geiger, T.; Mann, M.; Cox, J. The Perseus computational

ACS Paragon Plus Environment 10

Page 11 of 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

TOC

11 Environment ACS Paragon Plus

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 1. Scheme of the m-pIDL method. Six samples were separately labeled by the dimethyl labeling reagents 13CH2O, NaBH3CN (30L); CH2O, NaBD3CN (30H); 13CH2O, NaBD3CN (32L); CD2O, NaBH3CN (32H); 13CD2O, NaBH3CN (34L); and CD2O, NaBD3CN (34H) and mixed. Then, fragment ion-based quantification was achieved by enlarging the precursor ion isolation window to 10 m/z. 289x102mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 12 of 15

Page 13 of 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Figure 2. Quantification accuracy and dynamic range of m-pIDL. (a) Scatter plots showing the distribution of the log2 ratios against the log10 fragment ion intensities for the 1:20 mixed ratios. (b) Box plots showing measured protein ratios with the mixed ratio of 1:2:5:10:15:20. (c) Scatter plots of log2 protein ratios from three technical replicates of the sample mixed at 1:2:5:10:15:20.

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 3. The quantitative results of the two-proteome model. (a) The composition of the two-proteome model. E. coli peptides were labeled with 30L, 30H, 32L, 32H, 34L and 34H and mixed at the ratio of 1:5:1:5:1:5. HeLa peptides were labeled in the same way and mixed equally (1:1:1:1:1:1). The two samples were combined into a two-proteome model according to the same amounts of E. coli and HeLa peptides in channels 30H, 32H and 34H. Scatter plots showing the log2 ratios of E. coli (orange) and HeLa (blue) peptides between channels 30L and 30H against the log10 quantification ion intensities in channel 30H obtained by the (b) m-pIDL and (c) LFQ methods. (d) Box plots showing the measured E. coli protein ratios (left) and HeLa protein ratios (right) in the different channels.

ACS Paragon Plus Environment

Page 14 of 15

Page 15 of 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Figure 4. Analysis of the EMT process by the m-pIDL method. (a) Schematic illustration of the experimental setup, with the A549 cells stimulated by TGF-β and collected at six time points. (b) Unsupervised hierarchical clustering of 368 differentially expressed proteins. (c) Network of differentially expressed proteins in the pathways of the regulation of actin cytoskeleton and tight junction. (d) Time-dependent profiles of EPB41L2 and SPTAN1. 233x177mm (300 x 300 DPI)

ACS Paragon Plus Environment