Article pubs.acs.org/jpr
MassyTools: A High-Throughput Targeted Data Processing Tool for Relative Quantitation and Quality Control Developed for Glycomic and Glycoproteomic MALDI-MS Bas C. Jansen,† Karli R. Reiding,† Albert Bondt,†,‡ Agnes L. Hipgrave Ederveen,† Magnus Palmblad,† David Falck,† and Manfred Wuhrer*,†,§ †
Center for Proteomics and Metabolomics, Leiden University Medical Center, 2300 RC Leiden, The Netherlands Department of Rheumatology, Erasmus University Medical Center, 3000 CA Rotterdam, The Netherlands § Division of BioAnalytical Chemistry, VU University Amsterdam, 1081 HV Amsterdam, The Netherlands ‡
S Supporting Information *
ABSTRACT: The study of N-linked glycosylation has long been complicated by a lack of bioinformatics tools. In particular, there is still a lack of fast and robust data processing tools for targeted (relative) quantitation. We have developed modular, high-throughput data processing software, MassyTools, that is capable of calibrating spectra, extracting data, and performing quality control calculations based on a userdefined list of glycan or glycopeptide compositions. Typical examples of output include relative areas after background subtraction, isotopic pattern-based quality scores, spectral quality scores, and signal-to-noise ratios. We demonstrated MassyTools’ performance on MALDI-TOF-MS glycan and glycopeptide data from different samples. MassyTools yielded better calibration than the commercial software flexAnalysis, generally showing 2-fold better ppm errors after internal calibration. Relative quantitation using MassyTools and flexAnalysis gave similar results, yielding a relative standard deviation (RSD) of the main glycan of ∼6%. However, MassyTools yielded 2- to 5-fold lower RSD values for low-abundant analytes than flexAnalysis. Additionally, feature curation based on the computed quality criteria improved the data quality. In conclusion, we show that MassyTools is a robust automated data processing tool for high-throughput, high-performance glycosylation analysis. The package is released under the Apache 2.0 license and is freely available on GitHub (https://github.com/Tarskin/MassyTools). KEYWORDS: Bioinformatics, glycomics, glycoproteomics, biopharmaceuticals, profiling, mass spectrometry, matrix-assisted laser desorption/ionization, relative quantitation, quality control
■
INTRODUCTION
high-throughput MALDI-MS is regularly used in clinical microbiology and clinical metabolomic, proteomic, and glycomic analyses,4,15−19 as well as for quality control of biopharmaceuticals (e.g., the monitoring of fragment crystallizable (Fc)-linked N-glycosylation of monoclonal antibodies).4,15−20 Moreover, several recent reports have demonstrated that this technology allows quantitative measurements to be obtained when appropriate sample preparation methods and internal standards are applied.21,22 This recent progress in sample preparation and MS data acquisition of glycomics and glycoproteomics samples has worsened the bottleneck represented by spectral preprocessing and relative quantitation in targeted approaches. The analysis of large MS data sets can be extremely timeconsuming and is often the bottleneck in large studies. Calibration, smoothing, and baseline subtraction of a mass spectrum are common procedures before further processing is
Protein glycosylation is involved in a multitude of biological processes and diseases.1,2 Glycosylation analysis has successfully been used in biomedical research, uncovering associations with age, sex, and disease. Furthermore, it has also proven to be critical for the development of biopharmaceuticals.3−8 Recently, the limitations in sample throughput have started to disappear, with rapid sample preparation, method integration, and robotization decreasing the hands-on time required to prepare and measure glycan samples.9 The resulting high-throughput studies have led to the advent of genome-wide association studies on glycomics traits, revealing aspects of the genetic regulation of glycosylation.10,11 One of the most rapid approaches to analyze glycans is matrix-assisted laser desorption/ionization (MALDI)-mass spectrometry (MS). MALDI-MS shows intrinsic robustness, accuracy, speed, and high versatility, allowing the detection of free glycans as well as glycoconjugates such as glycopeptides.12,13 MALDI-MS also exhibits high-throughput potential and the option to combine it with robotized sample preparation techniques.14 Currently, © 2015 American Chemical Society
Received: March 25, 2015 Published: November 2, 2015 5088
DOI: 10.1021/acs.jproteome.5b00658 J. Proteome Res. 2015, 14, 5088−5098
Article
Journal of Proteome Research
written in the Python 2.7 programming language.33 MassyTools uses the nonstandard Python libraries NumPy,34 SciPy,35 matplotlib,36 and TkInter. These packages are all included in the Anaconda distribution of python. Benchmarks were performed on an Intel i3-3220 CPU desktop computer with 16 GB of RAM running 64-bit Windows 7 SP1.
performed. A wide variety of software is available that can performs these steps, including mMass and MZmine2.23−25 Additionally, there are many bioinformatics tools available for further analysis, such as primary sequence analysis and quantitation of proteins.25 However, the analysis of posttranslational modifications such as glycosylation is still limited.26 Identification of glycosylation features from singleprotein samples can be adequately performed by several software packages including ProteinScape, Glycominer, SweetHeart, and GlycoWorkbench.27−30 With the advent of identification software, it has become easier to use targeted approaches (i.e., analyzing the same set of analytes from an array of biological samples), which are especially useful in the routine analysis of biopharmaceuticals or clinical cohorts, where compositions are often comparable between samples. However, in targeted analysis, the time-consuming (relative) quantitation of a large number of samples with possibly hundreds of analytes becomes the bottleneck. Existing data analysis software packages allow only semiautomatic (relative) quantitation of glycosylation in MS spectra and are not suitable for robust and repeated data processing and extraction. Examples of current data analysis software that are capable of doing repeated data extraction are flexAnalysis (Bruker Daltonics, Bremen, Germany) and MALDIquant.31 These software tools are only capable of performing untargeted data extraction. A disadvantage of untargeted extraction derives from automated peak picking. Each sample can yield a different set of compositions, e.g., due to a composition not being detected in a given sample. Therefore, manual curation of the results will be required to correct for missing values. Additionally, spectral quality control features are limited, although they are a necessity to allow for spectrum and analyte curation. Therefore, there is an urgent need for a fast and robust data processing tool for targeted relative quantitation and quality control of MS spectra containing glycosylation data. Here, we have generated a modular data processing toolkit for targeted MALDI-MS analysis of glycans and glycopeptides. This toolkit, named MassyTools, offers a wide range of selectable and customizable functionalities for (1) relative quantitation by integration of analyte peak areas and automated total area normalization, (2) data preprocessing, including calibration and dynamic background detection, and (3) various assessments of the quality of individual extracted features as well as the total spectrum. The MassyTools output is designed in such a way that the results are easily transferrable to most statistical analysis programs with minimal data transformation. Many (especially freely available) bioinformatics tools suffer from usability issues, such as the lack of a graphical user interface. MassyTools is fully automated and designed to run on any computer with an Anaconda Python 2.7 installation.32 Its graphical interface increases accessibility to nonexperts. Its usability is further improved by providing a full set of preoptimized analysis parameters that allow, but do not require, input to suit all levels of user experience. The performance of MassyTools was evaluated using MALDI-TOF-MS spectra of the total plasma N-glycome (TPNG) and monoclonal antibody glycans and glycopeptides.
■
Materials
An IgG1-derived monoclonal antibody (mAb1, tryptic glycopeptide sequence EEQYNSTYR) and pooled plasma from 20 healthy human donors were used in this study. The plasma was obtained from Affinity Biologicals (Ancaster, Canada). Sodium dodecyl sulfate (SDS), trifluoroacetic acid (TFA), ammonium bicarbonate (NH4HCO3), and ethanol were purchased from Merck (Darmstadt, Germany). Nonidet P-40 substitute (NP-40) and sodium hydroxide (NaOH) were obtained from Sigma-Aldrich (Steinheim, Germany). Sequencing grade trypsin was purchased from Promega (Leiden, The Netherlands). Cotton was acquired from Pipoos (Utrecht, The Netherlands). HPLC SupraGradient acetonitrile (ACN) was supplied by Biosolve (Valkenswaard, The Netherlands). 4Chloro-α-cyanocinnamic acid (CHCA) and 2,5-dihydroxybenzoic acid (DHB) were obtained from Bruker Daltonics (Bremen, Germany). Lastly, peptide-N-glycosidase F (PNGase-F) was acquired from Roche Diagnostics (Mannheim, Germany). Sample Preparation
For the mAb1 glycopeptide data set, 20 μg of antibody in 20 μL of 25 mM NH4HCO3 buffer (pH 8) was mixed with 2 μL of 0.2% SDS in 50 mM NH4HCO3 buffer (pH 8), which resulted in a final SDS concentration of 0.02%. After a 15 min incubation at 60 °C, 2 μg of sequencing grade trypsin in 18 μL of 25 mM NH4HCO3 (pH 8) was added for an overnight digestion at 37 °C. The glycopeptides were purified by hydrophilic interaction liquid chromatography (HILIC)-solid phase extraction (SPE) using cotton thread tips according to a method published earlier for N-glycans, with minor modifications.17 Briefly, the cotton material was preconditioned by pipetting 20 μL of water three times (deionized by a Purelab Ultra (Elga LabWater, Ede. The Netherlands)), followed by equilibration with 20 μL of 85% ACN three times. Trapping of the glycopeptides by HILIC-SPE was achieved from a solution that contained 10 μL of the (40 μL) tryptic digestion mix (equivalent to 5 μg of mAb1) and 58 μL of ACN. The tips were sequentially washed three times with 20 μL 85% ACN + 1% TFA and three times with 85% ACN. Purified glycopeptides were eluted in 10 μL of water. Of this 10 μL, 5 μL was spotted, dried, and overlaid with 2 μL of matrix on an MTP 384 polished steel MALDI target plate (Bruker Daltonics). The matrix used for glycopeptide experiments was a 5 mg/mL solution of CHCA in 70% aqueous ACN, as described elsewhere.12 The mAb1 glycans were acquired by mixing 2 μL of sample (50 μg mAb), 3 μL of phosphate buffered saline (PBS), and 10 μL of 2% SDS and incubating on a shaker for 10 min. Denaturation occurred at 60 °C for 10 min. Samples were left to cool to room temperature (RT) before being mixed with a release mixture of 5 μL of 4% NP-40, 5 μL of 5× PBS, and 0.5 μL of PNGase F before overnight incubation at 37 °C. Of the 25 μL containing the released glycans, 2 μL (equivalent to 4 μg of mAb1) was subjected to ethyl esterification, as described in the literature (specifically, the fibrinogen protocol for released glycans).37
MATERIALS AND METHODS
Software and Hardware
Geany 0.21 was used as an integrated development environment (IDE) for developing MassyTools. MassyTools was 5089
DOI: 10.1021/acs.jproteome.5b00658 J. Proteome Res. 2015, 14, 5088−5098
Article
Journal of Proteome Research
course, be changed. In addition, the user can specify the desired charge carrier (H+, Na+) in the “CHARGE_CARRIER” section (the default being H+) or override this for individual compositions by adding [old charge carrier identifier]-1[new charge carrier identifier]1. Salt formation can be handled similarly, e.g., [proton]-2[sodium]2 for a sodium adduct with an additional carboxylic acid sodium salt or [proton]1[sodium]1 if sodium is set as the desired charge carrier.
TPNG samples were treated in a similar manner as the mAb1 glycan samples. Briefly, 10 μL of plasma standard was denatured by addition of 20 μL of 2% SDS and incubation for 10 min at 60 °C. Afterward, deglycosylation was performed by addition of 20 μL of release mixture (1:1 4% NP-40 substitute/5× PBS containing 1 U PNGase F) and overnight incubation at 37 °C. Of the resulting glycan mixture, 1 μL was derivatized by ethyl esterification as reported previously.37 HILIC-SPE was performed for all released, ethyl-esterified glycan samples (mAb1, TPNG) in the same manner as for the glycopeptides, except that only 20 μL of ACN was added to the derivatization mixture prior to sample loading. For mAb1 and TPNG samples, 1 μL of the 10 μL SPE eluate was spotted and dried on an MTP AnchorChip 800/384 MALDI target plate (Bruker Daltonics). Afterward, the dried sample was overlaid with 1 μL of a 5 mg/mL solution of DHB in 50% aqueous ACN containing 1 mM NaOH, dried again, and recrystallized with 0.2 μL ethanol. The addition of sodium to the matrix caused the suppression of other salt adducts.
Isotopic Distribution Calculation
The isotopic distribution is calculated for all supplied compositions, making use of the isotopic abundance ratios in the building blocks section of the program.39 Briefly, the isotopic pattern is calculated per chemical element using a binomial distribution. Subsequently, all elemental isotopic distributions are combined into a single molecular isotopic distribution. Lastly, isotopic species with the same nominal mass but different mass defects, resolved only at very high mass resolutions, are summed within a user-specified distance (ε), with the default being m/z 0.1. The isotopic pattern calculation has been made available as a separate python package in the GitHub repository of MassyTools (https://github.com/ Tarskin/MassyTools). For extraction, all isotopic peaks contributing at least 1% to the overall isotopic distribution of an analyte are extracted.
Data Acquisition
Derivatized glycan and native glycopeptide samples were analyzed using the reflectron positive ion mode of an Ultraflextreme MALDI-TOF-MS (Bruker Daltonics), equipped with a Smartbeam-II laser and operated by flexControl 3.4, build 135. Prior to measurement, the instrument was calibrated with a peptide calibration standard (Bruker Daltonics). Analyte acceleration was performed at 25 kV with 140 ns delayed extraction. For glycan samples, 20 000 spectra were accumulated within a window from m/z 1000 to 5000. Glycopeptide samples were analyzed within a window from m/z 1000 to 4000, accumulating 5000 spectra. Spectra were acquired using a random walk pattern at a laser repetition rate of 2000 Hz. In order to perform MassyTools processing, the obtained spectra were exported into a simple text-based format. The format consisted of an m/z and intensity value per line separated by a tab, from now on referred to as an (x,y) file. An alternative file format that MassyTools accepts is mzML, which is the current standard in the field of MS.38
Calibrant Lists
Calibration of mAb1 and TPNG samples was compared between MassyTools and flexAnalysis. The calibrants for the mAb1 glycans were H5N2 (1257.423 Da), H3N4F1 (1485.534 Da), H4N4F1 (1647.587 Da), H5N4F1 (1809.639 Da), H5N4F1L1 (2082.724 Da), and H5N4F1L2 (2355.809 Da), where all masses were calculated as [M + Na]+. mAb1 glycopeptide calibrants were mAb1pep-H5N2 (2406.938 Da), mAb1pep-H3N4F1 (2635.049 Da), mAb1pep-H4N4F1 (2797.102 Da), mAb1pep-H5N4F1 (2959.154 Da), and mAb1pep-H5N4F1S2 (3250.250 Da), where all masses were calculated as [M + H]+ of the tryptic glycopeptide. Calibrants used for the TPNG spectra were H5N4E1 (1982.708 Da), H5N4F1E1 (2128.766 Da), H5N4E2 (2301.835 Da), H6N5E2L1 (2940.052 Da), H6N5F1E2L1 (3086.110 Da), H7N6E1L3 (3532.227 Da), and H7N6E2L2 (3578.269 Da), where all masses were calculated as [M + Na]+.
Analyte Preprocessing in MassyTools
MassyTools software performs calibration and/or extraction of data on the basis of a user-defined list of analytes (containing the composition and, optionally, the isolation m/z width). Compositions must contain building blocks that are defined in the program’s “BLOCKS” section and must be in the [unit1 identifier][unit1 count][unit2 identifier][unit2 count] notation (e.g., H5N4 in the case of 5 hexoses and 4 N-acetylhexosamines). A unit identifier can contain only letters, and units used in this article are H for hexose, N for N-acetylhexosamine, F for fucose, E for α2,6-linked N-acetylneuraminic acid, L for α2,3-linked N-acetylneuraminic acid, S for N-acetylneuraminic with undefined linkage, and mAb for an IgG1 monoclonal antibody. Defined in the script for each building block are the monoisotopic reduced mass as well as the comprising elements (carbon, hydrogen, nitrogen, oxygen, and sulfur) used for isotopic pattern calculation. By following the requirements, additional building blocks can be added by the user, if desired. A table listing the currently available building blocks is shown in Supporting Information, Table S1. Next to building block counts defined per composition, building blocks can be added to all compositions at once. This is defined in the “MASS_MODIFIERS” section of the script. By default, H2O is added to all compositions, but this can, of
Calibration
MassyTools performs calibration of mass spectra on the basis of a user-defined list containing the names and m/z values of known compounds (calibration list). In order to consistently determine the accurate mass of the calibrants, a cubic spline is fitted through all of the data points within an m/z window around the exact mass (which is definable, but it defaults to m/ z ±0.4). The accurate mass is identified as the m/z value at the cubic spline local maximum. The signal-to-noise ratio (S/N) of the user-defined calibrants is calculated as detailed below. Calibrants are excluded if they are below a definable minimum S/N (9 by default). For the calibration, a second-degree polynomial is fitted through the pairs of accurate and exact masses, until the average residual postcalibration parts-permillion (ppm) error is minimized. The second-degree polynomial acquired by this method is applied to calibrate the entire spectrum. The flexAnalysis calibration of the mAb1 and TPNG samples was performed after automatic peak picking. Briefly, a maximum of 500 peaks with a S/N above 9 were picked 5090
DOI: 10.1021/acs.jproteome.5b00658 J. Proteome Res. 2015, 14, 5088−5098
Article
Journal of Proteome Research
area of an analyte (and its background-subtracted variant) by total area normalization (sum of all extracted analytes).
using the SNAP algorithm. The accurate mass of the calibrants was automatically detected using flexAnalysis, with a peak assignment tolerance of 1000 m/z. Afterward, a quadratic calibration is applied to the spectrum inside flexAnalysis.
X
area =
Background Determination
∑ (m/zi+ 1 − m/zi) × intensityi i=1
MassyTools uses adaptive background determination to identify a background region (Figure 1). Within a user-
(1)
Isotopic Quality Control (QC)
The isotopic quality control (QC) value is calculated by comparing the measured isotopic pattern with the theoretical isotopic pattern and relating the result to the noise, as reported previously (Supporting Information, Figure S1).40 Briefly, the difference between the measured (Si(obs)) pattern and theoretical (Si(est)) pattern is calculated per isotope (i), squared, and divided by the noise (N) squared. The square root of the sum of all isotope QC values yields the analyte QC value (eq 2). This QC value can be used to identify overlapping analytes and peak saturation, since either of the described problems can cause the observed isotopic pattern to deviate from the calculated one. X
QC value =
■
∑ i=1
(Si*(est) − Si*(obs))2 N2
(2)
RESULTS AND DISCUSSION MassyTools was developed to allow the (pre)processing of glycomics MALDI-MS spectra in a robust and high-throughput manner. This includes calibration, quality control (ppm errors, S/N, isotopic pattern matching), and targeted quantitation of all isotopes belonging to user-defined analytes. The MassyTools package is released under the Apache 2.0 license and is freely available on GitHub (https://github.com/Tarskin/ MassyTools). MassyTools offers a set of output blocks, which can be split into (1) quantitation results and (2) analyte/ spectrum quality criteria. The user can select which output blocks are to be used in the final summary via the graphical interface (Figure 2). In order to demonstrate the performance and features of MassyTools, we applied it to glycomics and glycoproteomics data. The analysis of glycosylation of a monoclonal antibody (mAb1) and healthy volunteer TPNG provides particular examples of work in the fields of biopharmaceuticals and clinical studies and reflects many of the challenges encountered in MALDI-MS analysis of glycosylation profiles. Additionally, the measurements helped us to determine the default settings for the definable analysis parameters. All definable and userdefined parameters of MassyTools are shown in Supporting Information, Table S2. Described in the next sections are the results and discussion pertaining calibration, background determination and subtraction, and relative quantification and its repeatability, as well as a display of the quality control made possible by MassyTools.
Figure 1. Background determination. MALDI-TOF-MS spectrum showing a m/z region containing the mAb1 H5N4F1 analyte. The m/z range for background determination of 10 m/z per analyte is illustrated with a line. The region that has the lowest average intensity will be taken as the background region, as indicated by a red box. The inset shows a detailed view of the m/z region that contains the background region.
specified m/z range before and after the monoisotopic peak, a number of m/z windows are integrated with equal width as the analyte isotopic peaks. The considered windows are distant from the monoisotopic peak by a multiple of 1.00335 Da (the mass difference between 12C and 13C). Both width and spacing are important to accurately capture the relevant contribution of the chemical noise. Across the m/z range, the average intensity is calculated for all possibilities of five consecutive windows. The five consecutive windows that yield the lowest average intensity are determined to be the background region. The area of each window is calculated as described in the Relative Quantification section. The five areas of the background region are then averaged to yield the background area. Additionally, background intensity and noise are computed as the average and standard deviation (SD) of all data points within the five windows of the background region, respectively. Relative Quantitation
MassyTools calculates the signal area of an analyte by taking the sum of all data point intensities multiplied by the m/z distance between data points (i, eq 1). The extraction width is a definable parameter, with 0.49 Da being the default setting for MALDI-TOF-MS data, meaning that all data points within ±0.49 Da around the exact mass are included. Using the same calculation, the background area for an analyte can be reported (see the Background Determination section). The program reports the area of an analyte, the background area around an analyte, the background-subtracted area of an analyte, and the background-subtracted area of each isotope of an analyte (Figure 2). Furthermore, the program calculates the relative
Calibration
To judge the applicability of the MassyTools calibration algorithm, we compared it to flexAnalysis 3.3 calibration using sets of MALDI-TOF-MS glycopeptide and derivatized Nglycan spectra of mAb1 (Figure 3). Additionally, a set of derivatized MALDI-TOF-MS glycan spectra of TPNG from a pool of healthy volunteers was calibrated (Figure 4; Supporting Information, Figure S2). Both mAb1 and TPNG samples were calibrated using a predefined list of analytes and their m/z values. 5091
DOI: 10.1021/acs.jproteome.5b00658 J. Proteome Res. 2015, 14, 5088−5098
Article
Journal of Proteome Research
Figure 2. MassyTools graphical user interface. (A) Main screen showing an open MALDI-TOF-MS spectrum of mAb1 glycans. (B) Screen showing the postcalibration mass errors of mAb1 glycans. (C) Batch process control window: this is where the user selects all the relevant files when processing a set of data files. (D) Screen showing all possible outputs that can be produced by MassyTools.
ppm). The major noncalibrant glycopeptide showed a maximum mass error of −2.9 ppm with MassyTools (mean absolute value 1.0 ppm, SD ± 0.8 ppm) and 3.9 ppm with flexAnalysis (mean absolute value 1.4 ppm, SD ± 1.4 ppm). Furthermore, all glycan and glycopeptide calibrants showed maximum mass errors below 10 ppm in all measurements (Supporting Information, Figure S3; see Supporting Information, Table S3, for the complete set of calibration results). In addition, we compared MassyTools calibration with flexAnalysis calibration using the TPNG spectra (Supporting Information, Table S4). For calibrated spectra from MassyTools and flexAnalysis, the maximum postcalibration mass error and SD are given below for the major mAb1 calibrant glycan (H5N4E2) and major noncalibrant glycan (H5N4E1L1) and calculated for all glycans above 1% relative abundance. The results showed that MassyTools performs better than flexAnalysis, showing, in general, an almost 2-fold improvement with regard to mass accuracy. Measurements showed that the major calibrant glycan has a maximum mass error of 8.6 ppm with MassyTools (mean absolute value 2.8 ppm, SD ± 2.2 ppm) and −5.4 ppm with flexAnalysis (mean absolute value 3.1 ppm, SD ± 1.1 ppm). The comparable calibration of the main calibrant peak was unexpected on the basis of the difference
The maximum postcalibration mass error and SD of the major mAb1 calibrant glycan (H4N4F1) and major noncalibrant glycan (H3N4) were calculated for the calibrated spectra from MassyTools and flexAnalysis. Additionally, the values were also calculated for the major mAb1 calibrant glycopeptide (mAb1-H4N4F1) and major noncalibrant glycopeptide (mAb1-H3N4). Note that flexAnalysis calibration was performed using an in-house developed script (in the flexAnalysis scripting language). MassyTools appeared to perform better than flexAnalysis with regard to calibration, offering, in general, a 2-fold decrease in the mass errors. For example, the major calibrant glycan shows a maximum mass error of −2.1 ppm with MassyTools (mean absolute value 0.8 ppm, SD ± 0.6 ppm) and −3.5 ppm with flexAnalysis (mean absolute value 1.7 ppm, SD ± 1.1 ppm) (Supporting Information, Figure S3). The major noncalibrant glycan shows a maximum mass error of 2.5 ppm with MassyTools (mean absolute value 1.0 ppm, SD ± 0.7 ppm) and 4.0 ppm with flexAnalysis (mean absolute value 1.7 ppm, SD ± 1.2 ppm). Measurements of the major calibrant glycopeptide showed a maximum mass error of 1.7 ppm with MassyTools (mean absolute value 0.7 ppm, SD ± 0.5 ppm) and −3.5 ppm with flexAnalysis (mean absolute value 1.7 ppm, SD ± 1.1 5092
DOI: 10.1021/acs.jproteome.5b00658 J. Proteome Res. 2015, 14, 5088−5098
Article
Journal of Proteome Research
Figure 3. Observed glycans in mAb1. (A) MALDI-TOF-MS spectrum of ethyl-esterified released glycans of mAb1, measured in RP mode. (B) MALDI-TOF-MS spectrum of mAb1 tryptic glycopeptides, measured in RP mode. All analytes used for calibration are displayed with red labels. Proposed glycan structures are based on the literature and observed mass.42
observed in the mAb1 data. Therefore, we also compared the second most abundant calibrant (H5N4E1), which showed a maximum mass error of 4.7 ppm with MassyTools (mean absolute value 1.9 ppm, SD ± 1.4 ppm) and −7.1 ppm with flexAnalysis (mean absolute value 4.5 ppm, SD ± 1.3 ppm). Furthermore, the major noncalibrant glycan showed a maximum mass error of −5.6 ppm with MassyTools (mean absolute value 1.9 ppm, SD ± 1.4 ppm) and −7.2 ppm with flexAnalysis (mean absolute value 3.6 ppm, SD ± 1.5 ppm). Lastly, 99% of the calculated mass errors for the 11 most abundant glycans were below 10 ppm using MassyTools. In contrast, when using flexAnalysis, 93% of the calculated mass errors for the 11 most abundant glycans were below 10 ppm (Supporting Information, Table S4). We also compared the computational time required for any of the data sets. MassyTools performed the calibration of either the mAb1 or TPNG samples in 1.7 s (SD ± 0.1 s). The simple flexAnalysis calibration of mAb1 glycans took 1.9 s. Summarizing, MassyTools and flexAnalysis processing time is roughly equal. The increased accuracy of calibration can most likely be attributed to the specific accurate mass determination performed in MassyTools. Namely, MassyTools performs peak detection using a spline fit instead of conventional
methods such as APEX (m/z of the highest intensity data point). The improved accuracy derives from using a continuous function, as opposed to taking a set of discrete data points for peak detection, while not assuming a specific shape, which is the case when fitting, e.g., a Gaussian peak. For each peak, the data point with the highest intensity value may not always properly reflect the top of the peak (flexAnalysis method), whereas the continuous function appears to provide a better estimate of the peak maximum (MassyTools method), as visualized in Supporting Information, Figure S4. An alternative continuous function (Gaussian) was also evaluated, showing comparable results to the spline fit for high-intensity peaks, although it proved to be less robust for lower intensities (data not shown). In conclusion, we showed that using the spline fit for MassyTools results in an improved accuracy after calibration compared to the APEX method (Supporting Information, Table S5) and comparable accuracy to a Gaussian fit (data not shown). Background Determination
MassyTools calculates the background levels by using a novel background determination algorithm, which identifies the region with the lowest average intensity around each given analyte (Figure 1). We assessed the performance of the 5093
DOI: 10.1021/acs.jproteome.5b00658 J. Proteome Res. 2015, 14, 5088−5098
Article
Journal of Proteome Research
Figure 4. Reference TPNG spectrum of healthy volunteers. MALDI-TOF-MS spectrum of ethyl-esterified released glycans from TPNG, measured in RP mode. Analytes that were used for calibration are displayed with red labels. The displayed spectrum is annotated with the most abundant glycan structures. Fully annotated spectra can be found in Supporting Information, Figure S2. For high-mannose and hybrid structures, the proposed structures are based on the synthesis pathway.43 Glycans that are contributed mainly by immunoglobulin G (e.g., H4N3F1) have been wellcharacterized.44 The localization of galactose to the specific antennae cannot be elucidated by MALDI-MS. Glycans that contain an Nacetylhexosamine, additional to the core or LAcNAc units (e.g., H5N5F1S1 and H5N5S1), can be bisected or contain a truncated antenna. Two of the major plasma glycoproteins, IgA and IgM, are known to contain diantennary bisected glycans.45,46 Antenna fucosylation has been observed on triantennary glycans, for example, in α1-acid glycoprotein.47 However, there are other highly abundant glycoproteins with fucosylated triantennary structures for which the fucose linkage is unknown. Lastly, sialic acid linkages in the spectrum are based on a derivatization technique that creates a unique mass for α2,3-linked and α2,6-linked sialic acids.37 A full list of all compositions that were extracted from this sample can be found in Supporting Information, Table S5.
preprocessing described previously (see Materials and Methods). The choice of the set of glycan analytes that were used for extraction was based entirely on their composition. The reason for that is that the only information that can be taken from MS1 level data is the accurate mass. Consequently, MassyTools cannot differentiate between isomers. Therefore, the use of a derivatization technique, exoglycosidase treatment, or MS/MS will be required to elucidate linkage differences and quantify the contribution of isomers. For example, the use of either α1,3- or α1,4-fucosidase may allow the user to distinguish sLeX and sLeA, respectively, from α1,6-linked core fucosylated species.41 Analyte compositions that were used for quantitation of mAb1 were previously reported and annotated accordingly (Figure 3).42 Analyte compositions that were used for quantitation of TPNG were also based on the literature and annotated accordingly (Figure 4).37 The quantitation of mAb1 provided highly similar relative areas for both glycans and glycopeptides, with RSD values for the main peak (H4N4F1) being 2.2% (MassyTools) and 1.9% (flexAnalysis) in the glycan spectra (Figure 5). A negative correlation was observed between the relative area and RSD of the analytes for analytes with a relative abundance above 0.1%. mAb1 glycans between 0.1 and 1% show an RSD range of 5.7− 31.8% using MassyTools and an RSD range of 6.2−37.1% using flexAnalysis (Supporting Information, Figure S6), indicating that MassyTools also yields equally good results as those with flexAnalysis for the quantitation of low-abundance glycan species. However, there was a significant difference (Wilcoxon sign rank test, p = 0.018) observed for analytes below 0.1% relative abundance, after manually removing outliers (RSD
background determination method on MALDI-TOF-MS mAb1 and TPNG glycan data using m/z ranges of ±30, 25, 20, 15, 10, and 6. For the mAb1 glycan data, using a window of m/z ±10, the data showed a relative standard deviation (RSD) of the main peak (H4N4F1) of 2% and an average RSD of 13% for the top 10 features (Supporting Information, Figure S4). For MALDI-TOF-MS TPNG glycan data, a window of m/z ±20 showed optimal results when comparing m/z ranges of ±50, 45, 40, 35, 30, 25, 20, 15, 10, 6, and 1 (data not shown). The difference in the optimum is likely due to the difference in sample complexity (TPNG contains far more signals than mAb1 spectra), and an m/z range of 20 for background determination is currently the default value for this definable setting. Care has to be taken when selecting the window for background determination. A range that is too small will detect a background region containing other analytes and/or contaminants, whereas a range that is too large will increase the probability of underestimating the background area. This is reflected in the RSD values increasing to either side of the optimum (Supporting Information, Figure S5). Relative Quantitation
We compared automated MassyTools quantitation with manual quantitation using flexAnalysis. The areas reported by flexAnalysis were acquired using the Centroid algorithm and manual selection of the desired peaks. Baseline correction in flexAnalysis was performed on a whole spectrum level to compensate for the lack of an analyte-specific backgroundsubtraction feature. The areas reported by MassyTools were extracted using an m/z window of ±0.2, after automated 5094
DOI: 10.1021/acs.jproteome.5b00658 J. Proteome Res. 2015, 14, 5088−5098
Article
Journal of Proteome Research
Figure 5. mAb1 data quantitation. (A) Glycan relative area with SD of analytes above 1% relative abundance. (B) Glycan relative area with SD of analytes below 1% relative abundance. (C) Glycopeptide relative area with SD of analytes above 1% relative abundance. (D) Glycopeptide relative area with SD of analytes below 1% relative abundance. The x-axis labels use a single-letter monosaccharide code, specifically, H = hexose, N = Nacetylhexosamine, F = deoxyhexose (fucose), L = α2,3-linked N-acetylneuraminic acid, and S = N-acetylneuraminic acid.
spectrum, manually integrating the first 3 isotopes of all desired analytes, copying the data to Excel, and summing the data from all isotopes.
above 100%). mAb1 glycopeptides below 0.1% relative abundance showed an RSD range of 8.0−19.7% using MassyTools and an RSD range of 11.2−36.4% using flexAnalysis. The correlation between RSD and analyte area for all analytes is given in Supporting Information, Figure S6. All quantitation results are included in Supporting Information, Table S6. Next, we performed automated integration of the TPNG samples (Figure 4). A total of 108 previously identified analyte compositions were used for extraction, yielding a relative area of 51.5% and an RSD of 6.4% for the main peak (H5N4E2) (Supporting Information, Table S7).37 Examining the values for all analytes showed there was a large number of analytes that had less than 0.2% relative abundance. Removal of 73 analytes with less than 0.2% relative abundance showed a repeatability that was in agreement with previously published results using the same sample (Supporting Information, Figure S7).37 When normalizing on the total intensity of the 35 most abundant features, the RSD of the main peak dropped to 5.8%. Importantly, we compared the time it took to process 108 compositions from 24 TPNG measurements. The total analysis time with MassyTools was approximately 5 min, consisting of selecting the necessary files, running the program, and loading the output into Excel. The total analysis time with flexAnalysis was approximately 10 h, consisting of baseline-subtracting each
Quality Control
A signal is defined as the maximum intensity of a feature within the m/z width set for (relative) quantitation. The S/N of each isotope of an analyte is calculated by MassyTools on the basis of the background intensity, noise, and signal. The signal is background-corrected (i.e., background is subtracted, followed by dividing the corrected signal intensity by the noise). Per analyte, the program can report the highest calculated S/N value as well as the S/N value for each individual isotope. We evaluated the performance of the spectral quality criteria on the TPNG data. A first quality criterion is the quotient of the sum of the analyte intensity and the total spectrum intensity, which is reported as “fraction of spectrum explained by analytes”. This fraction of spectrum explained by analytes was found to be 34.0% (SD ± 2.7%) for the TPNG measurements. The maximum observed value was 39.9%, whereas the minimum observed value was 28.3%. The difference in this value was caused mainly by the intensity of the spectra, as demonstrated by the main glycan having a maximum intensity of 8.1 × 105 in the best spectrum, whereas it had an intensity of only 3.5 × 105 in the worst spectrum. 5095
DOI: 10.1021/acs.jproteome.5b00658 J. Proteome Res. 2015, 14, 5088−5098
Article
Journal of Proteome Research
Figure 6. Feature curation. (A) Example of an analyte (H5N5F1) that was included based on the isotopic pattern and S/N quality criteria. (B) Example of an analyte (H4N4) that was excluded based on the QC value. The theoretical isotopic pattern is displayed inside the mass spectra with a dashed red line. Additionally, the QC value is listed in the top left of each panel.
results show that MassyTools calibration performs better than flexAnalysis calibration. Furthermore, quantitation using MassyTools is better than quantitation using flexAnalysis for analytes below 0.1% relative abundance, as demonstrated by lower RSD values. Importantly, MassyTools was significantly faster in all aspects of the analysis (calibration and quantitation). MassyTools allows a user unskilled in scripting to repeatedly integrate and analyze a set of analytes from MS glycosylation data. The program outputs quality control criteria to judge the validity of single analytes, as well as whole spectra. These are invaluable tools in the analysis of large clinical cohorts and biopharmaceuticals data sets, and we envision that MassyTools will enable researchers to focus on the interpretation of data rather than the technical aspects of data processing. In addition, we believe that the potential of MassyTools extends beyond glycomics and MALDI-TOF-MS, as it can be applied to any two-dimensional data (e.g., MALDIFourier transform ion cyclotron resonance-MS).
A second quality criterion that we evaluated was the quotient of the sum of the relative areas of analytes that have a S/N above the specified cutoff (6 by default) and the total sum of all extracted analyte areas. For the TPNG samples, this quotient showed a value of 93.7% (SD ± 2.9%). The maximum value for this parameter was 96.9%, whereas the minimum value that was observed was 83.7%. The spectra that were responsible for these values were the same spectra that gave the highest and lowest quotient of the sum of the analyte intensity and the total spectrum intensity, respectively. All analyses indicated a consistently high data quality, and we concluded that it was not necessary to remove any spectra. Analyte Curation
Curation of analytes based on their relative abundance, as shown previously (Relative Quantitation section), may result in the loss of relevant features. Therefore, we evaluated the performance of the analyte quality criteria by curating the list of 108 analytes that were previously observed in the TPNG data.37 We used the difference between the theoretical isotopic pattern and measured isotopic pattern (QC value) and the maximum S/N ratio of an analyte as curation parameters. We removed all analytes that had a maximum S/N below 9 and a QC value below 1.0 × 10−5. The resulting list of 40 analytes was used for integration and yielded an RSD of 6.0% for the main peak. We examined the difference in the curated lists, and an example feature that was accepted using the quality criteria is H5N5F1 (Figure 6A). Alternatively, an analyte that was removed due to failing the isotopic pattern QC value is H3N4 (Figure 6B). Curating analytes based on logical quality criteria instead of looking at the most abundant analytes will ensure that overlapping signals or wrongly classified features will be removed. Furthermore, it will make it easier to identify good quality, low-abundant signals that might be very relevant to an actual study.
■
ASSOCIATED CONTENT
S Supporting Information *
The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jproteome.5b00658. MassyTools (ZIP) Isotopic quality criteria (Figure S1); TPNG reference spectra (Figure S2); mAb1 calibration (Figure S3); cubic spline fit (Figure S4); background window (Figure S5); mAb1 integration (Figure S6); TPNG glycan profile (Figure S7) (PDF) MassyTools building blocks (Table S1); MassyTools parameters (Table S2); mAb1 calibration (Table S3); TPNG calibration (Table S4); calibration methods (Table S5); mAb1 integration (Table S6); TPNG integration (Table S7) (XLSX)
■
CONCLUDING REMARKS Here, we have presented an easy-to-use high-throughput data processing tool for the analysis of MALDI-MS glycosylation data. MassyTools was applied to a biopharmaceutical sample and a complex (TPNG) sample to compare its performance with existing and commonly used data processing tools. The
■
AUTHOR INFORMATION
Corresponding Author
*E-mail:
[email protected]. Tel.: +31-71-5268744. 5096
DOI: 10.1021/acs.jproteome.5b00658 J. Proteome Res. 2015, 14, 5088−5098
Article
Journal of Proteome Research Author Contributions
T.; Vitart, V.; Scheijen, B.; Uh, H.-W.; Molokhia, M.; Patrick, A. L.; McKeigue, P.; Kolčić, I.; Lukić, I. K.; Swann, O.; van Leeuwen, F. N.; Ruhaak, L. R.; Houwing-Duistermaat, J. J.; Slagboom, P. E.; Beekman, M.; de Craen, A. J. M.; Deelder, A. M.; Zeng, Q.; Wang, W.; Hastie, N. D.; Gyllensten, U.; Wilson, J. F.; Wuhrer, M.; Wright, A. F.; Rudd, P. M.; Hayward, C.; Aulchenko, Y.; Campbell, H.; Rudan, I. Loci Associated with N-Glycosylation of Human Immunoglobulin G Show Pleiotropy with Autoimmune Diseases and Haematological Cancers. PLoS Genet. 2013, 9, e1003225. (11) Huffman, J. E.; Knezevic, A.; Vitart, V.; Kattla, J.; Adamczyk, B.; Novokmet, M.; Igl, W.; Pucic, M.; Zgaga, L.; Johannson, A.; Redzic, I.; Gornik, O.; Zemunik, T.; Polasek, O.; Kolcic, I.; Pehlic, M.; Koeleman, C. A.; Campbell, S.; Wild, S. H.; Hastie, N. D.; Campbell, H.; Gyllensten, U.; Wuhrer, M.; Wilson, J. F.; Hayward, C.; Rudan, I.; Rudd, P. M.; Wright, A. F.; Lauc, G. Polymorphisms in B3GAT1, SLC9A9 and MGAT5 are associated with variation within the human plasma N-glycome of 3533 European adults. Hum. Mol. Genet. 2011, 20, 5000−5011. (12) Selman, M. H.; Hoffmann, M.; Zauner, G.; McDonnell, L. A.; Balog, C. I.; Rapp, E.; Deelder, A. M.; Wuhrer, M. MALDI-TOF-MS analysis of sialylated glycans and glycopeptides using 4-chloro-alphacyanocinnamic acid matrix. Proteomics 2012, 12, 1337−1348. (13) Wuhrer, M. Glycomics using mass spectrometry. Glycoconjugate J. 2013, 30, 11−22. (14) Bladergroen, M. R.; Derks, R. J.; Nicolardi, S.; de Visser, B.; van Berloo, S.; van der Burgt, Y. E.; Deelder, A. M. Standardized and automated solid-phase extraction procedures for high-throughput proteomics of body fluids. J. Proteomics 2012, 77, 144−153. (15) Pirman, D. A.; Efuet, E.; Ding, X.-P.; Pan, Y.; Tan, L.; Fischer, S. M.; DuBois, R. N.; Yang, P. Changes in Cancer Cell Metabolism Revealed by Direct Sample Analysis with MALDI Mass Spectrometry. PLoS One 2013, 8, e61379. (16) Yang, X.; Hu, L.; Ye, M.; Zou, H. Analysis of the human urine endogenous peptides by nanoparticle extraction and mass spectrometry identification. Anal. Chim. Acta 2014, 829, 40−47. (17) Bondt, A.; Rombouts, Y.; Selman, M. H.; Hensbergen, P. J.; Reiding, K. R.; Hazes, J. M.; Dolhain, R. J.; Wuhrer, M. IgG Fab glycosylation analysis using a new mass spectrometric high-throughput profiling method reveals pregnancy-associated changes. Mol. Cell. Proteomics 2014, 13, 3029−3039. (18) Opota, O.; Croxatto, A.; Prod’hom, G.; Greub, G. Blood culture-based diagnosis of bacteraemia: state of the art. Clin. Microbiol. Infect. 2015, 21, 313−322. (19) Patel, R. MALDI-TOF MS for the diagnosis of infectious diseases. Clin. Chem. 2015, 61, 100−111. (20) Reusch, D.; Haberger, M.; Selman, M. H. J.; Bulau, P.; Deelder, A. M.; Wuhrer, M.; Engler, N. High-throughput work flow for IgG Fcglycosylation analysis of biotechnological samples. Anal. Biochem. 2013, 432, 82−89. (21) Buse, J.; Purves, R. W.; Verrall, R. E.; Badea, I.; Zhang, H.; Mulligan, C. C.; Peru, K. M.; Bailey, J.; Headley, J. V.; El-Aneed, A. The development and assessment of high-throughput mass spectrometry-based methods for the quantification of a nanoparticle drug delivery agent in cellular lysate. J. Mass Spectrom. 2014, 49, 1171− 1180. (22) Niklas, J.; Hollemeyer, K.; Heinzle, E. High-throughput phospholipid quantitation in mammalian cells using matrix-assisted laser desorption ionization-time of flight mass spectrometry with Ntrifluoroacetyl-phosphatidylethanolamine as internal standard. Anal. Biochem. 2011, 419, 351−353. (23) Strohalm, M.; Kavan, D.; Novák, P.; Volný, M.; Havlíček, V. mMass 3: A Cross-Platform Software Environment for Precise Analysis of Mass Spectrometric Data. Anal. Chem. 2010, 82, 4648− 4651. (24) Pluskal, T.; Castillo, S.; Villar-Briones, A.; Oresic, M. MZmine 2: modular framework for processing, visualizing, and analyzing mass spectrometry-based molecular profile data. BMC Bioinf. 2010, 11, 395. (25) ms-utils.org bioinformatics tools. http://www.ms-utils.org/wiki/ pmwiki.php/Main/SoftwareList.
B.C.J. designed the software, wrote the software, and analyzed the data. K.R.R. measured the TPNG data and assisted in designing and testing the software. A.B. assisted in designing and testing the software. A.L.H.E. measured the mAb1 data. M.P. advised the project with regard to software development. D.F. analyzed the mAb1 data, assisted in designing and testing the software, assisted with MS figures-of-merit data interpretation, and assisted in preparing the manuscript. M.W. directed and advised the project with regard to software development and data analysis and revised the manuscript. Notes
The authors declare no competing financial interest.
■
ACKNOWLEDGMENTS We thank S. Nicolardi for fruitful discussions regarding the principles of isotopic pattern-based QC values. We would also like to thank R. Plomp for designing the MassyTools logo. D. Falck additionally acknowledges financial support by Hoffmann-La Roche (Penzberg, Germany). Furthermore, we would like to thank D. Reusch (Hoffmann-La Roche), M. Haberger (Hoffmann-La Roche), and various participants of the Stack Exchange network for helpful discussions regarding programming techniques. This work was supported by the Horizon Programme Zenith project funded by the Netherlands Genomic Initiative (project no. 93511033) as well as by the European Union (Seventh Framework Programme HighGlycan project, grant no. 278535).
■
REFERENCES
(1) Arnold, J. N.; Wormald, M. R.; Sim, R. B.; Rudd, P. M.; Dwek, R. A. The impact of glycosylation on the biological function and structure of human immunoglobulins. Annu. Rev. Immunol. 2007, 25, 21−50. (2) Varki, A. Biological roles of oligosaccharides: all of the theories are correct. Glycobiology 1993, 3, 97−130. (3) Omtvedt, L. A.; Royle, L.; Husby, G.; Sletten, K.; Radcliffe, C. M.; Harvey, D. J.; Dwek, R. A.; Rudd, P. M. Glycan analysis of monoclonal antibodies secreted in deposition disorders indicates that subsets of plasma cells differentially process IgG glycans. Arthritis Rheum. 2006, 54, 3433−3440. (4) Selman, M. H. J.; McDonnell, L. A.; Palmblad, M.; Ruhaak, L. R.; Deelder, A. M.; Wuhrer, M. Immunoglobulin G Glycopeptide Profiling by Matrix-Assisted Laser Desorption Ionization Fourier Transform Ion Cyclotron Resonance Mass Spectrometry. Anal. Chem. 2010, 82, 1073−1081. (5) Adamczyk, B.; Tharmalingam, T.; Rudd, P. M. Glycans as cancer biomarkers. Biochim. Biophys. Acta, Gen. Subj. 2012, 1820, 1347−1353. (6) Christiansen, M. N.; Chik, J.; Lee, L.; Anugraham, M.; Abrahams, J. L.; Packer, N. H. Cell surface protein glycosylation in cancer. Proteomics 2014, 14, 525−546. (7) Rombouts, Y.; Willemze, A.; van Beers, J. J.; Shi, J.; Kerkman, P. F.; van Toorn, L.; Janssen, G. M.; Zaldumbide, A.; Hoeben, R. C.; Pruijn, G. J.; Deelder, A. M.; Wolbink, G.; Rispens, T.; van Veelen, P. A.; Huizinga, T. W.; Wuhrer, M.; Trouw, L. A.; Scherer, H. U.; Toes, R. E. Extensive glycosylation of ACPA-IgG variable domains modulates binding to citrullinated antigens in rheumatoid arthritis. Ann. Rheum. Dis. 2015, DOI: 10.1136/annrheumdis-2014-206598. (8) Beck, A.; Wurch, T.; Bailly, C.; Corvaia, N. Strategies and challenges for the next generation of therapeutic antibodies. Nat. Rev. Immunol. 2010, 10, 345−352. (9) Shubhakar, A.; Reiding, K.; Gardner, R.; Spencer, D. R.; Fernandes, D.; Wuhrer, M. High-Throughput Analysis and Automation for Glycomics Studies. Chromatographia 2015, 78, 321−333. (10) Lauc, G.; Huffman, J. E.; Pučić, M.; Zgaga, L.; Adamczyk, B.; Mužinić, A.; Novokmet, M.; Polašek, O.; Gornik, O.; Krištić, J.; Keser, 5097
DOI: 10.1021/acs.jproteome.5b00658 J. Proteome Res. 2015, 14, 5088−5098
Article
Journal of Proteome Research (26) Dallas, D. C.; Martin, W. F.; Hua, S.; German, J. B. Automated glycopeptide analysisreview of current state and future directions. Briefings Bioinf. 2013, 14, 361−374. (27) Hufnagel, P.; Resemann, A.; Jabs, W.; Marx, K.; SchweigerHufnagel, U. Automated Detection and Identification of N-and Oglycopeptides, Proceedings of the Beilstein glyco-bioinformatics symposium, Potsdam, Germany, June 10−14, 2013. (28) Ozohanics, O.; Krenyacz, J.; Ludányi, K.; Pollreisz, F.; Vékey, K.; Drahos, L. GlycoMiner: a new software tool to elucidate glycopeptide composition. Rapid Commun. Mass Spectrom. 2008, 22, 3245−3254. (29) Ceroni, A.; Maass, K.; Geyer, H.; Geyer, R.; Dell, A.; Haslam, S. M. GlycoWorkbench: A Tool for the Computer-Assisted Annotation of Mass Spectra of Glycans. J. Proteome Res. 2008, 7, 1650−1659. (30) Wu, S.-W.; Liang, S.-Y.; Pu, T.-H.; Chang, F.-Y.; Khoo, K.-H. Sweet-Heart An integrated suite of enabling computational tools for automated MS2/MS3 sequencing and identification of glycopeptides. J. Proteomics 2013, 84, 1−16. (31) Gibb, S.; Strimmer, K. MALDIquant: a versatile R package for the analysis of mass spectrometry data. Bioinformatics 2012, 28, 2270− 2271. (32) Anaconda. https://store.continuum.io/cshop/anaconda/. (33) van Rossum, G.; Drake, F. L., Jr. Python reference manual, Technical Report CS-R9526; Centrum voor Wiskunde en Informatica: Amsterdam, 1995. (34) van der Walt, S.; Colbert, S. C.; Varoquaux, G. The NumPy Array: A Structure for Efficient Numerical Computation. Comput. Sci. Eng. 2011, 13, 22−30. (35) Oliphant, T. E. Python for Scientific Computing. Comput. Sci. Eng. 2007, 9, 10−20. (36) Hunter, J. D. Matplotlib: A 2D Graphics Environment. Comput. Sci. Eng. 2007, 9, 90−95. (37) Reiding, K. R.; Blank, D.; Kuijper, D. M.; Deelder, A. M.; Wuhrer, M. High-throughput profiling of protein N-glycosylation by MALDI-TOF-MS employing linkage-specific sialic acid esterification. Anal. Chem. 2014, 86, 5784−5793. (38) Deutsch, E. W. Mass spectrometer output file format mzML. Methods Mol. Biol. 2010, 604, 319−331. (39) Berglund, M.; Wieser, M. E. Isotopic compositions of the elements 2009 (IUPAC Technical Report). Pure Appl. Chem. 2009, 83, 397−410. (40) Nicolardi, S.; Palmblad, M.; Dalebout, H.; Bladergroen, M.; Tollenaar, R. E. M.; Deelder, A.; van der Burgt, Y. M. Quality control based on isotopic distributions for high-throughput MALDI-TOF and MALDI-FTICR serum peptide profiling. J. Am. Soc. Mass Spectrom. 2010, 21, 1515−1525. (41) Royle, L.; Radcliffe, C.; Dwek, R.; Rudd, P. Detailed Structural Analysis of N-Glycans Released From Glycoproteins in SDS-PAGE Gel Bands Using HPLC Combined With Exoglycosidase Array Digestions. Glycobiology Protocols 2006, 347, 125−144. (42) Reusch, D.; Haberger, M.; Maier, B.; Maier, M.; Kloseck, R.; Zimmermann, B.; Hook, M.; Szabo, Z.; Tep, S.; Wegstein, J.; Alt, N.; Bulau, P.; Wuhrer, M. Comparison of methods for the analysis of therapeutic immunoglobulin G Fc-glycosylation profilesPart 1: Separation-based methods. mAbs 2015, 7, 167−179. (43) Kornfeld, R.; Kornfeld, S. Assembly of asparagine-linked oligosaccharides. Annu. Rev. Biochem. 1985, 54, 631−664. (44) Fujii, S.; Nishiura, T.; Nishikawa, A.; Miura, R.; Taniguchi, N. Structural heterogeneity of sugar chains in immunoglobulin G. Conformation of immunoglobulin G molecule and substrate specificities of glycosyltransferases. J. Biol. Chem. 1990, 265, 6009− 6018. (45) Arnold, J. N.; Wormald, M. R.; Suter, D. M.; Radcliffe, C. M.; Harvey, D. J.; Dwek, R. A.; Rudd, P. M.; Sim, R. B. Human Serum IgM Glycosylation: Identification of Glycoforms That Can Bind to Mannan-Binding Lectin. J. Biol. Chem. 2005, 280, 29080−29087. (46) Mattu, T. S.; Pleass, R. J.; Willis, A. C.; Kilian, M.; Wormald, M. R.; Lellouch, A. C.; Rudd, P. M.; Woof, J. M.; Dwek, R. A. The Glycosylation and Structure of Human Serum IgA1, Fab, and Fc
Regions and the Role of N-Glycosylation on Fcα Receptor Interactions. J. Biol. Chem. 1998, 273, 2260−2272. (47) Dage, J. L.; Ackermann, B. L.; Halsall, H. B. Site localization of sialyl Lewisx antigen on α1-acid glycoprotein by high performance liquid chromatography-electrospray mass spectrometry. Glycobiology 1998, 8, 755−760.
5098
DOI: 10.1021/acs.jproteome.5b00658 J. Proteome Res. 2015, 14, 5088−5098