Characterizing Protein Glycosylation through On ... - ACS Publications

Nov 3, 2016 - calculation of motif scores we used custom software written in. Java (Sun ..... Because of the importance of characterizing protein glyc...
0 downloads 0 Views 4MB Size
Subscriber access provided by United Arab Emirates University | Libraries Deanship

Article

Characterizing Protein Glycosylation through On-Chip Glycan Modification and Probing Bryan S. Reatini, Elliot Ensink, Brian Liau, Jessica Y. Sinha, Thomas W. Powers, Katie Partyka, Marshall Bern, Randall E. Brand, Pauline Mary Rudd, Doron Kletter, Richard R. Drake, and Brian B. Haab Anal. Chem., Just Accepted Manuscript • DOI: 10.1021/acs.analchem.6b02998 • Publication Date (Web): 03 Nov 2016 Downloaded from http://pubs.acs.org on November 12, 2016

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Analytical Chemistry is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 27

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

On-Chip Glycan Analysis

Characterizing Protein Glycosylation through On-Chip Glycan Modification and Probing Bryan S. Reatini,1 Elliot Ensink,1° Brian Liau,2 Jessica Y. Sinha,1† Thomas W. Powers,3 Katie Partyka,1 Marshall Bern,4 Randall E. Brand,5 Pauline M. Rudd,2,6 Doron Kletter,4 Richard Drake,4 and Brian B. Haab1* 1

Van Andel Research Institute, Grand Rapids, Michigan; Bioprocessing Technology Institute, Singapore;

2

3

Medical University of South Carolina, Charleston, South Carolina; Protein Metrics, Inc., San Carlos, California;

5

University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania; National Institute for Bioprocessing Research

4

6

and Training, Dublin, Ireland ABSTRACT: Glycans are critical to protein biology and are useful as disease biomarkers. Many studies of glycans rely on clinical specimens, but the low amount of sample available for some specimens limits the experimental options. Here we present a method to obtain information about protein glycosylation using a minimal amount of protein. We treat proteins that were captured or directly spotted in small microarrays (2.2 x 2.2 mm) with exoglycosidases to successively expose underlying features, and then we probe the native or exposed features using a panel of lectins or glycanbinding reagents. We developed an algorithm to interpret the data and provide predictions about the glycan motifs that are present in the sample. We demonstrated the efficacy of the method to characterize differences between glycoproteins in their sialic acid linkages and N-linked glycan branching, and we validated the assignments by comparing results from mass spectrometry and chromatography. The amount of protein used on-chip was about 11 nanograms. The method also proved effective for analyzing the glycosylation of a cancer biomarker in human plasma, MUC5AC, using only 20 µL of the plasma. A glycan on MUC5AC that is associated with cancer had mostly 2,3 linked sialic acid, whereas other glycans on MUC5AC had a 2,6 linkage of sialic acid. The on-chip glycan modification and probing (on-chip GMAP) method provides a platform for analyzing protein glycosylation in clinical specimens and could complement the existing toolkit for studying glycosylation in disease.

INTRODUCTION Deciphering glycan structures on glycoproteins is an important goal for better understanding the roles of protein glycosylation in normal and disease biology1. For disease research, studies using clinical specimens are particularly valuable, as they enable a direct look at the location and amount of expression of glycans in the biological context. But such studies are challenging using current methods of analyzing glycans, mainly because the methods require more sample

ACS Paragon Plus Environment

On-Chip Glycan Analysis 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Page 2 of 27

than is typically available from clinical specimens. The use of mass spectrometry and chromatography to analyze protein glycosylation requires purification of a protein in microgram quantities2, which is not possible if the specimen is available only in small amounts or if the protein of interest has too low a concentration. Furthermore, the throughput and precision of methods that require multiple preparation and purification steps normally would not be high enough to determine differences among patient cohorts. Although specialized centers have automated systems with powerful capabilities3, new methods nevertheless are needed to provide complementary information and to broaden access to glycan analysis. Here we developed a method that combines the low-volume and multiplexing capabilities of microarrays with the precision of affinity reagents and enzymes to probe and modify glycans. Researchers have used affinity reagents, including lectins and glycan-binding antibodies, for distinguishing closely related or rare glycan motifs in a variety of experimental formats4 such as histochemical staining5 and lectin microarrays6-7. A format that is valuable for probing glycosylation on specific proteins is a sandwich ELISA-type microtiter assay, in which a surface-bound antibody captures a protein out of solution, and a solution-phase lectin binds to glycans on the captured protein. Researchers used such an assay to analyze glycosylation variants of alphafetoprotein8 and prostate-specific antigen9-10, among others. We multiplexed and miniaturized the format using antibody microarrays11. The microarray was particularly valuable for biomarker studies because it required only microliters of sample and concentrations of the targeted proteins in the low ng/mL range11. Moreover, we obtained the throughput and precision necessary for examining large patient cohorts. Using antibody-lectin sandwich arrays, we identified glycan biomarkers in the plasma of pancreatic cancer patients12 and demonstrated that specific protein glycoforms can be better biomarkers than the core proteins13-15. Other groups also have effectively used this format for disease and biomarker studies16-18. A limitation in the use of lectins is that they provide information on only one glycan motif that is accessible to the lectin. Additional information about the underlying glycan structures would be valuable in most studies and can be provided by LC-MS based approaches, but as noted above, the information can be difficult to obtain owing to sample limitations. An alternative approach is to combined the use of lectins with the use exoglycosidases to remove terminal saccharides and expose underlying features, analogous to the well-established method of sequential glycosidase digestions followed by chromatographic separations19. We postulated that such a method could apply to proteins captured on antibody arrays; instead of examining changes in glycan size, we could examine changes in lectin binding profile. The strategy is to measure

ACS Paragon Plus Environment

-2-

Page 3 of 27 On-Chip Glycan Analysis 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

binding across a panel of lectins incubated with and without prior enzymatic digestion, and then to predict which glycan motifs are present based on the integrated information. The possibility of acquiring such data was clear from previous work using sialidase on proteins captured by antibody arrays11 and on glycans in glycan arrays20. But the experimental methods are only one part of the challenge; one also needs software to interpret the data. Manual or qualitative interpretation becomes ineffective when attempting to integrate information from multiple lectins and from changes in binding patterns upon enzymatic treatment, especially when accounting for the unique and complex binding specificity of each lectin. We approached this problem by building on our previous work in informatics for glycan analysis. Our group was the first to present an algorithm for the automated analysis of glycan array data—a method called Motif Segregation21—which was followed by additional, useful methods for glycan array analysis22-24. A large repository of glycan array data is available through the Consortium for Functional Glycomics (CFG), and we analyzed the set in its entirety to obtain the binding specificities of hundreds of lectins and glycan-binding antibodies25. The analysis was critical for developing a method to predict the glycan motifs in a biological sample based on measurements from a panel of lectins26. Starting from the above foundations, in this research we developed the ability to treat a microarray of proteins with exoglycosidases and to probe the glycans both in their native form and in their enzyme-treated form. Furthermore, we developed an algorithm to interpret the data to provide evaluations of the glycan motifs present on the glycoproteins. In this report we present 1) a description of the experimental method and analysis algorithm; 2) the testing of the method on control proteins and comparing the results to data from orthogonal methods (mass spectrometry and chromatography); and 3) the application of the method to interrogating biomarker glycosylation using a small amount of a clinical specimen. The biomarker was MUC5AC obtained from 20 µL of human plasma, and we sought to answer a specific question about a glycan that is upregulated in pancreatic cancer. EXPERIMENTAL SECTION Protein and Antibody Microarray Fabrication and Use The antibodies, lectins, control proteins, and enzymes were purchased from various sources (Table S-1). We printed the capture antibodies or glycoproteins onto coated microscope slides (PATH, Grace Bio-Labs, Bend, OR) using a robotic arrayer (2470, Aushon Biosystems, Billerica, MA) (see the Supporting Information for additional details). Each slide contained 192 identical arrays arranged in an 8 x 24 grid with 2.25 mm spacing between arrays, and each array had the

ACS Paragon Plus Environment

-3-

On-Chip Glycan Analysis 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Page 4 of 27

same antibodies or proteins printed in six-replicate. After printing, hydrophobic borders were imprinted onto the slides (SlideImprinter, The Gel Company, San Francisco, CA) to segregate the arrays and allow for multiple, separate sample incubations on each slide27. The assays were modified from the protocol described previously11 (see the Supporting Information for additional details). For experiments involving the analysis of human plasma, we diluted each sample 2-fold into a buffer (1X PBS with 0.1% Tween-20, 0.1% Brij-35, species-specific blocking antibodies, and protease inhibitor) and incubated the sample on an antibody array overnight at 4 °C. We prepared α2-3 Neuraminidase (P0728L, New England Biolabs, Ipswich) or α2-3, 6, 8 Neuraminidase (P0720S, New England Biolabs, Ipswich) at a concentration of 250 U/mL in the supplied reaction buffer and incubated each separately on arrays containing the captured or spotted glycoproteins overnight at 37°C. The arrays not treated with enzymes were incubated with the enzyme reaction buffer in the same conditions. We incubated each array with a biotinylated lectin solution (3 µg/mL in 1X PBS with 0.1% Tween-20 and 0.1%BSA) and subsequently with Cy5-conjugated streptavidin (43-4316, Invitrogen, Carlsbad, CA) (2 µg/mL in the same buffer as the lectins). The slides were scanned for fluorescence at 633 nm using a microarray scanner (LS Reloaded, TECAN, Morrisville, NC). Data Analysis and Software The fluorescence images were quantified and analyzed using custom, in-house software28. (see the Supporting Information for additional details). For the calculation of motif scores we used custom software written in Java (Sun Microsystems), and we used Matlab (R2015b, Mathworks) for calculating the motif prediction scores. For final data processing and figure making, we used Microsoft Excel, GraphPad Pro, and Deneba Canvas. Human Plasma Samples All collections took place at the University of Pittsburgh Medical Center following informed consent of the participants and prior to any surgical or medical procedures. The donors were patients with pancreatic cancer (n = 206) and patients with pancreatitis or benign biliary obstruction (n = 49). All blood samples (EDTA plasma) were collected according to the standard operating procedure from the Early Detection Research Network and were frozen at -70 °C or colder within 4 hours of time of collection. Aliquots were shipped on dry ice and thawed no more than three times prior to analysis.

ACS Paragon Plus Environment

-4-

Page 5 of 27 On-Chip Glycan Analysis 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Glycan Analysis by Mass Spectrometry N-glycans released by PNGaseF digestion of fetuin and transferrin were extracted from precipitated protein and dried by vacuum centrifugation in preparation for ethyl esterification of terminal sialic acid residues. The ethyl esterification protocol, including the modification and enrichment, was adapted from Reiding et al.29 as previously reported30. Glycans were spotted with CHCA matrix and analyzed by MALDI-FTICR as previously described30. Glycan Analysis by Chromatography N-glycans were released from transferrin and AGP by in-gel digestion and analyzed by sequencing chromatography according to the method of Royle et al.3 with modifications. (See the Supporting Information for details.) Safety Considerations Human blood plasma is a potential source of infectious agents. Researchers should handle specimens collected from human subjects with the appropriate precautions.

ACS Paragon Plus Environment

-5-

On-Chip Glycan Analysis 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Page 6 of 27

RESULTS On-Chip Glycan Modification and Probing To acquire data on the glycosylation of a protein, we probe the protein with a panel of lectins that are incubated either with or without prior modification of the glycans by an exoglycosidase (Fig. 1). The signals are quantified and used in an algorithm to predict the glycan motifs that are present on the proteins. To test the method, we sought to distinguish between two related motifs: alpha 2,3-linked sialic acid and alpha 2,6-linked sialic acid. Enzymes are available that differentially cleave these features (Fig. 2A). One cleaves only 2,3-linked sialic acid (referred to as sialidase 1) and another is a pan-sialidase that cleaves 2,3, 2,6, 2,8 and 2,9 linkages (referred to as sialidase 2). We also selected a panel of lectins that bind either sialic acid in one of its linkages or that bind non-sialylated glycans that would be exposed upon removal of sialic acid (Fig. 2B). We then applied these reagents to the analysis of purified glycoproteins that we had printed in microarrays on microscope slides. The amount of protein in each microarray is about 170 picograms (170 x 10-12 grams), based on a protein concentration of 250 µg/mL, a spot volume of 170 pL, and 4 replicate spots per array. The entire analysis used 63 microarrays, based on 7 lectins, 3 conditions (no enzyme, sialidase 1, and sialidase 2), and 3 replicate arrays per condition. Thus the protein consumption was around 170 x 63 = 11 nanograms, equivalent to 140 femtomoles for an 80 kD/mole protein such as transferrin. Here the value of low-volume microarrays is apparent. The 2.2 x 2.2 mm arrays use 1.5 µL per incubation, so the lectin and enzyme conFigure 1. On-chip glycan modification and probing. Glycoproteins were immobilized onto a planar surface, and the glycans on the proteins were probed by a panel of lectins, either with or without prior modification of the glycans using enzymes. We quantified the binding of each lectin under each condition and then used an algorithm to predict the motifs that are present on the glycoprotein.

sumption is 63 x 1.5 µL = 95 µL for the entire analysis.

ACS Paragon Plus Environment

-6-

Page 7 of 27 On-Chip Glycan Analysis 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

The fluorescence images of the lectin binding showed major differences between the lectins and conditions (Fig. 2C). The quantified data (Fig. 2D) showed that the binding of lectins with specificity for sialic acid decreased upon enzymatic treatment, whereas the binding of lectins with specificity for underlying glycans increased. The differences between signals obtained with enzymatic treatment and without enzymatic treatment more clearly showed the changes and the differences between the proteins (Fig. 2D). SNA, which binds primarily 2,6-linked sialic acid, showed loss of binding after application of sialidase 2 but not sialidase 1. In contrast, MAL-1, which binds primarily 2,3-linked sialic acid, showed loss of binding after application of either sialidase. Thus the changes in binding qualitatively matched our expectations. We evaluated the reproducibility of the Figure 2. Test case: distinguishing α2,3 from α2,6 sialic acid. A) We used two enzymes. Sialidase 1 cleaves only α2,3 sialic acid, and sialidase 2 cleaves all linkages of sialic acid. B) We selected lectins based on their primary specificities against the motifs targeted by the enzymes and the motifs expected to be exposed after enzyme treatment. The fine specificities of the lectins are more complex than depicted here. C) The images show the fluorescence signals from microarrays produced by direct spotting of purified glycoproteins. We present representative examples from each treatment condition. D) The heatmaps show the relative signals under each condition as well as the differences between signals obtained with and without enzyme treatment, indicated by ∆Sialidase1 and ∆Sialidase2. To normalize the signals, all the values obtained with a given lectin were divided by the highest value for that lectin, making the range 0 to 1 for the original values and the range -1 to 1 for the differences.

measurements using 3 independent experiments (Fig. S-3). The coefficients of variation between the replicates averaged 0.42 for transferrin and 0.37 for fetuin, which is acceptable given the developmental stage of the assays. The average Pearson correlations across the replicate sets were 0.94 for transferrin and 0.95 fetuin. These numbers indicate stability in the measurements and the ability to provide consistent motif predictions.

ACS Paragon Plus Environment

-7-

On-Chip Glycan Analysis 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Page 8 of 27

Automated Interpretation of Lectin Binding and Glycan Modification An important requirement for the usability of the method is an algorithm to interpret the measurements. Such a process requires quantification both of the specificities of the lectins and of the expected changes in binding following enzymatic modification. For this step we used glycan array data and a bioinformatics method called motif segregation21. First we defined a set of substructures of glycans, referred to as motifs, that represent potential binding determinants of lectins. We defined 158 motifs (Table S-2) covering sialylated and non-sialylated features commonly found in mammalian glycans. Next, using glycan array data obtained from the Consortium for Functional Glycomics, we determined the presence or absence of each motif in each glycan on the array. (For example, the glycan ‘Galβ1-4(Fucα1-3)GlcNAcβ1’ contains the motif ‘terminal Galβ1-4’, but the glyFigure 3. Motif scores for the original and modified motifs. A) We defined 158 motifs covering the variants we were probing. For each motif, we defined modified versions representing treatment by either sialidase 1 or sialidase 2. Two representative motifs are illustrated here. The angled brackets refer to attributes of the immediately following monosaccharide. (See supplementary information for details on the motif language and the motifs.) B) For each of the 158 motifs we calculated a motif score from glycan array data (obtained from the Consortium for Functional Glycomics) for each lectin, using both the original and modified motifs. We then calculated the differences between the modified and original motif scores, indicated by ∆Sialidase1 and ∆Sialidase2.

can ‘Galβ1-3(Fucα1-3)GlcNAcβ’ does not.) Finally, using the binding intensities to each glycan for a particular lectin, we calculate a score for each motif based on the difference in intensities between the glycans that contain a motif and those that do not. In this way we

ACS Paragon Plus Environment

-8-

Page 9 of 27 On-Chip Glycan Analysis 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

calculated the 158 “motif scores” for each lectin. To obtain the expected changes in lectin binding upon enzymatic modification, we defined modified motifs corresponding to the original motifs. For example, the motif Siaα2-3Galβ14GlcNAcβ becomes Galβ1-4GlcNAcβ after treatment with sialidase 1 (the sialidase with specificity for only 2,3-linked sialic acid), but the motif Siaα2-6Galβ1-4GlcNAcβ is unchanged after treatment with sialidase 1 (Fig. 3A). We calculated motif scores using the modified motifs, and we then determined the differences between the modified and the original motif scores for each lectin and enzyme (Fig. 3B). The latter we refer to as “delta motif scores.” Next, we arrived at a “motif prediction score.” The motif score for each lectin is multiplied by the normalized binding intensity for the corresponding lectin, and the resulting products are summed over all lectins (Fig. 4A). In addition, the delta motif scores are multiplied by the delta intensities and summed over the lectins (Figs. 4B and 4C). Finally, the motif prediction scores are summed over all conditions. When applied to transferrin, several motifs showed high scores only after the use of enzymatic modifications (Fig. 4D). Comparison to Orthogonal Methods We next assessed the accuracy of the method by comparing results on Figure 4. Calculation of the motif prediction scores for transferrin. A) Each motif score for a particular lectin was multiplied by the fluorescence signal for that lectin. The resulting products were summed over all lectins for each motif, giving the motif prediction scores for the unmodified glycans. The calculation was the same for the glycans after modification by sialidase 1 (panel B) and sialidase 2 (panel C), using the ∆motif scores and the ∆intensities. The final step was to sum the motif prediction scores over all conditions. D) The top motif was 6-sialyl N-acetyllactosamine connected to unbranched mannose. Some of the top 5 motifs had high scores only after summing over the enzyme conditions.

the control glycoproteins to data from orthogonal glycan analysis methods. A cluster of the top motifs calculated for 5 glycoproteins showed clear groupings among the proteins (Fig. 5A). The groupings mainly were defined by the linkage of sialic acid, branching, and terminal mannose or

ACS Paragon Plus Environment

-9-

On-Chip Glycan Analysis 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Page 10 of 27

galactose. According to motif prediction, fetuin displays more 2,3-linked than 2,6-linked sialic acid on simple N-glycan extensions (motifs 37 and 55) and extensions with branching (motifs 44

Figure 5. Validation of predicted motifs. A) The heatmap of top prediction motifs shows both similarities and differences among the 5 proteins. The top motifs showed a relative dominance of motifs with α2,3 sialic acid in fetuin and motifs with α2,6 sialic acid in transferrin. AGP showed higher amounts of branched motifs (e.g. motifs 62 and 44) relative to transferrin, which showed higher unbranched motifs (e.g. motif 55). B) A quantification by mass spectrometry showed that the top glycans in transferrin primarily displayed α2,6 sialic acid, but the top glycans in fetuin primarily displayed α2,3 sialic acid. The glycans shown are probable structures based on biosynthetic rules. C) Quantitative glycan analysis by chromatography confirmed differences between transferrin and AGP in the branching of N-linked glycans. D) The top 15 motifs in either fetuin or transferrin had probable assignments in the top glycans found by MS, whereas related motifs with low MP scores had no probably assignments. ACS Paragon Plus Environment

- 10 -

Page 11 of 27On-Chip Glycan Analysis 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

and 62), but transferrin has more 2,6-linked sialic acid. AGP is predicted to have more branched than unbranched glycans, in contrast to transferrin and fetuin. We obtained mass spectrometry analysis of fetuin and transferrin (Fig. S-1) and chromatographic analysis of transferrin and AGP (Fig. S-2). The mass spectrometry analysis used ethyl esterification of sialic acids29 to distinguish 2,3 from 2,6 sialic acid linkages, and the chromatography enable quantification of the percentage of branched glycans. The MS data revealed that the top N-linked glycans on fetuin primarily have 2,3-linked sialic acid, but the top glycans on transferrin primarily have 2,6-linked sialic acid (Fig. 5B). The chromatographic analysis showed that transferrin mainly has unbranched N-linked glycans, unlike AGP with mainly branched glycans (Fig. 5C). Thus the motif predictions of 2,3 relative to 2,6 sialic acid, as well as differences in the amount of branching on N-linked glycans, agreed with the results from independent methods. The prediction of terminal mannose on transferrin but not fetuin agreed with the mass spectrometry data, which showed glycans with terminal mannose only on transferrin (Fig. S-1). Terminal mannose is not commonly observed on transferrin, so its presence could be due to contamination, but the agreement between the methods suggests the identification is not false. We further explored the validity of the findings by asking whether the top predicted motifs could be reasonably assigned to the top glycan compositions found by MS. Among the top 15 motifs found in either fetuin or transferrin, all but one could be assigned to the probable structures found by MS (Fig. 5D). In contrast, among motifs that are closely related but had low motif prediction scores, none could not be assigned to the probable structures. The lack of assignment of one top motif could be due to a false interpretation of the MS results, given that other structures are possible besides the ones shown. In any case the comparison supports the validity of the predicted motifs. Application to Clinical Specimens We next asked whether we could apply the method to glycoproteins captured out of biological fluid. We studied the nature of a glycoform of MUC5AC that we previously showed is elevated in pancreatic cancer12. The glycan is sialylated version of a stem cell marker detected by an antibody called TRA-1-6031. TRA-1-60 detects the non-sialylated glycan, so we removed the sialic acid using sialidase prior to detection (Fig. 6A). The sialylated glycan, called LSTa, is attached to the protein MUC5AC and is strongly elevated in the plasma of cancer patients relative to patients with benign pancreatic disease (Fig. 6B)12. The linkage of the sialylation, whether 2,3linked or 2,6-linked, was not evident from our previous assay because we used the broad-

ACS Paragon Plus Environment

- 11 -

On-Chip Glycan Analysis 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Page 12 of 27

specificity sialidase. Therefore we sought to determine the linkage of sialic acid on LSTa attached to MUC5AC. We tested whether on-chip GMAP, used with a small set of lectins, could provide accurate information for MUC5AC captured out of a plasma sample from a cancer patient and another from a control subject (Fig. 6C). The motif prediction scores calculated without the use of enzymatic

Figure 6. Probing glycans on MUC5AC captured from plasma. A) The TRA-1-60 antibody detects a non-sialylated glycan. To detect the glycan on MUC5AC, we captured MUC5AC from plasma using antibody arrays, treated the captured protein with sialidase, and probed with the antibody. B) The application of the method in panel A to samples from pancreatic cancer patients and subjects with benign pancreatic conditions revealed elevated levels in a significant number of cancer patients. C) Antibody arrays incubated with a cancer patient sample were probed with either BPL, ECL, or SNA using either no enzyme modification or treatment with one of the sialidases. D) We applied the motif prediction algorithm using either just the no enzyme condition or the sum over all conditions. The cluster shows the top 20 motifs from either method. Motif 50 was high using only in the summed value. E) A comparison of motifs 50, 68, and 88 showed that without summing over enzymatic modifications, no values were achieved for motifs 50 and 68. F) We probed a cancer patient sample and a control sample with the TRA-1-60 antibody using the 3 conditions. The pattern of increased binding after the sialidases agrees with the motif prediction result.

ACS Paragon Plus Environment

- 12 -

Page 13 of 27On-Chip Glycan Analysis 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

modifications showed mainly 2,6-linked sialic acid motifs and type-2 N-acetyl-lactosamine (LacNAc) (Fig. 6D), which is more common than type-1 LacNAc. and the motifs containing the LSTa antigen did not have positive motif prediction scores (Fig. 6D). When we incorporated enzymatic modifications into the calculations, a motif representing the LSTa antigen (type-1 and type-2 LacNAc in sequence) with 2,3-linked sialic acid (motif 50) showed up as one of the top motifs. Comparison motifs with 2,6-linked sialic acid (motif 68) or no sialic acid (motif 88) had low scores (Fig. 6D), and the control plasma sample showed low scores for all motifs (not shown). Information about the sialylation of the LSTa antigen was derived only with the use of enzymated modifications (Fig. 6E). The amount of plasma used in the complete analysis was 20.3 µL, based on 3 lectins, 3 conditions, 3 replicates, 1.5 µL/array, and a 2-fold dilution of the plasma. The concentration of MUC5AC in the plasma was not quantified because we did not have a calibration standard. Such a standard would be difficult to produce because the capture antibody (clone 45M1) poorly recognizes synthesized portions of MUC5AC, and because we are detecting specific glycoforms rather than the core protein. (The ability of the 45M1 clone to capture MUC5AC was not affected by varying glycosylation, according to a previous analysis.) Previous estimates of the detection limit for the antibody-lectin sandwich assay for an analyte in blood plasma are around 10 ng/mL, or 130 pM for an 80 kD/mole protein. Given a concentration in the plasma of 1 µg/mL for MUC5AC, total protein required was 0.2 µg, or around 1 femtomole for a 250kD/mole protein (the molecular weight of MUC5AC is variable depending on fragmentation and glycosylation). In addition, the reproducibility of the measurements was good; the average coefficient of variation across 3 replicates was 0.17, and the average Pearson correlation between sets was 0.87 (Fig. S-3). We could assess the accuracy of the prediction by examining the binding of the TRA-1-60 antibody with and without treatment by each of the sialidases. Binding increased greatly upon application of the sialidase specific for 2,3-linked sialic acid (sialidase 1) and increased slightly more with the broader sialidase (sialidase 2) (Fig. 6F). Thus the pattern of increased binding of TRA-1-60 agrees with the motif prediction of primarily 2,3-linked sialic acid on the LSTa antigen.

ACS Paragon Plus Environment

- 13 -

On-Chip Glycan Analysis 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Page 14 of 27

DISCUSSION Because of the importance of characterizing protein glycoforms in clinical samples, researchers will need methods that are compatible with limited sample volumes or low concentrations of the targeted proteins. Here we demonstrate a framework for achieving that goal using microscale capture, enzymatic modification, and lectin probing of the glycans, combined with an algorithm for the automated interpretation of the data. We validated the calculations of particular motif predictions by comparing results from control proteins to data obtained from MS and chromatography. The amount of protein consumed on-chip was only ~11 ng. We then demonstrated that without purification from large amounts of sample, we could learn about a specific feature of MUC5AC glycosylation that is associated with cancer, namely the sialic acid linkage on the LSTa glycan. The amount of protein required was about 0.2 µg of a 1 µg/mL protein in 20 µL of plasma. In contrast, most of the previous studies of mucin glycosylation used cell lines or several milliliters of a human specimen in order to purify at least 5-10 µg of protein, as in investigations of MUC132 and MUC1633 glycosylation. The protein requirements of certain methods could be on par with on-chip GMAP. For example, a process of 2D gel separation, in-gel glycan release from individual protein spots, and analysis of intact glycans by chromatography was demonstrated for identifying a glycan released from possibly as low as 50 ng of acute phase proteins34. Also, in-situ glycan release from tissue and analysis by MALDI mass spectrometry could be compatible with similarly low protein quantities35. Such methods could provide complementary capabilities and could be used in combination with on-chip GMAP for added value. MS and chromatography would provide information about complete compositions, and used in a glycoproteomics mode can give site mapping and density, whereas on-chip GMAP would provide motif information with minimal sample usage and software to aid interpretation. The ambiguities from each method potentially could be cleared up by the use of all the methods in conjunction36. The features of on-chip GMAP that could be particularly useful for disease research include the ability to process many samples; low cost; and reproducible measurements allowing comparisons across populations. In addition, automated interpretations can improve accuracy and throughput while opening up the methods to researchers with little expertise in glycan analysis methods. The on-chip GMAP strategy has several limitations. It does not give information about the complete composition of a glycan; it does not readily sort out information between glycans in samples with multiple glycans; and it does not provide information about site occupancy on a glycoprotein. In addition, the method currently is not quantitative for determining whether one

ACS Paragon Plus Environment

- 14 -

Page 15 of 27On-Chip Glycan Analysis 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

motif is present at a higher occupancy than another within a given protein, because the computation of the motif prediction score is based on many factors that are variable between motifs. One could mitigate these limitations in various ways. For quantitation, one could use standards for the lectins and enzymes in order to calibrate the experimental measurements. To sort out the analysis of complex samples or proteins, one could simplify the sample by cleaving off components that are not relevant to the analysis, for example by removing N-glycans using the PNGase F enzyme prior to analyzing O-glycans. An enzyme to remove O-glycans is not available, so the reverse experiment is not possible, but the software potentially could distinguish between N-glycan and O-glycan motifs based on PNGaseF treatment after each round of lectin probing. The main goal in the current work was to test whether the results are valid, which was confirmed in the comparisons with MS and chromatography. Thus the method presented here does not represent a fully-developed system but rather a platform onto which we may add capabilities. Since the method gives information only according to the lectins and enzymes used in the assay, an important area for further development is the validation of additional lectins and enzymes that could probe a broader range of glycans. The repertoire of reagents currently available covers many structures, but undoubtedly more lectins and enzymes are needed to probe uncommon features or non-mammalian glycans. Given the continual discoveries of lectins and enzymes with novel specificities, such resources will be needed. In some cases, the required specificities may already be available but simply need to be identified. A database of analyzed and searchable glycan array data would be useful for finding reagents, as demonstrated earlier25. Another area for development is improved information about the specificities of the lectins and enzymes. In the current state of development, we have incomplete information about most reagents. The progress in glycan array technology in its breadth of coverage and availability gives a good opportunity for filling in some of the information. For example, developers of glycan arrays have created arrays to cover mammalian glycans37-38, microbial glycans39-40, various types of sialylated structures41-43, N-linked glycans with asymmetric branching44, and others (reviewed in ref. 45). Ideally we will be able to merge information from multiple, diverse types of glycan arrays—as demonstrated earlier24—in order to overcome the limitations from any particular array. The structural modeling of lectin-glycan interactions also could provide insights into lectin specificities beyond what is possible from glycan array data46-47. Glycan arrays also would be useful for learning more about the specificities of exoglycosidases; one could probe changes in the

ACS Paragon Plus Environment

- 15 -

Analytical Chemistry

On-Chip Glycan Analysis 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 27

glycans subsequent to treatment with an exoglycosidases. A study of influenza neuraminidase20 demonstrated the feasibility of such an approach. CONCLUSIONS The method presented here promises to be valuable in applications where sample amounts or analyte concentrations are low, or where precision comparisons over multiple samples are needed. Moreover, it should be valuable as a complement to orthogonal methods such as mass spectrometry and sequencing chromatography. The latter methods reveal glycan compositions and some structural information, but additional data about particular motifs could help to resolve ambiguities48. With further work to make the protocols routine and the reagents and software readily available, the method could improve accessibility of glycan analysis to researchers involved in a broad range of studies. ASSOCIATED CONTENT Supporting Information Supplementary Methods Format of the Motif Language. (Page S-2) Protein and Antibody Microarray Fabrication and Use. (Page S-3) Image Processing. (Page S-4) Glycan Analysis by Chromatography. (Page S-4) Supplementary Tables Table S-1. Lectins, proteins, and enzymes used in the study. (Page S-5) Table S-2. Set of 158 motifs used in the study. (Page S-6) Supplementary Figures Figure S-1. Mass spectra of detach N-linked glycans from human transferrin and bovine fetuin. (Page S-9) Figure S-2. Chromatographic traces and peak assignments. (Page S-10) Figure S-3. Analysis of reproducibility. (Page S-11) AUTHOR INFORMATION Corresponding Author *Brian B. Haab Van Andel Research Institute, 333 Bostwick Ave., N.E., Grand Rapids, Michigan 49503

ACS Paragon Plus Environment

- 16 -

Page 17 of 27On-Chip Glycan Analysis 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Email: [email protected] Phone: (616) 234-5268 Fax: (616) 234-5269 Present Address †Jessica Y. Sinha Cyprotex US, LLC, Kalamazoo, MI °Elliot Ensink College of Osteopathic Medicine, Michigan State University, East Lansing, MI Author Contributions The manuscript was written through contributions of all authors. All authors have given approval to the final version of the manuscript. Conflict of Interest Disclosure The authors declare no competing financial interest. ACKNOWLEDGMENTS This work was supported by the National Cancer Institute (Alliance of Glycobiologists for Cancer Detection, 1U01CA168896; R21CA186799); the National Institute of General Medical Sciences (1R41GM112750); the state of South Carolina SmartState Endowed Research program; the Strategic Positioning Fund (SPF2013/001) (”GlycoSing“) from the Biomedical Research Council (BMRC) of Agency for Science, Technology and Research (A*STAR), Singapore; and A*STAR's Joint Council (JCO) Visiting Investigator Programme (”HighGlycoART“). REFERENCES (1) Thaysen-Andersen, M.; Packer, N. H.; Schulz, B. L. Mol Cell Proteomics 2016, 6, 17731790. (2) Wada, Y.; Dell, A.; Haslam, S. M.; Tissot, B.; Canis, K.; Azadi, P.; Backstrom, M.; Costello, C. E.; Hansson, G. C.; Hiki, Y.; Ishihara, M.; Ito, H.; Kakehi, K.; Karlsson, N.; Hayes, C. E.; Kato, K.; Kawasaki, N.; Khoo, K. H.; Kobayashi, K.; Kolarich, D.; Kondo, A.; Lebrilla, C.; Nakano, M.; Narimatsu, H.; Novak, J.; Novotny, M. V.; Ohno, E.; Packer, N. H.; Palaima, E.; Renfrow, M. B.; Tajiri, M.; Thomsson, K. A.; Yagi, H.; Yu, S. Y.; Taniguchi, N. Mol Cell Proteomics 2010, 4, 719-727. (3) Royle, L.; Radcliffe, C. M.; Dwek, R. A.; Rudd, P. M. Methods Mol Biol 2006, 125-143. (4) Sharon, N. J Biol Chem 2007, 5, 2753-2764. (5) Ching, C. K.; Black, R.; Helliwell, T.; Savage, A.; Barr, H.; Rhodes, J. M. J Clin Pathol 1988, 3, 324-328.

ACS Paragon Plus Environment

- 17 -

On-Chip Glycan Analysis 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Page 18 of 27

(6) Pilobello, K. T.; Krishnamoorthy, L.; Slawek, D.; Mahal, L. K. Chembiochem 2005, 6, 985989. (7) Kuno, A.; Uchiyama, N.; Koseki-Kuno, S.; Ebe, Y.; Takashima, S.; Yamada, M.; Hirabayashi, J. Nat Methods 2005, 11, 851-856. (8) Wu, J. T. Ann Clin Lab Sci 1990, 2, 98-105. (9) Dwek, M. V.; Jenks, A.; Leathem, A. J. Clin Chim Acta 2010, 23-24, 1935-1939. (10) Meany, D. L.; Zhang, Z.; Sokoll, L. J.; Zhang, H.; Chan, D. W. J Proteome Res 2009, 2, 613-619. (11) Chen, S.; LaRoche, T.; Hamelinck, D.; Bergsma, D.; Brenner, D.; Simeone, D.; Brand, R. E.; Haab, B. B. Nat Methods 2007, 5, 437-444. (12) Tang, H.; Partyka, K.; Hsueh, P.; Sinha, J. Y.; Kletter, D.; Zeh, H.; Huang, Y.; Brand, R. E.; Haab, B. B. Cell Mol Gastroenterology & Hepatology 2016, 2, 201-221. (13) Cao, Z.; Maupin, K.; Curnutte, B.; Fallon, B.; Feasley, C. L.; Brouhard, E.; Kwon, R.; West, C. M.; Cunningham, J.; Brand, R.; Castelli, P.; Crippa, S.; Feng, Z.; Allen, P.; Simeone, D. M.; Haab, B. B. Mol Cell Proteomics 2013, 10, 2724-2734. (14) Yue, T.; Goldstein, I. J.; Hollingsworth, M. A.; Kaul, K.; Brand, R. E.; Haab, B. B. Mol Cell Proteomics 2009, 7, 1697-1707. (15) Yue, T.; Maupin, K. A.; Fallon, B.; Li, L.; Partyka, K.; Anderson, M. A.; Brenner, D. E.; Kaul, K.; Zeh, H.; Moser, A. J.; Simeone, D. M.; Feng, Z.; Brand, R. E.; Haab, B. B. PLoS ONE 2011, 12, e29180. (16) Li, Y.; Tao, S. C.; Bova, G. S.; Liu, A. Y.; Chan, D. W.; Zhu, H.; Zhang, H. Anal Chem 2011, 22, 8509-8516. (17) Li, D.; Chiu, H.; Zhang, H.; Chan, D. W. Clin Proteomics 2013, 1, 12. (18) Rho, J. H.; Mead, J. R.; Wright, W. S.; Brenner, D. E.; Stave, J. W.; Gildersleeve, J. C.; Lampe, P. D. J Proteomics 2014, 291-299. (19) Marino, K.; Bones, J.; Kattla, J. J.; Rudd, P. M. Nat Chem Biol 2010, 10, 713-723. (20) Tappert, M. M.; Smith, D. F.; Air, G. M. J Virol 2011, 23, 12146-12159. (21) Porter, A.; Yue, T.; Heeringa, L.; Day, S.; Suh, E.; Haab, B. B. Glycobiology 2010, 3, 369380. (22) Xuan, P.; Zhang, Y.; Tzeng, T. R.; Wan, X. F.; Luo, F. Glycobiology 2012, 4, 552-560. (23) Cholleti, S. R.; Agravat, S.; Morris, T.; Saltz, J. H.; Song, X.; Cummings, R. D.; Smith, D. F. Omics : a journal of integrative biology 2012, 10, 497-512. (24) Wang, L.; Cummings, R. D.; Smith, D. F.; Huflejt, M.; Campbell, C. T.; Gildersleeve, J. C.; Gerlach, J. Q.; Kilcoyne, M.; Joshi, L.; Serna, S.; Reichardt, N. C.; Parera Pera, N.; Pieters, R. J.; Eng, W.; Mahal, L. K. Glycobiology 2014, 6, 507-517. (25) Kletter, D.; Singh, S.; Bern, M.; Haab, B. B. Mol Cell Proteomics 2013, 4, 1026-1035. (26) McCarter, C.; Kletter, D.; Tang, H.; Partyka, K.; Ma, Y.; Singh, S.; Yadav, J.; Bern, M.; Haab, B. B. Proteomics Clinical Applications 2013, 632-641. (27) Forrester, S.; Kuick, R.; Hung, K. E.; Kucherlapati, R.; Haab, B. B. Molecular Oncology 2007, 216-225.

ACS Paragon Plus Environment

- 18 -

Page 19 of 27On-Chip Glycan Analysis 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

(28) Ensink, E.; Sinha, J.; Sinha, A.; Tang, H.; Calderone, H. M.; Hostetter, G.; Winter, J.; Cherba, D.; Brand, R. E.; Allen, P. J.; Sempere, L. F.; Haab, B. B. Anal Chem 2015, 19, 9715-9721. (29) Reiding, K. R.; Blank, D.; Kuijper, D. M.; Deelder, A. M.; Wuhrer, M. Anal Chem 2014, 12, 5784-5793. (30) Powers, T. W.; Holst, S.; Wuhrer, M.; Mehta, A. S.; Drake, R. R. Biomolecules 2015, 4, 2554-2572. (31) Andrews, P. W.; Banting, G.; Damjanov, I.; Arnaud, D.; Avner, P. Hybridoma 1984, 4, 347361. (32) Parry, S.; Hanisch, F. G.; Leir, S. H.; Sutton-Smith, M.; Morris, H. R.; Dell, A.; Harris, A. Glycobiology 2006, 7, 623-634. (33) Kui Wong, N.; Easton, R. L.; Panico, M.; Sutton-Smith, M.; Morrison, J. C.; Lattanzio, F. A.; Morris, H. R.; Clark, G. F.; Dell, A.; Patankar, M. S. J Biol Chem 2003, 31, 28619-28634. (34) Abd Hamid, U. M.; Royle, L.; Saldova, R.; Radcliffe, C. M.; Harvey, D. J.; Storr, S. J.; Pardo, M.; Antrobus, R.; Chapman, C. J.; Zitzmann, N.; Robertson, J. F.; Dwek, R. A.; Rudd, P. M. Glycobiology 2008, 12, 1105-1118. (35) Powers, T. W.; Neely, B. A.; Shao, Y.; Tang, H.; Troyer, D. A.; Mehta, A. S.; Haab, B. B.; Drake, R. R. PLoS One 2014, 9, e106255. (36) Yu, Y.; Lasanajak, Y.; Song, X.; Hu, L.; Ramani, S.; Mickum, M. L.; Ashline, D. J.; Prasad, B. V.; Estes, M. K.; Reinhold, V. N.; Cummings, R. D.; Smith, D. F. Mol Cell Proteomics 2014, 11, 2944-2960. (37) Blixt, O.; Head, S.; Mondala, T.; Scanlan, C.; Huflejt, M. E.; Alvarez, R.; Bryan, M. C.; Fazio, F.; Calarese, D.; Stevens, J.; Razi, N.; Stevens, D. J.; Skehel, J. J.; van Die, I.; Burton, D. R.; Wilson, I. A.; Cummings, R.; Bovin, N.; Wong, C. H.; Paulson, J. C. Proc Natl Acad Sci U S A 2004, 49, 17033-17038. (38) Fukui, S.; Feizi, T.; Galustian, C.; Lawson, A. M.; Chai, W. Nat Biotechnol 2002, 10, 10111017. (39) Stowell, S. R.; Arthur, C. M.; McBride, R.; Berger, O.; Razi, N.; Heimburg-Molinaro, J.; Rodrigues, L. C.; Gourdine, J. P.; Noll, A. J.; von Gunten, S.; Smith, D. F.; Knirel, Y. A.; Paulson, J. C.; Cummings, R. D. Nat Chem Biol 2014, 6, 470-476. (40) Wang, D.; Liu, S.; Trummer, B. J.; Deng, C.; Wang, A. Nat Biotechnol 2002, 3, 275-281. (41) Nycholat, C. M.; McBride, R.; Ekiert, D. C.; Xu, R.; Rangarajan, J.; Peng, W.; Razi, N.; Gilbert, M.; Wakarchuk, W.; Wilson, I. A.; Paulson, J. C. Angew Chem Int Ed Engl 2012, 4860-4863. (42) Song, X.; Yu, H.; Chen, X.; Lasanajak, Y.; Tappert, M. M.; Air, G. M.; Tiwari, V. K.; Cao, H.; Chokhawala, H. A.; Zheng, H.; Cummings, R. D.; Smith, D. F. J Biol Chem 2011, 3161031622. (43) Padler-Karavani, V.; Song, X.; Yu, H.; Hurtado-Ziola, N.; Huang, S.; Muthana, S.; Chokhawala, H. A.; Cheng, J.; Verhagen, A.; Langereis, M. A.; Kleene, R.; Schachner, M.; de Groot, R. J.; Lasanajak, Y.; Matsuda, H.; Schwab, R.; Chen, X.; Smith, D. F.; Cummings, R. D.; Varki, A. J Biol Chem 2012, 27, 22593-22608. (44) Wang, Z.; Chinoy, Z. S.; Ambre, S. G.; Peng, W.; McBride, R.; de Vries, R. P.; Glushka, J.; Paulson, J. C.; Boons, G. J. Science 2013, 6144, 379-383.

ACS Paragon Plus Environment

- 19 -

On-Chip Glycan Analysis 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Page 20 of 27

(45) Rillahan, C. D.; Paulson, J. C. Annu Rev Biochem 2011, 797-823. (46) Taylor, M. E.; Drickamer, K. Glycobiology 2009, 11, 1155-1162. (47) Grant, O. C.; Tessier, M. B.; Meche, L.; Mahal, L. K.; Foley, B. L.; Woods, R. J. Glycobiology 2016, 7, 772-783. (48) Ashline, D. J.; Yu, Y.; Lasanajak, Y.; Song, X.; Hu, L.; Ramani, S.; Prasad, V.; Estes, M. K.; Cummings, R. D.; Smith, D. F.; Reinhold, V. N. Mol Cell Proteomics 2014, 11, 29612974.

ACS Paragon Plus Environment

- 20 -

Page 21 of 27On-Chip Glycan Analysis 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

TABLE OF CONTENTS GRAPHIC

ACS Paragon Plus Environment

- 21 -

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

On-chip glycan modification and probing. Glycoproteins were immobilized onto a planar surface, and the glycans on the proteins were probed by a panel of lectins, either with or without prior modification of the glycans using enzymes. We quantified the binding of each lectin under each condition and then used an algorithm to predict the Figure 1 99x64mm (600 x 600 DPI)

ACS Paragon Plus Environment

Page 22 of 27

Page 23 of 27

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Test case: distinguishing 2,3 from 2,6 sialic acid. A) We used two enzymes. Sialidase 1 cleaves only 2,3 sialic acid, and sialidase 2 cleaves all linkages of sialic acid. B) We selected lectins based on their primary specificities against the motifs targeted by the enzymes and the motifs expected to be exposed after enzyme treatment. The fine specificities of the lectins are more complex than depicted here. C) The images show the fluorescence signals from microarrays produced by direct spotting of purified glycoproteins. We present representative examples from each treatment condition. D) The heatmaps show the relative signals under each condition as well as the differences between signals obtained with and without enzyme treatment, indicated by Sialidase1 and Sialidase2. To normalize the signals, all the values obtained with a given lectin were divided by the highest value for that lectin, making the range 0 to 1 for the original values and the range -1 to 1 for the differences. Figure 2 185x333mm (600 x 600 DPI)

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment

Page 24 of 27

Page 25 of 27

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Motif scores for the original and modified motifs. A) We defined 158 motifs covering the variants we were probing. For each motif, we defined modified versions representing treatment by either sialidase 1 or sialidase 2. Two representative motifs are illustrated here. The angled brackets refer to attributes of the immediately following monosaccharide. (See supplementary information for details on the motif language and the motifs.) B) For each of the 158 motifs we calculated a motif score from glycan array data (obtained from the Consortium for Functional Glycomics) for each lectin, using both the original and modified motifs. We then calculated the differences between the modified and original motif scores, indicated by ∆Sialidase1 and ∆Sialidase2. Figure 3 182x420mm (600 x 600 DPI)

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Calculation of the motif prediction scores for transferrin. A) Each motif score for a particular lectin was multiplied by the fluorescence signal for that lectin. The resulting products were summed over all lectins for each motif, giving the motif prediction scores for the unmodified glycans. The calculation was the same for the glycans after modification by sialidase 1 (panel B) and sialidase 2 (panel C), using the motif scores and the intensities. The final step was to sum the motif prediction scores over all conditions. D) The top motif was 6-sialyl N-acetyl-lactosamine connected to unbranched mannose. Some of the top 5 motifs had high scores only after summing over the enzyme conditions. Figure 4 151x163mm (600 x 600 DPI)

ACS Paragon Plus Environment

Page 26 of 27

Page 27 of 27

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Validation of predicted motifs. A) The heatmap of top prediction motifs shows both similarities and differences among the 5 proteins. The top motifs showed a relative dominance of motifs with 2,3 sialic acid in fetuin and motifs with 2,6 sialic acid in transferrin. AGP showed higher amounts of branched motifs (e.g. motifs 62 and 44) relative to transferrin, which showed higher unbranched motifs (e.g. motif 55). B) A quantification by mass spectrometry showed that the top glycans in transferrin primarily displayed 2,6 sialic acid, but the top glycans in fetuin primarily displayed 2,3 sialic acid. The glycans shown are probable structures based on biosynthetic rules. C) Quantitative glycan analysis by chromatography confirmed differences between transferrin and AGP in the branching of N-linked glycans. D) The top 15 motifs in either fetuin or transferrin had probable assignments in the top glycans found by MS, whereas related motifs with low MP scores had no probably assignments. Figure 5 167x198mm (600 x 600 DPI)

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment

Page 28 of 27

Page 29 of 27

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Probing glycans on MUC5AC captured from plasma. A) The TRA-1-60 antibody detects a non-sialylated glycan. To detect the glycan on MUC5AC, we captured MUC5AC from plasma using antibody arrays, treated the captured protein with sialidase, and probed with the antibody. B) The application of the method in panel A to samples from pancreatic cancer patients and subjects with benign pancreatic conditions revealed elevated levels in a significant number of cancer patients. C) Antibody arrays incubated with a cancer patient sample were probed with either BPL, ECL, or SNA using either no enzyme modification or treatment with one of the sialidases. D) We applied the motif prediction algorithm using either just the no enzyme condition or the sum over all conditions. The cluster shows the top 20 motifs from either method. Motif 50 was high using only in the summed value. E) A comparison of motifs 50, 68, and 88 showed that without summing over enzymatic modifications, no values were achieved for motifs 50 and 68. F) We probed a cancer patient sample and a control sample with the TRA-1-60 antibody using the 3 conditions. The pattern of increased binding after the sialidases agrees with the motif prediction result. Figure 6 146x136mm (600 x 600 DPI)

ACS Paragon Plus Environment