GlycoPep DB: A Tool for Glycopeptide Analysis Using a “Smart Search

Department of Chemistry, University of Kansas, Lawrence, Kansas 66045. Anal. Chem. , 2007, 79 (4), pp 1708–1713. DOI: 10.1021/ac061548c. Publication...
4 downloads 0 Views 495KB Size
Anal. Chem. 2007, 79, 1708-1713

GlycoPep DB: A Tool for Glycopeptide Analysis Using a “Smart Search” Eden P. Go, Kathryn R. Rebecchi, Dilusha S. Dalpathado, Mary L. Bandu, Ying Zhang, and Heather Desaire*

Department of Chemistry, University of Kansas, Lawrence, Kansas 66045

Mass spectrometry is emerging as a versatile analytical tool for profiling glycan and glycopeptide structures. While the interpretation of MS data remains a challenging and difficult task, substantial efforts have been made to develop informatics tools to alleviate MS data interpretation. Here, we present a web-based tool, GlycoPep DB, designed to facilitate compositional assignment for glycopeptides by comparing experimentally measured masses to all calculated glycopeptide masses from a carbohydrate database with N-linked glycans. GlycoPep DB is an advance over current tools to assign N-linked glycans because it uses a concept of “smart searching”, where only biologically relevant carbohydrate compositions are searched, when matching carbohydrate compositions with the MS data making glycopeptide compositional assignment more efficient. This is in contrast to currently used tools, where many implausible glycan structures are present in the search output, but fewer biologically relevant glycan motifs are predicted. The utility of GlycoPep DB is illustrated in the analysis of glycopeptides derived from a proteolytic digest of follicle stimulating hormone. Glycosylation is the most common and extensive form of protein post-translation modification, and its presence is fundamental in a variety of biological processes. Glycan moieties attached to proteins are ubiquitously present in cells and tissues as well as extracellular fluids and matrices. There is growing evidence that glycans regulate and modulate cellular functions by interacting with proteins and molecules at the cell-extracellular interface.1-3 In addition, changes in glycosylation have been linked to disease states and are known to influence effector functions and binding to specific receptors.1,4-9 Thus, elucidating the glycan * To whom correspondence should be addressed. E-mail: [email protected]. (1) Dwek, R. A. Chem. Rev. 1996, 96, 683-720. (2) Dwek, R. A. Dev. Biol. Stand. 1998, 96, 43-7. (3) Raman, R.; Raguram, S.; Venkataraman, G.; Paulson, J. C.; Sasisekharan, R. Nat. Methods 2005, 2, 817-24. (4) Rudd, P. M.; Elliott, T.; Cresswell, P.; Wilson, I. A.; Dwek, R. A. Science 2001, 291, 2370-6. (5) Zitzmann, N.; Block, T.; Methta, A.; Rudd, P.; Burton, D.; Wilson, I.; Platt, F.; Butters, T.; Dwek, R. A. Adv. Exp. Med. Biol. 2005, 564, 1-2. (6) Jefferis, R.; Lund, J.; Pound, J. D. Immunol. Rev. 1998, 163, 59-76. (7) Krapp, S.; Mimura, Y.; Jefferis, R.; Huber, R.; Sondermann, P. J. Mol. Biol. 2003, 325, 979-89. (8) Mimura, Y.; Church, S.; Ghirlando, R.; Ashton, P. R.; Dong, S.; Goodall, M.; Lund, J.; Jefferis, R. Mol. Immunol. 2000, 37, 697-706.

1708 Analytical Chemistry, Vol. 79, No. 4, February 15, 2007

structures and profiling their changes in response to extracellular perturbations allow for the determination of the glycan’s functional activity. However, unlike proteins and nucleic acids, glycan sequence information has lagged behind, due to its diversity and heterogeneity.10 Glycans are not directly genetically encoded in the genome, and therefore, the structure and sequence cannot be simply deduced from the DNA sequence.11 As a result, glycan compositions must be analytically determined, as the glycan residues present can have considerable variability, depending on which carbohydrate enzymes are active during the glycan synthesis. A variety of analytical methodologies have been described for glycan compositional analysis, such as high-pressure anion exchange chromatography with pulsed amperometric detection, high-performance liquid chromatography, or affinity methods combined with mass spectrometry, lectin microarrays, and nuclear magnetic resonance spectroscopy.4,12-21 Among these methodologies, mass spectrometry is emerging as the method of choice for glycan analysis, due to low sample requirements, high sensitivity, and ability to determine molecular weights. Typical sample preparation methods prior to MS analysis usually entail enzymatic digestion of a glycoprotein followed by either an enzymatic or chemical release of the glycan chains or fractionation of the glycopeptide pool. The glycan compositions are then deduced from the analysis of MS and tandem MS data. These compositional and structural assignments are often painstaking and timeconsuming, since only a few software tools and glycan libraries (9) Yoo, E. M.; Morrison, S. L. Clin. Immunol. 2005, 116, 3-10. (10) von der Lieth, C. W.; Bohne-Lang, A.; Lohmann, K. K.; Frank, M. Brief Bioinformatics 2004, 5, 164-78. (11) Varki, A. Essentials of glycobiology,; Cold Spring Harbor Laboratory Press: Cold Spring Harbor, NY, 1999. (12) Harvey, D. J. Glycoconjugate J. 1992, 9, 1-12. (13) Harvey, D. J. Expert Rev. Proteomics 2005, 2, 87-101. (14) Harvey, D. J. Proteomics 2005, 5, 1774-86. (15) Kuno, A.; Uchiyama, N.; Koseki-Kuno, S.; Ebe, Y.; Takashima, S.; Yamada, M.; Hirabayashi, J. Nat. Methods 2005, 2, 851-6. (16) Macmillan: D.; Daines, A. M. Curr. Med. Chem. 2003, 10, 2733-73. (17) Manzi, A. E.; Norgard-Sumnicht, K.; Argade, S.; Marth, J. D.; van Halbeek, H.; Varki, A. Glycobiology 2000, 10, 669-89. (18) Mizushima, T.; Hirao, T.; Yoshida, Y.; Lee, S. J.; Chiba, T.; Iwai, K.; Yamaguchi, Y.; Kato, K.; Tsukihara, T.; Tanaka, K. Nat. Struct. Mol. Biol. 2004, 11, 365-70. (19) Rudd, P. M.; Guile, G. R.; Kuster, B.; Harvey, D. J.; Opdenakker, G.; Dwek, R. A. Nature 1997, 388, 205-7. (20) Solis, D.; Jimenez-Barbero, J.; Kaltner, H.; Romero, A.; Siebert, H. C.; der Lieth, C. W.; Gabius, H. J. Cells Tissues Organs 2001, 168, 5-23. (21) Wormald, M. R.; Petrescu, A. J.; Pao, Y. L.; Glithero, A.; Elliott, T.; Dwek, R. A. Chem. Rev. 2002, 102, 371-86. 10.1021/ac061548c CCC: $37.00

© 2007 American Chemical Society Published on Web 01/11/2007

Figure 1. Screenshot of the GlycoPep DB user interface for glycopeptide compositional analysis by database query. The user enters experimentally determined masses as well as peptide sequence, cysteine modification, variable modification from other amino acid residues, charge state and carrier, and mass tolerance.

are publicly accessible. Even so, considerable efforts have been made in the development of glycan web-based tools and databases to simplify analysis.10,22,23 To date, a number of useful resources such as CarbBank (http://biol.lancs.ac.uk/gig/pages/gag/carbbank.htm), EuroCarb DB (http://www.eurocarbdb.org), GlycoMod (http://www.expasy.ch/tools/glycomod/),24 Sugabase (http:// www.boc.chem.uu.nl/sugabase/sugabase.html), BOLD,25 GlySpy, and OSCAR,26,27 GlycoSuite DB (http://www.glycosuite.com),28,29 Consortium for Functional Glycomics (http://www.functionalglycomics.org), and Central Spectroscopy Department of the German Cancer Research Centre (http://www.dkfz.de/spec/)30-32 are becoming valuable tools for the analysis of an ensemble of glycan structures. When the goal is to assign glycopeptide compositions to MS data, GlycoMod, is perhaps the most useful of these tools. GlycoMod is a web-based tool that is freely available at the ExPaSy proteomics web site (http://www.expasy.ch/tools), which is designed to find all theoretically possible glycan or glycopeptide compositions from observed glycan or glycopeptide mass peaks. So far, GlycoMod had been successfully applied to singly charged free or derivatized glycans.24 However, this program does not have an option for multiply charged species, which are common for glycan/glycopeptide MS data obtained (22) (23) (24) (25) (26) (27) (28) (29) (30) (31) (32)

Berteau, O.; Stenutz, R. Carbohydr. Res. 2004, 339, 929-36. Perez, S.; Mulloy, B. Curr. Opin. Struct. Biol. 2005, 15, 517-24. Cooper, C. A.; Gasteiger, E.; Packer, N. H. Proteomics 2001, 1, 340-9. Cooper, C. A.; Wilkins, M. R.; Williams, K. L.; Packer, N. H. Electrophoresis 1999, 20, 3589-98. Lapadula, A. J.; Hatcher, P. J.; Hanneman, A. J.; Ashline, D. J.; Zhang, H.; Reinhold: V. N. Anal. Chem. 2005, 77, 6271-9. Zhang, H.; Singh, S.; Reinhold: V. N. Anal. Chem. 2005, 77, 6263-70. Cooper, C. A.; Harrison, M. J.; Wilkins, M. R.; Packer, N. H. Nucleic Acids Res. 2001, 29, 332-5. Cooper, C. A.; Joshi, H. J.; Harrison, M. J.; Wilkins, M. R.; Packer, N. H. Nucleic Acids Res. 2003, 31, 511-3. Lohmann, K. K.; von der Lieth, C. W. Proteomics 2003, 3, 2028-35. Lohmann, K. K.; von der Lieth, C. W. Nucleic Acids Res. 2004, 32, W2616. Loss, A.; Bunsmann, P.; Bohne, A.; Loss, A.; Schwarzer, E.; Lang, E.; von der Lieth, C. W. Nucleic Acids Res. 2002, 30, 405-8.

from ESI experiments. In order to determine the composition of these species, singly charged forms of these ions would have to be calculated manually. More importantly, no “filtering” of nonbiologically relevant carbohydrate compositions is incorporated. Although putative glycan/glycopeptide compositions can be obtained, the vast majority of the matches will either be nonbiologically relevant or will not be relevant to the protein of interest. Consequently, the user is left the painstaking task of sifting through thousands of possible “hits” and must use other resources to sort out incorrect hits. In order to reduce the numbers of incorrect hits, users often reduce their search space and, as a result, do not find any biologically relevant “hits” for some peaks. To address these issues, we developed a web-based tool that will not only expedite the analysis of glycopeptides but also eliminate the incidence of incorrect assignments. In this paper, we describe GlycoPep DB, a publicly accessible web-based tool designed to facilitate compositional analysis of glycopeptide peaks from MS and MSn data by comparing experimentally measured masses to predicted masses derived from a carbohydrate database. GlycoPep DB incorporates common glycomics tool features such as web interface, standard data formats, and database. The utility of GlycoPep DB is described herein, and its features are compared to other, similar glycan analysis tools. EXPERIMENTAL SECTION GlycoPep DB (http://hexose.chem.ku.edu/sugar.php) is a freely accessible web-based tool developed for glycopeptide analysis. A screen shot of the interface is found in Figure 1. GlycoPep DB matches mass spectral data input by the user to biologically relevant glycopeptide compositions, based on the information input by the user. The GlycoPep DB input file is a text file containing a list of experimental mass-to-charge (m/z) ratios extracted from the MS and MSn data separated by white spaces or new lines. The extracted peak list can be input manually or pasted from the text file. If the peak list for the mass spectra Analytical Chemistry, Vol. 79, No. 4, February 15, 2007

1709

is very large, containing more than 1000 data points, data points should be reduced prior to inputting, to increase the speed of the search. This is a common strategy used for similar automated peak assignment programs, and the data are normally reduced by employing data processing steps such as baseline correction, peak detection, and filtering 13C peaks. After the MS data are input, the user selects the database to query and enters the peptide sequence to be searched, prior to initiating the search. The carbohydrate database includes glycan structures from serum/ plasma, pituitary hormones, and HIV envelope glycoproteins reported in the literature. For cases where a specific carbohydrate database to be searched is not available, a search can be made on all carbohydrate entries in the database. All redundancies in this option were removed for uniqueness. The query results are displayed in a tabular format, which can be copied and pasted to Excel or can be saved as a text file. Once the glycopeptide compositions are determined, the matched peaks from the query result are refined by evaluating the corresponding isotopic pattern of the matched peak from the MS or MSn data. When an isotopic pattern is observed, the peak is further examined to confirm that the peak assignment corresponds to a 12C and not a 13C peak by zooming into the peak. Most often, after this verification, the correct compositional assignment is obtained. Implementation. GlycoPep DB was initially coded in Perl script and was converted to PHP script for web accessibility. Access to this program is realized through a web interface (Figure 1) implemented using two open-source software tools: a relational database system, MySQL (http://www.mysql.com), and a server-side scripting language, PHP (http://www.php.net) running under an Apache web server on a Linux system. Currently, the carbohydrate database contains N-linked glycans derived from pituitary hormones, serum/plasma proteins, and HIV envelope glycoproteins reported in the literature. Future updates of GlycoPep DB will include compilations of glycan structures derived from other biologically relevant glycoproteins including those with O-linked glycosylation. Following other online carbohydrate databases, GlycoPep DB intends to provide accurate database annotation, a minimum level of redundancy, and integration to other databases. RESULTS AND DISCUSSION Overview of GlycoPep DB. As a web-based tool designed to facilitate the analysis of MS and MSn data obtained from a glycopeptide pool, GlycoPep DB provides the retrieval of all plausible compositions of observed glycopeptide masses from a specified carbohydrate database. It currently contains a compilation of 319 N-linked glycans from several different types of glycoproteins reported in the literature. The main page of the web interface provides an access to the content of the database (Figure 1), and a search can be performed by specifying the carbohydrate database, peptide sequence, charge state, charge carriers, mass tolerance in ppm, cysteine modification, variable modification from other amino acid residues, and the experimental peak list. In the cases where the peptide sequence is not known, the user can input the experimental peptide mass obtained from MS/MS data or MS data from deglycosylated glycopeptides. Once the MS data are submitted, GlycoPep DB generates a list of all matched theoretically possible glycopeptide compositions in ascending order of the glycopeptide masses. The query result consists of the 1710

Analytical Chemistry, Vol. 79, No. 4, February 15, 2007

experimental glycopeptide masses, the mass error, the charge state, the matched theoretical glycopeptide masses, and the corresponding compositions in symbolic form. The following abbreviation is used in the query result: (a) Hex for hexoses (monosaccharides with the same mass, e.g., mannose and galactose) , (b) HexNAc for N-acetylhexosamine, (c) Fuc for fucose (deoxyhexose), SO3 for sulfate, PO4 for phosphate, and (d) NeuNAc for sialic acid. The full description of the user instruction is provided on the main page. Compositional Assignment of Glycopeptides: Follicle Stimulating Hormone (FSH). The utility of GlycoPep DB is demonstrated by comparing its output with the output of GlycoMod and an in-house Excel macro33,34 written in VBA (Visual Basic for Applications) in the MS analysis of glycopeptides derived from a proteinase K digest of equine FSH (eFSH). Details of the glycopeptide sample preparation have been described elsewhere.35 The ESI-Fourier transform ion cyclotron resonance (FTICR) MS data of the glycopeptide fraction derived from eFSH are shown in Figure 2A. Glycopeptide assignments of these data were initially done using GlycoMod and an in-house Excel macro, and the comparison of those results with the output from GlycoPep DB is described below. GlycoMod can be accessed from the Expasy Proteomics website (http://www.expasy.org/tools/). The program calculates possible glycopeptide compositions from the set of experimental mass values entered by the user. GlycoMod is different from GlycoPep DB in that it does not query a database of biologically relevant glycans. Instead, it generates an exhaustive enumeration of a list of possible glycan compositions that match the experimental mass within a user-specified mass tolerance and input parameters. For structures that are biologically relevant within this list, a corresponding link to GlycoSuite DB is provided. A detailed description of the input parameters can be found in the Documentation link of the tool. In the compositional analysis of the FTICR-MS data of the eFSH glycopeptide fraction using GlycoMod (http://www.expasy.org/tools/glycomod), the following input parameters were used: (a) peptide sequence, NIT; (b) peptide modification, all cysteines were reduced, no cutting, methionines not oxidized; (c) type of glycosylation, N-linked glycosylation and no glycan derivatization; (d) ion mode, negative and Na adducts; (e) mass error of 5 ppm; (f) monosaccharide residues present, Hex, HexNAc, fucose, sulfate, and sialic acid; (g) the peak list from the MS data. Since GlycoMod is designed to calculate plausible glycan/glycopeptide compositions from singly charged species, experimental masses of doubly and triply charged ions were manually converted to the theoretical mass of the singly charged species prior to analysis. Query results from these parameters generated over 1000 glycan compositions for the peptide, NIT. This initial list of candidate glycopeptide compositions were sorted using the following criteria: (1) structures should be biologically relevant, and (2) the MS/MS data of the corresponding peak should be consistent with the putative assignment. To illustrate how the results are sorted, (33) Irungu, J.; Dalpathado, D. S.; Go, E. P.; Jiang, H.; Ha, H. V.; Bousfield, G. R.; Desaire, H. Anal. Chem. 2006, 78, 1181-90. (34) Jiang, H.; Desaire, H.; Butnev, V. Y.; Bousfield, G. R. J. Am. Soc. Mass Spectrom. 2004, 15, 750-8. (35) Dalpathado, D. S.; Irungu, J.; Go, E. P.; Butnev, V. Y.; Norton, K.; Bousfield, G. R.; Desaire, H. Biochemistry 2006, 45, 8665-73.

Figure 2. (A) ESI FTICR-MS data of eFSH glycopeptides fraction in the negative ion mode. (B) A representative query result from the GlycoMod tool.

consider the singly charged ion with m/z 2753.03643, which corresponds to the doubly charged ion at m/z 1376.015 in the MS data (Figure 2A). GlycoMod generated three possible N-linked glycan compositions for this ion, when the peptide NIT was input (Figure 2B). Each of these structures consists of the same core, (Man)3(GlcNAc)4, followed by several monosaccharide residues in the antenna. After some consideration of biological precedence, based on the glycobiology literature,1,11 an experienced analyst can rule out the top two entries in this particular example as structures that are not biologically relevant. After doing so, the third entry becomes a reasonable structure to be considered as a compositional assignment for this peak. To further verify this structure, MS/MS data of this peak need to be evaluated to determine whether the structure is consistent with all the MS data available. In some cases, even when only one biologically relevant structure is listed in the GlycoMod output, it may not match the MS/MS data. In such cases, the correct structures must be either elucidated from the MS/MS data manually (which can be quite challenging) or the analyst must assume that the “wrong” peptide sequence was queried for that particular peak, and the analysis must be repeated, by considering a different peptide sequence. Overall ∼98% of the GlycoMod query results are ruled out in assigning these spectra, and such an analysis takes days to weeks.

This assessment is based on the following facts: Considering the whole mass range of the data shown in Figure 2A and interrogating these data for the peptide sequence NIT alone, a total of 1325 compositions were generated. After several days of analysis, the query results could be used to assign 18 peaks to NIT-containing glycopeptides. These 18 assigned compositions were further confirmed by manually verifying that the monoisotopic mass of the assigned glycopeptide corresponded to a monoisotopic (12C) peak on the spectrum, and not a 13C peak. In the end, the analysis of these data gave 16 reasonable assignments after days of analysis, which were only focused on assigning glycopeptides that contained the peptide sequence, NIT. With the sheer number of potential glycopeptide compositions generated by GlycoMod (>1300 for one peptide sequence, in this case), the user is faced with a rigorous task of spending considerable time sifting though data to assign the glycopeptides. In addition to the problem of providing too many wrong assignments, this algorithm also suffers from the fact that often the right assignments are not output, due to the fact that the user constantly attempts to limit the search space, in order to stop generating so many incorrect hits. This is also a limitation of our in-house VBA macro, as described below. The spectrum in Figure 3 was initially analyzed over a period of six weeks, using our VBA macro26,27 wherein possible glycan/ Analytical Chemistry, Vol. 79, No. 4, February 15, 2007

1711

Figure 3. (a) Screenshot of a typical output from GlycoPep DB. (b) ESI-FTICR-MS data of equine FSH showing identified glycopeptide peaks with their corresponding charge state using in-house macro (top) and GlycoPep DB (bottom) Note that identified peaks that were not labeled in both spectra are doubly charged ions.

glycopeptide compositions can be obtained from the experimental mass by specifying the type of glycosylation, the ion mode, protein/peptide sequence, mass tolerance, and possible glycan residues. In comparison to GlycoMod, this macro has an added functionality of being able to process data for multiply charged ions, without the necessity of manually converting the peaks to their theoretical, singly charged mass. As with GlycoMod, the calculated masses of the set of predicted candidates are compared to the measured masses within the mass error specified by the user. Once a possible match is obtained, the corresponding isotopic pattern from the MSn data is used to support the compositional assignment and to differentiate noise spikes from signal. Depending on the search parameters, either thousands of incorrect hits are obtained (as described above) or many peaks are left unassigned, because the search criteria are too restrictive. Since our VBA macro and GlycoMod assign spectra in a similar fashion, our VBA macro also generates incorrect assignments that are either nonbiologically relevant or not relevant to the protein of interest, requiring the user to sort out the incorrect hits. In cases where there are no matches, the user is forced to identify a composition manually, using the incorrect hit as a 1712

Analytical Chemistry, Vol. 79, No. 4, February 15, 2007

starting point. Putative candidates for unassigned peaks are manually determined by starting with glycopeptide compositions from peaks that have been assigned, then substituting, adding, or removing some glycan residues on the already-assigned glycan composition until a “hit” for the unassigned peak is identified. For example, the peak with an experimental m/z 1386.4975 was assigned to the doubly charged ion, [LENHTQ + 5HexNAc + 4Hex + NeuNAc + SO3] within 5 ppm mass tolerance. Thus, this peak was used as a starting point, in an attempt to manually assign a neighboring peak at m/z 1301.4395. Using the initially predicted composition for m/z 1386.4975 as the starting point, the addition of a HexNAc and an SO3 group followed by the removal of a Hex and a NeuNAc results in a glycan sequence with a composition of [LENHTQ + 6HexNAc + 3Hex + 2SO3] and a calculated m/z of 1301.4395 Da (for the doubly charged ion). This calculated m/z matches the peak at m/z 1301.4396 in the mass spectra within a mass error of 0.077 ppm, and its composition is further confirmed from MS/MS data. This peak at m/z 1301.4396 was also missed in GlycoMod, and the peak assignment has to be done manually. These manual peak assignments, which are required to fully assign a spectrum, consume considerable data analysis time.

While fully assigning the eFSH data using either GlycoMod or our VBA macro could take months, GlycoPep DB handles the same analysis much more rapidly. Analysis of the same data (Figure 2A) using the input parameters in the GlycoMod search generated 37 biologically relevant glycopeptide compositions for the peptide NIT (Figure 3a) within a few minutes. After determining which of the compositions matched monoisotopic peaks and further confirming the compositions from MS/MS data, a total of 17 compositions were obtained. Additional compositional analysis using GlycoPep DB starting with other eFSH peptide sequence generated 49 matched peaks, completed within a few hours, with all major and minor peaks assigned. Since GlycoPep DB only contains carbohydrate structures that are biologically relevant, these 49 hits do not need to be manually inspected for implausible structures. The obvious significant benefit of using GlycoPep DB is the reduced analysis time required, so it is amenable for high-throughput glycan/glycopeptide compositional analysis. In addition to orders of magnitude reduction in analysis time, more than twice the number of peaks were assignable. This is because GlycoPep DB limits the search space to only biologically relevant glycan motifs present in the database. This is in contrast with GlycoMod and the in-house VBA algorithm where an exhaustive enumeration of glycan compositions is obtained, generating far too many nonrelevant compositions in the process. GlycoPep DB has an additional feature in that if glycan analysis data are present for the protein of interest, the database can be restricted to only search for glycan matches that contain those carbohydrates. Essentially, this database approach filters out most of the incorrect assignments in an automated fashion. GlycoPep DB does not have the inherent

problem of having “too few” hits because it utilizes a database of previously characterized set of glycans found in the literature. As a result, GlycoPep DB generates a set of candidates that both GlycoMod and the VBA algorithm missed. And analogous to the GlycoMod and the VBA algorithm, the isotope patterns of the glycopeptide peaks are examined and the glycan composition information is further confirmed by tandem MS experiments, since it is always possible that the assigned glycan composition will be isobaric with the actual glycan composition. CONCLUSIONS Recent advances in glycobiology and glycomics have accelerated the pace of the development of bioanalytical and bioinformatic tools to study glycan structure-function relationships. Indeed with the completion of numerous genome sequences, the amount of data has immensely increased and this trend has created new demands for rapid, automated, and reliable informatic tools for glycan analysis. GlycoPep DB attempts to address the current limitation in MS data interpretation by allowing automated and rapid compositional assignments of observed glycopeptide peaks from MS and MSn spectra. Currently, we are working to expand the database and integrate this tool to other proteomics and carbohydrate databases. ACKNOWLEDGMENT This work was supported by NIH grant RO1GM077226. Received for review August 18, 2006. Accepted November 30, 2006. AC061548C

Analytical Chemistry, Vol. 79, No. 4, February 15, 2007

1713