GlycoExtractor: A Web-Based Interface for High Throughput Processing of HPLC-Glycan Data Natalia V. Artemenko,* Matthew P. Campbell,* and Pauline M. Rudd* Dublin-Oxford Glycobiology Laboratory, National Institute for Bioprocessing Research and Training (NIBRT), Conway Institute, University College Dublin, Belfield, Dublin 4, Ireland Received November 13, 2009
Abstract: Recently, an automated high-throughput HPLC platform has been developed that can be used to fully sequence and quantify low concentrations of N-linked sugars released from glycoproteins, supported by an experimental database (GlycoBase) and analytical tools (autoGU). However, commercial packages that support the operation of HPLC instruments and data storage lack platforms for the extraction of large volumes of data. The lack of resources and agreed formats in glycomics is now a major limiting factor that restricts the development of bioinformatic tools and automated workflows for highthroughput HPLC data analysis. GlycoExtractor is a webbased tool that interfaces with a commercial HPLC database/software solution to facilitate the extraction of large volumes of processed glycan profile data (peak number, peak areas, and glucose unit values). The tool allows the user to export a series of sample sets to a set of file formats (XML, JSON, and CSV) rather than a collection of disconnected files. This approach not only reduces the amount of manual refinement required to export data into a suitable format for data analysis but also opens the field to new approaches for high-throughput data interpretation and storage, including biomarker discovery and validation and monitoring of online bioprocessing conditions for next generation biotherapeutics. Keywords: glycomics data analysis • data extraction • glycome informatics • glycan database • xml format in glycomics • web-based extractor • standards • HPLC • carbohydrates • high-throughput
Introduction It is widely accepted that carbohydrates are important factors in many biological recognition processes and that the full and detailed characterization of the glycome of cells, tissues, and organisms is an important feature in the era of postgenomic science. Glycosylation is the most common and structurally diverse post-translational modification of proteins and has an impact on a wide range of biological functions.1 Recently, it has been demonstrated that more than half of all gene products have been found to be glycosylated2 and that approximately * To whom correspondence should be addressed. E-mail: natalia.
[email protected],
[email protected], and
[email protected]. 10.1021/pr901213u
2010 American Chemical Society
10% of genes expressed in humans are proteins involved in glycan biosynthesis.3 Glycosylation is involved in different biochemical processes including protein folding, stability, and localization.4 Specific interactions between a glycan and its carbohydrate binding protein are associated with cellular communication and, in the immune system, functional recognition of Ig Fc is modulated by the glycans attached to the IgG.5–9 Alterations in the activity of a glycoprotein are often associated with differences in the glycosylation profile, which can include overall changes in the oligosaccharide structures at specific glycosylation sites and/ or site occupancy.10,11 It has been demonstrated that aberrant changes in glycosylation have a crucial impact on mammalian diseases: immune deficiencies, cancer, cardiovascular disease, hereditary and congenital disorders.12–15 Moreover, we have seen the emergence of research activities to decipher the structural and functional roles of carbohydrates in protein functions and the implications of glycans in health and disease.13,16–21 Similar to genomics and proteomics, glycomics requires a well-structured and curated set of databases with supporting data retrieval, querying and analytical software. The development of bioinformatic resources to support glycobiology has increased considerably, but there are still many disconnected and often incompatible databases of experimental and structural data, including recent initiatives that aim to provide access to the available data from a single portal.22–39 Currently, the field still lacks well-established, comprehensive and centralized data collections, data standards and information reporting protocols, similar to those in genomics and proteomics (GenBank,40,41 Swiss-Prot/TrEMBL,42 Minimal Information About a Microarray Experiment,43 the Minimum Information About a Proteomics Experiment44). Genomics and proteomics have demonstrated that the availability of well-curated and extensive collections of structural and experimental data are pivotal in the development of analytical methodologies and help to facilitate cross-discipline research and systems biology. To be useful, these databases need reliable and robust annotation and a common set of standards for storing and presenting glycan structures. In the past few years, a number of approaches have been developed to allow exchange and querying across databases, including GLYDE,45 GLYDE II,20 GlycoCT46 which have been implemented by several consortia. The extent of glycoprotein microheterogeneity is a challenging analytical problem because there is a requirement to fully characterize all glycoforms, that is, a distinct form of a Journal of Proteome Research 2010, 9, 2037–2041 2037 Published on Web 02/09/2010
technical notes glycoprotein with a distinct glycan structure per glycosylation site, to better understand the biological function associated with particular glycan structures. This includes monosaccharide sequence and linkage information and quantitation. For example, a number of specific glycans or particular glycoforms have been shown to be promising cancer biomarkers47–49 and carbohydrate-based vaccines.50 Therefore, there is a requirement for the development of high-throughput (HT), highly sensitive and robust analytical strategies to define in detail the glycome of cells, tissues and fluids. Recent advances in HPLC methodologies, such as a robust, high-throughput NormalPhase High Performance Liquid Chromatography (NP-HPLC) robotic compatible platform for glycan analysis,51 are allowing this in-depth analysis. However, the lack of tools and accredited data formats for exporting large volumes of data are major bottlenecks in high-throughput glycomic projects. Over the years, different manufacturers have developed various proprietary data formats for handling such data, for example, AIA/ ANDI (Analytical Instrumentation Association/ANalytical Data Interchange) and ANDI/NetCDF (ANalytical Data Interchange/ network Common Data Form). Often the existing mechanism requires manual, time-consuming and error prone steps to export integrated profile data. This complicates the integration of new instruments into an analysis framework and the exchange and comparison of results from different experiments and laboratories. For instance, Creon Lab Control developed the Analytical Information Markup Language (AnIML) to facilitate the interchange of chromatography and spectroscopy data and Thermo Fisher Scientific initiated a common data storage format known as GAML (Generalized Analytical Markup Language) to preserve analytical data from a range of different instrumentation. Unfortunately, these formats have not been widely adopted and HPLC software providers do not offer a standard format for data sharing. There is still a limited number of software tools and databases to support the interpretation and annotation of HPLC data collections. An innovative step has been taken to develop supportive resources within the scope of NIBRT and EUROCarbDB to develop a suite of integrated analytical tools, databases and standards to assist data analysis and to provide access to well curated data collections. First, an experimental relational database GlycoBase24 was developed to support our HT HPLC methodology which is based on a 96-well plate format and includes sample immobilization, glycan release, fluorescent labeling and quantitative HPLC analysis to provide detailed structural information for charged and neutral glycans. The database contains the elution positions for 2-aminobenzamidelabeled (2AB) N-glycans expressed in the form of ‘glucose unit values’ (GU-value) together with predicted products of exoglycosidase digestions. Second, an automation tool autoGU24 was introduced to assist the interpretation of complex exoglycosidase digestion data by assigning possible glycan structures to each HPLC peak and refining the putative assignments based on the digest footprint of the sample data. The existing mechanism for processing and exporting large volumes of information, required for autoGU and other analytical software, is cumbersome due to the lack of workflows for exporting multiple sample sets. It often requires manual, time-consuming and error prone steps to export the integrated profile which are not adequate to support vigorous growth and development of high-throughput glycomic platforms. To automate the routine process of data extraction and to allow 2038
Journal of Proteome Research • Vol. 9, No. 4, 2010
Artemenko et al.
Figure 1. (a) GlycoExtractor user interface and (b) guXML file format.
visualization of the data, we have developed a web-based tool, called GlycoExtractor.
Experimental Section Implementation. GlycoExtractor is a web-based solution to facilitate the high-throughput analysis of experimental HPLCglycan data. The tool interfaces with the Waters chromatography software package (Waters Empower) to improve and automate the export of data from large scale sample runs to various formats (XML, JSON, CSV and HTML). These formats can be either incorporated into various analytical software packages for further statistical/comparative analysis (GU-values and percentages of peaks areas) or allow interactive offline examination of the data. It is based on Java servlet technology connected to the experimental Waters Empower relational database using the Java Database Connectivity (JDBC) driver for Oracle. The interface was developed for Firefox 3.0 and Internet Explorer 7.0/8.0, running on Apache Tomcat 5.5/6.0 on Linux operating systems (Red Hat 5 and Ubuntu). The application uses a modular architecture which allows different data sources and different output file formats to be added; therefore, the tool can be tailored to meet future requirements and advances in HPLC technology. Database. The experimental database used by Waters Empower software operates within the Oracle RDBMS (Relational Database Management System) and consists of multiple schemas representing experimental projects. Each project represents a collection of multiple sample sets, sample runs, and dates. Each sample contains integrated information on experimental peaks in the form of elution time expressed in glucose unit values (GU-values) and relative peak areas. The relationships in the database are defined as a set of connected tables (Figure S1 in the Supporting Information). Application Structure. When the user starts the application, in addition to using Waters Empower as its data source, they also get to provide their own data (file) from other data management systems. The additional data source uses the guXML format based on standard XML technology. This gives GlycoExtractor broad applicability by easily allowing the creation of compatible inputs, using standard SQL queries. Meanwhile, the user interface, shown in Figure 1, is designed to be easy to use and provides efficient export and subsequent analysis without requiring programming skills. The hierarchical nature of the experimental database is reflected in the structure of the application, where each “sample set” in a given “project” is characterized by a set of specific features shared by its members (samples). Each sample can be characterized by the number of peaks and the date of experiment. Sample selection involves selecting a project, a
technical notes
GlycoExtractorsA Web-Based Interface sample set and then the sample profiles (Figure 1). To minimize the initial load time the available sample names are determined dynamically as the user selects the sample sets. GlycoExtractor allows the user to export multiple samples from different sample sets, in a desired order, to a number of common file formats including XML, JSON, CSV, as well as an interactive HTML format which helps in data validation and allows quick visual inspection. The user has full control over the content and order of the exported experimental data. The HTML page contains the data and embedded Javascript necessary to give offline-browsing capabilities and to highlight gaps and discrepancies in the data. The web-based interface includes a number of time-saving features, including the ability to save partially completed work and/or the names of commonly used samples for later use, and multistage undo. Availability. The GlycoExtractor tool can be found at http:// glycobase.nibrt.ie:8080/DemoGlycoExtractor/MainPage. This version uses an example database that includes all features and resources. The site also allows the user to upload their own data in guXML format. The tool can be customized for end user deployment in a local network when the user provides local privileges and database descriptions. File Formats. The GlycoExtractor introduces the first release of a XML and JSON (guXML and guSON) schema to support the extraction of data from proprietary databases into a defined format using a controlled vocabulary that reflects the architecture of the experimental data (Supporting Information, Figure S2). The formats capture user integrated data that is optimized for automated analysis using existing experimental databases (GlycoBase) and tools (autoGU). The format and information content have been designed to facilitate data integration and processing within the EUROCarbDB framework. This will allow the design and conditions of an experiment to be stored and associated with integrated experimental data in one unified platform.
Results and Discussion Proteomics and genomics have demonstrated the importance of developing file formats for the exchange and processing of large data collections. The field of proteomics has embraced open XML-based data formats (mzXML52 and mzData53) for data collections in mass spectrometry, one of which has been recently accepted by the PSI. Both formats have benefited from contributions from many groups which led to the extension and refinement of data features. Currently, much of the data that is used in HPLC analysis is obtained by manual extraction of information from proprietary databases into flat files and/or using customized reporting methods. However, the comprehensive analysis of large amounts of data by automated informatics procedures requires data collection protocols that strictly control data formatting and information content. Adoption of common, standardized data formats and reporting protocols will facilitate the design of a glycobioinformatic architecture amenable to automated procedures. The evolution of glycomics and glycoproteomics frameworks can learn from the progress of genomic and proteomic informatics. Requirement for Automated Data Extraction. To support large scale HPLC-glycan data analysis51 there is an urgent necessity to develop bioinformatic platforms to facilitate and automate data processing. For example, the major limiting factor is exporting data from vendor supplied solutions into formats that assist the end user with data interpretation, which
Figure 2. Interactive HTML format.
requires a number of manual steps. For example, extracting experimental data from Waters Empower requires a repetitive process where the user must first select a sample (Supporting Information, Figure S3) and copy all the sample peaks (Supporting Information, Figure S4) to a spreadsheet for further analysis, which is very time-consuming, laborious and error prone method. GlycoExtractor’s design objectives eliminate the requirement to manually select each sample profile when extracting data. For example, exporting data from a list of 100 sample profiles to either an XML, JSON or CSV file format takes 1-2 sec compared to 90 min when manually exporting and saving the data. The data available for each profile (Supporting Information, Figure S4) can be visualized using the interactive HTML format which is described next. Interactive HTML Web-Page. An interactive HTML file can be exported from the GlycoExtractor web-page for a quick visual analysis of the data or to take data for offline review. This format displays the data in a hierarchical form and highlights sample sets and samples that do not have the complete set of GU-values. The page groups samples by sample sets, each of them is equivalent to a single HPLC run, and samples. Each sample may contain several subsets of data related to experiments executed on different dates. Samples are color-coded to indicate which ones have no GU-values (red) or some GU-values (orange). This allows the user to quickly identify the valid subsets containing experimental GU-values. For instance, the illustrated sample in Figure 2 contains one completed subset with experimental data and two incomplete ones. Conversion of manufacturer specific data formats to the proposed open format, implemented by GlycoExtractor, is a first step to improving high-throughput glycan data analysis. Integrating GlycoExtractor with the EUROCarbDB Framework. EUROCarbDB is a design initiative to provide resources and bioinformatic tools for storage and annotation of experimental data obtained by different analytical methods (HPLC, MS and NMR). The framework also implements common file formats and standards for data exchange and a series of workflows for storing, annotating and querying structural and experimental data. To ensure that next generation bioinformatic tools adopt a unified approach for the storage and sharing of data. GlycoExtractor will be integrated with the EUROCarbDB framework to Journal of Proteome Research • Vol. 9, No. 4, 2010 2039
technical notes establish a comprehensive high-throughput HPLC data analysis platform. The framework will be expanded to enable a connection between GlycoExtractor and autoGU for automated structural assignment of HPLC profile data collections using the agreed EUROCarbDB database schema and the file formats presented. It is, therefore, anticipated that this tool will set the foundations for new bioinformatic resources for HPLC-glycan technologies that will have implications in a number of research activities including biomarker discovery and validation and monitoring online bioprocessing conditions for next generation biotherapeutics.
Conclusion We have developed a web-based framework (GlycoExtractor) that facilitates the export of large volumes of HPLC-glycan data and the comparison of results obtained in a series of experiments. Current HPLC software lacks automated accredited workflows for exporting multiple data sample sets to open standardized file formats; here we have presented a novel solution and first draft of proposals for querying, extracting and sharing information content for large scale high-throughput HPLC glycan studies. GlycoExtractor has significantly reduced the time required to prepare and export large collections of data from proprietary databases into file formats amendable to data analysis. As the tool uses JDBC’s database abstraction and a pluggable file writer architecture, it can be easily customized to support other databases and output file formats. It is anticipated that this tool will initiate new data interchange resources for HPLC-glycan technologies, for example, the development of centralized databases, tools, data collection protocols to facilitate a global initiative to develop a unified glycoscience informatics solution for data sharing and annotation. We hope that this program will stimulate manufacturers to develop and offer common and standardized open formats for data storage and interchange which can be easily integrated into new systems and processes to support HPLC-glycan analysis. The proposed data formats used by the current application could be considered as a primary template for the development of a universal format and informatic developments. It will encourage and largely facilitate research activities in various emerging disciplines deeply intertwined with glycobiology and glycoinformatics and will bring the field to the forefront of high-throughput analytical strategies with applications in drug design and discovery (next generation biotherapeutics) and the validation of new biomarkers.
Acknowledgment. This research was partially supported by EUROCarbDB, which is a Research Infrastructure Design Study funded by sixth Research Framework Program of the European Union Contract (RIDS Contract number 011952). Supporting Information Available: The UML diagram of the relevant part of the relational database schema from Waters Empower, followed by the description of the first release of the GlycoExtractor’s file formats (guXML, guSON). The proposed file formats could be considered as a primary template for the development of a universal format for data exchange and storage in glycobioinformatics. Also, there is a visual demonstration of the sample set and sample organization in the relational database provided for better understanding the benefits of the use of automated approach in comparison with manual one. This material is available free of charge via the Internet at http://pubs.acs.org. 2040
Journal of Proteome Research • Vol. 9, No. 4, 2010
Artemenko et al.
References (1) Dell, A.; Morris, H. R. Glycoprotein Structure Determination by Mass Spectrometry. Science 2001, 291 (5512), 2351–2356. (2) Zaia, J. Mass Spectrometry and the Emerging Field of Glycomics. Chem. Biol. 2008, 15 (9), 881–892. (3) Varki, A.; Cummings, R. D.; Esko, J. D.; Freeze, H. H.; Stanley, P.; Bertozzi, C. R.; Hart, G. W.; Etzler, M. E. Essentials of Glycobiology, 2nd ed.; Cold Spring Harbor Laboratory Press: La Jolla, 2008; p 783. (4) Dwek, R. A. Glycobiology: Toward Understanding the Function of Sugars. Chem. Rev. 1996, 96 (2), 683–720. (5) Varki, A. Biological roles of oligosaccharides: all of the theories are correct. Glycobiology 1993, 3 (2), 97–130. (6) Rudd, P. M.; Woods, R. J.; Wormald, M. R.; Opdenakker, G.; Downing, A. K.; Campbell, I. D.; Dwek, R. A. The effects of variable glycosylation on the functional activities of ribonuclease, plasminogen and tissue plasminogen activator. BBA-Protein Struct. 1995, 1248 (1), 1–10. (7) Rudd, P. M.; Wormald, M. R.; Stanfield, R. L.; Huang, M.; Mattsson, N.; Speir, J. A.; DiGennaro, J. A.; Fetrow, J. S.; Dwek, R. A.; Wilson, I. A. Roles for glycosylation of cell surface receptors involved in cellular immune recognition. J. Mol. Biol. 1999, 293 (2), 351–366. (8) Arnold, J. N.; Wormald, M. R.; Sim, R. B.; Rudd, P. M.; Dwek, R. A. The impact of glycosylation on the biological function and structure of human immunoglobulins. Annu. Rev. Immunol. 2007, 25, 21–50. (9) Haslam, S. M.; Julien, S.; Burchell, J. M.; Monk, C. R.; Ceroni, A.; Garden, O. A.; Dell, A. Characterizing the glycome of the mammalian immune system. Immunol. Cell Biol. 2008, 86 (7), 564– 573. (10) Mackeen, M. M.; Almond, A.; Deschamps, M.; Cumpstey, I.; Fairbanks, A. J.; Tsang, C.; Rudd, P. M.; Butters, T. D.; Dwek, R. A.; Wormald, M. R. The Conformational Properties of the Glc3Man Unit Suggest Conformational Biasing within the Chaperoneassisted Glycoprotein Folding Pathway. J. Mol. Biol. 2009, 387 (2), 335–347. (11) Petrescu, A.-J.; Wormald, M. R.; Dwek, R. A. Structural aspects of glycomes with a focus on N-glycosylation and glycoprotein folding. Curr. Opin. Struct. Biol. 2006, 16 (5), 600–607. (12) Butler, M.; Quelhas, D.; Critchley, A. J.; Carchon, H.; Hebestreit, H. F.; Hibbert, R. G.; Vilarinho, L.; Teles, E.; Matthijs, G.; Schollen, E.; Argibay, P.; Harvey, D. J.; Dwek, R. A.; Jaeken, J.; Rudd, P. M. Detailed glycan analysis of serum glycoproteins of patients with congenital disorders of glycosylation indicates the specific defective glycan processing step and provides an insight into pathogenesis. Glycobiology 2003, 13 (9), 601–622. (13) Balzarini, J. Inhibition of HIV entry by carbohydrate-binding proteins. Antiviral Res. 2006, 71 (2-3), 237–247. (14) Dube, D. H.; Bertozzi, C. R. Glycans in cancer and inflammation - potential for theurapeutic and diagnostics. Nat. Rev. Drug Discovery 2005, 4 (6), 477–488. (15) Fuster, M. M.; Esko, J. D. The sweet and sour of cancer: glycans as novel theraupetic targets. Nat. Rev. Cancer 2005, 5 (7), 526– 542. (16) Gornik, O.; Dumic, J.; Flogel, M.; Lauc, G. Glycoscience - a new frontier in rational drug design. Acta Pharm. 2006, 56 (1), 19–30. (17) Feizi, T.; Chai, W. Oligosaccharide microarrays to decipher the glyco code. Nat. Rev. Mol. Cell. Biol. 2004, 5 (7), 582–588. (18) Brumshtein, B.; Greenblatt, H. M.; Butters, T. D.; Shaaltiel, Y.; Aviezer, D.; Silman, I.; Futerman, A. H.; Sussman, J. L. Crystal structures of complexes of N-butyl- and N-nonyl-deoxynojirimycin bound to acid beta-glucosidase: insights into the mechanism of chemical chaperone action in Gaucher disease. J. Biol. Chem. 2007, 282 (39), 29052–29058. (19) Wuhrer, M. Glycosylation profiling in clinical proteomics: heading for glycan biomarkers. Expert Rev. Proteomic 2007, 4 (2), 135–136. (20) Packer, N. H.; von der Lieth, C.-W.; Aoki-Kinoshita, K. F.; Lebrilla, C. B.; Paulson, J. C.; Raman, R.; Rudd, P. M.; Sasisekharan, R.; Taniguchi, N.; York, W. S. Frontiers in glycomics: Bioinformatics and biomarkers in disease. Proteomics 2008, 8 (1), 8–20. (21) Fraser, J.; Maxwell, K.; Davidson, A. Immunoglobulin-like domains on bacteriophage: weapons of modest damage? Curr. Opin. Microbiol. 2007, 10 (4), 382–387. (22) Open Grid Services Architecture Working Group (OGSA-WG); https://forge.gridforum.org/projects/ogsa-wg. (23) Aoki-Kinoshita, K. F.; Ichikawa, M.; Ikeda, S.; Yamada, K.; Yamaguchi, T. A Web-Based Resource for Glycome Informatics. In 17th International Conference on Genome Informatics (GIW 2006); Yokohama Pacifico: Japan, 2006.
technical notes
GlycoExtractorsA Web-Based Interface (24) Campbell, M. P.; Royle, L.; Radcliffe, C. M.; Dwek, R. A.; Rudd, P. M. GlycoBase and autoGU: tools for HPLC-based glycan analysis. Bioinformatics 2008, 24 (9), 1214–1216. (25) Cooper, C. A.; Harrison, M. J.; Wilkins, M. R.; Packer, N. H. GlycoSuiteDB: a new curated relational database of glycoprotein glycan structures and their biological sources. Nucleic Acids Res. 2001, 29 (1), 332–335. (26) Doubet, S.; Albersheim, P. Letter to the Glyco-Forum: CarbBank. Glycobiology 1992, 2 (6), 505. (27) Doubet, S.; Bock, K.; Smith, D.; Darvill, A.; Albersheim, P. The complex carbohydrate structure database. Trends Biochem. Sci. 1989, 14 (12), 475–477. (28) Hashimoto, K.; Goto, S.; Kawano, S.; Aoki-Kinoshita, K. F.; Ueda, N.; Hamajima, M.; Kawasaki, T.; Kanehisa, M. KEGG as a glycome informatics resource. Glycobiology 2006, 16 (5), 63R–70R. (29) Kameyama, A.; Kikuchi, N.; Nakaya, S.; Ito, H.; Sato, T.; Shikanai, T.; Takahashi, Y.; Takahashi, K.; Narimatsu, H. A Strategy for Identification of Oligosaccharide Structures Using Observational Multistage Mass Spectral Library. Anal. Chem. 2005, 77 (15), 4719– 4725. (30) Loβ, A.; Bunsmann, P.; Bohne, A.; Lo, A.; Schwarzer, E.; Lang, E.; von der Lieth, C.-W. SWEET-DB: an attempt to create annotated data collections for carbohydrates. Nucleic Acids Res. 2002, 30 (1), 405–408. (31) Lu ¨ tteke, T.; Bohne-Lang, A.; Loβ, A.; Goetz, T.; Frank, M.; von der Lieth, C.-W. GLYCOSCIENCES.de: an Internet portal to support glycomics and glycobiology research. Glycobiology 2006, 16 (5), 71R–81R. (32) Narimatsu, H. Construction of a human glycogene library and comprehensive functional analysis. Glycoconjugate J. 2004, 21 (1), 17–24. (33) Pahlevi, S. M.; Kojima, I. In OGSA-WebDB: an OGSA-based system for bringing Web databases into the grid. Information Technology: Coding and Computing (ITCC′04); Las Vegas, Nevada, 5-7 June 2004; Srimani, P. K. P. C.; Abraham, A.; Cannataro, M.; DomingoFerrer, J.; Hashemi, R.; Garuba, M.; Goharian, N.; Lawrence, E.; Mirto, M.; Mourelle, L.; Nedgah, N.; Orlic, P.; Regentova, E.; Sahinoglu, Z.; Sapiocha, P.; Chairs, V. S. T.; Aslandogan, Y. A.; Berge, L.; Boult, T.; Dua, S.; Hoban, S.; Chairs, R. F. S. S., Eds. IEEE Computer Society: IEEE Computer Society, Las Vegas, NV, 2004; pp 105-109. (34) Pahlevi, S. M.; Kojima, I. OGSA-WebDB: An OGSA-Based System for Bringing Web Databases into the Grid. J. Digit. Inf. Manage 2004, 2 (2), 48–53. (35) Raman, R.; Venkataraman, M.; Ramakrishnan, S.; Lang, W.; Raguram, S.; Sasisekharan, R. Advancing glycomics: Implementation strategies at the Consortium for Functional Glycomics. Glycobiology 2006, 16 (5), 82R–90. (36) Ranzinger, R.; Herget, S.; Wetter, T.; von der Lieth, C.-W. GlycomeDB - integration of open-access carbohydrate structure databases. BMC Bioinf. 2008, 9 (1), 384. (37) Shinkawa, T.; Taoka, M.; Yamauchi, Y.; Ichimura, T.; Kaji, H.; Takahashi, N.; Isobe, T. STEM: A Software Tool for Large-Scale Proteomic Data Analyses. J. Proteome Res. 2005, 4 (5), 1826–1831. (38) Toukach, P. V.; Joshi, H. J.; Ranzinger, R.; Knirel, Y.; von der Lieth, C.-W. Sharing of worldwide distributed carbohydrate-related digital resources: online connection of the Bacterial Carbohydrate Structure DataBase and GLYCOSCIENCES.de. Nucleic Acids Res. 2007, 35 (Suppl_1), D280–D286. (39) van Kuik, J. A.; Vliegenthart, J. F. G. Databases of complex carbohydrates. Trends Biotechnol. 1992, 10, 182–185. (40) Benson, D. A.; Karsch-Mizrachi, I.; Lipman, D. J.; Ostell, J.; Wheeler, D. L. GenBank. Nucleic Acids Res. 2008, 36 (Database issue), D25– D30.
(41) Benson, D. A.; Karsch-Mizrachi, I.; Lipman, D. J.; Ostell, J.; Sayers, E. W. GenBank. Nucleic Acids Res. 2009, 37 (Database issue), D26– D31. (42) Bairoch, A.; Apweiler, R. The SWISS-PROT protein sequence data bank and its new supplement TREMBL. Nucleic Acids Res. 1996, 24 (1), 21–25. (43) Brazma, A.; Hingamp, P.; Quackenbush, J.; Sherlock, G.; Spellman, P.; Stoeckert, C.; Aach, J.; Ansorge, W.; Ball, C. A.; Causton, H. C.; Gaasterland, T.; Glenisson, P.; Holstege, F. C.; Kim, I. F.; Markowitz, V.; Matese, J. C.; Parkinson, H.; Robinson, A.; Sarkans, U.; SchulzeKremer, S.; Stewart, J.; Taylor, R.; Vilo, J.; Vingron, M. Minimum information about a microarray experiment (MIAME)-toward standards for microarray data. Nat. Genet. 2001, 29 (4), 365–371. (44) Taylor, C. F.; Paton, N. W.; Lilley, K. S.; Binz, P.-A.; Julian, R. K.; Jones, A. R.; Zhu, W.; Apweiler, R.; Aebersold, R.; Deutsch, E. W.; Dunn, M. J.; Heck, A. J. R.; Leitner, A.; Macht, M.; Mann, M.; Martens, L.; Neubert, T. A.; Patterson, S. D.; Ping, P.; Seymour, S. L.; Souda, P.; Tsugita, A.; Vandekerckhove, J.; Vondriska, T. M.; Whitelegge, J. P.; Wilkins, M. R.; Xenarios, I.; Yates, J. R.; Hermjakob, H. The minimum information about a proteomics experiment (MIAPE). Nat. Biotechnol. 2007, 25 (8), 887–893. (45) Sahoo, S. S.; Thomas, C.; Sheth, A.; Henson, C.; York, W. S. GLYDE-an expressive XML standard for the representation of glycan structure. Carbohydr. Res. 2005, 340 (18), 2802–2807. (46) Herget, S.; Ranzinger, R.; Maass, K.; von der Lieth, C.-W. GlycoCT-a unifying sequence format for carbohydrates. Carbohydr. Res. 2008, 343 (12), 2162–2171. (47) Abd Hamid, U. M.; Royle, L.; Saldova, R.; Radcliffe, C. M.; Harvey, D. J.; Storr, S. J.; Pardo, M.; Antrobus, R.; Chapman, C. J.; Zitzmann, N.; Robertson, J. F.; Dwek, R. A.; Rudd, P. M. A strategy to reveal potential glycan markers from serum glycoproteins associated with breast cancer progression. Glycobiology 2008, 18 (12), 1105–1118. (48) Saldova, R.; Royle, L.; Radcliffe, C. M.; Abd Hamid, U. M.; Evans, R.; Arnold, J. N.; Banks, R. E.; Hutson, R.; Harvey, D. J.; Antrobus, R.; Petrescu, S. M.; Dwek, R. A.; Rudd, P. M. Ovarian Cancer is Associated with Changes in Glycosylation in Both Acute-Phase Proteins and IgG. Glycobiology 2007, 17 (12), 1344–1356. (49) Saldova, R.; Wormald, M. R.; Dwek, R. A.; Rudd, P. M. Glycosylation changes on serum glycoproteins in ovarian cancer may contribute to disease pathogenesis. Dis. Markers 2008, 25 (4), 219–232. (50) Liang, P.-H.; Wu, C.-Y.; Greenberg, W. A.; Wong, C.-H. Glycan arrays: biological and medical applications. Curr. Opin. Chem. Biol. 2008, 12 (1), 86–92. (51) Royle, L.; Campbell, M. P.; Radcliffe, C. M.; White, D. M.; Harvey, D. J.; Abrahams, J. L.; Kim, Y.-G.; Henry, G. W.; Shadick, N. A.; Weinblatt, M. E.; Lee, D. M.; Rudd, P. M.; Dwek, R. A. HPLC-based analysis of serum N-glycans on a 96-well plate platform with dedicated database software. Anal. Biochem. 2008, 376 (1), 1–12. (52) Pedrioli, P. G. A.; Eng, J. K.; Hubley, R.; Vogelzang, M.; Deutsch, E. W.; Raught, B.; Pratt, B.; Nilsson, E.; Angeletti, R. H.; Apweiler, R.; Cheung, K.; Costello, C. E.; Hermjakob, H.; Huang, S.; Julian, R. K.; Kapp, E.; McComb, M. E.; Oliver, S. G.; Omenn, G.; Paton, N. W.; Simpson, R.; Smith, R.; Taylor, C. F.; Zhu, W.; Aebersold, R. A common open representation of mass spectrometry data and its application to proteomics research. Nat. Biotechnol. 2004, 22 (11), 1459–1466. (53) Orchard, S.; Taylor, C. F.; Hermjakob, H.; Zhu, W.; Julian, R. K.; Apweiler, R. Current status of proteomic standards development. Expert Rev. Proteomic 2004, 1 (2), 179–183.
PR901213U
Journal of Proteome Research • Vol. 9, No. 4, 2010 2041