Human Serum Proteins Preseparated by ... - ACS Publications

John Marshall,*,† Andy Jankowski, Shirley Furesz, Inga Kireeva, Lisa Barker, Mila Dombrovsky,. Weimin Zhu, Kellie Jacks, Leslee Ingratta, Jenny Brui...
8 downloads 0 Views 415KB Size
Human Serum Proteins Preseparated by Electrophoresis or Chromatography Followed by Tandem Mass Spectrometry John Marshall,*,† Andy Jankowski, Shirley Furesz, Inga Kireeva, Lisa Barker, Mila Dombrovsky, Weimin Zhu, Kellie Jacks, Leslee Ingratta, Jenny Bruin, Erika Kristensen, Rulin Zhang, Eric Stanton,‡ Miyoko Takahashi, and George Jackowski§ SYNX PHARMA, 1 Marmac Drive, Toronto, Ontario, Canada, M9W 1E7

Electrophoretic and chromatographic sample preparations were compared and together detected the presence of some 600 types of protein products in human serum. Proteins from crude serum preseparated by ionic electrophoresis, chromatography, or a combination of both were analyzed. Proteins were digested with trypsin or chymotrypsin. Naturally occurring peptides were also collected by reversed-phase chromatography. The resulting peptides were identified by tandem mass spectrometry. The peptides were either desorbed by a laser from a metal chip into a quadrupole-time-of-flight mass spectrometer or ionized as an electro-spray from reversed-phase chromatography via a metal needle under voltage into an ion-trap mass spectrometer. All of the commonly known proteins associated with serum were detected, and the two mass spectrometers agreed on the identity of abundant serum proteins. Preseparation of serum proteins prior to digestion markedly enhanced the capacity to detect un-common proteins from blood. Electrophoretic- and chromatography-based experiments were found to be complementary. Many novel cellular proteins not previously associated with serum were recorded. Keywords: human • serum • protein • proteome • electrophoresis • chromatography • tandem • mass spectrometer

Introduction Mankind aims to enumerate all of the proteins in a human being and establish their structural and functional relationship.1 The fluids of the blood may be of particular biological complexity because they communicate between all of the cells, tissues, and organs, and thus may contain the most diverse set of proteins: The fractionation and identification of the proteinaceous components of blood has a long history.2 Of all human tissues, blood is of the greatest economic importance as a source of transfusion, therapeutics, and blood products or for diagnostic testing. Partition chromatography has previously been shown to be effective for the selective purification of proteins prior to identification.3,4 Protein identity has been assigned using MS data of digested proteins recovered from PAGE.5 Many novel approaches to preparing proteins for biochemical analysis have recently been explored, including iso-electric focusing and chromato-focusing,6 free flow electrophoresis,7 and novel combinations of electro-spray and TOF.8 Blood fluids or other biological solutions of proteins have been resolved by 2D-PAGE with identification of the proteins by MALDI-TOF or nano flow electrospray ionization.9-11 Serum contains a small group of super-abundant proteins that inter* To whom correspondence should be addressed. Phone: (416) 979-5000 x-4219. E-mail: [email protected]. † Department of Chemistry and Biology, Faculty of Engineering and Applied Science, Ryerson University, Toronto, Ontario, Canada. ‡ Department of Cardiology, St. Joseph’s Hospital, McMaster University, Hamilton, Ontario, Canada. § Department of Laboratory Medicine and Pathobiology, Hospital for Sick Children, University of Toronto, Toronto, Ontario, Canada.

364

Journal of Proteome Research 2004, 3, 364-382

Published on Web 01/15/2004

fere with the detection of low-abundance sample components. Serum albumin concentrations on the order of tens or hundreds of mg per ml resist iso-electric focusing into a narrow pH range at useful levels of total proteins resolved on 2D gels and even show significant spread and fail to resolve into narrow bands on 1D glycine-based SDS-PAGE gels. Depleting sera of specific proteins such as HSA or IgG has been suggested to provide greater sensitivity for the remaining proteins in the complex mixture of blood fluids.12 Analysis of protein mixtures by multiple chromatography has been shown to detect uncommon sample components and was able to sensitively detect the presence of low abundance polypeptides in the blood fluid.13 Digestion of crude proteins followed by LC-ESI-ION TRAP has been a successful approach in yeast, rice, and wheat.14-16 In contrast, serum has resisted the convenient enumeration of the sample components by high quality tandem mass spectrometry, and the successful analysis of serum may serve as a standard by which to compare the quality of sample preparation techniques. To identify a protein by mass spectrometry the protein may be cleaved into peptides by proteases.17 The method may be applied to proteins purified from crude mixtures by PAGE and the mass-distribution of tryptic-peptides may be recorded by MALDI-TOF MS and used to identify the proteins.18 More recently, tandem mass spectrometry (MS/MS) has been employed to correlate peptide gas fragmentation patterns to genetic databases. In common MS/MS approaches, each peptide is fragmented by collision-induced-dissociation (CID) via acceleration into homonuclear, inert gas molecules in a 10.1021/pr034039p CCC: $27.50

 2004 American Chemical Society

Human Serum Proteins

tandem mass spectrometer.19 The CID method is apparently limited by relative momentum of the peptide and gas, and likely will not efficiently dissociate peptides much more massive than ∼6 kD into fragments.20,21 At present the two most robust instruments for CID fragmentation and identification of peptides are the Paul ion trap22 and Qq-TOF;21 but the linear ion trap, that is not completely dis-similar to the q in Qq-TOF, may represent an alternative.23,24 The MS/MS fragmentation pattern is then searched against the predicted mass spectra of peptides encoded within genetic databases.25-27 Some assumptions regarding the specificity of the protease, as well as some criteria regarding the acceptable goodness of fit of the recorded spectra, were required.12,28 The gas fragmentation and computer identification process is essentially stochastic. It is possible that advances in protein sequencing tags29 or single accurate mass tagging by FTICR,30-33 fragmentation of peptides by high energy MALDI-TOF-TOF,34 top-down fragmentation by electron capture dissociation,35 or advances in post source decay36,37 may eventually change how identity is assigned to proteins. However, as the two most common ionization methods for elevating peptides into the gas phase, MALDI38,39 and ESI,40 are competitive reactions that often reflect the most abundant sample components, it remains likely that methods for the prefractionation of proteins and peptides prior to ionization will remain a universal feature of proteomics. We have explored chromatography or PAGE prior to tandem mass spectrometry by MALDI-Qq-TOF or ESI-LC ion trap as a means to obtain convincing detection of un-common types of serum proteins.

Material and Methods Materials. Except where indicated, all dry chemicals were obtained from the Sigma Chemical Company and were of a fine grade (St. Louis, MO). All solvents were of an optical grade or better from Caledon laboratories (Georgetown, Ontario, Canada). Blood sample tubes were from Becton Dickinson (Franklin Lakes, NJ). The various chromatographic resins and SDS-PAGE migration standards were obtained from Bio-Rad Laboratories (Hercules, CA). Reversed-phase resin was obtained from Millipore laboratories (Bedford, MA). Pre-cast gels were obtained from NOVEX/InVitrogen (Burlington Ontario, Canada). Sequencing grade trypsin was obtained from Promega (Madison WI). Blood Samples. Some normal serum samples were obtained from Serologicals corporation (formerly Intergen) (Norcross, GA). Blood samples were drawn into sera tubes. The samples were thawed, aliquoted and re-frozen once before being used and discarded as decribed.41 Preparative Chromatography. Preparative partition chromatography was performed essentially as previously described.41 Briefly, the columns were typically equilibrated with a binding buffer of no more than 100 mM PBS according to the manufacturer’s protocol. Serum samples were mixed with binding buffer and passed over the columns. The column was washed in five volumes of binding buffer before eluting over a range of salt concentrations up to 100 mM PBS plus 1000 mM NaCl as described by the manufacturer. The profile of the proteins bound or eluted from the chromatographic resin under various salt regimes appeared different confirming that in these experiments selective partition chromatography was obtained. Additionally, proteins from DEAE-B columns were digested with trypsin or chymotrypsin and the peptides bound to HiS in 150 mM PBS and subsequently eluted with up to 1000 mM NaCl after the MudPIT approach.42,43 In this paper, all protein

research articles separations were performed by simple gravity-drip chromatography. For an SDS-PAGE or 1D-LC-ESI-ION TRAP experiment, 25 µL of sera were typically used. SDS-PAGE. The proteins eluted from the columns were mixed with an equal volume of 2× sample buffer and resolved on 10% to 20% pre-cast-gradient tricine gels.44 Although albumin is difficult to compress into tight bands in IEF gels using ampholytes or on 1D glycine gels, high concentrations of the strong buffer tricine can effectively contain albumin into a narrow range of electrophoretic migration even from crude serum. The gels were stained with CBBR in 40% methanol and 10% acetic acid before de-staining in methanol and acetic acid. Bands for MS/MS analysis were typically prepared by digestion overnight with sequencing grade trypsin in 50 mM tricine pH 8.5, 200 mM urea, and 5% acetonitrile. MS/MS by MALDI-Qq-TOF. Polypeptides from sera or chromatographic samples were prepared for MALDI analysis by collection in a batch mode by passage over C18 reversed phase resin washed with several column volumes of 0.1% TFA and eluted with 50% acetonitrile in water with 0.1% TFA and 5% formic acid and spotted on to a stainless steel target for a SCIEX QSTAR PULSAR I (Concord, Ontario, Canada). MS/MS spectra were collected using DHB as a matrix. The MS/MS fragmentation patterns were searched for significant probability-based Mowse scores at the p ) 0.05 level against a nonredundant database of DNA, cDNA, EST and proteins, compiled from publicly available data, using MASCOT.27 MALDI-TOF Profiles and Naturally-Occurring Peptides. Sera and chromatography fractions for MALDI-TOF analysis were diluted in 0.1% TFA and 5% formic acid. For crude sera, samples were prepared for MALDI analysis by placing 1 µL of sera diluted 1 part in 20 with 0.1% TFA and 5% formic acid on a gold MALDI-TOF target. Alternatively, the peptides were collected in a batch mode by passage over C18 reversed phase resin washed with several column volumes of 0.1% TFA and eluted with 2 µL of 50% acetonitrile in water with 0.1% TFA and 5% formic acid. The endogenous peptides were separated by Offline LC MALDI-Qq-TOF with 2 µL per minute onto each spot under the chromatography conditions, 5% to 65% over 1 h, described immediately below. The eluted peptides were dried on to gold MALDI targets. After allowing all of the samples to dry evenly the energy-absorbing matrix CHCA was applied. A few mg of the matrix CHCA was washed by re-suspension in 50% acetonitrile in 0.1% TFA in water before discarding the wash solution. The matrix was redissolved in fresh 50% acetonitrile 0.1% TFA to form a saturated solution. One µL of saturated matrix solution was applied to each MALDI target spot immediately before sampling. The data were collected using a TOF MS model PBSII provided by Ciphergen Biosystems (Freemont, CA).45 The targets were washed with 2% SDS followed by 50% acetonitrile with 5% formic acid before use. We also sequenced the peptides less than 3 kD by MALDI-QqTOF. Since no enzyme was used to digest these samples, we performed the MASCOT search with no enzyme. MS/MS by LC-ESI-ION TRAP. Enzymatic peptides obtained from sera fractions either directly or after separation with HiS resin were resolved by C18 revered phase chromatography (Vydac 0.3 mm ID, 15 cm column). The sample was analyzed over a 90 min gradient from 5% to 65% acetonitrile at a flow rate of 2 µL per minute with an Agilent 1100 series capillary pump via a metal needle electro-spray46 into a Deca XP-100 ION TRAP. The MS/MS data files from the LC-ESI-ION TRAP runs were searched against a nonredundant library of proteins, Journal of Proteome Research • Vol. 3, No. 3, 2004 365

research articles

Figure 1. SDS-PAGE gels of intact serum proteins obtained by preparative partition chromatography as compared to crude sera. Proteins were resolved on 7% tricine gels and stained by CBBR. Numbers: 1, apolipoprotein B-100 (517 kD); 2, fibronectin (262 kD); 3, complement C4 (192 kD); 4, complement C3 (187 kD); 5, R-2-macroglobulin 163 kD); 6, complement H factor (139 kD); 7, apolipoprotein A-II (111 kD); 8, apolipoprotein C-III (108 kD); 9, inter R trypsin inhibitor (101 kD); 10, fibrinogen 94 kD; 11, histidine rich glyco protein (59 kD); 12, vitronectin (54 kD); 13, apolipoprotein J (53 kD); 14, Antithrombin-III precursor (52 kD); 15, hemopexin/β 1B-glycoprotein (52 kD); 16, immunoglobulin (49 kD); 17, antitrypsin (47 kD); 18, pigment epithelium derived factor (46 kD); 19, apolipoprotein A-IV (45 kD); 20, haptoglobin (45 kD); 21, Apolipoprotein E (36 kD); 22, C1 inhibitor (36 kD); 23, apolipoprotein A-I (31 kD); 24, R 1-acid glycoprotein (23 kD); 25, transthyretin (14 kD); 26, serum amyloid A protein (13 kD); 27, serum albumin (66 kD). SB, sample buffer. The masses of the SDS-PAGE migration standards are shown in kilo Daltons. Proteins identified by the combination of chromatography followed by SDS-PAGE and MALDI-Qq-TOF are shown alongside the descriptor CPM in Table 2.

cDNAs, EST,47 and genomic DNA using SEQUEST.26 Because the samples in this paper were digested with trypsin or chymotrypsin, we accepted tryptic or chymotryptic peptides. Peptides identified by SEQUEST with significant X-correlations of g1.9 (1+), g2.5 (2+), and g3.75 (+3) and delta-correlation values g0.1 were reported as previously described.14 Only peptides that matched the enzyme used for digestion were accepted because we observed very few peptide assignments when enzymes were omitted from the reaction. The proteins were additionally searched against the human databases using Bioworks.48 The database subset was created from the NCBI download NR database with search strings Homo sapiens or human and the terms ribose, virus, viral, and HIV excluded from the FASTA headings.

Results Crude serum or serum fractionated with chromatography was resolved by SDS-PAGE (Figure 1). We found that the various chromatographic preparations had the effect of enhancing the representation of more modest protein constituents with respect to albumin in agreement with previous results.12 Pre-fractionation of the samples by chromatography resulted in markedly different profiles after SDS-PAGE and CBBR-staining (Figure 1). The identity of some of the bands, as determined by MALDI-Qq-TOF, are listed in the legend of Figure 1 with reference to the numbered arrows on the edge of the gel lane. The proteins identified by this method show the appropriate trend in molecular mass when considered in from the top to the bottom of the gel and agree with the mass of known serum proteins or processed forms of these proteins. In general, the MALDI-Qq-TOF unambiguously yielded the 366

Journal of Proteome Research • Vol. 3, No. 3, 2004

Marshall et al.

Figure 2. Preparatory resolution of crude serum by SDS-PAGE and identification of gel bands by MALDI-Qq-TOF. A, an example of a y-ion series of from peptides recovered from an SDS-PAGE bands as detected by MALDI-Qq-TOF is illustrated with the example of Apoliporprotein AIV (from band number 19 Figure 1 above). B, An SDS-PAGE gel of crude serum from which sixteen bands were excised and digested with trypsin before identification with MALDI-Qq-TOF or LC-ESI-ION TRAP. The masses of the SDS-PAGE migration standards are shown in kilo Daltons. The comparison of the main proteins identified by the MALDI and LC MS/MS methods are shown in Table 1 and the full list of proteins obtained from the 16 bands are shown in Table 2 alongside the descriptor PAGE-LC. For an example of LC-ESIION TRAP spectra see Figure 7.

identity of the major proteins of CBBR stained bands from PAGE gels of chromatographic fractions.21 We compared the identification of PAGE bands either directly by MALDI-Qq-TOF (Figure 2A) of the tryptic digest versus resolution by liquid chromatography prior to ESI-ION TRAP (Table 1). In general, we found that MALDI-Qq-TOF was a robust method to identify gel bands but that our device sometimes failed to directly identify ultrahigh molecular mass proteins of crude serum resolved on gels presumably because of low molar amounts (Figure 2B). We also sequenced the bands from an SDS-PAGE gel of crude serum by separating the peptides by C18 reversed-phase chromatography prior to the ESI-ION TRAP. Consistent with previous results, MALDIQq-TOF and ESI-ION TRAP commonly agreed on the type of proteins that comprised the super abundant sample components.49 However, it was apparently possible to detect more than 100 additional faint proteins that were not detected in the direct MALDI-Qq-TOF analysis by LC-ESI-ION TRAP analysis of the sample (see Table 2). The LC-based experiment apparently detected more proteins by resolving the mixture of peptides and thus relieving the suppression of minor sample

research articles

Human Serum Proteins

Table 1. Comparison of the Major Proteins Identified by MALDI-Qq-TOF versus LC-ESI-ION TRAP from the 16 Bands Indicated from the SDS-PAGE Gel of Crude Serum as Illustrated in Figure 2a band #

1 2 3 4 5 6

MALDI-Qq-TOF

no match no match no match no match inter-R trypsin inhibitor R-2-macroglobulin

7 8 9 10 11 12

ceruloplasma or ferridoxase complement C3 complement C3 albumin albumin IgG

13 14 15 16

haptoglobin no match apoliporprotein A-I tranthyretin or pre-albumin

LC-ESI-ION TRAP

apolipoprotein B-100, apolipoprotein B-100, R-2-macroglobulin apolipoprotein B-100, R-2-macroglobulin, fibronectin 1 isoform 1 R-2-macroglobulin, fibronectin 1 isoform 1 R-2-macroglobulin, inter-R trypsin inhibitor R-1-anti trypsin, R-2-macroglobulin, myosin heavy chain, ITI-heavy chain related protein, apolipoprotein A1, anti thrombin III R-2-macroglobulin, ferridoxase, complement C3 complement C3, R-2-macroglobulin, ferridoxase complement C3, R-2-macroglobulin precursor albumin, complement 4A albumin, complement C3 albumin, serine proteinase inhibitor clade A, R-1-anti-trypsin, IgG heavy chain constant γ-3 apolipoprotein A-IV apolipoprotein E apolipoprotein A-I pre-albumin, apoliporprotein A-I

a The set of gel bands were digested with trypsin and a portion of the sample identified by MALDI-Qq-TOF with a MASCOT search versus a similar portion resolved by 1D-LC with on-line identification by ESI-ION TRAP with a SEQUEST search as described in the material and methods. The super-abundant serum proteins that were common to both methods are shown in bold typeface. Many additional proteins were identified by LC-ESI-ION TRAP and these are listed alongside the descriptor PAGE-LC in Table 2.

components in the competitive ionization reaction: LCMALDI-Qq-TOF and LC-ESI-ION TRAP have been previously shown to exhibit ∼90% overlap in the identification of protein from similar samples.49 Typical PAGE gels have difficulty visualizing proteins of less than 10 kD44 a crucially important part of the low-abundance serum proteome.50 Hence, we also used preparative-reversed phase chromatography to collect low mass peptides prior to MALDI-Qq-TOF (Figure 3). The MALDI-TOF spectra of crude serum contained few peaks compared to the preparative C18 fraction that revealed far more MALDI analytes than the crude serum (Figure 3). Thus, a great enrichment of low molecular mass peptides were visualized in the reversed-phase preparations compared to neat serum. Here, peptide fingerprints of sera prepared with C18 partition chromatography were shown in contrast to previously examined profiles of retentate chromatography.45 The effect of the partition chromatography separation is to reveal complex peptide patterns with a large number of resolved analytes per spectra and a high signal-tonoise ratio. We have identified some of the major peptides less than 3 kD from the preparative C18 reversed phase fraction by MALDI-Qq-TOF.51,52 We found the naturally occurring, i.e., endogenous, peptides to be families of fragments from commonly known serum proteins including HSA, serum amyloid A, apolipoprotein E, R-fibrinogen, complement C3 or C4, clusterin and other known serum proteins. In addition, low abundance proteins such as an androgen induced protein (Figure 4) or peptides that showed weak signals that had some similarity with integrins and other extra cellular proteins were observed. At present, obtaining the identity of intense endogenous peptides less than 3 kD was found to be practical with a commercially available MALDI-Qq-TOF.21 As previously observed,41 we recorded not only apparent tryptic peptides from common serum proteins but also degraded forms of the peptides that were missing amino acids from the amino and occasionally carboxyl terminus (not shown). Compared to other methods offline LC-MALDI-Qq-TOF shows particular promise for elucidating the endogenous low molecular weight peptides of human serum but further refinements to the sample preparation and perhaps a more recent MALDI-tandem MS/ MS instrument may be required.

Proteins from crude serum or chromatographic preparations were digested with trypsin or chymotrypsin and subjected to reversed-phase separation (Table 2). Examples of TIC traces for reversed-phase chromatography monitored with ESI-ION TRAP are shown (Figure 5). The DEAE-B fraction showed variation in the TIC traces after reversed-phase separation compared to crude serum. The ION TRAP mass spectrometer was set to collect MS/MS spectra when ions yielding strong signals were detected in the MS sweep. The bulk of analytes with MS/MS fragmentation patterns that matched best with known or predicted human proteins generally eluted between 20% and 40% acetonitrile as would be expected for peptides resolved by reversed-phase chromatography. To illustrate the quality of the MS/MS spectra generated, MS/MS spectra for some significant peptides are shown in Figure 6. In general, we observed strong MS signals with little noise and consequently high signal-to-noise ratios. We have shown several examples of the high signal-to-noise of the MS and MS/MS spectra obtained by the ion trap from peptides derived from refined serum that are the most important result reported in this paper. Of the significant MS/MS spectra of peptides that met the filter conditions for subsequent searching, as many as 20 to 25% of these spectra produced significant X-Correlation and delta X-Correlation scores resulting in the assignment of X and Y ion series by SEQUEST.26 We inspected the MS/MS spectra of the significant single ions to ensure that some consecutive ions were assigned and that few major ions were unaccounted for.12 Commonly known serum proteins were typically identified by a large number of peptides per protein. Uncommon serum proteins were often redundantly identified by a single ion in replicate runs. We show a representative assignment of x and y ions to putative peptides by SEQUEST from one commonly known and one uncommon serum protein illustrated with the examples of apolipoprotein and unknown protein af021799 (Figure 7). All of the major proteins detected by MALDI-Qq-TOF, or related protein(s)/synonymous proteins/pre-proteins, were also observed by LC-ESI-ION TRAP. The chromatography based experiments and gel-based experiments agreed at the level of super-abundant proteins but were complementary at the level of low-abundance proteins. We tested the effect of chromatoJournal of Proteome Research • Vol. 3, No. 3, 2004 367

research articles

Marshall et al.

Table 2. Comparison of the Types of Protein Products Identified by Sample Preparations and Sampling Schemesa protein name

pre-albumin (NC_000909) conserved hypothetical protein (NC_003902) conserved hypothetical protein 26S proteasome non-ATPase regulatory subunit II 30S ribosomal protein S7 homolog 4-coumarate ligase-Lithosperum erythrohizon ABC transporter ecsA homolog S41121 acetyl-CoA carboxylase (EC 6.4.1.2) - human Hydroxyindole O-methyltransferase (HIOMT) (Acetylserotonin O-methyltransferase) actin similar to T-cell activation NFKB-like protein [Homo sapiens] adipose most abundant gene transcript 1 ADP-L-Glycero-D-mannoheptose 6 epimerase AF005898 (orf) afamin precursor; R-albumin [Homo sapiens] AK023732 unnamed protein R-1-antichymotrypsin R-1-antichymotrypsin precursor R-1-acid glycoprotein R-1B-glycoprotein [Homo sapiens] AF349032_1 R-2-macroglobulin [Homo sapiens] crude R-2-macroglobulin precursor [Homo sapiens] R-1 antitrypsin crude R-1-microglobulin/bikunin precursor; R-1-microglobulin/bikunin R-2 IX collagen pigment epithelium-derived factor [Homo sapiens] X R-2-HS-glycoprotein; [Homo sapiens] R-thrombin-haemadin complex (exocite II-binding inhibitor) Alu RNA binding protein amyloid related serum protein Similar to androgen-induced prostate proliferative shut off associated protein angiotensinogen ANK3_HUMAN Ankyrin 3 (ANK-3) (Ankyrin G) anti TNF-R antibody light-chain Fab fragment anti-Entamoeba histolytica immunoglobulin κ light chain anti-HBsAg immunoglobulin Fab κ chain anti-rabies SOJB immunoglobulin λ light chain [Homo sapiens] antithrombin III variant [Homo sapiens] R18 APETALA A3 homologue RFAP3-1 apolipoprotein A-I, Crystal Structure crude apolipoprotein A-I precursor [Homo sapiens] apolipoprotein A-II apolipoprotein A-II precursor apolipoprotein A-IV apolipoprotein A-IV precursor [Homo sapiens] apolipoprotein B fragment [Homo sapiens] apolipoprotein C-I precursor [Homo sapiens] apolipoprotein C-II precursor [Homo sapiens] apolipoprotein C-III precursor [Homo sapiens] apolipoprotein D, apoD [human, plasma, Peptide, 246 aa] apolipoprotein E apolipoprotein F apolipoprotein J apolipoprotein L apolipoprotein L1 isoform a precursor; apolipoprotein L; apolipoprotei apolipoprotein L-I ATP dependent serine activating enzyme ATP dependent zinc metallo peptidase dJ741H3.1.1 attractin (with dipeptidylpeptidase IV activity) secreted A42856 EPF autoantibody-reactive epitope-human (fragment) X& B cell antigen CD75 [Homo sapiens] ating protein 2 CFAB_HUMAN Complement factor B precursor (C3/C5 convertase) basic membrane protein BC008988 (protein for MGC:17333) BC010708 hypothetical protein B-cell CLL/lymphoma 11B isoform 2; B-cell lymphoma/leukaemia 11B; zinc β-1B-glycoprotein β-2-glycoprotein I precursor [Homo sapiens] β-2-glycoprotein I precursor [Homo sapiens] β-2-glycoprotein I, fifth domain, chain A B-factor properidin BRAC1 associated ring domain protein - Xenopus laevis similar to butyrophilin-like 2 (MHC class II associated) [Homo sapiens]

368

Journal of Proteome Research • Vol. 3, No. 3, 2004

preparation/sampling/protease

chroma chroma chroma chroma PAGE-LC chroma chroma chroma chroma

trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin

PAGE-LC PAGE-LC PAGE-LC CPM PAGE-LC CPM PAGE-LC CPM PAGE-LC PAGE-LC

PAGE-LC

PAGE-LC PAGE-LC PAGE-LC PAGE-LC CPM PAGE-LC CPM PAGE-LC PAGE-LC CPM PAGE-LC CPM PAGE-LC PAGE-LC CPM PAGE-LC PAGE-LC

PAGE-LC

2

7705738

6

542750 1170276

11 1

27731085

2

4501987

10

21071030 13661814 4557225 393350 4502067

9 1 34 1 2

* 18483.26 1144299 91254.47 4502005 * * * 602.4891 24657779

chroma chroma chroma chroma chroma chroma chroma chroma chroma

trypsin trypsin trypsin trypsin trypsin trypsin trypsin chymo trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin

* 7277.278 * * * 15129.24 17136.71 * 5695949 1141568 * * * 2478.766 78477.69 403503.8 56543 133262.9 8982.031 324924.5 * * * 109.6259

7 4

1

21759000

39

27728687 576554

1 6

2914175 4557321

29 1

28762 28789 4502157 4502159 4557323 619383 4557325

1 8 5 3 2 6 13

21735614

1

trypsin * chroma trypsin * chroma trypsin * trypsin 68598.51 13160051

11

chroma chroma chroma chroma chroma chroma

chroma trypsin 151.8605 345836

1

chroma trypsin 17306.32 29389 chroma trypsin 744729.5 584908

7 16

chroma chroma chroma chroma

10

CPM chroma PAGE-LC PAGE-LC

219978

trypsin trypsin trypsin trypsin trypsin trypsin

CPM PAGE-LC PAGE-LC CPM PAGE-LC NEOLM chroma chroma CPM chroma chroma PAGE-LC PAGE-LC

459021.5 * * * 1761.871 * * 7798.266 1499.027

chroma chroma chroma chroma chroma chroma NEOLM

PAGE-LC

Gi no.

chroma trypsin * chroma trypsin 208.9137 trypsin * chroma trypsin * chroma trypsin * trypsin 4277.757 chroma trypsin * chroma trypsin * trypsin * trypsin * chroma trypsin 99851.8 chroma trypsin 133773.9 trypsin 2240336 chroma trypsin 1023276 chroma trypsin 5462.652

PAGE-LC

peptides

UniScore

chroma chroma chroma chroma

trypsin trypsin trypsin trypsin

* * * 1529.333 12597635

trypsin trypsin trypsin trypsin trypsin trypsin trypsin

* 6287.948 4557327 7430.16 4557327 * * * 12342.83 27715671

7 7

14

research articles

Human Serum Proteins Table 2. (continued) protein name

preparation/sampling/protease

C4b binding protein fragment chroma trypsin farnesyltransferase, CAAX box, β [Homo sapiens] chroma trypsin calmodulin-like skin protein [Homo sapiens] chroma trypsin carB protein homolog chroma trypsin CBP8_HUMAN Carboxypeptidase N 83 kDa chain chroma trypsin (Carboxypeptidase N regulatory subunit) cartilage intermediate layer protein [Homo sapiens] chroma trypsin catalase interacting protein cont trypsin CD20 Antigen chroma chymo similar to death effector filament-forming chroma trypsin Ced-4-like apoptosis protein cell surface hep chroma trypsin centaurin, β 2; centaurin β2; Arf GAP with coiled coil, ANK repeat chroma trypsin ceruloplasmin (ferroxidase); Ceruloplasmin [Homo sapiens] PAGE-LC chroma trypsin chain A cr2-cr3 complex structure chroma trypsin factor B [Homo sapiens] chroma trypsin chain A, crystal structure of the human Igg1 Fc-fragment PAGE-LC chroma trypsin chain A, NMR structure of human apolipoprotein C-Ii PAGE-LC trypsin in the presence Of SDS chain A, serum amyloid P component (Sap) PAGE-LC trypsin chain A, solution structure of the first Hmg box in trypsin human upstream binding F chain A, structure of the fab fragment from a chroma trypsin human Igm cold agglutinin chain A, X-ray structure of human complement protein PAGE-LC trypsin C8γ At Ph7.O immunoglobulin G Fc receptor IIIA [Homo sapiens] chroma trypsin chain B, crystal structure Of S-Nitroso-Nitrosyl PAGE-LC trypsin human hemoglobin A chain A, human serum transferrin, recombinant PAGE-LC trypsin N-terminal lobe, apo form chain F/chain C of fibrinogen fragment chroma trypsin chain H, crystal structure of an unliganded (native) Fv PAGE-LC chroma trypsin from a human Igm anti-peptide antibody chain H, crystal structure of tissue factor in complex PAGE-LC trypsin with humanized Fab D3h44 chain L, crystal structure of a human Igm rheumatoid factor Fab PAGE-LC chroma trypsin chain L, Igg Fab (human Igg1 κ) chimeric fragment (Cbr96) chroma trypsin chain L, crystal structure of tissue factor in complex PAGE-LC chroma trypsin with humanized Fab D3h44 chain L, Igg Fab (human Igg1, κ) chimeric fragment (Cbr96) PAGE-LC trypsin chaperonin (groEL) chroma trypsin CHLPN 76 kDa homologue Chlamydia trachomatis chroma chymo chromosome 20 open reading frame 96 [Homo sapiens] chroma trypsin CLU chroma trypsin clusterin (complement lysis inhibitor, SP-40,40, PAGE-LC NEOLM chroma trypsin sulfated glycoprotein coagulation factor II prothrombin precursor chroma trypsin FA9_HUMAN coagulation factor IX precursor (Christmas factor) chroma trypsin coagulation factor V jinjiang B domain [Homo sapiens] chroma trypsin coagulation factor X precursor; prothrombinase; chroma trypsin factor Xa [Homo sapiens] coagulation factor XI chroma trypsin dJ398D13.1 (coagulation factor XIII A chain precursor (F13A) chroma trypsin complement C1 chroma trypsin complement C1 β chroma trypsin complement C1 q chroma trypsin C1R_HUMAN complement C1r component precursor chroma trypsin C1s, chain A, crystal structure of the catalytic domain chroma trypsin of human complement C1QC_HUMAN complement C1q subcomponent, PAGE-LC trypsin C chain precursor complement C1s, chain A, crystal structure of the chroma trypsin catalytic domain of human complement component 2 precursor; chroma trypsin C3/C5 convertase [Homo sapiens] complement component 4A [Homo sapiens] crude NEOLM chroma trypsin complement C4A precursor PAGE-LC trypsin B20807 complement C4B - human (fragment) I chroma trypsin complement C7 [Homo sapiens] I chroma trypsin complement component 8, γ-polypeptide [Homo sapiens] chroma trypsin complement C8-β propetide chroma trypsin complement component 9 [Homo sapiens] chroma trypsin complement component 1, q subcomponent, PAGE-LC trypsin R polypeptide precursor complement component 1 inhibitor CPM PAGE-LC trypsin complement component 2 precursor; PAGE-LC trypsin C3/C5 convertase [Homo sapiens] complement component 3 precursor crude CPM PAGE-LC NEOLM chroma trypsin complement component 4 binding protein, R; PAGE-LC trypsin Complement component 4-b complement component 4A [Homo sapiens] PAGE-LC chroma trypsin complement component 4B proprotein [Homo sapiens] PAGE-LC trypsin

UniScore

Gi no.

* 1390.218 10835059 11994.59 8393159 * 60774.9 115877 2729.611 4502845 * * 40276.52 28511183 * 1429.142 17977656 845639.4 4557485 * 6157.958 758090 152091.4 28373341 * 10638.06 576259 2888.748 17942547

peptides

4 1 4 17 14 8 18 1 1 5 3

* * 70.41307 1478198 16623.91 3660145

1 1

315519.6 4389230

10

* * * * * 116910.3 18655500 * * * 5406.404 23943928 * 467147.6 4502905 * 102.1485 119772 18697.07 17426607 36910.19 4503625 * 473.2047 9453724 * * * 3433.832 115204 44603.06 13787045 32250.29 20178281 46531

1

13 19 1 8 7 3

8 4 4

13787045

4

27817.37 14550407

7

50387.47 * 24864.91 36839.11 3512.513 8842.707 413266.4 10886.54

15214496

5

87191 899271 4557393 29575 4502511 7705753

3 5 1 8 15 3

* 21926.99 14550407

7

2159179 4557385 7568.7 4502503

55 6

44209.56 15214496 1021830 4502501

5 36

Journal of Proteome Research • Vol. 3, No. 3, 2004 369

research articles

Marshall et al.

Table 2. (continued) protein name

complement component 5 [Homo sapiens] complement component 6 precursor CFAI_HUMAN complement factor I precursor (C3B/C4B inactivator) FHR-1; complement factor H-related protein 1 [Homo sapiens] complement factor H related protein related 3 complement H factor complement H factor 1 connective tissue activating peptide III conserved hypothetical protein gi|2501617| conserved hypothetical protein Mycoplasma pneumonia Y256_MYCPN conserved hypothetical protein NC_003902 - Xanthomonas campestris conserved hypothetical protein Xanthomonas axonopodis pv citri NP_640988.1 conseved protein NC_000916 Methanothermobacter thermautotrophicus COX-1 intron 1 protein - yeast A chain A, human C-reactive protein CRP2 binding protein cyclic nucleotide gated channel β 3; cyclic nucleotide-gated chanel cyclic nucleotide phospho diesterase cyclin dependent kinase 2 CYSQ PROTEIN HOMOLOG gi|1651207| cytochrome b-245, β-polypeptide (chronic granulomatous disease) cytochrome P450 - Nostoc spp. cytokeratin 9 [Homo sapiens] I crude cytokeratin type II [Homo sapiens] cytoskeleton assembly control protein homolog AF260332_1 DC33 [Homo sapiens] dead box ATP-dependent RNA Helicase Schizosaccharomyces pombe DEME 6 protein dermcidin precursor; AIDD protein [Homo sapiens] dihydroxyvitamin D3 induced protein and β arrestin 1 dimeric dihydroil dehydrogenase dJ1178H5.4.4 (novel protein (isoform 4)) [Homo sapiens] dJ1187J4.2 (novel protein similar to rat RYF3 - potential ligand binding protein) DNA excision repair protein haywire DNA polymerase III, subunit R (polC-1) similar to putative DNA polymerase; POL4P [Homo sapiens] DNA polymerase, β RA52_HUMAN DNA repair protein RAD52 homolog DNA segment, numerous copies, expressed probes (GS1 gene) A32618 DNA-directed RNA polymerase (EC 2.7.7.6) II 23K chain [validated] - human echinoderm microtubule associated protein like 1 ectonucleotide pyrophosphatase/phosphodiesterase 5 (putative function) Egp200-MR6 elongation factor Ts (EF-Ts) - Spiroplasma citri elongation factor Tu family protein; protein id: At1 g06220.1 Ethylene-inducible protein Arabidopsis thaliana expressed sequence AI429604 F18A5.250 Arabidopsis thaliana FA9_HUMAN coagulation factor IX precursor (christmas factor) RNA processing factor 1 [Homo sapiens] fatty acid binding protein homologue 3 fatty acid coenzyme A ligase Fc γ receptor 1ii FeMo cofactor biosynthesis protein - Rhodobacter capsulatus fibrin R C term fragment fibrin β fibrinogen R fibrinogen R A fibrinogen β I-118 fibrinogen β + β-B + I-118 fibrinogen β-B fibrinogen γ fibrinogen, R-chain isoform R-preproprotein [Homo sapiens] fibroblast activation protein fibronectin [Homo sapiens] fibronectin 1 isoform 1 preproprotein fibronectin [Homo sapiens] I fibronectin [Homo sapiens] similar to fibulin 1 isoform C precursor [Homo sapiens] ficolin 1 precursor; ficolin (collagen/fibrinogen domain-containing) 1 FCN3_HUMAN ficolin 3 precursor (collagen/fibrinogen domain-containing protein) FLJ00256 protein hypothetical protein FLJ12298 [Homo sapiens] y¨ hypothetical protein FLJ31579 [Homo sapiens] Fog-2 homology domian containing family member Caenorhabitis elegans

370

Journal of Proteome Research • Vol. 3, No. 3, 2004

preparation/sampling/protease

PAGE-LC PAGE-LC

CPM PAGE-LC

UniScore

32

chroma trypsin 3712.539 1772972 chroma trypsin * trypsin * chroma trypsin * chroma trypsin * trypsin * chroma chymo * chroma trypsin * chroma chymo *

2

chroma chroma chroma chroma chroma chroma PAGE-LC chroma chroma PAGE-LC chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma

PAGE-LC PAGE-LC

trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin

4

* * 104.6788 * 2441.301 * * * 3556.862 * 112360.3 72444.76 * 1134.566 * * 2668.763 * * 339.6276 *

1942435

2

21361816

6

6996021

16

435476 4758618

7 8

12005902

1

16751921

1

5262926

2

chroma trypsin * trypsin * trypsin 2939.611 20826233 12 chroma trypsin * trypsin 194.8485 1172823 4 trypsin * chroma trypsin 81.05428 2135016 1 chroma trypsin 1629.987 4758268 chroma trypsin 192.0414 11034849 chroma chroma chroma chroma chroma chroma

PAGE-LC chroma chroma chroma chroma chroma chroma chroma NEOLM chroma chroma chroma chroma chroma chroma PAGE-LC chroma chroma

CPM

peptides

chroma trypsin 30776.45 4502507 trypsin * chroma trypsin 37019.07 116133

chroma trypsin

PAGE-LC PAGE-LC

Gi no.

PAGE-LC chroma PAGE-LC chroma chroma PAGE-LC chroma chroma chroma chroma

trypsin trypsin trypsin chymo trypsin chymo trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin

* * 700.4527 * * * 102.1485 2792.28 * * * * * * * * * * * * 132.6778 * 20524.94 * 206882.9 12539.33 482.8426 173.8071 4964.144

9 5

15221423

8

119772 17225079

1 4

11761629 14 4096858

2

15866738 4096854 17440040 8051584 13124185

8 2 1 3 3

chymo * chymo 166.1977 19263643 chymo 352.546 23397570 chymo *

8 7

research articles

Human Serum Proteins Table 2. (continued) protein name

UniScore

Gi no.

peptides

chroma trypsin * chroma trypsin 7054.038

68293

3

22035692

6

4504165 2119533

13 1

24476016

1

4504375 4826762 3337391 1335055

9 6 1 1

13195586 239718 1335098 23200172

2 2 5 18

4504489

9

28590

1

307075

4

21450665

13

21389407 20149675

6 3

preparation/sampling/protease

FOX P1 UFHUM fumarate hydratase (EC 4.2.1.2) precursor, mitochondrial human (fragment) GABA receptor-like GDNF family receptor R-1 isoform b preproprotein; glial cell lineGDNF receptor R-(RET ligand I) gelsolin (amyloidosis, Finnish type); Gelsolin [Homo sapiens] I52300 giantin - human glucosyl hydrolase - Xanthomonas axonopodis pv citri str 306 glutamine synthetase glutamyl TRNA synthetase Saccharomyces cerevisiae GP120 heavy chain related protein PAGE-LC G-protein coupled receptor [Homo sapiens] G-protein coupled receptor 64, epididymis specific GTP binding protein MX2 - Bos taurus H factor 1 (complement); H factor-1 (complement) [Homo sapiens] haptoglobin [Homo sapiens] crude CPM PAGE-LC haptoglobin-related protein precursor [Homo sapiens] heavy chain of factor I [Homo sapiens] PAGE-LC helicase II helicase LHR-related, Thermoplasma volcanium hemoglobin AF351127_1 hemoglobin R 1 globin chain [Homo sapiens] PAGE-LC hemoglobin β chain; β-globin [Homo sapiens] X& PAGE-LC hemopexin [Homo sapiens] ence AI426465 [Homo sapiens] CPM PAGE-LC chain A, crystal structure of native heparin cofactor Ii hero resistance protein 3 homologue PAGE-LC high mobility group protein highly similar to aspartate carbamoyl transferase Listeria innocua histidine-rich glycoprotein precursor; histidine-proline rich glycoprot PAGE-LC histone H2B.1 histone H4 homeo box 10 homologue to regulator of G protein signaling 10 homologue to Sec 13 related protein reading frame HSA [Homo sapiens] ] n ˜ crude CPM HSP90 homologue Candida albicans human activated protein C, chain C retinol-binding protein hypothetical ORF identified by homology hypothetical protein 548 aa long conserved - Sulfolobus tokadaii hypothetical protein Af021799 hypothetical protein DKFZp434C1717.1 hypothetical protein DKFZp434L187.1 hypothetical protein F45F2.1 - C. elegans hypothetical protein FLJ00012 hypothetical protein FLJ10661 hypothetical protein FLJ32745 [Homo sapiens] hypothetical protein L4325.09 hypothetical protein MGC39389 [Homo sapiens] cont hypothetical protein MGC4342 [Homo sapiens] PAGE-LC Hypothetical protein NP_640463.1 Xanthomas axonopodis pv citri hypothetical protein NZ_AAAP01003490 Magnetospirillum magnetotacticum hypothetical protein NZ_AAAV01000174 Novosphingobium aromaticivorans hypothetical protein NZ_AABA01000116 hypothetical protein NZ_AABA01000157 - Psuedomonas fluorescens hypothetical protein NZ_AABE01000004 - Cytophaga hutchinsonii hypothetical protein P214.45 hypothetical protein XP_062078 hypothetical protein XP_062258 hypothetical protein XP_063137 hypothetical protein XP_063574 hypothetical protein XP_068482 hypothetical protein XP_073810 hypothetical protein XP_089347 hypothetical protein XP_092393 hypothetical protein XP_093225 hypothetical protein XP_093254 hypothetical protein XP_093362 [Homo sapiens] X& hypothetical protein XP_093375 hypothetical protein XP_093639 hypothetical protein XP_094361 hypothetical protein XP_095555 hypothetical protein XP_097115 hypothetical protein XP_097662 hypothetical protein XP_097931 hypothetical protein XP_101014 hypothetical protein XP_104185 hypothetical protein XP_106664 hypothetical protein XP_106741

chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma

trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin chymo trypsin trypsin trypsin trypsin trypsin trypsin trypsin chymo trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin chymo trypsin

chroma trypsin chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma

* 16315.7 * 115624.3 336.2814 * * * * 1100.784 * * 42189.24 107481.2 7057.032 51059.99 * * * 1303.117 1022.363 16944.76 66305.02 * * * 112668 * * * * * 801.1822 * * 908.1013 * * * * * * * * 793.6963 * 1048.129 584.3965 * * *

trypsin * trypsin * trypsin * trypsin * trypsin * trypsin * trypsin * trypsin * trypsin * trypsin * trypsin * trypsin * trypsin * trypsin * trypsin 456.955 27481147 trypsin * trypsin * trypsin * trypsin * trypsin * trypsin * trypsin * trypsin * trypsin * trypsin * trypsin *

7

Journal of Proteome Research • Vol. 3, No. 3, 2004 371

research articles

Marshall et al.

Table 2. (continued) protein name

hypothetical protein XP_106754 hypothetical protein XP_107752 hypothetical protein XP_117091 hypothetical protein XP_119387 [Homo sapiens] X& hypothetical protein XP_119485 hypothetical protein XP_120939 hypothetical protein XP_121071 hypothetical protein XP_151306 hypothetical protein XP_166262 hypothetical protein XP_167403 hypothetical protein XP_169477 hypothetical protein XP_169612 hypothetical protein XP_169813 hypothetical protein XP_170264 hypothetical protein XP_174984 hypothetical protein Y50D4B.4 C. elegans ICB-1γ [Homo sapiens] IgA IgG fab fragment IgM heavy chain [Homo sapiens] IgG κ chain IgG light chain IgM IgM autoantibody heavy chain IgM hv, C µ IgM hv, C µ(107 AA) IIC Tryptic Immunoglobulins III D additional proteins identified only by chymotryptic peptides IIIB tryptic hypothetical proteins ALC1_HUMAN Ig R-1 chain C region immunoglobulin E immunoglobulin γ-1 heavy chain constant region immunoglobulin γ-4 chain C region immunoglobulin heavy chain [Homo sapiens] immunoglobulin heavy chain constant region immunoglobulin heavy chain constant region γ-1 immunoglobulin heavy chain variable region [Homo sapiens] immunoglobulin J polypeptide, linker protein for immunoglobulin R immunoglobulin κ chain NIG93 precursor PN0445 Ig κ chain precursor V-I region - human (fragment) immunoglobulin κ chain variable region [Homo sapiens] immunoglobulin κ chain variable region [Homo sapiens] C30601 Ig κ chain V-III region (Pay) - human (fragment) F30601 Ig κ chain V-III region (Neu) - human (fragment) KV4A_HUMAN IG κ chain V-IV region len immunoglobulin κ L-chain V-region immunoglobulin κ light chain [Homo sapiens] immunoglobulin κ light chain variable region [Homo sapiens] immunoglobulin κ light chain VLJ region S25741 Ig λ chain - human S25742 Ig λ chain - human immunoglobulin λ chain (Ke+O-) immunoglobulin λ light chain VLJ region [Homo sapiens] immunoglobulin light chain variable region [Homo sapiens] immunoglobulin µ chain [Homo sapiens] immunoglobulin µ heavy chain disease protein immunoglobulin µ heavy chain variable region [Homo sapiens] immunoglobulin V(H) gene V-D-J [Homo sapiens] nter Immunoglobulins inhibitor R 1 protease insulin-like growth factor binding protein, acid labile subunit; INSULI insulin receptor [Homo sapiens] integral membrane serine protease Seprase A36429 integrin β-4 chain precursor - human FZp434D0215.1 integrin like protein intein containing hypothetical protein inter R-trypsin inhibitor 101 kD inter R-trypsin inhibitor component II inter R-trypsin inhibitor component III hyaluronan binding protein 2; hyaluronic acid binding protein 2; hepato T46280 isocitrate dehydrogenase (NADP) (EC 1.1.1.42), cytosolic [similarity] - hum ITI H3 ITI heavy chain H1 ITI heavy chain H1 precursor immunoglobulin µ heavy chain disease protein immunoglobulin µ heavy chain variable region [Homo sapiens] immunoglobulin V(H) gene V-D-J [Homo sapiens] nter Immunoglobulins inhibitor R 1 protease insulin-like growth factor binding protein, acid labile subunit; INSULI insulin receptor [Homo sapiens]

372

Journal of Proteome Research • Vol. 3, No. 3, 2004

preparation/sampling/protease

PAGE-LC PAGE-LC PAGE-LC

PAGE-LC

chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma

PAGE-LC CPM PAGE-LC PAGE-LC PAGE-LC PAGE-LC PAGE-LC PAGE-LC PAGE-LC PAGE-LC PAGE-LC PAGE-LC PAGE-LC PAGE-LC PAGE-LC PAGE-LC PAGE-LC PAGE-LC PAGE-LC PAGE-LC PAGE-LC PAGE-LC PAGE-LC

chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma

PAGE-LC chroma chroma chroma chroma chroma chroma cont chroma chroma chroma chroma

CPM

PAGE-LC PAGE-LC PAGE-LC PAGE-LC

UniScore

trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin chymo trypsin trypsin trypsin trypsin trypsin trypsin

* * * 557.2449 * * * * * * * * * * * * 8365.547 * * 439.6557 * * * * * * * * * 241195.8 * * * 3848.389 * * 32945.19 5279.296 * 124.8385 16043.65 22791.38 11636.7 4674.323 10189.83 * 2484.966 16043.65 * 54.95536 226.661 * 11021.93 28874.65 29720.06 * 8248.479 388.319 * * 17599.06 1049.706 * 5648.777 * * * * * 4288.875 2584.348

chroma trypsin * chroma trypsin * trypsin * chroma trypsin * chroma trypsin 8248.479 chroma trypsin 388.319 trypsin * chroma trypsin * chroma trypsin 17599.06 chroma trypsin 1049.706

Gi no.

peptides

20532845

3

13359161

3

1699441

1

113584

7

10334541

3

11137372 21489959

1 3

418844 15722775 10637404 106608 106607 1730075

1 1 1 1 1 1

1561606 18041836

1 2

106644 106643

2 3

21669631 3328006 184728

1 2 4

4995354 1359764

1 3

4826772 914086

7 2

2119645

26

4758502 11374664

6 5

4995354 1359764

1 3

4826772 914086

7 2

research articles

Human Serum Proteins Table 2. (continued) protein name

preparation/sampling/protease

UniScore

Gi no.

peptides

integral membrane serine protease Seprase chroma trypsin * A36429 integrin β-4 chain precursor - human FZp434D0215.1 chroma chymo 5648.777 2119645 26 integrin like protein chroma trypsin * intein containing hypothetical protein cont trypsin * inter R-trypsin inhibitor 101 kD * inter R-trypsin inhibitor component II chroma trypsin * inter R-trypsin inhibitor component III CPM chroma trypsin * hyaluronan binding protein 2; hyaluronic acid binding protein 2; hepato chroma trypsin 4288.875 4758502 6 T46280 isocitrate dehydrogenase (NADP) (EC 1.1.1.42), chroma trypsin 2584.348 11374664 5 cytosolic [similarity] - hum ITI H3 chroma trypsin * ITI heavy chain H1 PAGE-LC chroma trypsin * ITI heavy chain H1 precursor PAGE-LC trypsin * ITH2_HUMAN Inter-R-trypsin inhibitor heavy chain H2 precursor PAGE-LC chroma trypsin 1427060 125000 27 (ITI heavy ch ITI heavy chain H2 precursor PAGE-LC trypsin * ITI heavy chain H3 precursor PAGE-LC trypsin * ITI heavy chain H4 (plasma kallikrein sensitive glycoprotein) PAGE-LC chroma trypsin * ITH4_HUMAN inter-R-trypsin inhibitor heavy chain H4 precursor (ITI heavy PAGE-LC chroma trypsin 612449.9 13432192 15 keratin chroma trypsin * keratin 1; keratin-1; cytokeratin 1; hair R protein [Homo sapiens] crude cont PAGE-LC trypsin 751029.2 17318569 22 keratin 10 (epidermolytic hyperkeratosis; keratosis palmaris et plantari cont PAGE-LC trypsin 48791.13 21961605 5 keratin 14; cytokeratin 14 [Homo sapiens] cont PAGE-LC trypsin 67754.41 15431310 5 keratin 2a [Homo sapiens] cont PAGE-LC trypsin 60417.65 4557703 11 keratin 5 (epidermolysis bullosa simplex, Dowling-Meara/Kobner/Weber-Coc cont trypsin 5454.786 18999435 10 keratin 6 isoform K6e [Homo sapiens] cont trypsin 4263.399 27465517 9 keratin type 2 cytoskeletal 1 crude PAGE-LC trypsin * keratin 4; keratin-4; cytokeratin 4; keratin, type II cytoskeletal 4 crude PAGE-LC trypsin 11402.3 17318574 8 KIAA0563 chroma trypsin * KIAA1092 protein [Homo sapiens] frame 44 chroma trypsin 745.6939 5689521 6 KIAA1843 protein [Homo sapiens] chroma trypsin 4071.433 20521990 16 kininogen [Homo sapiens] PAGE-LC chroma trypsin 11168.53 4504893 3 L8019.5 Leishmania major chroma trypsin * λ HuHITI-13 chroma trypsin * Laminin B receptor - Mus musculus chroma trypsin * latrophilin-2 [Homo sapiens] PAGE-LC trypsin 4714.334 6273483 1 Len Bence Jones protein chroma trypsin * leucine-rich repeat-containing G protein-coupled receptor 7 [Homo sapiens] chroma trypsin 8774.37 11056008 9 lifeguard chroma trypsin * lipopolysaccharide binding protein [Homo sapiens] chroma trypsin 2502.91 18490598 4 lipoprotein C1 chroma trypsin * lipoprotein CIII chroma trypsin * lipoprotein Gln I chroma trypsin * LSU ribosomal protein L6P Archaeoglobus fulgidus chroma trypsin * lumican [Homo sapiens] chroma trypsin 377143 4505047 11 lymphocyte antigen 64 homolog, radioprotective 105kDa; chroma trypsin 3198.053 5031895 3 Lymphocyte antigen lysophospholipase chroma trypsin * macropain subunit chroma trypsin * macrophage capping protein Cap G chroma trypsin * magnesium-dependent phosphatase 1 chroma trypsin * mannitol-1-phosphate 5 dehydrogenase chroma trypsin * mannose binding protein chroma trypsin * mannose-6-phosphate/insulin like growth factor II receptor chroma trypsin * MASP-1 mannose binding protein assoicated serine protease chroma trypsin * MCM10 homologue [Homo sapiens] ated protein [Homo sapiens] chroma trypsin 1203.843 11527602 18 bG174L6.2 (MSF: megakaryocyte stimulating factor) [Homo sapiens] chroma trypsin 1438.458 13559026 21 meiotic checkpoint regulator chroma trypsin * methionine synthetase I chroma trypsin * MGC:33321 chroma trypsin * hypothetical protein MGC4701 [Homo sapiens] chroma trypsin 10418.05 24308291 11 CFAB_HUMAN complement factor B precursor (C3/C5 convertase) chroma trypsin 704721.1 584908 16 (Properdin factor B microtubule associated protein homologue Drosophila melanogaster chroma trypsin * microtubule associated proteins 1A/1B light chain 3 chroma trypsin * microtubule vesicle-linker (CLIP-170) chroma trypsin * mitochondrial L5 protein cont trypsin * AF295356_1 moesin/anaplastic lymphoma kinase fusion protein chroma trypsin 977.5149 14625824 3 [Homo sapiens] Murinoglobin - Cavia porcellus chroma trypsin * mutant keratin 9 PAGE-LC trypsin * myosin XV; unconventional myosin-15 [Homo sapiens] chroma trypsin 2440.925 22547229 32 similar to myosin heavy chain, nonmuscle type B (Cellular myosin heavy PAGE-LC trypsin 2570.986 27482786 35 N utilization substance protein A - Yersinia pestis chroma trypsin * Na,K-ATPase R-4 subunit [Homo sapiens] PAGE-LC chroma chymo 120.1657 17149816 1 NUAM_HUMAN NADH-ubiquinone oxidoreductase 75 kDa subunit, chroma trypsin 3570.513 128826 12 mitochondrial precurs natriuretic peptide receptor A/guanylate cyclase A (atrionatriuretic pe chroma chymo 113.5465 4505435 9 nebulin [Homo sapiens] PAGE-LC trypsin 9151.139 4758794 118 neurogenic differentiation 2 chroma trypsin * neurotrophin 3 chroma trypsin * neutrophil activating peptide-2 chroma trypsin *

Journal of Proteome Research • Vol. 3, No. 3, 2004 373

research articles

Marshall et al.

Table 2. (continued) protein name

NPAT [Homo sapiens] nuclear protein gi 1351640 OATP-E oxygenase homolog paraoxonase paraoxonase 1 paraoxonase/arylesterase [Homo sapiens] PECAM-1 (CD31 antigen) peptidoglycan recognition protein L precursor [Homo sapiens] PF4-derived endothelial cell growth inhibitor peak II PfS230 (predicted secreted protein) Pheromone shutdown protein tra B Borrelia burdorferi phosphate ABC transporter, ATP-binding protein phosphate regulating gene with homologies to endopeptidases on the X c phosphatidylserine-specific phospholipase A1R [Homo sapiens] phycobilisome rod-core linker protein - Nostoc, Anabaena spp. pigment epithelium-derived factor [Homo sapiens] X& crude PK-120 precursor plasma kallikrein B1 precursor; Kallikrein, plasma; kallikrein 3, plasm plasma protein s vitamin k dependent - Rhesus macaque Plasma retinol-binding protein precursor plasminogen [Homo sapiens] platelet-activating factor acetylhydrolase, isoform Ib, R subunit ( polyunsaturated fat synthase subunit C - Schizochytrium sp. possible G-protein receptor possible NADH-dependent butanol-dehydrogenase 2 - Bacillus subtilis potassium voltage-gated channel, Shal-related subfamily, member 3 isof prealbumin [Homo sapiens] precollagen D AF116721_45 PRO1708 [Homo sapiens] antigen 6/11 (ME PRO2619 pro-apolipoprotein probable ATPase AF112207_1 translation initiation factor eIF-2b delta subunit [Homo sapien proliferating cell nuclear antigen [Homo sapiens] pro-melanin concentrating protein hormone-like 1 protein pro-platelet basic protein (includes platelet basic protein, bet; Pro-p protein C (inactivator of coagulation factors Va and VIIIa) [Homo sapie protein kinase C, epsilon protein phosphate inhibitor 2 protein tyrosine phosphatase receptor pi [Homo sapiens] protein Z dependent protease inhibitor prothrombin [Homo sapiens] apiens] Homo sapiens] [ putative ATP-binding component of dipeptide transport system - E. coli putative cytochrome P450 putative DTDP-6-deoxy-L-mannose-dehydrogenase homolog FXO4_HUMAN Putative fork head domain transcription factor AFX1 (Forkhead box) putative NADH dehydrogenase - Levenhookia leptantha putative protein At5 g27270.1 - Arabidopsis thaliana putative ras-related protein ARA-1 Arabidopsis thaliana PWWP domain protein - Arabidopsis thaliana pyruvate phosphate dikinase - Entamoeba histolytica RAB ras oncogene family like RAB5 interacting protein 3 RAC family serine/threonine kinase homologue RAD54 homologue RAD54B homologue isoform 1; RAD54, S. cerevisiae, homologue of, B [Homo sapiens] RAN-GTPASE activating protein FK506 binding protein 12-rapamycin associated protein 1; FK506 binding ras-responsive element binding transcription factor (RREB-1 homologue) recombinant IgG2 heavy chain reduced folate carrier regulator of chromosome condensation motifs replication protein E1 - human papillomavirus restriction modification system S chain homolog reticulocyte binding protein 2 homologue B [Plasmodium falciparum] reverse transcriptase - Drosophila teissieri AF268032_1 rhophilin-like protein [Homo sapiens] X& RIKEN cDNA 2210010C04 [Mus musculus] RNA polymerase sigma factor Xanthomonas axonopodis pv citir S20250 splicing factor U2Af large chain AF380577_1 SAM-dependent methyltransferase [Homo sapiens] sarcosine oxidase, subunit R-related (soxA) scaffold attachment factor B [Homo sapiens] 1, supp similar to SEC13 (S. cerevisiae)-like 1; SEC13-related protein [Homo sapiens] sec6 homologue [Homo sapiens] in [Homo sapiens] [Mus musculus] SecA homologue - Pisum sativa selenoprotein P [Homo sapiens]

374

Journal of Proteome Research • Vol. 3, No. 3, 2004

UniScore

Gi no.

peptides

chymo trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin

1948.251 * * * * * 108390.9 * 8337.834 * * * * 4074.073 804.2936 * 18275.51 * 8290.075 * * 96265.9 1498.229 * * * 3450.682 455684 * 2718688 * * * 3082.569 613.2354 * 30092.37 281077.9 * * 358.9678 * 103342.9 * * * 7704.836

1304114

17

298532

4

21361845

5

10937867 7706661

11 2

1144299

7

4504877

7

4505881 4557741

8 7

27436984 219978

4 2

7959791

20

6563202 4505641

7 3

4505981 4506115

6 2

2351576

10

1335344

12

27923975

4

trypsin trypsin trypsin trypsin trypsin chymo chymo trypsin trypsin trypsin

* * * * * * * * * 3964.096 6912622

11

preparation/sampling/protease

chroma chroma chroma chroma chroma PAGE-LC PAGE-LC PAGE-LC

chroma chroma chroma chroma chroma

PAGE-LC

CPM PAGE-LC

PAGE-LC PAGE-LC

chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma

PAGE-LC

PAGE-LC

chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma

PAGE-LC

PAGE-LC

PAGE-LC

chroma trypsin chroma trypsin chroma trypsin chroma trypsin chroma trypsin chroma trypsin chroma trypsin chroma trypsin chroma chymo/ trypsin chroma trypsin chroma trypsin chroma trypsin chroma chymo chroma chymo chroma chymo trypsin chroma trypsin chroma trypsin

* 790.1774 3214.04 * * 140.2258 * 2407.15 2525.289

chroma chroma chroma

trypsin trypsin trypsin

* 2558.017 4826730 * * * * * * *

16

14279409 12963645

3 2

16417154

4

1213639 27712890

21 3

2458.865 3005727 * 9265.162 2654365

5 4

research articles

Human Serum Proteins Table 2. (continued) protein name

serine (or cysteine) proteinase inhibitor, clade A (R-1 antiprotein AAH11171 serine (or cysteine) proteinase inhibitor, clade G (C1 inhibitor similar to serine (or cysteine) proteinase inhibitor, clade C (antithrom serine rich protein homolog similar to serine-arginine repressor protein (35 kDa) [Homo sapiens] [ serpin R-1 protease inhibitor serum albumin [Homo sapiens] iens] n ˜ serum amyloid A4, constitutive; C-SAA [Homo sapiens] serum amyloid A4, constitutive; C-SAA [Homo sapiens] serum ion transport protein similar to ADAMTS18 protein AC004836_1 similar to cadherin and Drosophila fat protein; similar to CAA6 Similar to cardiolipin synthase Bacilus subtilis similar to CED-4 similar to CG12056 gene product Similar to C-reactive protein Listeria monocytogenes similar to disintegrin-like mettalloprotease with TSP-1 motif similar to DJ568C11.2 similar to DNA-binding protein Spo0J-like similar to dynein light chain 2 TCTEX2 similar to hypothetical protein FLJ21562 [Homo sapiens] [Rattus norvegicus] hypothetical protein FLJ31614 [Homo sapiens] similar to γ-tubulin complex component 3 [Rattus norvegicus] similar to heavy metal-transporting ATPase [Bacillus subtilis] similar to hemicentrin similar to histone H2A [Homo sapiens] [Mus musculus] similar to hypothetical protein FLJ10408 [Homo sapiens] similar to Ig γ-2 chain C region similar to immunoglobulin heavy constant γ-3 similar to keratin 6 irs4 [Homo sapiens] [Mus musculus] AAH14152 Similar to keratin 6A [Homo sapiens] similar to keratin 8; Keratin-8 [Homo sapiens] similar to keritin 5 - Mus musculus similar to KIAA0738 gene product [Homo sapiens] [Rattus norvegicus] similar to KIAA0793 gene product [Homo sapiens] [Rattus norvegicus] similar to KIAA1635 protein [Homo sapiens] [Rattus norvegicus] similar to LDL induced EC protein similar to nucleosomal binding protein 1 similar to pancreatic elastase similar to peripheral benzodiazepine receptor associated protein 1 similar to phosphatidyl serine decarboxylase similar to PTS system fructose-specific IIA component similar to putative [Homo sapiens] roteasome 26S s similar to RAS-like, family 2, isoform 9 similar to rat myomegalin similar to ribosomal protein S12; 40S ribosomal protein S12 [Homo sapiens] similar to RIKEN cDNA 1300014I06 gene [Homo sapiens] similar to RIKEN cDNA 4930424G05 [Mus musculus] [Homo sapiens] similar to ring finger B-box similar to serine (or cysteine) proteinase inhibitor, clade A (R-1 similar to serine/threonine kinase similar to serine/threonine kinase 36 similar to serum albumin precursor synaptotagmin-like 2 isoform a; chromosome 11 synaptotagmin [Homo sapiens] similar to tetranectin precursor Similar to transcriptional regulator (GntR family) Bacillus subtilis TRIO_HUMAN triple functional domain protein (PTPRF interacting protein) similar to vitronectin - Mus musculus similar to XP_121071 sirtuin (silent mating type information regulation 2) SLP 76 tryosine phosphoprotein SM70 antigen Schistomsoma mansoni small protein A homolog SMC2 structural maintenance of chromosomes 2-like 1; structural mainten Sp4 transcription factor spastic ataxia of Charlevoix-Saguenay (sacsin) [Homo sapiens] SPI2 protein - Picea abies S-protein AF449428_1 SRrp35 [Homo sapiens] sulfide oxidase SWI/SNF chromatin remodeling complex subunit OSA2 T05255 targeted effector protein yopP TBX6_HUMAN T-box transcription factor TBX6 (T-box protein 6) tetranectin (plasminogen binding protein) tetranectin (plasminogen binding protein) prothrombin [Homo sapiens] apiens] Homo sapiens]

UniScore

Gi no.

peptides

trypsin chymo trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin chymo

5988.807 215237.7 15040.9 * 552.9138 * 5607.842 70358.3 67829 * * 944.9569

5453896 15029894 18490839

8 15 5

27714289

3

28592 10835095 10835095

3 4 4

4699969

1

chymo trypsin trypsin chymo trypsin trypsin trypsin trypsin chymo/ trypsin chymo trypsin trypsin chymo trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin chymo trypsin trypsin trypsin trypsin

* * * * * * * * 10335.12 27702719

6

461.2036 * * * 463.3977 418.5722 * * 274617.1 7163.126 133166.7 * 2937.453 340.6394 2605.818 * * * * * * 2542.402 * * 2538.34

22749181

4

20848444 27478299

3 3

20904305 15559584 27483752

6 1 3

27710096 27686369 27691326

11 3 8

20380883

6

27708262

1

trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin

3781.568 7383.858 * 3075.166 * * * 11120.79

28374392 27484244

2 1

27479577

5

15011902

9

chroma chroma chroma

trypsin chymo trypsin

* * 8753.27

8928460

38

chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma

trypsin trypsin trypsin trypsin trypsin trypsin trypsin chymo trypsin trypsin trypsin trypsin trypsin trypsin chymo trypsin trypsin trypsin trypsin trypsin

* * * * * * 9745.217 * 13669.52 * * 793.3724 * * * * 27099.06 23930.28 26268.72 126950

5453591

24

7657534

45

18034491

3

6094434 4507557 4507557 1335344

5 6 6 12

preparation/sampling/protease

PAGE-LC chroma PAGE-LC

CPM PAGE-LC

chroma chroma chroma chroma chroma

PAGE-LC chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma PAGE-LC chroma chroma PAGE-LC PAGE-LC

PAGE-LC PAGE-LC PAGE-LC PAGE-LC PAGE-LC

chroma chroma chroma chroma chroma

chroma chroma chroma PAGE-LC chroma chroma chroma chroma chroma chroma chroma chroma chroma cont chroma PAGE-LC chroma PAGE-LC chroma PAGE-LC chroma PAGE-LC

PAGE-LC chroma PAGE-LC chroma chroma

Journal of Proteome Research • Vol. 3, No. 3, 2004 375

research articles

Marshall et al.

Table 2. (continued) protein name

preparation/sampling/protease

TSP1•HUMAN Thrombospondin 1 precursor Tissue factor TonB-dependent receptor A49985 transaldolase (EC 2.2.1.2) - human transcription elongation factor B SIII transcription factor ICBP90 transcription termination factor (RNA pol II) transferrin [Homo sapiens] Transposase homologue B Heliobacter pylori transthyretin; TTR [Homo sapiens] n ˜ trigger transposable element homolog, Mus musculus troponin T2, cardiac neuronal tryptophan hydroxylase [Homo sapiens] tumor protein p53 binding protein, 1; tumor protein 53-binding protein, type 2 sretion system protein type II secretion system protein ubiquitin protease 1 ubiquitin-activating enzyme E1C similar to ubiquitin-like 5 [Homo sapiens] [Mus musculus] UDP-glucose 4-epimerase UDP-glucuronosyltransferase [Homo sapiens] ns] UDP-N-acetyl-D-mannosaminuronic acid dehydrogenase UL16 binding protein 3 unamed protein product BAB15362.1 unamed protein product MGC:39273 unknown (protein for IMAGE:4792618) unknown (protein for MGC:14588) unknown (protein for MGC:39273) [Homo sapiens] gme unknown (protein for MGC:9478) unknown gi13543597 unknown protein AF217999 AF217999_1 unknown [Homo sapiens] n ˜ unknown protein IMAGE:3533309 unknown protein MGC: 29484 unknown (protein for MGC:26123) [Homo sapiens] unknown protein product mass)42974 unnamed protein product gi 10434659 unnamed protein product gi 16549862 unnamed protein product gi16553682 unnamed protein product gi21755409 unnamed protein product gi22761175 vasopressin-activated calcium-mobilizing receptor-1; cullin-5 (vasopres group-specific component (vitamin D binding protein); hDBP [Homo sapiens] vitamin K-dependent protein S SGHU1V vitronectin precursor [validated] - human 014641 [Homo sapiens] SGHU1V vitronectin precursor [validated] - human 014641 [Homo sapiens] chain A, human Von Willebrand factor A3 domain VWF pre-pro-polypeptide (-22 to 2791) [Homo sapiens] WD repeat domain 7 protein isoform 1; TGF-β resistance associated g xanthomonadin biosynthesis related protein yeast GCN protein kinase activator homologue Schizosaccharomyces pombe R-2-glycoprotein 1, zinc; R-2-glycoprotein, zinc [Homo sapiens] similar to zinc-finger protein DZIP1 [Homo sapiens] [Rattus norvegicus zonadhesin splice variant 1 [Homo sapiens] ein 1;

PAGE-LC PAGE-LC CPM

chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma chroma

PAGE-LC chroma PAGE-LC chroma chroma chroma PAGE-LC chroma chroma chroma chroma chroma PAGE-LC PAGE-LC PAGE-LC PAGE-LC PAGE-LC

chroma

chroma chroma chroma chroma chroma cont chroma chroma PAGE-LC PAGE-LC PAGE-LC PAGE-LC

chroma chroma chroma chroma chroma

PAGE-LC chroma PAGE-LC chroma chroma chroma PAGE-LC PAGE-LC chroma

trypsin chymo trypsin trypsin trypsin trypsin trypsin trypsin chymo trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin chymo chymo trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin trypsin chymo trypsin trypsin trypsin

UniScore

Gi no.

peptides

3330.025 * * 434.3289 * * * 599319.3 * 308568.4 * * 59.60057 361.3617 * * * * 2742.132 * 95.8261 * * * * * * 604589.6 * * * 260.019 * * 511.4797 * * * * * * 2438.297 85668.9 * 233891.7 216738.6 5570.525 22259.8 2622.717 * * 2472.911 3395.105 4983.137

135717

13

1082840

1

4557871

18

1336728

2

27497159 5032189

5 8

20919929

1

7690346

2

18999465

8

10441928

1

21040475

7

4503167 9845255

30 13

72146 72146 2982053 37947 16579890

9 9 2 14 11

4502337 27721341 15721995

11 11 15

a

Proteins identified directly from sera are listed with the descriptor crude. Additionally, proteins were prepared by chromatogrpahy followed by analytical PAGE and identification by MALDI-Qq-TOF (CPM). In a separate experiment proteins were identified from the 16 CBBR-stained bands of crude serum resolved by Preparative SDS-PAGE as illustrated in Figure 2 followed by trypsin digestion and subsequent analysis the resulting peptides by 1D-LC-ESI-ION TRAP with a SEQUEST search (PAGE-LC). Naturally occurring peptides, i.e., No Enzyme, were detected by Off Line analytical C18 reversed phase chromatography onto stainless steel metal plates prior to MALDI-TOF (NEOLM). Proteins were identified from human serum by gel-free chromatography of intact proteins with subsequent digestion by trypsin or chymotrypsin and 2D-LC fractionation of the resulting peptides by HiS chromatography prior to reversed phase C18 LC-ESI-ION TRAP (chroma). In addition the proteins found in laboratory dust we listed as controls (cont). The use of trypsin or chymotrypsin are indicated by (trypsin) or (chymo). All proteins (/) were identified by at least one significant peptide (X-CORR: +1/1.9, +2/2.5, +3/3.75) by SEQUEST or MASCOT (p < 0.05) against the NR database. Additionally, where the name of a protein with a significant X-CORR values agreed with analysis of the human database by BIOWORKS, the unified score generated, Gi number and the number of unique types of peptides obtained are shown. Note that significant SEQUEST or MASCOT scores (*) that were obtained for every protein listed were sometimes replaced with Uniscores from BIOWORKS analysis of a limited subset of samples and that not all proteins show significant Uniscores.

graphic separation of whole proteins with subsequent 2D-LC separation of the enzymatic peptides which were identified by MS/MS with the ION TRAP. The fractions derived from chromatography of intact proteins were digested with trypsin or chymotrypsin and the resulting peptides separated by HiS columns. The fractions from the HiS columns were then resolved by C18 LC-ESI-ION TRAP. The resulting list of peptide assignments from the combined 2D-LC runs apparently showed significant overlap of ions between adjacent fractions and good reproducibility in agreement with previous results.14 We found 376

Journal of Proteome Research • Vol. 3, No. 3, 2004

that sample preparation at the level of peptides from digested proteins increased the capacity of the subsequent LC-ESIION TRAP experiment to identify un-common proteins. A number of uncommon serum proteins (for example unknown protein #af021799 above) were only detected after 2D-LC separation of the peptides prior to ION TRAP. A post-hoc search of protein databases for af021799 after the MS/MS spectra was selected for illustration yielded an interleukin 17 receptor-like mRNA. In contrast, analysis of crude serum yielded the identification of only a few commonly known serum proteins.

research articles

Human Serum Proteins

Figure 3. MALDI-TOF analysis of serum polypeptides (i.e. without enzymatic digestion) collected over C18 reversed-phase resin prior to MALDI-TOF as compared to crude sera. The polypeptides were spotted on gold MALDI-TOF chips, matrixed with CHCA and analyzed at a laser intensity setting of 210 and a sensitivity of 7 on a Ciphergen PBS II. The same peptide set labeled C18 were also resolved by Offline LC-MALDI-Qq-TOF and the source of the peptide families identified are listed under the heading (NEOLM) in Table 2.

Especially in the case of PAGE but also with gel-less protein identification, keratins and other potential contaminants such as skin or hair proteins were detected. For example, calmodulin-like skin protein was potentially suspect. These may well constitute a part of the legitimate serum proteome. However, there is no way to determine which keratin sequences were truly found in blood and which were the result of dust contamination during preparation. Hence, we recorded the MS and MS/MS spectra of buffers that had no serum proteins or from the direct examination of laboratory dust using trypsin digestion followed by 1D LC-ESI-ION TRAP and found keratins 1, 2a, 5, 6a, 10, 14, cytokeratin nine as well as unknown protein product mass ) 42 974, intein-containing hypothetical protein, similar to hypothetical protein MGC39389, catalase interacting protein, similar to ribosomal protein, and mitochondrial L5 protein (Table 2). We suggest these proteins were apparently laboratory contaminants and should be considered with caution. MS/MS spectra from the gel-free experiments described above were searched against a nonredundant library of enzymatic peptides and more than 500 protein products with X-correlations of g 2.5(2+)/3.75(3+) were observed (Table 2). When considered together with the proteins identified by SDSPAGE based approaches and low mass MALDI-Qq-TOF peptides we found over six hundred types of high-scoring protein products in serum using rudimentary chromatographic preseparation. The FASTA headers from the resulting list of tentative assignments were then analyzed for species. We observed that about 99% of the significant peptides identified here were assigned to known human serum proteins. When analyzed at the level of named proteins we found that about

Figure 4. Identification of naturally occurring peptides from human serum (See Figure 3) by MALDI-Qq-TOF. Peptides from human serum was pre-fractionated with analytical reversedphase C18 separation prior to spotting on a stainless steel target, matrixed by DHB and analyzed by MALDI-Qq-TOF. The typical identification of a serum peptide is illustrated by an apparent peptide from an androgen-induced protein.

90% to the proteins were from the Kingdom Animalia (Figure 8). In terms of tentative function, a considerable number of the proteins named were previously observed to be common components of sera. A substantial fraction of the total were also computer-assigned to hypothetical proteins or Open Reading Frames (ORF’s). ORFs are merely a sizable stretch of genomic DNA that in one reading frame fails to show at least one stop codon and therefore might be a coding sequence. Peptides assigned to hypothetical proteins of humans and or other species, while potentially important, will require close scrutiny to determine if they reflect real additions to the serum proteome. Of the remaining named proteins, the assignments were distributed among enzymes, transporters, nuclear proteins, membrane proteins, receptors, regulatory proteins, proteins that contained known protein-protein interaction domains,53,54 and others, with a notably small faction of the total assigned to kinases or phosphatases.

Discussion Of all the proteomes, the proteins of the blood are perhaps of the greatest biological, medicinal and economic importance. The identification of the serum proteins presents a task of significant technical difficulty.2,12,13 Immunoglobulins aside, serum may represent one of the most complex sets of proteins with perhaps 1 million different molecules spread over a million-fold range of physiological concentrations but, infamously, contains a small set of highly abundant proteins: If Journal of Proteome Research • Vol. 3, No. 3, 2004 377

research articles

Marshall et al.

Figure 5. Total Ion Chromatograms (TIC) of reversed-phase LCESI-ION TRAP runs from the trypsin digested proteins after the initial collection of the intact proteins over DEAE-B chromatography resin as compared to crude sera. The peptides were resolved in a 15 cm × 0.3 mm ID C18 reversed phase column at a flow rate of 2 µL per minute into an Finnigan XP-100 LCQ.

we assume the genome expresses 100 000 splice variants with an average of 10 post-translational or processing events per molecule, then there may be one million molecules in blood: If HSA is in the millimolar range and if most physiological interactions have km values of no less than E-10 then there is no physiological reasonsbesides tissue leakagesto consider much more than a million fold range of concentrations. Twodimensional electrophoresis of crude serum is severely limited by the presence of albumin. In anticipation of comparative proteomics, we need sample preparation techniques that are simple and not composed of multiple steps in series that by nature will not be easily reproduced. For this reason, we show that preparative size-fractionation of crude serum on tricine gels or preparation of serum by DEAE blue chromatography followed by 2D-LC can enumerate hundreds of proteins by tandem MS/MS analysis. In contrast, the failure to sharply resolve albumin by standard 2D electrophoresis prevents identification of anything other than multiple forms of common serum proteins by 2D gels. At present commercially available CID instruments may enumerate thousands of proteins in a study15 and experimental FTICR instruments may identify tens of thousands of peptides.30 MS/MS spectra from CID experiments will be commonly used to assign probable identity to peptides for at least the near future.55 Of the presently practical CID methods available to identify proteins, the standard for 378

Journal of Proteome Research • Vol. 3, No. 3, 2004

Figure 6. Typical MS and MS/MS spectra from the LC-ESI-ION TRAP. A significant peptide as scored by SEQUEST was randomly selected and located in the appropriate TIC trace (see Figure 5). Subsequently, the next several consecutive spectra that met the minimum X-Correlation and delta-Correlation criteria of 2.5(2+)/3.75(3+) were then used to illustrate the typical quality of the raw data that informed the MS/MS identification of peptides in this paper.

accuracy remains the Qq-TOF29 and the introduction of the MALDI interface presents a major advance in the elegance and simplicity of the device21 especially for naturally occurring peptides of normal serum. As previously demonstrated in yeast,49 we found that the present workhorse of proteomics, the LC-ESI-ION TRAP,14,22,43 was clearly in agreement with the standard MALDI-Qq-TOFsand the literature2 with respect to the major proteins of serum. Hence, there is good reason to consider the validity of the many low abundance proteins identified here by ION TRAP. At present, the convenience and sensitivity of online LC-ESI-ION TRAP makes this system preferable for the mass spectral analysis of complex samples.15 However, the anticipated development of convenient tools for offline-LC separation of peptides prior to MALDI based analysis may soon present a complementary approach. Direct analysis of serum by PAGE or LC-MS yielded tandem

Human Serum Proteins

Figure 7. Assignment of X and Y ions to MS/MS spectra to a common and uncommon serum protein by SEAQUEST. A, an MS/MS fragmentation pattern of a parent ion from Apo B-100; B, an MS/MS fragmentation patern from a parent ion of from unknown protein af021799. After af021799 was selected for the illustration of a typical unknown, low-abundance serum-protein, we subsequently determined that this protein shows similarity to the interleukin 17 receptor.

mass spectra of fewer than 30 proteins. In sharp contrast, direct analysis of rice proteins by 2D LC-MS yielded tandem mass spectra of thousands of proteins.15 The major conclusion of this study is that pre-fractionating of sera to enrich low-abundance proteins, and carefully preparing the samples for mass spectral analysis,3 were the key steps that governed the progress made in elucidating the human serum proteome. For example, we found that chromatographic pre-fractionation of both intact proteins and digested peptides were required to identify low abundance proteins. Similarly, analysis of SDS-PAGE bands by reversed-phase separation of the resulting peptides significantly increased sensitivity for low abundance molecules. The effect of these rudimentary preseparation techniques to reveal low abundance sample components almost certainly results from releasing the suppression of minor peptides during the competitive MALDI and ESI ionization reactions. We found that PAGE-LC and gel-less chromatography were complementary and each method revealed some unique proteins. Hence, in serum, the MALDI-Qq-TOF21 and metal-needle ESI-ION TRAP46 employed here were still limited by the merits of the sample preparation strategy. We found significant overlap and agreement with proteins identified by gel-free 2D-LC ION-TRAP searched under a more flexible scoring regime.12 The visualization and identification of naturally occurring low molecular mass polypeptides in serum is significant given

research articles

Figure 8. Kingdom and Phyllum assignments of the proteins identified by LC-ESI-ION TRAP. The peptides with X-correlation scores of g 2.5 (2+) and g 3.75 (3+) were collected and the species of the protein identified determined from the FASTA header. Each entry in Table 2 was classified as mammalian verses nonmammalian based on the latin binomial.

the recent intense interest in using MALDI-TOF spectra of sera to fingerprint, i.e., to phenotype, human disease in order to find mass spectral bio-markers.37,56,57 Previous workers have used retentate-chromatography where the sample was adsorbed to chromatographic surfaces45 or adsorbed onto membranes that served as the MALDI target surface.58 The partition chromatography technique used here to prepare samples for MALDI-TOF apparently showed a high analyte complexity and reasonably useful peak shape and symmetry. The peptide fingerprints of the sera after partition chromatography also showed reproducible selectivity and high signal-to-noise ratios. To profile disease, there must be a large amount of useful information, i.e., high signal-to-noise ratio, contained in the MALDI spectrum. The preparation of samples for MALDIprofiling by partition chromatography resulted in spectra with a diverse array of analytes, a high information content and sensitive detection of serum polypeptides. We sequenced the low molecular mass endogenous peptides of less than 3 kD by LC-MALDI-Qq-TOF and found, in agreement with previous results, that this technique will likely reflect an elegant approach to the identification of peptides.51,52 Again, the utility of this new powerful mass spectral technology was dependent on sample preparation and no sequences were obtained from crude serum. After combining the significant peptides resulting from all the methods described above, we apparently observed peptides from some six-hundred protein products using the computer searches described in the materials and methods. The peptides listed in Tables II met or exceeded the previously established criteria for the correlation of tandem mass spectra.26,27 Given the relatively limited number of human genes and mRNA splice Journal of Proteome Research • Vol. 3, No. 3, 2004 379

research articles variants, perhaps 30 000 and 100 000 respectively,59-62 it remains likely that many of these assignments were correct. This conclusion was strongly supported by the fact that the overwhelming bulk of significant scores were mapped to known human serum proteins. However, some of the assignments were from apparently un-common serum proteins or from proteins from which only a single significant peptide was recorded (typically detected in multiple measurements and or experiments). We consulted the literature post-hoc and found that some of the low abundance proteins identified by only a single or few peptides have been previously demonstrated in blood including angiotensinogen, carboxypeptidase N, connective tissue activating peptide, gelsolin, laminin-R, mannose binding protein, N-acetyl glucosaminidase, hyaluronin binding proteins, tetranectin, tissue factor, matalloproteinases plus their inhibitors, or Zn-glycoproteins among others.2 We also obtained commercial antibodies against some of the low abundance proteins and found that immunologically related proteins were detectable in normal serum (not shown). In terms of the nature of the proteins identified, the most striking result is the abundance of apparent regulatory proteins as indicated by the presence of protein-interaction domains and the dearth of kinases and phosphatases detected which remain to be interpreted. We would like to draw attention to several of the limitations that should be observed when interpreting these data. By nature, the fragmentation and computer search of peptides is stochastic and therefore will be sometimes correct but sometimes wrong, especially with present-day Paul ion traps that show low-mass-accuracy. Because the cut sites of chymotrypsin are more ambiguous than those of trypsin, the proteins uniquely identified from chymotryptic digests may be more suspect. However, we note that proteins associated with many of these chymotryptic peptides were previously observed from tryptic digests under a more flexible scoring system.12 With regard to results from the ION TRAP, when we digested with trypsin we accepted tryptic peptides with charge/X-Corr scores of +1/1.9, +2/2.5 or +3/3.75, a criteria for 2+ ion in excess of that recommended by Wolters et al. (2001). For chymotryptic digests we used the same criteria for chymotryptic peptides. With regards to the MALDI-Qq-TOF we used probability-based Mowse scores of tandem MS/MS spectra at the p ) 0.05 level or lower. However, because the majority of peptides listed here were detected in replicate experiments or measurements the confidence associated with them may be higher than these cut off values from a single observation. It might become important to list the hypothetical proteins detected by proteomic investigation because mass spectral detection might be used to confirm the expression and reading frame of hypothetical proteins. Given the nature of structural variation in immunoglobulin proteins, with potentially more than 10 000 000 variations per person, generated by complex DNA rearrangements from the same set of parental genes,63 it remains most likely that assignment of target-specific identities to IgG molecules by this technique may not be reliable: For example, an antibody that shared a peptide sequence with a known rabiesspecific-antibody may have been detected (Table 2), but that alone cannot confirm the presence of rabies specific antibody per se. Thus, the putative assignments of identity to IgG molecules have been reported. We might speculate the abundance of hypothetical proteins may result from the random combination of the immense range of structural variation in immunoglobulins, post-trascriptionally spliced and post-trans380

Journal of Proteome Research • Vol. 3, No. 3, 2004

Marshall et al.

lationally modified proteins or other molecules together with the large size of nucleic acid databases wherein the reading frame is unclear. Although the raw data seems as reasonably solid as practical, we note that as algorithms and data banks used to assign identity to proteins change over time and it remains probable that some of these identities may have to be reconsidered.64 We have made a distinction between the identification of individual proteins, processed forms or iso-forms versus the type or category of protein: For example, a protein with homology to a protein(s) from the category G-protein coupled receptor was apparently detected, but it may be some time before a detailed analysis of the raw data could rule out which of the ∼700 G-protein coupled receptors known to date were not implicated based on the MS/MS data. Thus, for the sake of utility, we have conveyed the often highly specific result of the computer searches but feel that cautious interpretation at the level of types of proteins is an appropriate level of resolution for the present. It is possible that molecules listed as precursor proteins or related proteins may or may not reflect different protein products in the blood and hence they must be qualified as apparently nonredundant. The capacity to correlate enzymatic peptides is not the rate-limiting step in definitively defining a proteome by these methods but rather the capacity to definitively interpret the result is now the bottleneck.65 The results here were searched against a non redundant (NR) database implying that multiple accession numbers representing similar sequences at loci that may potentially be physically distinct have been ignored. The complete set of accession numbers that may have been implicated based on the correlation data available here may take considerable further computational effort to determine. Categorizing these groups of accession numbers, once obtained, in terms of molecular function and the associated biological process is also beyond the scope of this paper. The focus of this paper is to compare sample preparation and sampling strategies that sensitively yield high-quality MS/MS spectra from small amounts of sera. We have confirmed the presence of some of the low abundance protein implicated by western blot and immunologically related proteins in sera. However, there remains the vast bulk of the scholarly effort required to definitively interpret even the small amount of serum raw data that was detected here. We conclude that the pre-fractionation of sera at the level of whole proteins by both chromatography and electrophoresis were both necessary and were complementary, and each detected unique low abundance proteins the other missed. However, both methods required LC chromatography of the digested peptides prior to tandem mass spectrometry to detect low-abundance peptides. Given all of the limitations of our study we nonetheless observed some 600 types of proteins products in human serum, the largest survey of highly significant tandem mass spectra of enzymatic peptides reported from sera to date.2 It would appear that the elucidation of the serum proteome will not be limited by the requirement for new mass spectral technologies, although these will help, but rather by the application of biochemical techniques for protein prefractionation. The exploration of continuous partition chromatography of intact proteins prior to 2D-LC of peptides and 2D-PAGE techniques prior to LC-analysis of peptides may serve to increase the size of the reported serum proteome by an order of magnitude. In terms of the entire human proteome, it is apparent that a concerted effort would likely be required to annotate the proteome as it is revealed and somehow provide

Human Serum Proteins

commentary and interpretation of the result in some standardized format to a central repository of data.1

Abbreviations 1D, one-dimensional; 2D, two-dimensional; CID, collision induced-dissociation; CBBR, Coomasie brilliant blue; cDNA, complementary DNA; CHCA, cyano-4-hydroxy cinnamic acid; DEAE-B, blue-dye-affinity, di-ethyl-amino-ethyl sepharaose; DHB, 2,5-dihydroxybenzoic acid; ESI, electro-spray ionization; EST, expressed sequence tag; FTICR, Fourier transform ion cyclotron resonance; LC, liquid chromatography; MALDI, matrix assisted laser desorption and ionization; MS, mass spectrometry; MS/MS, tandem mass spectrometry; NHS, normal human sera; ORF, open reading frame; PAGE, polyacrylamide-gel-electrophoresis; PBS, phosphate buffered saline; Qq, Quadrupole, radio frequency-only quadrupole; SDS, sodiumdodecyl sulfate; TOF, time-of-flight; TFA, trifluoro acetic acid.

Acknowledgment. This paper was supported in part by industrial research and development grants, and student fellowship grants from the Natural Science and Engineering Research Council of Canada. References (1) Hanash, S.; Celis, J. E. The Human Proteome Organization: a mission to advance proteome knowledge. Mol. Cell Proteomics 2002, 1, 413-414. (2) Anderson, N. L.; Anderson, N. G. The human plasma proteome: history, character, and diagnostic prospects. Mol. Cell Proteomics 2003, 2, 50. (3) Issaq, H. J.; Conrads, T. P.; Janini, G. M.; Veenstra, T. D. Methods for fractionation, separation and profiling of proteins and peptides. Electrophoresis 2002, 23, 3048-61. (4) Wilchek, M.; Jakoby, W. B. The literature on affinity chromatography. Methods Enzymol. 1974, 34, 3-10. (5) Wilm, M.; Shevchenko, A.; Houthaeve, T.; Breit, S.; Schweigerer, L.; Fotsis, T.; Mann, M. Femtomole sequencing of proteins from polyacrylamide gels by nano-electrospray mass spectrometry. Nature 1996, 379, 466-469. (6) Lubman, D. M.; Kachman, M. T.; Wang, H.; Gong, S.; Yan, F.; Hamler, R. L.; O’Neil, K. A.; Zhu, K., Buchanan, N. S.; Barder, T. J. Two-dimensional liquid separations-mass mapping of proteins from human cancer cell lysates. J. Chromatogr. B Analyt. Technol. Biomed. Life Sci. 2002, 782, 183-196. (7) Hoffmann, P.; Ji, H.; Moritz, R. L.; Connolly, L. M.; Frecklington, D. F.; Layton, M. J.; Eddes, J. S.; Simpson, R. J. Continuous freeflow electrophoresis separation of cytosolic proteins from the human colon carcinoma cell line LIM 1215: a non two- dimensional gel electrophoresis-based proteome analysis strategy. Proteomics 2001, 1, 807-818. (8) Tammen, H.; Hess, R.; Uckert, S.; Becker, A. J.; Stief, C. G.; Knappe, P. S.; Schrader, M.; Jonas, U. Detection of low-molecular-mass plasma peptides in the cavernous and systemic blood of healthy men during penile flaccidity and rigiditysan experimental approach using the novel differential peptide display technology. Urology 2002, 59, 784-789. (9) Kennedy, S. Proteomic profiling from human samples: the body fluid alternative. Toxicol. Lett. 2001, 120, 379-384. (10) Haynes, P.; Miller, I.; Aebersold, R.; Gemeiner, M.; Eberini, I.; Lovati, M. R.; Manzoni, C.; Vignati, M.; Gianazza, E. Proteins of rat serum: I. Establishing a reference two-dimensional electrophoresis map by immunodetection and microbore high performance liquid chromatography-electrospray mass spectrometry. Electrophoresis 1998, 19, 1484-1492. (11) Wait, R.; Gianazza, E.; Eberini, I.; Sironi, L.; Dunn, M. J.; Gemeiner, M.; Miller, I. Proteins of rat serum, urine, and cerebrospinal fluid: VI. Further protein identifications and interstrain comparison. Electrophoresis 2001, 22, 3043-3052. (12) Adkins, J. N.; Varnum, S. M.; Auberry, K. J.; Moore, R. J.; Angell, N. H.; Smith, R. D.; Springer, D. L.; Pounds, J. G. Toward a human blood serum proteome: analysis by multidimensional separation coupled with mass spectrometry. Mol. Cell Proteomics 2002, 1, 947-955.

research articles (13) Wu, S. L.; Amato, H.; Biringer, R.; Choudhary, G.; Shieh, P.; Hancock, W. S. Targeted proteomics of low-level proteins in human plasma by LC/MSn: using human growth hormone as a model system. J. Proteome Res. 2002, 1, 459-465. (14) Washburn, M. P.; Wolters, D.; Yates, J. R., 3rd Large-scale analysis of the yeast proteome by multidimensional protein identification technology. Nat. Biotechnol. 2001, 19, 242-247. (15) Koller, A.; Washburn, M. P.; Lange, B. M.; Andon, N. L.; Deciu, C.; Haynes, P. A.; Hays, L.; Schieltz, D.; Ulaszek, R.; Wei, J.; Wolters, D.; Yates, J. R., 3rd Proteomic survey of metabolic pathways in rice. Proc. Natl. Acad. Sci. U. S. A. 2002, 99, 11ThinSpace969-11ThinSpace974. (16) Andon, N. L.; Hollingworth, S.; Koller, A.; Greenland, A. J.; Yates, J. R., 3rd; Haynes, P. A. Proteomic characterization of wheat amyloplasts using identification of proteins by tandem mass spectrometry. Proteomics 2002, 2, 1156-1168. (17) Gharahdaghi, F.; Kirchner, M.; Fernandez, J.; Mische, S. M. Peptide-mass profiles of poly(vinylidene difluoride)-bound proteins by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry in the presence of nonionic detergents. Anal. Biochem. 1996, 233, 94-99. (18) Skehel, J. M.; Schneider, K.; Murphy, N.; Graham, A.; Benson, G. M.; Cutler, P.; Camilleri, P. Phenotyping apolipoprotein E*3-leiden transgenic mice by two- dimensional polyacrylamide gel electrophoresis and mass spectrometric identification. Electrophoresis 2000, 21, 2540-2545. (19) Mann, M.; Hendrickson, R. C.; Pandey, A. Analysis of proteins and proteomes by mass spectrometry. Annu. Rev. Biochem. 2001, 70, 437-473. (20) Badman, E. R.; Myung, S.; Clemmer, D. E. Gas-phase separations of protein and peptide ion fragments generated by collisioninduced dissociation in an ion trap. Anal. Chem. 2002, 74, 48894894. (21) Loboda, A. V.; Krutchinsky, A. N.; Bromirski, M.; Ens, W.; Standing, K. G. A tandem quadrupole/time-of-flight mass spectrometer with a matrix- assisted laser desorption/ionization source: design and performance. Rapid Commun. Mass Spectrom. 2000, 14, 10471057. (22) Stafford, G. C. Instrumental aspects of positive and negative ion chemical ionization mass spectrometry. Environ. Health Perspect 1980, 36, 85-88. (23) Hager, J. W.; Yves Le Blanc, J. C. Product ion scanning using a Q-q-Qlinear ion trap (Q TRAPTM) mass spectrometer. Rapid Commun. Mass Spectrom. 2003, 17, 1056-1064. (24) Schwartz, J. C.; Senko, M. W.; Syka, J. E. A two-dimensional quadrupole ion trap mass spectrometer. J. Am. Soc. Mass Spectrom. 2002, 13, 659-669. (25) Mann, M.; Wilm, M. Error-tolerant identification of peptides in sequence databases by peptide sequence tags. Anal. Chem. 1994, 66, 4390-4399. (26) Yates, J. R., 3rd Database searching using mass spectrometry data. Electrophoresis 1998, 19, 893-900. (27) Perkins, D. N.; Pappin, D. J.; Creasy, D. M.; Cottrell, J. S. Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 1999, 20, 3551-3567. [pii]. (28) Choudhary, G.; Wu, S. L.; Shieh, P.; Hancock, W. S. Multiple enzymatic digestion for enhanced sequence coverage of proteins in complex proteomic mixtures using capillary LC with ion trap MS/MS. J. Proteome Res. 2003, 2, 59-67. (29) Shevchenko, A.; Chernushevich, I.; Ens, W.; Standing, K. G.; Thomson, B.; Wilm, M.; Mann, M. Rapid ‘de novo’ peptide sequencing by a combination of nanoelectrospray, isotopic labeling and a quadrupole/time-of-flight mass spectrometer. Rapid Commun. Mass Spectrom. 1997, 11, 1015-1024. (30) Smith, R. D.; Anderson, G. A.; Lipton, M. S.; Pasa-Tolic, L.; Shen, Y.; Conrads, T. P.; Veenstra, T. D.; Udseth, H. R. An accurate mass tag strategy for quantitative and high-throughput proteome measurements. Proteomics 2002, 2, 513-523. (31) Kruppa, G. H.; Schoeniger, J.; Young, M. M. A top down approach to protein structural studies using chemical cross- linking and Fourier transform mass spectrometry. Rapid Commun. Mass Spectrom. 2003, 17, 155-162. (32) Hunt, D. F.; Shabanowitz, J.; Yates, J. R., 3rd; Zhu, N. Z.; Russell, D. H.; Castro, M. E. Tandem quadrupole Fourier transform mass spectrometry of oligopeptides and small proteins. Proc. Natl. Acad. Sci. U. S. A. 1987, 84, 620-623. (33) Johnson, J. R.; Meng, F.; Forbes, A. J.; Cargile, B. J.; Kelleher, N. L. Fourier transform mass spectrometry for automated fragmentation and identification of 5-20 kDa proteins in mixtures. Electrophoresis 2002, 23, 3217-3223.

Journal of Proteome Research • Vol. 3, No. 3, 2004 381

research articles (34) Yergey, A. L.; Coorssen, J. R.; Backlund, P. S., Jr.; Blank, P. S.; Humphrey, G. A.; Zimmerberg, J.; Campbell, J. M.; Vestal, M. L. De novo sequencing of peptides using MALDI/TOF-TOF. J. Am. Soc. Mass Spectrom. 2002, 13, 784-791. (35) Sze, S. K.; Ge, Y.; Oh, H.; McLafferty, F. W. Top-down mass spectrometry of a 29-kDa protein for characterization of any posttranslational modification to within one residue. Proc. Natl. Acad. Sci. U. S. A. 2002, 99, 1774-1779. (36) Marekov, L. N.; Steinert, P. M. Charge derivatization by 4-sulfophenyl isothiocyanate enhances peptide sequencing by postsource decay matrix-assisted laser desorption/ionization timeof-flight mass spectrometry. J. Mass Spectrom. 2003, 38, 373377. (37) Uchida, T.; Fukawa, A.; Uchida, M.; Fujita, K.; Saito, K. Application of a novel protein biochip technology for detection and identification of rheumatoid arthritis biomarkers in synovial fluid. J. Proteome Res. 2002, 1, 495-499. (38) Karas, M.; Hillenkamp, F. Laser desorption ionization of proteins with molecular masses exceeding 10 000 daltons. Anal. Chem. 1988, 60, 2299-2301. (39) Nakanishi, T.; Okamoto, N.; Tanaka, K.; Shimizu, A. Laser desorption time-of-flight mass spectrometric analysis of transferrin precipitated with antiserum: a unique simple method to identify molecular weight variants. Biol. Mass Spectrom. 1994, 23, 230-233. (40) Fenn, J. B.; Mann, M.; Meng, C. K.; Wong, S. F.; Whitehouse, C. M. Electrospray ionization for mass spectrometry of large biomolecules. Science 1989, 246, 64-71. (41) Marshall, J.; Kupchak, P.; Zhu, W.; Yantha, J.; Vrees, T.; Furesz, S.; Jacks, K.; Smith, C.; Kireeva, I.; Zhang, R.; Takahashi, M.; Stanton, E.; Jackowski, G. Processing of serum proteins underlies the mass spectral fingerprinting of myocardial infarction. J. Proteome Res. 2003, 2(4), 361-372. (42) Link, A. J.; Eng, J.; Schieltz, D. M.; Carmack, E.; Mize, G. J.; Morris, D. R.; Garvik, B. M.; Yates, J. R., 3rd Direct analysis of protein complexes using mass spectrometry. Nat. Biotechnol. 1999, 17, 676-682. (43) Wolters, D. A.; Washburn, M. P.; Yates, J. R., 3rd An automated multidimensional protein identification technology for shotgun proteomics. Anal. Chem. 2001, 73, 5683-5690. (44) Schagger, H.; Aquila, H.; Von Jagow, G. Coomassie blue-sodium dodecyl sulfate-polyacrylamide gel electrophoresis for direct visualization of polypeptides during electrophoresis. Anal. Biochem. 1988, 173, 201-205. (45) Weinberger, S. R.; Boschetti, E.; Santambien, P.; Brenac, V. Surface-enhanced laser desorption-ionization retentate chromatography mass spectrometry (SELDI-RC-MS): a new method for rapid development of process chromatography conditions. J. Chromatogr. B Analyt. Technol. Biomed. Life Sci. 2002, 782, 307316. (46) Guzzetta, A. W.; Thakur, R. A.; Mylchreest, I. C. A robust microelectrospray ionization technique for high-throughput liquid chromatography/mass spectrometry proteomics using a sanded metal needle as an emitter. Rapid Commun. Mass Spectrom. 2002, 16, 2067-2072. (47) McCombie, W. R.; Adams, M. D.; Kelley, J. M.; FitzGerald, M. G.; Utterback, T. R.; Khan, M.; Dubnick, M.; Kerlavage, A. R.; Venter, J. C.; Fields, C. Caenorhabditis elegans expressed sequence tags identify gene families and potential disease gene homologues. Nat. Genet. 1992, 1, 124-131. (48) Chelius, D.; Huhmer, A. F.; Shieh, C. H.; Lehmberg, E.; Traina, J. A.; Slattery, T. K.; Pungor, E., Jr. Analysis of the adenovirus type 5 proteome by liquid chromatography and tandem mass spectrometry methods. J. Proteome Res. 2002, 1, 501-513. (49) Griffin, T. J.; Gygi, S. P.; Rist, B.; Aebersold, R.; Loboda, A.; Jilkine, A.; Ens, W.; Standing, K. G. Quantitative proteomic analysis using a MALDI quadrupole time-of-flight mass spectrometer. Anal. Chem. 2001, 73, 978-986. (50) Jurgens, M.; Schrader, M. Peptidomic approaches in proteomic research. Curr. Opin. Mol. Ther. 2002, 4, 236-241. (51) Verhaert, P.; Uttenweiler-Joseph, S.; de Vries, M.; Loboda, A.; Ens, W.; Standing, K. G. Matrix-assisted laser desorption/ionization quadrupole time-of-flight mass spectrometry: an elegant tool for peptidomics. Proteomics 2001, 1, 118-131. (52) Shevchenko, A.; Loboda, A.; Ens, W.; Standing, K. G. MALDI quadrupole time-of-flight mass spectrometry: a powerful tool for proteomic research. Anal. Chem. 2000, 72, 2132-2141.

382

Journal of Proteome Research • Vol. 3, No. 3, 2004

Marshall et al. (53) Yaffe, M. B. Phosphotyrosine-binding domains in signal transduction. Nat. Rev. Mol. Cell Biol. 2002, 3, 177-186. (54) Pawson, T.; Nash, P. Assembly of cell regulatory systems through protein interaction domains. Science 2003, 300, 445-452. (55) Aebersold, R.; Mann, M. Mass spectrometry-based proteomics. Nature 2003, 422, 198-207. (56) Ardekani, A. M.; Liotta, L. A.; Petricoin, E. F., 3rd Clinical potential of proteomics in the diagnosis of ovarian cancer. Expert Rev. Mol. Diagn. 2002, 2, 312-320. (57) Li, J.; Zhang, Z.; Rosenzweig, J.; Wang, Y. Y.; Chan, D. W. Proteomics and bioinformatics approaches for identification of serum biomarkers to detect breast cancer. Clin. Chem. 2002, 48, 1296-1304. (58) Oleschuk, R. D.; McComb, M. E.; Chow, A.; Ens, W.; Standing, K. G.; Perreault, H.; Marois, Y.; King, M. Characterization of plasma proteins adsorbed onto biomaterials. By MALDI-TOF MS. Biomaterials 2000, 21, 1701-1710. (59) Lander, E. S.; Linton, L. M.; Birren, B.; Nusbaum, C.; Zody, M. C.; Baldwin, J.; Devon, K.; Dewar, K.; Doyle, M.; FitzHugh, W.; Funke, R.; Gage, D.; Harris, K.; Heaford, A.; Howland, J.; Kann, L.; Lehoczky, J.; LeVine, R.; McEwan, P.; McKernan, K.; Meldrim, J.; Mesirov, J. P.; Miranda, C.; Morris, W.; Naylor, J.; Raymond, C.; Rosetti, M.; Santos, R.; Sheridan, A.; Sougnez, C.; StangeThomann, N.; Stojanovic, N.; Subramanian, A.; Wyman, D.; Rogers, J.; Sulston, J.; Ainscough, R.; Beck, S.; Bentley, D.; Burton, J.; Clee, C.; Carter, N.; Coulson, A.; Deadman, R.; Deloukas, P.; Dunham, A.; Dunham, I.; Durbin, R.; French, L.; Grafham, D.; Gregory, S.; Hubbard, T.; Humphray, S.; Hunt, A.; Jones, M.; Lloyd, C.; McMurray, A.; Matthews, L.; Mercer, S.; Milne, S.; Mullikin, J. C.; Mungall, A.; Plumb, R.; Ross, M.; Shownkeen, R.; Sims, S.; Waterston, R. H.; Wilson, R. K.; Hillier, L. W.; McPherson, J. D.; Marra, M. A.; Mardis, E. R.; Fulton, L. A.; Chinwalla, A. T.; Pepin, K. H.; Gish, W. R.; Chissoe, S. L.; Wendl, M. C.; Delehaunty, K. D.; Miner, T. L.; Delehaunty, A.; Kramer, J. B.; Cook, L. L.; Fulton, R. S.; Johnson, D. L.; Minx, P. J.; Clifton, S. W.; Hawkins, T.; Branscomb, E.; Predki, P.; Richardson, P.; Wenning, S.; Slezak, T.; Doggett, N.; Cheng, J. F.; Olsen, A.; Lucas, S.; El kin, C.; Uberbacher, E.; Frazier, M., et al. Initial sequencing and analysis of the human genome. Nature 2001, 409, 860-921. (60) Brett, D.; Pospisil, H.; Valcarcel, J.; Reich, J.; Bork, P. Alternative splicing and genome complexity. Nat. Genet. 2002, 30, 29-30. (61) Schmucker, D.; Clemens, J. C.; Shu, H.; Worby, C. A.; Xiao, J.; Muda, M.; Dixon, J. E.; Zipursky, S. L. Drosophila Dscam is an axon guidance receptor exhibiting extraordinary molecular diversity. Cell 2000, 101, 671-684. (62) Venter, J. C.; Adams, M. D.; Myers, E. W.; Li, P. W.; Mural, R. J.; Sutton, G. G.; Smith, H. O.; Yandell, M.; Evans, C. A.; Holt, R. A.; Gocayne, J. D.; Amanatides, P.; Ballew, R. M.; Huson, D. H.; Wortman, J. R.; Zhang, Q.; Kodira, C. D.; Zheng, X. H.; Chen, L.; Skupski, M.; Subramanian, G.; Thomas, P. D.; Zhang, J.; Gabor Miklos, G. L.; Nelson, C.; Broder, S.; Clark, A. G.; Nadeau, J.; McKusick, V. A.; Zinder, N.; Levine, A. J.; Roberts, R. J.; Simon, M.; Slayman, C.; Hunkapiller, M.; Bolanos, R.; Delcher, A.; Dew, I.; Fasulo, D.; Flanigan, M.; Florea, L.; Halpern, A.; Hannenhalli, S.; Kravitz, S.; Levy, S.; Mobarry, C.; Reinert, K.; Remington, K.; Abu-Threideh, J.; Beasley, E.; Biddick, K.; Bonazzi, V.; Brandon, R.; Cargill, M.; Chandramouliswaran, I.; Charlab, R.; Chaturvedi, K.; Deng, Z.; Di Francesco, V.; Dunn, P.; Eilbeck, K.; Evangelista, C.; Gabrielian, A. E.; Gan, W.; Ge, W.; Gong, F.; Gu, Z.; Guan, P.; Heiman, T. J.; Higgins, M. E.; Ji, R. R.; Ke, Z.; Ketchum, K. A.; Lai, Z.; Lei, Y.; Li, Z.; Li, J.; Liang, Y.; Lin, X.; Lu, F.; Merkulov, G. V.; Milshina, N.; Moore, H. M.; Naik, A. K.; Narayan, V. A.; Neelam, B.; Nusskern, D.; Rusch, D. B.; Salzberg, S.; Shao, W.; Shue, B.; Sun, J.; Wang, Z.; Wang, A.; Wang, X.; Wang, J.; Wei, M.; Wides, R.; Xiao, C.; Yan, C., et al. The sequence of the human genome. Science 2001, 291, 1304-1351. (63) Tonegawa, S.; Steinberg, C.; Dube, S.; Bernardini, A. Evidence for somatic generation of antibody diversity. Proc. Natl. Acad. Sci. U. S. A. 1974, 71, 4027-4031. (64) Harrison, P. M.; Kumar, A.; Lang, N.; Snyder, M.; Gerstein, M. A question of size: the eukaryotic proteome and the problems in defining it. Nucleic Acids Res. 2002, 30, 1083-1090. (65) Patterson, S. D. Data analysis-the Achilles heel of proteomics. Nat. Biotechnol. 2003, 21, 221-222.

PR034039P