Different Immunoaffinity Fractionation Strategies to Characterize the

In the present study, two different immunoaffinity fractionation columns for the top-6 or the top-12 proteins in plasma were investigated and both the...
2 downloads 6 Views 265KB Size
Different Immunoaffinity Fractionation Strategies to Characterize the Human Plasma Proteome Yan Gong, Xiaohai Li, Bing Yang, Wantao Ying, Dong Li, Yangjun Zhang, Shujia Dai, Yun Cai, Jinglan Wang, Fuchu He, and Xiaohong Qian* Department of Genomics and Proteomics, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing 100850, Beijing Proteome Research Center, 33 Life Park Road, Beijing 102206, People’s Republic of China Received January 4, 2006

Plasma proteins may often serve as indicators of disease and are a rich source for biomarker discovery. However, the intrinsic large dynamic range of plasma proteins makes the analysis very challenging because a large number of low abundance proteins are often masked by a few high abundance proteins. The use of prefractionation methods, such as depletion of higher abundance proteins before protein profiling, can assist in the discovery and detection of less abundant proteins that may ultimately prove to be informative biomarkers. But there are few studies on comprehensive investigation of the proteins both in the fractions depleted and remainder. In the present study, two different immunoaffinity fractionation columns for the top-6 or the top-12 proteins in plasma were investigated and both the proteins in column-bound and flow-through fractions were subsequently analyzed. A two-dimensional peptide separation strategy, utilizing chromatographic separation techniques, combined with tandem mass spectrometry (MS/MS) was employed for proteomic analysis of the four fractions. Using the established HUPO PPP criteria, a total of 2401 unique plasma proteins were identified. The Multiple Affinity Removal System yielded 921 and 725 unique proteins from the flow-through and bound fractions, respectively, whereas the Seppro MIXED 12 column yielded identification of 897 and 730 unique proteins from the flow-through and bound fractions, respectively. When more stringent criteria, based on searching against the reversed database, were implemented, 529 unique proteins were identified from the four fractions with the confidence in peptide identification increased from 73.6% to 99%. To determine whether the presence of nontarget proteins in the immunoaffinity-bound fraction could be attributed to their interaction with high abundance proteins, co-immunoprecipitation analysis with an antibody to human plasma albumin was performed, which resulted in an identification of 40 unique proteins from the coimmunoprecipitate with the more stringent criteria. This study illustrated that combining the column-bound and flow-through fractions from immunoaffinity separation affords more extensive profiling of the protein content of human plasma. The presence of nontarget proteins in the column-bound fractions may be induced by their binding to the higher abundance proteins targeted by the immunoaffinity column. Keywords: plasma • proteomics • immunoaffinity separation • two-dimensional separation • mass spectrometry • coimmunoprecipitation • anti-human plasma albumin

Introduction Human plasma is a primary clinical specimen and holds the promise in disease diagnosis and therapeutic monitoring. Plasma is a large repository of proteins, containing thousands of distinct proteins including classical blood proteins and many other proteins secreted, shed, or lost from cells and tissues throughout the body. Furthermore, plasma is a rather unique biological sample in that no specific cellular genomic expression contributes to its protein content and thus it remains beyond the scope of DNA or RNA-based diagnostics.1 * To whom correspondence should be addressed. Tel: +86-10-807277771231. Fax: +86-10-8070-5155. E-mail: [email protected]. 10.1021/pr0600024 CCC: $33.50

 2006 American Chemical Society

There has been great interest in comprehensively characterizing the plasma protein and even in the discovery of biomarkers for clinical diagnosis and treatment. However, the large dynamic range of protein concentration in plasma renders this proteome cumbersome to be characterized effectively.1 The 22 most abundant proteins (on an mg/mL basis) constitute more than 99% of the mass of total plasma proteins. The remaining 1%, presumably the biologically interesting population, is composed of thousands of types of very low abundance proteins.2 The 10 or greater orders of magnitude in protein concentration limit the ability of mass spectrometry to effectively monitor the low abundant species. Therefore, depletion of the high abundant proteins becomes a critical step in Journal of Proteome Research 2006, 5, 1379-1387

1379

Published on Web 05/04/2006

research articles the plasma proteome profiling, especially whenever the limit peak capacity and low dynamic range of mass spectrometry analysis occur with protein identification. Previous studies indicated that removal of high abundant proteins enables greatly improved detection of lower abundant proteins.3-13 Nevertheless, the popular depletion strategies possibly result in concomitant loss of potentially important peptides and proteins.2,13-18 All things considered, the identification of both the column-bound and flow-through fractions from immunoaffinity separation should be a wise approach, insofar as not only the lower abundance proteins, but also the proteins bound specifically or nonspecifically to the column, could be detected. There are a number of approaches to separate proteins based on their biochemical and biophysical features such as molecular weight, mass, density, hydrophobicity, surface charge, and isoelectric point.2,6,11,19,20 However, these separation processes are not protein-specific and have variable capacities and limited reproducibility. Various tools and methods for protein separation using affinity capture reagents are capable of binding and removing specific targets.13,14,16,20,21 Among these techniques, immunoaffinity capture using antibodies is the classical, effective, and most reliable approach that avoids the masking effect of the high abundant proteins to low abundant proteins and allows for more comprehensive identification of low abundant proteins. In the work reported here, a comprehensive analysis of plasma proteins was performed on the column-bound and flow-through fractions separated by two commercial immunoaffinity columns. The data indicated that a two-dimensional peptide separation strategy combined with tandem mass spectrometry was capable of extensively characterizing human plasma proteome when both bound fractions and flow-through fractions were analyzed. Using the HUPO Plasma Proteome Project (HUPO PPP) data criteria, 2401 unique proteins were identified, and the false positive rate was 26.4% based on the searching against reversed database. The more stringent criteria were developed based on this reversed database analysis and the reanalyzed result showed that 529 unique proteins were identified in high confidence with the false positive rate 1%. Herein, 129 proteins were only identified in the bound fractions. These findings demonstrate that affinity removal of the high abundant proteins results in loss of some protein species, and that analyzing the bound as well as flow-through fractions from the two immunoaffinity columns affords greater coverage of human plasma proteins than either fraction alone. Subsequently a coimmunoprecipitation analysis with antibody against human plasma albumin was conducted and the associated proteins were identified by SDS-PAGE-RPLC-MS/MS. Among the 40 proteins identified, 26 were overlapped with those detected from bound fractions. This finding illustrates that the specifically or nonspecifically bound proteins found in bound fractions may result from their interaction with the highest abundant proteins.

Experimental Section Materials. A Multiple Affinity Removal System (MARS) HPLC column, for the separation of the six plasma proteins of highest abundance, was purchased from Agilent Technologies (Palo Alto, CA). Seppro MIXED12 (Seppro) HPLC column, for the separation of the twelve plasma proteins of highest abundance, was purchased from GenWay Biotech Inc. (San Diego, CA). Sequencing grade porcine trypsin was purchased from Promega (Madison, WI). DL-dithiothreitol (DTT) and iodoacetamide were 1380

Journal of Proteome Research • Vol. 5, No. 6, 2006

Gong et al.

obtained from Pierce (Rockford, IL). Tris (2-carboxyethyl) phosphine hydrochloride (TCEP) and ammonium bicarbonate were purchased from Sigma-Aldrich (St. Louis, MO). HPLCgrade water was produced by Millipore (Billerica, MA). UVgrade acetonitrile (ACN) was purchased from Merck (Whitehouse Station, NJ). Rabbit anti-human plasma albumin polyclonal antibody was purchased from Abcam (Cambridge, UK). Human Plasma Samples. Human plasma was acquired from a health anonymous donor at Beijing Northern Taiping Road Hospital who was tested negative to human immunodeficiency virus (HIV), hepatitis B virus (HBV), hepatitis C virus (HCV) and syphilis. Human blood was obtained by venipuncture from the donor and collected into tubes containing 0.109 M sodium citrate. The blood was further centrifuged at 3000 rpm for 10 min at 4 °C within 6 h. The resultant plasma was transferred to a second set of centrifuge tubes and repeated the centrifugation to remove any residual cells and the plasma was stored in small aliquots at -80 °C. They were thawed only once and then thrown away. The protein concentration of the initial plasma sample was 77 mg/mL as determined by BCA assay. The study was approved by ethics committees of Beijing Northern Taiping Road Hospital and Beijing Institute of Radiation Medicine. Separation of the Bound and Flow-through Fractions by MARS. According to the manufacturer’s instructions, crude plasma was diluted 5-fold with buffer A (product no. 5185-5987, Agilent Technologies) and then passed through a 0.45 um filter by spinning at 12 000 rpm at room temperature. On an Elite 230 LC system (Dalian, China), each aliquot of the sample (equal to 25 µL original plasma) was injected on the MARS column in 100% buffer A at a flow rate of 0.5 mL/min for 10 min. After collection of the flow-through fraction (MARS-FF), the column was washed and the bound proteins were eluted with 100% buffer B (product no. 5185-5988, Agilent Technologies) at a flow rate of 1.0 mL/min for 8 min, then the bound fraction was collected (MARS-BF). The column was regenerated by equilibrating it with 100% buffer A for 10 min. Separation of the Bound and Flow-through Fractions by Seppro. The operation was consistent with the manufacturer’s specified protocol. Plasma was diluted with five volumes of Dilution Buffer (GenWay Biotech Inc.) and centrifuged through a 0.45 µm filter at 12 000 rpm. On a Beckman Proteome Lab PF 2D system, each aliquot of the sample (equal to 25 µL original plasma) was injected on the Seppro column. The method started at a flow rate of 0.1 mL/min for 10 min, and the column washed at a flow rate of 0.2 mL/min for 7 min, then flow rate changed to 1.0 mL/min to continue the wash for 5 min and the flow-through fractions collected (SepproFF). Bound proteins were eluted from the column with Stripping Buffer (GenWay Biotech Inc.) at a flow rate of 1.0 mL/ min for 12 min with bound fractions collected (Seppro-BF), and the column was neutralized with Neutralizing Buffer (GenWay Biotech Inc.) at a flow rate of 1.0 mL/min for 6 min. The column was regenerated by equilibrating it with Dilution Buffer at a flow rate of 1.0 mL/min for 6 min. Desalting and Concentrating of Four Fractions by Centrifugal Ultrafiltration. The sample was applied to a Centriplus centrifugal concentrator (YM-3, MWCO 3KD, Millipore MA) and centrifuged (Biofuge Stratos, Heraeus Instruments, Germany) at 4500 rpm, 4 °C. Finally, the sample solution was bufferexchanged gradually into 20 mM NH4HCO3 (pH 8.5). The protein content was assayed using a revised Bio-Rad RC-DC method.

research articles

Strategies to Characterize the Human Plasma Proteome

Trypsin Digestion of Plasma Proteins and Desalting by RPLC. This step was performed as previously described.22 Solid urea was added to the samples, then 40 mM TCEP and 200 mM DTT solution was added. The samples were reduced at 37 °C for 4 h and then concentrated iodoacetamide in 100 mM NH4HCO3 solution was added. The mixture was incubated for an additional 60 min in darkness. Finally, trypsin solution was added and incubated at 37 °C for 24 h. After incubation, the sample was concentrated by a SpeedVac (Thermo Savant) followed by desalting by RPLC. The ACN in each fraction was evaporated with a stream of nitrogen and the remainder was concentrated by Speedvac. Offline SCX for First-Dimension Chromatographic Separation of Peptides. Offline SCXLC separation of peptides from a desalted sample was performed on Beckman Proteome Lab PF 2D system as previously described with simply modified.22 An analytical Hypersil SCX column (Thermo-Keystone, 4.6 mm i.d. × 25 cm) was used for the first-dimension separation. The following conditions were used: solution A was 5 mM NH4Cl solution (adjusted to pH 3.0 with formic acid) containing 25% ACN, solution B was 800 mM NH4Cl solution (adjusted to pH 3.0 with formic acid) containing 25% ACN and solution C was 50 mM Tris, 500 mM NH4Cl solution (adjusted to pH 7.5 with formic acid). A solution gradient was 100% solvent A for 5 min, from 0 to 60% B for 60 min, ramped to 100% B in 20 min, held for another 30 min, then maintained in 100% solvent C for 30 min. Fractions were collected using an automated Gilson fraction collector. The corresponding RPLC fractions from subsequent runs of the same sample were pooled in the same eppendorff vials. All effluent fractions were combined into 60 fractions according to the OD value after comparison of chromatograms. Analysis of Flow-Through and Bound Fractions Using RPLC-MS/MS. All peptide mixtures were analyzed using a Finnigan LCQ Deca XP Plus ion trap mass spectrometer. Two reverse phase C18 trap columns (100 µm i.d. × 5 mm) were connected with the 10-port column-switching valve. PicoFrit column (BioBasic C18, 5 µm, 75 µm i.d. × 10 cm, 15 µm i.d. spray tip, New Objective, Woburn, MA) was used to separate the peptide mixture. The spray voltage was set at 1.8 kV. The temperature of the ion transfer capillary was set at 180 °C. Peptide ions were detected in a full scan followed by three datadependent MS/MS scans on the three most intense ions. The isolation width was 3 Da and normalized collision energy was 35%. Data Processing and Analysis. All MS/MS spectra from the four fractions were searched independently against the normal human IPI protein database (v. 2.27) using the SEQUEST algorithm (Eng and Yates III, University of Washington, WA)23 for peptide and protein identifications. Spectra from SepproFF were also searched against the reversed human IPI protein database using the same algorithm to evaluate false positive rates of peptide identifications from human plasma samples. The reversed human protein database was created by reversing the order of amino acid sequences for each protein as previously reported.24-26 Database searching was performed allowing for differential modification on cysteine residues, methionine residues and full tryptic cleavage, with peptide mass tolerance of 1.5 Da. The DTASelect and Contrast software were used to filter and organize the SEQUEST searching data.27 The comparison of identified proteins from the four different fractions was based on the protein that in each cluster it was at the forefront in the database.

Coimmunoprecipitation and Protein Isolation. Protein-G sepharose beads (Amersham) were washed with fresh buffer (50 mmol/L of Tris base, 150 mmol/L of sodium chloride, 10% glycerol, 1% Tween-20, 0.2% NP-40, 1 mmol/L DTT, pH 7.4) three times before mixing with plasma or antibody. Prior to coimmunoprecipitation, 100 µL plasma was incubated at 4 °C with 100 µL protein G sepharose beads shaking for 2 h. The supernatant was removed after a 30 s centrifugation at 13 000 rpm and incubated with 300 µg rabbit anti-human plasma albumin polyclonal antibody at 4 °C shaking for 6 h and then mixed with 200 µL protein-G sepharose beads shaking overnight at 4 °C. The coimmunoprecipitate was centrifuged for 3 min at 13 000 rpm and the supernatant was discarded. The pellet was resuspensed and washed three times with fresh buffer. Bound proteins were released from the protein-G sepharose beads by adding 2 × SDS gel loading buffer and boiling for 5 min. Following centrifugation, the supernatant was separated by SDS-PAGE. In-Gel Digestion of Coimmunoprecipitation Product. Ingel digestion was performed as previously described with minor changes.28 Breifly, the gel pieces were destained and shrunk using 100% acetonitrile and proteins reduced by addition of 10mM dithiothreitol followed by an incubation step at 56 °C for 1 h. The proteins were alkylated by adding 55 mM iodoacetamide and incubating for 1 h at room temperature in the dark. After an additional wash and shrinkage, 10 ng/µL trypsin in 0.1 M NH4HCO3 was added, followed by incubation overnight at 37 °C. An extraction step was carried out to recover the peptides from the gel slices by adding 50% acetonitrile and incubating at room temperature for 1 h.

Results and Discussion Separation of the Bound and Flow-Through Fractions by Immunoaffinity Columns. Separation of complex protein mixtures with large dynamic range of concentration, such as plasma or serum, is a great challenge for proteomic analysis. Sample prefractionation for removing abundant proteins prior to the further protein separation and identification is the necessary step for more extensively exploring the plasma proteome. The HUPO PPP especially promoted the evaluation of various methods of plasma sample preparation. Several research results indicated that immunoaffinity separation is the most effective and specific tool by far to specifically separate the high abundant proteins and prepare samples for attaining higher sensitivity and better resolution to detect low abundant proteins; thereby “digging deeper into the proteome”. Immunoaffinity separation of proteins using various types of antibodies has yielded encouraging data.9,12,28-31 MARS was designed for simultaneous binding of albumin, IgG, IgA, haptoglobin, transferrin, and R1-antitrypsin, while Seppro was designed for simultaneous binding of albumin, IgG, transferrin, R1-antitrypsin, IgA, IgM, R2-macroglobulin, haptoglobin, apolipoproteins A-I and A-II, orosomucoid (R1-acid glycoprotein), and fibrinogen. Compared to unfractionated samples, the target proteins were effectively separated from the flow-through samples (Figure 1). Proteins bound to the corresponding immunoaffinity column were mainly the expected targets. This result is similar to the reports based on analysis of the two fractions on SDS-PAGE or 2DE.9,29 Classical Coomassie-stained gels can only display at most 4 orders of magnitude in protein concentration32 and is not sensitive enough to display the low abundant proteins, especially those coexisting with very high abundant proteins in the bound Journal of Proteome Research • Vol. 5, No. 6, 2006 1381

research articles

Figure 1. Plasma was treated by MARS, Seppro immunaffinity column as shown by SDS-PAGE. Lane 1, Seppro-FF; Lane 2, Seppro-BF; Lane 3, MARS-FF; Lane 4, MARS-BF; Lane 5, unprocessed plasma; Lane 6, molecular weight standards. The target proteins were labeled to the left, and molecular weights of the standards were shown to the right.

fractions. Our results demonstrate that there are many proteins, in addition to the six or twelve target-proteins, present in bound fractions as analyzed by offline peptide SCXLC separation coupled with RPLC-MS/MS. Peptide Separation of Flow-through and Bound Fractions Using SCXLC. As shown previously,9,29 the immunoaffinity approach offered high efficiency, binding specificity and simultaneously increased resolution and improved intensity of low abundant proteins in terms of SDS-PAGE or 2DE separation. Because SDS-PAGE or 2DE allows resolution of at most 4 orders of magnitude, a nongel based approach was applied to adequately separate peptides for subsequent analyzing the protein components of four fractions. As shown in Figure 2, the peptides generated from flow-through and bound fractions exhibited good separation; in particular, low abundant peptides were effectively separated from the very high abundant peptides in bound fractions. These results also showed that the offline SCX shotgun strategy selectively enriched the low abundant peptides, which significantly improved their mass spectrometric identification from the bound fractions.

Gong et al.

Protein Database Searching, Filtering and False Positive Evaluation by Reversed Database Searching. The rate of false positive protein identification is significantly dependent on sample trait, particularly the number of proteins found within the detectable dynamic range.33 The use of a sequence-reversed protein database for assessing the false positive rates of peptide identifications resulting from SEQUEST searching has been previously reported.24,34,35 This analysis presumes that as the reversed database is identical in size to the normal database in terms of protein number, protein size, and distribution of amino acids, then the number of false positives arising from random “hits” should be similar for both the normal database and the reversed database.33 In this study, we created and used a reversed human protein database to estimate false positive rates of peptide identifications from human plasma samples. When peptide MS/MS assignments were filtered according to the recommended HUPO PPP criteria, namely: Xcorr g 1.9 with charge state 1+, Xcorr g 2.2 with charge state 2+, Xcorr g 3.75 with charge state 3+, DeltCn g 0.1 and Rsp e 4, 921, and 725 unique proteins from MARS-FF and MARS-BF, respectively, and 897 and 730 unique proteins from Seppro-FF and Seppro-BF, respectively, were identified. In total, 2401 unique plasma proteins were identified from the four fractions. The false positive rate for these analyses was calculated to be 26.4% based on the MS/ MS spectral assignments of Seppro-FF searching the reversed database. Qian et al.33 indicated that the false positive rate for peptide identifications from human plasma using the abovementioned criteria was about 30% by a reversed database searching. To increase the data confidence, we developed more stringent filtering criteria for human plasma proteome analysis by nanoESI-MS/MS, which decreased the rate of false positive identifications to 1% by reversed database searching: Xcorr g 2.6 with charge state 1+, Xcorr g 3.0 with charge state 2+, Xcorr g 3.5 with charge state 3+, DeltCn g 0.1, and Rsp e 4. These Xcorr thresholds were more stringent than those used in the study of cortical neuron proteome providing 99% confidence.36 More detailed data analyses were based on the latter, more

Figure 2. SCXLC separation of trypsin digestate of Seppro-FF (A); MARS-FF (B);Seppro-BF(C); MARS-BF (D). 1382

Journal of Proteome Research • Vol. 5, No. 6, 2006

Strategies to Characterize the Human Plasma Proteome

research articles

Table 1. Number of Proteins, Peptides, Two Peptide Hits Identified from Different Fractionsa

a Numbers in the bracket represent the number of unique proteins and two peptide hits from the two fractions (not including the target top-6 or top-12 proteins and Igs).

stringent filtering criteria, which reduce the contribution of false positive identifications to the data. Proteins Identified in Human Plasma. Enhanced protein detection by proteomics is facilitated by specific prefractionation or depletion methods to separate high abundant members from those in low abundance. In addition to reducing the total amount of a given protein, this approach further reduces ion suppression effects in the electrospray and signal suppression in the mass spectrometer, both of which increase the likelihood of protein identification.37 Using the more stringent criteria, 307 and 194 unique proteins from MARS-FF and MARS-BF, respectively, and 329 and 217 unique proteins from Seppro-FF and Seppro-BF, respectively, were identified for a total of 529 unique plasma proteins (Table 1). 420 and 432 unique proteins were identified from MARS-BF plus MARSFF, Seppro-BF plus Seppro-FF, respectively. The total 529 unique proteins obtained from these four fractions were listed as Supporting Information Table 1. In addition, 201 protein clusters were identified by the peptides not unique to a single protein. This identification revealed that a minimum of 730 proteins could be identified from this experiment. Examples of low abundant proteins (