In-Depth Proteomic Profiling of the Normal Human Kidney Glomerulus Using Two-Dimensional Protein Prefractionation in Combination with Liquid Chromatography-Tandem Mass Spectrometry Masahito Miyamoto,†,‡ Yutaka Yoshida,*,† Izumi Taguchi,† Yoshimi Nagasaka,† Masayuki Tasaki,† Ying Zhang,† Bo Xu,† Masaaki Nameta,§ Hiroshi Sezaki,| Lino M. Cuellar,† Tetsuo Osawa,⊥ Hideo Morishita,O Shigeki Sekiyama,# Eishin Yaoita,† Kenjiro Kimura,‡ and Tadashi Yamamoto† Department of Structural Pathology, Institute of Nephrology, Graduate School of Medical and Dental Sciences, Niigata University, Niigata, Japan, Division of Nephrology and Hypertension, Department of Internal Medicine, St. Marianna University School of Medicine, Kawasaki, Japan, Cooperative Laboratory for Electron Microscopy, Niigata University, Niigata, Japan, Agilent Technologies Japan, Ltd., Hachioji, Japan, Department of Urology, Niigata City General Hospital, Niigata, Japan, Department of Urology, Nagaoka Red Cross Hospital, Niigata, Japan, and International Academic Support Office, Niigata University, Niigata, Japan Received April 10, 2007
The kidney glomerulus plays a pivotal role in ultrafiltration of plasma into urine and also is the locus of kidney disease progressing to chronic renal failure. We have focused proteomic analysis on the glomerulus that is most proximal to the disease locus. In the present study, we aimed to provide a confident, in-depth profiling of the glomerulus proteome. The glomeruli were highly purified from the kidney cortex from a male, 68-year-old patient who underwent nephroureterectomy due to ureter carcinoma. The patient was normal in clinical examinations including serum creatinine and urea levels and liver function, and did not receive any chemotherapy and radiotherapy. The cortical tissue was histologically normal, and no significant deposition of immunoglobulins and complement C3 was observed. We employed a novel strategy of protein separation using 1D (SDS-PAGE) and 2D (solutionphase IEF in combination with SDS-PAGE) prefractionation prior to the shotgun analysis with LC-MS/ MS. The protein prefractionation produced 90 fractions, and eventually provided a confident set of identified proteins consisting of 6686 unique proteins (3679 proteins with two or more peptide matches and 3007 proteins with one peptide match), representing 2966 distinct genes. All the identified proteins were annotated and classified in terms of molecular function and biological process, compiled into 1D and 2D protein arrays, consisting of 15 and 75 sections, corresponding to the protein fractions which were defined by MW and pI range, and deposited on a Web-based database (http://www.hkupp.org). The most remarkable feature of the glomerulus proteome was a high incidence of identification of cytoskeleton-related proteins, presumably reflecting the well-developed, cytoskeletal organization of glomerular cells related to their physiological functions. Keywords: human kidney • glomerulus • 2D protein prefractionation • solution-phase IEF • database
Introduction The glomerulus is the site of plasma ultrafiltration and production of primary urine in the kidney. The structure not * To whom correspondence should be addressed. Yutaka Yoshida, Department of Structural Pathology, Institute of Nephrology, Graduate School of Medical and Dental Sciences, Niigata University, 1-757 Asahimachidori, Chuo-Ku, Niigata 951-8510, Japan. Phone, +81 25 227 2152; Fax, +81 25 227 0768; e-mail,
[email protected]. † Department of Structural Pathology, Niigata University. ‡ St. Marianna University School of Medicine. § Cooperative Laboratory for Electron Microscopy, Niigata University. | Agilent Technologies Japan, Ltd. ⊥ Niigata City General Hospital. O Nagaoka Red Cross Hospital. # International Academic Support Office, Niigata University.
3680
Journal of Proteome Research 2007, 6, 3680-3690
Published on Web 08/21/2007
only plays a pivotal role in the fundamental kidney function, but also is the locus of various progressive diseases that lead to chronic renal failure. Patients afflicted with these glomerular diseases frequently progress to irreversible loss of renal function and inevitably require renal replacement therapies. The prevention or attenuation of progression of the glomerular diseases is one of the most promising approaches to reduce the number of patients with end-stage renal failure. However, molecular mechanisms underlying most of these diseases are still obscure despite a great number of clinical and experimental studies. Proteomics can be defined as a systemic study of proteins including their expression, functions, and interactions to characterize biological processes under physiological and 10.1021/pr070203n CCC: $37.00
2007 American Chemical Society
Comprehensive Profiling of Glomerulus Proteome
pathophysiological conditions.1 Proteomics holds special promise for discovery of proteins relevant to physiological and pathophysiological processes which have been veiled in the theory-driven, reductionistic approaches so far conducted. Despite the extensive proteomic researches so far conducted, a limited number of novel markers are used in clinical practice, and the rate of introduction of new protein biomarkers has not been increased.2 The reason for this disjunction are manifold, and could be explained in part by the complexity and wide dynamic range of targeted proteomes, and the anticipated low relative abundance of disease-related proteins as well as the considerable extent of human and disease variations.2 Most of the proteomic analyses of kidney diseases are currently conducted with urine,3-7 the biological fluid that is easily collected with no invasiveness and is expected to contain thousands of proteins including proteins secreted or shed by most cells and tissues as well as proteins that leak into the fluid from damaged tissues. Although there have been a limited number of proteomic studies that identify a certain number of biomarker candidates in urine, the use of tissues more proximal to disease loci in combination with maximizing the sensitivity to detect and identify low-abundance proteins is the most straightforward approach to address the current issue of proteomic analysis of diseases.2 We have focused our proteomic study on the glomerulus to acquire comprehensive profiling of glomerulus proteome under physiological and pathophysiological conditions.8-10 In the present study, we employed twodimensional fractionation of proteins obtained from highly purified glomeruli on the basis of their molecular weights and isoelectric points. The proteins fractionated were then digested with trypsin and subjected to nanoflow reverse phase-liquid chromatography (RP-LC) coupled with an iontrap tandem mass spectrometer (MS/MS) with a higher scan speed. The protein prefractionation produced 90 fractions of intact proteins, resulting in a drastic reduction of the complexity of glomerulus proteome. This proteomic approach enabled us to accomplish an in-depth and confident profiling of the glomerulus proteome resulting in identification of 6686 proteins with one or more peptide hits and 3679 proteins with two or more peptide hits.
Materials and Methods Subject. This study was approved by the Ethics Committees of Niigata University Faculty of Medicine, Niigata City Hospital, and Nagaoka Red Cross Hospital, and conducted in accordance with their ethical principles. The kidney tissues were obtained from a 68-year old male patient with his informed consent and permission, who underwent surgical nephroureterectomy due to ureter carcinoma. The patient was normal in clinical examinations including blood cell counts, serum creatinine and urea levels, and liver function, and did not receive any chemotherapy and radiotherapy. Pieces of cortex with normal appearance were excised, and part of the tissue was immediately fixed in methyl Carnoy’s solution for histological examination, which was embedded in paraffin, sectioned at a thickness of 3 µm, and stained with hematoxylin, periodic acidSchiff (PAS) stain, and periodic acid methenamine silver (PAM) stain. Other part of the tissue was snap-frozen in n-hexane at -80 °C for immunofluorescence staining with FITC-conjugated goat antibodies against human IgA, IgG, IgM, and C3 (Tago Immunobiologicals, Burlinghame, CA). The rest of cortical tissues was brought to our laboratory in ice-cold PBS within 1 h after removal. Glomeruli were purified to apparent homogeneity as examined under a phase-contrast microscope ac-
research articles cording to the previously described method.8 In this study, we used kidney cortices with no apparent pathologic manifestations. Sample Preparation. The purified glomeruli were homogenized with a Polytron homogenizer (Kinematica, Littau, Switzerland) in lysis solution (7 M urea, 2 M thiourea, 4% CHAPS, 20 mM DTT, 0.5 mM PMSF, 0.46 mM EDTA-Na, 85 µM bestatin, 14 µM pepstain A, 11 µM E-64, and 10 µM leupeptin) at 40-80 µL/mg wet weight of glomeruli. The homogenate was added with 1 M Tris to adjust pH around 8.5, and incubated at room temperature for 30 min. The homogenate was then added with N,N-dimethylacrylamide (DMA) to a final concentration of 0.5% (v/v), incubated in the dark for an additional 30 min at room temperature, and added with 2 M DTT to a final concentration of 20 mM to quench alkylation of cysteine residues by DMA. The homogenate was centrifuged at 16 000g for 20 min at room temperature, and the resultant supernatant was stored at -80 °C until use. The protein concentration of samples was determined by the method of Ramagli and Rodriguez11 with BSA as a standard. Solution-Phase Isoelectric Focusing. Solution-phase isoelectric focusing (IEF) was carried out using a ZOOM IEF Fractionator (Invitrogen, Carosbad, CA) basically according to the instruction manual provided by the manufacturer. Briefly, the protein extract of glomeruli was precipitated by addition of acetone chilled at -20 °C to 80% (v/v), and left standing at -20 °C for more than 30 min. The precipitate was collected by centrifugation at 16 000g for 20 min at 4 °C, air-dried for 10 min at room temperature, and finally dissolved in IEF denaturant (7 M urea, 2 M thiourea, 4% CHAPS, 0.4% Zoom carrier ampholyte pH 3-10 linear range (Invitrogen), 20 mM DTT, and a trace amount of bromophenol blue). The dissolved sample was diluted with IEF denaturant to 0.6 mg/mL and subjected to fractionation on the five chambers-configured device separated by immobiline gel membrane discs with pH values of 3.0, 4.6, 5.4, 7.0, and 10.0 (Invitrogen). Aliquots (700 µL) of the sample were loaded into the respective five chambers. The proteins were focused for 20 min at 100 V, then 80 min at 200 V, and finally 80 min at 600 V. The fractionated samples were collected and stored at -80 °C until use. One-Dimensional SDS-PAGE and Two-Dimensional Gel Electrophoresis. The five protein fractions obtained by solution-phase IEF were precipitated by acetone as described above and separated by one-dimensional (1D) SDS-PAGE. Aliquots of respective samples containing 20 µg of protein were electrophoresed in a 1D SDS-PAGE gel (10%) under the Laemmli buffer system for a total distance of 6 cm and stained with Coomassie Brilliant blue R-250. Each lane was subsequently cut into 4 mm slices, each collected into Eppendorf tubes containing ultrapure water and subjected to in-gel digestion with trypsin as described below. Since the loss of proteins during procedure of solution-phase IEF was apparent as indicated by the relatively low recovery (∼60%) of total proteins in fractions obtained with solution-phase IEF, presumably due to adsorption on the immobiline gel membrane discs and the wall of separation chambers, we additionally separated the protein extract without separation by solution-phase IEF on 1D SDS-PAGE gel to maximally complement proteins lost in the process of IEF fractionation. Two-dimensional gel electrophoresis (2-DE) was performed to confirm fractionation of glomerular proteins by solutionphase IEF. IEF in the first dimension was carried out in a ZOOM IPG Runner (Invitrogen) using 7 cm ZOOM strips of pH 3-10 Journal of Proteome Research • Vol. 6, No. 9, 2007 3681
research articles linear range (Invitrogen). The ZOOM strips were rehydrated overnight at 20 °C with 155 µL of rehydration buffer (8 M urea, 2% CHAPS, 20 mM DTT, 0.5% Zoom carrier ampholyte pH 3-10 linear range, and 0.002% bromophenol blue), containing 3 µg of protein from each of the five fractions, and focused for 20 min at 200 V, followed by 15 min at 450 V, 15 min at 750 V, and finally 30 min at 2000 V. After IEF was completed, the ZOOM strips were equilibrated at room temperature for 15 min with reducing buffer containing NuPAGE sample reducing agent (Invitrogen), followed by 15 min with alkylation buffer containing 125 mM iodoacetamide (IAA). The equilibrated strips were subjected to the second-dimensional SDS-PAGE in an XCell SureLock Mini-Cell (Invitrogen) using NuPAGE 4-12% linear gradient Bis-Tris gels (Invitrogen). Electrophoresis was carried out at constant voltage of 200 V in NuPAGE MOPS Running buffer (Invitrogen). Separated proteins on 2-DE gels were visualized by silver-staining using the protocol of PlusOne silver staining kit (GE Healthcare, Chalfont, St. Giles, U.K.). Protein Identification. As described above, glomerular proteins were prefractionated by 2D prefractionation combining the solution-phase IEF and the subsequent 1D SDS-PAGE. Lanes of 1D SDS-PAGE gel, corresponding to the respective protein fractions separated by the solution-phase IEF as well as the total proteins without solution-phase IEF fractionation, were cut into 4 mm gel slices, which resulted in producing 90 fractions. These gel slices were subjected to in-gel digestion with trypsin (proteomics sequence grade obtained from Sigma, St. Louis, MO) according to the method previously described.9 The tryptic digests were analyzed with a nanoflow LCiontrap-tandem mass spectrometer (nLC-IT-MS/MS, Agilent 1100 LC/MSD Trap XCT Ultra) equipped with an HPLC nanospray Chip (Protein ID chip #1, Agilent) integrating an enrichment column (40 nL, ZORBAX 300 SB-C18, 5 µm), an analytical column (43 × 0.075 mm, ZORBAX 300 SB-C18, 5 µm), and a spray needle. Mobile phase A was water containing 0.1% formic acid, and mobile phase B was acetonitrile containing 0.1% formic acid. A 30 min linear gradient from 2% to 50% B, followed by 50% B isocratic run for 10 min and subsequent 80% B isocratic run for 10 min at a flow rate of 300 nL/min, was applied to elute peptides. All samples were analyzed in duplicate by two consecutive LC-MS/MS runs with the same amount of protein load to increase chance for precursor ion selection to MS/MS event. In addition, the two consecutive sample runs were interrupted with two consecutive blank runs to eliminate carryover from a previously analyzed sample. The mass spectrometer was operated in positive ion mode over the range of 350-2000 m/z in the data-dependent mode to select the four most intense precursor ions for acquisition of MS/MS spectra. The identification of proteins was performed with Spectrum Mill MS Proteomics Workbench platform (version A.03.12.060, Agilent Technologies) according to the workflow of Spectrum Mill. Briefly, MS/MS spectra were first extracted from raw data by merging nearby MS/MS spectra from the same precursor ion, reducing the number of noise spectra with minimum sequence tag length and S/N parameters, determining precursor charges where possible, and centroiding MS peaks. The data extraction was performed with the following parameters: fixed modification, carbamoidomethyaltion; MH+ range, 600-4000 Da; sequence tag length, more than 1; window settings of merging spectra from the same precursor, retention time within ( 1.5 s and mass value within ( 1.4 m/z; maximum charge, +7; minimum S/N, 25. The extracted and processed MS spectra 3682
Journal of Proteome Research • Vol. 6, No. 9, 2007
Miyamoto et al.
were searched against in-house built IPI human protein database (version 3.18). Carbamoidomethylation was set as fixed modification, and oxidized methionine as variable modification. Mass tolerance was ( 2.5 Da on MS peaks and ( 0.7 Da on MS/MS peaks. Two missed cleavages with trypsin were allowed. The instrument setting was specified as “ESI ion trap”. Search results were then autovalidated to select reliable protein identifications on the basis of peptide score and the summation of peptide scores of peptides matched to the same proteins. The peptide score indicates the quality of MS/MS spectrum matching to the sequence of a protein identified and was calculated on the basis of similarity between the acquired experimental MS/MS spectrum and a theoretical spectrum. The similarity is evaluated by the number of peaks common to the two spectra, and bonus points are awarded depending on the ion type (b or y) as well as penalty points for unmatched peaks, which are inversely proportional to the relative peak intensity of the unmatched fragment ion.12 In addition, we manually inspected the quality of MS/MS spectra attributed to the autovalidated protein identifications. MS/MS spectra that did not pass the threshold of autovalidation process were manually inspected and retrieved as valid spectra if a series of at least three consecutive y- and/or b-ion fragments were observed that were evenly distributed in the spectrum and detected at a high S/N ratio as described in the instruction manual of Spectrum Mill (Agilent G2721AA). Furthermore, false-positive peptide hits were filtered based on difference between the scores resulted from the forward and the reversed sequence database, and difference between the scores of rank 1 and rank 2 protein hits according to the criteria of Spectrum Mill. Through the validation process, 112 241 spectra (5.5%) were finally selected as validated MS/MS spectra among the total spectra (2 052 113) produced by the mass spectrometer. The false-positive rate was estimated for each sample by dividing the number of peptides identified in the randomized IPI human protein database by the number of peptides identified in the normal IPI human database, and calculated to be 1.86 ( 0.24% (the mean ( SEM). Annotations and Compilation of Identified Proteins. Through process of protein identification, fundamental annotations of identified proteins were provided including IPI protein accession number, protein name, and theoretical molecular weight and pI, and parameters for protein identifications intrinsic to LC-MS/MS analysis (number of matched peptides, number of distinct peptides, identification score, sequence coverage) were provided. In addition, we have annotated the identified proteins with other knowledgeable information through further search of databases in public domain. These include gene name, accession number of protein databases including Swiss-Prot, RefSeq, and TrEMBL, and functional classifications (molecular function and biological process) based on Panther classification system (www.pantherdb.org). Panther classification is based on experimental evidence and evolutionary relationships to predict function, and its ontologies are controlled, structured vocabularies of molecular function and biological process terms similar to those of Gene Ontology Consortium (GO), but greatly abbreviated and simplified to facilitate high-throughput analyses. The tools for mapping of their ontology terms to GO terms or vice versa are available.13 As described above, the glomerular proteins were separated by 1D SDS-PAGE and 2D separation strategy combining solution-phase IEF and 1D SDS-PAGE, producing 90 fractions.
Comprehensive Profiling of Glomerulus Proteome
research articles and physiological and pathophysiological conditions of the subject. We aimed in this study to establish and validate the most appropriate proteomic analysis for in-depth profiling of the glomerulus proteome, which could provide a basic information regarding the depth and width of human glomerulus proteome. We are now attempting to establish a platform for comprehensive, quantitative proteomic analysis of human glomeruli isolated from biopsy samples, which will contribute to clinical proteomics of glomerular diseases leading to discovery of protein biomarkers and drug targets.
Figure 1. A representative phase-contrast micrograph of a highly purified preparation of glomeruli. The glomeruli preparation used in this study is shown. Original magnification 200×.
As each of fraction was defined by physicochemical properties, that is, MW range and pI range, we constructed 1D and 2D protein arrays consisting of 90 sections corresponding to the 90 protein fractions and compiled the identified proteins into these 90 sections on the 1D and 2D protein arrays on the basis of fraction(s) where a particular protein is identified (Figure 5).
Results and Discussion Subject. We obtained kidney tissue from a 68-year-old male patient who underwent nephroureterectomy due to ureter carcinoma. The cortical tissue with normal appearance was excised and used to purify glomeruli. The glomeruli was purified to an apparent homogeneity and did not contain any tubules under a light microscopy (Figure 1), although contamination of the purified glomeruli preparation with a small amount of components derived from blood was unavoidable. The ex vivo perfusion of the surgically resected kidney with physiological solution would effectively remove most of the blood-derived contaminants. However, the perfusion was practically difficult to perform because we could not obtain permission from the hospital where surgery was carried out, and the kidney was available to us only after dissection for removal of tissue for histological examination. The tissue used in this study was histologically judged as normal by light microscopic examination (Figure 2), and no significant deposition of immunoglobulins (IgA, IgG, and IgM) and complement C3 was observed by immunofluorescence microscopy (data not shown). However, minimal age (68-year)related changes, including a slight widening of mesangium, a small proportion of collapsed glomeruli, focal interstitial infiltration of lymphocytes, tubular atrophy, and hyaline degeneration, were present. Thus, the glomerulus proteome we analyzed in this study may be slightly affected by the agerelated changes. Clinical examination data were normal, and the patient was not treated with any chemotherapy and radiotherapy before nephroureterectomy, suggesting no influence of the tumor and medical treatments on the glomerulus proteome. However, we could not exclude the possibility of occurrence of glomerular diseases undetectable with our histological examinations, since tumor-associated glomerulonephropathies have been documented as reported by Magyariaki et al.14 We have analyzed glomeruli obtained from one subject. The database we constructed, therefore, does not represent the universal validity, and may be biased by genetic background,
Protein Identification Strategy. Reducing sample complexity prior to the shotgun protein identification by LC-MS/MS is one of critical steps in a comprehensive analysis of targeted proteome. We employed the two-dimensional (2D) fractionation of glomerular proteins by combination of solution-phase IEF and SDS-PAGE (2D prefractionation), which was developed by Speicher and his colleagues for profiling of human plasma and serum proteomes.15 In addition to the 2D prefractionation, we separately fractionated the total glomerular proteins by onedimensional (1D) SDS-PAGE without fractionation by solutionphase IEF to maximally complement proteins lost in the IEF fractionation (1D prefractionation). Lanes of SDS-PAGE gels corresponding to the fractions separated by the solution-phase IEF as well as the total proteins were cut into 4 mm gel slices, which produced 90 fractions consisting of 75 fractions from the 2D prefractionation and 15 fractions from the 1D prefractionation. These gel slices were subjected to in-gel digestion with trypsin, and analyzed by a nanoflow LC-iontrap-tandem mass spectrometer (nLC-ITMS/MS) for protein identifications. With the protein prefractionation strategy, proteins identified in a specific fraction were provided with their intrinsic physicochemical properties, that is, a range of MW and pI, the structural information of intact proteins which is useful as supporting information for confident protein identifications. The protein load capacity of the 2D prefractionation was up to 2 mg. The high protein capacity coupled with extensive prefractionation in our protein identification strategy would allow the sensitivity of proteomic analysis enough to identify low-abundance proteins. A diagram for the workflow of our protein identification strategy is illustrated in Figure 3. 2D Prefractionation of Glomerular Proteins. A higher amount of glomerular proteins (2 mg) was loaded and separated by solution-phase IEF which produced 5 fractions with defined pH ranges: 3.0-4.6, 4.6-5.4, 5.4-6.2, 6.2-7.0, and 7.010.0. The fractionation of glomerular proteins by solution-phase IEF was verified by two-dimensional gel electrophoresis (2-DE) using IPG strips of pH range 3-10. As shown in Figure 4, the proteins fractionated into respective solution-phase IEF fractions were largely confined to the pH range defined, and crosscontaminations between fractions were minimal, although some spots were noticed in other pH range possibly due to insufficient focusing either in the solution-phase IEF or the first dimensional IEF in 2-DE separation. This incorrect focusing is consistent with the observed degree of horizontal streaking of some proteins on the 2-DE gels. In addition to the incorrect focusing, the possibility of proteolysis or artificial modifications resulting in the appearance of protein spots outside the expected pH range could not be excluded. The fractionation by the solution-phase IEF was reproducible, since repeated separation with the solution-phase IEF followed by analysis with 2-DE gave almost the same result (data not shown). Journal of Proteome Research • Vol. 6, No. 9, 2007 3683
research articles
Miyamoto et al.
Figure 2. Histological examination of kidney cortex of the subject in this study. A part of the tissue used for purification of glomeruli was fixed in methyl-Caroy’s solution, embedded in paraffin, sectioned at 3 µm, and stained with PAS (A) and PAM (B) stains. Immunofluorescence staining of frozen sections from the same specimen indicated no significant deposits of IgA, IgG, IgM, and complement C3 (data not shown). Original magnification 165×.
fractionations. The recovery was in good agreement with that reported by Zuo and Speicher.16 The loss of protein was attributable to adsorption of proteins on immobiline gel membrane discs and the wall of separation chambers, and the loss of proteins with pI values outside the pH 3-10 range of the separation chambers into the two electrode solutions.16 We did not further analyze the proteins lost in the solution-phase IEF process because elution and recovery of these proteins were variable and introduced a significant uncertainty and inconsistency into our comprehensive proteomic analysis. Since the loss of protein in the fractionation with solutionphase IEF was relatively high, we additionally fractionated the total protein by 1D SDS-PAGE (1D prefractionation) without solution-phase IEF fractionation to complement the proteins lost in the solution-phase IEF. Obviously, the 1D prefractionation could in part complement but not completely cover the protein loss, and it should be noticed that significant numbers of proteins lost in the solution-phase IEF lacked, especially proteins with extreme pIs and those in low abundance, in the identified protein list we reported in this study.
Figure 3. Diagram of protein identification strategy. Proteins extracted from highly purified glomeruli were cysteine-alkylated. A part of protein extract was directly separated by SDS-PAGE (1D prefractionation), and the rest (2 mg) was separated by solution-phase IEF into 5 fractions and separated by SDS-PAGE (2D prefractionation) for a total distance of 6 cm from the top of separation gel. The lanes corresponding to the unfractionated total proteins as well as each of 5 fractions separated by solutionphase IEF were cut into 4 mm uniform slices producing 15 fractions (1D prefractionation) and 75 fractions (2D fractionation), in-gel-digested with trypsin, and analyzed by nanoflow-LCiontrap MS/MS. The protein identifications were conducted by searching IPI human protein database by using Spectrum Mill as a search engine.
The total protein recovery of the five fractions of solutionphase IEF was 55 ( 3.9% (the mean ( SEM) in 4 replicate 3684
Journal of Proteome Research • Vol. 6, No. 9, 2007
Protein Identifications. As described above, glomerular proteins were prefractionated by 1D prefractionation using 1D SDS-PAGE and by 2D prefractionation using the solution-phase IEF coupled with SDS-PAGE (Figure 5A). The lanes in SDSPAGE gels corresponding to the unfractionated total protein (1D prefractionation) as well as the fractions separated by 2D prefractionation were cut into 4 mm gel slices, producing 90 fractions (Figure 5A). The gel slices were in-gel-digested with trypsin and analyzed for protein identification by a nanoflow LC-iontrap-tandem mass spectrometer (nLC-IT-MS/MS). The two consecutive LC-MS runs of the same sample were adopted to increase the number of protein identifications, since the number of detected peptide ions could be influenced by the random selection for MS/MS event especially in the analysis of complex mixture with ion trap type of mass spectrometer.17 In fact, it was demonstrated that multiple LC-MS runs of the same sample identified distinct proteins in each run.18 The two consecutive runs, coupled with IT-MS/MS with a higher scanning speed of MS and MS/MS spectra, resulted in identifications of a large number of proteins as summarized in Table 1. The protein identification strategy we adopted in the present study provided a confident set of identified proteins consisting of 6686 unique proteins. We categorized the identified proteins into those with high confidence and those with lower confi-
Comprehensive Profiling of Glomerulus Proteome
research articles
Figure 4. Evaluation of sample fractionation with solution-phase IEF by two-dimensional gel electrophoresis (2-DE). A representative result is shown. After fractionation of glomerular proteins using solution-phase IEF to five fractions: B, fraction 1 (pH 3-4.6); C, fraction 2 (pH 4.6-5.4); D, fraction 3 (pH 5.4-6.2); E, fraction 4 (pH 6.2-7.0); F, fraction 5 (pH 7.0-10.0), each of the fractionated sample (3 µg) was separated on 2-DE gels using IPG strips of pH 3-10 in the first dimensional IEF and 4-12% gradient gel in the second dimensional SDS-PAGE and stained with silver nitrate. Additionally, total protein sample without fractionation by solution-phase IEF was run as a reference (A, unfractionated).
dence according to the number of matched peptides (Table 1). Under the default settings, Spectrum Mill search engine generates groups of the identified proteins on the basis of matched peptides: identified proteins are included in the same group if they share the same matched peptide(s), and proteins in the same group are then sorted and ranked according to
their scores. By the protein grouping, most of identified proteins were grouped in terms of alternative splice forms, families, and paralogs. The grouping of proteins is useful to detect ambiguity in protein identifications. In principle, detailed inspection of shared or not shared distinct peptides among identified proteins in a protein group discriminate proteins Journal of Proteome Research • Vol. 6, No. 9, 2007 3685
research articles
Miyamoto et al.
Figure 5. Distribution of identified glomerular proteins in one-dimensional (1D) and two-dimensional (2D) protein arrays. In panel A, SDS-PAGE profiles of 1D and 2D protein prefractionation were shown with assigned sections, each of which corresponds to protein fractions separated by 1D and 2D prefractionation of glomerular proteins, and are defined with MW and pI range. The unfractionated total proteins (1D, fraction O) and the five fractions (2D, fractions A-E or F1-F5) separated by solution-phase IEF were loaded on 10% SDS-PAGE gels, electrophoresed for a distance of 6 cm, and stained with Coomassie Brilliant blue. It should be noted that some high-abundance proteins were confined to particular sections, which brought about a considerable reduction of complexity of protein mixture in other sections and would lead to efficient identification of low-abundance proteins. In panel B, the number of identified proteins in each of the sections in 1D and 2D protein arrays is indicated. A preliminary form of database constructed from the protein arrays and all the identified proteins with fundamental annotations is available on a Web site (http://www.hkupp.org).
Table 1. Summary of Protein Identifications by 1D and 2D Protein Prefractionation Strategy number of identified proteins
protein prefractionation
high confidencea
lower confidenceb
total identified proteinsc
1D prefractionationd
2630
797
3427
2D prefractionatione Fr. 1 (pH 3-4.6) Fr. 2 (pH 4.6-5.4 Fr. 3 (pH 5.4-6.2) Fr. 4 (pH 6.2-7.0) Fr. 5 (pH 7.0-10.0) Total number
992 1422 1140 1721 1309 3653
1088 1155 1144 1289 1177 2684
2080 2577 2284 3010 2486 6337
Total distinct proteinsf
3679
3007
6686
a The number of proteins identified with two or more peptide matches. The number of proteins identified with one peptide match. c The number of proteins identified with one or more peptide matches. d Glomerular proteins were directly separated by one-dimensional (1D) SDS-PAGE. e Glomerular proteins were separated by two-dimensional (2D) fractionation consisting of solution-phase IEF in the first dimension and SDS-PAGE in the second dimension. f The number of distinct proteins identified by 1D and 2D protein prefractionation strategies including those commonly identified by both strategies. b
actually present in the original protein mixture (biological redundancy) from proteins identified with uncertainty arisen from biological redundancy or bioinformatics redundancy in which a shared peptide(s) could be mapped to more than one protein sequence. We did not further analyze our data set for specification of proteins identified with this type of ambiguity and did not exclude those proteins in the list of identified proteins. In addition, we excluded proteins belonging to keratin family. Most of keratin isoforms identified in this study were highly likely to be contaminants. There have been no reports demonstrating expression of keratin in the glomerulus, and our recent proteomic analysis of in-solution trypsin digests of isolated glomeruli from normal human kidney failed to detect any peptides derived from keratin isoforms (data not shown), 3686
Journal of Proteome Research • Vol. 6, No. 9, 2007
Figure 6. Two-way Venn diagram showing the overlap between identified proteins by 1D and 2D prefractionation strategies. The 1D prefractionation identified 3427 proteins and the 2D prefractionation identified 6337 proteins. The number of proteins identified commonly by the two prefractionation strategies was 3078, resulting in the total number of identified proteins to be 6686. The number of identified proteins represents the total identified proteins including those with high confidence and lower confidence (Table 1).
although we cannot exclude the possibility of expression of certain keratin isoforms in the glomerulus.19 The complete list of identified proteins along with their associated groups are available in Table 1 of Supporting Information. The 1D prefractionation strategy resulted in identification of 3427 proteins, while the 2D prefractionation came to produce identification of 6337 proteins. Not surprisingly, considerable numbers of proteins amounting to 3078 were identified commonly by the two strategies (Figure 6). In addition to the identified proteins by both strategies, 2D prefractionation identified an additional 3259 unique proteins. The result clearly demonstrated the advantage of 2D prefractionation over the 1D prefractionation in sensitivity to identification of lowabundance proteins. Interestingly, it should be noted that 349
Comprehensive Profiling of Glomerulus Proteome
research articles
Figure 7. Distribution of ZO-1 (A), nephrin (B), and paxillin (C) proteins known to be expressed in the podocyte of the glomerulus, in the 2D protein array. Sections in which these proteins were identified were highlighted with gradational red color corresponding to the number of peptides matched to respective proteins.
proteins were uniquely identified by the 1D prefractionation strategy, suggesting a significant fraction of glomerulus proteins was lost in the solution-phase IEF fractionation step in the 2D prefractionation. The proteins identified only by 1D prefractionation may complement in part comprehensive profiling of the glomerulus proteome. Compiling of Identified Proteins into 1D and 2D Protein Arrays. The protein identification was conducted with proteins in a particular fraction among the 90 fractions prefractionated by 1D and 2D prefractionations, and the respective fractions were defined by MW and pI range, providing proteins identified in a fraction with their physicochemical properties of intact proteins. The 1D and 2D prefractionation strategy for protein identification confers a considerable advantage over the peptide-based, conventional proteomic analysis using multidimensional peptide separation by HPLC coupled with tandem mass spectrometry, since the latter approach introduces loss of connectivity between peptides and their protein precursors. We compiled the proteins identified in the present study into a 1D protein array consisting of 15 sections defined by MW range and to 2D protein array consisting of 75 sections defined by MW and pI range as illustrated in Figure 5. A preliminary form of database, including the compiled protein arrays and the list of all identified proteins with annotations, was constructed and deposited in an accessible form on a Web site (http://www.hkupp.org). The physicochemical properties intrinsic to proteins identified in the present study could be used to exclude ambiguously identified proteins and to reduce redundancy in protein identifications arisen from bioinformatics redundancy. We have now undertaken to construct a nonredundant database as a subdivision of the database of human glomerulus proteome, now available on the Web, by
taking an advantage of MW and pI information and by detailed inspection of peptides matched to proteins included in the same group as discussed above. The distributions of three proteins known to be expressed in the podocyte in the 2D protein array were depicted in Figure 7; nephrin and ZO-1 (proteins associated with slit diaphragm), and paxillin (a component of integrin-associated focal adhesion complex). Nephrin and ZO-1 were identified in multiple sections in the 2D protein array. The MW and pI range of the section with largest number of distinct peptides (matched peptides to the protein), however, were most close to the theoretical MW and pI of both nephrin and ZO-1. As the number of observed matched peptides, as well as hit rank, and the matching score can be considered as indicators for protein abundance in a large-scale, comprehensive proteomic analysis,20,21 it is reasonably assumed that nephrin and ZO-1 were most enriched in the sections with the largest number of distinct peptides. Other sections, where nephrin or ZO-1 was identified, could be explained by insufficient focusing in the solution-phase IEF separation, broad band zone in the SDSPAGE, post-translational modifications, and proteolytic degradation generated in the process of protein fractionation. The wide distribution over multiple sections may be also arisen from bioinformatics redundancy as discussed above. In contrast, paxillin was identified only in one section, and MW and pI range of the section were well-matched with the theoretical MW and pI of paxillin, supporting the reliability of identification of paxillin. As exemplified above, the 1D and 2D protein prefractionation strategy and compiling of identified proteins into 1D and 2D protein arrays not only provide mapping of complex proteome in a similar manner to two-dimensional Journal of Proteome Research • Vol. 6, No. 9, 2007 3687
research articles
Miyamoto et al.
Figure 8. Pie charts of functional classification of all the identified proteins of the glomerulus of normal human kidney in terms of molecular function (A) and biological process (B). The identified proteins with two or more matched peptides (high-confidence set) were analyzed by Panther classification system. The result using all identified proteins (high-confidence and low-confidence sets) was similar (data not shown).
electrophoresis, but also could substantiate the reliability of protein identification. Classification of Identified Proteins and Characterization of Glomerulus Proteome. All the identified proteins with high 3688
Journal of Proteome Research • Vol. 6, No. 9, 2007
confidence (3679 proteins), representing 2966 unique genes, were classified in terms of molecular function and biological process using Panther classification system. Panther is a software freely accessible on the Web (www.pantherdb.org) and
research articles
Comprehensive Profiling of Glomerulus Proteome
provides a platform of assigning families, functional classifications, and pathways to gene products.13 The results expressed in pie charts using the Panther classification system are shown in Figure 8. In classification based on molecular function, proteins categorized into “cytoskeletal protein” were most abundant amounting to 12% of total identified proteins, which was much higher than genes assigned to cytoskeletal proteins (2.9%) among the whole human genes. Furthermore the ratio of cytoskeletal proteins in the glomerulus proteome was higher than that (7%) in the human brain proteome,18 which was the result by analysis with GoMiner (gominer.souirceforge.net). In classification based on biological process, proteins categorized into “signal transduction” (9%), “cell structure and motility” (9%), and “immunity and defense” (9%) were the main biological processes among the identified proteins. A similar result was obtained when all identified proteins with both high and low confidence (6686 proteins) were analyzed. The glomerulus is heterogeneous in cellular components and composed of different cell types (endothelial cell, mesangial cell, podocyte) and basement membrane. The systematic description of the characteristics of glomerulus proteome on the basis of the comprehensive proteomic analysis accomplished in this study is, therefore, difficult and awaits further detailed analysis using isolated cells and/or cellular components. The significantly higher ratio of cytoskeletal proteins in the glomerulus proteome, however, may reflect the highly specialized architecture of the glomerulus. The glomerular capillary tufts are exposed to high and variable intraluminal hydrostatic pressures. On the basis of morphological analyses, it has been proposed that the contractile mesangial cells as well as podocytes generate inwardly directed forces that afford stability to the capillary loop by opposing the expansile force of high hydrostatic glomerular capillary pressure.22 Mesangial cells are enriched in myosin-associated cytoplasmic filaments,23 and podocytes have highly organized cytoskeletal networks, especially in their foot processes.24 The glomerulus is a highly specialized structure that functions as the fundamental part of the filtration barrier in the kidney. The filtration barrier between the blood and urinary space is composed of fenestrated endothelial cell, glomerular basement membrane (GBM), and slit diaphragm formed between foot processes from adjacent podocytes. Podocyte foot process is not a static structure, but rather contains well-developed, contractile, and specialized cytoskeletal organizations which are modified by unique assembly of actin-associated proteins. The remarkably higher incidence of identification of cytoskeleton-associated proteins in the glomerulus might be explained by the cytoskeletal organizations well-developed in podocytes and mesangial cells. The identified proteins relevant to the podocyte and its foot process were summarized in Table 2 in Supporting Information.
Conclusions When 1D and 2D prefractionation was employed prior to the shotgun analysis by LC-MS/MS, glomerular proteins of normal human kidney were fractionated into 90 fractions, which resulted in a drastic reduction in sample complexity, and determined feasibility of identifying low-abundance proteins. The protein identification strategy provides a most comprehensive, confident set of identified proteins consisting of 6686 unique proteins, representing 2966 distinct genes, of which 3679 proteins were identified with two or more peptide matches (high-confidence set) and 3007 proteins were identified with
one peptide match (low-confidence set). The identified proteins were annotated, classified in terms of molecular function and biological process, and compiled into 1D (consisting of 15 sections) and 2D (consisting of 75 sections) protein arrays, and each of the sections was defined with MW and pI ranges, the physicochemical properties intrinsic to intact proteins. In classification based on molecular function, the most remarkable feature of glomerulus proteome was a higher incidence of identification of cytoskeleton-related proteins, presumably reflecting the well-developed cytoskeleton organization in glomerular cells. The in-depth profiling of glomerulus proteome of normal human kidney was used to construct a Web-based database ((http://www.hkupp.org) and can contribute to clinical proteomic research for understanding pathophysiology of kidney diseases as well as discovery of disease-related proteins.
Acknowledgment. This study was supported by a Grant-in-Aid for Scientific Research (B) (17390247) from Japan Society for Promotion of Science, and a Grant for Promotion of Niigata University Research Project. Supporting Information Available: We have constructed a preliminary form of database of glomerulus proteome including all the proteins identified in this study, which is accessible free on the Web site (http://www.hkupp.org). Table 1, a list of all the identified proteins; Table 2, the identified proteins relevant to the podocyte and its foot process. This material is available free of charge via the Internet at http://pubs.acs.org. References (1) Peng, J.; Gygi, S. P. Proteomics: the move to mixtures. J. Mass Spectrom. 2001, 36 (10), 1083-1091. (2) Rifai, N.; Gillete, M. A.; Carr, S. A. Protein biomarker discovery and validation: the long and uncertain path to clinical utility. Nat. Biotech. 2006, 24 (8), 971-963. (3) Cutillas, P.; Burlingame, A.; Unwin, R. Proteomic strategies and their application in studies of renal function. News Physiol. Sci. 2004, 19 (3), 114-119. (4) O’Riordan, E.; Goligorsky, M. S. Emerging studies of the urinary proteome: the end of the beginning. Curr. Opin. Nephrol. Hypertens. 2005, 14 (6), 579-585. (5) Thongboonkerd, V.; Malasit, P. Renal and urinary proteomics: Current applications and challenges. Proteomics 2005, 5 (2), 1033-1042. (6) Schiffer, E.; Mischak, H.; Novak, J. High resolution proteome/ peptidome analysis of body fluids by capillary electrophoresis coupled with MS. Proteomics 2006, 6 (20), 5615-5627. (7) Pisitkun, T.; Johnstone, R.; Knepper, M. A. Discovery of urinary biomarkers. Mol. Cell. Proteomics 2006, 5 (10), 1760-1771. (8) Yoshida, Y.; Miyazaki, K.; Kamiie, J.; Sato, M.; Okuizumi, S.; Kenmochi, A.; Kamijo, K.; Nabetani, A.; Xu, B.; Zhang, Y.; Yaoita, E.; Osawa, T.; Yamamoto, T. Two-dimensional electrophoretic profiling of normal human kidney glomerulus proteome and construction of an extensible markup language (XML)-based database. Proteomics 2005, 5 (4), 1083-1096. (9) Xu, B.; Yoshida, Y.; Zhang, Y.; Yaoita, E.; Osawa, T.; Yamamoto, T. Two-dimensional electrophoretic profiling of normal human kidney: differential protein expression in glomerulus, cortex and medulla. J. Electrophoresis 2005, 49 (1), 5-13. (10) Zhang, Y.; Yoshida, Y.; Xu, B.; Nameta, M.; Miyamoto, M.; Yaoita, E.; Yamamoto, T. Localization of tyrosine-phosphorylated proteins in normal rat kidney. Acta Med. Biol. 2007, 55 (1), 1-7. (11) Ramagli, L. S. Quantifying protein in 2-D PAGE solubilization buffers. Methods Mol. Biol. 1999, 112, 99-103. (12) Kapp, E. A.; Schultz, F.; Connolly, L. M.; Chakel, J. A.; Meza, J. E.; Miller, C. A.; Fenyo, D.; Eng, J. K; Adkins, J. N.; Omenn, G. S.; Simpson, R. J. An evaluation, comparison, and accurate benchmarking of several publicly available MS/MS search algorithms: sensitivity and specificity analysis. Proteomics 2005, 5 (13), 34753490.
Journal of Proteome Research • Vol. 6, No. 9, 2007 3689
research articles (13) Mi, H.; Guo, N.; Kejariwal, A.; Thomas, P. D. PANTHER version 6: protein sequence and function evolution data with expanded representation of biological pathways. Nucleic Acids Res. 2006, 35 (Database issue), D247-D252. (14) Magyarlaki, T.; Kiss, B.; Buzogany, I.; Fazekas, A.; Sukosd, F.; Nagy, J. Renal cell carcinoma and paraneoplastic IgA nephropathy. Nephron 1999, 82 (2), 127-130. (15) Tang, H, Y.; Ali-Khan, N.; Echan, L. A.; Levenkova, N.; Rux, J. J.; Speicher, D. W. A novel four-dimensional strategy combining protein and peptide separation methods enables detection of lowabundance proteins in human plasma and serum proteomes. Proteomics 2005, 5 (13), 3329-3342. (16) Zuo, X.; Speicher, D. W. A method for global analysis of complex proteomes using sample prefractionation by solution isoelectrofocusing prior to two-dimensional electrophoresis. Anal. Biochem. 2000, 284 (2), 266-278. (17) Ishihama, Y.; Oda, Y.; Tabata, T.; Sato, T. Nagasu, T.; Rappsilber, J.; Mann, M. Exponentially modified protein abundance index (emPAI) for estimation of absolute protein amount in proteomics by the number of sequenced peptides per protein. Mol. Cell. Proteomics 2005, 4 (9), 1265-1272. (18) Park, Y. M.; Kim, J. Y.; Kwon, K. H.; Lee, S. K.; Kim, Y. H.; Kim, S. Y.; Park, G. W.; Lee, J. H.; Lee, B.; Yoo, J. S. Profiling human brain proteome by multi-dimensional separations coupled with MS. Proteomics 2006, 6 (18), 4978-4986.
3690
Journal of Proteome Research • Vol. 6, No. 9, 2007
Miyamoto et al. (19) Vilafranca, M.; Ferrer, L.; Wohlsein, P.; Trautwein, G.; Sanchez, J.; Navarro, J. A. Ultra structural co-localization of vimentin and cytokeratin in visceral glomerular epithelial cells of dogs with glomerulonephritis. Res. Vet. Sci. 1995, 59 (1), 87-91. (20) Lansonder, E.; Ishihama, Y.; Andersen, J. S.; Vermunt, A. M.; Pain, A.; Sauerwein, R. W.; Eling, W. M.; Hall, N.; Waters, A. P.; Stunnenberg, H. G.; Mann, M. Analysis of the Plasmodium falcoparum proteome by high-accuracy mass spectrometry. Nature 2002, 419 (6906), 537-542. (21) Shen, Y.; Zhao, R.; Berger, S. J.; Anderson, G. A.; Rodriguez, N.; Smith, R. D. High-efficiency nanoscale liquid chromatography coupled on-line with mass spectrometry using nanoelectrospray ionization for proteomics. Anal. Chem. 2002, 74 (16), 4235-4249. (22) Sterzel, R. B.; Hartner, A.; Schlo¨tzer-Schrehardt, U. S.; Voi, S.; Hausnecht, B.; Doliana, R.; Colombatti, A.; Gibson, M. A.: Braghetta, P.; Bressan, G. M. Elastic fiber proteins in the glomerular mesangium in vivo and in cell culture. Kidney Int. 2000, 58 (4), 588-1602. (23) Mene´, P.; Simonson, M. S.; Dunn, M. J. Physiology of the mesangial cell. Physiol. Rev. 1989, 69 (4), 1347-1424. (24) Ransom, R. F. Podocyte proteomics. Contrib. Nephrol. 2004, 141, 189-211.
PR070203N