Comprehensive Genome-Wide Proteomic Analysis of Human

Jan 30, 2013 - Figure 2A shows a comparison of the number of human .... For example, five and six N-linked sites were identified at laminin subunit ga...
1 downloads 0 Views 352KB Size
Article pubs.acs.org/jpr

Comprehensive Genome-Wide Proteomic Analysis of Human Placental Tissue for the Chromosome-Centric Human Proteome Project Hyoung-Joo Lee,† Seul-Ki Jeong,† Keun Na,† Min Jung Lee,† Sun Hee Lee,† Jong-Sun Lim,† Hyun-Jeong Cha,† Jin-Young Cho,† Ja-Young Kwon,‡ Hoguen Kim,‡ Si Young Song,‡ Jong Shin Yoo,§ Young Mok Park,§ Hail Kim,∥ William S. Hancock,†,# and Young-Ki Paik*,† †

Yonsei Proteome Research Center and Department of Integrated Omics for Biomedical Science, World Class University Program, Yonsei University, Seoul, Korea ‡ Yonsei University College of Medicine, Seoul, Korea § Division of Mass Spectrometry Research, Korea Basic Science Institute, Ochang, Chungbuk, Korea ∥ Graduate School of Medical Science & Engineering, Korea Advanced Institute of Science and Technology, Daejeon, Korea # Barnnet Institute and Department of Chemistry, Northeastern University, Boston, Massachusetts, United States S Supporting Information *

ABSTRACT: As a starting point of the Chromosome-Centric Human Proteome Project (C-HPP), we established strategies of genome-wide proteomic analysis, including protein identification, quantitation of disease-specific proteins, and assessment of post-translational modifications, using paired human placental tissues from healthy and preeclampsia patients. This analysis resulted in identification of 4239 unique proteins with high confidence (two or more unique peptides with a false discovery rate less than 1%), covering 21% of approximately 20 059 (Ensembl v69, Oct 2012) human proteins, among which 28 proteins exhibited differentially expressed preeclampsia-specific proteins. When these proteins are assigned to all human chromosomes, the pattern of the newly identified placental protein population is proportional to that of the gene count distribution of each chromosome. We also identified 219 unique N-linked glycopeptides, 592 unique phosphopeptides, and 66 chromosome 13-specific proteins. In particular, protein evidence of 14 genes previously known to be specifically up-regulated in human placenta was verified by mass spectrometry. With respect to the functional implication of these proteins, 38 proteins were found to be involved in regulatory factor biosynthesis or the immune system in the placenta, but the molecular mechanism of these proteins during pregnancy warrants further investigation. As far as we know, this work produced the highest number of proteins identified in the placenta and will be useful for annotating and mapping all proteins encoded in the human genome. KEYWORDS: C-HPP, glycosylation, phosphorylation, preeclampsia, quantitation, TMT



INTRODUCTION The objective of the international Chromosome-Centric Human Proteome Project (C-HPP) was to map and annotate all proteins encoded by the genes on each human chromosome (Chr).1−3 To achieve this goal, each of the 25 national or international teams selected its clinical samples of interest (e.g., the liver, brain, heart, and placenta) and mapped those proteins according to the proposed five-step protocols set by the C-HPP standard guidelines.4 For example, in stage 1, the sample can be analyzed using a typical proteomic platform that includes largescale protein identification, post-translational modification (PTM; acetyl-, glycosyl-, phospho-) analysis, quantitative assessment of disease-related proteins, and validation. Throughout this process, we will be able to newly identify some of the missing proteins that are defined as poorly characterized by mass spectrometry (MS).1,3 To this end, a well-established strategy for sample preparation is necessary: efficient © 2013 American Chemical Society

fractionation and high-resolution liquid chromatography (LC) coupled to mass spectrometry (MS), which may be suitable for proteomic profiling of specific tissues of interest (e.g., the placenta and liver).5,6 In practice, to detect as many human proteins as possible to match the number of human genes (20 059, Ensembl v69, Oct 2012), it is necessary to integrate MS data from various tissues, particularly from tissues with less explored proteins in the database. Therefore, we chose the placenta as the primary tissue source and attempted to identify proteins during a pilot study. In the current study, we describe an initial experimental analysis of the protein parts list of chromosomes from human Special Issue: Chromosome-centric Human Proteome Project Received: November 4, 2012 Published: January 30, 2013 2458

dx.doi.org/10.1021/pr301040g | J. Proteome Res. 2013, 12, 2458−2466

Journal of Proteome Research

Article

raphy using a mixture of two lectins, concanavalin A and wheat germ agglutinin. Tissue proteins were equilibrated in a binding buffer [20 mM Tris, 1 mM MnCl2, 1 mM CaCl2, and 150 mM NaCl (pH 7.4)], and bound fractions were released using an elution buffer [0.5 M methyl α-D-mannopyranoside, 0.5 M Nacetyl-glucosamine, 0.8 M galactose, 20 mM Tris, and 0.5 M NaCl, (pH 7.0)] and filtered through a 0.2-mm filter (PALL, New York, NY, USA). Eluted fractions were concentrated using a 3000-Dalton molecular mass cutoff spin column (Millipore). The protein concentration was determined using a 2D Quant kit (GE Healthcare, Uppsala, Sweden).

placental tissue known to contain the largest number of genes of any organ and to perform many functions, including exchange of gases, nutrients, and electrolytes.7 Although several proteomic studies on the human placenta have been reported,8−18 large-scale genome-wide proteomic analysis with quantitative assessment has not been reported. Thus, we wished to establish strategies for large-scale protein identification that includes quantitation of the disease-derived sample (preeclampsia), characterization of the glycan structure of glycoproteins, and mapping of phosphoproteins in the placenta. Results from this pilot study will contribute to further development of protein mapping techniques as well as standardization of protein quantitation of any given clinical samples used in C-HPP. We envision that this genome-wide protein identification of placental tissue would produce a proteomic parts list of the placenta that can serve as a pool of differentially expressed biomarker candidates against disease (e.g., preeclampsia) and identify missing proteins in each chromosome.19



Phosphopeptide Enrichment using TiO2 Magnetic Beads

Phosphopeptides from the tryptic-digested mixture were enriched using TiO2 magnetic beads according to the manufacturer’s protocol (P/N 28-9513-77; GE Healthcare). LC−MS/MS Analysis

Nanohigh-performance liquid chromatography (nano-HPLC) analysis was performed using an Easy n-LC system (Thermo Fisher). The capillary column used (150 × 0.075 mm) for LC− MS/MS analysis was obtained from Phoenix S&T (Chester, PA, USA), and the slurry was packed in-house using a 5-μm, 100-Å pore size Magic C18 stationary phase resin (Michrom BioResources, Auburn, CA, USA). The mobile phase A for LC separation was 0.1% formic acid in deionized water, and the mobile phase B was 0.1% formic acid in acetonitrile. The chromatography gradient was designed for a linear increase from 0 to 8% B in 5 min, 5 to 25% B in 100 min, 25 to 45% B in 10 min, and 45 to 60% B in 10 min. The LTQ-Orbitrap mass spectrometry system (Thermo Fisher) was used for either identification or quantification of peptides. The Xcalibur system (version 2.1; Thermo Fisher) was used to generate peak lists. Orbitrap full MS scans were acquired from m/z 350 to 1500 at a resolution of 15 000 (at m/z 400) using an automatic gain control (AGC) value of 2 × 105. The minimum threshold was set to 100 000 ion counts. Parent ions were fragmented using the LTQ (isolation width of 2 m/z units) with a maximum injection time of 100 ms combined with an AGC value of 1 × 104 using three fragmentation modes such as collision-induced dissociation (CID) alone, electron-transfer dissociation (ETD) alone, and decision tree-based CID/ETD. For ETD MS/MS, the reagent ion source emission current, reagent ion electron energy, and reagent ion source chemical ionization pressure were set to 35 mA, 70 V, and 26 psi, respectively. The activation time was set to 100 ms, and the supplemental activation mode was enabled. For high-energy collision dissociation (HCD) MS/MS for TMT-labeled peptides, the AGC was set to 1 × 105 (isolation width of 3 m/z units) with a resolution of 7500. The dynamic exclusion time for precursor ion m/z values was 30 s. Internal calibration was performed using the background polysiloxane ion signal at m/z 445.120025 as the calibrant. The Agilent 6530 Accurate-Mass Q-TOF combined with the nano chip HPLC system (Agilent, Wilmington, DE, USA) was employed for peptide identification.

MATERIALS AND METHODS

Sample Preparation

Human placental tissues were obtained with informed consent in accordance with institutional review board guidelines from the Yonsei University College of Medicine (Seoul, Korea). This research was approved by the institutional review board of the Yonsei University Health System. We used central chorionic tissue from two paired samples (two normal and two preeclampsia tissues). A 1-cm3-sized placental tissue specimen was excised from the central region of the placenta from uncomplicated and preeclampsia pregnancies following Cesarean sections. Severe preeclampsia was defined as a systolic blood pressure of at least 150 mmHg or diastolic blood pressure of at least 110 mmHg on two occasions 6 h apart, associated with clinically significant proteinuria defined as 2+ by urine dipstick testing. The clinical characteristics of patients diagnosed with preeclampsia are outlined in Table S1 (Supporting Information). Tissues were rinsed extensively in sterile saline, and amnion and chorionic membranes were removed. Tissue samples were prepared as previously described.20 Samples were reduced, alkylated, and subjected to tryptic digestion as previously described.21 Tandem Mass Tag (TMT) Labeling and Peptide Fractionation

Each 100-μg sample in 20 μL of 500 mM TEAB was reduced, alkylated, digested, and labeled with TMT reagents (127:129) according to the manufacturer’s protocol (Thermo Fisher, San Jose, CA, USA). Tryptic-digested placental tissue proteins were fractionated using three different fractionated methods, including hydrophilic interaction chromatography (HILIC),22 strong cationic exchange chromatography (SCX),23 and OFFGEL electrophoresis according to the manufacturer’s protocol. Deglycosylation and Glycoprotein Enrichment using Multilectin Columns

Peptide Identification and Quantification

ProteomeDiscoverer software (version 1.3; Thermo Fisher) was used for protein identification and quantification. The peptides were identified using UniProt (Released, April 2012) and the International Protein Index (IPI) human sequence database (ver IPI 3.75). Database search criteria were as follows: taxonomy Homo sapiens, carboxyamidomethylated

The samples were enzymatically deglycosylated using PNGase F (Q-A Bio, Palm Desert, CA, USA) as previously described.20 Placental tissues (100 mg) were chopped and homogenized in modified RIPA buffer [50 mM Tris-HCl, 150 mM NaCl, 1% NP-40, and 0.5% sodium deoxycholate (pH 7.4)], and the lysated samples were subjected to lectin affinity chromatog2459

dx.doi.org/10.1021/pr301040g | J. Proteome Res. 2013, 12, 2458−2466

Journal of Proteome Research

Article

Figure 1. Overall workflow for genome-wide proteomic analysis of human placental tissue for the Chromosome-Centric Human Proteome Project. (A) Placental tissue proteins were extensively fractionated followed by identification and quantification using high-accuracy Orbitrap MS. All raw files produced in this study will be deposited in the public database (e.g., PRIDE, PeptideAtlas, or GPMDB) according to C-HPP guidelines.4 (B) Tryptic-digested placental tissue proteins were fractionated by either TiO2 column for phosphopeptides, or multi-lectin affinity chromatography (MLAC) column for glycoproteins. All samples were subjected to high-accurate MS followed by handling of their raw data using informatics tools. As an example of collaboration among C-HPP teams, Chr 13-specific protein data from human brain tissue supported by the Korea Basic Science Institute (Ochang, Korea) were shared to produce a double number of Chr 13-specific proteins.

(+57 Da) at cysteine residues for fixed modifications, oxidized at methionine (+16 Da) residues for variable modifications, two maximum allowed missed cleavage, 10 ppm MS tolerance, a 0.8-Da CID, and 20 mmu HCD MS/MS tolerance. Only peptides resulting from trypsin digestion were considered. For TMT-labeled peptides, TMT6 modification was added at peptide N termini (+229 Da) and at lysines (+229 Da) for fixed modification. Quantification was performed by calculating the ratio between the peak areas of the TMT reporter groups. To eliminate masking of changes in expression due to peptides that are shared between proteins, we calculated the protein ratio using only ratios from the spectra that are distinct to each protein. All quantitative results were normalized using protein medians (minimum protein count: 20). If all the quant channels were not present, the quant values were rejected. PeptideProphet and ProteinProphet were used to estimate the false discovery rate (FDR). We identified proteins using two or more unique peptides with an FDR < 1% at the protein level. We employed the following databases as the baseline metrics for estimating the protein numbers per chromosome and missing proteins: Ensembl v69 (Oct. 2012) [A], neXtProt (Gold, Nov 2012) [B], Human PeptideAtlas (Dec 2012, canonical) [C], GPMdb (Green, Nov. 26, 2012) [D], and Human Protein Atlas (high or medium, Dec 2012) [F]. The number of the missing proteins per chromosome was estimated by subtracting the mean of those protein numbers in the databases of B, C and D from that in the A database.19



Figure 2. (A) Comparison of the number of protein-coding genes per chromosome and number of proteins identified by MS per chromosome. (B) The number of Chr 13-specific proteins identified in human placenta and brain tissue. Data from the human brain were provided by the Chr 11 team (KBSI, Korea). Ensembl v69 (Oct. 2012) was used for estimating the protein numbers per chromosome.

RESULTS AND DISCUSSION

Overall Strategy for Genome-Wide Proteomic Analysis

tissue proteins were extensively fractionated followed by identification and quantification using high-accuracy MS. All raw files produced in the present study will be deposited in the public database (e.g., PRIDE, PeptideAtlas, or GPMDB) to

We set up an overall experimental strategy to map and annotate proteins encoded by genes on each human chromosome using human placenta as outlined in Figure 1A. Briefly, placental 2460

dx.doi.org/10.1021/pr301040g | J. Proteome Res. 2013, 12, 2458−2466

Journal of Proteome Research

Article

Figure 3. Representative significant biological pathways in which identified placental proteins are predicted to be involved. KEGG pathway analysis was performed using the identified placental proteins to evaluate which pathways are significantly represented (p ≤ 0.05). Out of 44 different pathways, 22 are linked to the global placenta proteome.

Table 1. Representative Molecular Functions of PlacentaSpecific Proteins molecular function Intramolecular oxidoreductase activity, interconversion of aldoses and ketoses Aldehyde dehydrogenase (NAD) activity Lipoic acid binding Vinculin binding Aldehyde-lyase activity Phosphatidylcholine-sterol Oacyltransferase activator activity Nucleobase binding Isocitrate dehydrogenase activity GPI-anchor transamidase activity Acetyl-CoA C-acyltransferase activity Signal recognition particle binding Purine binding NADPH:quinone reductase activity Thioredoxin peroxidase activity Phosphoserine binding Phosphoglucomutase activity Protein phosphatase type 1 regulator activity Beta-N-acetylhexosaminidase activity Epoxide hydrolase activity

Figure 4. Representative tissue distribution of the identified placenta proteins. Tissue information was obtained from the UniProtKB database. Accession numbers of identified placenta proteins were cross checked with those present in the UniProtKB database. Among them, 4099 proteins contain their tissue information (81 tissues total) in the UniProtKB database.

share with other C-HPP teams. Figure 1B shows detailed steps for producing proteomic data (e.g., non-PTM proteins, glycoproteins, phosphoproteins, and TMT-labeled proteins) using LC−MS/MS. The tryptic-digested placental tissue proteins were fractionated by either TiO2 column for phosphopeptides, or MLAC column for glycopeptides. All samples were subjected to high-accuracy MS followed by rigorous analysis of raw data using the available informatics tools. For example, collaboration among C-HPP teams produced more enriched specific types of data sets (i.e., a 2fold higher number of Chr 13-specific proteins when combined from the brain and placenta). Some parts of the Chr 13-specific placental protein data are also presented in an additional report.24

annotated proteins

p-value

fold enrichmenta

9

1.02 × 10−04

4.02

8

3.66 × 10−04

4.02

7 7 6 5

1.30 1.30 4.51 1.53

10−03 10−03 10−03 10−02

4.02 4.02 4.02 4.02

5 5

1.53 × 10−02 1.53 × 10−02

4.02 4.02

5

1.53 × 10−02

4.02

4

5.00 × 10−02

4.02

4

5.00 × 10−02

4.02

4 4

5.00 × 10−02 5.00 × 10−02

4.02 4.02

4 4 4 4

5.00 5.00 5.00 5.00

10−02 10−02 10−02 10−02

4.02 4.02 4.02 4.02

4

5.00 × 10−02

4.02

4

5.00 × 10−02

4.02

× × × ×

× × × ×

a

Ratio between the numbers of genes in the gene list belonging to a specific Gene Ontology term and the total number of genes in the gene list.

Large-Scale Protein Identification in Human Placental Tissue

accuracy MS. As a result, we identified 4239 unique proteins with high confidence (two or more unique peptides with an FDR less than 1%), including lower-abundance proteins related to calcium metabolism and signaling. This marks the highest

To perform large-scale protein identification in human placental tissue, tryptic-digested human placental tissue was fractionated using three different techniques, HILIC, SCX, and OFFGEL electrophoresis. All fractions were analyzed by high2461

dx.doi.org/10.1021/pr301040g | J. Proteome Res. 2013, 12, 2458−2466

Journal of Proteome Research

Article

Table 2. Representative Biological Processes of PlacentaSpecific Proteins biological process Negative regulation of fibrinolysis Barbed-end actin filament capping Nuclear pore organization Isocitrate metabolic process High-density lipoprotein particle assembly Fructose 1,6-bisphosphate metabolic process Regulation of interferongamma-mediated signaling pathway Nuclear migration Outer mitochondrial membrane organization Pinocytosis Mitochondrial outer membrane translocase complex assembly Cdc42 protein signal transduction Regulation of plasma membrane long-chain fatty acid transport Negative regulation of plasma membrane long-chain fatty acid transport

annotated proteins

p-value

fold enrichmenta

6

4.43 × 10−03

4.04

5

1.51 × 10−02

4.04

5 5 5

1.51 × 10−02 1.51 × 10−02 1.51 × 10−02

4.04 4.04 4.04

5

1.51 × 10−02

4.04

4

4.95 × 10−02

4.04

4 4

4.95 × 10−02 4.95 × 10−02

4.04 4.04

4 4

4.95 × 10−02 4.95 × 10−02

4.04 4.04

4

4.95 × 10−02

4.04

−02

4

4.95 × 10

4.04

4

4.95 × 10−02

4.04

Table 3. Regulatory Proteins Involved in Biological Processes in the Placenta identified cytokine-related proteins proteins Leptin Transforming growth factor-beta-induced protein ig-h3 Isoform 2 of Transforming growth factor beta-1 Transforming growth factor beta-1 Angiopoietin-4 Cytokine receptor-like factor 3 Dedicator of cytokinesis 8 Transforming growth factor, beta receptor I isoform 2 precursor Latent-transforming growth factor beta-binding protein 1 isoform 5 precursor Transforming growth feet or beta (Fragment) Interleukin enhancer-binding factor 2 Interleukin-18 receptor 1 Isoform 1 of interleukin-1 receptor accessory protein Interleukin-27 subunitbeta Isoform 5 of interleukin enhancer-binding factor 3 Isoform 1 of interleukin enhancer-binding factor 3 Isoform 6 of interleukin enhancer-binding factor 3 interleukin enhancer binding factor 3 isoform c variant (Fragment) biological process

a

Ratio between the numbers of genes in the gene list belonging to a specific Gene Ontology term and the total number of genes in the gene list.

number of proteins identified from the placenta, producing >21,1 of the total predicted human proteins (20 059, Ensembl v69, Oct 2012). Figure 2A shows a comparison of the number of human protein-coding genes per chromosome and the number of identified proteins by MS (Supporting Information, Table S2). This protein distribution shows the relative proportion of those human proteins identified by MS over the predicted protein number, which looks quite similar between chromosomes. By combining the total number of identified proteins with the previously identified proteins available in the public database (Ensembl, v69), our data set produced 33 missing proteins.. Given that the goal of the C-HPP is to map the entire proteome of the human chromosome in a coordinated manner, it would be synergistic if the consortium members share data by submitting chromosome-encoded protein data, including their own target chromosomal proteins to the public database. In light of this sharing policy within the C-HPP consortium, we received 75 Chr 13-encoded brain proteins from the Chr 11 teams and combined them with the already identified 66 Chr 13-encoded placental proteins, resulting in a total of 141 proteins. This result demonstrates that combining data sets of collaborating teams (e.g., YPRC for Chr 13 and Korea Basic Science Institution, Korea for Chr 11) is very synergistic for filling the gap on the list of missing proteins in each chromosome (Figure 2B). This finding will also stimulate development of technology by sharing each unique analysis platform used for profiling different target tissues employed by different teams. With regard to the C-HPP project, identifying proteins with high confidence, particularly for “missing proteins” or “low-abundance proteins,” ETD has been shown to be complementary to CID and is mainly used to increase

term Immune response mediated by circulating immunogbbulin KEGG pathway

MW [kDa]

pI

18.6 74.6

6.4 7.7

3.3 1.6

47.9 44.3 56.8 49.7 177.9 47.7

7.6 8.5 8.9 5.1 7.2 8.0

−1.6 -

142.6

4.9

-

4.8 43.0 62.3 65.4

9.0 5.3 7.9 8.1

-

25.4 74.6

9.2 8.2

-

95.3

8.8

-

76.5

7.9

-

50.1

6.6

-

pree/nora

no. of annotated proteins

p-value

17

0.001

term

no. of annotated proteins

p-value

Tryptophan metabolism

21

0.015

a

Tandem mass tag quantitation ratio [normal (nor) vs preeclampsia placental tissue (pree)]. Dashes indicate no detection (see Materials and Methods).

proteome coverage.25 Therefore, we used three fragmentation modes, CID alone, ETD alone, and decision tree-based CID− ETD, to improve protein sequence coverage and protein confidence. We found that ETD provided greater confidence in protein assignment in the present study. For example, for analysis of the high-molecular-weight AHNAK nucleoprotein (629 kDa), 44 peptides were detected by CID, whereas an additional 49 peptides were detected by ETD (data not shown). Quantitative Analysis of Proteins Differentially Expressed by Normal and Preeclampsia Placental Tissue with and without TMT Labeling

After establishing a goal to explore the annotation and diseaserelated context of newly identified proteins in collaboration with the B/D project,1,4 we performed quantitative analysis for both normal and disease-related human placental tissue. Thus, each 100-μg sample of healthy and preeclampsia patient placental tissue was labeled with TMT reagent (127 and 129) and mixed at a 1:1 ratio. The mixture was fractionated into 24 fractions by OFFGEL electrophoresis. Each fraction was analyzed by LC−MS/MS using sequentially combination 2462

dx.doi.org/10.1021/pr301040g | J. Proteome Res. 2013, 12, 2458−2466

Journal of Proteome Research

Article

end, normal placenta (5 mg) was subjected to MLAC to enrich glycoproteins.29 The enriched glycoproteins were digested with trypsin, and then were subjected to LC−MS/MS analysis. We were able to identify 541 unique MLAC-enriched glycoproteins. Lectin affinity chromatography would not have enriched only glycosylated proteins because of nonspecific binding. For example, nonglycoproteins (e.g., actin) were also identified after MLAC enrichment in this study. To verify the presence of N-linked glycoproteins, the samples were deglycosylated using PNGase-F enzyme and analyzed by LC−MS/MS, similar to the previous strategy.20,21 After rigorous database searching, Nlinked glycopeptides were identified on the basis of the mass difference (0.98 Da, N converted to D) from the native sequence.20,21 A total of 219 peptides were identified as Nlinked glycopeptides that were mapped to 141 unique glycoproteins. For example, five and six N-linked sites were identified at laminin subunit gamma-1 and pro-low-density lipoprotein receptor-related protein 1, respectively. Interestingly, ERO1-like protein alpha, corticosteroid-binding globulin, and tumor-associated calcium signal transducer 2 were identified as glycoproteins, which were significantly expressed proteins in preeclampsia tissue following TMT labeling. Therefore, we postulated that glycosylation of those three proteins might be related to the progression of preeclampsia. The characterization of carbohydrate modifications on glycoproteins will improve our understanding of the role of biomarkers that would be secreted to patient plasma. To accomplish this, we attempted to characterize glycan structure from identified glycoproteins using a previous approach.20,21 Tryptic-digested MLAC-enriched samples (3 μg, without PNGase-F treatment) were subjected to LC−MS/MS. When intact glycopeptides were fragmented during MS/MS acquisition, marker ions corresponding to low molecular weight oxonium ions such as m/z 163 (Hex+), m/z 204 (HexNAc+), m/z 366 (Hex-HexNAc+), and m/z 292 (NeuNAc+) can be observed. After manual inspection of all MS/MS spectra, we found that MS spectra derived from m/z 1245.0320 (4 charged) and 1038.7654 (3 charged) precursor ions exhibited the typical fragmentation pattern. To identify the peptide sequence, the same sample was treated with PNGase-F and subjected to LC−MS/MS analysis. After a cross-comparison through a rigorous database search, we were able to identify those peptides that would have been produced from N-linked glycosylated vitronectin (N*ISDGFDGIPDNVDAALALPAHSYSGR and N*GSLFAFR where a mark “*” indicates the deamidation site) with an expected molecular weight of 1245.0320 Da (4 charged) and 1038.7654 Da (3 charged). To characterize the glycan structure of those peptides, MS/MS spectra were manually assigned with counting constant neutral losses of terminal monosaccharides as previously described.12 Figure S3 (Supporting Information) shows manually assigned MS/MS spectra for N-linked glycopeptides from vitronectin (N*ISDGFDGIPDNVDAALALPAHSYSGR and N*GSLFAFR). The MS/MS spectra patterns were nearly identical to those of previous results.19 Interestingly, vitronectin was shown to be up-regulated in preeclampsia patient plasma as previously reported.9 Therefore, glycosylation of vitronectin that would have been secreted into plasma from placenta tissue could be functionally related to preeclampsia disease. This finding warrants further investigation. Using this strategy, it would also be possible to analyze the glycan structure of preeclampsia-specific glycoproteins.

scanning, the decision-tree mode (CID and ETD), and the HCD mode. All raw data were subjected to ProteomeDiscoverer software for quantitative analysis. Thus, 1331 unique proteins were quantified. As shown in Figure S1 (Supporting Information), the expression ratio for all quantified proteins displayed a normal distribution. However, 15 proteins were found to be up-regulated (≥2-fold), and 13 proteins were down-regulated (≥2-fold). We assumed that these significantly expressed proteins might be related to the progression of preeclampsia. For example, the pregnancy zone protein and 143-3 protein were significantly down-regulated (3.9-fold) and up-regulated (3.4-fold), respectively, in patients with preeclampsia, which is consistent with the previous reports.10,26 For the Chr 13-specific proteins, the expression levels of FERM, RhoGEF (ARHGEF), and pleckstrin domain protein 1 were down-regulated (2.2-fold) in patients with preeclampsia. In parallel with the TMT-labeling approach, we have also attempted to perform label-free quantitation for normal and preeclampsia tissue to identify altered candidates. To this end, each sample (tryptic digested, 2 μg) was subjected to LC−MS/ MS with triplicated runs. Figure S2A (Supporting Information) shows aligned base-peak spectra obtained from triplicated LC− MS/MS analysis of tryptic-digested normal and preeclampsia tissue. Full-scan spectra from a reference measurement (normal tissue) were compared with all other measurements. To normalize for inherent chromatographic variability, full-scan chromatograms from each data file were aligned by calculation of optimal correlations between spectra. Figure S2B (Supporting Information) shows a volcano plot of the ratio in log scale between the two groups (ratio vs the log p-value). From this work, 432 unique proteins were identified (FDR < 1%). Among them, 12 unique proteins were up-regulated (>2-fold, >2 peptides, p-value < 0.05), whereas 9 unique proteins were down-regulated (>2-fold, >2 peptides, p-value < 0.05). For example, vimentin was significantly down-regulated (15.6-fold) in patients with preeclampsia. Consistent with this result, the protein was shown to be related to preeclampsia disease as previously reported.13 In this study, few overlaps were evident in the number of proteins that showed a more than a 2-fold increase in expression between the TMT labeling approach and label-free method, suggesting the presence of a difference in sensitivity for those target peptides (labeled vs nonlabeled) between the two quantification methods as previously described.26 It was also noticed that there were few common differentially expressed placental proteins between our work and others when we cross checked the protein list of our experiment with that of others.8−15,17 This may be due to the difference in the samples (biofluids vs tissues) and analytical methods (e.g., gel-based separation vs LC, dye-stained quantification vs isotope labeling). However, the proteins identified from the current experiment need further verification regarding their usefulness in the diagnosis of preeclampsia using large samples. Global Profiling of Glycoproteins in Placental Tissue

A C-HPP working strategy is to characterize at least three major PTMs (i.e., phosphoryl-, glycosyl-, and acetyl-) for each protein.4 Therefore, we also focused on glycoprotein analysis because protein glycosylation is a major PTM that controls protein folding, conformational distribution, stability, and activity. Additionally, most proteins secreted into plasma from disease tissue are glycosylated, and thus they could be important disease biomarkers (e.g., preeclampsia).28 To this 2463

dx.doi.org/10.1021/pr301040g | J. Proteome Res. 2013, 12, 2458−2466

Journal of Proteome Research

Article

Global Profiling of Phosphoproteins in Placental Tissue

proteins have yet no information regarding their tissue specificity. The list of unique placenta tissue-specific proteins is summarized in Table S4 (Supporting Information). Additionally, the identified placental proteins were categorized into groups based on molecular function and biological process. Proteins involved in molecular function (Supporting Information, Table S5) and biological processes (Supporting Information, Table S6) are listed in which placental proteins showed much more enrichment compared with other human proteins (fold-enrichment value ≥2; p-value < 0.05). In particular, Tables 1 and 2 summarize the list of proteins involved in the molecular function and biological processes of placental-specific proteins (fold-enrichment value ≥4; p-value < 0.05). As outlined in Table 1, the most significant molecular functions include oxidoreductase activity, interconversion of aldoses and ketoses, aldehyde dehydrogenase (NAD) activity, lipoic acid binding, and vinculin binding. Moreover, the most significant biological processes include negative regulation of fibrinolysis, barbed-end actin filament capping, nuclear pore organization, isocitrate metabolism, high-density lipoprotein particle assembly, fructose 1,6-bisphosphate metabolism (Table 2). Having classified these identified proteins from the placenta, it would be reasonable to survey a small group of proteins that perform particular molecular functions in the disease state in the future.

We successfully identified 592 unique phosphopeptides that are present in placental tissue using TiO2 enrichment. We found a slightly high correlation with the distribution of gene density on a given chromosome similar to the distribution of glycoproteins (data not shown). As an example, we identified four Chr 13specific phosphopeptides corresponding to four phosphoproteins. The phosphoproteins identified from this experiment will be closely examined for their involvement in any specific disease. For instance, one phosphoprotein, synaptopodin is known to be related to cancer.30 Biological Implications of Identified Placental Proteins

One key feature of the C-HPP is to use integrated “omics” techniques (e.g., genomics, transcriptomics, and proteomics) that may enhance structural and functional characterization of previously unexplored transcripts bearing protein coding sequences; however, to date, no evidence of translation is available. A survey of public databases shows that 31 genes have been found to be specifically up-regulated in human placental tissues (http://www.broadinstitute.org/gsea/). Using this information, we were able to verify protein evidence for 14 of 31 known genes using high-throughput proteomic tools: CYP19A1, EGFR, PAPPA2, ADAM12, SERPINB2, PSG9, GCM1, HMGB3, CHKB, LAGE3, MAGEA10, VGLL1, GALE, and PHLDA2. Among them, LAGE3 (uncharacterized protein) was found to be down-regulated in preeclampsia patient tissue (Supporting Information, Table S3). Furthermore, EGFR (isoform 1 of epidermal growth factor receptor) and PSG9 (pregnancy-specific glycoprotein) were characterized as Nlinked glycoproteins. Interestingly, pregnancy-specific glycoprotein appears to be related to preeclampsia as previously reported.31 KEGG pathway analysis was performed using the identified placental tissue proteins to evaluate pathways that might be significantly represented (p < 0.05). Forty-four different pathways were associated with the identified global placental proteins. Figure 3 shows some representative biological pathways that were linked to those identified proteins. As observed in this figure, many proteins are predicted to be related to cell junctions, major determinants of endothelial integrity and metabolism. This result is consistent with the previous observation of the potential role for junctional adhesion proteins in the regulation of the placental endothelial barrier during pregnancy.32 In particular, focal adhesion pathways have been shown to be related to placental proteins that are involved in early placental development and pregnancy.33,34 Given that the placenta serves as a critical channel between the mother and fetus in transporting oxygen and nutrients,35 detection of those proteins involved in various types of metabolism has been well anticipated. For example, several proteins (e.g., apolipoprotein) involved in lipid metabolism were shown to be related to preeclampsia disease as previously reported.9 We next attempted to construct a tissue distribution of identified proteins using the tissue information available in the UniProtKB database (www.uniprot.org/help/ uniprotkb). In total, 4069 proteins from the placenta can be distributed in 81 different tissues that are available in UniProtKB. Figure 4 shows a pie chart of the 20 most significantly enriched tissues for the identified placental proteins. Among them, 34 proteins are predicted to be placenta specific, 1238 proteins are to be shared with other tissues, 2842 proteins belong to those specific for other tissues, and 143

Proteins Involved in Potential Disease-Related Pathways

During pregnancy, the energy requirements of the fetus impose changes in maternal metabolism. As the fetus grows, insulin resistance becomes more apparent, causing many changes in the expression of cytokines and hormones ([e.g., transforming growth factor beta [(TGF-β)], interleukin, leptin, angiopoietin, resistin, interleukin-1, adiponectin, osteocalcin, resistin, irisin, and fibroblast growth factor]).36 Identification of major proteins involved in this transition period would help understand the potential mechanism during pregnancy. From an initial profiling study on the placenta, we were able to identify differentially expressed regulatory hormones (e.g., TGF-β, various interleukin isoforms, angiopoietin, cytokine receptor-like factor; Table 3). Furthermore, we were able to quantify the differential expression level of proteins that are related to preeclampsia patients. For example, leptin and TGFβ-induced protein ig-h3 showed approximately 3.3-fold and 1.6fold up-regulation, respectively, whereas isoform 2 of TGF-β-1 was down-regulated approximately 1.6-fold (Table 3). In general, to increase blood flow in the fetus, maternal tissues tend to reduce the maternal blood for the peripheral tissues, resulting in an increase in blood pressure of peripheral blood vessels. Therefore, it is well-known that amino acid metabolism (i.e., tryptophan, tyrosine, phenylalanine) involved in the biosynthesis of neurotransmitters (e.g., serotonin and epinephrine) are closely related to pregnancy.26 From our initial mapping, we also identified 21 proteins, including those of the aldehyde dehydrogenase 3 family and oxoglutarate (alphaketoglutarate) dehydrogenase, which are involved in the tryptophan metabolism pathway (p-value = 0.015; Table 3). Given that the placenta is an organ of exchange between the mother and fetus for nutrients, the immune system is usually suppressed. In this regard, we were able to identify 17 proteins, including a complement component isoform that is known to be involved in the immune system (p-value = 0.0011; Table 3). The molecular mechanism of these proteins during pregnancy warrants future investigation. 2464

dx.doi.org/10.1021/pr301040g | J. Proteome Res. 2013, 12, 2458−2466

Journal of Proteome Research



Article

(7) Centlow, M.; Hansson, S. R.; Welinder, C. Differential proteome analysis of the preeclamptic placenta using optimized protein extraction. J. Biomed. Biotechnol. 2009, 2010, 458748. (8) Liu, C.; Zhang, N.; Yu, H.; Chen, Y.; Liang, Y.; et al. Proteomic analysis of human serum for finding pathogenic factors and potential biomarkers in preeclampsia. Placenta 2011, 32, 168−174. (9) Kolla, V.; Jenö, P.; Moes, S.; Lapaire, O.; Hoesli, I.; et al. Quantitative proteomic (iTRAQ) analysis of 1st trimester maternal plasma samples in pregnancies at risk for preeclampsia. Biomed. Biotechnol. 2012, 1−8. (10) Liu, C.; Zhang, N.; Yu, H.; Chen, Y.; Liang, Y.; Deng, H.; Zhang, Z. Proteomic analysis of human serum for finding pathogenic factors and potential biomarkers in preeclampsia. Placenta 2011, 32, 168−174. (11) Blumenstein, M.; Michael, T.; McMaster, M.; Black, A.; Wu, S.; Prakash, R.; Cooper, G. J. S.; North, R. A.; et al. A proteomic approach identifies early pregnancy biomarkers for preeclampsia: Novel linkages between a predisposition to preeclampsia and cardiovascular disease. Proteomics 2009, 9, 2929−2945. (12) Blankley, R. T.; Robinson, N. J.; Aplin, J. D.; Crocker, I. P.; Gaskell, S. J.; et al. A gel-free quantitative proteomics analysis of factors released from hypoxic-conditioned placentae. Reprod. Sci. 2010, 17, 247−257. (13) Gharesi-Fard, B.; Zolghadri, J.; Kamali-Sarvestani, E. Proteome differences of placenta between pre-eclampsia and normal pregnancy. Placenta 2010, 31, 121−125. (14) Epiney, M.; Ribaux, P.; Arboit, P.; Irion, O.; Cohen, M. Comparative analysis of secreted proteins from normal and preeclamptic trophoblastic cells using proteomic approaches. J. Proteomics 2012, 75, 1771−1777. (15) Johnstone, E. D.; Sawicki, G.; Guilbert, L.; Winkler-Lowen, B.; Cadete, V. J.; Morrish, D. W. Differential proteomic analysis of highly purified placental cytotrophoblasts in pre-eclampsia demonstrates a state of increased oxidative stress and reduced cytotrophoblast antioxidant defense. Proteomics 2011, 11, 4077−4084. (16) Carty, D. M.; Siwy, J.; Brennand, J. E.; Zürbig, P.; Mullen, W.; et al. Urinary proteomics for prediction of preeclampsia. Hypertension 2011, 57, 561−569. (17) Epiney, M.; Ribaux, P.; Arboit, P.; Irion, O.; Cohen, M. Comparative analysis of secreted proteins from normal and preeclamptic trophoblastic cells using proteomic approaches. J. Proteomics 2012, 75, 1771−1777. (18) Shin, J. K.; Baek, J. C.; Kang, M. Y.; Park, J. K.; Lee, S. A.; et al. Proteomic analysis reveals an elevated expression of heat shock protein 27 in preeclamptic placentas. Gynecol. Obstet. Invest. 2011, 71, 151− 157. (19) Marko-Varga, G.; Omenn, G. S.; Paik, Y. K.; Hancock, W. S. A first step toward completion of a genome-wide characterization of the human proteome. J. Proteome Res. 2013, 12 (1), 1−5. (20) Lee, H. J.; Na, K.; Choi, E. Y.; Kim, K. S.; Kim, H.; et al. Simple method for quantitative analysis of N-linked glycoproteins in hepatocellualr carcinoma specimens. J. Proteome Res. 2010, 9, 308− 318. (21) Lee, H. J.; Kang, M. J.; Lee, E. Y.; Cho, S. Y.; et al. Application of a peptide-based PF2D platform for quantitative proteomics in disease biomarker discovery. Proteomics 2008, 8, 3371−3381. (22) Zauner, G.; Deelder, A. M.; Wuhrer, M. Recent advances in hydrophilic interaction liquid chromatography (HILIC) for structural glycomics. Electrophoresis 2011, 32, 3456−3466. (23) Jadaliha, M.; Lee, H. J.; Pakzad, M.; Fathi, A.; Jeong, S. K.; et al. Quantiative proteomic analysis of human embryonic stem cell differentiation by 8-plex iTRAQ labeling. PLoS One 2012, 7, e38532. (24) Jeong, S. K.; Lee, H. J.; Na, K.; Cho, J. Y.; Lee, M. J. GenomewidePDB, a proteomic database exploring the comprehensive protein parts list and transcriptome landscape in human chromosomes. J. Proteome Res. 2013, 12, 106−111. (25) Swaney, D. L.; McAlister, G. C.; Coon, J. J. Decision tree-driven tandem mass spectrometry for shotgun proteomics. Nat. Methods 2008, 5, 959−964.

CONCLUSIONS We believe that high-quality, extensive proteome maps are achievable within a planned 10-year period. With the wellestablished analytical methods, we have accomplished genomewide proteomic analysis of placental tissue, resulting in 4239 identified proteins that include 219 N-linked glycopeptides and 592 phosphopeptides with high confidence (FDR < 1%). Moreover, 1331 unique proteins were quantified with high confidence (FDR < 1%). In the next step, glycoproteins, phosphoproteins, and preeclampsia-specific proteins that have been identified from the initial profiling will be subjected to validation in the context of missing proteins that would fill in the predicted 20 059, Ensembl v69, Oct 2012 gene products.19 With our well-established methods, protein evidence for 13 placenta-specific genes was verified. Further studies should focus on the molecular mechanism of those identified placentaspecific proteins involved in various biological systems. The analytical methods applied to this initial profiling would also contribute to the development of a standardized platform of the C-HPP. All raw data produced in the present study will be deposited in the public database and shared with all C-HPP teams.



ASSOCIATED CONTENT

* Supporting Information S

Supporting tables and figures. This material is available free of charge via the Internet at http://pubs.acs.org.



AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. Tel: 82-2-2123-4242. Fax: 82-2393-6589. Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS This study was supported by the National Research Foundation of Korea (NRF) Grant (No. 2011-0028112 to Y.K.P.) and WCU Program (R31-2008-000-10086-0) funded by the Korean government (MEST), and the National Project for Personalized Genomic Medicine (A111218-11-CP01 to Y.K.P.). We thank Thermo Fisher Scientific and Agilent Technologies for support regarding mass spectrometric analysis.



REFERENCES

(1) Paik, Y. K.; Jeong, S. K.; Omenn, G. S.; Uhlen, M.; Hanash, S.; et al. The Chromosome-Centric Human Proteome Project for cataloging proteins encoded in the genome. Nat. Biotechnol. 2012, 7, 221−223. (2) Legrain, P.; Aebersold, R.; Archakov, A.; Bairoch, A.; Bala, K.; et al. The human proteome project: current state and future direction. Mol. Cell. Proteomics 2011, 10, 1−5. (3) Hancock, W. S.; Omenn, G.; Legrain, P.; Paik, Y. K. Proteomics, human proteome project, and chromosomes. J. Proteome Res. 2011, 10, 210. (4) Paik, Y. K.; Omenn, G. S.; Uhlen, M.; Hanash, S.; Marko-Varga, G.; et al. Standard guqidelines for the chromosome-centric human proteome project. J. Proteome Res. 2012, 6, 2005−2015. (5) Thelen, J. J.; Miernyk, J. A. The proteomic future: where mass spectrometry should be taking us. Biochem. J. 2012, 444, 169−181. (6) Chen, G.; Pramanik, B. N. Application of LC/MS to proteomics studies: current status and future prospects. Drug Discovery Today 2009, 9−10, 465−471. 2465

dx.doi.org/10.1021/pr301040g | J. Proteome Res. 2013, 12, 2458−2466

Journal of Proteome Research

Article

(26) Hoegh, A. M.; Borup, R.; Nielsen, F. C.; Sørensen, S.; Hviid, T. V. Gene expression profiling of placentas affected by pre-eclampsia. J. Biomed. Biotechnol. 2010, 2010, 787545. (27) Zhou, L.; Adams, R. M.; Karuna, G.; Chourey, R. B.; Hettich, H. L.; Pan, C. Systematic comparison of label-free, metabolic labeling, and isobaric chemical labeling for quantitative proteomics on LTQ Orbitrap Velos. J. Proteome Res. 2012, 11, 1582−1590. (28) Alavi, A.; Axford, J. S. Glyco-biomarkers: potential determinants of cellular physiology and pathology. Dis. Markers 2008, 25, 193−205. (29) Drake, R. R.; Schwegler, E. E.; Malik, G.; Diaz, J.; Block, T.; et al. Lectin capture strategies combined with mass spectrometry for the discovery of serum glycoprotein biomarkers. Mol. Cell. Proteomics 2006, 5, 1957−1967. (30) Chen, L.; Fang, B.; Giorgianni, F.; Gingrich, J. R.; BeranovaGiorgianni, S. Investigation of phosphoprotein signatures of archived prostate cancer tissue specimens via proteomic analysis. Electrophoresis 2011, 32, 1984−1991. (31) Okazaki, S.; Sekizawa, A.; Purwosunu, Y.; Farina, A.; Wibowo, N.; Okai, T. Placenta-derived, cellular messenger RNA expression in the maternal blood of preeclamptic women. Obstet. Gynecol. 2007, 110, 1130−1136. (32) Leach, L.; Lammiman, M. J.; Babawale, M. O.; Hobson, S. A.; Bromilou, B.; Lovat, S.; Simmonds, M. J. Molecular organization of tight and adherens junctions in the human placental vascular tree. Placenta 2000, 21, 547−557. (33) MacPhee, D. J.; Mostachfi, H.; Han, R.; Lye, S. J.; Post, M.; Caniggia, I. Focal adhesion kinase is a key mediator of human trophoblast development. Lab. Invest. 2001, 81, 1469−1483. (34) Burghardt, R. C.; Burghardt, J. R.; Taylor, J. D, 2nd; Reeder, A. T.; Nguen, B. T.; Spencer, T. E.; Bayless, K. J.; Johnson, G. A. Enhanced focal adhesion assembly reflects increased mechanosensation and mechanotransduction at maternal-conceptus interface and uterine wall during ovine pregnancy. Reproduction 2009, 137, 567− 582. (35) Myllynen, P.; Vähäkangas, K. Placental transfer and metabolism: An overview of the experimental models utilizing human placental tissue. Toxicol. In Vitro 2013, 27, 507−512. (36) Kim, H.; Toyofuku, Y.; Lynn, F. C.; Chak, E.; Uchida, T.; et al. Serotonin regulates pancreatic beta cell mass during pregnancy. Nat. Med. 2010, 16, 804−808.

2466

dx.doi.org/10.1021/pr301040g | J. Proteome Res. 2013, 12, 2458−2466