ARTICLE pubs.acs.org/jpr
Proteomics Studies of Childhood Pilocytic Astrocytoma Athanasios K. Anagnostopoulos,†,‡ Konstantinos S. Dimas,§,r Chrissa Papathanassiou,‡ Maria Braoudaki,‡,|| Ema Anastasiadou,† Konstantinos Vougas,† Kalliopi Karamolegou,‡ Harry Kontos,^ Neofytos Prodromou,# Fotini Tzortzatou-Stathopoulou,|| and George Th. Tsangaris*,† †
Proteomics Research Unit, Center of Basic Research II, Biomedical Research Foundation of the Academy of Athens, Athens, Greece Pharmacology Division, Center of Basic Research I, Biomedical Research Foundation of the Academy of Athens, Athens, Greece ‡ Hematology/Oncology Unit, First Department of Pediatrics, University of Athens, Aghia Sophia Children’s Hospital, Athens, Greece University Research Institute for the Study and Treatment of Childhood Genetic and Malignant Diseases, University of Athens, Aghia Sophia Children’s Hospital, Athens, Greece ^ “Genomedica” Molecular Diagnostics Laboratory, Piraeus, Greece # Department of Neurosurgery, Aghia Sophia Children’s Hospital, Athens, Greece
)
§
bS Supporting Information ABSTRACT: Childhood pilocytic astrocytoma is the most frequent brain tumor affecting children. Proteomics analysis is currently considered a powerful tool for global evaluation of protein expression and has been widely applied in the field of cancer research. In the present study, a series of proteomics, genomics, and bioinformatics approaches were employed to identify, classify and characterize the proteome content of low-grade brain tumors as it appears in early childhood. Through bioinformatics database construction, protein profiles generated from pathological tissue samples were compared against profiles of normal brain tissues. Additionally, experiments of comparative genomic hybridization arrays were employed to monitor for genetic aberrations and sustain the interpretation and evaluation of the proteomic data. The current study confirms the dominance of MAPK pathway for the childhood pilocytic astrocytoma occurrence and novel findings regarding the ERK-2 expression are reported. KEYWORDS: proteomics, pilocytic astrocytoma
1. INTRODUCTION Pediatric brain tumors represent a devastating disease affecting thousands of children worldwide each year. Emerging technologies of both genomics and proteomics are currently transforming the field in cancer research, promising to radically improve our understandings over the underlying molecular basis of this disease type. Astrocytomas are traditionally thought to originate from astrocytes and their precursor cells. The origin of astrocytomas involves a number of characteristic gene alterations, including the activation of proto-oncogenes and inactivation of tumor suppressor genes that play an important role in cellular growth, apoptosis, motility, and invasion pathways.1 On the basis of their histological characteristics and classification guidelines given by the World Health Organization (WHO), astrocytomas are divided into four grades: pilocytic astrocytoma (Grade I), diffused astrocytoma (Grade II), anaplastic astrocytoma (Grade III), and glioblastoma multiforme (Grade IV).2 Childhood pilocytic astrocytoma (CPA) is the most frequent brain tumor affecting children.3,4 CPA is usually not infiltrating and progression to higher grades is rare. Although gross total resection may result in cure, recurrence is seen in 19% of cases.5 r 2011 American Chemical Society
With more than one-third of the recurrent tumors not being amenable to complete resection, patients who follow subtotal resection without adjuvant therapy have poor long-term survival.6 Histologically, CPA commonly arises in the cerebellar hemispheres, is usually of low cellularity and is characterized by elongated orbipolar tumor cells accompanied by Rosenthal fibers and atypical vascular pattern with hyalinized and sometimes glomeruloid vessels. However, CPA can show a wide morphogenic spectrum, with areas resembling oligodendroglioma or higher grade astrocytomas.2 Cytogenetic analyses in CPA reveal either normal karyotypes or a variety of aberrations that fail to follow a distinct pattern; the lone exception is frequent trisomy 7 and 8 (33% of cases).7 Proteomics analysis is currently considered a powerful tool for global evaluation of protein expression and has been widely applied in analyses of diseases, especially in the field of cancer research.8,9 In addition to known genetic and epigenetic alterations, there are other factors (e.g. molecular changes in translation, posttranslational modification, intracellular mislocalization) involved Received: January 11, 2011 Published: April 06, 2011 2555
dx.doi.org/10.1021/pr200024m | J. Proteome Res. 2011, 10, 2555–2565
Journal of Proteome Research
ARTICLE
in tumor initiation and growth, which cannot be detected either by measuring the amount of RNA or by detecting nucleotide sequence variation.10 Therefore, characterization of protein patterns in malignant cells/tissues is at least complementary to cDNA microarrays to identify molecules entailed in the cancer progression process. In contrast to increasing knowledge regarding the genetic aberrations in high grade glioma, relatively little is known about protein expression in CPA. The current study delivers, for the first time, the analysis of CPA by a proteomic, genomic, and bioinformatics point of view. The analysis establishes the 2D-E reference map of CPA, delivering a global overview of whole tissue protein expression for the human pilocytic astrocytoma in childhood. Characterization and classification of all identified proteins is performed through network and bioinformatics analyses. Furthermore, protein profiles of CPA are compared to profiles generated from normal brain tissues and subsequent differences are discussed herein. Finally, the proteomic findings are interpreted and evaluated through data obtained by comparative genomic hybridization arrays (a-CGH) analyses.
2. MATERIALS Brain tumor tissues were obtained from patients who had undergone tumor resection at the Aghia Sophia Children’s Hospital. All procedures were performed in accordance with approved human subject guidelines, approved by the Ethical Committee of the Athenian University, after informed consent from the patients’ parents. Brain tissue samples diagnosed as non malignant were used as normal controls in the analysis (Table 1). Table 1. Clinical Features of Human Brain Tissue Samples clinical features
number %
Normal Tissues Mean Age Gender (male/female)
4 ( 1 (29) 2/1
Total
3
Pilocytic Astrocytoma Mean Age
5 ( 1 (28)
Gender (male/female)
4/2
Total
6
All samples were collected at initial diagnosis with no prior exposure to chemotherapy or radiation therapy. Each tumor was examined by microscope observation after staining with hematoxylin and eosin (Figure 1). Sections were stained for GFAP, synaptoficin (SY38), and keratin AE1/AE3 by immunohistochemistry and final classification was performed in accordance with recent guidelines by the WHO. Proteomics experiments on tumor samples were carried immediately after completion of surgery, without freezethaw cycle intervention. Two replicate 2D-E gels were prepared for each sample.
3. METHODS 3.1. Sample Preparation
To remove excess blood from the tissues (which may interfere with 2D-PAGE analysis), whole tumors were washed 3 times in sucrose buffer consisting of 20 mM HEPES, pH 7.5, 320 mM sucrose, 1 mM EDTA, 5 mM DTE, and 1 mg/mL of a mixture of protease inhibitors [1 mM PMSF and 1 tablet (Roche Diagnostics) per 50 mL of wash buffer and phosphatase inhibitors (0.2 mM Na3VO3 and 1 mM NaF)]. The tissue was further homogenized in a glass Wheaton (tight) homogenizer in a buffer consisting of 8 M urea, 40 mM Tris-HCL (pH 8.5), 2 M thiourea, 4% CHAPS, 1% dithioerythritol (DTE), 0.2% IPG buffer pH 310 (Amersham Biosciences), and 1 mM PMSF. The homogenate was left at room temperature for 1 h and centrifuged at 13 000 rpm for 30 min. Desalting was performed with Ultrafree-4 centrifugal filter unit (Millipore). The protein content of the supernatant was determined using the Bradford quantification method. 3.2. Two-Dimensional Electrophoresis
Two dimensional gel electrophoresis was performed essentially as reported.11 Samples of 1 mg total protein were applied on 18 cm, pI 310NL, IPG strips (Bio-Rad Lab, Hercules, CA), at their basic and acidic ends, using sample cups. IPG strips had been prepared for IEF by 20 h rehydration in a buffer of 8 M urea, 4% CHAPS and 1% DTE. Fist dimensional electrophoresis focusing started at 250 V and voltage was gradually increased to 8000 V, with 3 V/min, where it was kept constant for 25 h (approximately 150 000 Vh totally). IEF was conducted in a PROTEAN IEF Cell, Bio-Rad apparatus. After focusing, IPG strips were equilibrated first in 6 M urea, 50 mM Tris-HCL (pH 8.8), 2% (w/v) SDS, 30% (v/v) glycerol and 0.5% (w/v) DTE for 15 min then in the same buffer containing 4% (w/v) iodoacetamide instead of DTE, for 15 more
Figure 1. Histological demonstration of childhood pilocytic astrocytoma. Hematoxylin and eosin stained micrographs in (A) 200 and (B) 400 magnification of tissue from primary tumor demonstrate the interval increase in cellularity observed in pilocytic astrocytoma. 2556
dx.doi.org/10.1021/pr200024m |J. Proteome Res. 2011, 10, 2555–2565
Journal of Proteome Research minutes. Second dimensional electrophoresis was performed on 12% SDS-polyacrylamide gels (180 200 1.5 mm) with a run of 40 mA/gel, in PROTEIN-II multicell apparatuses (Bio-Rad, Hercules, CA). 3.3. Peptide Mass Fingerprinting and Identification of Proteins
Peptide mass fingerprinting analysis was essentially performed as described previously.12 Briefly, all spots on the gels were annotated semiautomatically using the Melanie 4.02 software, excised with a Proteiner SPII robot (Bruker Daltonics, Bremen, Germany) and placed into 96-well microtiter plates. The excised spots were destained using 180 μL of 100 mM ammonium bicarbonate in 30% ACN and the gel piece was dried in a speed vacuum concentrator (MaxiDry Plus, Heto, Denmark). The dried gel piece was rehydrated with 5 μL of 20 μg/mL recombinant trypsin (proteomics grade, Roche diagnostics, Basel, Swiss) solution. After 16 h at room temperature, 10 μL of 50% acetonitrile containing 0.3% trifluoroacetic acid were added, and the gel pieces were incubated for 15 min with gentle shaking. Sample application to a target plate and analysis as well as peptide matching and protein searching were carried out as described previously12 Briefly, tryptic peptide mixtures (1 μL) were applied on an anchor chip MALDI plate with 1 μL of matrix solution, consisting of 0.08% CHCA (Sigma), and the internal standard peptides des-Arg-bradykinin (Sigma, 904.4681 Da) and adrenocorticotropic hormone fragment 1839 (Sigma, 2465.1989 Da) in 65% ethanol, 50% ACN and 0.1% TFA. Peptide mixtures were analyzed in a MALDI-ToF mass spectrometer (Ultraflex II, Bruker Daltonics). Laser shots (n = 400) of intensity between 40 and 60% were collected and summarized and the peak list was created using the FlexAnalysis v2.2 software (Bruker). Peak list was created with Flexanalysis v2.2 software (Bruker). Smoothing was applied with the Savitzky-Golay algorithm (width 0.2mz, cycle number 1). Signal to noise (S/N) threshold ratio of 2.5 was allowed. SNAP (Bruker) algorithm was used for peak picking. Tryptic autodigest as well as commonly occurring keratin contaminant peaks were filtered out by the software prior to the protein identification process. Peptide matching and protein searches were performed automatically with MASCOT Server 2 (Matrix Science). Peptide masses were compared with the theoretical peptide masses of all available proteins from Homo sapiens in the Swiss-Prot database. Stringent criteria were used for protein identification with a maximum allowed mass error of 10 ppm and a minimum of four matching peptides. Probability score with p < 0.05 was used as the criterion for affirmative protein identification. Monoisotopic masses were used, and one missed trypsin cleavage site was calculated for proteolytic products.
ARTICLE
purpose, proteins from tumor tissue samples and controls were resolved on 2-DE gels and all spots from all gels were subjected to MS analysis.13 More specifically, 3 normal samples and 6 CPA samples were resolved on 2-DE gels in duplicates; hence, 18 gels were produced in total. All the detected spots from all the gels were picked and analyzed by MALDI-TOF MS. The identification process resulted in a list of protein entries. Only proteins identified in the entity of all samples from each category (CPA and normal) were considered statistically significant and were included in the final CPA and normal databases. Comparison of the aforementioned databases depicted the proteins that were present in one category and absent from the other (considered as up or down regulated respectively). The depicted proteins are presented as differentially expressed. 3.5. Classification and Functional Clustering
All identified proteins in CPA were assigned their gene symbol via the Uniprot Knowledgedbase database (http://www.uniprot. org/). Protein classification was performed based on their functional annotations using Gene Ontology (GO) for biological process, and subcellular localization. Analyses were performed for all identified proteins; when more than one assignment was available, all of the functional annotations were considered in the results. 3.6. Chromosome Distribution and KEGGS analysis
The chromosomal distribution of genes encoding the identified proteins was compared with known proportions of genes on human chromosomes using a standard χ2 contingency table.14 The chromosome distribution graph was generated by WebGestalt (WEB-based GEne SeT AnaLysis Toolkit, http:// bioinfo. vanderbilt.edu/webgestalt).15 GO hierarchy clustering was performed using the Gene Ontology Tree Machine (GOTM) software.16 Proteins were classified using the GO functional annotations for molecular function. Annotation categories were taken from level 2 in the GO trees by setting a p value of >0.01 and using fisher’s exact test. GO enrichment analysis was conducted by calculating the probability that the number of annotations in the protein list could have arisen by chance, assuming an underlying hypergeometric distribution.17 Analysis for KEGGS-enriched categories was also carried out. All identified proteins were compared against the Webgestalt human proteome database. Using a p value of >0.01, Fisher’s exact test as the statistical method, and 2 genes as the minimum threshold for the analysis, proteins were examined to display all pathways that implicate them and to graphically highlight the input genes within the pathway maps (http://www.genome jp/ kegg/tool/search_pathway.html).
3.4. Determination of Proteins Differentially Expressed in CPA and Normal Neurons
3.7. Array-CGH—Hybridization of Differentially Labeled Tumor and Normal DNA to BAC Clone Microarray
Routinely, in order to distinguish differentially expressed proteins, spot patterns of 2D-E gels from pathological samples are matched and compared to the ones found in normal/controls.11 However, this “conventional” comparative image analysis procedure may be hampered by several factors, such as moderate reproducibility or even partly suboptimal resolution of 2D-E gels, resulting in difficulties on spot matching across gels. Since these problems affect image analysis driven comparative studies, but mostly since the 2D-E gel images of CPA and normal tissue, produced in this study, were to the least not similar, an alternative strategy to identify differentially expressed proteins was employed. For this
Genomic DNA was isolated by 3-day proteinase K digestion followed by phenol-chloroform-isoamyl alcohol extraction. DNA labeling, hybridization of labeled DNA to a 3379 BAC clone microarray (CytoChip Focus Constitutional v.1.1, BlueGnome Ltd., Cambridge, England) and array analysis were performed according to the manufacturer’s instructions (BlueGnome Ltd.). In brief, 400 ng of genomic DNA was labeled by random priming with Cy3-dCTP (BlueGnome) and hybridized to 400 ng of sexmismatched (female) reference DNA (Promega) labeled, in the same way, with Cy5-dCTP. Test and reference samples were mixed and coprecipitated in the presence of 125 μg Cot-1 DNA 2557
dx.doi.org/10.1021/pr200024m |J. Proteome Res. 2011, 10, 2555–2565
Journal of Proteome Research
ARTICLE
Figure 2. Two-dimensional reference map of human pilocytic astrocytoma in childhood. The proteins were separated on a 310 nonlinear IPG strip, followed by 12% SDS-polyacrylamide gel, as stated under materials and methods. The gel was stained with Coomassie blue and the spots were analyzed by MALDI-MS. The proteins which were present in all samples analyzed are designated with their accession numbers. The names of the proteins are listed in Supplementary Table 2 (Supporting Information).
(Roche), and resuspended in 21 μL of a hybridization solution containing 50% formamide and 15% dextran sulfate (BlueGnome). An overnight hybridization at 47 C was performed, followed by four posthybridization wash cycles according to the manufacturer’s instructions. Slides were dried by centrifugation and scanned using an InnoScan 700 Micro-Array Scanner (Innopsys, Carbonne, France). Spot identification and two-color fluorescence intensity measurements were obtained using the Mapix 3.1.0 software (Innopsys), and data were loaded onto the BlueFuse Multi Microarrays v.2.2 software (BlueGnome) for subsequent analysis. Following normalization, the log2 transformed test-over-reference ratios were analyzed for loss and gain of genomic regions. 3.8. Western Blot Analysis
Total protein (10 μg) of control (n = 3) and CPA tissues (n = 4) were separated by 10% SDS-PAGE under reducing
conditions and electroblotted to Hybond_ECL NC membranes (Amersham Biosciences, Upsala, Sweden). After blocking with 5% nonfat dried milk in TBST solution (20 mM Tris/pH 7.6, 137 mM NaCl, 0.1% Tween 20) for 1 h at room temperature, membranes were washed with TBST and incubated overnight at 4 C with the appropriate primary antibody against ERK-2 (1:200), (sc-65981), (Santa Cruz Biotechnology, Santa Cruz, CA). Next, membranes were washed with TBST and incubated with antimouse HRP-conjugated secondary antibody (1:5000). After a final wash with TBST solution, proteins were detected by the ECL, west pico (Pierce, Rockford, IL) detection system. Western blots were scanned with a GS-800 calibrated densitometer (Bio-Rad Lab). Band quantification was performed with the Quantity One image processing software (Bio-Rad Lab). Actin-beta was used as internal control to ensure equal sample loading. All antibodies were purchased by Santa Cruz Biotechnology. 2558
dx.doi.org/10.1021/pr200024m |J. Proteome Res. 2011, 10, 2555–2565
Journal of Proteome Research
ARTICLE
Figure 3. Classification of proteins identified in CPA, as listed in Supplementary Table 2, Supporting Information, into (A) functional categories and into (B) distinct cellular compartments. The distribution frequencies in regard to the specified categories within the given charts are indicated in % of the total number of protein entries. For each classification, a cut off value was set to 5%, meaning that all of the cellular functions/compartments represented below this threshold are summarized under the subset “other”.
3.9. Network Analysis
All protein identifications, both the ones solely expressed in CPA tissues as well as those differentially expressed among CPA and normal brain neurons, were used for pathway analysis. For this purpose, the Swiss-Prot accession numbers were inserted into the Ingenuity Pathway Analysis (IPA) software (Ingenuity Systems, Mountain View, CA). This software categorizes gene products based on the location of the protein within cellular components and suggests possible biochemical, biological and molecular functions. Furthermore, proteins were mapped to genetic networks available in the Ingenuity database and ranked by score. These genetic networks describe functional relationships between gene products based on known interactions in literature. Through the IPA software, the newly formed networks were associated with known biological pathways.
4. RESULTS 4.1. 2D-E reference map of childhood pilocytic astrocytoma
CPA proteins are mostly present in the acidic region (pI 47), although theoretical pI values predicted protein accumulation in pI 810 as well. The number of protein spots detected in each tissue sample varied, with a mean of 1.350 ( 50 spots per gel. For the analysis more than 24 000 spots were excised from 2D-E gels and analyzed by MALDI TOF-MS. The identification process resulted to a list of 18 000 protein entries (75% identification rate). Out of these, only proteins present in the entity of all samples analyzed were included in the final CPA reference map/ database. Supplementary Table 2 (Supporting Information) lists these proteins along with their accession number, MW, theoretical pI, and Mascot score used for identification. Altogether, 150 different gene products (proteins) were found to be commonly expressed in all CPA samples. The majority of spots identified in CPA contained single proteins, but in several cases MS analysis indicated mixtures of proteins under the same spot. Protein distribution over multiple spots was observed frequently in the analysis. Isoforms of VIME (P08670), GFAP (P14136), G3P (P04406) and CRYAB (P02511), are located on several areas of the proteome map, suggesting possible post-translational modifications (PTMs) of these proteins in CPA. A characteristic 2D-E master gel image
consisting of annotations for all 150 gene products is presented in Figure 2. 4.2. Protein Classification
On the basis of existing gene ontology information, the proteins expressed in CPA (Supplementary Table 2, Supporting Information) were clustered according to their cellular function and localization (Figure 3A and B). Proteins are involved in several biological processes which facilitate tumor relevant adjustments regarding growth, invasion apoptosis as well as inflammation. A significant number of identified proteins (14%) are engaged in cell metabolism, 19% are involved in cell motility, 12% are important in terms of cell proliferation, while others are connected to signal transduction (9%), stress response (5%), or exert multiple functions (5%). Less frequently represented are proteins with known, but not further categorized ontology information (4.5%), proteins associated with structural integrity (4.5%) and cellular adhesion (2%) as well as proteins with ion binding characteristics (1%). Thus, categories exhibiting a frequency of