1 Proteomic Characterization of the Neural Ectoderm Fated Cell

Proteomic Characterization of the Neural Ectoderm Fated Cell Clones in the Xenopus laevis Embryo by High-resolution Mass Spectrometry. Aparna B. Baxi...
0 downloads 0 Views 3MB Size
Subscriber access provided by UNIV OF SCIENCES PHILADELPHIA

Article

Proteomic Characterization of the Neural Ectoderm Fated Cell Clones in the Xenopus laevis Embryo by High-resolution Mass Spectrometry Aparna B. Baxi, Camille Lombard-Banek, Sally A. Moody, and Peter Nemes ACS Chem. Neurosci., Just Accepted Manuscript • DOI: 10.1021/acschemneuro.7b00525 • Publication Date (Web): 26 Mar 2018 Downloaded from http://pubs.acs.org on March 27, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

ACS Chemical Neuroscience is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Neuroscience

Proteomic Characterization of the Neural Ectoderm Fated Cell Clones in the Xenopus laevis Embryo by High-resolution Mass Spectrometry Aparna B. Baxi1,2, Camille Lombard-Banek1, Sally A. Moody2, and Peter Nemes1,2*

1

Department of Chemistry & Biochemistry, University of Maryland, College Park, MD 20742;

2

Department of Anatomy & Regenerative Biology, The George Washington University,

Washington, DC 20052

*Correspondence to: Department of Chemistry & Biochemistry, University of Maryland, 8051 Regents Drive, College Park, MD 20742. Phone: (1) 301-405-0373. Fax: (1) 301-314-9121. Email: [email protected].

Key Words. Mass spectrometry, proteomics, neural ectoderm, Xenopus laevis

Abbreviations: HRMS, high-resolution mass spectrometry; ESI, electrospray ionization; LC, liquid chromatography; LFQ, label free quantification; NE, neural ectoderm

Invited contribution to the special issue on Model Systems in Neuroscience ACS Chemical Neuroscience

1 ACS Paragon Plus Environment

ACS Chemical Neuroscience 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 40

ABSTRACT The molecular program by which embryonic ectoderm is induced to form neural tissue is essential to understanding normal and impaired development of the central nervous system. Xenopus has been a powerful vertebrate model in which to elucidate this process. However, abundant vitellogenin (yolk) proteins in cells of the early Xenopus embryo interfere with protein detection by high-resolution mass spectrometry (HRMS), the technology of choice for identifying these gene products. Here, we systematically evaluated strategies of bottom-up proteomics to enhance proteomic detection from the neural ectoderm (NE) of X. laevis using nano-flow high-performance liquid chromatography (nanoLC) HRMS. From whole embryos, high-pH fractionation prior to nanoLC-HRMS yielded 1,319 protein groups vs. 762 proteins without fractionation (control). Compared to 702 proteins from dorsal halves of embryos (control), 1,881 proteins were identified after yolk platelets were depleted via sucrose-gradient centrifugation. We combined these approaches to characterize protein expression in the NE of the early embryo. To guide microdissection of the NE tissues from the gastrula (stage 10), their precursor (midline dorsal-animal, or D111) cells were fate-mapped from the 32-cell embryo using a fluorescent lineage tracer. HRMS of the cell clones identified 2,363 proteins, including 147 phosphoproteins (without phosphoprotein enrichment), transcription factors, and members from pathways of cellular signaling. In reference to transcriptomic maps of the developing X. laevis, 55 proteins involved in signaling pathways were gene-matched to transcripts with known enrichment in the neural plate. Besides a protocol, this work provides qualitative proteomic data on the early developing NE.

2 ACS Paragon Plus Environment

Page 3 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Neuroscience

INTRODUCTION Formation of the vertebrate central nervous system begins with the induction of embryonic cells of the ectoderm to a neural fate. Amphibians have long been used as models for the study of neural ectoderm (NE) induction.1-3 The South African clawed frog (Xenopus laevis) offers particular advantages for basic biological and translational studies on normal and impaired development (reviewed in References 4-7). Embryos of Xenopus laevis develop externally to the mother, are much larger (1‒1.5 mm in diameter) than those of mouse or fish, develop rapidly (~15 h from fertilization to neurulation), and are amenable to surgical and molecular interventions.8 Furthermore, these embryos undergo stereotypical development: the Xenopus laevis NE arises predominantly from the midline dorsal-animal cell of the 32-cell embryo9 (see Fig. 1). While considerable progress has been made in characterizing morphological and molecular processes during NE induction and their significance to development (reviewed in references 10, 11), the comprehensive suite of molecules underpinning neural tissue development is yet to be determined. Much of our knowledge on neural development comes from decades of innovative embryological manipulations and molecular biological studies, many using X. laevis. Embryonic explant-based screening of growth factors revealed that anti-bone morphogenetic protein (BMP) factors and Wnt (wingless) and fibroblast growth factor (FGF) signaling are involved in neural induction.2 Knock-down and overexpression studies found maternally expressed transcription factors, such as sox11, zic2, foxD5, and geminin, to induce the expression of neural genes later in development.1, 3 More recently, comprehensive transcriptomic and proteomic analyses of ectodermal cells generated from human induced pluripotent stem cells corroborated the involvement of Wnt/β-catenin, the transforming growth factor (TGF-β) superfamily, and nuclear

3 ACS Paragon Plus Environment

ACS Chemical Neuroscience 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 40

factor κB (NF-κB) signaling pathways during ectodermal cell differentiation.12 Moreover, there is emerging evidence that small molecules also can impact neural tissue development. For example, purine has been found to trigger the expression of eye field transcription factors and eye development.13 Likewise, the small molecules (metabolites) acetylcholine, methionine, threonine, and histidine have been shown to alter neural vs. epidermal tissue fates in X. laevis.14 These results highlight the importance of studying a broad variety of molecules to understand the intricacy of neural induction, ranging from genes and transcripts to proteins and metabolites. Technological innovations provided new insights into the spatial and temporal organization of these molecules in X. laevis. Using polymerase chain reaction, microarrays, RNA-Seq, and recently, next-generation sequencing, transcriptional changes were found to unfold as the cleavage stage embryo develops to gastrulate15 and then neurulate16, 17. In parallel, shotgun (bottom-up) proteomics by high-resolution mass spectrometry (HRMS) augmented transcriptomic data with information on the expression of thousands of proteins18, 19 and their posttranslational modification (e.g., phosphorylation20) in the early developing whole embryo. For example, iTRAQ-based quantification of ~4,000 proteins captured sudden changes upon mid-blastula transition as the X. laevis progressed from a fertilized egg to stage 22 embryo,21 when many primary organ precursors are formed. Recent studies have found surprising complexity in mRNA-protein dynamics,19, 22 thus calling for the dual use of mRNA sequencing and proteomics by HRMS to better understand embryonic development. Improvements in HRMS sensitivity have ushered proteomic investigations to single cells, providing previously unavailable insights into cell and developmental biology. Using nano-flow liquid chromatography (nanoLC) HRMS, nucleocytoplasmic partitioning was probed for ~9,000 different proteins by microdissecting nuclei from X. laevis oocytes, revealing passive retention,

4 ACS Paragon Plus Environment

Page 5 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Neuroscience

rather than active transport, to be responsible for maintaining the nuclear and cytoplasmic proteomes.23 Additionally, recent advances in single-cell HRMS (reviewed in references 24, 25) uncovered proteomic26, 27 and metabolomic14, 28-30 differences between identified cells that form the dorsal-ventral, animal-vegetal, or left-right axis in 8-, 16-, or 32-cell X. laevis embryos. Remarkably, metabolomic or proteomic cell heterogeneity was only partially detectable at the level of transcripts.31 These single-cell proteomic data provide information on the translation of genes that are important to neural fate determination, including geminin and isthmin.26 Because these molecular cell differences are unaccounted for during traditional whole-embryo measurements, there is a need for new approaches that can provide tissue- or cell-type specific information on molecular changes during development. To address this knowledge gap, we developed a tissue-specific HRMS-based methodology that allows the characterization of protein expression in NE-fated cell clones of the X. laevis gastrula. After classical cell fate mapping of the clones, the labeled tissues were microdissected. The proteins were extracted and detected via a bottom-up (shotgun) approach, whereby proteotypic peptides were generated by tryptic digestion and identified using discovery HRMS. For deeper coverage of the gene-encoded proteome, we systematically tested methodologies to reduce the concentration of abundant proteins extracted from the tissues, which typically challenge identification, and to simplify peptide complexity prior to HRMS. The resulting methodology allowed ~2,300 protein groups and ~150 phosphorylated proteins to be identified from the NE-fated tissue of the early gastrula-stage embryo. By complementing information on mRNA expression, these proteomic data raise possibilities for follow-up studies to better understand molecular mechanisms of NE induction in the vertebrate embryo.

5 ACS Paragon Plus Environment

ACS Chemical Neuroscience 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 40

RESULTS AND DISCUSSION Abundant Yolk Challenges Protein Detection in Xenopus Tissues. The purpose of this study was to characterize proteins that are expressed in the neural ectoderm (NE), which is a prerequisite for the formation of the central nervous system. We selected X. laevis as the model, for which decades of research have identified many genes with important roles in neural tissue induction. For example, induction of the NE depends on an intricate interplay between transcription factors (e.g., sox11, zic2, foxD5, and geminin), secreted proteins (e.g., anti-BMP factors), and signaling pathways, such as Wnt and FGF (see reviews in reference11). Because of complex dynamics between transcription and translation in developing embryos,19 there is a high need for the direct analysis of proteins to better understand molecular mechanisms of tissue induction, which we provide in this study. To characterize protein production in NE-fated tissues, we here integrated classical embryological microsurgery with untargeted proteomics by HRMS. Figure 1 illustrates our experimental approach. It begins with the labeling of the NE-fated cell clones by an inert fluorescence tracer in the 32-cell embryo, namely a dextran-conjugated fluorescent dye (Alexa Fluor 488) in this work. After culturing the embryo to stage 10, the labeled cell clones were dissected and processed for bottom-up proteomic analysis: proteins were extracted, digested to peptides, and peptides were detected by nanoLC-HRMS. To improve protein identification, we tested two distinct strategies. High-pH reversed-phase (Hp-RP) fractionation was implemented to reduce peptide complexity prior to nanoLC-HRMS. The other approach used gradient-based fractionation to reduce the abundance of yolk platelets prior to HRMS. As a control, a standard bottom-up proteomic workflow was applied on whole embryos (Fig. 2). Proteins were extracted from n = 5 embryos (stage 10) in the lysis buffer, digested, and

6 ACS Paragon Plus Environment

Page 7 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Neuroscience

1 µg of the resulting peptides (confirmed by a Total Peptide Assay) were analyzed by nanoLCnanoESI-HRMS (see Methods). A total of 3,592 peptides were identified, which could be assigned to 782 different protein groups against the Xenopus laevis proteome (see Methods). A comprehensive list of the identified proteins is provided in Supplementary Table 1 (Table S1). Calculated LFQ intensities suggested that detected proteins occupied an ~5 log-order dynamic range (Fig. 2A), highlighting significant molecular complexity. In the lower domain of concentration range, representative proteins included gene products with important roles during development. For example, protein phosphatase 2 regulatory subunit A (Ppp2r1a), a protein involved in the FGF signaling pathway, is essential for embryonic patterning,17 and profilin-1 (Pfn-1), which binds to actin to regulate the structure of the cytoskeleton, is essential for cell migration during development.32 In the middle domain of the concentration range, transcription factors were detected, including staphylococcal nuclease domain containing protein 1 (Snd1) and high mobility group box 2 (Hmgb2), which are involved in cellular proliferation.33, 34 The middle–high domain of the concentration window was populated by common house-keeping proteins, such as glyceraldehyde-3-phosphate dehydrogenase (Gapdh) and fructose-bisphosphate aldolase (Aldoa and Aldoc), which are key to the glycolysis pathway, and ribosomal proteins that regulate protein translation. At highest concentrations, the different chains of vitellogenin were detected (Vtga1, Vtga2, Vtgb1, and Vtgb2). Based on the calculated LFQ intensities, which approximate protein abundances,35 these yolk proteins accounted for ~87% of the total protein abundance in the embryo. In agreement with these data, yolk proteins are known to account for ~90% of the total protein content of the Xenopus laevis egg, which provides a source of rich nutrients for the early developing embryo.36

7 ACS Paragon Plus Environment

ACS Chemical Neuroscience 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 40

These abundant vitellogenin proteins were expected to interfere with the detection of other proteins. In addition to being present at high concentration, yolk proteins are appreciably large, thus giving rise to numerous peptides upon tryptic digestion, each with high abundance. For example, in silico digestion by trypsin predicts Vtga2 (~200 kDa) to yield at least ~400 peptides with >5 amino acids in length, and this peptide complexity is further expanded by variable modifications. Figure 2B evaluates detection of peptide signals in the control sample. The yolk proteins gave rise to a massive spectral background covering the entire span of the separation window: ~15–65% of peptide spectral matches (PSMs) were attributed to Vtga1, Vtga2, Vtgb1, and Vtgb2 as nanoLC separation unfolded. During data-dependent acquisition (DDA) of PSMs, peptide signals from these overabundant proteins were expected to challenge the identifications of non-yolk proteins by interfering with ion generation during ESI, saturating the duty cycle during MS or MS/MS events, and/or biasing MS/MS transitions against peptide signals present at lower signal intensity. High-pH Reversed-phase Fractionation Improved Protein Identification. For protein identifications to improve, the experimental workflow necessitated a revision. Encouraged by the success of orthogonal separations in reducing peptide complexity, we adapted high-pH reversedphase (RP) fractionation prior to low-pH nanoLC. After proteins were extracted from whole X. laevis embryos (stage 10) and digested by trypsin, ~43 µg of peptides (confirmed by a Total Peptide Assay) were sequentially eluted from a C18 column at high pH using increasing concentration of acetonitrile (7.5%, 15.0%, 22.5%, 30.0%, and 50.0%). Embryos were processed without peptide fractionation as a control. The resulting peptide samples were analyzed (1 µg/run) in technical duplicate using low-pH nanoLC-nanoESI-HRMS, as described earlier.

8 ACS Paragon Plus Environment

Page 9 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Neuroscience

Detection performance was compared between these approaches (Fig. 3A). Cumulatively, 1,319 proteins were identified between the 5 fractions vs. 767 proteins from the control (Fig. 3A), representing an ~70% improvement in identifications by fractionation. A comprehensive list of identified proteins is provided in Table S2. Fractionation uncovered nearly all proteins that also were detected in the control, including many transcription factors (e.g., Hmgb2, transcription factor A, or Tfam) as well as proteins involved in translation (e.g., eukaryotic translation initiation factor 3 subunits a and e, or Eif3a and Eif3e, resp.), some with known enrichment in neural tissues (e.g., fascin actin-bundling protein 1, or FSCN1, and proliferationassociated 2G4, or Pa2g4). Importantly, 567 proteins were only detectable by high-pH fractionation, but not in the control. Representative proteins included the high mobility group box 3 (Hmgb3) and purine-rich element binding protein A (Pura), which are known to be expressed in neural tissues (as reported on Xenbase37). Quantitative performance was assessed based on calculated LFQ intensities. Figure 3B compares protein quantities between equal amounts of protein digest using log10-transformed and mean-normalized LFQ intensities. Proteins that were only quantifiable by fractionation had a significantly lower distribution mean than the control, revealing higher detection sensitivity upon fractionation. We ascribe these qualitative and quantitative improvements to a combination of factors. As shown in Figure 3C, each of the 5 fractions allowed a complementary set of peptides to be identified. Therefore, fractionation efficiently reduced peptide complexity over the temporal domain of separation, which in turn improved the duty cycle of DDA for identifying PSMs. However, while fractionation enhanced protein identification, it could not lower the abundance of yolk proteins. Between 3–18% of peptides that were identified in the respective fractions corresponded to yolk proteins (see Fig. 3C). Moreover, LFQ estimated the vitellogenin

9 ACS Paragon Plus Environment

ACS Chemical Neuroscience 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 40

proteins (forms a1, a2, b1, and b2) to amass to ~84% of the total protein abundance in the extracted sample with vtga2 and b1 present in ~4 log-order higher concentration than the mean (see Fig. 3B). It follows that abundant yolk proteins likely still interfered with the identification of other proteins. Yolk Depletion Enhanced Protein Identification. As an alternative, we evaluated depletion of abundant vitellogenin proteins to improve identification (Fig. 4). In Xenopus, yolk is compartmentalized in platelets within each cell of the embryo.36 Therefore, depleting the tissue of these organelles should reduce the concentration of the yolk proteins. We tested this approach by adapting sucrose density-gradient centrifugation into the workflow by modifying a recent protocol (see Methods).18 To test the workflow toward the NE-fated cell clones, which is formed in the dorsal-animal half of the embryo and requires fluorescent labeling with dissection, we tested this approach to a pool of dorsal-animal halves that were isolated from n = 5 embryos at stage 10. Figure 4A depicts the process of yolk depletion. After processing the tissue in the yolk depletion buffer, the homogenate was centrifuged to obtain 3 layers that were readily discernable by visual inspection: lipids, proteins, and yolk were segregated in agreement with the previous protocol.18 To minimize protein losses, the protein and lipid layers were mixed, and the resulting aliquot was transferred into a separate tube, where proteins were purified by precipitation in chloroform/methanol. As control, dorsal halves of the embryos were processed in the lysis buffer (without yolk depletion). In follow-up experiments, the control, yolk-depleted sample, and the yolk layer were separately analyzed. A total of 1 µg peptide mixture was measured in technical duplicate from each sample using nanoLC-HRMS. These data were used to test whether yolk depletion aided the detection of non-yolk proteins. Based on LFQ, the yolk pellet was enriched in vitellogenin proteins to ~97% 10 ACS Paragon Plus Environment

Page 11 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Neuroscience

abundance. The yolk-depleted supernatant contained ~29% yolk proteins by LFQ as compared to ~87% yolk in the control (Fig. 4A). While not exhaustive, this extent of depletion significantly improved protein identifications. Compared to 702 identified proteins in the control, the yolkdepleted protein sample yielded 1,881 proteins with 1,215 of these proteins only identifiable upon depletion (Fig. 4B). A comprehensive list of identified proteins is tabulated in Table S3. The mean of the LFQ distribution was significantly lower for the proteins exclusively identified by depletion than in the control (see Fig. 4C). Therefore, yolk depletion significantly improved the sensitivity of protein identification. Gene ontology annotation by PANTHER provided information to query the molecular roles of the identified proteins. As shown in Figure 4D, 51 of the proteins that were identified after yolk depletion are known members of important signaling pathways, thus providing validation for the biological significance of our proteomics data. The pathways of cytoskeletal regulation by Rho GTPases, Wnt, integrin, fibroblast growth factor (FGF) included 10–20 proteins/pathway. About 5–10 proteins were assigned to the cadherin (Cadh), Ras, and platelet-derived growth factor (PDGF) pathways. The insulin growth factor (IGF), Notch, bone morphogenetic proteins (BMPs), and transforming growth factor β (TGF-β) signaling pathways were detected with 2–5 proteins/pathway coverage. Notably, coverage of these pathways was significantly improved upon yolk depletion. These proteins included mitogen activated protein kinase (Mapk1), Mapk kinase (Map2k), P21 protein activated kinase 2 (PAK2), and ribosomal protein S6 kinase A3 (Rps6a3) that are essential mediators of growth factor signaling. Moreover, the Notch and BMP pathways were only represented by proteins that were detectable after yolk depletion. Detection of these developmentally significant pathways demonstrated sufficient sensitivity for assessing protein expression in X. laevis.

11 ACS Paragon Plus Environment

ACS Chemical Neuroscience 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 40

Protein Detection from NE-fated Clones. Last, we applied this methodology to characterize protein production as the induction of the NE starts in the stage 10 embryo (Fig. 5). The experiments began with lineage tracing of the midline dorsal-animal cell of the 32-cell stage embryo, called the D111 cell (by Jacobson nomenclature38, also known as the A1 cell by the Nakamura nomenclature39); this cell reproducibly gives rise to the NE in X. laevis.9 To label the NE-fated tissue, the left and right D111 cells were injected with ~1 nL of 0.5% (w/v) fluorescent dextran, and the embryos were cultured to stage 10. Figure 5A shows the fluorescently marked D111 cells and their progeny in the right half of the stage 10 embryo using epifluorescence microscopy supplemented with bright-field imaging. The labeled tissues were microdissected and pooled from n = 5 different embryos for proteomic analysis, yielding ~100 µg of total protein material. To maximize protein identifications in these samples, yolk platelets were depleted, and digested proteins were Hp-RP fractionated. To improve detection by enhancing ionization efficiency and MS/MS sequence coverage via selective chemical modification (reviewed in reference 40), peptides were tagged with tandem mass tags (TMT, Thermo) before measurement of 1 µg of peptide mixture in technical duplicate using nanoLC-HRMS. The resulting data provided rich information on genes translated in the early NE. A total of 2,363 protein groups were identified in the clones. Based on LFQA, 2,318 of these proteins were quantified across an ~5-log-order range in concentration, revealing a comparably broad quantitative dynamic range of expression to the whole embryo. A comprehensive list of the identified proteins is provided in Table S4. It is worth noting that identification of ~2,400 proteins from 1 µg peptides using sucrose-gradient based yolk depletion in this study demonstrates sensitivity improvement over a recently optimized protocol, in which the use of mammalian Cell-PE LB lysing buffer (NP40) and filter-aided protein sample processing (FASP)

12 ACS Paragon Plus Environment

Page 13 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Neuroscience

enabled the identification of 2,270 proteins from 2 µg of peptide mixture measured in technical triplicate using a similar mass spectrometer.41 Without phospho-enrichment employed in this study, the high-resolution MS/MS data also allowed us to detect 147 phosphorylated proteins with 262 phosphorylation sites confidently identified (site probability > 0.9). For example, the high mobility group protein HMGI-C (Hmga2), which is required for neural crest cell specification in X. laevis,42 was detected phosphorylated at serine 113 and 117 (S113, S117). Moreover, 21 of these phosphoproteins are reported in PhosphoSitePlus to have known mammalian homologs with 26 matching phosphorylation sites to our data (see Table S4). Other proteins are highlighted in Figure 5B. These gene products represented the Wnt, TGF-β, integrin, and growth factor pathways, which also detected upon yolk depletion. Notably, the combined approach significantly deepened pathway coverage. In the Wnt pathway, 33 proteins were identified, which included Wnt10A, lymphoid specific helicase (Hells), the Wnt pathway signal transducer catenin beta-1 (Ctnnb1), and histone deacetylase 1 (Hdac1), which promotes neural induction by restricting mesodermal fate.43 Interestingly, eight Wnt pathway proteins were detected to be phosphorylated including the myristoylated alaninerich C kinase substrate (MARCKS), which is essential for controlling cell motility during gastrulation.44 Phosphorylation sites for three Wnt pathway proteins were confidently identified from the HRMS data (site specific probability >0.9) and could be matched to mammalian protein homologs. Specifically, we detected phosphorylation for MARCKS at threonine 123 (T123), proteasome subunit alpha type-3 (Psma3) at S250, and two splice variants of ribosomal protein lateral stalk subunit P2 (Rplp2) at residues S102, S105 and S120, S123, respectively. Another key protein identified in the NE clones was Smad2, the intracellular signal transducer and

13 ACS Paragon Plus Environment

ACS Chemical Neuroscience 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 40

transcriptional modulator activated by TGF-β. In addition, seventeen detected proteins were involved in Rho GTPase and integrin pathways, which are involved in cytoskeletal regulation and cell migration during gastrulation.45, 46 These pathways included the vasodilator simulated phosphoprotein (Vasp), myosin heavy chain (Myh9,10, and 11), integrin beta1 (Itgb1), paxillin (Pxn), and actin related protein complexes (Arpc 2, 3, 4, and 5). The data also represented components of the FGF signaling pathway, an important player during neural induction,47, 48 thus validating protein identifications. Following GO annotation, nineteen proteins were assigned to this pathway with multiple functional aspects of the signaling cascade, ranging from kinases and phosphatases to adapter proteins. For example, we detected the growth-factor receptor bound protein 2 (Grb2), Grb2 associated protein (Gab1), and 14-3-3 proteins (ζ and ε), which interact with cell surface receptors and signaling pathway mediators to regulate and facilitate signal transduction.49 As shown in Figure 5C, protein-protein interactions can be anticipated based on the canonical FGF pathway50 or regulatory interplays predicted using the STRING51 database. The dataset was enriched in 43 transcription factors, including Hmgb 2 and 3, which are dynamically expressed in neural stem cells,34 the cell type derived from the D111 cell lineage. Remarkably, Hmgb2 along with seven other transcription factors were found to be phosphorylated. These phospho-TFs included Zic5 (zinc finger protein 5), which acts downstream of the canonical Wnt pathway to regulate cell proliferation during midbrain development in zebrafish.52 Zic5 was phosphorylated at tyrosine 3 (Y3) and threonine 5 (T5); however, these phosphorylation sites are not known to be conserved in the mammalian form of Zic5. Knowledge of these proteins and their phosphorylation is important to understand gene regulatory networks during development.

14 ACS Paragon Plus Environment

Page 15 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Neuroscience

Our data can be used to gain insights into the relationship between mRNA and protein production during the development of the NE. The proteins that we identified in the D111 clones can be cross-referenced to quantitative information on gene expression in embryonic tissues and the developing whole embryo, e.g., via Xenbase.37 Interestingly, a molecular atlas recently mapped spatial and temporal transcription between distinct ectodermal domains of X. laevis in addition to the whole embryo,16 providing an excellent resource for such mRNA-protein comparisons. In theory, this comparison is feasible for any gene product that is common between these datasets, including the 2,363 proteins that were detected in the NE clones derived from the D111 cell in this study. These matches with the mRNA data validate our proteomics data on the NE clones. As proof of concept, we queried transcriptional enrichment of the NE for only a subset of genes. Using an interactive network interface, EctoMap16, we searched the mRNA enrichment database for the neural plate in the stage 12.5 embryo, which arises from the NE of the stage 10 embryo, the subject of our experiments. The analysis was limited to the 94 genes that we detected as proteins from signaling pathways or transcriptional factors. The results are summarized in Table 1. Of the 42 transcription factors that were detected by HRMS, 37 were reported transcriptionally enriched in the neural plate compared to the whole embryo. These agreements between transcriptomic data and our proteomic results open the possibility to future quantitative studies on mRNA-protein dynamics in the developing NE. Transcriptional enrichment also was reported for 55 proteins that are members of essential signaling pathways (see earlier). As an example, Figure 5C overlays transcriptional enrichment for detected proteins that were assigned to the FGF signaling pathway (see earlier), suggesting neural enrichment of regulatory proteins (e.g., serine/threonine protein phosphatases and the 14-3-3 proteins).

15 ACS Paragon Plus Environment

ACS Chemical Neuroscience 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 40

CONCLUSIONS This study demonstrates that proteomic analysis of identified tissues can facilitate the understanding of molecular mechanisms underlying tissue induction and maintenance during development of vital organs, such as the central nervous system. Xenopus laevis has provided significant insights into cellular and molecular events during normal and impaired embryonic development (reviewed in References 4-7). However, the abundance of yolk proteins (vitellogenins) in cells of the early embryo hinders HRMS detection of proteins, particularly those present in low–medium abundance. To bridge HRMS to cell and developmental biology of Xenopus, we systematically tested analytical strategies to alleviate interferences from abundant yolk signals. The combination of sucrose-gradient based depletion of yolk platelets and high-pH RP fractionation efficiently improved protein identification and quantification in X. laevis using nanoLC-nanoESI-HRMS. Combined with continuous developments in HRMS technology for Xenopus, such as multiplexing quantification23 and ultrasensitive detection26, 53, 54, the methodologies tested here afford possibilities to study developmental processes with progressively deeper molecular coverage, even at the level of single cells26, 27, 55. Integration of classical embryology and discovery proteomics by HRMS offers previously unavailable molecular insights into the control of embryonic development. Whole-embryo analyses provided important clues about the dynamics of transcription and translation of the genome and post-translational modifications during development (see references 18, 19, 21, 56). Here we demonstrated that improvements in HRMS sensitivity and reproducible cell fates in Xenopus laevis extend these investigations to a defined population of cells. This approach yields molecular information for a given cell or tissue phenotype that would be potentially lost or

16 ACS Paragon Plus Environment

Page 17 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Neuroscience

difficult to interpret in classical measurements that pool large populations of cells to whole embryos. As proof of principle, we fate-mapped the D111 cell in the 32-cell X. laevis embryo to guide the microdissection of the NE-fated cell clones in the stage 10 embryo. Yolk depletion with highpH fractionation prior to nanoLC-HRMS allowed 2,363 proteins to be identified in these tissues by analyzing 1 µg peptides per run in technical duplicate. These performance metrics demonstrate sensitivity improvements over a recently optimized protocol that used NP40-based lysis and filter-aided sample preparation (FASP) to identify 2,270 proteins by analyzing 2 µg peptides per run in technical triplicate. Proteins that were identified in our study included many transcription factors that illustrate the sensitivity of the technology. Furthermore, many gene products corresponded to transcripts with known enrichment in the neural ectoderm and molecular pathways with essential contribution to neural development, thus validating the proteomics data for these clones. The integration of proteomic and transcriptomic technologies, such as HRMS and next-generation sequencing, promises to advance our understanding of normal and impaired biological processes during vertebrate embryonic development.

METHODS Materials and Reagents. Unless otherwise noted, LC-MS grade solvents (for HRMS), reagentgrade chemicals, TPCK-modified trypsin (for HRMS), and Alexa Fluor 488 (green fluorescent dextran) were purchased from Fisher Scientific (Pittsburg, PA). Calcium nitrite, cysteine, Trizma hydrochloride, and Trizma base were purchased from Sigma-Aldrich (Saint Louis, MO). Solutions. For culturing embryos, 100% (w/v) and 50% (w/v) Steinberg’s solutions (SS) were prepared according to standard protocols.57 The “lysis buffer” was prepared to contain 150

17 ACS Paragon Plus Environment

ACS Chemical Neuroscience 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 40

mM NaCl, 20 mM Tris-HCl, 5 mM EDTA, and 1% sodium dodecyl sulfate (SDS). The“yolk depletion buffer” was adapted with modification from Reference18 to contain 250 mM sucrose, 1% nonidet P-40 (NP-40), 5 mM EDTA, 20 mM Tris-HCl, 10 µM combretastatin 4A, and 10 µM cytochalasin D; Tris-HCl was chosen to minimize HEPES contamination during nanoLCHRMS in this work. Animal Care and Maintenance. Adult male and female frogs (Xenopus laevis) were obtained from Nasco (Fort Atkinson, WI) and maintained in a breeding colony. Protocols regarding the maintenance and handling of Xenopus laevis were approved by the George Washington University Institutional Animal Care and Use Committee (IACUC #A233). Standard protocols were followed to obtain embryos using gonadotropin induced natural mating of male and female frogs for analytical method development and, by in vitro fertilization for tissue dissections.57 Embryos were dejellied using 2% (w/v) cysteine solution and then cultured and staged according to the Nieuwkoop and Faber (NF) nomenclature following established protocols.57 Two-cell embryos in which stereotypical pigmentation marked the future dorsalventral axis were selected as described elsewhere38 and cultured to stage 6 (32-cell).9 The D111 cells were labeled with green fluorescent dextran using a microinjector (Warner Instruments, Hamden, CT). The labeled embryos were raised in 100% SS to stage 10. The embryos were monitored with a stereomicroscope under epifluorescence (model SMZ18 with GFP-B 480 filter, Nikon Instruments Inc., Melville, NY). The labeled cells or tissues were dissected in a 2% (w/v) agarose-coated Petri dish containing 50% SS, collected in LoBind Eppendorf tubes, and snapfrozen on dry ice. The samples were stored at −80 ºC until further processing. Bottom-up Proteomic Workflow. Two protein extraction protocols were compared in this work, one with yolk protein depletion18 and without26 (control). In the control protocol,

18 ACS Paragon Plus Environment

Page 19 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Neuroscience

embryonic samples were lysed in 300 µL of lysis buffer with sonication for 5 min. In the yolk depletion protocol, embryonic samples were lysed in 300 µL of yolk depletion buffer, incubated on ice for 10 min, and centrifuged at 4,500 × g at 4 ºC, for 4 min as described elsewhere18. The pellet (yolk platelets) and the supernatant (yolk-depleted) were transferred into separate microcentrifuge tubes, and 1% SDS was added to the supernatant. The pelleted yolk was also resuspended in 200 µL of lysis buffer before further processing. Once extracted, proteins from both protocols were processed following standard bottom-up proteomic workflows as follows: proteins were reduced (5 µL of 0.5 M dithiothreitol, 60 ºC for 30 min) and alkylated (15 µL of 0.5 M iodoacetamide treatment in the dark at room temperature for 20 min) before the reaction was quenched (5 µL of dithiothreitol). Next, proteins were purified by overnight precipitation in chilled (−20 °C) acetone for the control sample and chloroform-methanol mixture for the yolk depleted sample, followed by centrifugation at 10,000 × g for 10 min at 4 °C. The pellet was rinsed once with chilled acetone for the control sample. The resulting pellets were suspended in 50 mM ammonium bicarbonate, and the total protein concentration was determined by the bicinchoninic acid (BCA) assay (Thermo). Last, proteins were digested with trypsin (protein:enzyme = 1:50) overnight at 37 °C. Peptide concentration was quantified using the Colorimetric Peptide Assay (Thermo). For enhanced peptide detection, peptides extracted from the NE cell clones were tagged with tandem mass tags (TMT, Thermo). To reduce peptide complexity, protein digests were fractionated under high pH in spin columns following vendor recommendations (Pierce Kit #84868, Thermo). A total of ~100 µg peptide mixture were fractionated using 0.1% (v/v) trimethylamine containing acetonitrile at 7.5%, 15%, 22.5%, 30%, and 50% (v/v), and each fraction was dried in a vacuum concentrator. Unfractionated peptide samples were desalted using C18 spin columns (Pierce #89870, Thermo).

19 ACS Paragon Plus Environment

ACS Chemical Neuroscience 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 40

Unlabeled peptides were reconstituted in 2% acetonitrile (0.1%, v/v formic acid), whereas labeled peptides were dissolved in 5% acetonitrile (0.1%, v/v formic acid). Liquid Chromatography Mass Spectrometry. A total of 1 µg of peptide mixture (confirmed by the Total Peptide Assay) was separated on a C18 column at 200 nL/min (75 µm inner diameter, 2 µm particle size with 100 Å pores, 50 cm length, Acclaim PepMap 100, Thermo). Separation used a nano-flow liquid chromatography system (Dionex Ultimate 3000 RSLCnano, Thermo) providing a 120-min gradient mixture from Buffer A (2% ACN, 0.1% FA) and Buffer B (100% ACN, 0.1% FA). Untagged peptides were loaded using 100% Buffer A and eluted via a multi-step gradient that ramped Buffer B as follows: from 2% to 7% in 15 min, to 15% in 35 min, to 40% in 70 min, and to 80% in 5 min, then the eluent composition was held for 2 min, before returning it to 2% in 15 min for equilibration over 15 min. TMT-tagged peptides were separated using a similar multi-step gradient supplying Buffer B as follows: 3% for 10 min, then ramped to 40% in 110 min, to 80% in 5 min, returned to 3% in 10 min for equilibration over 15 min. Peptides were detected by nanoESI-HRMS. The electrospray was generated at 2.5 kV using a pulled fused silica capillary as emitter (10 µm tip, New objective, Woburn, MA). Peptides were detected and sequenced using an orbitrap-quadrupole-ion trap tribrid high-resolution mass spectrometer (Orbitrap Fusion, Thermo) executing DDA. Untagged and tagged peptides were detected under slightly different experimental settings. For untagged samples, precursor ions were surveyed at 120,000 FWMH resolution in the orbitrap mass analyzer (MS1) with the following parameters: maximum injection time (IT), 50 ms; automatic gain control (AGC), 4 × 105; microscans, 1. Precursor ions with intensity greater than 1 × 104 counts were selected with a 1.6 Da mass window and fragmented by higher energy collision dissociation (HCD) in nitrogen

20 ACS Paragon Plus Environment

Page 21 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Neuroscience

at 32% normalized collision energy (NCE). Fragment ions were detected in the ion trap: scan rate, rapid; maximum IT, 70 ms; AGC, 1 × 104 counts; microscans, 1. Fragmented ions were dynamically excluded for 60 s before being reconsidered for fragmentation. For the tagged samples, ions were surveyed at 60,000 FWMH resolution in the orbitrap mass analyzer (MS1) with following settings: maximum IT, 50 ms; AGC, 2 × 105 counts; microscans, 1. Tandem MS was activated with collision induced dissociation (CID) in helium at 35% NCE. Fragment ions were detected in the ion trap with rapid scan rate with the following settings: maximum IT, 50 ms; AGC, 5 × 104 counts; microscans, 1. Multinotch MS3 fragmentation was triggered on the top 10 MS2 fragments with HCD in nitrogen (65% NCE). MS3 spectra were acquired at 15,000 FWMH in the orbitrap (maximum IT, 120 ms; AGC, 1 × 105 counts; microscans, 1). Fragment ions were dynamically excluded for 60 s before being reconsidered for fragmentation. Data analysis. Primary mass spectrometry data were processed in MaxQuant 1.5.7.4 running the Andromeda (version 1.5.6.0) search engine.35 The MS–MS/MS data were searched against the SwissProt Xenopus laevis proteome (downloaded from UniProt58 on 10/15/2017) supplemented with the mRNA-derived PHROG database (ver. 1.0) from Reference18. The following search parameters were applied: digestion, tryptic; number of missed cleavages, maximum 2; minimum number of unique peptides, 1; fixed modification, cysteine carbamidomethylation; variable modification, methionine oxidation; maximum mass deviation for main search of precursor masses, 4.5 ppm; de novo mass tolerance for tandem mass spectra, 0.25 Da; minimum score for modified peptides, 40; minimum delta score for modified peptides, 8. For TMT-tagged peptides additional search parameters were: sample type, reporter ion MS3; label, TMT. In the NE clones, phosphorylated proteins were identified using the search parameters: variable phosphorylation of serine, threonine, and tyrosine; variable oxidation of

21 ACS Paragon Plus Environment

ACS Chemical Neuroscience 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 40

methionine; minimum delta score for modified peptides, 8; phosphorylation localization probability filtered to >0.9 and sites probability >0.9. Proteins were quantified using label-free quantification (LFQ) by MaxLFQ.35 Peptide and protein identifications were filtered to