Proteomic Analysis of Mouse Oocytes Identifies PRMT7 as a

May 26, 2016 - Phone +49 0251 70365 330. ... We previously searched for such facilitators of reprogramming (the reprogrammome) by applying label-free ...
4 downloads 11 Views 2MB Size
Subscriber access provided by UOW Library

Article

Proteomic analysis of mouse oocytes identifies PRMT7 as reprogramming factor that replaces SOX2 in the induction of pluripotent stem cells Bingyuan Wang, Martin Johannes Pfeiffer, Hannes C.A. Drexler, Georg Fuellen, and Michele Boiani J. Proteome Res., Just Accepted Manuscript • DOI: 10.1021/acs.jproteome.5b01083 • Publication Date (Web): 26 May 2016 Downloaded from http://pubs.acs.org on May 31, 2016

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Proteome Research is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Proteomic analysis of mouse oocytes identifies PRMT7 as reprogramming factor that replaces SOX2 in the induction of pluripotent stem cells Bingyuan Wang1, Martin J. Pfeiffer2, Hannes C. A. Drexler3, Georg Fuellen4, Michele Boiani2*

1. Key Laboratory of Farm Animal Genetic Resources and Germplasm Innovation of Ministry of Agriculture, Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, 100193, China.

2. Max Planck Institute for Molecular Biomedicine, Röntgenstraße 20, 48149 Münster, Germany.

3. Max-Planck Institute for Molecular Biomedicine, Bioanalytical Mass Spectrometry Facility, Röntgenstraße 20, D-48149 Münster, Germany.

4. Institute for Biostatistics and Informatics in Medicine and Ageing Research, Rostock University Medical Center, Rostock, Germany

*. Corresponding author (Email: [email protected] ; Phone +49 0251 70365 330)

1 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 49

Abstract

The reprogramming process that leads to induced pluripotent stem cells (iPSCs) may benefit from adding oocyte factors to Yamanaka’s reprogramming cocktail (OCT4, SOX2, KLF4, with or without MYC; OSK(M)). We previously searched for such facilitators of reprogramming (the reprogrammome) by applying label-free LC-MS/MS analysis to mouse oocytes, producing a catalog of 28 candidates that are (i) able to robustly access the cell nucleus, and (ii) shared between mature mouse oocytes and pluripotent embryonic stem cells. In the present study we hypothesized that our 28 reprogrammome candidates would also be (iii) abundant in mature oocytes, (iv) depleted after oocyte-to-embryo transition, and (v) able to potentiate or replace the OSKM factors. Using LC-MS/MS and isotopic labeling methods, we found that the abundance profiles of the 28 proteins were below those of known oocyte-specific and housekeeping proteins. Of the 28 proteins, only arginine methyltransferase 7 (PRMT7) changed substantially during mouse embryogenesis and promoted the conversion of mouse fibroblasts into iPSCs. Specifically, PRMT7 replaced SOX2 in a factor-substitution assay, yielding iPSCs. These findings exemplify how proteomics can be used to prioritize the functional analysis of reprogrammome candidates. The LC-MS/MS data are available via ProteomeXchange with identifier PXD003093.

Keywords: development, DNMT3A, induced pluripotent stem cells, oocyte proteome, PRMT7, reprogramming, reprogrammome, SILAC, STELLA

2 ACS Paragon Plus Environment

Page 3 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Introduction Metaphase II (MII) oocytes have the unique ability to reprogram somatic cell nuclei to a totipotent state after somatic cell nuclear transfer (SCNT). This fact has been documented by successful reproductive cloning of various mammalian species 1-6. Reprogramming can also be assessed at an intermediate stage in species that are permissive for the derivation of embryonic stem (ES) cells, including mouse 7, rabbit 8, monkey 9, and human 10. However, it remains unclear which oocyte factors are pivotal to the unique reprogramming process that oocytes are capable of and which ones might be dispensable. Reprogramming is also possible without an oocyte if select transcription factors (e.g. OCT4, SOX2, KLF4, MYC, in brief OSKM) are delivered to somatic cells, a process that leads to induced pluripotent stem cells (iPSCs) 11 which share many properties with ES cells. iPSC reprogramming requires the master pluripotency regulator OCT4, whereas oocyte-mediated reprogramming does not; it is still fully functional even if maternal OCT4 is lacking 12-14. It has previously been shown that individual factors of the classical reprogramming cocktail are replaceable 15, and the search for factor substitutions and putative reprogramming enhancers is still ongoing. Given the knowledge gleaned from SCNT experiments, it is realistic to assume that such factors may be found in oocytes. The inclusion of oocyte factors with OSKM factors may benefit the reprogramming process 16, provided the ‘right’ factors were chosen. To accomplish this, and to design strategies to improve the efficiency of iPSC generation, a better understanding of oocytemediated reprogramming is advantageous. Our laboratory contributed to the search for reprogramming factors in mouse oocytes using a proteomic approach. The rationale for this approach was based on the fact that oocyte-

3 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 49

mediated reprogramming has a very short latency phase compared to iPSC reprogramming 16, suggesting that the oocyte's factors are poised for action because they are present in suitable amounts and in a ready-synthesized form, i.e. as proteins. A proteomic approach is also justified since protein abundance cannot be reliably predicted from transcript abundance 17, 18. Fast reprogramming is not an exclusive prerogative of the oocyte. It is also enabled by fusion of somatic cells with undifferentiated ES cells, which are separated by only a few days of development from the oocyte. In addition, somatic cells fused with ES cells give rise to pluripotent hybrids, which express pluripotency markers from the somatic alleles 19, 20. Given these outcomes, our group postulated that the proteins shared by mouse MII oocytes and undifferentiated mouse ES cells would include the factors that facilitate nuclear reprogramming. In previous work, we used SDS-PAGE (polyacrylamide gel electrophoresis) and LC-MS/MS (liquid chromatography coupled with mass spectrometry) techniques to identify 3699 proteins in mouse MII oocytes, which remains the deepest mammalian oocyte dataset published to date. Of the 3699 proteins, 2556 were also found in ES cells. Based on the assumption that candidate reprogramming factors of oocytes should be able to robustly access and actively modify the transplanted genome, the 2556 proteins were searched for proteins that feature 'nuclear localization', 'chromatin modification' and 'catalytic activity' as gene ontologies 21 resulting in the following hits: BAZ1B, BRCC3, CARM1 (PRMT4), CCNB1, CHD4, DNMT1, DNMT3A, EED, EP400, HAT1, HDAC1, HDAC2, HDAC6, HELLS, KDM1A (LSD1), KDM6A (UTX), MLL3, PRMT1, PRMT5, PRMT7, RNF2, RNF20, RUVBL1, RUVBL2, SMARCA4 (BRG1), SMARCA5, SMARCAL1, USP16. We proposed that these proteins might be part of what we called the ‘reprogrammome’ 21. Byrne and colleagues have also pursued candidate oocyte reprogramming factors (CORFs). Based on

4 ACS Paragon Plus Environment

Page 5 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

the massive nuclear swelling and chromatin decondensation observed after SCNT, the authors looked for factors that are present in oocytes in considerable amounts and that are able to robustly access the transplanted nuclei (22; chapter 14 in 23). They proposed the following 23 CORFs: AFAP1L2, ARID2, ASF1A, ASF1B, BRDT, DPPA3, DPPA5, ERG, FOXK2, H1FOO, HHEX, ING3, KDM6B, LEF1, MLL3, MSL3, NCOA3, NFATC2, NR5A2, POU4F1, RPS6KA5, TADA2L, TAF4B. It is not straightforward to compare Byrne and colleagues’ dataset with our dataset 21 as the former is based on mRNA analysis and the latter on protein analysis, and the selection criteria were not the same, albeit similar. Yet, the two datasets share one factor, namely MLL3. In the current work we explored the functional significance of the reprogrammome with reference to pluripotency transitions as can be seen during early development and during somatic reprogramming to iPSC. To do so we applied a holistic LC−MS/MS approach to capture as many proteins as possible, before closing the circle on reprogrammome candidates 21 present in mature mouse oocytes and during embryonic stages; and we applied short overexpression screening to test if reprogrammome candidates are able to reinforce the induction of pluripotency by OSKM factors. Relative quantifications were performed using a LC-MS/MS proteomic pipeline, in which the oocytic or embryonic peptides are quantified against isotopically labeled counterparts from a reference cell line whose lysate is added to the sample in a fixed amount (spike-in standard). The labeling of the proteins in the reference cell line was achieved via stable isotope labeling with amino acids in cell culture (SILAC), and the LC-MS/MS measurements were processed using MaxQuant 24. A subset of selected proteins was subjected to immunofluorescence-based verification. Short overexpression screening of reprogrammome candidates was performed by expressing them retrovirally in mouse embryonic fibroblasts

5 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 49

(MEFs) during OSK(M)-mediated reprogramming to iPSCs. Of the reprogrammome candidates tested, PRMT7 stood out as biologically significant; although it did not increase the rate of iPSC formation, it replaced SOX2 in the OSK(M) Yamanaka cocktail, yielding OCT4-KLF4-PRMT7 (OKP7) iPSCs. This study is the first report of PRMT7 replacing SOX2 and provides a resource of tested proteins that expand our understanding of protein roles in cellular and nuclear reprogramming.

6 ACS Paragon Plus Environment

Page 7 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Materials and methods Mice. B6C3F1 females were used for zygote production after mating to CD1 males. For testing of iPSCs for chimerism and germline contribution, CD1 females were used as blastocyst donors and as recipients for embryo transfer after sterile mating to vasectomized CD1 males. Severe combined immunodeficient (SCID) mice of either sex were used for the teratoma assay.

Oocyte and embryo sample collection. Fertilized and unfertilized oocytes were collected after gonadotropin stimulation (5 IU each PMSG and hCG, injected i.p. 48 hours apart) and cervical dislocation of B6C3F1 mice aged 6-8 weeks, as described 25. Fertilized oocytes were recovered from oviducts of B6C3F1 mice after successful mating to CD1 males; the fertilized oocytes were then cultured in KSOM(aa) medium at 37 °C in a humidified atmosphere of 5 % CO2 in air. Developmental stages were sampled at respective time points. Before lysing in SDS buffer, the zona pellucida (approximately 15% of the total oocyte protein 26) was removed using acidic Tyrode’s solution to increase the sensitivity of proteome analysis for the other oocyte proteins.

Protein isolation, fractionation, mass spectrometry and protein identification/quantification. In compliance with the Minimum Information about a Proteomics Experiment (MIAPE) guidelines, we again used our previously established pipeline for the quantitative identification of oocyte proteins, which has been described in detail 25 with a few improvements. Our pipeline is based on the stable isotope labeling of a F9 embryonal carcinoma (EC) cell spike-in reference

7 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 49

with amino acids in cell culture (SILAC) 25, 27. In brief, proteins from zona-free oocytes or embryos were mixed 1:1 (protein amount) with the heavy F9 carcinoma spike-in cell lysate (Lys8 and Arg10), acetone-precipitated, reduced and alkylated, and then digested with Endoproteinase Lys-C (3hr) and Trypsin (overnight). Following desalting on Empore 3M C18 discs, peptide mixtures were offline fractionated by RP-HPLC at pH 10.2 (Buffer A: 10mM ammonium formate pH 10.2; Buffer B: 10mM ammonium formate, 90% acetonitril, pH 10.2; linear gradient from 035% B in 70min; 35 - 70% B in 15min; 70% B for 10min; Waters XBridge BEH C18 2.1 x 150mm). Twenty pools were generated from each sample by concatenated fractionation 28, dried down in a SpeedVac and subsequently analyzed individually by LC-MS/MS either on a LTQ Orbitrap Velos (experiment 0672) or on a Q Exactive (experiment 0616) mass spectrometer (Thermo Scientific, Waltham, MA 02454, USA) both equipped with an Easy nano-LC system and a nano-electrospray source (both from Proxeon, Odense, Denmark) holding 15 cm fused silica capillary emitter columns (New Objective, ID 75µm) filled with a C18 reversed phase matrix (ReproSil-Pur C18-AQ, 3µm; Dr. Maisch, Ammerbuch). Both mass spectrometers were operated in data-dependent mode (positive ion mode, source voltage 2.1kV), automatically switching between a survey scan (Orbitrap Velos: mass range m/z = 350-1650, target value = 1 x 106; resolution R = 60 K; lock mass set to background ion 445.120025; Q Exactive: mass range 300 – 1750; target value = 3 x 106; resolution 70K) and MS/MS acquisition of the 15 (Velos) and 10 (Q Exactive) most intense peaks by collisional induced dissociation (CID) in the ion trap in case of the Velos (isolation width m/z = 2.0; normalized collision energy 35%; dynamic exclusion enabled with repeat count 1, repeat duration 30.0, exclusion list size 500 and exclusion duration set to 90 s) or by higher energy induced collisional dissociation (HCD) in case of the Q Exactive (isolation width m/z = 1.6;

8 ACS Paragon Plus Environment

Page 9 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

normalized collision energy 25%; dynamic exclusion enabled and set to 25.0 s); double charge and higher charges were allowed. Gradient conditions for the reversed-phase online separation of peptide mixtures were 2-28% buffer B (80% acetonitril, 0.1% formic acid; 120min), 28-98% B (20min), 98% B (6min) for the Orbitrap Velos; and 2-30% B (120min), 30-50% B (30min), 50-95% B (5min), 95% B (5min) for the Q Exactive. Afterwards, columns were re-equilibrated in Buffer A (0.1% formic acid). Raw data were processed by MaxQuant software (version 1.5.3.8) including the built-in Andromeda search engine and the bioinformatics tool Perseus. The search was performed against the UniProt mouse database (release date 7/8/2015) concatenated with reversed sequence versions of all entries and supplemented with common contaminants. Parameters defined for the search were: Trypsin as digesting enzyme, allowing two missed cleavages; a minimum length of 7 amino acids; carbamidomethylation at cysteine residues as fixed modification, oxidation at methionine and protein N-terminal acetylation as variable modifications. The maximum allowed mass deviation was 20ppm for the MS and 0.5Da for the MS/MS scans. Protein groups were regarded as identified with a false discovery rate (FDR) set to 1% for all peptide and protein identifications; in addition, at least two matching peptides were required and at least one of these peptides had to be unique to the protein group. SILAC labeling of the F9 spike-in standard with 13C6, 15N4 L- Arginine and 13C6, 15N2 L-Lysine was specified and accounted for within MaxQuant. Heavy over light (H/L) SILAC ratios for each protein group were calculated by MaxQuant as the median of all SILAC peptide ratios that could be assigned to the protein group. SILAC ratios were normalized by MaxQuant in order to correct for mixing errors of total protein amounts as

9 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 49

described by Cox and Mann 24. For all further calculations the normalized H/L SILAC ratios were inverted, resulting in L/H ratios, which facilitate the interpretation of the results since the oocyte or embryo quantity (L) is in the numerator. The MaxQuant output data were exported to Excel for analysis of abundance profiles (see Supporting Information). The proteins’ gene identities were analyzed for gene ontologies using the PANTHER (Protein ANalysis THrough Evolutionary Relationships) classification system 29.

Protein data repository. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium 30 via the PRIDE partner repository with the dataset identifier PXD003093 (the dataset is currently private; deposited files include the search files “proteinGroups.txt”, “peptides.txt”, “evidence.txt”, “parameters.txt”, “msms.txt”, and 280 raw files. For reviewer access please go to http://www.ebi.ac.uk/pride/archive/projects/PXD003093 using the username: [email protected] and password: 1rYnkq8c).

Microarray-based transcriptomic analysis. RNA was isolated from pools of 20 of each of the following: MII oocytes, fertilized oocytes with pronuclei, 2-cell embryos, 4-cell embryos, 8-cell embryos, morulae (approx. 16 cells), and blastocysts (32 cells or more). The protocol for reverse transcription, amplification, labeling and hybridization to Illumina mouseWG-6 v2.0 Expression BeadChips was the same as used previously 25. Array data was normalized and quantified using Genome Studio software (Table S1 as supporting information).

10 ACS Paragon Plus Environment

Page 11 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Vectors used for production of OSKM and reprogrammome factors in retroviral form. OSKM vectors were commercially available as retroviral plasmids (pMXs-Oct4, pMXs-Sox2, pMXs-Klf4 and pMXs-c-Myc from Addgene plasmids 13366, 13367, 13370, 13375, respectively). The coding sequences of reprogrammome candidates were cloned into pMXs-GW vectors in house (pMXs-Brcc3, pMXs-Carm1, pMXs-Ccnb1, pMXs-Chd4, pMXs-Dnmt3a, pMXs-Eed, pMXsHat1, pMXs-Hdac1, pMXs-Hdac6, pMXs-Hells, pMXs-Kdm6a, pMXs-Prmt1, pMXs-Prmt5, pMXsPrmt7, pMXs-Rnf2, pMXs-Rnf20, pMXs-Ruvbl1, pMXs-Ruvbl2, pMXs-Smarcal1, pMXs-Usp16) using the Gateway BP and LR recombination reaction system (Invitrogen 11789 and 11791). All constructs at each step were verified by sequencing.

Retroviral infection of mouse fibroblasts and iPSC induction. Virus-containing supernatant was obtained from 2.2 x 106 HEK 293T cells grown in 10-cm dishes. Cells were transfected with a mixture of 5 µg pMXs (containing the gene of interest) and 5 µg pCL-Eco retrovirus packaging vector using Fugene 6 transfection reagent (Roche) following manufacturer’s instructions. After transfection for 8-10 h, medium was renewed and harvested 24 h and 48 h later as virus-containing supernatant, which was filtered (0.45 µm) prior to use. 5 x 104 MEFs were seeded in wells of a 6-well plate. One day after seeding, 0.2 ml of each viruscontaining supernatant was added to the MEFs. Transduction efficiency was aided by the presence of 5 µg/ml protamine sulfate. OG2-MEFs (MEFs with Oct4-GFP reporter) were infected with OSKM plus one of the reprogrammome candidates; secondary MEFs were infected with one of the reprogrammome candidates combined with doxycycline-mediated induction of

11 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 49

endogenous OSKM (gift from Marius Wernig). The infection was performed twice in the same way after around 12-24 h; the cells were cultured in MEF-medium (knock-out Dulbecco’s modified Eagle medium (DMEM) containing 15% fetal bovine serum (FBS), 5000 U/ml penicillin/streptomycin (PS), 2 mM L-glutamine, 0.1 mM non-essential amino acid (NEAA) and 0.05 mM β-mercaptoethanol). After the induction of iPSCs, the culture medium was replaced with the ES-medium and changed every second day (knock-out DMEM containing 15% knock-out serum replacement (KOSR), 5% FBS, 1000 U/ml leukemia inhibition factor (LIF), 5000 U/ml PS, 2 mM L-glutamine, 0.1 mM NEAA and 0.05 mM β-mercaptoethanol). Colonies arising from OG2MEFs were inspected for green fluorescence coming from OCT4-GFP. Colonies arising from doxycycline-inducible MEFs were stained for positive alkaline phosphatase (AP) activity using Fast Red/Napthol reagent 31.

Validation of factor overexpression and endogenous gene expression by real-time PCR. Total RNA was extracted from cells using QIAGEN RNeasy Mini Kit following manufacturer’s instructions. cDNA synthesis was performed with M-MLV reverse transcriptase (Promega) from 1µg RNA according to the manufacturer’s instructions. Real-time PCR was performed using the ABI 7900 system with SYBR Green PCR Master Mix (Bio-Rad) in 20 µl reaction volumes in triplicate. The running conditions were as follows: 50 °C 2 min, 95 °C 10 min, 95 °C 10 sec and 60°C 1 min for a total of 40 cycles; dissociation step: 95 °C 15 sec, 60°C 15 sec and 95°C 15 sec. Transcript levels of genes were normalized to that of Gapdh using the ΔΔCt method 32. Primer sequences are given in Table S-2 as supporting information.

12 ACS Paragon Plus Environment

Page 13 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

In vitro and in vivo differentiation of iPSCs with PRMT7 substituted for SOX2. For in vitro testing of differentiation, embryoid bodies (EBs) were produced as follows. iPSC colonies were trypsinized and sedimented, thereby removing the majority of the feeder cells. 750 single iPSCs in 20 µl MEF-medium were placed on the lid of a 15-cm dish and kept as hanging drops for 5 days. Thereafter, 15-50 EBs each were transferred to the wells of a 12-well plate containing differentiation medium and cultured for 2-3 weeks. For endoderm and mesoderm differentiation, EBs were cultured on gelatin-coated plates in KO-DMEM supplemented with 20% (endoderm) or 15% (mesoderm) FBS, 0.275 mM β-mercaptoethanol, 5000 U/ml PS, 2 mM Lglutamine and 0.1 mM NEAA. For ectoderm differentiation, EBs were cultured on Matrigelcoated plates in 50% Neurobasal, medium, 50% DMEM/F12, 0.5% N-2, 1% B27, 5000 U/ml PS and 2 mM L-glutamine with activin A inhibitor SB431542. After 2-4 weeks the differentiated EBs were fixed with 4% PFA/PBS for 10 min, permeabilized with 0.2% Triton X-100/PBS for 10 min, blocked with 2% FBS/ 0.1% PBS-Tween 20 for 1 h, and incubated with primary antibodies for ectoderm (anti-Tuj1, Sigma T8860, 1:2000) and for endoderm (anti-Sox17, R&D AF1924, 1:500) over night. Alexa-fluor-568-conjugated anti-goat and anti-mouse were used as secondary antibodies (anti-goat A11079 and anti-mouse A11061, Invitrogen). Hoechst 33342 was applied at 5 µg/ml for nuclear staining. Images were taken on a fluorescence microscope and processed in Image-J 33. Successful mesodermal differentiation was documented by filming videos of beating cardiomyocytes (see supporting information). For in vivo testing of differentiation, teratomata and germ cells were produced as follows. For teratoma formation, iPSC colonies were trypsinized and a single cell suspension was obtained after double sedimentation. 5 x 106 cells were injected subcutaneously into SCID mice in a carrier

13 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 49

volume of 500 µl MEF-medium. Four weeks later, mice were inspected for tumor formation. Tumor masses were dissected and fixed in 4% PFA in PBS followed by paraffin embedding and slicing. Histological sections were stained with hematoxylin and eosin to identify cell types. For germ cell formation, we followed our previously established protocol 34. Briefly, 10-15 iPSCs derived from OG2-MEFs were injected into the cavities of E3.5 wt blastocysts, which were then transferred to pseudopregnant CD1 mice. Germline contribution was assessed at E15.5 by the inspection of the fetal gonads for presence of OCT4-driven GFP-positive cells.

14 ACS Paragon Plus Environment

Page 15 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Results and discussion

The 28 reprogrammome candidates are present in the mouse preimplantation proteome at low relative abundance. Modern LC-MS/MS is now capable of analyzing the proteome to considerable depth; more than 3000 proteins have been detected in pools of mouse oocytes 21 and more than 4000 in Xenopus oocytes 35, 36, 37. Yet the list of candidate reprogramming factors in mammalian oocytes is quite limited (Table 1). We investigated the abundance of reprogrammome candidates 21 during the time when the reprogramming ability fades progressively after the oocyte MII stage. We collected unfertilized and fertilized oocytes from the oviduct and cultured the fertilized oocytes in KSOM(aa) medium over 4 days, expecting to see changes of abundance for those candidates that participate in reprogramming. For in-depth quantitative proteomic profiling, the cell lysates from two biological replicates of seven consecutive stages were processed: 1. unfertilized oocyte (MII); 2. fertilized oocyte with pronuclei; 3. two-cell embryo; 4. four-cell embryo; 5. eight-cell embryo; 6. morula (approx. 16 cells); and 7. blastocyst (32 cells or more). Each sample of the seven stages contained 600 biological units (oocytes or embryos), adding to 8400 units in total. We performed our measurements with two replicates, similar to a proteomic study of mouse iPSCs where measurements were also done with two biological replicates 38. In general, more replicates as e.g. in a proteomic study of bovine developmental stages 39 would have been desirable for the sake of more accurate and reliable summary statistics and to better isolate sources of variation in our measurements; however, the benefit of more replicates needs to be weighed against the

15 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 49

sacrificing of hundreds (or even thousands 40) of mice to collect sufficient numbers of oocytes and embryos. Two replicates, as in our study, represent a valid compromise and enabled us to at least filter the results and retain for further analysis only those proteins that were detected in both replicates. To give our study a quantitative dimension, we mixed the cell lysate of each stage with an equal amount of F9 EC cell lysate (spike-in standard), which had been isotopically labeled with Lys8 and Arg10 using the SILAC method 27. The F9 EC cell line was originally isolated by Bernstine et al. 41 as a subline of the teratocarcinoma OTT6050, established by implanting a 6 day-old embryo in the testis of a 129/J mouse. Thus, F9 EC cells have many characteristics of early embryonic mouse cells 42 and are expected to provide a labeled counterpart for a large share of the proteins present in early embryos. Cell lysate / spike-in mixtures were subjected to a tryptic digest, the resulting peptide mixture fractionated by reversed-phase HPLC at pH 10.2 and pools of fractions then analyzed on either an LTQ Orbitrap Velos or a Q Exactive LC-MS/MS. Since the chromatographic behavior of peptides is unaffected by their corresponding labeling status, ‘light' (L) peptides from unlabeled oocytes or embryos are co-eluting with 'heavy' (H) peptides from labeled F9 cells, the latter being common to all oocyte or embryo samples and serving as internal reference. Thus, light and heavy peptides appear in the same MS1 spectrum scan and their intensities can be directly used for relative quantification purposes. This allowed us to perform relative quantifications by comparing the L signal (from oocyte or embryo) against the H signal (from the F9 EC cells). Using the above pipeline, 5207 gene identities were detected and quantified (L/H) on average per single sample (5350 in MII oocytes, 5448 in fertilized oocytes, 5274 in 2-cell embryos, 5146 in

16 ACS Paragon Plus Environment

Page 17 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

4-cell embryos, 5298 in 8-cell embryos, 5210 in 16-cell embryos, 4717 in 32-cell embryos; Table S-3 as supporting information). An additional thirty-eight proteins were detected only in oocytes or embryos (L) but not in the F9 spike (H), and four proteins were detected only in the spike but not in oocytes or embryos (Table S-4 as supporting information); these proteins were excluded from our analysis. We were able to achieve this depth of analysis – unprecedented for mammalian oocytes – with 600 oocytes per sample, compared to 7000 oocytes per sample required in an independent study to generate proteomes of less than 3000 proteins 40, or compared to 1800 oocytes per sample used in our own previous study to identify over 3600 proteins 21. One explanation for the observed increase in identification rates (over our own previous results 21) could be the considerably extended prefractionation of samples by offline reversedphase chromatography which we included in our sample preparation scheme, resulting in 20 individual measurements per sample, each one with a comparatively long 2.5 h gradient. On the Orbitrap Velos this extended prefractionation resulted in an average identification rate of 4751 +/- 188 proteins with a L/H ratio, which is a major improvement over our previous measurements on roughly the same amount of starting material 25. On the other hand, we also noticed that enhanced sensitivity of the Q Exactive mass spectrometer as compared to the Orbitrap Velos instrument contributed to the further increase in identification rates: on average 5661 +/- 328 proteins per sample could be identified with a L/H ratio, using the same number of oocytes as starting material, indicating that there is a gain in quantifiable proteins of about 24% with the Q Exactive mass spectrometer.

17 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 49

Considering all samples together, 7243 gene identities were detected in at least one developmental stage of at least one replicate (union), while 5045 genes were present simultaneously in both replicates. Although the overall number of detected proteins (7243) remains far from the number of transcripts present in MII mouse oocytes (estimated in 16000 genes 43), all of the 28 candidates of the reprogrammome, 3 of the OSKM factors (except MYC) and the major maternal-effect proteins known from the literature 44 were detected in the proteome of mouse oocytes and embryos (Table 2). In fact, we were able to detect proteins whose cognate mRNA remained undetectable even though the probe was present on the microarray (Table S-5 as supporting information); these proteins include reprogrammome candidates MLL3 and SMARCA5 (Table 2; Table S-1 as supporting information). This is not the first time that we detected the proteins but not their corresponding mRNAs 21, 25. The semantic composition of these 'orphan' proteins in terms of the most represented ‘biological process’ categories of the gene ontology is similar to that that of the full set of detected proteins and detected mRNAs (Figure S-1 as supporting information). Clearly, although the proteome mirrors the transcriptome in general terms, discrepancies do exist that need to be examined on a caseby-case basis since they do not appear to fit in any obvious pattern. We wanted to estimate the relative abundance of the reprogrammome candidates in MII oocytes since they are the optimal recipients for SCNT, whereas subsequent stages harbor only a fraction of the initial reprogramming power 45, 46. Wang and colleagues 40 measured protein abundance based on the number of identified peptides; however, larger proteins tend to yield more tryptic peptides and thus the number of identified peptides when taken as a measure of protein abundance is likely to produce erroneous results. It has been demonstrated that a

18 ACS Paragon Plus Environment

Page 19 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

protein’s abundance as a fraction of the total protein within the cell is reflected by the proportion of its MS signal to the total MS signal corrected for the mass of the protein 47, 48. Therefore, we estimated protein abundance as the intensity of a given protein divided by the summed up intensity values of all proteins detected in oocytes, and divided this fraction by the molecular weight of the protein (Table 2). These relative intensities formed a statistical distribution. Proteins known to be abundant in mouse oocytes 44 ranked in the first percentile of the distribution (e.g. PADI6, PLA2G4C, MATER/NLRP5, NUCLEOPLASMIN, DPPA3/PGC7/STELLA, etc.). In contrast to these abundant proteins, OSKM factors OCT4 and SOX2 ranked in the 47th and 33rd percentile, respectively. Our reprogrammome candidates featured a median abundance percentile of 41 and only DNMT1, a maintenance DNA methyltransferase, ranked in the first percentile (Table 2).

Levels of the 28 reprogrammome candidates are stable throughout embryonic cleavage except for PRMT7 and CHD4 In order to better follow the abundance profiles over time, the abundance of reprogramming factors at each stage was normalized to that in the starting material, MII oocytes, according to the formula

ಽమ ಹ ಽభ ಹ

௅ଶ

= ௅ଵ, where H indicates the EC cell control, L2 indicates any given embryonic

stage under study, and L1 indicates the MII oocyte. For this quantification, at least two matching SILAC ratios of the same protein were required. This ratio of ratios resulted in values of 1 for oocytes and for proteins whose levels were unchanged between different developmental stages and oocytes, and values higher and lower than 1 for increased and reduced amounts of the corresponding proteins, respectively. 19 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 49

Our proteomic approach enabled us to confirm changes of protein abundance as expected from the literature (Table 2) and to assess the reprogrammome candidates (Figure 1). A hallmark of preimplantation development is the degradation of maternal or maternal-effect proteins soon after fertilization 49. Compared to degradation, synthesis and accumulation of new embryonic proteins may be more difficult to appreciate because the subunits of the oocyte ribosomes are stored in lattice structures, which may influence the ribosomes’ translational ability 50. Indeed, oocytes which lacked one of the candidate structural components of the lattices, PADI6, underwent ~50% of the total protein synthesis that is normally measured at the 2-cell stage 51. Our observations match these expectations in regard to protein degradation and synthesis (Table 2). For example, abundance of maternal-effect proteins BMP15 and GDF9 44, 49 declined after the MII stage and became undetectable after the 2-cell stage. Other maternal-effect proteins, such as the members of the subcortical maternal complex (MATER/NLRP5, FILIA/KHDC3, PADI6, TLE6 and FLOPED/OOEP; 52), were present throughout preimplantation development, with a transient decrease at the 1-cell stage (FLOPED/OOEP, MATER/NLRP5). A parabolic abundance profile was observed for another maternal-effect protein, DPPA3/PGC7/STELLA 53 (Figure 2C,C’), which decreased after oocyte fertilization and then rose again at the blastocyst stage, suggestive of functional roles not restricted to primordial germ cells, or suggesting that mouse primordial germ cell induction may start before implantation 54. It may be noted that the mRNAs coding for FLOPED/OOEP, MATER/NLRP5 and TLE6 fell below the detection limit from the 2-cell stage onward 52, yet these proteins were reliably found throughout preimplantation development up to blastocyst. Regardless of the molecular basis (for instance, proteins have a longer half-life and more molecules per cell than mRNAs 18), these

20 ACS Paragon Plus Environment

Page 21 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

discrepancies tell us that predicting the phenotype of a cell on the basis of the sole mRNA is fraught with risk. This may be especially true for highly specialized cells like oocytes and for rapidly transforming entities like embryos, which do not exist in a long-term steady state. In contrast to the maternal proteins, housekeeping proteins were more stably expressed, except for HPRT, which featured a steep increase (approx. 16-fold) at the morula and blastocyst stages compared to the 1-cell stage. The profile of HPRT can be explained, in part, by its template (Hprt mRNA increases approx. 4-fold during preimplantation development; Table S-1 as supporting information) and is not a singular case among proteins; it parallels the global increase of protein synthesis observed in mouse preimplantation development — 7-fold overall and 14fold for tubulin 55. Thus, the case of HPRT emphasizes the need to carefully evaluate reference genes with regard to assumptions of stable expression; probably no housekeeping gene is steadily expressed in all cells at all times. Abundance of candidate reprogramming factors was stable overall from the MII level onwards (Table 2). Of the four original iPS reprogramming factors, OCT4 and SOX2 were constitutively present, while KLF4 was detected in one of two replicates (hence considered as not reliable, n.r.), and MYC protein was not detected with our method. It may be noted that MYC also went undetected in our previous LC-MS/MS studies of mouse oocytes 21, 25 as well as in the independent oocyte study of Wang et al. 40, an observation that may be due to the labile nature and very short half-life of this transcription factor. In oocytes and early embryos, OCT4 exists in two isoforms, OCT4A and OCT4B 12. Of the 13 peptides assigned to OCT4, three mapped to the NH2-terminus region specific to OCT4A. Hence, the reprogramming-competent isoform OCT4A could be resolved from the pool of OCT4. We compared the abundance profiles of OCT4A and

21 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 49

total OCT4 (Table 2), and it appears that the two isoforms are co-present during preimplantation development, irrespective of the higher or lower reprogramming power associated with the different preimplantation stages. Candidate reprogramming factors TPT1, NPM3, and H3F3A (HISTONE H3.3, Table 1) were constitutively present. Two histones that proved effective in iPSC reprogramming when used in combination with NPM2, namely HIST3H2A and HIST1H2BA 56, were not detected in either replicate. We then examined the 28 reprogrammome candidates we previously proposed 21. Among them, KDM6A and MLL3 were not detected reliably in this study, and the cognate mRNA of MLL3 went undetected as well (Table 2; Table S-1 as supporting information). Only PRMT7 and CHD4 featured a marked peak of relative abundance at the 4-8cell stage (Figure 1). This peak is noteworthy because the mouse 8-cell stage features a mesenchymal-to-epithelial transition (compaction of the blastomeres), which is also observed during iPSC reprogramming 57,38. Since we were interested in factors that promote reprogramming and CHD4 is already known to be a reprogramming-inhibiting factor 58, we focused on PRMT7 and aimed to clarify its role in reprogramming, but not before confirming the LC-MS/MS profile of PRMT7 in situ by immunofluorescence (Figure 2A,A’). We observed good agreement between immunofluorescence and LC-MS/MS results for PRMT7. We also conducted this immunofluorescence validation for another reprogrammome candidate (DNMT3A; Figure 2B,B’) and for an independent antigen (STELLA; Figure 2C,C’).

Screening of reprogrammome candidates in iPSC induction uncovers a role for PRMT7 as capable of functionally replacing SOX2.

22 ACS Paragon Plus Environment

Page 23 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Next, we investigated whether our reprogrammome candidates, particularly PRMT7, can reinforce or replace any of the Yamanaka reprogramming factors during iPSCs generation. It was previously observed that PRMT7 is expressed in mouse ES cells and germ cells in a pattern similar to pluripotency markers 59. The mRNAs of PRMT7 and 16 other reprogrammome factors (BRCC3, CCNB1, DNMT1, DNMT3A, EED, HAT1, HDAC2, HDAC6, HELLS, KDM1A, PRMT1, PRMT5, RNF2, RUVBL1, RUVBL2, SMARCAL1) were upregulated during the conversion of mouse fibroblasts to iPSCs 57. Thus, we reprogrammed primary MEFs infected with OSK (we omitted MYC, since it is not necessary 60), and secondary MEFs containing a doxycycline-inducible OSKM cassette. On top of the Yamanaka factors, we added our reprogrammome candidates. We succeeded in testing 20 of the 28 reprogrammome candidates. All were expressed at least at double the control level after 3 days of infection as measured by qPCR (Table 3). The remaining 8 candidates were resistant to molecular cloning into retroviral vectors because either the full coding sequence could not be obtained or it was too long for packaging. MEFs supplemented with individual reprogrammome candidates, in addition to Yamanaka factors, did not generate more iPSCs (Table 3). The results of the overexpression assays were as follows: Based on the counts of GFP and AP positive colonies at day 16, the 20 factors tested failed to support an increase in excess of 2-fold in the induction efficiency compared to their controls, with the highest increase of 1.43 afforded by HDAC1 (Table 3). Although it is generally assumed that histone deacetylases act as transcriptional co-repressors, Wang and colleagues observed that HDAC1 is more enriched in active genes than in silent genes 61. Kidder and colleagues further found that, in mouse ES cells, HDAC1 occupies mainly active genes including the pluripotent marker genes Oct4, Sox2 and Nanog 62. Furthermore, HDAC1 expression was high

23 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 49

in mouse ES cells and decreased during late differentiation 63; Hdac1 was upregulated during reprogramming of MEFs into iPSCs 63. For the remaining factors, which failed to enhance iPSC generation in our hands, there are in the literature similar or opposite findings to ours (e.g. DNMT3 64; CHD4 58; KDM6A also known as UTX 65; PRMT5 66). We note that, in comparison to our results with the reprogrammome, adding AFS1 and GDF9 to OSKM enabled an increase from 6 to 8 iPSC colonies 67; adding DPPA3 brought an increase from 25 to 60 (GFP) or 175 to 200 (AP) colonies 68; and adding GLIS1 to OSKM enabled an increase from 10 to 40 colonies 69. The foldincrease of 1.43 produced by HDAC1 is much closer to the modest increase reported for AFS1 and GDF9, but it is nevertheless significant. Our results are contingent on the extent of overexpression achieved with the pMX vectors used in this study; we cannot exclude that different vectors or higher overexpression would lead factors to induce more colonies. Compared to the overexpression assays, results were more rewarding when we carried out substitution assays to see if the Yamanaka factors could be replaced by any of our reprogrammome candidates. We used a 6-well-based screening approach to look for reprogrammome candidates that could replace either OCT4, SOX2, or KLF4 in 3-factor-induced 60 OG2-MEF reprogramming. We found that PRMT7 replaced SOX2 efficiently (Figure 3A), leading to OKP7 iPSCs (OCT4 and KLF4 could not be replaced by the proposed reprogrammome factors). PRMT7 is a type II/III enzyme member of the protein arginine methyltransferase family 70, 71. Compared to the negative control (OCT4 and KLF4 without additional factor), there were more GFP+ colonies in OKP7 (40 ± 10 colonies/5 x 104 MEFs plated) compared to the other factors tested (0-5 colonies/5 x 104 MEFs plated). While OCT4 and KLF4 have been replaced several times with other proteins (with E-CADHERIN 72, NR5A2 15, TET1 73, BMPs 74, GLIS1, PARP1, ESRRB

24 ACS Paragon Plus Environment

Page 25 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

75

), SOX2 has mostly been replaced with small molecules (e.g. inhibitors of TGF beta receptor

kinase, of GSK3, of SFK 76-79). Thus PRMT7 (this study) adds to the small set of proteins (BMI-1 80, PRMT5 81) that can replace SOX2 in iPSC reprogramming. In order to determine the pluripotency of the OKP7 iPS cell lines, those with prevalent euploid karyotype (sublines #1 and #4) were passaged and stably maintained with ES cell morphology (Figure 3C) and marker gene expression (Figure 3B) for several months, prior to being subjected to multilineage differentiation in vitro and in vivo. The pluripotent character of OKP7 iPSCs was validated by in vitro formation and differentiation of embryoid bodies followed by immunostaining using endoderm- (SOX17) and ectoderm- (TUJ1) specific antibodies (Figure 4A; movies of beating cardiomyocytes provided as supporting information); and by in vivo teratoma formation in SCID mice (Figure 4B), as well as germline contribution in chimeric fetuses (Figure 4B).

Concluding remarks. In many previous studies it was empirically found that adding oocyte components to Yamanaka’s reprogramming cocktail (OSKM) had an effect on the rate of iPSC formation. In the present study, we tested a possible criterion to prioritize the choice of such additional factors by assessing the quantitative profile of candidate reprogramming factors in oocytes and throughout preimplantation stages. The data presented above support the conclusion that, with the exception of PRMT7, known (e.g. OSKM) and candidate reprogramming factors (e.g. the reprogrammome factors 21) are constitutively expressed at low-to-median levels in oocytes and during preimplantation development. Upon functional challenging of these factors, effects on

25 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 49

reprogramming efficiencies were mild, if any, as other groups have also observed when they tested promising factors in addition to the Yamanaka cocktail 67-69. In fact, we learned that some reprogrammome candidates may need to be depleted instead of overexpressed, resembling the case of Mdb3 58. Alternatively, reprogrammome candidates may need to be overexpressed in cooperative association in a concerted manner, rather than individually, thereby posing a requirement for a broader study effort (consortium) to determine appropriate number, stoichiometry and expression timing of the factors. As a step in that direction, we have identified and validated factor PRMT7 as an important member of the reprogrammome. PRMT7 is capable of functionally replacing SOX2. The mechanism by which PRMT7 exerts its function is not known. The activation of common pathways by SOX2 and PRMT7 might be one possible mechanism. In addition, we note that PRMT7 catalyzes similar histone modifications as PRMT5, which also has been shown to replace SOX2 81. The fact that both PRMT5 and PRMT7 can be selectively inhibited via the same small molecule 82 hints to a common mechanism. In addition, PRMT7 regulates cellular response to DNA damage 83 which is a known component of iPSC reprogramming 84. Taken together, these data indicate that PRMT7 participates in the reprogramming process mediated by Yamanaka factors, and makes us appreciate the usefulness of a proteome-based approach to tackle the reprogramming factors of mouse oocytes: our 1:20 discovery rate compares efficiently to other methods e.g. the screening of 1437 transcription factors for their ability to replace KLF4 or OCT4 during iPSC generation in mouse 69.

26 ACS Paragon Plus Environment

Page 27 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Acknowledgements This study came into existence thanks to the intellectual stimulus of the Schwerpunktprogramm (Priority Program) no. 1356 of the Deutsche Forschungsgemeinschaft (DFG grants BO2540/3-2 and FU 583/2-2). The experiments were an integral part of the doctoral work of Bingyuan Wang, who is grateful to the Chinese Scholarship Council for supporting her stay in Germany. We thank the Max Planck Institute for Molecular Biomedicine and its Director, Prof. Hans R. Schöler, for infrastructural support. We thank Ellen Casser and Mathias Ernst for help with the immunofluorescence and bioinformatic analysis, respectively. We also thank the personnel of the mouse housing facility, and Annalen Nolte at the Bioanalytical Mass Spectrometry Facility of MPI. Holm Zaehres made valuable comments on the manuscript. Amy Pavlak performed a language and grammar check on the final version of the manuscript.

Author contributions Bingyuan Wang performed the iPSC work and drafted an early version of this manuscript. Martin Pfeiffer helped Bingyuan Wang with the iPSC work. Hannes Drexler performed the LCMS/MS work. Georg Fuellen helped with the bioinformatic analysis and critically revised the manuscript. Michele Boiani collected the oocyte and embryo samples, performed the immunofluorescence work, analyzed all data, and wrote the manuscript in its final form.

27 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 49

Conflict of interest disclosure The authors are not aware of sources of bias or competing financial interests that may prejudice the objectivity of this manuscript.

Abbreviations AP: alkaline phosphatase CORFs: candidate oocyte reprogramming factors EB: embryoid body EC: embryonal carcinoma (cell) ES: embryonic stem (cell) FBS: fetal bovine serum GFP: green fluorescent protein hCG: human chorionic gonadotropin HEK 293: human embryonic kidney 293 cells iPS: induced pluripotent stem (cell) KLF4, product of the Kruppel-like factor 4 gene KOSR: knock-out serum replacement KSOM(aa): Potassium (K) simplex optimized (medium) containing amino acids L/H: light/heavy (SILAC peptides’ intensities) LC-MS/MS: liquid chromatography–mass spectrometry LIF: leukemia inhibitory factor MEF: mouse embryonic fibroblasts MII: metaphase II (oocyte) MYC, product of the myelocytomatosis oncogene NEAA: non-essential amino acids OCT4, product of the POU domain class 5, transcription factor 1 gene OG2: Oct4-GFP transgene OKP7: OCT4-KLF4-PRMT7 (iPSCs) OSKM: OCT4, SOX2, KLF4, MYC PMSG: pregnant mare’s serum gonadotropin PRMT7, protein arginine methyltransferase 7 PS: penicillin streptomycin qPCR: quantitative polymerase chain reaction SCID: severe combined immunodeficient (mouse) SCNT: somatic cell nuclear transfer SILAC: stable isotope labeling with amino acids in cell culture SOX2, prioduct of the SRY (sex determining region Y)-box 2 gene

28 ACS Paragon Plus Environment

Page 29 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

ASSOCIATED CONTENT Supporting Information All transcripts identified in the developmental stages used for proteomics are listed in Table S1. Real-time qPCR primers used for validation of factor overexpression or endogenous gene expression are listed in Table S-2. All proteins identified in each sample are listed in Table S-3. All raw L and H intensities of each sample are listed in Table S-4. The proteins without detected mRNA but represented with a probe on the microarray are listed in Table S-5. The most represented (%) ‘biological process’ categories of the gene ontology (GO BP) of proteome and transcriptome are shown in Figure S-1. The successful differentiation of OKP7 iPSCs into beating cardiomyocytes is documented with movies.

29 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 30 of 49

References 1. 2.

3. 4.

5.

6.

7.

8.

9.

10.

11. 12.

13.

Wilmut, I.; Schnieke, A. E.; McWhir, J.; Kind, A. J.; Campbell, K. H., Viable offspring derived from fetal and adult mammalian cells. Nature 1997, 385, (6619), 810-3. Wakayama, T.; Perry, A. C.; Zuccotti, M.; Johnson, K. R.; Yanagimachi, R., Full-term development of mice from enucleated oocytes injected with cumulus cell nuclei. Nature 1998, 394, (6691), 369-74. Kato, Y.; Tani, T.; Sotomaru, Y.; Kurokawa, K.; Kato, J.; Doguchi, H.; Yasue, H.; Tsunoda, Y., Eight calves cloned from somatic cells of a single adult. Science 1998, 282, (5396), 2095-8. Baguisi, A.; Behboodi, E.; Melican, D. T.; Pollock, J. S.; Destrempes, M. M.; Cammuso, C.; Williams, J. L.; Nims, S. D.; Porter, C. A.; Midura, P.; Palacios, M. J.; Ayres, S. L.; Denniston, R. S.; Hayes, M. L.; Ziomek, C. A.; Meade, H. M.; Godke, R. A.; Gavin, W. G.; Overstrom, E. W.; Echelard, Y., Production of goats by somatic cell nuclear transfer. Nat Biotechnol 1999, 17, (5), 456-61. Polejaeva, I. A.; Chen, S. H.; Vaught, T. D.; Page, R. L.; Mullins, J.; Ball, S.; Dai, Y.; Boone, J.; Walker, S.; Ayares, D. L.; Colman, A.; Campbell, K. H., Cloned pigs produced by nuclear transfer from adult somatic cells. Nature 2000, 407, (6800), 86-90. Chesne, P.; Adenot, P. G.; Viglietta, C.; Baratte, M.; Boulanger, L.; Renard, J. P., Cloned rabbits produced by nuclear transfer from adult somatic cells. Nat Biotechnol 2002, 20, (4), 366-9. Kim, K.; Ng, K.; Rugg-Gunn, P. J.; Shieh, J. H.; Kirak, O.; Jaenisch, R.; Wakayama, T.; Moore, M. A.; Pedersen, R. A.; Daley, G. Q., Recombination signatures distinguish embryonic stem cells derived by parthenogenesis and somatic cell nuclear transfer. Cell Stem Cell 2007, 1, (3), 346-52. Fang, Z. F.; Gai, H.; Huang, Y. Z.; Li, S. G.; Chen, X. J.; Shi, J. J.; Wu, L.; Liu, A.; Xu, P.; Sheng, H. Z., Rabbit embryonic stem cell lines derived from fertilized, parthenogenetic or somatic cell nuclear transfer embryos. Exp Cell Res 2006, 312, (18), 3669-82. Byrne, J. A.; Pedersen, D. A.; Clepper, L. L.; Nelson, M.; Sanger, W. G.; Gokhale, S.; Wolf, D. P.; Mitalipov, S. M., Producing primate embryonic stem cells by somatic cell nuclear transfer. Nature 2007, 450, (7169), 497-502. Tachibana, M.; Amato, P.; Sparman, M.; Gutierrez, N. M.; Tippner-Hedges, R.; Ma, H.; Kang, E.; Fulati, A.; Lee, H. S.; Sritanaudomchai, H.; Masterson, K.; Larson, J.; Eaton, D.; SadlerFredd, K.; Battaglia, D.; Lee, D.; Wu, D.; Jensen, J.; Patton, P.; Gokhale, S.; Stouffer, R. L.; Wolf, D.; Mitalipov, S., Human embryonic stem cells derived by somatic cell nuclear transfer. Cell 2013, 153, (6), 1228-38. Takahashi, K.; Yamanaka, S., Induction of pluripotent stem cells from mouse embryonic and adult fibroblast cultures by defined factors. Cell 2006, 126, (4), 663-76. Wu, G.; Han, D.; Gong, Y.; Sebastiano, V.; Gentile, L.; Singhal, N.; Adachi, K.; Fischedick, G.; Ortmeier, C.; Sinn, M.; Radstaak, M.; Tomilin, A.; Scholer, H. R., Establishment of totipotency does not depend on Oct4A. Nat Cell Biol 2013, 15, (9), 1089-97. Frum, T.; Halbisen, M. A.; Wang, C.; Amiri, H.; Robson, P.; Ralston, A., Oct4 cellautonomously promotes primitive endoderm development in the mouse blastocyst. Dev Cell 2013, 25, (6), 610-22.

30 ACS Paragon Plus Environment

Page 31 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

14. Le Bin, G. C.; Munoz-Descalzo, S.; Kurowski, A.; Leitch, H.; Lou, X.; Mansfield, W.; EtienneDumeau, C.; Grabole, N.; Mulas, C.; Niwa, H.; Hadjantonakis, A. K.; Nichols, J., Oct4 is required for lineage priming in the developing inner cell mass of the mouse blastocyst. Development 2014, 141, (5), 1001-10. 15. Heng, J. C.; Feng, B.; Han, J.; Jiang, J.; Kraus, P.; Ng, J. H.; Orlov, Y. L.; Huss, M.; Yang, L.; Lufkin, T.; Lim, B.; Ng, H. H., The nuclear receptor Nr5a2 can replace Oct4 in the reprogramming of murine somatic cells to pluripotent cells. Cell Stem Cell 2010, 6, (2), 16774. 16. Hanna, J.; Saha, K.; Pando, B.; van Zon, J.; Lengner, C. J.; Creyghton, M. P.; van Oudenaarden, A.; Jaenisch, R., Direct cell reprogramming is a stochastic process amenable to acceleration. Nature 2009, 462, (7273), 595-601. 17. Nie, J.; An, L.; Miao, K.; Hou, Z.; Yu, Y.; Tan, K.; Sui, L.; He, S.; Liu, Q.; Lei, X.; Wu, Z.; Tian, J., Comparative analysis of dynamic proteomic profiles between in vivo and in vitro produced mouse embryos during postimplantation period. J Proteome Res 2013, 12, (9), 3843-56. 18. Schwanhausser, B.; Busse, D.; Li, N.; Dittmar, G.; Schuchhardt, J.; Wolf, J.; Chen, W.; Selbach, M., Global quantification of mammalian gene expression control. Nature 2011, 473, (7347), 337-42. 19. Matveeva, N. M.; Shilov, A. G.; Kaftanovskaya, E. M.; Maximovsky, L. P.; Zhelezova, A. I.; Golubitsa, A. N.; Bayborodin, S. I.; Fokina, M. M.; Serov, O. L., In vitro and in vivo study of pluripotency in intraspecific hybrid cells obtained by fusion of murine embryonic stem cells with splenocytes. Mol Reprod Dev 1998, 50, (2), 128-38. 20. Flasza, M.; Shering, A. F.; Smith, K.; Andrews, P. W.; Talley, P.; Johnson, P. A., Reprogramming in inter-species embryonal carcinoma-somatic cell hybrids induces expression of pluripotency and differentiation markers. Cloning Stem Cells 2003, 5, (4), 33954. 21. Pfeiffer, M. J.; Siatkowski, M.; Paudel, Y.; Balbach, S. T.; Baeumer, N.; Crosetto, N.; Drexler, H. C.; Fuellen, G.; Boiani, M., Proteomic analysis of mouse oocytes reveals 28 candidate factors of the "reprogrammome". J Proteome Res 2011, 10, (5), 2140-53. 22. Awe, J. P.; Byrne, J. A., Identifying candidate oocyte reprogramming factors using crossspecies global transcriptional analysis. Cell Reprogram 2013, 15, (2), 126-33. 23. Cibelli, J. B., Principles of cloning. Second edition. ed.; Elsevier/AP, Academic Press is an imprint of Elsevier: Amsterdam ; Boston ;, 2014; p xxii, 562 pages. 24. Cox, J.; Mann, M., MaxQuant enables high peptide identification rates, individualized p.p.b.range mass accuracies and proteome-wide protein quantification. Nat Biotechnol 2008, 26, (12), 1367-72. 25. Schwarzer, C.; Siatkowski, M.; Pfeiffer, M. J.; Baeumer, N.; Drexler, H. C.; Wang, B.; Fuellen, G.; Boiani, M., Maternal age effect on mouse oocytes: new biological insight from proteomic analysis. Reproduction 2014, 148, (1), 55-72. 26. Wassarman, P. M.; Jovine, L.; Litscher, E. S., Mouse zona pellucida genes and glycoproteins. Cytogenet Genome Res 2004, 105, (2-4), 228-34. 27. Geiger, T.; Wisniewski, J. R.; Cox, J.; Zanivan, S.; Kruger, M.; Ishihama, Y.; Mann, M., Use of stable isotope labeling by amino acids in cell culture as a spike-in standard in quantitative proteomics. Nat Protoc 2011, 6, (2), 147-57.

31 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 32 of 49

28. Wang, Y.; Yang, F.; Gritsenko, M. A.; Wang, Y.; Clauss, T.; Liu, T.; Shen, Y.; Monroe, M. E.; Lopez-Ferrer, D.; Reno, T.; Moore, R. J.; Klemke, R. L.; Camp, D. G., 2nd; Smith, R. D., Reversed-phase chromatography with multiple fraction concatenation strategy for proteome profiling of human MCF10A cells. Proteomics 2011, 11, (10), 2019-26. 29. Mi, H.; Muruganujan, A.; Thomas, P. D., PANTHER in 2013: modeling the evolution of gene function, and other gene attributes, in the context of phylogenetic trees. Nucleic Acids Res 2013, 41, (Database issue), D377-86. 30. Vizcaino, J. A.; Deutsch, E. W.; Wang, R.; Csordas, A.; Reisinger, F.; Rios, D.; Dianes, J. A.; Sun, Z.; Farrah, T.; Bandeira, N.; Binz, P. A.; Xenarios, I.; Eisenacher, M.; Mayer, G.; Gatto, L.; Campos, A.; Chalkley, R. J.; Kraus, H. J.; Albar, J. P.; Martinez-Bartolome, S.; Apweiler, R.; Omenn, G. S.; Martens, L.; Jones, A. R.; Hermjakob, H., ProteomeXchange provides globally coordinated proteomics data submission and dissemination. Nat Biotechnol 2014, 32, (3), 223-6. 31. Cox, W. G.; Singer, V. L., A high-resolution, fluorescence-based method for localization of endogenous alkaline phosphatase activity. J Histochem Cytochem 1999, 47, (11), 1443-56. 32. Livak, K. J.; Schmittgen, T. D., Analysis of relative gene expression data using real-time quantitative PCR and the 2(-Delta Delta C(T)) Method. Methods 2001, 25, (4), 402-8. 33. Schneider, C. A.; Rasband, W. S.; Eliceiri, K. W., NIH Image to ImageJ: 25 years of image analysis. Nat Methods 2012, 9, (7), 671-5. 34. Cavaleri, F. M.; Balbach, S. T.; Gentile, L.; Jauch, A.; Bohm-Steuer, B.; Han, Y. M.; Scholer, H. R.; Boiani, M., Subsets of cloned mouse embryos and their non-random relationship to development and nuclear reprogramming. Mech Dev 2008, 125, (1-2), 153-66. 35. Smits, A. H.; Lindeboom, R. G.; Perino, M.; van Heeringen, S. J.; Veenstra, G. J.; Vermeulen, M., Global absolute quantification reveals tight regulation of protein expression in single Xenopus eggs. Nucleic Acids Res 2014, 42, (15), 9880-91. 36. Sun, L.; Bertke, M. M.; Champion, M. M.; Zhu, G.; Huber, P. W.; Dovichi, N. J., Quantitative proteomics of Xenopus laevis embryos: expression kinetics of nearly 4000 proteins during early development. Sci Rep 2014, 4, 4365. 37. Wuhr, M.; Guttler, T.; Peshkin, L.; McAlister, G. C.; Sonnett, M.; Ishihara, K.; Groen, A. C.; Presler, M.; Erickson, B. K.; Mitchison, T. J.; Kirschner, M. W.; Gygi, S. P., The Nuclear Proteome of a Vertebrate. Curr Biol 2015, 25, (20), 2663-71. 38. Hansson, J.; Rafiee, M. R.; Reiland, S.; Polo, J. M.; Gehring, J.; Okawa, S.; Huber, W.; Hochedlinger, K.; Krijgsveld, J., Highly coordinated proteome dynamics during reprogramming of somatic cells to pluripotency. Cell Rep 2012, 2, (6), 1579-92. 39. Deutsch, D. R.; Frohlich, T.; Otte, K. A.; Beck, A.; Habermann, F. A.; Wolf, E.; Arnold, G. J., Stage-specific proteome signatures in early bovine embryo development. J Proteome Res 2014, 13, (10), 4363-76. 40. Wang, S.; Kou, Z.; Jing, Z.; Zhang, Y.; Guo, X.; Dong, M.; Wilmut, I.; Gao, S., Proteome of mouse oocytes at different developmental stages. Proc Natl Acad Sci U S A 2010, 107, (41), 17639-44. 41. Berstine, E. G.; Hooper, M. L.; Grandchamp, S.; Ephrussi, B., Alkaline phosphatase activity in mouse teratoma. Proc Natl Acad Sci U S A 1973, 70, (12), 3899-903. 42. Alonso, A.; Breuer, B.; Steuer, B.; Fischer, J., The F9-EC cell line as a model for the analysis of differentiation. Int J Dev Biol 1991, 35, (4), 389-97. 32 ACS Paragon Plus Environment

Page 33 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

43. Ma, J. Y.; Li, M.; Ge, Z. J.; Luo, Y.; Ou, X. H.; Song, S.; Tian, D.; Yang, J.; Zhang, B.; Ou-Yang, Y. C.; Hou, Y.; Liu, Z.; Schatten, H.; Sun, Q. Y., Whole transcriptome analysis of the effects of type I diabetes on mouse oocytes. PLoS One 2012, 7, (7), e41981. 44. Yurttas, P.; Morency, E.; Coonrod, S. A., Use of proteomics to identify highly abundant maternal factors that drive the egg-to-embryo transition. Reproduction 2010, 139, (5), 80923. 45. Egli, D.; Rosains, J.; Birkhoff, G.; Eggan, K., Developmental reprogramming after chromosome transfer into mitotic mouse zygotes. Nature 2007, 447, (7145), 679-85. 46. Egli, D.; Sandler, V. M.; Shinohara, M. L.; Cantor, H.; Eggan, K., Reprogramming after chromosome transfer into mouse blastomeres. Curr Biol 2009, 19, (16), 1403-9. 47. Wisniewski, J. R.; Hein, M. Y.; Cox, J.; Mann, M., A "proteomic ruler" for protein copy number and concentration estimation without spike-in standards. Mol Cell Proteomics 2014, 13, (12), 3497-506. 48. Wisniewski, J. R.; Friedrich, A.; Keller, T.; Mann, M.; Koepsell, H., The impact of high-fat diet on metabolism and immune defense in small intestine mucosa. J Proteome Res 2015, 14, (1), 353-65. 49. Li, L.; Zheng, P.; Dean, J., Maternal control of early mouse development. Development 2010, 137, (6), 859-70. 50. Bachvarova, R.; De Leon, V.; Spiegelman, I., Mouse egg ribosomes: evidence for storage in lattices. J Embryol Exp Morphol 1981, 62, 153-64. 51. Yurttas, P.; Vitale, A. M.; Fitzhenry, R. J.; Cohen-Gould, L.; Wu, W.; Gossen, J. A.; Coonrod, S. A., Role for PADI6 and the cytoplasmic lattices in ribosomal storage in oocytes and translational control in the early mouse embryo. Development 2008, 135, (15), 2627-36. 52. Li, L.; Baibakov, B.; Dean, J., A subcortical maternal complex essential for preimplantation mouse embryogenesis. Dev Cell 2008, 15, (3), 416-25. 53. Payer, B.; Saitou, M.; Barton, S. C.; Thresher, R.; Dixon, J. P.; Zahn, D.; Colledge, W. H.; Carlton, M. B.; Nakano, T.; Surani, M. A., Stella is a maternal effect gene required for normal early development in mice. Curr Biol 2003, 13, (23), 2110-7. 54. Gerovska, D.; Arauzo-Bravo, M. J., Does mouse embryo primordial germ cell activation start before implantation as suggested by single-cell transcriptomics dynamics? Mol Hum Reprod 2016, 22, (3), 208-25. 55. Abreu, S. L.; Brinster, R. L., Synthesis of tubulin and actin during the preimplantation development of the mouse. Exp Cell Res 1978, 114, (1), 135-41. 56. Shinagawa, T.; Takagi, T.; Tsukamoto, D.; Tomaru, C.; Huynh, L. M.; Sivaraman, P.; Kumarevel, T.; Inoue, K.; Nakato, R.; Katou, Y.; Sado, T.; Takahashi, S.; Ogura, A.; Shirahige, K.; Ishii, S., Histone variants enriched in oocytes enhance reprogramming to induced pluripotent stem cells. Cell Stem Cell 2014, 14, (2), 217-27. 57. Samavarchi-Tehrani, P.; Golipour, A.; David, L.; Sung, H. K.; Beyer, T. A.; Datti, A.; Woltjen, K.; Nagy, A.; Wrana, J. L., Functional genomics reveals a BMP-driven mesenchymal-to-epithelial transition in the initiation of somatic cell reprogramming. Cell Stem Cell 2010, 7, (1), 64-77. 58. Rais, Y.; Zviran, A.; Geula, S.; Gafni, O.; Chomsky, E.; Viukov, S.; Mansour, A. A.; Caspi, I.; Krupalnik, V.; Zerbib, M.; Maza, I.; Mor, N.; Baran, D.; Weinberger, L.; Jaitin, D. A.; LaraAstiaso, D.; Blecher-Gonen, R.; Shipony, Z.; Mukamel, Z.; Hagai, T.; Gilad, S.; Amann-

33 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

59.

60. 61.

62. 63.

64.

65.

66. 67.

68.

69.

70.

71.

72.

Page 34 of 49

Zalcenstein, D.; Tanay, A.; Amit, I.; Novershtern, N.; Hanna, J. H., Deterministic direct reprogramming of somatic cells to pluripotency. Nature 2013, 502, (7469), 65-70. Buhr, N.; Carapito, C.; Schaeffer, C.; Kieffer, E.; Van Dorsselaer, A.; Viville, S., Nuclear proteome analysis of undifferentiated mouse embryonic stem and germ cells. Electrophoresis 2008, 29, (11), 2381-90. Wernig, M.; Meissner, A.; Cassady, J. P.; Jaenisch, R., c-Myc is dispensable for direct reprogramming of mouse fibroblasts. Cell Stem Cell 2008, 2, (1), 10-2. Wang, Z.; Zang, C.; Cui, K.; Schones, D. E.; Barski, A.; Peng, W.; Zhao, K., Genome-wide mapping of HATs and HDACs reveals distinct functions in active and inactive genes. Cell 2009, 138, (5), 1019-31. Kidder, B. L.; Palmer, S., HDAC1 regulates pluripotency and lineage specific transcriptional networks in embryonic and trophoblast stem cells. Nucleic Acids Res 2012, 40, (7), 2925-39. Saunders, L. R.; Sharma, A. D.; Tawney, J.; Nakagawa, M.; Okita, K.; Yamanaka, S.; Willenbring, H.; Verdin, E., miRNAs regulate SIRT1 expression during mouse embryonic stem cell differentiation and in adult mouse tissues. Aging (Albany NY) 2010, 2, (7), 415-31. Guo, X.; Liu, Q.; Wang, G.; Zhu, S.; Gao, L.; Hong, W.; Chen, Y.; Wu, M.; Liu, H.; Jiang, C.; Kang, J., microRNA-29b is a novel mediator of Sox2 function in the regulation of somatic cell reprogramming. Cell Res 2013, 23, (1), 142-56. Mansour, A. A.; Gafni, O.; Weinberger, L.; Zviran, A.; Ayyash, M.; Rais, Y.; Krupalnik, V.; Zerbib, M.; Amann-Zalcenstein, D.; Maza, I.; Geula, S.; Viukov, S.; Holtzman, L.; Pribluda, A.; Canaani, E.; Horn-Saban, S.; Amit, I.; Novershtern, N.; Hanna, J. H., The H3K27 demethylase Utx regulates somatic and germ cell epigenetic reprogramming. Nature 2012, 488, (7411), 409-13. Han, C.; Gu, H.; Wang, J.; Lu, W.; Mei, Y.; Wu, M., Regulation of L-threonine dehydrogenase in somatic cell reprogramming. Stem Cells 2013, 31, (5), 953-65. Gonzalez-Munoz, E.; Arboleda-Estudillo, Y.; Otu, H. H.; Cibelli, J. B., Cell reprogramming. Histone chaperone ASF1A is required for maintenance of pluripotency and cellular reprogramming. Science 2014, 345, (6198), 822-5. Xu, X.; Smorag, L.; Nakamura, T.; Kimura, T.; Dressel, R.; Fitzner, A.; Tan, X.; Linke, M.; Zechner, U.; Engel, W.; Pantakani, D. V., Dppa3 expression is critical for generation of fully reprogrammed iPS cells and maintenance of Dlk1-Dio3 imprinting. Nat Commun 2015, 6, 6008. Maekawa, M.; Yamaguchi, K.; Nakamura, T.; Shibukawa, R.; Kodanaka, I.; Ichisaka, T.; Kawamura, Y.; Mochizuki, H.; Goshima, N.; Yamanaka, S., Direct reprogramming of somatic cells is promoted by maternal transcription factor Glis1. Nature 2011, 474, (7350), 225-9. Lee, J. H.; Cook, J. R.; Yang, Z. H.; Mirochnitchenko, O.; Gunderson, S. I.; Felix, A. M.; Herth, N.; Hoffmann, R.; Pestka, S., PRMT7, a new protein arginine methyltransferase that synthesizes symmetric dimethylarginine. J Biol Chem 2005, 280, (5), 3656-64. Zurita-Lopez, C. I.; Sandberg, T.; Kelly, R.; Clarke, S. G., Human protein arginine methyltransferase 7 (PRMT7) is a type III enzyme forming omega-NG-monomethylated arginine residues. J Biol Chem 2012, 287, (11), 7859-70. Redmer, T.; Diecke, S.; Grigoryan, T.; Quiroga-Negreira, A.; Birchmeier, W.; Besser, D., Ecadherin is crucial for embryonic stem cell pluripotency and can replace OCT4 during somatic cell reprogramming. EMBO Rep 2011, 12, (7), 720-6. 34 ACS Paragon Plus Environment

Page 35 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

73. Gao, Y.; Chen, J.; Li, K.; Wu, T.; Huang, B.; Liu, W.; Kou, X.; Zhang, Y.; Huang, H.; Jiang, Y.; Yao, C.; Liu, X.; Lu, Z.; Xu, Z.; Kang, L.; Chen, J.; Wang, H.; Cai, T.; Gao, S., Replacement of Oct4 by Tet1 during iPSC induction reveals an important role of DNA methylation and hydroxymethylation in reprogramming. Cell Stem Cell 2013, 12, (4), 453-69. 74. Chen, J.; Liu, J.; Yang, J.; Chen, Y.; Chen, J.; Ni, S.; Song, H.; Zeng, L.; Ding, K.; Pei, D., BMPs functionally replace Klf4 and support efficient reprogramming of mouse fibroblasts by Oct4 alone. Cell Res 2011, 21, (1), 205-12. 75. Feng, B.; Jiang, J.; Kraus, P.; Ng, J. H.; Heng, J. C.; Chan, Y. S.; Yaw, L. P.; Zhang, W.; Loh, Y. H.; Han, J.; Vega, V. B.; Cacheux-Rataboul, V.; Lim, B.; Lufkin, T.; Ng, H. H., Reprogramming of fibroblasts into induced pluripotent stem cells with orphan nuclear receptor Esrrb. Nat Cell Biol 2009, 11, (2), 197-203. 76. Maherali, N.; Hochedlinger, K., Tgfbeta signal inhibition cooperates in the induction of iPSCs and replaces Sox2 and cMyc. Curr Biol 2009, 19, (20), 1718-23. 77. Ichida, J. K.; Blanchard, J.; Lam, K.; Son, E. Y.; Chung, J. E.; Egli, D.; Loh, K. M.; Carter, A. C.; Di Giorgio, F. P.; Koszka, K.; Huangfu, D.; Akutsu, H.; Liu, D. R.; Rubin, L. L.; Eggan, K., A smallmolecule inhibitor of tgf-Beta signaling replaces sox2 in reprogramming by inducing nanog. Cell Stem Cell 2009, 5, (5), 491-503. 78. Li, W.; Zhou, H.; Abujarour, R.; Zhu, S.; Young Joo, J.; Lin, T.; Hao, E.; Scholer, H. R.; Hayek, A.; Ding, S., Generation of human-induced pluripotent stem cells in the absence of exogenous Sox2. Stem Cells 2009, 27, (12), 2992-3000. 79. Staerk, J.; Lyssiotis, C. A.; Medeiro, L. A.; Bollong, M.; Foreman, R. K.; Zhu, S.; Garcia, M.; Gao, Q.; Bouchez, L. C.; Lairson, L. L.; Charette, B. D.; Supekova, L.; Janes, J.; Brinker, A.; Cho, C. Y.; Jaenisch, R.; Schultz, P. G., Pan-Src family kinase inhibitors replace Sox2 during the direct reprogramming of somatic cells. Angew Chem Int Ed Engl 2011, 50, (25), 5734-6. 80. Moon, J. H.; Heo, J. S.; Kim, J. S.; Jun, E. K.; Lee, J. H.; Kim, A.; Kim, J.; Whang, K. Y.; Kang, Y. K.; Yeo, S.; Lim, H. J.; Han, D. W.; Kim, D. W.; Oh, S.; Yoon, B. S.; Scholer, H. R.; You, S., Reprogramming fibroblasts into induced pluripotent stem cells with Bmi1. Cell Res 2011, 21, (9), 1305-15. 81. Nagamatsu, G.; Kosaka, T.; Kawasumi, M.; Kinoshita, T.; Takubo, K.; Akiyama, H.; Sudo, T.; Kobayashi, T.; Oya, M.; Suda, T., A germ cell-specific gene, Prmt5, works in somatic cell reprogramming. J Biol Chem 2011, 286, (12), 10641-8. 82. Smil, D.; Eram, M. S.; Li, F.; Kennedy, S.; Szewczyk, M. M.; Brown, P. J.; Barsyte-Lovejoy, D.; Arrowsmith, C. H.; Vedadi, M.; Schapira, M., Discovery of a Dual PRMT5-PRMT7 Inhibitor. ACS Med Chem Lett 2015, 6, (4), 408-12. 83. Karkhanis, V.; Wang, L.; Tae, S.; Hu, Y. J.; Imbalzano, A. N.; Sif, S., Protein arginine methyltransferase 7 regulates cellular response to DNA damage by methylating promoter histones H2A and H4 of the polymerase delta catalytic subunit gene, POLD1. J Biol Chem 2012, 287, (35), 29801-14. 84. Marion, R. M.; Strati, K.; Li, H.; Murga, M.; Blanco, R.; Ortega, S.; Fernandez-Capetillo, O.; Serrano, M.; Blasco, M. A., A p53-mediated DNA damage response limits reprogramming to ensure iPS cell genomic integrity. Nature 2009, 460, (7259), 1149-53. 85. Hansis, C.; Barreto, G.; Maltry, N.; Niehrs, C., Nuclear reprogramming of human somatic cells by xenopus egg extract requires BRG1. Curr Biol 2004, 14, (16), 1475-80.

35 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 36 of 49

86. Betthauser, J. M.; Pfister-Genskow, M.; Xu, H.; Golueke, P. J.; Lacson, J. C.; Koppang, R. W.; Myers, C.; Liu, B.; Hoeschele, I.; Eilertsen, K. J.; Leno, G. H., Nucleoplasmin facilitates reprogramming and in vivo development of bovine nuclear transfer embryos. Mol Reprod Dev 2006, 73, (8), 977-86. 87. Tani, T.; Shimada, H.; Kato, Y.; Tsunoda, Y., Bovine oocytes with the potential to reprogram somatic cell nuclei have a unique 23-kDa protein, phosphorylated transcriptionally controlled tumor protein (TCTP). Cloning Stem Cells 2007, 9, (2), 267-80. 88. Koziol, M. J.; Garrett, N.; Gurdon, J. B., Tpt1 activates transcription of oct4 and nanog in transplanted somatic nuclei. Curr Biol 2007, 17, (9), 801-7. 89. Yan, X.; Yu, S.; Lei, A.; Hua, J.; Chen, F.; Li, L.; Xie, X.; Yang, X.; Geng, W.; Dou, Z., The four reprogramming factors and embryonic development in mice. Cell Reprogram 2010, 12, (5), 565-70. 90. Miyamoto, K.; Nagai, K.; Kitamura, N.; Nishikawa, T.; Ikegami, H.; Binh, N. T.; Tsukamoto, S.; Matsumoto, M.; Tsukiyama, T.; Minami, N.; Yamada, M.; Ariga, H.; Miyake, M.; Kawarasaki, T.; Matsumoto, K.; Imai, H., Identification and characterization of an oocyte factor required for development of porcine nuclear transfer embryos. Proc Natl Acad Sci U S A 2011, 108, (17), 7040-5. 91. Kong, Q.; Xie, B.; Li, J.; Huan, Y.; Huang, T.; Wei, R.; Lv, J.; Liu, S.; Liu, Z., Identification and characterization of an oocyte factor required for porcine nuclear reprogramming. J Biol Chem 2014, 289, (10), 6960-8. 92. Wen, D.; Banaszynski, L. A.; Liu, Y.; Geng, F.; Noh, K. M.; Xiang, J.; Elemento, O.; Rosenwaks, Z.; Allis, C. D.; Rafii, S., Histone variant H3.3 is an essential maternal factor for oocyte reprogramming. Proc Natl Acad Sci U S A 2014, 111, (20), 7325-30. 93. Ishiuchi, T.; Enriquez-Gasca, R.; Mizutani, E.; Boskovic, A.; Ziegler-Birling, C.; RodriguezTerrones, D.; Wakayama, T.; Vaquerizas, J. M.; Torres-Padilla, M. E., Early embryonic-like cells are induced by downregulating replication-dependent chromatin assembly. Nat Struct Mol Biol 2015. 94. David, L.; Polo, J. M., Phases of reprogramming. Stem Cell Res 2014, 12, (3), 754-61.

36 ACS Paragon Plus Environment

Page 37 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Table 1. Literature search (chronological publication order) of reprogramming factors thought to account for the oocyte's unique ability to reprogram nuclei after SCNT. Factor

Role

Reference

BRG1 a.k.a. SMARCA 4

depletion of BRG1 protein abolishes the reprogramming ability of Xenopus egg extracts on human somatic cells pregnancy rates are increased after SCNT pretreatment of donor cells with phosphorylated TCTP increases the development of cloned calves by improving transcription of pluripotency genes the four Yamanaka factors are present in mouse oocytes development is impaired in porcine SCNT embryos that lack Park7 iPSC generation efficiency is increased when Npm2 is co-delivered in silico candidates proposed to loosen/open up chromatin structure; abundant as mRNA in mature oocytes of human, rhesus monkey and mouse depletion of maternal product reduces both rate and quality of cloned embryo development

85

pluripotency-associated genes are reactivated in a maternal H3.3-dependent way in SCNT mouse embryos iPSC generation efficiency is increased when Npm2 is co-delivered

92

iPSC generation efficiency is increased when Gdf9 is co-delivered reprogramming after SCNT is enhanced when CAF1 is knocked down in nucleus donor cells

67

NPM3 TPT1 a.k.a. TCTP

OSKM DJ-1 a.k.a. PARK7 GLIS1 CORFs

VIMENTIN HISTONE H3.3

HISTONES HIST1H2AA and HIST1H2BA AFS1 CAF1 a.k.a. CHAF1

DPPA3 / PGC7 / STELLA increases the proportion of fully reprogrammed, high-grade chimera-forming iPSCs, as compared to partially reprogrammed low-grade chimera-forming iPSCs.

86 87 88

89

90

69

22

91

56

93

68

37 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 38 of 49

Table 2. SILAC based proteomic quantitation of differently staged mouse embryos. Relative abundance of reprogrammome candidates, iPSC reprogramming factors, oocyte-specific or oocyte-enriched proteins, and housekeeping proteins. The protein abundance percentile in MII oocytes was derived from the proportion of a protein’s MS signal to the total MS signal corrected for the mass of the protein. The percentile was assigned by ranking the factors from most to least abundant, and setting the whole range to 100%. During development (MII, 1C, 2C, 4C, 8C, 16C, 32C), each value is obtained from the average of two embryonic replicates divided by the average abundance at M II stage, yielding L/H ratios. Factors are listed in alphabetical order. N.d.: not detected at all (in any of the replicates). N.r.: not reliable (protein was detected in one sample but not in the replicate; hence all stages MII-to-32C are affected). A slash between gene names is used to indicate synonyms. A semicolon between gene names is used to indicate that the protein assignment is uncertain between the two proteins separated by the semicolon (tie). For instance, this is the case of the TATA-binding proteins TBP and TBPL2, whose amino acid sequences are very similar at least in the C-terminal region. * Presence (P), absence (A) according to microarray analysis (probe verified as present). ** Abundance according to distribution of relative MS signal intensity: 1st percentile: highest abundance; 100th percentile: lowest abundance.

Factor

Reprogrammome BAZ1B BRCC3 CARM1 CCNB1

Cognate Protein mRNA * abundance MII in MII **

P P P P

49.54 28.72 43.78 39.79

1 1 1 1

1C

2C

4C

8C

16C

32C

1.05 0.78 0.67 1.57

0.92 0.45 0.78 1.65

0.86 0.54 0.68 1.19

1.19 0.62 0.66 0.92

1.36 0.38 1.14 2.20

2.26 1.85 1.08 2.32

38 ACS Paragon Plus Environment

Page 39 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

CHD4 DNMT1 DNMT3A EED EP400 HAT1 HDAC1; GM10093 HDAC2 HDAC6 HELLS KDM1A / LSD1 KDM6A / UTX MLL3 PRMT1 PRMT5 PRMT7 RNF2 RNF20 RUVBL1 RUVBL2 SMARCA4 / BRG1 SMARCA5 SMARCAL1 USP16 iPS - OSKM POU5F1 (A+B) POU5F1 (A) SOX2 KLF4 MYC Oocyte-specific or –enriched ASF1A BMP15 CHAF1A DPPA3 / PGC7 / STELLA FLOPED / OOEP GDF9 GLIS1; B1ASP5 HIST1H2AA HIST1H2BA H3F3A; H3F3C

Journal of Proteome Research

P P P P P P P P P P P P A P P P P P P P P A P P

32.61 0.88 67.74 40.68 79.56 21.43 5.52 43.67 44.03 22.39 14.64 n.r. n.r. 6.41 38.04 61.29 46.92 71.42 9.59 11.68 50.84 16.56 78.61 88.16

1 1 1 1 1 1 1 1 1 1 1 n.r. n.r. 1 1 1 1 1 1 1 1 1 1 1

1.80 0.94 0.95 0.97 0.76 1.10 0.84 1.27 0.75 1.26 1.01 n.r. n.r. 1.03 0.80 1.30 0.97 0.80 1.12 1.01 1.02 0.96 0.70 0.93

1.12 1 0.39 1.02 0.90 0.96 0.53 0.68 0.73 0.73 1.11 n.r. n.r. 1.18 0.47 2.33 0.80 0.77 1.07 0.68 0.92 0.75 0.67 1.04

5.99 1.18 0.19 1.01 0.62 0.96 0.53 1.19 0.84 1.14 0.88 n.r. n.r. 1.13 0.57 6.22 0.81 0.62 1.21 0.95 0.83 0.94 1.21 1.42

1.42 1.04 0 1.42 0.77 1 0.98 1.14 0.81 1.29 1.17 n.r. n.r. 1.49 0.58 2.75 1.04 0.98 1.24 1.08 1.33 1.81 0.64 0.87

1.50 0.88 0 1.69 0.55 1.28 1.12 1.18 0.78 1.11 1.11 n.r. n.r. 1.52 1.65 2.25 1.64 1.10 1.49 1.27 1.51 2.02 1.19 1.18

1.39 0.46 0 3.40 2.52 1.76 1.41 2.03 0.42 1.92 2.30 n.r. n.r. 1.86 1.82 1.42 1.50 2.32 1.34 1.16 2.13 3.50 2.55 2.39

P

47.70

1

1.03

0.87

0.93

1.28

2.67

4.10

1

1

0.32

0.46

0.49

1.31

1.64

P P P

33.16 n.r. n.d.

1 n.r.

0.50 n.r.

0.53 n.r.

0.64 n.r.

0.72 n.r.

0.87 n.r.

0.88 n.r.

P P P P P P P P P P

58.57 42.32 80.52 10.38 0.14 8.90 n.r. n.d. n.d. 47.28

1 1 1 1 1 1 n.r.

1 1.98 1.89 0.34 0.09 2.39 n.r.

1.28 0.51 1.33 0.20 1.58 2.15 n.r.

0.73 0 1.91 0.13 1.55 0 n.r.

0.56 0 1.41 0.10 0.14 0 n.r.

0.77 0 1.44 0.65 0.03 0 n.r.

0.93 0 3.75 1.28 0.04 0 n.r.

1

0.90

0.89

1.80

1.48

1.24

1.72 39

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 40 of 49

FILIA / KHDC3 / 2410004A20Rik MATER / NLRP5 NPM2 NPM3 PADI6 PARK7 PLA2G4C TLE6 TPT1 VIM

P P P P P P P P P P

0.16 0.29 0.40 29.72 0.04 1.99 0.58 0.36 2.64 3.39

1 1 1 1 1 1 1 1 1 1

2.25 0.27 4.52 0.98 2.84 1.10 0.47 3.72 1.44 0.91

0.53 1.75 0.69 0.97 6.33 2.50 1.69 1.06 4.22 109.69 0.97 0.88 0.08 0.13 4.93 6.51 1.27 1.46 0.96 1.11

2.44 1.41 3.63 1.02 7.42 0.83 0.05 2.76 3.40 0.92

1.90 0.90 3.82 1.87 3.04 1.14 0.29 1.88 2.22 0.83

1.09 1.56 5.42 1.78 0.65 0.74 0.34 5.12 1.86 0.54

Housekeepers EEF1E1 GAPDH H2AFV; H2AFZ HPRT1 RPS27A PPIA; GM5160 TBP; TBPL2

P P P P P P P

35.17 0.79 15.42 6.73 0.07 0.61 32.70

1 1 1 1 1 1 1

0.76 1.51 1.53 0.88 1.05 1.12 0.18

0.94 1.16 1.11 0.85 0.86 1.09 0.21

1.91 1.52 1.87 3.30 1.05 1.23 0.40

1.04 1.57 2.20 6.43 1.07 1.22 0.52

2.59 0.73 2.38 16.4 0.54 0.81 0.96

0.9 1.39 1.60 1.63 1 1.09 0.36

40 ACS Paragon Plus Environment

Page 41 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Table 3. Efficiency of iPSC generation after overexpression of individual reprogrammome candidates in MEFs. The number of pluripotent colonies is normalized to a OSKM control 16 days after infection. We reprogrammed MEFs containing an Oct4-GFP transgene, and MEFs containing a doxycycline-inducible OSKM-cassette. In these cells reprogramming is revealed, respectively, by GFP fluorescence and by alkaline phosphatase (AP) staining, corresponding to more and less advanced phases of reprogramming, respectively 94. On day 3 after infection of OSKM MEFs, the overexpression of candidates was verified by qPCR (expression is relative to that in OSKM group, Gapdh as control, 3 replicates). On day 16, resultant iPSC colonies with positive GFP or AP were scored. The scores based on GFP (driven by Oct4 promoter) and AP are similar but not identical, because these markers come up with different kinetics during the reprogramming process 94. Replicates = 3. Asterisks (*) represent a statistically significant difference between effect and control within the same column, t-test, p < 0.05.

41 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 42 of 49

Figure legends

Figure 1. Expression levels of reprogrammome candidates in mouse preimplantation stages according to LC-MS/MS (MII oocyte set to 1).

Figure 2. Immunofluorescence verification of LC-MS/MS expression levels of PRMT7, DNMT3A and STELLA during mouse preimplantation stages. (A) The top horizontal series shows the DNA counterstaining with YO-PRO-1; the middle series shows PRMT7 (Rabbit-anti-human-PRMT7, sc98882) detected with a fluorescence dye labeled goat-anti-rabbit-IgG-647; the lower horizontal series shows the overlay of the YO-PRO-1 and PRMT7 signals with DNA in green, PRMT7 in red (GV, verminal vesicle). (A’) Comparison of LC-MS/MS signals and immunofluorescence intensities of PRMT7 measured with Image-J. (B) Immunofluorescence signal of DNMT3A (Goat-antihuman-DNMT3A, sc-10234) detected with a rabbit-anti-goat secondary antibody Alexa 405 conjugate; nuclear counterstain with DRAQ5. (B’) Comparison of LC-MS/MS signals and immunofluorescence intensities of DNMT3A measured with Image-J. (C) Immunofluorescence signal of PGC7/STELLA/DPPA3 (Rabbit-anti-mouse-STELLA, Abcam 19878) detected with a goatanti-rabbit secondary antibody Alexa 488 conjugate; nuclear counterstain with DRAQ5. (C’) Comparison of LC-MS/MS signals and immunofluorescence intensities of PGC7/STELLA/DPPA3 measured with Image-J. Scale bar = 50 µm.

Figure 3. Generation of SOX2-substituted iPSCs is most efficient with PRMT7. (A) Rate of iPSC induction after substituting the indicated candidate factors for SOX2 (n° colonies per well ±

42 ACS Paragon Plus Environment

Page 43 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

standard deviations, 3 replicates). (B) mRNA expression of endogenous pluripotency markers OCT4, SOX2, KLF4, and NANOG as measured by qPCR in OKP7 iPS cells, ES cells and MEFs (Gapdh used as reference for qPCR; 3 replicates; error bars = standard deviations). (C) Pictures of representative OKP7 colonies with ES cell-like morphology and expression of Oct4-GFP.

Figure 4. Functional characterization of OKP7 iPSCs and control ESCs in vitro (A) and in vivo (B). A, in vitro differentiation as embryoid bodies. Two to three weeks after transfer to gelatinized dishes with specific medium for differentiation, EBs were subjected to immunofluorescence for SOX17 (endoderm) and TUJ1 (ectoderm). A1-A3, SOX17 expression. A4-A6, Hoechst counterstain. A7-A9, TUJ1 expression. A10-A12, Hoechst counterstain. Scale bar = 250 µm. B, in vivo differentiation as teratomas after subcutaneous injection into SCID mice or as germ cells after blastocyst injection and embryo transfer. Four week-old teratomas contained derivatives of all three germ layers as detected by hematoxylin and eosin staining. Germ cells derived from OKP7 iPSCs were recognizable by Oct4-GFP expression in fetal gonads on developmental day 15.5. Scale bar = 100 µm. B1-B3, endoderm (ciliated epithelium), B4-B6, mesoderm (striated muscle), B7-B9, ectoderm (neural rosette), B10-B12, green-fluorescent germ cells in fetal gonads.

Graphical abstract. The reprogramming process that leads to induced pluripotent stem cells (iPSCs) may benefit from adding oocyte factors to Yamanaka’s reprogramming cocktail (OCT4, SOX2, KLF4, with or without MYC; OSK(M)). We previously searched for such facilitators of reprogramming (the reprogrammome) by applying label-free LC-MS/MS analysis to mouse oocytes, producing a catalog of 28 candidates that are (i) able to robustly access the cell nucleus, and (ii) shared between mature mouse oocytes and pluripotent embryonic stem cells. In the 43 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 44 of 49

present study we hypothesized that our 28 reprogrammome candidates would also be (iii) abundant in mature oocytes, (iv) depleted after oocyte-to-embryo transition, and (v) able to potentiate or replace the OSKM factors. Using LC-MS/MS and isotopic labeling methods, we found that the abundance profiles of the 28 proteins were below those of known oocyte-specific and housekeeping proteins. Of the 28 proteins, only arginine methyltransferase 7 (PRMT7) changed substantially during mouse embryogenesis and promoted the conversion of mouse fibroblasts into iPSCs. Specifically, PRMT7 replaced SOX2 in a factor-substitution assay, yielding iPSCs. These findings exemplify how proteomics can be used to prioritize the functional analysis of reprogrammome candidates. The LC-MS/MS data are available via ProteomeXchange with identifier PXD003093.

44 ACS Paragon Plus Environment

Page 45 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Figure 1. Expression levels of reprogrammome candidates in mouse preimplantation stages according to LCMS/MS (MII oocyte set to 1). 176x116mm (300 x 300 DPI)

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 2. Immunofluorescence verification of LC-MS/MS expression levels of PRMT7, DNMT3A and STELLA during mouse preimplantation stages. (A) The top horizontal series shows the DNA counterstaining with YOPRO-1; the middle series shows PRMT7 (Rabbit-anti-human-PRMT7, sc-98882) detected with a fluorescence dye labeled goat-anti-rabbit-IgG-647; the lower horizontal series shows the overlay of the YO-PRO-1 and PRMT7 signals with DNA in green, PRMT7 in red (GV, verminal vesicle). (A’) Comparison of LC-MS/MS signals and immunofluorescence intensities of PRMT7 measured with Image-J. (B) Immunofluorescence signal of DNMT3A (Goat-anti-human-DNMT3A, sc-10234) detected with a rabbit-anti-goat secondary antibody Alexa 405 conjugate; nuclear counterstain with DRAQ5. (B’) Comparison of LC-MS/MS signals and immunofluorescence intensities of DNMT3A measured with Image-J. (C) Immunofluorescence signal of PGC7/STELLA/DPPA3 (Rabbit-anti-mouse-STELLA, Abcam 19878) detected with a goat-anti-rabbit secondary antibody Alexa 488 conjugate; nuclear counterstain with DRAQ5. (C’) Comparison of LC-MS/MS signals and immunofluorescence intensities of PGC7/STELLA/DPPA3 measured with Image-J. Scale bar = 50 µm. 176x203mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 46 of 49

Page 47 of 49

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 3. Generation of SOX2-substituted iPSCs is most efficient with PRMT7. (A) Rate of iPSC induction after substituting the indicated candidate factors for SOX2 (n° colonies per well ± standard deviations, 3 replicates). (B) mRNA expression of endogenous pluripotency markers OCT4, SOX2, KLF4, and NANOG as measured by qPCR in OKP7 iPS cells, ES cells and MEFs (Gapdh used as reference for qPCR; 3 replicates; error bars = standard deviations). (C) Pictures of representative OKP7 colonies with ES cell-like morphology and expression of Oct4-GFP. 176x129mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 48 of 49

Page 49 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Figure 4. Functional characterization of OKP7 iPSCs and control ESCs in vitro (A) and in vivo (B). A, in vitro differentiation as embryoid bodies. Two to three weeks after transfer to gelatinized dishes with specific medium for differentiation, EBs were subjected to immunofluorescence for SOX17 (endoderm) and TUJ1 (ectoderm). A1-A3, SOX17 expression. A4-A6, Hoechst counterstain. A7-A9, TUJ1 expression. A10-A12, Hoechst counterstain. Scale bar = 250 µm. B, in vivo differentiation as teratomas after subcutaneous injection into SCID mice or as germ cells after blastocyst injection and embryo transfer. Four week-old teratomas contained derivatives of all three germ layers as detected by hematoxylin and eosin staining. Germ cells derived from OKP7 iPSCs were recognizable by Oct4-GFP expression in fetal gonads on developmental day 15.5. Scale bar = 100 µm. B1-B3, endoderm (ciliated epithelium), B4-B6, mesoderm (striated muscle), B7-B9, ectoderm (neural rosette), B10-B12, green-fluorescent germ cells in fetal gonads. 176x79mm (300 x 300 DPI)

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Graphical abstract. The reprogramming process that leads to induced pluripotent stem cells (iPSCs) may benefit from adding oocyte factors to Yamanaka’s reprogramming cocktail (OCT4, SOX2, KLF4, with or without MYC; OSK(M)). We previously searched for such facilitators of reprogramming (the reprogrammome) by applying label-free LC-MS/MS analysis to mouse oocytes, producing a catalog of 28 candidates that are (i) able to robustly access the cell nucleus, and (ii) shared between mature mouse oocytes and pluripotent embryonic stem cells. In the present study we hypothesized that our 28 reprogrammome candidates would also be (iii) abundant in mature oocytes, (iv) depleted after oocyte-toembryo transition, and (v) able to potentiate or replace the OSKM factors. Using LC-MS/MS and isotopic labeling methods, we found that the abundance profiles of the 28 proteins were below those of known oocyte-specific and housekeeping proteins. Of the 28 proteins, only arginine methyltransferase 7 (PRMT7) changed substantially during mouse embryogenesis and promoted the conversion of mouse fibroblasts into iPSCs. Specifically, PRMT7 replaced SOX2 in a factor-substitution assay, yielding iPSCs. These findings exemplify how proteomics can be used to prioritize the functional analysis of reprogrammome candidates. The LC-MS/MS data are available via ProteomeXchange with identifier PXD003093. 82x45mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 50 of 49