Up-to-Date Workflow for Plant (Phospho)proteomics Identifies

Sep 19, 2016 - Minako Ueda , Ernst Aichinger , Wen Gong , Edwin Groot , Inge Verstraeten , Lam Dai Vu , Ive De Smet , Tetsuya Higashiyama , Masaaki ...
0 downloads 0 Views 1MB Size
Subscriber access provided by CORNELL UNIVERSITY LIBRARY

Article

An up-to-date workflow for plant (phospho)proteomics identifies differential drought-responsive phosphorylation events in maize leaves Lam Dai Vu, Elisabeth Stes, Michiel Van Bel, Hilde Nelissen, Davy Maddelein, Dirk Inzé, Frederik Coppens, Lennart Martens, Kris Gevaert, and Ive De Smet J. Proteome Res., Just Accepted Manuscript • DOI: 10.1021/acs.jproteome.6b00348 • Publication Date (Web): 19 Sep 2016 Downloaded from http://pubs.acs.org on September 21, 2016

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Proteome Research is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

An up-to-date workflow for plant (phospho)proteomics identifies differential droughtresponsive phosphorylation events in maize leaves

Lam Dai Vua,b,c,d,#, Elisabeth Stesa,b,c,d,#, Michiel Van Bela,b, Hilde Nelissena,b, Davy Maddeleinc,d, Dirk Inzéa,b, Frederik Coppensa,b, Lennart Martensc,d, Kris Gevaertc,d,§, Ive De Smeta,b,§,*

a

Department of Plant Systems Biology, VIB, 9052 Ghent, Belgium

b

Department of Plant Biotechnology and Bioinformatics, Ghent University, 9052 Ghent,

Belgium c

Medical Biotechnology Center, VIB, 9000 Ghent, Belgium

d

Department of Biochemistry, Ghent University, 9000 Ghent, Belgium

1

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ABSTRACT

Protein phosphorylation is one of the most common post-translational modifications (PTMs), which can regulate protein activity and localization, as well as protein–protein interactions in numerous cellular processes. Phosphopeptide enrichment techniques enabled plant researchers to acquire insight in phosphorylation-controlled signaling networks in various plant species. Most phosphoproteome analyses of plant samples still involve stable isotope labeling, peptide fractionation, and demand lots of mass spectrometry (MS) time. Here, we present a simple workflow to probe, map and catalogue plant phosphoproteomes, requiring relatively low amounts of starting material, no labeling, no fractionation, and no excessive analysis time. Following optimization of the different experimental steps on Arabidopsis thaliana samples, we transferred our workflow to maize, a major monocot crop, to study signaling upon drought stress. In addition, we included normalization to protein abundance to identify true phosphorylation changes. Overall, we identified a set of new phosphosites in both Arabidopsis thaliana and maize, some of which are differentially phosphorylated upon drought. All data are available via ProteomeXchange with identifier PXD003634, but to provide easy access to our model plant and crop datasets, we created an online database, Plant PTM Viewer (bioinformatics.psb.ugent.be/webtools/ptm_viewer/), where all phosphosites identified in our study can be consulted.

Key Words: Phosphoproteomics, maize, Arabidopsis, drought stress, database

2

ACS Paragon Plus Environment

Page 2 of 48

Page 3 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

INTRODUCTION

The balanced action of protein kinases and phosphatases determines a proteome’s phosphorylation status. Protein phosphorylation may transiently modify protein properties, such as enzymatic activity, subcellular localization, protein structure and stability, and interactions with other proteins. As such, many cellular signaling processes, such as transmembrane signaling, intracellular amplification of signals and cell cycle control, occur via reversible protein phosphorylation.1 In plants, phosphorylation-mediated signaling is of central importance in various physiological processes, including hormone signaling and stress responses.2 However, only a limited number of plant kinases and phosphatases (and their targets) have been studied in different levels of detail.3 Mass spectrometry (MS)-based proteomics became an essential tool for studying protein phosphorylation and has enabled the identification of numerous phosphorylation sites on plant proteins.3 Nevertheless, studies of phosphorylation events remain challenging, due to their dynamic nature and the sub-stoichiometric levels of phosphorylated proteins. Therefore, at least some level of enrichment for phosphorylation sites is needed and this is best done at the peptide level to maximize the identification of phosphosites. The most productive approach is based on metal (ion) chelation. By exploiting the interaction between negatively charged phosphate groups and positively charged metal ions or metal oxides, immobilized metal affinity chromatography (IMAC) and metal oxide affinity chromatography (MOAC) methods, respectively, represent efficient ways to enrich phosphopeptides from complex mixtures. Enrichment with TiO2 beads became a routine method in plant proteomics studies in recent years.4-12 In an attempt to maximally cover phosphoproteomes, the majority of phosphoproteomics approaches make use of peptide fractionation methods, such as strong cation exchange chromatography, hydrophilic interaction chromatography or reversed-phase 3

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 48

chromatography.7, 13-17 These however result in far more LC-MS/MS measurement time per sample to be analyzed and also require large(r) amounts of starting material. With the ever-increasing number of phosphorylation sites being identified – for nearly every human cellular protein a phosphosite has been reported18 – the functionality of these post-translational modifications (PTMs) is questioned. Crowdedness effects are hypothesized to give rise to non-functional transfer of a phosphate group by kinases upon encounter of a random protein.19, 20 Merely profiling phosphorylation sites will hence likely lead to the large scale identification of nonfunctional PTMs. To discriminate these ‘noisy’ phosphosites from sites with regulatory significance, it is vital to design experiments where differential conditions

are

compared.

Obviously,

this

requires

assessing

dynamics

in

the

phosphoproteome via quantitative methods. Methodologies for the quantitative analysis of phosphoproteomes in plants are most frequently based on stable isotope labeling, like 15N/14N metabolic labeling of proteins during plant growth4, 11, 16, 21, 22 or post-metabolic labeling of peptides with iTRAQ.9, 12, 23, 24 As labeling imposes limitations on the number of conditions that can be monitored, label-free methods represent a practical alternative. Two label-free methods, spectral counting and precursor ion intensity-based quantification, have been applied in plant phosphoproteome strategies.8,

10, 11, 14, 25-27

However, label-free approaches

often suffer from quantitative incompleteness due to stochastic data acquisition (MS/MS sequencing) leading to numerous missing values in the dataset, which – to some extent – can be avoided by matching data between LC-MS(/MS) runs. Although missing in most published plant phosphoproteome studies [notwithstanding some exceptions12, 15, 16, 28-30], parallel and in depth investigation of the overall proteome is recommended for normalization of quantitative PTM studies. To determine if phosphopeptide changes are the result of true phosphorylation changes or rather general abundance changes of the phosphoprotein, phosphopeptide levels need to be normalized to overall protein 4

ACS Paragon Plus Environment

Page 5 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

abundances. Ideally, such changes in overall protein levels should be derived from an analysis of non-phosphopeptides of the same sample.31 In recent years, the number of phosphoproteome and proteome researches in monocot crop species under abiotic stresses has been steadily growing.32, 33 Agricultural plants, such as maize, routinely face drought stress, which is one of the worst environmental hazards that impacts crop productivity.34,

35

As plants remodel their proteome in response to stress,

drought-adaptive traits are likely to be reflected at the proteome level.36 Moreover, as a universal biochemical signal in cells, protein phosphorylation controls stress responses, transmitting stress signals from the cell surface to the nucleus.32 However, parallel phosphoproteome and proteome analysis, while existing,15, 29 are not yet standardized for such monocot crop and model plant studies and often the change in phospho-status is not corrected against the protein abundance. Taken together, commonly used strategies in plant phosphoproteomics involve tedious labeling approaches and fractionation steps, which are time consuming, expertise demanding and negatively affect reproducibility, robustness and throughput. Here, we present a label-free quantitative workflow for quick and reproducible phosphoproteome analysis of plant tissue, requiring only small sample amounts and no costly expert software for data analysis and integrating steps (such as normalization) that are not yet standard in plant phosphoproteomics. We applied our workflow to maize, a major monocot crop, to study signaling upon drought stress. In this case study we identified a set of new phosphosites in maize, some of which are differentially phosphorylated upon drought.

MATERIALS AND METHODS

5

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Plant growth Seedlings of Arabidopsis thaliana (ecotype Columbia) were grown on vertically-held plates with half-strength Murashige and Skoog medium solidified with 0.8% agar at 22°C in continuous light. Four days post germination (dpg), the plants were transferred to 10 µM 1naphthaleneacetic acid (NAA)-containing plates. Roots were harvested 5 dpg. Maize plants (inbred line B104) were grown in soil in a growth chamber with controlled relative humidity (55%) and temperature (24°C), in a 16h/8h (day/night) cycle. Drought was induced by lowering the soil water capacity to 62.5% relative to that of the well-watered control plants. Twenty-one days after sowing, the first 4 cm of growing leaf 7 was harvested.

Protein Extraction and Tryptic Digestion Plant material was harvested in three biological replicates. One g of fresh weight material was flash-frozen in liquid nitrogen, and manually ground into a fine powder with a pestle and mortar. Proteins were extracted in homogenization buffer containing 50 mM Tris-HCl buffer (pH 8), 0.1 M KCl, 30% sucrose, 5 mM EDTA, and 1 mM DTT in milliQ water, and the appropriate amounts of the Complete protease inhibitor mixture and the PhosSTOP phosphatase inhibitor mixture (both from Roche) were added. The samples were sonicated on ice and centrifuged at 4°C for 15 min at 2,500×g to remove debris. Supernatants were collected and a methanol/chloroform precipitation was carried out by adding 3, 1 and 4 volumes of methanol, chloroform and water, respectively. Samples were centrifuged for 10 min at 5,000×g, and the aqueous phase was removed. After addition of 4 volumes methanol, the proteins were pelleted via centrifugation for 10 min at 2,500×g. Pellets were washed with 80% acetone and re-suspended in 6 M guanidinium hydrochloride in 50 mM triethylammonium bicarbonate (TEAB) buffer (pH 8). Alkylation of cysteines was carried out by adding a combination of tris(carboxyethyl)phosphine (TCEP, Pierce) and iodoacetamide 6

ACS Paragon Plus Environment

Page 6 of 48

Page 7 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

(Sigma-Aldrich) to final concentrations of 15 mM and 30 mM respectively, and the reaction was allowed for 15 min at 30°C in the dark. Before digestion, the samples were buffer exchanged on Illustra NAP columns (GE Healthcare Life Sciences) to 50 mM TEAB buffer (pH 8) and the protein concentration was measured using the Bio-Rad Protein Assay. One mg of the proteins was pre-digested with EndoLysC (Wako Chemicals) for 4 h, followed by a digestion with trypsin overnight (Promega Trypsin Gold, mass spectrometry grade), both digestions occurring at 37°C at an enzyme-to-substrate ratio of 1:100 (w:w). The digest was acidified to pH ≤ 3 with trifluoroacetic acid (TFA) and desalted with SampliQ C18 SPE cartridges (Agilent) according to the manufacturer’s guidelines. The eluates were split into two and dried in a vacuum centrifuge. One half of the samples served for proteome analyses and were re-dissolved in 30 µL of 2% (v/v) acetonitrile and 0.1% (v/v) TFA right before LCMS/MS analysis.

Phosphopeptide Enrichment The dried eluates were re-suspended in 100 µl of loading solvent (80% acetonitrile, 5% TFA) and incubated with 1 mg MagReSyn® Ti-IMAC microspheres (ReSyn Biosciences) for 20 min at room temperature. The microspheres were next washed once with wash solvent 1 (80% acetonitrile, 1% TFA, 200 mM NaCl) and two times with wash solvent 2 (80% acetonitrile, 1% TFA). The bound phosphopeptides were eluted with three volumes (80 µl) of a 1% NH4OH solution, followed immediately by acidification to pH ≤ 3 with formic acid. Prior to MS analysis, the samples were vacuum-dried and re-dissolved in 50 µL of 2% (v/v) acetonitrile and 0.1% (v/v) TFA.

7

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Mass Spectrometry Each sample was analyzed twice (i.e. in two technical replicates) via LC−MS/MS on an Ultimate 3000 RSLC nano LC (Thermo Fisher Scientific) in-line connected to a Q Exactive mass spectrometer (Thermo Fisher Scientific). The sample mixture was first loaded on a trapping column (made in-house, 100 µm internal diameter (I.D.) × 20 mm, 5 µm beads C18 Reprosil-HD, Dr. Maisch, Ammerbuch-Entringen, Germany). After flushing from the trapping column, the sample was loaded on an analytical column (made in-house, 75 µm I.D. × 150 mm, 3 µm beads C18 Reprosil-HD, Dr. Maisch). Peptides were loaded with loading solvent A (0.1% TFA in water) and separated with a linear gradient from 98% solvent A’ (0.1% formic acid in water) to 55% solvent B′ (0.1% formic acid in water/acetonitrile, 20/80 (v/v)) in 170 min at a flow rate of 300 nL/min. This was followed by a 5 min wash reaching 99% solvent B’. The mass spectrometer was operated in data-dependent, positive ionization mode, automatically switching between MS and MS/MS acquisition for the 10 most abundant peaks in a given MS spectrum. The source voltage was 3.4 kV, and the capillary temperature was 275°C. One MS1 scan (m/z 400−2000, AGC target 3 × 106 ions, maximum ion injection time 80 ms) acquired at a resolution of 70000 (at 200 m/z) was followed by up to 10 tandem MS scans (resolution 17500 at 200 m/z) of the most intense ions fulfilling predefined selection criteria (AGC target 5 × 104 ions, maximum ion injection time 60 ms, isolation window 2 Da, fixed first mass 140 m/z, spectrum data type: centroid, underfill ratio 2%, intensity threshold 1.7xE4, exclusion of unassigned, 1, 5-8, >8 charged precursors, peptide match preferred, exclude isotopes on, dynamic exclusion time 20 s). The HCD collision energy was set to 25% Normalized Collision Energy and the polydimethylcyclosiloxane background ion at 445.120025 Da was used for internal calibration (lock mass).

8

ACS Paragon Plus Environment

Page 8 of 48

Page 9 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Data Analysis For the Arabidopsis samples, MS/MS spectra were searched against a UniProt database containing A. thaliana sequences (34,509 entries, version November, 2014) with the MaxQuant software (version 1.5.3.8). For the maize samples, the searches were done against a Zea mays database downloaded from PLAZA Monocots 3.037 containing sequences (39,305 entries, version 2014) with the MaxQuant software (version 1.5.0.30). For all searches, a precursor mass tolerance was set to 20 ppm for the first search (used for nonlinear mass recalibration) and to 4.5 ppm for the main search. Trypsin was selected as enzyme setting. Cleavages between lysine/arginine-proline residues and up to two missed cleavages were allowed. Carbamidomethylation of cysteine residues was selected as a fixed modification and oxidation of methionine residues was selected as a variable modification. For the samples enriched for phosphopeptides phosphorylation of serine, threonine and tyrosine residues were set as additional variable modifications. The false discovery rate for peptide and protein identifications was set to 1%, and the minimum peptide length was set to 7. The minimum score threshold for both modified and unmodified peptides was set to 30. The MaxLFQ algorithm allowing label-free quantification38 and the ‘Matching Between Runs’ feature were enabled. All mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE39 partner repository with the dataset identifier PXD003634. For the quantitative maize proteome and phosphoproteome analyses, the ‘ProteinGroups’ and ‘Phospho(STY)sites’ output files, respectively, generated by the MaxQuant search was loaded into Perseus, the data analysis software available in the MaxQuant package.40 Only proteins or phosphosites which were quantified in at least two of the three biological replicates of at least one sample were retained. Log2 transformed protein LFQ intensities or phosphosites intensities were centered by subtracting the median of the entire set of protein/phosphosite intensities per sample. A two-sample test with p 0.700 for highly confident interactions (STRING protein-protein interaction prediction is based on data available for genomic homology, gene fusion, occurrence in the same metabolic pathways, co-expression, experiments, database and text mining. A combined score is calculated based on the score of all the methods that were used for the protein-protein interaction prediction. The higher the score is, the more confident the interaction). The results were visualized using the Cytoscape package.

PubMed search A PubMed search (www.ncbi.nlm.nih.gov/pubmed/) was performed on 21-22/06/2016 using ‘maize proteomics 2016’, ‘wheat proteomics 2016’ or ‘proteomics arabidopsis 2016’ to identify relevant papers [only research papers (no reviews or opinions) with the correct focus (namely those that actually investigated the indicated species, that were available, and that actually applied proteomics) were retained].

11

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 48

RESULTS AND DISCUSSION

Optimized quantitative workflow for proteomics and phosphoproteomics in plants

To facilitate efficient proteome analyses of plants, we developed a simple workflow, which maximizes the coverage and reproducibility of protein and phosphorylation site quantification in single LC-MS/MS runs, thus without requiring peptide fractionation steps (Figure 1). To optimize the pipeline, we used the fully sequenced and well-annotated model plant Arabidopsis thaliana. First, we provide a brief overview of the key steps in the protocol and the improvements that were introduced step-by-step to robustly survey plant proteomes and phosphoproteomes (more details can be found in the Materials and Methods). To reproducibly capture a comprehensive spectrum of proteins, we opted for a protein precipitation approach. Proteins from ground plant material were extracted with a sucrose buffer containing protease and

phosphatase

inhibitors.

The

extract

was

subsequently

purified

through

a

chloroform/methanol precipitation step to eliminate compounds (e.g. polysaccharides, phenolic compounds, lipids and secondary metabolites) that may hinder preparation and analysis of proteome samples.45 The pelleted proteins were reconstituted in a buffer containing guanidinium hydrochloride. Cysteine disulfide bonds were reduced with tris(2carboxyethyl)phosphine hydrochloride (TCEP-HCl), allowing the alkylation reaction with iodoacetamide to simultaneously take place.46 Next, we pre-digested the proteins with endoproteinase-LysC, followed by a full digestion with trypsin. The pre-digestion step was previously shown to substantially improve the proteolytic efficiency of trypsin.47 The resulting peptides were desalted, and split into two. One part was used for the proteome analysis, leaving 500 µg of digest material as input for phosphopeptide enrichment. We opted 12

ACS Paragon Plus Environment

Page 13 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

for a Ti4+-IMAC-based method, as it was found to perform extremely well in terms of reproducibility and provides even greater selectivity and sensitivity than the more commonly used TiO2 chromatography.48,

49

Both the proteome and phosphoproteome samples were

analyzed by 3 hour gradients on a quadrupole Orbitrap instrument [Q Exactive50]. Second, peptide identification and quantification are important steps following LCMS/MS analysis. We chose a label-free quantitation approach over a labeling approach, as it is cost-effective, does not restrict the numbers of samples that can be compared, and can span several orders of magnitude of protein concentrations.38 In most label-free studies of plant phosphoproteomes, the raw data are analyzed by a combination of expensive expert peptide identification software, like Proteome Discoverer or Mascot, and in-house developed algorithms to facilitate label free quantification10,

11, 26, 27

(Supplementary Table S1). Here,

peptide identification was carried out by the freely available and easy-to-use software package MaxQuant.51 Simultaneously, the label-free quantitation is carried out by MaxQuant, in an ion intensity-based manner.38 The missing value issue, due to stochastic peptide sequencing inherent to mass spectrometry, was tackled by using the “match between runs” feature in MaxQuant, which can transfer MS/MS identifications between measurements based on a peptide retention time correlation approach.38 Taken together, compared to published methods for (phospho)proteomics in plants12, 14-16, 23

, we reduced the number of sample preparation steps, MS time and data analysis

complexity, due to the lack of labeling, gel-based steps and pre-fractionation steps and the introduction of MaxQuant in our workflow (Figure 1). With respect to the latter, this facilitates and standardizes data analysis, but does (not yet) seem to be routinely integrated in plant proteomics (Supplementary Table S1). Further, for differential studies it is also important to correct the phosphosite intensities against the protein abundance to pinpoint protein level-independent phosphorylation events (Figure 1). 13

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Validating the optimized workflow on Arabidopsis thaliana roots

The proteome analysis of the non-enriched samples amounted to a total of 18 hours of MS time, leading to the cumulative identification of 34,216 unique peptides that could be mapped on 4,903 protein groups (Supplementary Table S2). The latter can be defined as protein entries distinguishable on the basis of identified peptides.52 Via the MaxLFQ algorithm, 4,847 of those could be quantified in at least one biological replicate and 2,992 in all replicates. Seeing that label-free methods are very replicate dependent, reproducibility of the chromatographic separation must be very high. The data from the replicate experiments clearly show highly accurate quantitative reproducibility with an average Pearson correlation of 0.978 (Supplementary Figure S1A). A common challenge for plant proteomics studies is the difficulty of isolating proteins from the different subcellular organelles with sufficient efficiency. Membrane proteins represent an additional hurdle, as their large size and hydrophobicity render them difficult to isolate. To obtain a wide-ranging snapshot of cellular signaling processes it is vital to capture proteins from not only the cytosol, but also from membranes and organelles. GO analysis shows that the applied protocol extracted proteins from cytosol, nucleus, plasma membrane and other organelles (Figure 2 and Supplementary Table S3). This evidences that our approach is not limited by particular experimental difficulties and recovers proteins from all subcellular membranes and organelles. Next, we monitored the phosphorylation events in the Arabidopsis samples. The total of six LC-MS/MS runs of the Ti4+-IMAC enriched samples resulted in the identification of 1,051 unique phosphopeptides, corresponding to 1,331 phosphosites on 706 protein groups (Supplementary Table S4). The vast majority of these sites occurred on serine and threonine residues (90.3% and 9.1%, respectively), whereas phosphotyrosines accounted for less than 14

ACS Paragon Plus Environment

Page 14 of 48

Page 15 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

1% of the identified sites, which is in agreement with other reports.9,

13

Accurate site

localization (probability > 0.75) was achieved for 799 of these phosphosites on 552 proteins. Of the 1,331 unique phosphosites, we could accurately quantify 1,022 and 711 in at least one and in all biological replicates, respectively. To evaluate the quality of the experiment, we assessed the correlation of all phosphopeptide intensities between the three biological replicates. An average Pearson’s correlation of 0.818 illustrates the high reproducibility of the phosphopeptide enrichment strategy (Supplementary Figure S1B). All the identified Arabidopsis phosphosites were used to search against the PhosPhAt 4.0 full dataset of experimentally identified phosphosites.41 This resulted in 169 phosphosites (13% of the dataset) uniquely identified in our study (Supplementary Table S5). In summary, we have experimental evidence that our workflow successfully detects a large portion of the (phospho)proteome and can thus be applied to understanding biological processes.

Applying the (phospho)proteomics workflow to maize leaves under drought stress

Following the validation of our pipeline in Arabidopsis roots, we applied our workflow to a monocot crop under stress. Given the importance of drought-related research,32,

53-55

we

profiled the proteome and phosphoproteome of maize leaves subjected to drought stress. Since the growth zone of the maize leaf determines to a great extent the final leaf length56 and drought affects cell division and cell expansion in the growth zone of the maize leaf,57 we harvested the growth zone of the growing leaf 7 of 21 day old plants. Drought treatment was applied by preventing irrigation upon sowing and when the soil water content reached 62.5% of that of the well-watered controls, the plants were maintained at the respective watering regime by daily watering. At the moment leaf 7 appeared, the effects of the drought were 15

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

quantified by measuring the final leaf length of the youngest fully grown leaf, leaf 4. This leaf showed a significant length reduction compared to the control (data not shown), supporting that the applied drought affected leaf growth. Proteome and phosphoproteome data were obtained for growth zones of maize leaves as described above, further emphasizing the importance of moving away from gel-based approaches, also in maize and wheat where this is not standard yet (Supplementary Table S1). All biological samples were analyzed twice by nanoLC-MS/MS using three hour gradients.

Changes in abundance of stress regulators under drought revealed through proteome analysis

In the non-enriched samples, a total of 22,093 peptides were identified originating from 4,409 protein groups. 4,361 protein groups could be accurately quantified, of which 2,299 in at least two of the three biological replicates (Supplementary Table S6). The data from the replicate experiments show quantitative reproducibility with an average Pearson correlation of 0.856 and 0.892 for control and drought samples, respectively (Supplementary Figure S2). Statistical testing (p 0.7 (Figure 4). The network is approximately centralized around the DNA topoisomerase II (GRMZM2G021270/PLAZA identifier ZM05G37510). From the resulted network, we identified different groups of interaction between proteins involved in different cellular processes. These included the categories DNA/chromatin organization, photosynthesis and glucose metabolism, of which 17

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 48

the corresponding GO terms were enriched in the dataset. Photosynthesis is very sensitive to environmental stresses.74,

75

Unexpectedly, while abiotic stresses are known to reduce

photosynthesis rate,75 our data exhibited an up-regulation of proteins belonging to photosynthetic apparatus. This apparent contradiction is in agreement with recent microarray data that showed an upregulation in photosynthetic transcripts.76 While in mature leaves, the down-regulation or unchanged transcription possibly help maintaining the fully developed photosynthetic machinery, in leaves developing under drought, the increase of photosynthetic proteins might play a role in resuming the photosynthetic capacity in the later recovery phase.76 Further, we identified a small cluster of seven proteins involved in protein folding. Four

of

them,

a

member

of

the

heat

shock

protein

HSP70

family

(GRMZM2G415007/PLAZA identifier ZM04G41380) and three chaperone proteins belonging to the Clp protease family (GRMZM2G110023/PLAZA identifier ZM01G09650; GRMZM2G123922/PLAZA identifier ZM10G15640; GRMZM2G162968/ PLAZA identifier ZM09G19730). are upregulated upon drought. It is known that the control of protein folding state is crucial for the survival of plants during abiotic stress.77 The chaperonin subunit alpha 60 (AC215201.3_FG005/PLAZA identifier ZM06G23100), involved in the structural assembly of chloroplasts, was shown to be transcriptionally stimulated by various abiotic stresses,78 and is here also induced on the protein level by drought stress. All in all, our data greatly overlap with the observations in other studies, including transcriptional findings that we can now be confirmed on the protein level.

Analysis of the maize leaf phosphoproteome under drought stress

18

ACS Paragon Plus Environment

Page 19 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Ti4+ IMAC enrichment of maize phosphopeptides led to the detection of a total of 980 unique phosphosites on 686 phosphopeptides, which could be mapped on 536 phosphoproteins (Supplementary Table S10). The data from the replicate experiments show quantitative reproducibility with an average Pearson correlation of 0.887 and 0.856 for control and drought samples, respectively (Supplementary Figure S3). The number of identified phosphoproteins lies in between the range published in recent maize phosphoproteomics studies: 282,23 858,9 2,852,14 and 3,557 phosphoproteins17. Important to note is that the latter two studies fractionated the enriched phosphopeptides via extensive SCX chromatography, hereby greatly increasing MS analysis time per sample to two days.14, 17 In our work, six hours analysis time was used per sample (2 technical replicates), hence yielding a relative high number of phosphoprotein identifications. However, the number of identifications also depends on the experimental design, such as the tissue-specificity or the number and the efficiency of treatment conditions, resulting in the above-mentioned large range of identification across the studies. Overall, the majority (97.4%) was mono-phosphorylated peptides, while around 2.6% of the phosphopeptides carried two phosphorylated residues. There were 84.0% phosphoserine, 15.2% phosphothreonine and 0.8% phosphotyrosine containing peptides identified, sharing a similar distribution pattern to other maize phosphoproteomics studies.9, 14, 17, 23

All the identified maize phosphosites were searched against the retrieved set of

phosphosites identified in maize seed from the Atlas of Maize Proteotypes and a dataset of phosphosites garnered from different developmental zones of maize leaves.17, 42 This resulted in 359 phosphosites (37% of the dataset) uniquely identified in our study (Supplementary Table S11). Overrepresentation of amino acid motifs surrounding the identified phosphosites was analyzed using Motif-X (Table 1). Phosphorylated tyrosine sites were excluded from the 19

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 48

analysis due to their low abundance in the dataset. Similar to other studies in Arabidopsis and other monocots,79-82 [sP] is the most enriched motif for the S-phosphorylation as well as its phosphorylated threonine counterpart [tP] for the T-phosphorylation dataset. Further, 20 peptides are enriched with the proline-rich motif [sxSP]. Peptides containing the prolinedirected [sP] and [tP] motifs are suggested to be substrates for MAP-kinases (MAPK), sucrose non-fermenting1-related protein kinase 2 (SnRK2), receptor-like kinases (RLKs), AGC family protein kinases PKA, PKG and PKC, CDKs (cyclin-dependent kinases), calcium-dependent protein kinases (CDPKs) and STE20-like kinases (SLKs).79 Only one common acidic motif – [sDxE] – resulted from the analysis, belonging to 22 peptides that might be potential substrates for casein kinase II (CKII) and CDPKs. Further, three basophilic motifs are overrepresented in the dataset, [Rxxs] and the subtype [RSxs], which are recognized by MAPK kinases (MKKs), and [Kxxs], which is targeted by PKA and PKC. No specific protein kinases are found for the T-phosphorylation motif [tS].

Differential analysis of phosphorylation sites after correcting against the protein level

Earlier differential phosphoproteomics studies of maize leaf tissue, identifying differences between stress conditions, lack normalization to the protein abundance.

9, 15, 23

Here, because

of our extensive analysis, we can simultaneously take into account protein and phosphorylation site profiles. In total, 615 phosphosites on 445 phosphoproteins were quantified, of which 536 phosphosites in at least two biological replicates of one condition. Taking into account that differences in protein levels can influence the outcome of the differential phosphorylation data, we set out to normalize the intensities of the phosphopeptides to the protein intensities. For 224 phosphosites, matching proteins were quantified in the proteome experiment allowing 20

ACS Paragon Plus Environment

Page 21 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

normalization. This result demonstrated that phosphopeptide enrichment facilitated the identification of low abundance proteins, of which non-phosphorylated peptides were likely missed in the proteome scans due to different dynamic ranges and crowdedness. A two sample test (p