Site-Specific Identification of Lysine Acetylation Stoichiometries in

Feb 3, 2016 - Functional characterization of the lysine acetylation pathway requires quantitative measurement of the modification abundance at the sto...
0 downloads 14 Views 1MB Size
Subscriber access provided by GAZI UNIV

Technical Note

Site-specific Identification of Lysine Acetylation Stoichiometries in Mammalian Cells Tong Zhou, Ying-hua Chung, Jianji Chen, and Yue Chen J. Proteome Res., Just Accepted Manuscript • DOI: 10.1021/acs.jproteome.5b01097 • Publication Date (Web): 03 Feb 2016 Downloaded from http://pubs.acs.org on February 12, 2016

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Proteome Research is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Site-specific Identification of Lysine Acetylation Stoichiometries in Mammalian Cells Tong Zhou1, 2, Ying-hua Chung1, 2, Jianji Chen1, Yue Chen1 *

1.

Department of Biochemistry, Molecular Biology and Biophysics, University of

Minnesota at Twin Cities, Minneapolis, MN 55455, USA 2.

These authors contributed equally to this work

*Correspondence: Dr. Yue Chen (Email: [email protected], Phone: 1-612-626-3340) Running Title: Acetylation stoichiometry analysis in mammalian cells Key words: Lysine Acetylation, Stoichiometry, Quantitative Proteomics

1 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Abstract Functional characterization of lysine acetylation pathway requires the quantitative measurement of the modification abundance at the stoichiometry level. Here, we developed a systematic workflow for global untargeted identification of site-specific Lys acetylation stoichiometries in mammalian cells. Our strategy includes an optimized protocol for in vitro chemical labeling of unmodified lysine with stable isotope-encoded acetyl-NHS ester, deep proteomic profiling with high resolution mass spectrometer and a new software tool for quantitative analysis and stoichiometry determination. The workflow was validated using in vitro chemically labeled BSA and synthetic peptides with multiple Lys acetylation at various positions. In the proof-of-concept study, we applied the strategy to analyze the proteome of HeLa cells and determined the stoichiometries of over 600 acetylation sites with good reproducibility. Sodium butyrate treatment induced significant increase of acetylation stoichiometries in HeLa cells. Analysis of site-specific stoichiometry dynamics revealed the co-regulation of closely positioned acetylation sites on histone H3 and H4 upon treatment.

2 ACS Paragon Plus Environment

Page 2 of 37

Page 3 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Introduction Posttranslational modification (PTM) on proteins represents a dynamic and versatile means to regulate diverse functional processes in the cell. Understanding the biological significance of these PTM pathways in cellular physiology and disease progression is one of the major goals of biochemical studies in postgenomic era. To date, over 200 different types of protein modifications have been identified, which forms a highly complex network of regulatory pathways 1. Lysine (Lys), in particular, has been identified as the target of a number of acylation modifications, largely due to its electron-rich and nucleophilic side chain 2-9. As the first discovered lysine PTM, Lys acetylation is an abundant and highly regulated protein modification that is well known for its role in epigenetic histone code and transcription regulation 10. Recent development of immunoaffinity based proteomics technology enables global characterization of Lys acetylation substrates and connects Lys acetylation pathways with metabolic and signaling regulatory functions in addition to the DNA-templated processes 11. The cellular abundance of Lys acetylation is determined by a number of factors including the expression of the regulatory enzymes and the availabilities of enzyme co-factors such as acetyl-CoA or NAD+. It has been suggested that the abundance of Lys acetylation is sensitive to metabolic conditions such as fed and fasted states and that the modification involves in the regulation of cellular metabolic flux 12, 13. The dynamics of Lys acetylation is exemplified by the epigenetic regulation of histone lysine ε-amine acetylation for the transcriptional control and epigenetic memory 14, 15. Recent studies also suggested that Lys acetylation is dynamically regulated during cell cycle, signaling pathway activation and proteasome-mediated protein degradation 16-18.

3 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

System-wide study of Lys acetylation dynamics has been achieved through the combination of immunoaffinity enrichment of Lys acetylation substrates and quantitative proteomics strategies such as SILAC and isobaric tagging 19, 20. These studies took advantage of stable isotope labeling-based approach for relative quantification of PTM abundance in different cell populations to identify dynamically regulated Lys acetylation substrates in response to certain treatments or environmental conditions. Although very powerful, a potential issue for the relative quantification of modification sites is the lack of control for the total protein abundances in different conditions. In addition, relative quantification information may not reflect the real physiological significance of the PTM abundance change. For example, an eight fold change in relative abundance of Lys acetylation may be due to an increase of the acetylation fractional abundance (the stoichiometry) of the substrate from 0.1% to 0.8%, or from 10% to 80%. Apparently, these two changes may have different functional significance. Analysis of PTM stoichiometry dynamics, instead of relative PTM quantification, may allow more accurate identification of critical Lys acetylation sites whose abundance changes are more physiologically important. Stoichiometry analysis for posttranslational modification is a challenging analytical task, largely due to the differential ionization efficiency of the modified and unmodified peptides in mass spectrometer. In phosphoproteomic analysis, two strategies have been developed for global stoichiometry analysis. The first strategy applied an elegant mathematical model and calculated stoichiometries based on the SILAC ratios of modified peptides, unmodified peptides and protein ratios 21. This strategy has been further modified to identify the stoichiometries of Lys succinylation substrates 22. The second strategy applies an efficient de-phosphorylation method and used SILAC ratios of non-phosphorylated peptides with and without de-phosphorylation to calculate site4 ACS Paragon Plus Environment

Page 4 of 37

Page 5 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

specific stoichiometry 23. Both approaches are capable of large scale, system-wide stoichiometry identification but are not capable of analyzing multiply phosphorylated peptides with each site being differentially phosphorylated. Lys acetylation stoichiometry analysis was initially achieved in acetylation study of purified proteins such as histones using heavy isotope-coded acetic anhydride to chemically label unmodified Lys 24-27. Such an isotope-balanced analysis of PTMs approach equalizes the ionization efficiency between unmodified and modified Lys and therefore allow accurate quantitative calculation of stoichiometry based on the peptide ion intensities. The approach has allowed confident identification of Lys acetylation stoichiometries on histone and non-histone peptides using electrospray or MALDI ionization followed by targeted, multi-stage gas-phase fragmentation. Recently, two studies expanded this approach for proteome-wide identification of Lys acetylation stoichiometries. The first study quantifies the stoichiometry based on the acetyllysine immonium ion intensities from heavy and light acetylated lysine in peptide fragmentation and the other study calculated Lys acetylation stoichiometries of E. coli proteins based on the precursor ion intensities of both heavy and light acetylated peptides 28, 29. Another recent study applied partial chemical modification using acetyl phosphate and SILACbased quantification to estimate lysine acetylation stoichiometries in yeast 30. These approaches are capable of system-wide stoichiometry identifications but similar to the phosphorylation stoichiometry approaches, they are limited to quantitative stoichiometry analysis of singly modified peptide. We have developed an improved workflow for efficient, system-wide Lys acetylation stoichiometry analysis that allows site-specific, untargeted stoichiometry identification for both singly and multiply acetylated peptides. Our strategy includes an efficient chemical labeling protocol and an in-house developed software for accurate and reliable 5 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

identification of Lys acetylation stoichiometry. We applied our strategy for unbiased, system-wide analysis of Lys acetylation stoichiometries in HeLa cells and studied the dynamics of Lys acetylation stoichiometries under HDAC inhibitor treatment. Material and Methods Materials Peptides were synthesized by GL Biochem (Shanghai, China). Modified sequencinggrade trypsin was purchased from Promega (Madison, WI). OMIX tip (C18, 100µl) was purchased from Agilent Technologies. C18 and cation exchange cartridge were purchased from 3M (Saint Paul, MN). NMR was recorded in CDCl3 (99.8% atom D, Acros) on a Bruker 700 MHz Spectrometer. Other reagents were purchased from the following vendors: sodium acetate-13C2d3 (99% atom % 13C, 99% atom % D, Aldrich), acetic acid (Fisher), N-(3-dimethylaminopropyl)-N’-ethylcarbodiimide hydrochloride (Sigma), N-hydroxysuccinimide (Alfa Aesar), tetrahydrofuran (Fisher), petroleum ether (Fisher), acetone (Fisher), sodium butyrate (Acros), nicotinamide (Sigma), tris(2carboxyethyl)phosphine, 2-iodoacetamide (Thermo), urea (Thermo), sodium hydroxide (Sigma), trifluoroacetic acid (Sigma), acetonitrile (Fisher), acetic anhydride (Fisher), ammonium acetate (Acros), ammonium bicarbonate (Acros), ammonium hydroxide (Fisher). Synthesis of Acetyl-NHS Acetic acid (1 equiv) and N-(3-dimethylaminopropyl)-N’-ethylcarbodiimide hydrochloride (1.2 equiv) were dissolved in anhydrous tetrahydrofuran by stirring at room temperature for 2 hours. N-Hydroxysuccinimide (1.2 equiv) was also dissolved in anhydrous tetrahydrofuran and added dropwise to the reaction solution. The resulting mixture was stirred at room temperature for 16 hours. TLC (thin layer chromatography) was used to 6 ACS Paragon Plus Environment

Page 6 of 37

Page 7 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

monitor the reaction process. Upon completion (87.07% yield), the sample was evaporated to dryness on a rotavap and then purified on a silica gel column chromatography eluted with gradient petroleum ether/acetone. The structure of the product was confirmed as 2,5-dioxopyrrolidin-1-yl acetate (Ac-NHS) by ESI TOF ([M+H]+ 158.0513, calculated for C6H8NO4+ 158.0514), 1H NMR (700 MHz, CDCl3, δ 2.71 (4H, s), 2.25 (3H, s)) and 13C NMR (175 MHz, CDCl3, δ 169.4, 165.7, 25.5, 17.4). 2,5dioxopyrrolidin-1-yl acetate-13C2d3 (Ah-NHS) was also synthesized from sodium acetate13

C2d3 (99% atom % 13C, 99% atom % D) using the same processing methods.

Cell culture and lysis HeLa cell was maintained in 10 cm dishes in DMEM media with 10% (v/v) FBS. Six hours after passaging, the culture media was replaced by fresh media containing either blank vehicle or 1 mM sodium butyrate. Two times of PBS wash were applied during the media changing. Cells were cultured for additional 16 hours. Upon harvesting, the cells were washed with cold PBS containing 5 mM sodium butyrate, then lysed in 8 M urea with sonication. Protein concentration was determined with Bradford method. Cysteine reduction and alkylation were applied using tris(2-carboxyethyl)phosphine and 2iodoacetamide at room temperature in dark for 30 min. Chemical labeling with acetyl-NHS and acetyl anhydride A typical method for Ac-NHS labeling was performed in a PBS solution (pH 7.5) with 8 M urea and 5-10% (v/v) acetonitrile. In detail, 1 µl 10× PBS was added to 10 µl protein (2 µg/µl) in 8 M urea, pH was adjusted to 7.5 with 0.4 M sodium hydroxide. Acetyl-NHS was dissolved in acetonitrile at a concentration of 100 mg/ml and 0.5 µl of the reaction buffer was added into the protein solution. The sample was kept on shaker at room temperature for 30 min; a second 0.5 µl acetyl-NHS solution was added and the sample 7 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

was kept on shaker for another 30 min. The reaction was quenched by 0.8 µl 5% (v/v) hydroxylamine at room temperature for 15 min. A typical AcOAc (acetic anhydride) labeling was done in 0.5 M NH4OAc and 50 mM NH4HCO3 with 8 M urea following previously published protocol 29. Equal volume of 1 M NH4OAc 100 mM NH4HCO3 pH 8.0 was added to 10 µg protein (2 µg/µl) in 8 M urea. 0.2 µl AcOAc was added and the mixture was vortexed at 4 oC for 20 min. pH was adjusted back to 8.0 by ammonium hydroxide. A second 0.2 µl AcOAc was added and the solution was vortexed at 4 oC for 20 min. pH was adjusted back to 8.0 by ammonium hydroxide. Then a third 0.2 µl AcOAc was added and the mixture was vortexed at 4 oC for 20 min. Synthetic peptides and BSA were labeled in PBS with AcNHS methods. The labeling efficiency was determined by spectral counting ratios comparing between the number of spectra identified from peptides with different variable modifications and the total spectrum counts in the same sample. Statistical significance analysis was conducted with student’s t-tests using SAS statistical software 9.3 (SAS Institute Inc.) Proteolytic digestion and LCMS analysis The reaction mixture was diluted 10 times with PBS and adjusted to pH 8.0. Then the proteins were digested sequentially by 1:50 trypsin overnight and 1:100 trypsin 2 hours at room temperature. The digestion was quenched by 1% (v/v) TFA (final concentration). Each sample was desalted by OMIX C18 tip (Agilent Technologies), dried in Speed-Vac (Thermofisher Inc.) and then subject to SCX fractionation as previously described 31. Briefly, the peptides were reconstituted in 5% ACN, 0.1% TFA in water and loaded onto a stage-tip packed with strong cation exchange extraction disk (3M Corp). The peptides were sequentially eluted with the following six buffers – fraction 1: 0.1% FA, 20% ACN, 50 mM NH4OAc; fraction 2: 0.1% FA, 20% ACN, 75 mM NH4OAc; fraction 3: 0.1% FA, 20% ACN, 125 mM NH4OAc; fraction 4: 0.1% FA, 20% ACN, 200 mM NH4OAc; fraction 8 ACS Paragon Plus Environment

Page 8 of 37

Page 9 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

5: 0.1% FA, 20% ACN, 300 mM NH4OAc; fraction 6: 0.1% FA, 20% ACN, 500 mM NH4OAc (all the concentrations are v/v). Peptides were analyzed by high resolution Orbitrap Fusion mass spectrometer coupled online with Proxeon Easy nLC 1000 UPLC (Thermofisher Inc.). An in-house packed 25 cm × 75 µm C18 column (ReproSil-Pur Basic C18, 2.5 µm, Dr. Maisch GmbH) was used to perform chromatographic separation. The gradient is 5% (v/v) to 30% (v/v) acetonitrile (0.1% formic acid, v/v) over 95 min followed by 25 min washout and re-equilibration. Full mass spectra or MS1 (m/z 3001600) were acquired in Orbitrap with a resolution of 120,000 at m/z 200. Monoisotopic precursor selection was disabled. Dynamic exclusion was used to exclude ions for 30 seconds with a mass window of -1.10 m/z and +1.50 m/z surrounding the precursor m/z. Precursor ions were isolated by quadrupole (isolation window m/z 1.2, m/z offset 0.5) for the Top-12 most intense peaks and fragmented in HCD cell with 35% energy. MS/MS spectra or MS2 were acquired also in Orbitrap with a resolution of 15,000 at m/z 200. AGC targets were set at 4.0E5 for full MS spectra and 5.0E4 for MS/MS spectra. The maximum injection time was 50 ms for full MS spectra and 200 ms for MS/MS spectra. Only precursor ions with charge states 2-7 were fragmented for MS/MS analysis. Peptide identification and calculation of site-specific stoichiometries All the raw data were analyzed by MaxQuant 1.4.1.2 32, 33. Andromeda was configured with light acetyl group (Ac, +42.0106 Da) and heavy labelled acetyl group (Ah, +47.0361 Da) at both lysine and protein N terminal. A search against Uniprot Human database (downloaded at 2014/04/14 with a total of 69078 sequences) was performed with light and heavy acetylation at both protein N terminal and lysine and methionine oxidation as variable modifications and carbamidomethylation of cysteine as a fixed modification. Trypsin was specified as the enzyme to check labeling efficiency for each sample with a maximum of 4 missed cleavage. Typical labeling efficiency for a complex protein mixture 9 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

can reach above 95% based on the total intensities of identified peptides containing heavy acetyl labeled and unlabeled lysines. Arg-C specificity was used for database searching to calculate acetylation stoichiometry with a maximum of 2 missed cleavages. The precursor ion tolerance is 4.5 ppm for Andromeda main search and 20 ppm for MS/MS analysis. The identifications were filtered at 1% false discovery rate for at peptide, protein and site levels. Modified peptides were required to have a minimum Andromeda score of 40 and localization probability of 0.75. MaxQuant search result files in a tab-delimited format were served as inputs for an inhouse developed software StoichAnalyzer which contains three main analysis modules – fragment ion quantifier, precursor ion quantifier, and integration data processor. First, all peptides identified with either heavy or light acetylation by MaxQuant were compiled to build a library of acetylated peptides with non-redundant peptide sequence. The program then performed a filtering step to eliminate the multiply-acetylated peptide sequences whose stoichiometries could not be theoretically calculated due to missing quantitative information of fragment ions between the neighboring acetylation sites. For each multiply acetylated peptide that passed the filtering, the program analyzed all fragment ion groups and calculate fragment ion ratios based on linear regression model 32. Second, for each acetylated peptide in the library, the program identified the HPLC retention profile of all acetyl isomers (peptides with the same sequence but different number of light acetyl groups) and calculated the corresponding peak areas. For multiply acetylated peptides, the program performed the second filtering step – it removed the peptide if the precursor ion intensities of the acetyl isomers containing mixed populations of heavy and light acetyl groups were detectable (S/N ratio>3) but MS/MS fragmentation was not acquired. In the final step, the precursor ion peak areas as well as fragment ion ratios of all the acetyl isomers for each peptide in the library were input to the data processer for

10 ACS Paragon Plus Environment

Page 10 of 37

Page 11 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

calculating acetylation stoichiometry in a site-specific manner based on the mathematical model for peptides containing up to 4. The software is freely available to the public for download (Supporting information). Bioinformatic analysis The acetylation sites from wild-type cell analysis and non-overlapping sites from the two replicates analysis were pooled together and the leading proteins were used for the functional classification of sites or proteins with different levels of stoichiometries. For functional annotation enrichment analysis, the acetylation sites were divided into four quantiles based on their stoichiometries: less than 1%, 1~5%, 5~20% and above 20%. Gene ontology enrichment for each quantile was performed through DAVID online bioinformatics tool with Homo sapiens as background 34. The p values after BenjamininHochberg correction were converted first (P = -Log10(p-value)) and then transformed to z-score. For each ontology category, the annotations were clustered using the z-scores through one-way hierarchical clustering as previously described 35.

11 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Results A quantitative workflow for site-specific Lys acetylation stoichiometry identification. We have developed an integrated workflow for proteome-wide untargeted identification of site-specific Lys acetylation stoichiometries. The strategy includes three essential steps – an efficient approach to globally derivatize the unmodified lysine residues in the whole cell lysate with heavy-isotope labeled acetyl groups, high resolution MS and MS/MS identification of lysine acetylated peptides and a new software program for quantitative analysis and calculation of site-specific lysine acetylation stoichiometries (Figure 1). In the first step, to enable comparative MS analysis of lysine acetylated peptides and their unmodified counterparts, we applied heavy-acetyl-NHS based chemical labeling strategy for proteome-wide derivatization of unmodified lysine (Supplementary Figure S1). Such an approach has been widely used in the chemical derivatization of free amino groups in isobaric tagging methods such as TMT and iTRAQ labeling 36, 37. The derivatization equalizes the ionization efficiency of acetylated and unmodified peptides and therefore allows direct comparison and measurement of their peptide ion intensities for stoichiometry analysis. Amino acids Y, S, and T have been known to cause side chemical reaction for Lys-targeted labeling strategies. In our analysis, high specificity is critical because the side reactions on Y, S or T may lead to false positive identification when the database searching software erroneously assigns the acetyl group to the nearby Lys. We screened various conditions to optimize the reaction conditions for acetyl labeling in the whole cell lysate (Figure 2). To determine optimum acetyl-NHS reaction concentrations, we performed systematic screening through chemical labeling of the whole cell lysate at different acetyl-NHS concentrations (Figure 2A). Our data suggested that Lys labeling was very selectively at low acetyl-NHS concentration comparing to undesired labeling at Y, S or T, and reached a saturation point at about

12 ACS Paragon Plus Environment

Page 12 of 37

Page 13 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

100 mg/ml for the input acetyl-NHS solution when the side reaction labeling was still at a very low level (Figure 2A). This allowed us to determine the optimum concentration for Lys labeling in the whole cell lysate. Using similar principles, we further screened additional chemical reaction conditions, such as reaction buffer compositions, organic solvent for dissolving acetyl-NHS, in-solution or in-gel labeling, reduction-alkylation before or after labeling and determined the optimum conditions (data not shown). Acetic anhydride is a commonly-used reagent in Lys labeling reactions and we reasoned that acetyl-NHS-based chemical labeling may confer higher specificity comparing to acetyl anhydride based labeling 38. To evaluate the two strategies, we compared their labeling specificity in the whole cell lysate. The results showed that acetyl-NHS method had higher yield on Lys labeling, less side reaction on Y, S or T and higher protein ID and peptide ID than anhydride-based method (Figure 2B-C). In addition to labeling at protein level, we also tested labeling at peptide level by using Glu-C as the first digesting enzyme followed by chemical labeling. Our data showed that this strategy led to better labeling specificity and efficiency, but the numbers of the protein and peptide identifications are much less than direct chemical labeling at protein level in the whole cell lysate (Figure 2B-C). Taken together, through comprehensive screening, we determined optimized conditions for chemical labeling in whole cell lysate. Following the chemical derivatization, in the second step, proteins were digested and the tryptic peptides were analyzed by LCMS acquiring high resolution MS and MS/MS spectra for identification by MaxQuant search engine and quantification. The high resolution data were required for reliable quantitative analysis. In the third step, an inhouse developed software, StoichAnalyzer, was applied to calculate the site-specific stoichiometries of Lys acetylation sites (Figure 3). It is important to note that in LCMS chromatogram, the heavy and light acetylated peptides do not completely co-elute due

13 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

to deuterium based isotope labeling (Supplementary Figure S2). Therefore, it is necessary to calculate the peak areas of the peptides for stoichiometry analysis. If the peptide contains single light acetylation site or it contains multiple light acetylation sites with equal stoichiometry on each site, it is sufficient to calculate site-specific stoichiometry based on precursor ion intensities of both modified and unmodified peptides. For multiply acetylated peptides with differential acetylation stoichiometries on each site, precursor ion intensities information alone are no longer sufficient. To address this challenge, we deduced mathematical formula and demonstrated that high resolution fragmentation spectra of each differentially acetylated peptide isoform acquired in datadependent acquisition could provide sufficient data to calculate site-specific Lys acetylation stoichiometry (Supplementary Note 1). To implement this data analysis workflow, StoichAnalyzer takes the identification text files from database search engine MaxQuant and the mzXML data files as inputs, and performs peak area extraction and stoichiometry calculation based on the deduced mathematical formula. Validation with BSA and synthetic histone H4 peptides. To validate our quantitative strategies for identifying site-specific stoichiometries in both singly and differentially multiple-acetylated peptides, we performed two sets of experiments. First, we first prepared two samples of BSA in PBS buffer (1 µg/µl), then applied the light acetyl (AcNHS) and heavy acetyl (Ah-NHS) labeling to each sample respectively. After labeling reaction was quenched, the samples were mixed together at various ratios for trypsin digestion and LCMS analysis. Based on MaxQuant identification, we calculated the experimental stoichiometries using StoichAnalyzer. Our results showed very good match with the theoretical values (50%, 10%, 1%) (Figure 4A), which validated our quantification strategy using precursor ions of acetyl isomers. To further evaluate our quantification strategy using both high resolution MS and MS/MS spectra, we

14 ACS Paragon Plus Environment

Page 14 of 37

Page 15 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

synthesized histone H4 N-terminal peptide [G4-R17] with acetylation at various positions as a testing model. The normalized peptides mixture was dissolved in PBS and then labelled with Ah-NHS for untargeted analysis in LCMS acquiring high resolution MS and MS/MS data. Our stoichiometry identifications by StoichAnalyzer showed very good match with the theoretical values in a range from 1% to 90% at site-specific levels (Figure 4B-C). These data suggested that the integrated strategy we developed is highly accurate to calculate site-specific acetylation stoichiometries for both singly and multiply acetylated peptides. We further performed spike-in experiments by mixing heavy and light acetyl-labeled BSA at different specific ratios first and then spiking in with Hela whole cell lysates at a roughly 1:50 ratio (BSA/Hela proteins, w/w). Stoichiometry analysis of detectable BSA peptides showed very good agreement with theoretical values (50%, 10%, 1%) (Supplementary Figure S3), demonstrating that our strategy is able to reliably identify acetylation stoichiometry from complex mixture. Improving quantification confidence in the whole proteome analysis. Having validated our quantitative workflow for Lys acetylation stoichiometry identification, we applied the strategy to analyze global Lys acetylation stoichiometries in HeLa whole cell lysate. Briefly, HeLa cells were lysed and the proteins were labeled with Ah-NHS. Proteins were digested by trypsin and the peptides were fractionated with strong cation exchange into six fractions. LCMS raw data were processed by MaxQuant for peptide identification. Software analysis of these data leads to the identification of Lys stoichiometries (nonzero values) of over 2000 acetylation sites (data not shown). Surprised by the results, we carefully analyzed the quantification data and noticed that a large number of these stoichiometries were likely to be false quantification. We suspected that the high complexity of the HeLa whole proteome analysis may give rise to the erroneous peak selection and incorrect calculation of acetylation stoichiometries.

15 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

After careful analysis, we determined that the erroneously selected peak may arise from co-eluting compound or peptide with very close m/z (including the isotope peaks of the adjacent peptides). We carefully compared the properties of these false selected peak pairs with the properties of the true peak pairs in our standard tests. Based on these observations, we designed three main approaches to remove the false peak selections and improve accuracy of stoichiometry analysis (Supplementary Figure S4). First, a deconvolution step was added to analyze the isotope patterns of each full MS scan from low mass to high mass prior to peak selection, which allowed the software to remove isotope peaks that interfere with stoichiometry calculation and also determined the charge state of each monoisotopic peak. In any scan during the elution window, if the candidate peak was interfered by co-eluting ions, the peptide would be discarded for stoichiometry analysis, because we can no longer calculate the peak area reliably. Second, the software will use the real m/z values of the reference peptides (the peptides identified by MaxQuant) in each MS scan as the internal reference to identify the differentially labeled peptide precursor ions using very narrow mass error window (+/-2.5 ppm). Such approach avoided the use of theoretically calculated m/z for peak selection and allow more confident determination of target peak for quantification. Third, we filtered the selected peak pairs for stoichiometry quantification using all peptides identified by MaxQuant software and the retention time windows for each peptide given by MaxQuant to avoid selecting the precursor ions of other known co-eluting peptides for stoichiometry calculation. To assess the reproducibility of our stoichiometry quantitative analysis, we performed two biological replicate experiments with HeLa cells and the data were analyzed by StoichAnalyzer for quantification. In each experiment, the HeLa cells were cultured separately under the similar confluence prior to harvesting. We identified stoichiometries

16 ACS Paragon Plus Environment

Page 16 of 37

Page 17 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

for a total of 452 sites in duplicate experiments. Among the sites with over 1% stoichiometry, about 50% overlapped. Comparing the stoichiometries of overlapping sites showed a good linear correlation, further suggesting high confidence of our analysis (Supplementary Figure S5). Lys acetylation stoichiometries in HeLa cells. Analysis of Lys acetylation stoichiometries of wild-type HeLa cells with our optimized chemical labeling and quantification workflow identified the stoichiometries of a total of 499 singly acetylated peptides and 157 multiply acetylated peptides (including the analysis of wild-type cells as well as the two biological replicates), among which 136 singly acetylated peptides and 86 multiply acetylated peptides have more than 1% stoichiometry (Figure 5A, Table S1 and an example of a doubly acetylated peptide shown in Supplementary Figure S6). Forty sites (4.8%) have stoichiometries over 20% and 53 sites (6.4%) have stoichiometries between 5% and 20% (Figure 5B-D). Considering all non-zero stoichiometry identifications, the average of Lys acetylation stoichiometries is about 4.1%. The generally low Lys acetylation stoichiometries in HeLa cells is similar to previously reported Lys acylation stoichiometries in other species and cells 22, 28-30. It is well established that acetylation dynamics on histones is a key epigenetic regulatory mechanism, playing critical roles in maintaining active chromatin structure and transcriptional regulation. Using untargeted and systematic analysis of whole cell lysate, we were able to identify Lys acetylation stoichiometries of a number of histone acetylation sites in HeLa cells including histone H4 N-terminal acetylation sites with 8.9% for K5, 7.1% for K8, 13.2% for K12 and 11% for K16 (Figure 6A-B). The higher abundance of acetylation on K12 and K16 correlated well with existing knowledge of differential role of Lys acetylation among the four acetylation sites in epigenetic regulation 39. We also identified the stoichiometry for K77 in the globular domain of 17 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

histone H4 which was reported through studies using extensive affinity-enrichment (http://www.phosphositeplus.org). Our data showed that the stoichiometry of K77 is only 0.02%, much lower than those of the N-terminal histone H4 acetylation sites. Mitochondria protein acetylation was first reported through large scale acetylation proteome analysis and has been implicated to play key roles in cellular metabolic pathways 11, 12. The reduced activity of Sirt3, a mitochondria specific lysine deacetylase, leads to hyperacetylation of mitochondria proteins and metabolic disorders 20, 40. In our dataset, we analyzed the acetylation stoichiometries of mitochondria proteins, cytosol proteins and nuclear proteins in HeLa cells (Table S1). Our study showed that the nuclear proteins had overall higher acetylation stoichiometry (4.8% on average). Mitochondria protein acetylation stoichiometries identified in our study (2.4% on average) were much lower than nuclear protein stoichiometries and slightly higher than those of cytosol proteins (2.1% on average) in HeLa cells. Site-specific stoichiometry analysis allowed us to identify the differential roles of Lys acetylation sites on multiply acetylated proteins. For example, programmed cell death 6interacting protein (PDCD6IP), a protein involved in apoptosis process, was identified with two acetylation sites. Stoichiometry analysis showed that K374 has a particularly high stoichiometry of 65.7% while the stoichiometry of the other site K48 is below 0.3%, demonstrating differentially regulatory roles between these two acetylation sites. Dynamics of Lys acetylation stoichiometry induced by HDAC inhibitor treatment. HDAC inhibitors such as sodium butyrate (NaBu) are known to induce global increase in Lys acetylation abundance 41. To test if our analytical workflow can identify the dynamics of stoichiometry changes by the inhibitor treatment, we grew HeLa cells treated with 1 mM sodium butyrate for 16 hours prior to harvesting. Our analysis identified 163 sites with

18 ACS Paragon Plus Environment

Page 18 of 37

Page 19 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

above 1% stoichiometry (Table S2). We observed a noticeable increase of Lys acetylation sites with higher stoichiometries upon sodium butyrate treatment (Figure 6C). Histone H4 N-terminal Lys acetylation site stoichiometries were confidently identified with 49.8% for K5, 46.8% for K8, 70.7% for K12 and 70.6% for K16, respectively (Figure 6A-B). This represents an overall increase in Lys acetylation stoichiometry for all four sites. In addition to histone H4 acetylation sites, we also identified six histone acetylation sites and their stoichiometries also significantly increased upon sodium butyrate treatment – K18, K23, K56 and K122 on histone H3 (H3F3B), and K31 and K77 on histone H4. Comparing the site-specific stoichiometry dynamics induced by sodium butyrate treatment, we noticed that the stoichiometries of Lys acetylation sites that locate closely have similar fold level changes between NaBu treated and wild-type cells (Figure 6A-B). The stoichiometries of K5 and K12 acetylation sites on histone H4 increased around 5.5 folds, and the stoichiometries of K8 and K16 acetylation of histone H4 were upregulated by 6.5 folds, while the stoichiometries of the histone H3 acetylation sites (K18 and K23) increased about 3.7 folds upon sodium butryrate treatment (Figure 6D). The similarity of the stoichiometry fold changes upon NaBu treatment among closelypositioned acetylation sites suggests potential co-regulation of these sites through common enzymatic pathways that were induced by the treatment. Our data surprisingly agrees with the previous studies on histone acetylation dynamics during cell cycle and during development suggesting potential co-regulation between K5 and K12 and between K8 and K16 acetylation sites at the histone H4 N-terminal 42, 43. In addition, newly synthesized histones were known to be primarily diacetylated first at K5 and K12 prior to deposition and acetylated at K8 and K16 after deposition in Tetrahymenas, Drosophila and human cells 44-46. Genome-wide ChIP-seq data also showed extensive co-acetylation of H4K5 and H4K12 47. These data further corroborates with the notion

19 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

that specific closely-positioned histone acetylation sites are likely to be co-regulated by certain mechanisms during cellular processes. Our data also suggests that in addition to histone H4, similar co-regulation may occur on histone H3 acetylation sites under the current treatment condition. It is important to note that such phenomena can only be observed using site-specific stoichiometry analysis but not with traditional relative quantification analysis. Bioinformatic and structural analysis of acetylation sites. To understand the functional significance of Lys acetylation substrates with different levels of stoichiometries, we divided the stoichiometry dataset from wild-type HeLa cells into four quantiles (less than 1%, 1% to 5%, 5% to 20% and above 20%) and performed gene ontology annotation enrichment in each quantile using DAVID bioinformatics resource 34. Our data showed that proteins involved in transcription regulation and DNA damage response were more enriched in quantile with higher stoichiometry while acetylation substrates involved in metabolic processes were more enriched in quantile with relatively low stoichiometry in HeLa cells (Figure 5D and Supplementary Figure S7). To determine whether acetylation sites with high stoichiometry may have potential effect on a substrate’s structure, we analyzed the available PDB structures for proteins with high acetylation stoichiometry. In our dataset, K89 on Syntaxin-6 (STX6) was identified with 36% stoichiometry upon sodium butyrate treatment. Structural analysis (PDB# 4J2C) showed that K89 is located on an alpha helix and its un-acetylated nitrogen on lysine side chain forms hydrogen bond with the oxygen on the side chain of E72 located at the end of an adjacent alpha helix (Figure 6E). Such hydrogen bond likely stabilizes the interaction between the two helixes and the overall protein structure. Acetylation on K89 will disrupt the hydrogen bond and potentially lead to protein conformational change.

20 ACS Paragon Plus Environment

Page 20 of 37

Page 21 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Discussion In summary, we developed an integrated approach including heavy acetyl-NHS based chemical labeling, high resolution MS analysis and a software developed in-house for quantitative analysis to identify site-specific Lys acetylation stoichiometries in mammalian cells. Our strategy achieved global site-specific Lys acetylation stoichiometry identification through an untargeted quantitative analysis at both precursor ion and fragment ion levels. The process has been validated through BSA and synthetic peptide standards. During the analysis of mammalian cell whole cell lysate, we developed systematic strategies to remove false quantification results, which significantly improve the confidence of stoichiometry identification. Application of our strategy led to the identification of stoichiometries of over 600 Lys acetylation sites in HeLa cells and revealed the site-specific stoichiometry dynamics upon sodium butyrate treatment. Stoichiometry analysis for quantification of PTMs has long been considered to be technically challenging. Nevertheless, comparing to relative PTM quantifications, stoichiometry analysis offer a few key advantages. First, the dynamics of PTM stoichiometry is the ultimate indication of activity changes in the corresponding PTM pathway. It is independent of changes in protein abundance and therefore can be systematically compared across various samples and time points. Second, while relative PTM analysis can only compare dynamics of the same site or peptide under various conditions, the stoichiometry analysis enables quantitative analysis between different sites on the same or different proteins, offering unique functional prospective on sitespecific dynamics during changes in cellular processes. Third, the PTM stoichiometry analysis will potentially allow the quantitative comparison of different types of PTMs on the same or different sites, and may lead to confident discovery of novel crosstalk 21 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

between different PTM pathways. Fourth, the proteome-wide stoichiometry analysis with the isotope-balanced approach can be achieved without normalizing the total proteins in each lysate or metabolic labeling of proteins. Therefore, it can be easily applied to analyze diverse types of complex samples such as mouse or human tissues. The stoichiometry analysis strategy developed in this study does not rely on systemwide enrichment of Lys acetylation substrates and therefore, it offers unbiased acetylation site identification and quantification. However, a potential limitation of the current strategy is its dependence on large scale, efficient fractionation to reduce the sample complexity and eliminate false quantification results coming from the interference of background ions. We have found that such false quantification could be widespread if the criteria for quantitative analysis were not rigorous. We also anticipate that deep fractionation and high resolution MS analysis will reduce the complexity and therefore lead to more confident identification and lower possibility for false quantification results. The proposed strategy and concept can be potentially applied to system-wide study of the stoichiometries of other protein modifications, which include recently discovered short-chain Lys acylations (SCLA). SCLA pathways include well-studied Lys acetylation but also include Lys propionylation, butyrylation, crotonylation, succinylation, malonylation, glutarylation and 2-hydroxyisobutyrylation 4-6, 48-51. Each SCLA pathway closely links to the activity of different metabolic processes through the involvement of different CoAs as co-factors. System-wide proteomics studies have shown that many of them share common substrates 22, 52-55. Application of stoichiometry analysis for SCLA pathways may allow quantitative identification of Lys acylation pathway crosstalk under different metabolic environment. Acknowledgement

22 ACS Paragon Plus Environment

Page 22 of 37

Page 23 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

We would like to thank Tim Griffin, the members of Chen lab and the Center for Mass Spectrometry and Proteomics for helpful discussion. We greatly appreciate the support from the Center for Mass Spectrometry and Proteomics for LCMS instrument access, the Structural Biology NMR Center for NMR analysis as well as Minnesota Supercomputing Institute for accessing Linux workstations. This work is supported by the research start-up fund to Y.C. from the University of Minnesota. Figure legends Figure 1. An overview of the experimental and computational workflow for system-wide, site-specific identification of Lys acetylation stoichiometry. Figure 2. Comparison of chemical-based acetyl labeling strategies. (A) Comparison of the acetyl labeling efficiencies on K, Y, S, T at different Ac-NHS concentrations. The acetyl labeling yields for each modification were calculated by spectral counting comparison between peptides with various acetylation forms (K, Y, S, T) and total identified peptides. (B) Comparison of the labeling efficiency and specificity with four different strategies: labeling with Ac-NHS first followed by trypsin digestion; labeling with AcOAc (acetic anhydride) first followed by trypsin digestion; Glu-C digestion first followed by labeling with Ac-NHS and trypsin digestion; Glu-C digestion first followed by labeling with Ac-NHS. (C) Comparison of the total peptide and protein identifications using the four different strategies described above. Student’s t-test were performed to evaluate the statistical difference between labeling efficiency with Ac-NHS labeling followed by trypsin digestion and the labeling efficiencies of other labeling strategies (*