Large-Scale Identification of Phosphorylation Sites for Profiling Protein

May 28, 2014 - In this study, we tried to identify kinase-selective substrate determinants, including motif sequences, based on large-scale discovery ...
1 downloads 0 Views 2MB Size
Subscriber access provided by NATIONAL UNIV OF SINGAPORE

Article

Large-scale Identification of Phosphorylation Sites for Profiling Protein Kinase Selectivity Haruna Imamura, Naoyuki Sugiyama, Masaki Wakabayashi, and Yasushi Ishihama J. Proteome Res., Just Accepted Manuscript • DOI: 10.1021/pr500319y • Publication Date (Web): 28 May 2014 Downloaded from http://pubs.acs.org on May 30, 2014

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Proteome Research is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Large-scale Identification of Phosphorylation Sites for Profiling Protein Kinase Selectivity

Haruna Imamura1, Naoyuki Sugiyama1, Masaki Wakabayashi1, Yasushi Ishihama1* 1

Graduate School of Pharmaceutical Sciences, Kyoto University, Sakyo-ku, Kyoto

606-8501, Japan

*

To whom correspondence should be addressed. Yasushi Ishihama Graduate School of Pharmaceutical Sciences Kyoto University 46-29, Yoshida-Shimo-Adachi-Cho, Sakyo-ku, Kyoto 606-8501, Japan Email: [email protected] Phone: +81-75-753-4555 Fax: +81-75-753-4601

1 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 30

Abstract Protein kinase selectivity is largely governed by direct binding to the target site(s) on the substrate. Thus, substrate determinants identified from sequences around phosphorylation sites are desirable resources for matching kinases to their substrates. In this study, we tried to identify kinase-selective substrate determinants, including motif sequences, based on large-scale discovery of kinase/substrate pairs. For this purpose, we employed a combination strategy of in vitro kinase reaction followed by LC-MS/MS analysis, and applied it to three well-studied kinases: c-AMP regulated protein kinase A (PKA), extracellular signal-regulated kinase 1 (ERK1), and RAC-alpha serine/threonine-protein

kinase

(AKT1).

Cellular

proteins

were

fractionated,

dephosphorylated with thermo-sensitive alkaline phosphatase, phosphorylated with the target kinase, and digested with Lys-C/trypsin, and then phosphopeptides were enriched using TiO2-based hydroxy acid-modified metal oxide chromatography (HAMMOC) and subjected to LC-MS/MS. As a result, 3,585, 4,347 and 1,778 in vitro phosphorylation sites were identified for PKA, ERK1 and AKT1, respectively. As expected, these extensive identifications of phosphorylation sites enabled extraction of both known and novel motif sequences, and this in turn permitted fine discrimination of the specificities of PKA and AKT1, which both belong to the AGC kinase family. Other unique features of the kinases were also characterized, including phospho-acceptor preference (Ser or Thr) and bias ratio of singly/multiply phosphorylated peptides. More motifs were found with this methodology as compared to target kinase phosphorylation of peptides obtained by pre-digestion of proteins with Lys-C/trypsin. Thus, this approach to characterization of kinase substrate determinants is effective for identification of kinases associated with particular phosphorylation sites.

Keywords Mass

spectrometry,

phosphoproteome,

kinase/substrate,

motif

phosphoacceptor, multiply phosphorylated peptide, PKA, ERK1, AKT1

2 ACS Paragon Plus Environment

sequence,

Page 3 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Introduction Kinase-mediated phosphorylation is a post-translational modification that is associated with regulation of many cellular processes.1,2 More than 500 proteins encoded in the human genome have been annotated as members of the protein kinase family on the basis of homologous catalytic domains.3 Kinases in this large family target a huge variety of substrates, and it has been suggested that more than 70% of all cellular proteins are phosphorylated.4 Large-scale discovery of phosphorylation sites has mostly utilized shotgun phosphoproteomics, which is based on state-of-the-art mass spectrometry (MS) and highly selective phosphopeptide enrichment methods. 5-7 Kinase substrate recognition is determined by the structural characteristics of the kinase active site, interactions between the kinase and its substrate, and other interactions with scaffolding and adaptor proteins.8 Kinase active sites are known to have complementary features to sequences around phosphorylation sites in terms of hydrophobicity, charge and depth, as would be expected from their direct interaction. Thus, peptide sequences around phosphorylation sites provide clues to predict the responsible kinases. Some kinases are known to have specific selectivity for particular sequences of amino acids, designated as motif sequences.9-11 Because there is great interest in uncovering kinase/substrate relationships, several prediction tools to identify responsible kinases have been developed based on current motif knowledge.12-15 Moreover, motifs have been applied in several studies to derive biological information from high-throughput datasets. For example, matching known motif sequences to phosphoproteome data obtained from specific tissues or cell lines allowed identification of in vivo active kinases and/or pathways.5,16,17 Motifs are also available as filters to screen direct substrates from multiple candidates.18-21 In order to find intrinsic substrates of a particular kinase, in vivo experiments using selective kinase inhibitor/activator or RNA interference experiments are often used. However, those treatments inevitably have an effect downstream of the targeted kinase, and the results therefore include indirect phosphorylation sites. Among putative substrates, those with a motif can be most reliably identified as direct substrates. Nevertheless, the number of reported motifs is quite limited. So far, NetPhorest13 is the largest web-based atlas, albeit the stored phosphorylation-related motifs and domains correspond to only 179 out of 518 human kinases. Further, some of the reported motifs are not sufficiently selective to discriminate substrates of specific 3 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

kinases. The sensitivity and specificity of motifs are critical for accurate prediction of the responsible kinases. Given that kinase domains are highly conserved, kinases in the same family tend to have similar motif selectivity. However, the small number of phosphorylation sites that have been linked to their responsible kinases still hampers the identification of larger numbers of motifs with high selectivity. So far, about 209,000 phosphorylation sites have been registered in PhosphoSitePlus®22, one of the largest public databases of phosphorylation, but only ~6.6% (13,751) have been linked to a particular kinase. Hence, an efficient strategy to link kinases and substrates is the key to discovery of large numbers of novel motif sequences. While the in vivo condition imposes biological constraints such as protein-protein interaction or cellular localization, kinase binding to substrates is dependent on physical and chemical properties, so that in vitro investigation is suitable for exploring motif sequences. Accordingly, synthetic peptide arrays have been utilized for identification of motif sequences with γ32P-labeled adenosine triphosphate (ATP), affording highly sensitive detection of phosphorylation.23-25 However, this approach has three disadvantages. One is that the variety of sequences on the array is limited. The peptide arrays contain synthetic peptides with a fixed phospho-acceptor residue (S/T/Y) as well as a second fixed residue in one of the flanking positions. This is unlikely to provide sufficient coverage of the wide range of amino acid sequences in cells. Secondly, the results represent the average of all motifs generated by a particular kinase, and less favored motifs or specific combinations of amino acids may be overwhelmed by highly favored motifs. It is also impossible to determine which combinations of amino acids are the most important. Finally, depending on surface chemistry and binding materials, the immobility of the peptides may be modulated, causing differences of reaction efficiency.26,27 Protein arrays are another option to screen motif sequences.28 Newman et al. recently reported a large-scale study to construct kinase/substrate networks by utilizing a protein array to screen sequence preferences of 289 protein kinases.29 However, this approach does not overcome the problems described above, and moreover, an additional process is needed to determine phosphorylation sites. More recently, LC-MS/MS coupled with in vitro kinase reaction has been developed as an alternative method to detect kinase/substrate relationships.30-38 In addition to the capability of MS to sequence peptides and to localize phosphorylation sites at the 4 ACS Paragon Plus Environment

Page 4 of 30

Page 5 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

same time, this approach has the further advantage that it is able to use any kind of protein or peptide pool as substrates for purified active kinases to phosphorylate in vitro. Thus, it is applicable to cell extracts that provide kinases with thousands of different peptide sequences, with the same amino acid frequencies as the cellular proteome. However, the limited number of identified phosphorylation sites has been a bottleneck for detailed characterization of motifs. In this study, with the aim of in-depth identification of in vitro kinase substrates and large-scale discovery of motif sequences, we carried out LC-MS/MS-based phosphoproteome analysis coupled with in vitro kinase reaction, focusing on three well-studied serine-threonine kinases: c-AMP regulated protein kinase A (PKA), extracellular signal-regulated kinase 1 (ERK1), and RAC-alpha serine/threonine-protein kinase (AKT1).

Materials and Methods Materials Dulbecco’s modified Eagle’s medium (DMEM), kanamycin, phosphate-buffered saline (PBS), dithiothreitol (DTT), iodoacetamide (IAA), Lys-C, piperidine, and V8 protease were obtained from Wako (Osaka, Japan). Protease inhibitor and triethylammonium bicarbonate (TEAB) were obtained from Sigma-Aldrich (St Louis, MO). Fetal bovine serum was obtained from Gibco® and BCA protein assay kit was obtained from Thermo Fisher Scientific (Waltham, MA). Trypsin and thermo-sensitive alkaline phosphatase (TSAP) was obtained from Promega (Madison, WI). Amicon Ultra 10K was obtained from Merck Millipore (Darmstadt, Germany). Recombinant protein kinases were obtained from Carna Biosciences (Kobe, Japan). Cell culture and protein extraction HeLa S3 cells were cultured in DMEM containing 10% fetal bovine serum and 100 µg/mL kanamycin. Cells at about 80% confluence in 15 cm dishes were washed with ice-cold PBS and collected with a rubber scraper in PBS, followed by centrifugation at 500 g, 4 °C for 10 min. The pelleted cells were stored at -80 °C until use. To obtain “supernatant fraction” for in vitro kinase reaction, the cells were lysed with 20 mM HEPES (pH 7.5), 250 mM sucrose, 1.5 mM MgCl2, 10 mM KCl, 0.5% Nonidet P-40, containing 1% protease inhibitor, and left on ice for 5 min. After sonication for 5 5 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

min, the lysate was centrifuged at 1,500 g for 10 min at 4°C and the supernatant was collected as “supernatant fraction”. To obtain the “pellet fraction” for in vitro kinase reaction, we followed the protocol of Dignam et al.39. Briefly, the cells were lysed with lysis buffer A (10 mM HEPES (pH 7.9), 1.5 mM MgCl2, 10 mM KCl, 0.5 mM DTT) and left on ice for 10 min. After centrifugation at 1,000 g for 10 min at 4°C, the cells were suspended in lysis buffer A and homogenized by a Dounce homogenizer. Then, the lysate was centrifuged at 1,000 g for 10 min at 4°C. After removing the supernatant, the pellet fraction was centrifuged at 25,000 g for 20 min at 4°C again. The resultant pellet was solubilized with lysis buffer B (20 mM HEPES (pH 7.9), 25% (v/v) glycerol, 0.42 M NaCl, 1.5 mM MgCl2, 0.2 mM EDTA, and 0.5 mM DTT containing 1% protease inhibitor), and homogenized by the Dounce homogenizer. After stirring the solution for 10 min on ice, followed by centrifugation at 25,000 g for 30 min at 4°C, the supernatant was collected as “pellet fraction”. The buffer for both fractions were replaced with 40 mM Tris-HCl (pH 7.5) by ultrafiltration using an Amicon Ultra 10K at 14,000 g and 4°C. Protein amount was confirmed with a BCA protein assay kit and distributed to each of 100 µg aliquot. Sample preparation for profiling in vitro kinase substrates For dephosphorylation, 100 µg of proteins were reacted with 1 µL of TSAP (1 MBU/µL) at 37°C for 1 hour, and TSAP was inactivated by heating at 75°C for 30 min. For in vitro kinase reaction, each 100 µg of dephosphorylated proteins (1 µg/µL) from the supernatant or the pellet fraction were reacted with 1 µL of each recombinant kinase (0.5 μg/µL) or distilled water as a control at 37°C in kinase reaction buffer (40 mM Tris-HCl (pH 7.5), 20 mM MgCl2, 1 mM ATP) for 3 hours. The reaction was stopped by heating at 95°C for 5 min. The protein aliquot was diluted with 8 M urea in 0.2 mM Tris-HCl (pH 9.0). After protein reduction/alkylation, Lys-C/trypsin digestion (1/100 w/w) was performed as described previously

40

. Phosphopeptides were enriched by

TiO2-based hydroxy acid-modified metal oxide chromatography (HAMMOC)41,42 with elution by 0.5 % piperidine. Phosphopeptides were desalted by StageTips43 and suspended in the loading buffer (0.5% TFA and 4% ACN) for subsequent nanoLC-MS/MS analyses. The experimental process either with supernatant or pellet fraction was repeated five times for each kinase. Sample preparation methods for other experiments were described in 6 ACS Paragon Plus Environment

Page 6 of 30

Page 7 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Supplementary Information. NanoLC-MS/MS system A self-pulled analytical column (150 mm length x 100 µm i.d.) was prepared with ReproSil-Pur C18-AQ materials (3 µm, Dr. Maisch, Ammerbuch, Germany)44. The mobile phases consisted of (A) 0.5% acetic acid and (B) 0.5% acetic acid in 80% acetonitrile. A gradient condition with flow rate 500 nL/min was employed, that is, 5–10% B in 5 min, 10-40% B in 60 min, 40–100% B in 5 min, 100% B for 10 min, and 5% B for 30 min. An Ultimate 3000 pump (Thermo Fisher Scientific, Germering, Germany) and a HTC-PAL autosampler (CTC Analytics, Zwingen, Switzerland) were used in coupled with a TripleTOF 5600 System (AB Sciex, Foster City, CA). A spray voltage of 2,300 V was applied. The MS scan range was m/z 350–1500. The top 7 precursor ions were selected in the MS scan for subsequent MS/MS scan in high-sensitivity mode. MS scans were performed for 0.25 sec, and subsequent MS/MS scans were performed for 0.143 sec each. To minimize repeated scanning, previously scanned ions were excluded for 12 sec. The CID energy was automatically adjusted by the rolling CID function of Analyst TF 1.6. Identification of phosphoproteome The raw data files from the TripleTOF 5600 system were analyzed by AB Sciex MS Data Converter to create peak lists based on the recorded fragmentation spectra. Paragon on AB Sciex ProteinPilot (v. 4.5) and Mascot v. 2.4 (Matrix Sciences, London. U.K.) were used against SwissProt Database (version 2013-03) with the parameters described before45 except for a precursor mass tolerance of 10 ppm and a fragment ion mass tolerance of 0.1 Da for Mascot search. To integrate results from the two search engines, firstly phosphopeptides identified by Mascot were accepted and then those not identified by Mascot were accepted from the Paragon results46. The numbers of unique phosphopeptides were counted based on the combination of sequence and modification. Phosphorylation sites were confirmed by calculation of PTM score (≥0.75) and/or combinations of site-determining b- and y-ions as described before41,47,48. Site-confirmed phosphopeptides were remapped to protein sequences for protein annotation. One protein was representatively selected when multiple proteins were assigned to one peptide sequence. Phosphorylation sites on the kinases themselves were excluded from final lists, since they cannot be distinguished in this 7 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

experiment. False discovery rates (FDR) were estimated by searching against a randomized decoy database (