Kinase Substrate Profiling Using a Proteome-wide Serine-Oriented

Jun 19, 2018 - *Address: 850 W. Campus Dr., Integrated Science and Technology ... reflect human protein sequences or are biased by human cell line pro...
0 downloads 0 Views 6MB Size
From the Bench Cite This: Biochemistry XXXX, XXX, XXX−XXX

pubs.acs.org/biochemistry

Kinase Substrate Profiling Using a Proteome-wide Serine-Oriented Human Peptide Library Karl W. Barber,†,‡ Chad J. Miller,§ Jay W. Jun,∥,⊥ Hua Jane Lou,§ Benjamin E. Turk,§ and Jesse Rinehart*,†,‡,⊥ †

Department of Cellular & Molecular Physiology, Yale University, New Haven, Connecticut 06520, United States Systems Biology Institute, Yale University, West Haven, Connecticut 06516, United States § Department of Pharmacology, Yale University, New Haven, Connecticut 06520, United States ∥ Division of Nutritional Sciences, Cornell University, Ithaca, New York 14850, United States ⊥ The Cancer Systems Biology Consortium Research Center, Yale University, West Haven, Connecticut 06516, United States

Downloaded via TUFTS UNIV on June 20, 2018 at 02:06:16 (UTC). See https://pubs.acs.org/sharingguidelines for options on how to legitimately share published articles.



S Supporting Information *

ABSTRACT: The human proteome encodes >500 protein kinases and hundreds of thousands of potential phosphorylation sites. However, the identification of kinase−substrate pairs remains an active area of research because the relationships between individual kinases and these phosphorylation sites remain largely unknown. Many techniques have been established to discover kinase substrates but are often technically challenging to perform. Moreover, these methods frequently rely on substrate reagent pools that do not reflect human protein sequences or are biased by human cell line protein expression profiles. Here, we describe a new approach called SERIOHL-KILR (serineoriented human library−kinase library reactions) to profile kinase substrate specificity and to identify candidate substrates for serine kinases. Using a purified library of >100000 serine-oriented human peptides expressed heterologously in Escherichia coli, we perform in vitro kinase reactions to identify phosphorylated human peptide sequences by liquid chromatography and tandem mass spectrometry. We compare our results for protein kinase A to those of a well-established positional scanning peptide library method, certifying that SERIOHL-KILR can identify the same predominant motif elements as traditional techniques. We then interrogate a small panel of cancer-associated PKCβ mutants using our profiling protocol and observe a shift in substrate specificity likely attributable to the loss of key polar contacts between the kinase and its substrates. Overall, we demonstrate that SERIOHL-KILR can rapidly identify candidate kinase substrates that can be directly mapped to human sequences for pathway analysis. Because this technique can be adapted for various kinase studies, we believe that SERIOHL-KILR will have many new victims in the future. amino acid located a fixed distance from a central phosphoacceptor, permitting construction of substrate motifs.5,6 However, kinase substrates are not directly identified and can be predicted only by the presence of motif elements in phosphoproteomic data sets. Other techniques that use mammalian lysate-based proteomes as in vitro substrate reagent pools have been established but may be biased by the protein expression profile of the cell type being used, therefore presenting an incomplete set of possible substrates to a kinase.7,8 Recently, we described a heterologous peptide-based representation of the human serine phosphoproteome, in which we were able to demonstrate the production of tens of thousands of phosphorylated or nonphosphorylated human

T

he human genome encodes >500 protein kinases, which phosphorylate proteins to post-translationally regulate substrate structure and function.1 A combination of highthroughput mass spectrometry experiments and classical biochemical techniques have uncovered >105 putative sites of phosphorylation on human proteins.2,3 However, the observation of a phosphorylated protein within a complex mixture provides no direct information about the identity of the kinase responsible for the phosphorylation event. One solution to this problem is to use substrate motif analysis. Kinases have been long recognized to exhibit telltale patterns in substrate preference related to the primary sequence directly surrounding a phosphorylatable amino acid.4 Thus, phosphorylation occurring within a certain amino acid context may provide clues about the identity of the responsible kinase. To experimentally identify the preferred amino acid sequence motif for individual kinases, several in vitro methods have been developed. One common technique makes use of a positional scanning peptide library (PSPL), in which a purified kinase is reacted with an array of peptides harboring a single invariable © XXXX American Chemical Society

Special Issue: Discovering New Tools Received: April 9, 2018 Revised: May 25, 2018

A

DOI: 10.1021/acs.biochem.8b00410 Biochemistry XXXX, XXX, XXX−XXX

From the Bench

Biochemistry

Figure 1. SERIOHL-KILR workflow. (a) All instances of human serine phosphorylation were downloaded from the PhosphoSitePlus database.3 These sites were converted into DNA sequences encoding serine-oriented peptides retaining 15 amino acids N-terminal and C-terminal to the observed site of serine phosphorylation.9 These genes were introduced into a single vector library and transformed into E. coli for expression and subsequent purification of the serine-oriented human peptide library (SERIOHL). (b) The SERIOHL peptides were reacted in vitro with kinases of interest (SERIOHL-KILR). Phosphorylated peptides were enriched using TiO2 and then identified by LC−MS/MS.

peptide sequences from a single plasmid library in Escherichia coli.9 These peptides are based on previously observed instances of serine phosphorylation in human proteins. We realized that the nonphosphorylated (or “phosphorylatable”) human peptide library would serve as an ideal reagent for human kinase profiling for identifying candidate substrates for serine kinases of interest. Here, we use the serine-oriented human library of peptides (SERIOHL) to perform in vitro kinase library reactions (SERIOHL-KILR) and use liquid chromatography and tandem mass spectrometry (LC−MS/ MS) to identify phosphorylated human substrate peptides of various kinases. We evaluate the results of our platform in parallel with the PSPL technique, demonstrating that SERIOHL-KILR performs like another rigorously established method. Overall, while our library presents some of the same limitations as other in vitro peptide-based methods, SERIOHLKILR is rooted in common principles of recombinant protein expression and purification, making the technology very tractable for routine kinase substrate identification using techniques already employed in most molecular biology laboratories. Moreover, SERIOHL-KILR is capable of uncovering multiple motifs and combinatorial amino acid preferences in kinase substrate peptides and can be easily customized to encompass other user-defined peptide collections. Finally, the peptides in SERIOHL are based on nonrandom human sequences, thus enabling the direct identification of physiologically relevant candidate kinase substrates and permitting gene network analysis.

previously observed instances of human serine phosphorylation were extracted from the PhosphoSitePlus database,3 and the amino acid sequences surrounding the phosphorylated serine residue were converted into ≤31 residue peptides centered around the phosphoacceptor serine (Figure 1a). An oligonucleotide library encoding these peptides was synthesized, amplified by polymerase chain reaction, and introduced into a bacterial expression vector in a single pool, which was shown to contain 94% of the anticipated DNA sequences by next-generation sequencing.9 In this plasmid library, the encoded peptides are fused to an N-terminal glutathione Stransferase (GST) tag and a C-terminal 6xHis tag to facilitate expression and purification (Figure S1). To prepare the SERIOHL peptides, we made use of a recoded strain of E. coli known as C321.ΔA (Addgene 68306) in which all genomic TAG codons have been replaced with TAA.10 Additionally, release factor 1 has been knocked out of this strain, such that UAG codons no longer serve as cues for translational termination (i.e., the UAG codon is functionally “unassigned”). By introducing a plasmid encoding the serine amber suppressor supD tRNA (Addgene 68307), we are able to drive serine incorporation in response to UAG codons at the ribosome.11 Each SERIOHL peptide-encoding DNA sequence contains a TAG codon, corresponding to the previously observed phosphorylated serine position. Although this method of peptide expression using amber suppression is not necessary for the synthesis of human peptides in E. coli, we were able to use the same precursor DNA library that previously enabled the expression of recombinant phosphopeptide collections via genetic code expansion.9 The direct incorporation of phosphoserine into these peptides at UAG codons remains a possibility for control experiments (e.g., to create a phosphorylated peptide as a mass spectrometry



METHOD DEVELOPMENT The design of the SERIOHL substrate reagent pool was described previously,9 but the resulting peptide library was not used in kinase reactions in that study. Briefly, 110139 B

DOI: 10.1021/acs.biochem.8b00410 Biochemistry XXXX, XXX, XXX−XXX

From the Bench

Biochemistry

Figure 2. Comparison of SERIOHL-KILR and PSPL techniques. (a) pLogo motif analysis14 of human peptides phosphorylated by PKA and identified by SERIOHL-KILR. Results from triplicate experiments were combined (n = 490 unique, library-mapped phosphopeptides). The red line indicates the p = 0.05 significance threshold with Bonferroni correction. (b) PSPL results using PKA. Log2-transformed selectivity scale shown at the right.

standard or to investigate the biochemical properties of a particular phosphorylated peptide).9,11,12 The SERIOHL plasmid library was first transformed into C321.ΔA with supD tRNA by standard electroporation methods. Serial dilutions of electroporated cells were plated on selective media (LB agar with 100 ng/mL ampicillin for the library plasmid and 25 ng/mL kanamycin for the supD tRNA plasmid), and colonies were counted to ensure that >107 transformants were obtained to maintain high peptide library diversity. This number is approximately 100-fold greater than the number of variants in the plasmid library, which was shown previously to be sufficient to produce a peptide library with >56000 unique members that could be observed by mass spectrometry.9 Peptide library expression and sequential purification using Ni-NTA and glutathione resins were described previously.9 Purified GST/6xHis-fusion SERIOHL peptides were concentrated to ∼500 μL using an Amicon Ultra-4 10 kDa molecular weight cutoff spin column (Millipore), transferred to an Amicon Ultra-0.5 10 kDa column (Millipore), and buffer exchanged [50 mM Tris (pH 7), 150 mM NaCl, and 20% glycerol] to a final volume of ∼100 μL. This served as the SERIOHL substrate pool for kinase reactions. We then proceeded to react the SERIOHL peptide pool with various kinases of interest and identify phosphorylated peptides by LC−MS/MS [SERIOHL-KILR (Figure 1b)]. To evaluate the performance of SERIOHL-KILR, we first reacted our peptide library with protein kinase A (PKA), an extensively characterized serine/threonine kinase with hundreds of known substrates. The human PKA catalytic domain was expressed and purified with an N-terminal 6xHis tag as previously described.13 One microgram (∼670 nM) of the purified PKA catalytic domain was reacted with approximately 20 μg (∼630 nM) of GST-fused SERIOHL peptides in a buffer containing 50 mM Tris (pH 7.4), 150 mM NaCl, 50 μM DTT, 20 mM MgCl2, and 200 μM ATP in a 50 μL final reaction volume. Triplicate kinase reactions were performed at 30 °C for 4 h, chilled on ice, and prepared for LC−MS/MS analysis. Peptides were then digested using trypsin and desalted, and phosphorylated peptides were enriched using titanium dioxide (TiO2) as described previously.9

Dried TiO2-enriched peptides were then resuspended in 0.5 μL of 70% formic acid, 0.3 μL of a 50% acetonitrile/0.1% formic acid mixture, and 6.2 μL of 0.1% trifluoroacetic acid. Five microliters of the sample were injected onto a 50 cm × 75 μm (inside diameter) PicoFrit column (New Objective) packed with 1.9 μm ReproSil-Pur 120 Å C18-AQ (Dr. Maisch) using an ACQUITY UPLC M-Class instrument (Waters) paired with a Q Exactive Plus mass spectrometer (Thermo). Liquid chromatography gradients (for a 290 min method), mass spectrometry operational parameters, and mass spectra search parameters (using MaxQuant version 1.5.1.2) were described previously.9 Tryptic phosphopeptides identified by MS2 were then mapped back to the in silico cDNA database of the encoded SERIOHL peptides using a custom Python script (https://github.com/rinehartlab/synphospho). Only SERIOHL peptides phosphorylated at the central serine residue were considered for further analysis.



RESULTS AND METHOD VALIDATION In total, 519 unique phosphopeptides were identified as PKA substrates in three parallel SERIOHL-KILR replicates (Figure S2). Previously, we had seen evidence of >56000 unique human peptides in SERIOHL as determined by mass spectrometry,9 indicating that PKA phosphorylated approximately 1% of the library across the SERIOHL-KILR replicates. The phosphorylated SERIOHL peptides were then analyzed by pLogo motif analysis,14 using the 110139 theoretical SERIOHL peptides encoded in the plasmid library as background. The resulting sequence analysis revealed a strong −3R/−2R preference, which corresponds to the known, canonical R-R/K-x-S PKA consensus motif (Figure 2a).4 Parallel PSPL analysis with PKA showed a similar pattern as previously reported,5 thus validating the ability of SERIOHLKILR to accurately identify kinase substrate motifs (Figure 2b). While the −3 and −2 Arg signature is dominant in both methods, the overrepresentation of hydrophobic residues Leu and Val at position +1 is present but slightly less evident in SERIOHL-KILR. We also observe a prominent deselection of proline at position +1 by SERIOHL-KILR, which is also observed by PSPL but at levels comparable to those of acidic amino acids not observed by SERIOHL-KILR. This interesting C

DOI: 10.1021/acs.biochem.8b00410 Biochemistry XXXX, XXX, XXX−XXX

From the Bench

Biochemistry

adult T cell leukemia/lymphoma.18 Several of the most recurrent mutations, including D427N, D470H, and E533K, occur in residues that likely confer kinase substrate specificity.19,20 We reasoned that SERIOHL-KILR could be leveraged to profile cancer-associated PKCβ mutations to better understand how these mutations might alter substrate choice and consequently rewire signaling networks. We began by deploying SERIOHL-KILR to identify substrate recognition patterns and candidate substrates for wild-type (WT) PKCβ. We performed triplicate SERIOHLKILR experiments using the same conditions as in the PKA reactions described above, but replacing PKA with 1 μg (∼490 nM) of full-length WT PKCβ and 5 μL of 10× PKC lipid activator (EMD Millipore, 20-133A) per 50 μL reaction mixture (Figure S4). None of the 50 previously identified PKCβ substrate peptides listed in the PhosphoSitePlus database were observed by SERIOHL-KILR, which is unsurprising given the relatively small number of known targets and the comparatively larger SERIOHL peptide population size. By mass spectrometry, we observe a very strong preference for Arg at position −3, hydrophobic residues at position +1, and, to a somewhat smaller extent, Arg at position +2. This matches well to the canonical PKCβ consensus sequence R-x-x-pS-ϕ-R (where ϕ is a hydrophobic residue).21We also performed PSPL for WT PKCβ, which revealed similar preferences for Arg at positions −3 and +2 and hydrophobic residues at position +1 but also exhibited a global preference for basic residues at all positions (Figure 4a). We then used SERIOHL-KILR to characterize the D427N, D470H, and E533K mutations in PKCβ. We observed that the D427N and D470H PKCβ mutants completely lost specificity for substrates containing a −3R, seemingly ablating a major determinant of specificity within PKCβ (Figure 4b,c). Interestingly, the E533K mutant also showed a weakened preference for −3R compared to WT PKCβ, but less so in comparison to the other mutants (Figure 4d). By PSPL analysis, we also observed a decrease in selectivity for Arg primarily at position −3 in all mutants (Figure 4b−d). One difference between SERIOHL-KILR and PSPL analysis was the relative prominence of the −2R signature in PSPL for WT PKCβ and its decreased prevalence in the PSPL profile of the E533K mutant. While this effect is still observed to a lesser extent by SERIOHL-KILR, it is one of the most striking differences between the PKCβ variants according to the PSPL analysis. These differences between methods may be due to sequence bias and limited diversity within the SERIOHL substrate pool, which is not a perfectly randomized collection of candidate substrates that can reveal subtleties in substrate preference, as is possible with the PSPL technique. Alternatively, this difference could reflect representation bias in the PSPL peptide mixtures. To provide independent validation of these results, we performed assays using WT and mutant PKCβ with matched pairs of individual peptide substrates with single-residue substitutions at the appropriate positions (Figure S5). As anticipated, we observed reduced selectivity for Arg at position −2 with the D470H and E533K mutants and position −3 with the D427N mutant (Figure S5). These assays further suggest that PKCβ mutations change specificity by reducing the rate of phosphorylation of Argcontaining peptides preferred by the WT kinase, while having activity similar to that of WT on peptides lacking the key Arg residue.

difference may reflect an improved signal-to-noise ratio gained by direct substrate interrogation/identification in SERIOHLKILR compared to motif reassembly from the randomized peptides in PSPL. One of the advantages of SERIOHL-KILR is the ability to investigate combinatorial effects of residues surrounding the central phosphoacceptor Ser. Using motif-x, we observe overrepresentation of −5R/−3R and −3R/−2S dual-residue motifs (Figure S3).15 Though slight preferences for −5R or −2S are also observed by PSPL, this technique cannot infer positional interdependence as the analysis of each position is independent of the analysis of the others. SERIOHL-KILR identified 16 peptides corresponding to bona fide human cellular PKA targets according to the PhosphoSitePlus database.3 While the number of previously identified PKA substrates observed by SERIOHL-KILR is enriched compared to the number for the total SERIOHL reagent library (3.1% of SERIOHL-KILR peptides compared to 0.4% in original library are listed as PKA substrates in PhosphoSitePlus), only 3.6% of known PKA sites were observed by SERIOHL-KILR. However, we note that SERIOHL-KILR performed very favorably in terms of motif element recognition of substrate peptides containing −3R and/or −2R (Figure 3). We found that 26% of phosphopep-

Figure 3. Enrichment of the canonical PKA motif using SERIOHLKILR. The total SERIOHL precursor pool refers to all 110139 theoretically encoded human peptides in the plasmid library. SERIOHL-KILR substrates refers to phosphopeptides identified by SERIOHL-KILR using PKA.

tides detected by SERIOHL-KILR contained both −3R and −2R, compared with 31% of PKA substrates in PhosphoSitePlus. Our method fares similarly well for the enrichment of peptides containing Arg at either of these positions (69 and 48% of identified sequences contain −3R and −2R, respectively). While SERIOHL-KILR is an effective tool for elucidating important substrate motif elements, subsequent in vitro or in vivo validation using standard biochemical or genetic approaches must be performed to confirm that full-length proteins corresponding to the identified SERIOHL peptide substrates are true kinase substrates. We then sought to use SERIOHL-KILR to characterize the effects of certain kinase mutations. Intriguingly, while cancerassociated kinase mutations typically cause hyperactivation, a subset of these mutations map to the catalytic cleft and in some cases appear to change substrate specificity. Recent studies have reported that kinase mutations observed in patient tumor samples can profoundly alter substrate recognition and may therefore contribute to pathogenesis by rewiring downstream signaling cascades.16,17 It was also recently discovered that mutations in protein kinase C β (PKCβ) were common in D

DOI: 10.1021/acs.biochem.8b00410 Biochemistry XXXX, XXX, XXX−XXX

From the Bench

Biochemistry

Figure 4. Comparison of SERIOHL-KILR and PSPL for investigating cancer-associated PKCβ mutants. pLogo14 of SERIOHL-KILR results (left) and PSPL data (right) for (a) WT PKCβ (n = 77), (b) PKCβ D427N (n = 323), (c) PKCβ D470H (n = 303), and (d) PKCβ E533K (n = 100). Unique, library-mapped phosphopeptides from triplicate experiments were combined for pLogo. The red line indicates the p = 0.05 significance threshold with Bonferroni correction. Quantified PSPL data were normalized by position and log2 transformed, with the selectivity scale shown to the right of the heat map. E

DOI: 10.1021/acs.biochem.8b00410 Biochemistry XXXX, XXX, XXX−XXX

From the Bench

Biochemistry

to identify biological processes correlated with the human genes corresponding to identified phosphorylated SERIOHL peptides [http://geneontology.org/page/go-enrichmentanalysis (Figure S6)].25 Genes associated with peptides observed in experiments using WT PKCβ were not substantially enriched for any biological processes. We then looked specifically at SERIOHL peptides that were observed exclusively in mutant samples and not with WT PKCβ and used genes corresponding to all possible encoded SERIOHL peptides as the background. We observed several enriched functions in PKCβ D427N samples (spindle assembly, regulation of cell cycle, regulation of organelle organization, and intracellular signal transduction) and PKCβ D470H samples (cytoskeleton organization and microtubule-based process). Loss of regulation of mitotic spindle/microtubule organization and cell cycle disruption have been previously associated with adult T cell leukemia,26−28 but these functional alterations have not been directly tied to PKCβ mutations. These results provide an interesting starting point for further studies into the implications of kinase mutations on signaling output and present clues about potential molecular underpinnings of cancer pathogenesis related to these specific substitutions within the PKCβ kinase domain. The PKCβ E533K mutant exhibited no significantly overrepresented terms, which could be due to either the limited number of candidate substrates identified by SERIOHL-KILR (Figure S4) or the downstream effects of this specific mutation (Figure 4). The candidate substrates identified in all SERIOHL-KILR experiments appeared to be uniformly enriched in sites predicted to be substrates of PKC family kinases by the Netphorest algorithm, with no other obvious kinases selectively targeting the identified phosphorylation sites.29

The results obtained by both substrate profiling methods can be rationalized by the kinase−substrate interaction architecture from the published structure of PKA in complex with a peptide inhibitor.22 We noted that the acidic residues at all three tested mutational positions make direct interactions via polar contacts with Arg residues in a substrate peptide (Figure 5). D427 interacts with Arg at position −3, while D470



Figure 5. Theoretical rationale for the loss of substrate specificity in PKCβ mutants. Crystal structure of the catalytic domain of PKA complexed with the peptide inhibitor (Protein Data Bank entry 1ATP).22 Aligned positions corresponding to relevant PKCβ mutations (D427, D470, and E533) are shown. Green lines represent hydrogen bonds.

DISCUSSION Overall, we have shown that SERIOHL-KILR is an effective technique for identifying candidate substrates for serine kinases of interest. SERIOHL-KILR performs like PSPL, identifying similar sequence elements that are essential to kinase− substrate interactions. Head to head, each technique offers several distinct advantages. PSPL reveals highly nuanced, quantitative information about kinase substrate selectivity by surveying individual amino acids surrounding a phosphoacceptor. These same measurements for all 20 natural amino acids at multiple positions cannot be observed by SERIOHLKILR because of the limited number of substrates present in the reagent library and the nonrandomized distribution of amino acid composition within SERIOHL peptides, which are based on human sequences. Because of the differences in the abundance of various SERIOHL peptides as well as potential differences in peptide ionization, we note that that rank ordering or weighing intensities of phosphopeptides observed by mass spectrometry would not be an appropriate analytical technique for this method. On the other hand, because the SERIOHL peptides directly correspond to physiologically relevant protein sequences and in vivo phosphorylation sites, they can be mapped to full-length human proteins for candidate substrate identification and gene pathway analysis. By the same token, the lower diversity of the substrate pool based on all known possible serine kinase substrates may decrease the background of SERIOHL-KILR experiments (i.e., fewer false positives because there are no randomized or nonphysiological candidate substrates present). SERIOHLKILR can also notably identify multiple multiresidue motifs

and E533 interact with −2R. This corresponds well with our SERIOHL-KILR and PSPL data, which show a strong decrease in preference for −3R caused by the PKCβ D427N mutation. This decrease in selectivity for −3R is not as stark in the E533K mutant, which is consistent with the lack of direct contact between these amino acids. PKCβ D470H also seemingly exhibits a decrease in preference for the −3R position by SERIOHL-KILR, which could be explained by changes in electrostatics incurred by the mutations. Because Arg residues are very important in the interaction between the pseudosubstrate sequence N-terminal to the kinase domain in PKCβ,23,24 these kinase mutations have been proposed to decrease the level of PKCβ autoinhibition, leading to activation. Overall, our results provide evidence that these mutations are likely to affect the substrate repertoire of PKCβ, both reducing the level of phosphorylation of native substrates and leading to the acquisition of new ones. One important benefit of using the SERIOHL peptide collection is that all library members are derived from authentic phosphorylation sites in human proteins. As such, the candidate kinase substrates identified by SERIOHL-KILR can be used for pathway analysis to provide potential mechanistic insights into the effects of the various PKCβ mutations. We used gene ontology (GO) enrichment analysis F

DOI: 10.1021/acs.biochem.8b00410 Biochemistry XXXX, XXX, XXX−XXX

From the Bench

Biochemistry

these reasons, SERIOHL-KILR may therefore selectively represent peptides that exist at higher concentrations in the reagent peptide library (i.e., increasing the incidence of false negatives, but not necessarily impacting the false positive rate). The use of peptide libraries in which each constituent sequence is more rigorously normalized would therefore improve the performance of this platform as would synthesis of smaller, targeted SERIOHL peptide pools. Still, the biochemical properties of certain phosphopeptides may make them difficult or impossible to observe by LC−MS/MS.9 Certain kinases that prefer substrates containing multiple basic residues, such as WT PKCβ that exhibits a −3R/+2R substrate motif preference, may yield tryptic fragments that are too short to observe by LC−MS/MS, weakening the ability to detect certain candidate substrates by SERIOHL-KILR unless they are not properly cleaved by trypsin. This problem could be remediated by performing partial trypsin digests or by using alternative proteases. This is also a possible reason for why fewer phosphorylated SERIOHL peptides were identified as substrates of WT PKCβ than of the mutant variants (Figure 4). This limitation exists, however, for most MS-based kinase substrate identification methods and is not unique to SERIOHL-KILR. Additionally, only the catalytic domain of PKA was used in our experiments, and SERIOHL peptides correspond to only a small fragment of their corresponding human proteins. These structural simplifications likely eliminate key binding domains or other elements that are necessary for interaction coordination, limiting the scope of the SERIOHL-KILR platform as compared to in vivo systems. Any candidate substrates of interest identified using this technology will need to be further validated using full-length proteins and eukaryotic cellular assays. There are several additional caveats for the SERIOHL-KILR method. The SERIOHL peptide reagent pool is very diverse and does not take into account tissue, cellular, or subcellular expression profiles that are encountered in mammalian systems. Therefore, our SERIOHL-KILR data set likely contains proteins that can serve as kinase substrates but that do not come into contact with the tested kinases in intact eukaryotic cells. Future iterations of SERIOHL peptide collections could be constructed on the basis of organ-, tissue-, or organelle-specific kinase/proteome expression profiles to better limit the scope of the identified candidate kinase substrates. The SERIOHL peptides are also not designed to study threonine or tyrosine phosphorylation or other classes of post-translational modifications. However, the same design principles could easily be extended to make peptide libraries that would be more applicable to other kinases or enzyme classes of interest, especially as the cost of large-scale singlestranded DNA library synthesis continues to drop dramatically. We do consider the low cost and ease of SERIOHL peptide isolation to be substantial platform advantages. Bacterial overexpression of the recombinant SERIOHL peptides is a convenient technique for renewably generating a reagent pool for substrate identification reactions. Using the GST-fused SERIOHL peptides, we obtain approximately 1 mg of the library/L of culture, which is enough reagent for 50 SERIOHLKILR experiments. Finally, the reaction conditions for SERIOHL-KILR will likely need to be tailored to individual kinases. Time course SERIOHL-KILR assays could be performed to identify which SERIOHL substrates are preferred by a kinase of interest and to avoid false positives that may arise in long reactions. Labeled internal standards should also

(Figure S3) or uncover preferred sequence elements up to 15 amino acids from the central phosphoacceptor residue. However, PSPL can also uncover the importance of modified residues such as pThr and pTyr in kinase substrate motifs, for which the current SERIOHL-KILR platform does not account. Our new platform also offers several advantages compared to other established kinase−substrate discovery techniques, but there are certain trade-offs. One shortcoming of SERIOHLKILR is that peptide-based representations of proteins fail to capture certain important secondary, tertiary, and quaternary structural elements that may be influential or essential in kinase−substrate recognition. Ideally, kinase−substrate pairings should be uncovered in the context of native systems or human cell lines, therefore retaining as much protein sequence and spatiotemporal information as possible. However, several challenges are presented in eukaryotic platforms. Parsing the role of an individual kinase in eukaryotic cells is difficult, as hundreds of other kinases may be present simultaneously. Chemical genetic and rescue approaches, which enable the selective inhibition and activation, respectively, of the chemically sensitive mutant kinase of interest, exhibit exquisite control over kinase activity30−32 yet may not be amenable to all kinases. Although membrane-associated and transmembrane proteins are notoriously difficult to isolate and study, there are many SERIOHL peptides that correspond to regions of these proteins, enabling their potential identification as candidate kinase substrates. Besides PSPL, other in vitro approaches have sought to elucidate kinase−substrate pairings such as on-bead peptide library screening, which performs partial Edman degradation and mass spectrometry to identify individual substrate sequences from a randomized peptide library.33 This powerful technique can uncover sequence covariance and favored and/or disfavored residues in substrates, but the resulting optimal substrate sequences do not directly correspond to human proteins. Other techniques, such as high-density microchips containing full-length human proteins,34 are difficult to construct in most laboratory settings, are challenging to customize, and fail to offer site-specific information about the location of protein phosphorylation. By contrast, SERIOHL-KILR is simple to perform and customizable, and the sequence of each possible phosphorylation site is genetically preprogrammed and easy to fully identify by mass spectrometry. In vitro reactions using a kinase of interest with intact or protease/phosphatase-treated mammalian lysates have also been very important tools in identifying kinase−substrate pairings;7,8 however, these methods can be biased in favor of the most abundant proteins in cell lines, and endogenous kinases can complicate experimental findings. Finally, other highly effective E. colibased techniques for kinase substrate profiling techniques have been developed, such as using bacterial surface display of hundreds of human peptides35 or identifying E. coli substrates of human kinases.36 By comparison, our technology offers a very large collection of peptides (tens of thousands) that correspond to human protein sequences. There are several reasons that the results obtained by SERIOHL-KILR may be incomplete or biased. A likely culprit in the bias of the results is the distribution of peptides within the SERIOHL peptide reagent pool; the most abundant peptides may outcompete peptides that exist at lower concentrations within the mixture during kinase reactions. Highly prevalent phosphopeptides may also mask or suppress the signals of rarer phosphopeptides by mass spectrometry. For G

DOI: 10.1021/acs.biochem.8b00410 Biochemistry XXXX, XXX, XXX−XXX

From the Bench

Biochemistry

(5) Hutti, J. E., Jarrell, E. T., Chang, J. D., Abbott, D. W., Storz, P., Toker, A., Cantley, L. C., and Turk, B. E. (2004) A rapid method for determining protein kinase phosphorylation specificity. Nat. Methods 1, 27−29. (6) Miller, C. J., and Turk, B. E. (2016) Kinase Screening and Profiling. Methods Mol. Biol. (N. Y., NY, U. S.) 1360, 203−216. (7) Kettenbach, A. N., Wang, T., Faherty, B. K., Madden, D. R., Knapp, S., Bailey-Kellogg, C., and Gerber, S. A. (2012) Rapid Determination of Multiple Linear Kinase Substrate Motifs by Mass Spectrometry. Chem. Biol. 19, 608−618. (8) Cohen, P., and Knebel, A. (2006) KESTREL: a powerful method for identifying the physiological substrates of protein kinases. Biochem. J. 393, 1−6. (9) Barber, K. W., Muir, P., Szeligowski, R. V., Rogulina, S., Gerstein, M., Sampson, J. R., Isaacs, F. J., and Rinehart, J. (2018) Encoding human serine phosphopeptides in bacteria for proteome-wide identification of phosphorylation-dependent interactions. Nat. Biotechnol., DOI: 10.1038/nbt.4150, Advance online publication. (10) Lajoie, M. J., Rovner, A. J., Goodman, D. B., Aerni, H.-R., Haimovich, A. D., Kuznetsov, G., Mercer, J. A., Wang, H. H., Carr, P. A., Mosberg, J. A., Rohland, N., Schultz, P. G., Jacobson, J. M., Rinehart, J., Church, G. M., and Isaacs, F. J. (2013) Genomically Recoded Organisms Expand Biological Functions. Science 342, 357− 360. (11) Pirman, N. L., Barber, K. W., Aerni, H. R., Ma, N. J., Haimovich, A. D., Rogulina, S., Isaacs, F. J., and Rinehart, J. (2015) A flexible codon in genomically recoded Escherichia coli permits programmable protein phosphorylation. Nat. Commun. 6, 8130. (12) Barber, K. W., and Rinehart, J. (2017) Kinase Signaling Networks. Methods Mol. Biol. (N. Y., NY, U. S.) 1636, 71−78. (13) Chen, C., Ha, B., Thévenin, A. F., Lou, H., Zhang, R., Yip, K. Y., Peterson, J. R., Gerstein, M., Kim, P. M., Filippakopoulos, P., Knapp, S., Boggon, T. J., and Turk, B. E. (2014) Identification of a Major Determinant for Serine-Threonine Kinase Phosphoacceptor Specificity. Mol. Cell 53, 140−147. (14) O’Shea, J. P., Chou, M. F., Quader, S. A., Ryan, J. K., Church, G. M., and Schwartz, D. (2013) pLogo: a probabilistic approach to visualizing sequence motifs. Nat. Methods 10, 1211. (15) Schwartz, D., and Gygi, S. P. (2005) An iterative statistical approach to the identification of protein phosphorylation motifs from large-scale data sets. Nat. Biotechnol. 23, 1391−1398. (16) Creixell, P., Palmeri, A., Miller, C. J., Lou, H., Santini, C. C., Nielsen, M., Turk, B. E., and Linding, R. (2015) Unmasking Determinants of Specificity in the Human Kinome. Cell 163, 187− 201. (17) Creixell, P., Schoof, E. M., Simpson, C. D., Longden, J., Miller, C. J., Lou, H., Perryman, L., Cox, T. R., Zivanovic, N., Palmeri, A., Wesolowska-Andersen, A., Helmer-Citterich, M., Ferkinghoff-Borg, J., Itamochi, H., Bodenmiller, B., Erler, J. T., Turk, B. E., and Linding, R. (2015) Kinome-wide Decoding of Network-Attacking Mutations Rewiring Cancer Signaling. Cell 163, 202−217. (18) Kataoka, K., Nagata, Y., Kitanaka, A., Shiraishi, Y., Shimamura, T., Yasunaga, J.-i., Totoki, Y., Chiba, K., Sato-Otsubo, A., Nagae, G., Ishii, R., Muto, S., Kotani, S., Watatani, Y., Takeda, J., Sanada, M., Tanaka, H., Suzuki, H., Sato, Y., Shiozawa, Y., Yoshizato, T., Yoshida, K., Makishima, H., Iwanaga, M., Ma, G., Nosaka, K., Hishizawa, M., Itonaga, H., Imaizumi, Y., Munakata, W., Ogasawara, H., Sato, T., Sasai, K., Muramoto, K., Penova, M., Kawaguchi, T., Nakamura, H., Hama, N., Shide, K., Kubuki, Y., Hidaka, T., Kameda, T., Nakamaki, T., Ishiyama, K., Miyawaki, S., Yoon, S.-S., Tobinai, K., Miyazaki, Y., Takaori-Kondo, A., Matsuda, F., Takeuchi, K., Nureki, O., Aburatani, H., Watanabe, T., Shibata, T., Matsuoka, M., Miyano, S., Shimoda, K., and Ogawa, S. (2015) Integrated molecular analysis of adult T cell leukemia/lymphoma. Nat. Genet. 47, 1304−1315. (19) Zhu, G., Fujii, K., Liu, Y., Codrea, V., Herrero, J., and Shaw, S. (2005) A Single Pair of Acidic Residues in the Kinase Major Groove Mediates Strong Substrate Preference for P-2 or P-5 Arginine in the AGC, CAMK, and STE Kinase Families. J. Biol. Chem. 280, 36372− 36379.

be utilized to allow phosphopeptide quantitation in SERIOHLKILR.37 In this work, we have described a new method for identifying candidate substrates for kinases of interest. We anticipate that this technique will be applicable to many different kinases and may be used to identify potential downstream effects of clinically relevant kinase mutations.



ASSOCIATED CONTENT

S Supporting Information *

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.biochem.8b00410. Supplementary methods (concerning PSPL technique) and figures (replication analysis, motif-x analysis for PKA, peptide phosphorylation assays, and GO enrichment analysis) (PDF) SERIOHL-KILR results (peptide substrates identified by LC−MS/MS for PKA and all tested PKCβ variants and corresponding gene names) (XLSX)



AUTHOR INFORMATION

Corresponding Author

*Address: 850 W. Campus Dr., Integrated Science and Technology Center, West Haven, CT 06516. E-mail: jesse. [email protected]. ORCID

Karl W. Barber: 0000-0003-0672-8409 Benjamin E. Turk: 0000-0001-9275-4069 Jesse Rinehart: 0000-0003-4839-4005 Funding

K.W.B. is supported by the National Science Foundation Graduate Research Fellowship via Grant DGE-1122492. B.E.T. is supported by the National Institutes of Health (NIH) (GM104047). J.R. is supported by the NIH (GM117230, GM125951, DK017433, and CA209992). J.W.J. is supported by Cancer Systems Biology Consortium (CSBC) and Physical Sciences in Oncology Network (PS-ON) Summer Research Fellowships. Notes

The authors declare the following competing financial interest(s): K.W.B. and J.R. have filed a provisional patent application with the U.S. Patent and Trademark Office (U.S. Patent Application 62/639,279) related to this work.

■ ■

ACKNOWLEDGMENTS The authors thank Shannon Hughes. REFERENCES

(1) Ubersax, J. A., and Ferrell, J. E., Jr. (2007) Mechanisms of specificity in protein phosphorylation. Nat. Rev. Mol. Cell Biol. 8, 530− 541. (2) Diella, F., Cameron, S., Gemünd, C., Linding, R., Via, A., Kuster, B., Sicheritz-Pontén, T., Blom, N., and Gibson, T. J. (2004) Phospho.ELM: A database of experimentally verified phosphorylation sites in eukaryotic proteins. BMC Bioinf. 5, 79. (3) Hornbeck, P. V., Zhang, B., Murray, B., Kornhauser, J. M., Latham, V., and Skrzypek, E. (2015) PhosphoSitePlus, 2014: mutations, PTMs and recalibrations. Nucleic Acids Res. 43, D512. (4) Kemp, B. E., and Pearson, R. B. (1990) Protein kinase recognition sequence motifs. Trends Biochem. Sci. 15, 342−346. H

DOI: 10.1021/acs.biochem.8b00410 Biochemistry XXXX, XXX, XXX−XXX

From the Bench

Biochemistry (20) Chen, C., Nimlamool, W., Miller, C. J., Lou, H., and Turk, B. E. (2017) Rational Redesign of a Functional Protein Kinase-Substrate Interaction. ACS Chem. Biol. 12, 1194−1198. (21) Nishikawa, K., Toker, A., Johannes, F.-J., Songyang, Z., and Cantley, L. C. (1997) Determination of the Specific Substrate Sequence Motifs of Protein Kinase C Isozymes. J. Biol. Chem. 272, 952−960. (22) Zheng, J., Trafny, E. A., Knighton, D. R., Xuong, N., Taylor, S. S., Ten Eyck, L. F., and Sowadski, J. M. (1993) 2.2 Å refined crystal structure of the catalytic subunit of cAMP-dependent protein kinase complexed with MnATP and a peptide inhibitor. Acta Crystallogr., Sect. D: Biol. Crystallogr. 49, 362−365. (23) House, C., and Kemp, B. E. (1990) Protein kinase C pseudosubstrate prototope: Structure-function relationships. Cell. Signalling 2, 187−190. (24) Newton, A. C. (1995) Protein Kinase C: Structure, Function, and Regulation. J. Biol. Chem. 270, 28495−28498. (25) Mi, H., Muruganujan, A., and Thomas, P. D. (2012) PANTHER in 2013: modeling the evolution of gene function, and other gene attributes, in the context of phylogenetic trees. Nucleic Acids Res. 41, D377−D386. (26) Kasai, T., Iwanaga, Y., Iha, H., and Jeang, K.-T. (2002) Prevalent Loss of Mitotic Spindle Checkpoint in Adult T-cell Leukemia Confers Resistance to Microtubule Inhibitors. J. Biol. Chem. 277, 5187−5193. (27) Sieburg, M., Tripp, A., Ma, J.-W., and Feuer, G. (2004) Human T-Cell Leukemia Virus Type 1 (HTLV-1) and HTLV-2 Tax Oncoproteins Modulate Cell Cycle Progression and Apoptosis. Journal of Virology 78, 10399−10409. (28) Nejmeddine, M., Negi, V. S., Mukherjee, S., Tanaka, Y., Orth, K., Taylor, G. P., and Bangham, C. R. M. (2009) HTLV-1−Tax and ICAM-1 act on T-cell signal pathways to polarize the microtubuleorganizing center at the virological synapse. Blood 114, 1016−1025. (29) Miller, M., Jensen, L., Diella, F., Jørgensen, C., Tinti, M., Li, L., Hsiung, M., Parker, S. A., Bordeaux, J., Sicheritz-Ponten, T., Olhovsky, M., Pasculescu, A., Alexander, J., Knapp, S., Blom, N., Bork, P., Li, S., Cesareni, G., Pawson, T., Turk, B. E., Yaffe, M. B., Brunak, S., and Linding, R. (2008) Linear Motif Atlas for Phosphorylation-Dependent Signaling. Sci. Signaling 1, ra2. (30) Shah, K., Liu, Y., Deirmengian, C., and Shokat, K. M. (1997) Engineering unnatural nucleotide specificity for Rous sarcoma virus tyrosine kinase to uniquely label its direct substrates. Proc. Natl. Acad. Sci. U. S. A. 94, 3565−3570. (31) Knight, Z. A., and Shokat, K. M. (2007) Chemical Genetics: Where Genetics and Pharmacology Meet. Cell 128, 425−430. (32) Qiao, Y., Molina, H., Pandey, A., Zhang, J., and Cole, P. A. (2006) Chemical Rescue of a Mutant Enzyme in Living Cells. Science 311, 1293−1297. (33) Trinh, T. B., Xiao, Q., and Pei, D. (2013) Profiling the Substrate Specificity of Protein Kinases by On-Bead Screening of Peptide Libraries. Biochemistry 52, 5645−5655. (34) Hall, D. A., Ptacek, J., and Snyder, M. (2007) Protein microarray technology. Mech. Ageing Dev. 128, 161−167. (35) Shah, N. H., Wang, Q., Yan, Q., Karandur, D., Kadlecek, T. A., Fallahee, I. R., Russ, W. P., Ranganathan, R., Weiss, A., and Kuriyan, J. (2016) An electrostatic selection mechanism controls sequential kinase signaling downstream of the T cell receptor. eLife 5, e20105. (36) Chou, M. F., Prisic, S., Lubner, J. M., Church, G. M., Husson, R. N., and Schwartz, D. (2012) Using Bacteria to Determine Protein Kinase Specificity and Predict Target Substrates. PLoS One 7, e52747. (37) Kubota, K., Anjum, R., Yu, Y., Kunz, R. C., Andersen, J. N., Kraus, M., Keilhack, H., Nagashima, K., Krauss, S., Paweletz, C., Hendrickson, R. C., Feldman, A. S., Wu, C.-L., Rush, J., Villén, J., and Gygi, S. P. (2009) Sensitive multiplexed analysis of kinase activities and activity-based kinase identification. Nat. Biotechnol. 27, 933−940.

I

DOI: 10.1021/acs.biochem.8b00410 Biochemistry XXXX, XXX, XXX−XXX