Short Linear Motifs: Ubiquitous and Functionally Diverse Protein

Jun 13, 2014 - SLiM-binding domains are represented as blue-bordered gray boxes; other domains are represented as mauve boxes; domain names are given ...
15 downloads 14 Views 10MB Size
Review pubs.acs.org/CR

Short Linear Motifs: Ubiquitous and Functionally Diverse Protein Interaction Modules Directing Cell Regulation Kim Van Roey,† Bora Uyar,† Robert J. Weatheritt,‡ Holger Dinkel,† Markus Seiler,† Aidan Budd,† Toby J. Gibson,† and Norman E. Davey*,†,§ †

Structural and Computational Biology Unit, European Molecular Biology Laboratory (EMBL), Meyerhofstrasse 1, 69117 Heidelberg, Germany ‡ MRC Laboratory of Molecular Biology (LMB), Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge CB2 0QH, United Kingdom § Department of Physiology, University of California, San Francisco, San Francisco, California 94143, United States 4.2.1. Post-Translational Modification Motifs and Moiety Addition or Removal 4.2.2. Post-Translational Modification Motifs and Proteolytic Cleavage 4.2.3. Post-Translational Modification Motifs and Structural Modification 4.2.4. Summary of Post-Translational Modification Motifs 5. Strategies of Cooperative SLiM Use 5.1. From Translation to Degradation: The Life Cycle of CDC25 5.1.1. SLiMs and CDC25 Modification 5.1.2. SLiMs and CDC25 Activity 5.1.3. SLiMs and CDC25 Stability 5.1.4. SLiMs and CDC25 Localization 5.1.5. Summary of SLiM-Mediated Regulation of the Life Cycle of CDC25 5.2. Functional Versatility of p21 5.2.1. SLiMs and Inhibition of Cyclin-CDK Activity by p21 5.2.2. SLiMs and Inhibition of DNA Replication by p21 5.2.3. Summary of SLiM-Mediated Functional Flexibility of p21 6. Strategies of Systemic SLiM Use 6.1. Dynamic Signaling Platforms in T-Cell Receptor Signaling 6.1.1. SLiMs and LCK Regulation 6.1.2. ITAM Activation and ZAP70 Recruitment 6.1.3. SLiMs and LAT Complex Assembly 6.1.4. SLiMs and Activation of Downstream Pathways 6.1.5. Summary of SLiMs in T-Cell Receptor Signaling 6.2. Combinatorial Signal Integration in Hypoxia 6.2.1. SLiMs Controlling HIF1A Stability and Transcriptional Activity 6.2.2. SLiMs Inhibiting p53 in Normoxia 6.2.3. SLiMs Activating p53 in Sustained Hypoxia

CONTENTS 1. Introduction 2. SLiMs Are a Distinct Class of Protein Interaction Module 3. Properties of SLiMs 3.1. Compactness and Structural Plasticity of SLiMs 3.1.1. SLiMs as Compact Binding Modules 3.1.2. SLiMs as Structurally Flexible Binding Modules 3.2. Evolution of SLiMs 3.2.1. Evidence for De Novo Evolution of SLiMs 3.2.2. Conservation of SLiMs 3.2.3. Evolution of SLiM-Binding Domains 3.3. Binding Determinants of SLiMs 3.3.1. Intrinsic Determinants 3.3.2. Extrinsic Determinants 3.4. Regulation of SLiMs 3.4.1. Pretranslational Regulation of SLiMs 3.4.2. Post-Translational Regulation of SLiMs 4. Functions of SLiMs 4.1. Ligand Motifs 4.1.1. Ligand Motifs and Enzyme Recruitment 4.1.2. Ligand Motifs and Protein Targeting 4.1.3. Ligand Motifs and Protein Complex Assembly 4.1.4. Summary of Ligand Motifs 4.2. Post-Translational Modification Motifs

© 2014 American Chemical Society

6734 6735 6737 6737 6737 6737 6737 6737 6739 6739 6739 6739 6741 6741 6741 6741 6742 6742 6743 6745 6747 6750 6751

6751 6752 6753 6753 6754 6755 6755 6755 6755 6756 6756 6757 6757 6757 6758 6759 6759 6759 6760 6760 6760 6762 6762 6762 6763 6763

Special Issue: 2014 Intrinsically Disordered Proteins (IDPs) Received: October 17, 2013 Published: June 13, 2014 6733

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews 6.2.4. Summary of SLiMs in Combinatorial Signal Integration in Hypoxia 7. SLiMs and Disease 7.1. SLiMs in Infectious Diseases 7.1.1. Host SLiM Mimicry by Pathogens 7.1.2. SLiM Modulation by Pathogens 7.1.3. Malevolent Use of Endogenous SLiMs 7.2. SLiMs in Mendelian Diseases and Cancer 7.2.1. Mutated SLiMs Resulting in Aberrant Local Abundance 7.2.2. Mutated SLiMs Resulting in Defective Activity 7.3. Therapeutic Potential of SLiMs 8. Concluding Comments Author Information Corresponding Author Notes Biographies Acknowledgments References

Review

The most common functional modules within IDRs are composed of short stretches of adjacent amino acids. These compact, linear protein interaction sites are known as short linear motifs (SLiMs), also being referred to as linear motifs (LMs), molecular recognition features (MoRFs), or miniMotifs.15−18 Depending on the partner proteins that recognize them, these sites can facilitate a diverse set of functions including targeting a protein to a specific subcellular location, determining the modification state of a protein, controlling the stability of a protein, and regulating the context-dependent activity of a protein.15,19 Hence, SLiMs are imperative for dynamic and robust control of cell physiology. The current census of SLiMs is split unequally between ligand motifs that mediate protein−protein interactions (see section 4.1) and post-translational modification motifs, which are directly recognized and targeted for post-translational modification (PTM) by regulatory enzymes (see section 4.2). The Eukaryotic Linear Motif (ELM) resource (http://elm.eu.org) is the most comprehensive repository of experimentally validated SLiMs, containing ∼2000 ligand motif instances recognized by ∼100 nonenzymatic globular domain families.15 In contrast, around 100 000 sites recognized and modified by enzymatic globular domains (many of which recognize a modification motif surrounding the modified residue) have been discovered to date.20−22 The disparity between ligand and modification motif counts is a reflection of the lack of methods to discover novel ligand motifs on a proteome-wide scale. Conversely, proteomic approaches for PTM discovery have advanced rapidly, and whole-proteome screens for PTMs are now common practice.23 The lack of high-throughput assays to analyze low-affinity interactions mediated by ligand motifs poses a question. How many ligand SLiMs are there in the human proteome? Several pieces of evidence suggest that our current understanding of SLiMs represents only a tiny portion of the true complement. First, approximately one-third of the human proteome consists of IDRs, and the functional role of the vast majority of these regions remains uncharacterized.3,13 As SLiMs are the most common functional modules within IDRs, it is likely that as these regions are explored the number of known SLiMs will rapidly increase. Second, two-thirds of the 12 000 identified globular domain families have no known binding partner, while for over three-quarters of protein−protein interactions no binding mode has been characterized.24,25 A considerable number of these interfaces will undoubtedly involve SLiMs. Furthermore, even within the domain families recognized to bind SLiMs the target peptides are unknown for the majority of the domain instances. Finally, most of the current knowledge about linear motifs comes from a limited number of IDRcontaining proteins that are intensively studied and contain numerous SLiM instances, such as members of the p53 family. Extrapolating the SLiM density of these proteins to other IDRs would suggest that a eukaryotic proteome could contain upward of 100 000 ligand motif instances. Although the true size of the cellular complement of SLiMs remains a mystery, it would not be surprising if ligand motifs overtook globular domains as the most numerous nonenzymatic protein interaction modules in the cell. This review attempts to consolidate the key material describing these ubiquitous, elusive, and functionally diverse modules in a manner that will allow uninitiated readers to gain a deeper understanding of the function of SLiMs and their roles in cell regulation. We will briefly compare SLiMs with other

6763 6763 6764 6764 6766 6766 6766 6766 6767 6768 6768 6768 6768 6768 6768 6770 6770

1. INTRODUCTION The eukaryotic cell is a bustling collection of macromolecules acting cooperatively to mediate the functions required for cell viability. Specific, context-dependent and tightly controlled physical interactions between these cellular components govern the necessary physiological processes, from cell division to cell death. The specificity, conditionality, and regulation of these binding events depend on communication between the interacting molecules and their surroundings. For proteins, most of this communication is mediated by a variety of modules that are embedded within the protein sequence, can bind a wide array of ligands, and have catalytic, regulatory, or scaffolding activity. These functional units enable proteins to sense, integrate, and transmit environmental and cell state indicators and concomitantly instigate cellular decisions based on the information available to the system. The diversity of the cellular functions assigned to proteins is matched by the remarkable variety of their interaction modules, each with recognizably distinct binding properties. Globular domains mediating high-affinity interactions, such as those stabilizing multisubunit molecular machines,1,2 are classically seen as the archetypal protein interaction module, a perception that stems from the obsolete assertion that a protein requires a well-defined, properly folded three-dimensional structure in order to perform its biological function.3 Currently, the wellstructured globular domains provide the majority of characterized protein interaction interfaces. However, the past decade has revealed that unstructured regions of the proteome play a central role in many dynamic cellular processes and that binding sites lacking a stable predefined conformation mediate a significant fraction (perhaps even the majority) of regulatory protein interactions within the cell.3−14 This has led to the development of an exciting new field in molecular biology that is progressively unraveling the role of these intrinsically disordered regions (IDRs), as evidenced by the increasing discovery rate of novel modules within these regions and the functions thereof.14,15 However, despite recent advances and an increasing recognition of their importance, the mechanisms by which IDRs direct and regulate protein and pathway function are not well understood. 6734

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

Table 1. Generalized Attributes of the Three Major Classes of Protein Interaction Modules attributea examplesb length native conformation linearity affinity range (Kd)c dynamicsc buried surface binding partner predominant nonorthologous evolution

globular domains

intrinsically disordered domains (IDDs)

DNA-binding domain of p53,29 CA domain of Smad-binding domain of SARA,32 CDH1 HIV Gag,30 cyclin fold domain of CCNA231 inhibitory domain of ACM1,33 tetramerization domain of p5334 50−200 amino acids 20−50 amino acids folded disordered multipartite (discontinuous) multipartite (linear) picomolar nanomolar stable transient ∼1150 Å2 ∼1350 Å2 miscellaneous globular domains/IDDs divergent complex

short linear motifs (SLiMs) nuclear receptor-binding NR box motif,35 SH3 domain-binding motif,36 nuclear localization signal37 3−10 amino acids disordered monopartite low micromolar transient ∼500 Å2 globular domains convergent

a

Although exceptions exist for each category, these discriminatory attributes hold true for the majority of characterized instances. bReferences contain solved structures of the example in complex with a binding partner. cThese attributes refer to the situation where the module type is interacting with a globular domain.

Figure 1. Classification of protein interaction modules. (A) Structure of the CCNA2-CDK2 holoenzyme in complex with the CDK inhibitor p27Kip1 (PDB 1JSU)31 containing CCNA2 (light gray), CDK2 (dark gray), and the disordered KID region of p27Kip1 (orange). The cyclin substraterecognition pocket is occupied by the N terminus of the KID (green). The same pocket can function autonomously for SLiM binding, thus acting as a cyclin-box docking site. (B) Cyclin-box docking motifs found in CDK substrates. The top motif is the docking motif in p27Kip1 shown in panel A. Green residues signify residues shared by all instances that are inserted into the SLiM-binding pocket on their cyclin family binding partner. The hydrophobic pocket on cyclins recognizes [RK]xLx{0,1}Φ motifs, requiring a lysine or arginine in the first position, an invariant leucine in the third position, and an additional hydrophobic residue, usually a leucine or phenylalanine, in position 4 or 5, within 2 residues of the key leucine. Regular expression syntax: letters denote a specific amino acid; x denotes any amino acid; square brackets denote a subset of allowed amino acids; round brackets indicate a position targeted for PTM after motif recognition; P denotes a phosphorylation required for binding; curly brackets denote length variability; Φ denotes a hydrophobic residue; −NH2 indicates the amino terminus of the protein; −COOH indicates the carboxy terminus of the protein.

characterized protein interaction modules (see section 2) and highlight their key properties (see section 3) before discussing in depth the cellular functions mediated by SLiMs (see section 4). We want to emphasize that motifs function cooperatively to mediate molecular switching, both on a molecular level to direct the life cycle of a protein (see section 5) and on a systems level to facilitate complex regulatory mechanisms (see section 6). Finally, motif deregulation in disease and the potential of motif-mediated interactions as therapeutic targets will be discussed (see section 7). Multiple examples of motifmediated interactions are used throughout the text to illustrate the various concepts and highlight the ubiquity and diverse functionality of SLiMs in cell regulation. Unless specified otherwise, the examples involve human proteins (defined by

their gene name (or in some cases most common name) and full name when first introduced), and many of the motif instances mentioned here, either in the tables or in the text, have been curated in the ELM resource.15

2. SLIMS ARE A DISTINCT CLASS OF PROTEIN INTERACTION MODULE Protein interaction networks are wired through a variety of autonomous binding modules ranging from large, stably folded domains to compact, unstructured SLiMs.26 No perfect discriminatory properties can unambiguously distinguish the different types of binding elements, and the various interaction interfaces are characterized by a continuum of overlapping binding features.7,27 Consequently, defining a comprehensive 6735

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

Table 2. Representative Examples of Interactions Mediated by the Three Major Classes of Protein Interaction Module for Which Experimentally Determined Affinity Values Are Available interface type motif globular

IDD globular

globular globular

module

cystic fibrosis transmembrane conductance regulator (CFTR)50 retinoblastoma-associated protein (RB1)51

PDZ-domain binding motif

1475VQDTRL1480

SH3 and multiple ankyrin repeat domains protein 2 (SHANK2)

PDZ

0.056 μM

PP1-binding motif

872LKKLRFD878

calcineurin-like phospho-esterase

2.1 μM

retinoblastoma-associated protein (RB1)51 death domain-associated protein 6 (DAXX)52 epidermal growth factor receptor (EGFR)53 regulatory protein SWI6 (SWI6)54

cyclin-binding motif Sumo-binding motif TKB-binding motif nuclear localization signal (NLS) EB-binding motif TAZ2-binding domain kinase inhibitory domain (KID)

872LKKLRFD878

serine/threonine-protein phosphatase PP1-alpha catalytic subunit (PPP1CA) Cyclin-A2 (CCNA2)

6.3 μM

small ubiquitin-related modifier 1 (SUMO1) E3 ubiquitin-protein ligase CBL (CBL) importin subunit alpha (SRP1)

cyclin, N-terminal domain ubiquitin-2-like Rad60 SUMO-like adaptor protein Cbl, TKB domain armadillo repeat

microtubule-associated protein RP/ EB family member 1 (MAPRE1) CREB-binding protein (CREBBP)

EB1-like C-terminal domain TAZ zinc finger

5.8 μMd

Cyclin-A2 (CCNA2)/Cyclindependent kinase 2 (CDK2)

3.5 nM

erythropoietin receptor (EPOR)

cyclin, N-terminal domain/protein kinase domain fibronectin type III

Rnase inhibitor (RNH1)

leucine-rich repeat

0.059 pM

adenomatous polyposis coli protein (APC)55 cellular tumor antigen p53 (TP53)56 cyclin-dependent kinase inhibitor 1B (CDKN1B)57 erythropoietin (EPO)58 ribonuclease A (RNASE1)59

motif sequence

interactor

733IIVLSDSD740 1069YSSDPT1074 161LKKLKI167

2801RPSQIPTPVNN2811

short-chain cytokines RNase A-like

domain

affinity (Kd)

protein

55.3 μMa 1.0 μMb 0.045 μMc

20 nMe

3.7 pM

Phosphorylation of S737 and S739 in DAXX increases affinity to 1.6 μM. bRequires phosphorylation of Y1069 in EGFR. cPhosphorylation of S160 in SWI6 reduces affinity to 0.163 μM. dPhosphorylation of S2789 and S2793 in APC reduces affinity to 17.9 and 32 μM, respectively. ePhosphorylation of T18 and S20 in p53 increases affinity to 2.5 nM. a

functional term IDD, which refers to an autonomous binding module (i.e., an IDR can contain multiple autonomous functional IDD and SLiM modules). Each of the three classes of protein interaction module can be illustrated using the cyclin-bound CDK (cyclin-dependent kinase) holoenzyme in complex with the cyclin-dependent kinase inhibitor p27Kip1 (cyclin-dependent kinase inhibitor 1B) (Figure 1A).31 Two globular domain modules in CCNA2 (Cyclin-A2) and CDK2 share a large globular interface that buries a large surface area and forms a stable high-affinity interaction. The serpentine IDD of the kinase inhibitory domain (KID) of p27Kip1 buries a large surface area in a multipartite, linear, and continuous interface with nanomolar affinity. The substrate-recognition pocket on the cyclin subunit is occupied by the N terminus of the KID region (Figure 1A, green residues). This same pocket functions autonomously as a binding site for cyclin-binding motifs present in multiple cyclinCDK substrates (Figure 1B). This example also illustrates how evolution utilizes modules with appropriate binding properties for specific functions. The core of the holoenzyme is a stable cyclin-CDK subcomplex and consequently stabilized by a globular interface. The inhibitory IDD binds with high affinity while concomitantly allowing conditional regulation of attachment to rapidly activate cyclin-CDK when required.40 Finally, substrates are recruited through low-affinity SLiMs that transiently bind to the substrate-recognition subunit and are tuned to facilitate substrate recruitment but also allow rapid dissociation of the modified substrate. As such, SLiMs play a key role in cell physiology, providing the proteome with an interaction module with unique properties that complement those of the other categories. An important note concerning the examples of SLiMmediated interactions used throughout this text is that they might give the impression that motifs only bind to globular

classification for interaction sites is a difficult task. A threecategory classification of protein interaction modules that have distinct functional, structural, and evolutionary dynamics has been proposed previously: globular domains, intrinsically disordered domains (IDDs), and SLiMs.28 This classification defines a module as a minimal autonomous unit that, through interaction with another biomolecule, performs a regulatory, recognition, or enzymatic function. Each of these classes has a set of defining attributes that allows them to be discriminated from each other, and although exceptions exist for each category, these discriminatory attributes hold true for the majority of instances (Table 1). The three classes are most clearly differentiated by features of their structure and evolution. Globular domains can be discriminated from IDDs and SLiMs by having a stable tertiary structure in the native state.38 Conversely, both SLiMs and IDDs are found in unstructured IDRs of a proteome and in the absence of their binding partner lack stable tertiary structure.28 SLiMs and IDDs, in contrast, can be distinguished on the basis of their evolutionary dynamics. These differences stem from the relatively short length and low sequence complexity of SLiMs compared to IDDs. As a result, novel instances of a SLiM class can easily evolve de novo in multiple proteins (i.e., appear independently at many different positions in the proteome through only one or few point mutations), while IDDs are expected to arise de novo only extremely rarely and show a “divergent” mode of evolution similar to that of globular domains.28,39 Confusingly, IDDs can occupy SLiM-binding sites as part of their multipartite interfaces, as seen for many pseudosubstrate inhibitors of SLiM-mediated interactions. However, in these cases, the function of the IDD requires the extended interface.31 An additional source of confusion is the interchangeable use of the structural term IDR, which refers to a protein region that is intrinsically disordered, with the 6736

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

adapt and precisely fit to their different binding partners by coupled folding and binding.60 Optimization of intermolecular interactions between the motif-containing peptide and its binding surface often facilitates induction of secondary structure, and around 60% of structurally solved SLiMs form either a helix or a beta strand when bound, for instance, to mediate binding by beta-sheet augmentation in the latter case.16,61 The coupled folding and binding requires that the protein region containing the SLiM is both flexible and accessible. Consequently, the majority of SLiMs are found in intrinsically disordered regions,16,62 although they can also occur in flexible loops of globular domains.63 The conformational adaptability of SLiMs can allow the same peptide sequences in an IDR to interact with distinct binding partners with high specificity and similar affinities due to alternative packing when bound to the different target domains, as evidenced by the overlapping binding motifs found in proteins such as TP53 (cellular tumor antigen p53).7,10,64 The predominant occurrence of SLiMs in IDRs also has consequences for their binding kinetics and might underlie their ability to mediate very specific and highly dynamic interactions: absolute requirements for proteins involved in signaling and regulation.65 However, IDR binding depends on a complex interplay between different enthalpic and entropic contributions, and different aspects of the energetics of this process have been studied and discussed previously.60,65−69

protein domains. However, there is an acquisition bias toward this type of SLiM interaction in the literature. While the vast majority of known SLiM binding partners are indeed globular domains, several SLiMs that bind nonprotein partners have been characterized. These include nucleotides, as shown by the RGG box (534RGGGGR539) in FMR1 (Fragile X mental retardation protein 1) that recognizes RNA duplex−quadruplex junctions,41 as well as lipids, for example, the basic motif (258SKLFSRLRR266) in some BIN1 (Myc box-dependentinteracting protein 1) isoforms that bind 1-phosphatidyl-1Dmyo-inositol 4,5-bisphosphate (PIP2).42 Furthermore, SLiMs have been implicated in formation of noncanonical protein− protein complexes such as hydrogels composing RNA granules where repeated [GS]Y[GS] motifs promote phase transition from a soluble to a hydrogel-like state in a concentrationdependent manner by forming amyloid-like fibers.43 Examples of noncanonical SLiM-mediated interactions such as these are likely to become more common in the future as we gain a better understanding of SLiM functions.

3. PROPERTIES OF SLIMS Multiple analyses of the available experimentally validated motif instances have determined the major defining attributes of SLiMs.16,44−46 In the following sections these properties will be discussed in detail and, when possible, compared to those characterized for IDD- and globular domain-mediated interfaces.

3.2. Evolution of SLiMs

3.1. Compactness and Structural Plasticity of SLiMs

As mentioned above, SLiM evolutionary dynamics are significantly different from those of other interaction module classes (i.e., IDDs and globular domains). Both globular domains and IDDs have complex sequence constraints and are thus expected to evolve “divergently” (i.e., are expected to only extremely rarely evolve de novo).28,38 Globular domains or IDDs have been shown to convergently acquire similar functionalities, for example, the ability to bind to a particular domain; however, this functionality is generally realized through a novel mode of interaction. For example, IDDs of both HIF1A (hypoxia-inducible factor 1-alpha) and CITED2 (Cbp/p300-interacting transactivator 2) bind to the TAZ1 domain of CBP (CREB-binding protein); however, each module takes a different path along the surface of the domain.70,71 In contrast, typical features of SLiMs (i.e., their short length and small number of essential residues) make it much easier for them to evolve de novo from a random peptide sequence on multiple occasions via point substitutions. Moreover, although they were acquired independently, novel instances of a specific SLiM class bind to the same motifbinding site through a similar mode of interaction. This suggests a model where a rudimentary motif with weak specificity and affinity acquired by point substitutions can be selected by evolutionary pressures either positively, to generate a functional SLiM, or negatively, to eliminate deleterious interactions.39,72,73 As such, SLiMs exhibit an evolutionary plasticity that is unavailable to globular domains and IDDs. Conversely, like most interfaces, a single substitution at a key position is often sufficient to abolish SLiM binding.74 The resulting addition or removal of a SLiM can alter the function of a protein by modifying its regulatory program or prompting the gain or loss of an interaction partner. 3.2.1. Evidence for De Novo Evolution of SLiMs. Generally, the literature on SLiM evolution is limited to studies that note the presence or absence of a functional SLiM in a

3.1.1. SLiMs as Compact Binding Modules. Of the three classes of interaction module, SLiMs are the most compact (Figure 1A). Most SLiMs are short and monopartite. Their major binding determinants are generally encoded in a single stretch of 3−10 contiguous amino acids,16,47 although binding of a SLiM can be modulated to a varying degree by residues outside this region.44,45 By comparison, globular domains average 100 residues in length,48 while IDDs are generally between 20 and 50 residues in length.28 Consequently, SLiMs allow a larger number of functional modules to be embedded in a given length of polypeptide compared to their less compact counterparts, an attribute referred to as high functional density. SLiMs appreciably bury (on average) 3−4 key residues in the interacting surface of their target protein (Figure 1A, green residues). These key residues are generally shared, with some conservative substitutions, across all peptides recognized by an interacting surface on a peptide-binding domain, and they supply the majority of the specificity and affinity determinants of a SLiM (Figure 1B). As a result of the limited number of residues in a SLiM that directly contact the binding partner, SLiM−globular domain interfaces have smaller buried surface areas (∼500 Å2) than globular domain−globular domain (∼1150 Å2) and IDD−globular domain (∼1350 Å2) interfaces.46 Empirical evidence suggests that SLiMs are the weakest binders of the three categories of interaction modules (Table 2). SLiMs associate with their binding partners with relatively low affinity, typically attaining equilibrium dissociation constants (Kd) in the 1−10 μM range,39 while IDDs typically interact with low nanomolar affinity,28 and globular−globular domain interactions can achieve picomolar Kd values.49 3.1.2. SLiMs as Structurally Flexible Binding Modules. In their unbound state, SLiMs, like IDDs but unlike globular domains, are highly flexible and lack stable tertiary structures. This intrinsic structural plasticity enables SLiMs to structurally 6737

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

Figure 2. De novo evolution of SLiMs. (A) Alignment of CRK orthologs showing the de novo evolution of an SH3 domain-binding PxxP motif (69PPVPPSP75) in a large insertion in the SH2 domain of mammalian CRK proteins. Arrows below the alignment indicate beta strands of the CRK SH2 domain. (B) Structure of the human CRK SH3 domain-binding PxxP motif (yellow) bound to the SH3 domain of ABL1 (dark gray). The overlapping structures of the homologous SH2 domains of CRK (light gray) (PDB 1JU5)63 and CRKL (Crk-like protein) (green) (PDB 2EO3) show the novel PxxP motif-containing loop in CRK.

Table 3. Representative Examples of De Novo Evolution of SLiMs in Nonhomologous Human Proteins To Gain Access to a Complex, Pathway, or Regulatory Program motifa endocytic motif (YxxΦ)

KEN box (xKENx)

PKA phosphorylation site ([RK][RK]x[ST])

description

protein name

tyrosine-based motif in the cytosolic tail of membrane proteins recognized by the endocytic machinery

degradation motif recognized by the CDH1 or CDC20 substraterecognition subunits of the APC/C ubiquitin ligase

phosphorylation motif recognized by basophilic protein kinase A

gene name

motif sequenceb

cytotoxic T-lymphocyte protein 4 epidermal growth factor receptor lysosomal acid phosphatase neural cell adhesion molecule L1 transferrin receptor protein 1 aurora kinase B cell division cycle protein 20 homologue cell division control protein 6 homologue M-phase inducer phosphatase 3 securin catenin beta-1

CTLA4

201YVKM204

EGFR

998YRAL1001

ACP2

413YRHV416

L1CAM

1176YRSL1179

TFRC

20YTRF23

AURKB CDC20

3QKENS7

CDC6

80KKENG84

CDC25C

150NKEND154

PTTG1 CTNNB1

8DKENG12

retinoic acid receptor alpha NADPH oxidase activator 1 transcription factor SOX-9 single-stranded DNA cytosine deaminase

RARA

366RRPS369

NOXA1

169RRGS172

SOX9 AICDA

61KKES64

96SKENQ100

549RRTS552

35RRDS38

a

The syntax used for notation of motif regular expressions is described in the legend of Figure 1. bAn extended list of motif instances is available in the ELM resource.15

half of the experimentally validated human SLiM instances are absent in fish and therefore have been gained in the human lineage or lost in the fish lineage.16 Several high-throughput proteomics studies have shown similar trends. For instance, Nglycosylation sites ((N)x[ST]) and canonical CDK phosphorylation sites (([ST])Px[KR]) are often gained (and lost) as species diverge.79,80 Third, evidence accumulated from lowthroughput experiments suggests that motifs performing common regulatory tasks such as controlling protein localization or stability or acting as sites to integrate cell state information are ubiquitous and have evolved de novo multiple times (Table 3).81 This is supported by an analysis of the ELM resource, which revealed that the majority of SLiM classes have examples of de novo convergent evolution of instances in two or more unrelated proteins.16 Finally, several examples exist of paralogs that, after duplication, retain the same core functionality but evolve de novo a distinct and diverse set of motifs that differentially control the functionality, localization,

subset of species, suggesting the gain or loss of a motif in a particular lineage.75,76 For example, the adaptor molecule CRK contains a polyproline motif that is recognized by the SH3 domain of the tyrosine-protein kinase ABL1 and promotes an alternative mode of binding for ABL1 recruitment (Figure 2). The motif is present only in mammals and as such represents an example of de novo motif evolution.63 However, in analyses such as this, a direct comparison with an ortholog in species without the motif is rarely carried out. Several threads of evidence derived from different surveys and SLiM curation efforts suggest that addition or removal of a SLiM is common. First, computational analysis of the evolution of protein interaction networks revealed that interactions mediated by SLiMs are the most likely class of interactions to be rewired as interactomes evolve, particularly those involving PDZ, SH2, and SH3 domain-binding motifs.77,78 Second, many functional SLiMs are present in only a limited number of species. For example, analysis showed that approximately one6738

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

lineage, in tandem with expansion of phosphotyrosine kinases and, undoubtedly, extensive de novo evolution of SH2-binding motifs (Figure 3A). Specialized phosphotyrosine motif-binding domains have evolved independently on at least four occasions, resulting in the SH2, HYB, C2, and PTB domain families.89 Conversely, the canonical RNA-binding RRM domains90 have evolved the ability to bind SLiMs on at least three separate occasions, each with a unique specificity and binding mode: (i) recognition of PTB RRM2 interacting (PRI) motifs by PTBP1 (polypyrimidine tract-binding protein 1),91 (ii) the polybasic and tryptophan-containing UHM ligand motifs (ULMs) by U2AF (U2 auxiliary factor) family members,92 and (iii) the NBox motif by PUF60 (poly(U)-binding-splicing factor PUF60)93 (Figure 3B). Domains and motif-driven systems can also be lost. For example, the SLiM-mediated secondary peroxisomal import system is missing from the nematode Caenorhabditis elegans.94

or stability of the related but diverging proteins. For instance, members of the cell division control protein 25 (CDC25) family82 (see section 5.1), the CDK inhibitor (CDI) family83 (see section 5.2), and the cell division cycle protein 20 (CDC20)/fizzy family of APC/C (anaphase-promoting complex or cyclosome) substrate-recognition subunits84 have all evolved distinct regulatory programs using unique sets of motifs. 3.2.2. Conservation of SLiMs. As a result of the rapid gain and loss of motifs on an evolutionary time scale, most motifs are not conserved across a large number of species. For example, less than 5% of motifs curated in the ELM resource are conserved (at a given position in a protein sequence) between human and yeast.16 Nonetheless, several functionally important motifs are conserved over large evolutionary distances, such as the C-terminal PCNA (proliferating cell nuclear antigen)-interacting protein (PIP)-box motif of FEN1 (Flap endonuclease 1) (337QGRLDDFFKV346), which is present across the tree of life.85 Despite the evolutionary plasticity of single motif instances, motifs can often be recognized by an orthologous recognition module in a distant relative species, as observed for yeast SRP1 (Importin subunit alpha), which can bind the nuclear localization signal (NLS) of human MYC (Myc proto-oncogene protein) (320PAAKRVKL327).86 This suggests that once a motif has become widely used across a proteome, changes to SLiMbinding sites are constrained. 3.2.3. Evolution of SLiM-Binding Domains. Many SLiMbinding domain families, such as the PDZ, PTB, SH2, SH3, and WW domains, have expanded significantly in higher eukaryotic organisms and are among the most abundant in metazoans (Table 4).87 The evolution of the SH2 domain has been

3.3. Binding Determinants of SLiMs

The binding elements that mediate IDD−IDD, IDD−globular, and globular−globular domain interactions are typically highly specific and usually recognize only a single or limited set of binding partners. The specificity of SLiM-mediated interactions, however, varies depending on the requirements for their biological function, ranging from exclusivity toward a single binding partner to high promiscuity, both on the SLiM and on the domain side of the interface.96,97 The specificity and affinity of SLiM binding is derived from both intrinsic and extrinsic binding determinants that are encoded at their interface and dependent on its context, respectively. These determinants govern SLiM specificity, to correctly identify their cellular binding partners while minimizing the likelihood of deleterious effects of binding to the incorrect partner, and determine a SLiM’s binding affinity, to achieve the interaction strength and dynamics that are optimal to fulfill its biological function. As such, high-fidelity recognition of targets by SLiM-binding domains appears to be distributed among a large number of independent and imperfect specificity mechanisms.98 3.3.1. Intrinsic Determinants. The intrinsic specificity, affinity, and selectivity determinants of a SLiM depend on a complex interplay between core required positions and contextual permissive and nonpermissive positions.99 The ELM resource uses regular expressions to describe these determinants, and a similar notation will be used throughout this review (see the legend of Figure 1).15 The core residues of the motif are required positions that are usually in direct contact with the surface of the domain partner. Required positions will tolerate only a limited, often physicochemically similar, subset of amino acids with strong complementarity to the binding surface on the peptide-recognition domain.99 The required positions are frequently used to describe the specificity of the binding pocket, for example, the phosphorylated tyrosine residue and glutamine residue in PYxN motifs binding to the SH2 domain of GRB2 (growth factor receptor-bound protein 2) (Figure 4A). These positions are the major specificity and affinity determinants of binding. Computational studies suggest that on average the core of the motif contributes about 80% of the energy of binding.44 The local context of a motif consists of residues that flank the motif core. These residues can provide secondary contacts between the regions flanking the motif core and the domain surface, or they can indirectly modulate the physical, chemical, or structural compatibility of the peptide with the target domain. Contextual positions can contain

Table 4. Representative Examples of Motif-Binding Globular Domain Families domain family PDZc domain Src homology 3 (SH3) domain Src homology 2 (SH2) domain WW domain phosphotyrosine-binding (PTB) domain

frequencya 573 domains in 342 proteins 451 domains in 382 proteins 237 domains in 219 proteins 151 domains in 103 proteins 142 domains in 133 proteins

pattern of recognized motifb [ST]x[ACVILF]−COOH PxxP PYxx[IV]

PPxY NPxPY

a

Frequencies are taken from Pfam for the human proteome.95 Patterns are representative and roughly define the specificity of a subset of the domain in the domain family. The syntax used for notation of motif regular expressions is described in the legend of Figure 1. cPDZ: Postsynaptic density protein (PSD95)-Drosophila disc large tumor suppressor (Dlg1)-Zonula occludens-1 protein (Zo1). b

extensively studied and as such provides the best insight into SLiM-binding domain evolution.88 The SH2 domains of 111 human proteins can be traced back to the SH2 domain in the ancestoral eukaryotic SPT6 (transcription elongation factor SPT6). Interestingly, although the SH2 domain is the archetypal phosphotyrosine-binding domain, the SH2 domain in yeast SPT6 is a phosphoserine-binding domain.88 Following the evolution of phosphotyrosine recognition in pre-metazoa, the number of SH2 domains rapidly expanded in the metazoan 6739

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

Figure 3. Versatility in SLiM recognition by SLiM-binding domains. (A) Superimposed structures of six SH2 domains bound to their ligands highlighting the similarity of their binding mode (red, peptide from SHC1 (SHC-transforming protein 1) (PDB 1QG1);499 yellow, PYxN-derived peptide (PDB 3OVE);500 light blue, RLNPYAQLWHR peptide (PDB 3TL0);501 light green, peptide from APP (amyloid beta A4 protein) (PDB 3MXC);502 dark blue, peptide from LAT2 (linker for activation of T-cells family member 2) (PDB 3MAZ);503 pink, PY191 peptide from LAT (PDB 1R1Q)354). (B) Superimposed structures of four RRM domains bound to their ligands (red, ULM peptide from SF3B1 (splicing factor 3B subunit 1) (PDB 2PEH);92 dark blue, RNA molecule (PDB 2KXN);90 green, PRI4 peptide ([ILVM]LGxxP) from mouse Raver1 (ribonucleoprotein PTBbinding 1) (PDB 3ZZZ);91 light blue, NBox peptide from FUBP1 (far upstream element-binding protein 1) (PDB 2KXH)93). (C) Superimposed structures of the C-terminal calponin homology (CH) domain of PARVA (alpha-parvin) showing binding of different LD1 motifs (blue and red) from PXN (paxilin) (PDB 2VZD and 2VZG).504 Note how the two peptides bind in an antiparallel orientation. (D) Superimposed structures of phosphatase docking interactions showing the SILK and RVxF docking sites on the catalytic domain of PP1 (red, rat Ppp1cc (serine/threonineprotein phosphatase PP1-gamma catalytic subunit) bound to mouse Ppp1r2 (protein phosphatase inhibitor 2) SILK motif (PDB 2O8A);505 blue: PPP1CB (Serine/threonine-protein phosphatase PP1-beta catalytic subunit) bound to RVxF motif-containing chicken PPP1R12A (protein phosphatase 1 regulatory subunit 12A) (PDB 1S70)506). (E) Superimposed structures of eight WD40 repeat domains bound to their ligands (red, EH1 peptide from GSC (Homeobox protein Goosecoid) (PDB 2CE8);108 yellow, C-terminal degron of CCNE1 (G1/S-specific cyclin-E1) (PDB 2OVQ);109 light blue, degron of CTNNB1 (Catenin beta-1) (PDB 1P22);110 dark blue, peptide from fruit fly His4 (histone H4) (PDB 3C9C);111 brown, peptide from mouse Ezh2 (histone-lysine N-methyltransferase EZH2) (PDB 2QXV);112 dark green, peptide from BRCA2 (breast cancer type 2 susceptibility protein) (PDB 3EU7);113 purple, peptide from KMT2A (histone-lysine N-methyltransferase 2A) (PDB 3EMH);114 mint, KENbox peptide from yeast ACM1 (APC/C-CDH1 modulator 1) (PDB 4BH6);33 orange, ABBA motif peptide from yeast ACM1 (PDB 4BH6);33 cyan, D-box peptide from ACM1 (PDB 4BH6)33).

permissive residues that promote binding. Often, a variety of residues are tolerated at permissive positions; however, not all residues are equally compatible, and, consequently, permissive positions can modulate the affinity, specificity, or selectivity of the SLiM.99 Many motif-mediated interactions also involve contacts with the peptide backbone of the motif, which contribute to the affinity of an interaction without providing much binding specificity.46 Finally, similar peptides without the ability to bind a given domain will often have nonpermissive residues that inhibit binding by introducing physical, chemical, or structural incompatibility with the binding domain.99 Most SLiM classes have simple specificity, affinity, and selectivity determinants. However, as introduced in section 3.2.3, many motif-binding domain families have rapidly expanded in specific lineages.87 Members of these large families often recognize a common core motif (e.g., phosphotyrosine,

polyproline, and hydrophobic C-terminal motifs recognized by the SH2, SH3, and PDZ domain families, respectively (Figure 4B, Table 4)), yet different domain family members bind distinct subsets of SLiM-containing proteins. Evolutionary refinement of the binding surfaces of these domains has permitted divergence of their binding preferences to allow complete separation, partial overlap, or complete overlap of their targets (Figure 4C). A semiquantitative analysis of recognition of phosphotyrosine-containing SLiMs by SH2 domains highlighted the contribution of exclusivity residues in addition to required, permissive, and nonpermissive residues.99 Exclusivity residues are permissive to a subset of members of a domain family but nonpermissive to other family members, allowing recognition by one subset while blocking binding to other homologous domains (Figure 4D). To further complicate the definition of a motif’s binding specificity, higher level paired 6740

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

Figure 4. Intrinsic specificity determinants for SLiM binding. (A) Representative examples of peptides recognized by the SH2 domain of GRB2, showing the core PYxN motif. All instances are human unless marked otherwise (R for rat, M for mouse). (B) Rough specificities of several representative SH2 domains. (C) Motif space for a set of domains from a domain family with distinctive and overlapping binding specificities. Circles signify peptides that are recognized by the domain family. Enclosing lines signify the sets of peptides recognized by a given domain in the domain family. (D) Specificity map of the CRK and tyrosine-protein kinase HCK SH2 domains illustrating the negative positions (red), permissive positions (green), and exclusive positions (indicated by C with blue background for CRK and H with orange background for HCK). (E) Illustration of higher level paired positions for HCK (top) and CRK (bottom) where residues in one position of the motif can change whether a residue is allowed (green) or disallowed (red) in another position. HCK, for example, tolerates a proline in the +3 position (relative to the PY position) if there is a basic residue in the +4 position but not if a proline is present in the +4 position. Conversely, isoleucine in the +3 position shows the opposite preference for paired positions with the +4 position.

peroxisome.97 Furthermore, motifs frequently function cooperatively and hence only function with biologically relevant strength in the context of a specific complex where multivalent cooperative interactions increase the specificity and stability of binding (see sections 4.1.3.1 and 6.1).123

positions also exist that can either compensate or antagonize each other (Figure 4E).99 It should be noted that the plasticity of SLiM-binding domains often precludes the definition of a single motif regular expression for a given domain family, as every binding domain will have some exceptions. Many SLiM-binding domain families bind an experimentally validated SLiM instance that does not follow the classical binding determinants for the binding pocket,33,100,101 multiple examples of peptides binding in antiparallel orientations have been described (Figure 3C),102−107 and several domains with multiple distinct peptide-binding pockets have been characterized (Figure 3D).33,108 Likewise, some domain families, such as the WD40 repeat family, have diverse specificity for a wide range of physicochemically unrelated SLiMs (Figure 3E).108−116 3.3.2. Extrinsic Determinants. Binding specificity and affinity is generally augmented by extrinsic determinants that are encoded outside the interaction interface. Many proteins never meet in the cell, and therefore, most peptides will not be physiologically competitive in vivo. To facilitate this, the local abundance of natively disordered proteins is strictly controlled117,118 and restricted by cell compartmentalization,119 tissue specificity,120 and/or cell cycle phase,121 which allows for spatial and/or temporal separation of potential cross-reactive detrimental domain−peptide pairs.122 For example, the SH3 domain-binding motif of the yeast peroxisomal membrane protein PEX14 (87PTLPHR92) is highly promiscuous in vitro, having the intrinsic ability to bind multiple yeast SH3 domains. Yet, in vivo, PEX13 (peroxisomal membrane protein PAS20), the primary cellular target of this motif, is the only known SH3 domain-containing protein that colocalizes with PEX14 at the

3.4. Regulation of SLiMs

Several recent analyses and reviews have emphasized the dynamic nature of SLiM-mediated interactions, providing strong evidence that SLiM-mediated interfaces are highly regulated both pre- and post-translationally in order to control protein localization, binding, stability, activity, and modification state.7,19,124−131 3.4.1. Pretranslational Regulation of SLiMs. Inclusion or exclusion of a SLiM by pretranslational mechanisms such as alternative splicing, alternative promoter usage, and RNA editing can rewire the interaction network of a protein isoform in a cell state- or tissue-specific manner, thereby altering its subcellular localization, half-life, binding partners, activity, or modification state, hence its function.127 Alternatively, changes in the flanking regions of a SLiM can have more subtle effects on the affinity or specificity of its interactions.125−127,132 Several recent studies have shown that SLiMs are enriched in nonconstitutive exons, suggesting that altering the regulatory potential of a protein by including or excluding a SLiM may be a significant factor in the functional diversity and tissue-specific activity of splice isoforms.124,126,133 3.4.2. Post-Translational Regulation of SLiMs. The intrinsic attributes of SLiMs also predispose them to posttranslational regulation. First, due to the relatively weak affinity and limited number of specificity-determining residues of SLiM-containing interfaces, post-translational modification of a 6741

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

Figure 5. Functional classification of SLiMs. SLiMs are divided into two general groups: ligand motifs that mediate binding events and posttranslational modification motifs that are targeted for PTM. These can be further divided into several classes. The ligand motifs consist of regulatory motifs that control the localization, modification state, or stability of proteins and motifs that mediate assembly of macromolecular complexes. These classes are not necessarily mutually exclusive. Modification motifs can function as sites for PTM moiety addition/removal, proteolytic cleavage, or structural modification.

be divided into two general groups15,136 (Figure 5): (i) ligand motifs that mediate binding events (see section 4.1) and (ii) post-translational modif ication motifs that act as target sites for PTM by modifying enzymes (see section 4.2). These two groups can be further divided into distinct functional subsets of SLiMs that perform specific tasks. These include ligand motifs for docking to modifying enzymes, degradation initiated by recruitment of ubiquitin ligases, traf f icking to and anchoring in specific subcellular locations, and molecular complex assembly. Modification motifs can act as sites for PTM moiety addition or removal, proteolytic cleavage, or structural modif ication. The following sections will describe these groups and subgroups, introducing representative examples to highlight both the diverse functionality and the ubiquity of SLiMs.

motif can have a substantial effect on the structural and physicochemical compatibility of the motif with its binding partner and hence provides the means to easily modulate the activity of a SLiM. Consequently, reversible PTM of motifs is a common mechanism to conditionally and dynamically regulate SLiM-mediated interactions.129,130 Several SLiM classes are intrinsically inactive when unmodified and require addition of a specific PTM moiety to a specific motif residue in order to be recognized by their target domains. For example, SH2, PTB, 14-3-3, FHA, and BRCT domain-binding motifs can only interact after a specific residue has been phosphorylated,134 while binding of HIF (hypoxia-inducible factor) proteins to VHL (von Hippel−Lindau protein) requires hydroxylation of a specific proline residue in their VHL-binding motifs.135 Alternatively, many motif-mediated interactions are inhibited by PTM of a residue in or adjacent to the motif.129,130 Second, the compact footprint of SLiMs facilitates the occurrence of regions with high functional density containing multiple adjacent or overlapping motifs. The distinct motifs can act cooperatively, combining multiple low-affinity interactions to mediate high-avidity binding and promote the assembly of metastable protein complexes. By regulating each motif individually, these interactions can be modulated to varying degrees. Alternatively, overlapping or adjacent SLiMs provide mutually exclusive binding regions that compete for different binding partners, allowing functionally distinct complexes to be assembled in a highly controlled manner. The balance of competition can be shifted in favor of a specific interactor by altering the local abundance of the competitors or by modulating the specificity of the motif-containing region toward different binding partners, for instance, by PTM.130 Finally, in contrast to direct modulation of an interaction interface, binding of some SLiMs depends on allosteric mechanisms, which act in response to a perturbation, for instance, a binding or PTM event, at a site that is distinct from the motif-mediated interface.130 To date, a substantial number of conditional SLiM-mediated interactions have been characterized, highlighting the central role that post-transcriptional modulation of SLiMs plays in the regulation of dynamic cellular processes.19 These dynamic binding properties are advantageous to signaling proteins, and consequently, SLiMs are commonly used to mediate context-dependent assembly of dynamic macromolecular complexes (see sections 4.1.3 and 6).

4.1. Ligand Motifs

The group of ligand motifs consists of SLiMs that mediate noncatalytic protein interactions, and by definition, after dissociation of the interacting proteins neither the SLiM nor the SLiM-binding domain is altered chemically or structurally. Functionally, several types of ligand motifs can be defined (Figure 5).15 These classes are partially overlapping and thus not mutually exclusive, meaning that a specific motif can mediate functions associated with more than one class. Several classes of ligand motif play a specific role in regulating protein function by mediating interactions that determine the modification state, stability, or subcellular localization of a protein. For example, modifying enzymes are often recruited to substrates by motifs distinct from the site to be modified. These enzyme recruitment sites (see section 4.1.1), known as docking motifs, are utilized extensively to increase the substrate specificity of modifying enzymes. A specific class of docking motif, known as degradation motifs or degrons, controls protein stability. The second class are targeting motifs (see section 4.1.2) that facilitate the correct localization of proteins by functioning as traf f icking motifs, which direct translocation of proteins between subcellular compartments by promoting their recruitment to specific transport pathways, or anchoring motifs, which retain localized proteins at a subcellular location by binding to compartment-specific complexes. Finally, the largest class of ligand motifs covers SLiMs that do not perform a specific regulatory function but rather act as simple binding modules (see section 4.1.3). Often they serve as complex assembly modules, binding in a concerted manner to mediate formation of large macromolecular complexes. By discussing a variety of examples, the following sections will describe each of these ligand motif classes, highlighting their importance for controlling protein functionality and their prevalence in cell

4. FUNCTIONS OF SLIMS SLiM-mediated interfaces have been characterized in a wide range of cellular roles. On the basis of their function, SLiMs can 6742

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

Table 5. Representative Examples of Docking Motifs enzyme

description

motif patterna,b

docking site in substrates and regulators of the MAP kinase family docking site in substrates and inhibitors of cyclin-CDK complexes docking site in substrates phosphorylated by PIKKd family members upon DNA damage docking site in substrates phosphorylated by OSR1/SPAK kinases

[KR]{0,2}[KR]x{0,2}[KR]x{2,4}ΦxΦ or FxFP

enzyme type

mitogen-activated protein kinase (MAPK)

kinase

cyclin-dependent kinase (CDK)

kinase

serine-protein kinase ATMc

kinase

serine/threonine-protein kinase OSR1e and STE20/SPS1-related proline-alanine-rich protein kinase (SPAK) serine/threonine-protein phosphatase 2B (PP2B)

kinase

serine/threonine-protein phosphatase PP1 (PP1)

phosphatase

Tankyrase

poly[ADP-ribose] polymerase (PARP)

phosphatase

docking site in substrates dephosphorylated by Calcineurin phosphotases docking site in substrates dephosphorylated by protein phosphatase 1 catalytic subunit (PP1c) docking site in substrates of the PARP family members Tankyrase 1 and 2

[RK]xLx{0,1}Φ [DEN][DEN]x{2,3}Φ[DEN][DEN]L RFx[IV]

PxIx[IV] [RK]x{0,1}[VIL]x[FW]

Rxx[PGAV][DEIP]G

a

The syntax used for notation of motif regular expressions is described in the legend of Figure 1. bAn extended list of docking motifs is available in the ELM resource.15 cAtaxia telangiectasia mutated. dPhosphatidylinositol 3-kinase-related kinase. eOxidative stress-responsive 1 protein.

regulation. The final section summarizes the main characteristics of the diverse functionality of ligand motifs and their importance for cell regulation (see section 4.1.4). 4.1.1. Ligand Motifs and Enzyme Recruitment. Docking motifs are ligand motifs that act as recruitment sites for modifying enzymes (Table 5). Although many proteinmodifying enzymes preferentially target specific peptides for modification (see section 4.2) depending on the stereochemical requirements of their active site for catalysis, these target sequence constraints are often not stringent enough to explain the substrate specificity that an enzyme exhibits in vivo.137 The enzyme−substrate interactions are often promoted by protein− protein interaction interfaces involving docking motifs that are distinct from the modification target site.138 These docking sites complement the specificity of modification target sites by promoting spatial and temporal colocalization of enzyme and substrate and thereby decrease the likelihood of off-target modification (similar to the scaffolding function described for the AKAP (A-kinase anchor protein) family in section 4.1.3.3).139,140 A specific subset of enzyme-recruitment motifs, known as degrons, targets proteins to ubiquitin ligase complexes, which polyubiquitylate the degron-containing proteins and thereby mark them for proteasomal degradation.131 Therefore, degrons can be considered a functionally distinct type of ligand motif that regulates protein stability. The enzyme recruitment motifs play an important role in cell signaling as they mediate interactions that determine the modification state of a protein and indirectly regulate the function of a protein by modulating its stability, activity, or subcellular localization in a conditional manner. In the following sections, several aspects of docking (see section 4.1.1.1) and degron (see section 4.1.1.2) motifs are discussed and illustrated with examples. 4.1.1.1. Classical Docking Motifs and Enzyme Recruitment. Several modes of docking motif-dependent substrate recruitment have been characterized. Numerous enzymes have a binding site for docking motifs located on their catalytic domain but on a surface that is distinct from the active site. For example, accurate MAPK (mitogen-activated protein kinase) signaling depends on docking motifs that bind to a docking groove on the catalytic domain of MAPKs.141 These docking motifs can be found in many MAPK binding partners, including substrates such as the transcriptional activator MEF2A

(myocyte-specific enhancer factor 2A) (269RKPDLRVVI277), but also in activators such as MAP2K1 (MAPK kinase 1) (3KKKPTPIQL11) and in negative MAPK regulators such as MKP1 (MAPK phosphatase 1) (54RRAKGAMGL62).139,142,143 Docking interactions can also be mediated by separate interaction modules that are within the same protein as the modifying catalytic domain. For instance, docking of the SRC (proto-oncogene tyrosine-protein kinase Src) SH2 domain to the 397PYAEI400 motif in FAK1 (focal adhesion kinase 1) is required for SRC-dependent phosphorylation and activation of FAK1.144 Similarly, a polo-box domain in PLK1 (polo-like kinase 1) binds to a phosphorylated motif (49SPSP51) in CDC25B, recruiting PLK1 to phosphorylate several additional sites.145 Finally, the binding domain that recruits the docking motif can be located in a different protein, in which case substrate binding and subsequent modification depend on prior assembly of a complex consisting of the catalytic protein and the substrate-recognition protein. Such a mechanism is used to recruit CDKs to certain substrates whose phosphorylation depends on docking of their cyclin-binding motif to a CDKassociated cyclin molecule.146 For example, the 873KKLRF877 and 90RRLDL94 cyclin-binding motifs of RB1 (retinoblastomaassociated protein) and transcription factor E2F1, respectively, are required for phosphorylation-promoted disassembly of the RB1-E2F1 complex.147 Use of motif-mediated docking interactions to recruit substrates to modifying enzymes is not restricted to kinases. The Tankyrases, members of the PARP (poly ADP-ribose polymerase) family, catalyze attachment of ADP-ribose moieties to their substrate proteins.148 Recognition of specific targets is mediated by docking interactions between the Tankyrase N-terminal ankyrin repeats and Tankyrase-binding motifs present in the substrates, for instance, SH3BP2 (SH3 domain-binding protein 2) (415RSPPDG420).149 Subsequent ribosylation affects the function of the target proteins by direct modulation of their activity, rewiring their interaction network, or altering their stability.148,150,151 Moreover, not only addition but also regulated removal of PTM moieties can be mediated by docking motifs. Phosphatases antagonize kinases by catalyzing removal of phosphate groups from modified proteins; however, these enzymes have very low to no intrinsic substrate specificity.152 Hence, phosphatases recognize a variety of docking motifs to target the proper substrates. For instance, 6743

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

Table 6. Representative Examples of Degradation Motifs (Degrons) ubiquitin ligase complex

substrate-recognition subunit

description

E3 ubiquitin-protein ligase Mdm2 (MDM2) ubiquitin carboxyl-terminal hydrolase 7 (USP7)c

motif patterna,b

docking site in substrates of the MDM2 ubiquitin ligase docking site in substrates of the deubiquitylating enzyme USP7 RxxL-based D-box APC/C degron

FxxxWxxΦ

KEN-box APC/C degron

KEN ΦT[DEN] xxxx[KR]d PTPxxP[ST] [DE]xPTPxKe

[PA]xxS

anaphase-promoting complex/ cyclosome (APC/C) anaphase-promoting complex/ cyclosome (APC/C) Cullin ring ligase 4 (CRL4)

cell division cycle protein 20 homologue (CDC20) or Fizzyrelated protein homologue (FZR1) cell division cycle protein 20 homologue (CDC20) or Fizzyrelated protein homologue (FZR1) denticleless protein homologue (DTL)

RxxLxxΦ

SKP1-CUL1-F-box (SCF) SKP1-CUL1-F-box (SCF) SKP1-CUL1-F-box (SCF) SKP1-CUL1-F-box (SCF)

protein

F-box/WD repeat-containing protein 7 (FBXW7)

PCNA-dependent DTL (CDT2) degron phospho-dependent FBXW7 degron

protein protein

S-phase kinase-associated protein 2 (SKP2) and cyclin-dependent kinases regulatory subunit 1 (CKS1B) F-box/WD repeat-containing protein 1A (BTRC)

phospho-dependent SKP2-CKS1B degron phospho-dependent BTRC degron

DPSGx{2,3}P[ST]

protein

protein transport inhibitor response 1 (TIR1)

plant-specific degron auxin-dependent TIR1 degron

[VLIA][VLI] GWPP[VLI]xxxR

a

The syntax used for notation of motif regular expressions is described in the legend of Figure 1. bAn extended list of degron motifs is available in the ELM resource.15 cAlthough not a degron, the docking site for USP7 regulates protein stability by antagonizing the action of ubiquitin ligases, for instance, ubiquitylation of p53 by MDM2, and thereby prevents proteasomal degradation. dMust overlap PIP box and bind PCNA to be recognized. e Requires formation of a composite binding site by SKP2 and CKS1B.

of the substrate’s degron. These phospho-degrons provide an additional layer of control to confer context-dependent protein degradation and allow decision making based on integration of multiple signals. For instance, BTRC-dependent degradation of CTNNB1 (catenin beta-1) (32DPSGIHPS37) is controlled by priming-dependent phosphorylation (see section 4.2.4) of two sites within its degron by GSK3B (glycogen synthase kinase-3 beta), which occurs in the absence of Wnt signaling.162 The APC/C is a major ubiquitin ligase that drives progression through the cell cycle by targeting several key cell cycle regulatory proteins for proteasomal degradation. Recruitment to the APC/C depends on D-box and KEN-box motifs in its substrates, which include Securin (8DKENG12) (the inhibitor of Separase, a protease responsible for cleavage of sister chromatid cohesion proteins)163 and some major cell cycle kinases such as the cyclin-dependent kinases (by destroying the cyclin subunits, for instance, cyclin-B1 (41PRTALGDIG49)),164,165 Aurora kinases (3QKENS7 in Aurora kinase B),166 and Polo-like kinases (336NRKPLTVLN344 in PLK1).167 The strict temporal regulation of the ubiquitin ligase activity and substrate specificity of the APC/C ensures timely destruction of the correct substrates and restricts APC/ C-mediated degradation to a window between metaphase and late G1 phase of the following cell cycle.84 An additional strategy to regulate context-dependent protein degradation is the use of composite binding sites, where binding of a degron motif depends on preassembly of a molecular complex. Illustrative examples of this mechanism can be found in the control of hormone-induced signaling in plants. Bioactive forms of the phytohormone auxin regulate diverse aspects of plant development and growth, in part by inducing a transcriptional response through targeted destruction of a set of transcriptional repressors, the Aux/IAA protein family members.168 Degradation of Aux/IAA proteins depends on TIR1 (transport inhibitor response 1), which functions as a hormonespecific target-recognition subunit of the SCF ubiquitin ligase. Binding of auxin to TIR1 creates a composite binding site for TIR1-binding degrons, for instance, in Arabidopsis thaliana IAA7 (indoleacetic acid-induced protein 7) (82QVVGWPPVRNYRK94).169 This provides a simple yet

many interaction partners of PP1 (serine/threonine-protein phosphatase 1) possess one or more FxxRxR, RVxF, SILK, and/or MyPhoNE motifs.152−156 These docking motif-containing PP1-binding proteins include PP1 regulators such as MYPT1 (myosin phosphatase-targeting subunit 1) (10RNEQLKRW17 and 35KVKF38).153 4.1.1.2. Degron Motifs and Ubiquitin Ligase Recruitment. The number of copies of a protein present in a cell at a specific time or under specific conditions is controlled by a combination of the rate of protein synthesis and degradation. Selective degradation of specific proteins by ubiquitin-mediated proteolysis plays an important role in controlling a wide variety of cellular processes, such as the cell cycle33 and apoptosis,157 and allows proteins to exhibit widely differing half-lives, ranging from a few minutes to several days.158 Proteins can be targeted for proteasomal degradation by polyubiquitylation, which is catalyzed by an E2 ubiquitin-conjugating enzyme in cooperation with an E1 ubiquitin-activating enzyme and an E3 ubiquitin ligase.157 The specificity and selectivity of the degradation machinery for a target protein is mainly determined by the E3 ubiquitin ligases, since these proteins are responsible for substrate recognition. Often E3 ubiquitin ligases recognize substrates by binding to short degradation motifs, a subset of docking motifs known as degrons (Table 6).159 Deubiquitylating enzymes can antagonize the activity of ubiquitin ligases, and the stability of target proteins depends on the regulated interplay between these two types of enzyme. For instance, the abundance of p53 is largely determined by the antagonistic actions of the ubiquitin ligase MDM2 (double minute 2 protein) and the deubiquitylating enzyme USP7 (ubiquitin carboxyl-terminal hydrolase 7), both recruited by docking motifs in p53.160 As degrons can target proteins for destruction, tight regulation of their function is of vital importance and a range of mechanisms exist to ensure timely and context-dependent activation of degron motifs.131 Substrate recruitment by several substrate recognition subunits of the SCF (SKP1-CUL1-F-box protein) ubiquitin ligase, including FBXW7 (F-box/WD repeatcontaining protein 7)161 and BTRC (F-box/WD repeatcontaining protein 1A),162 depends on prior phosphorylation 6744

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

Figure 6. Functionality of targeting motifs. Many processes involved in intracellular trafficking of proteins are mediated by SLiMs (left). The examples discussed in the text include recognition of endocytic motifs in plasma membrane proteins such as CD4 (434QIKRLL439) and EGFR (998YRAL1001) by AP-2 complexes; ER retention/retrieval mechanisms for ER membrane proteins such as WBP1 (427KKTN−COOH) and soluble ER proteins such as ENPL (800KDEL−COOH) and P4HB (505KDEL−COOH); nuclear import of NLS-containing proteins such as SV40 Large T antigen (127KKKRK131), Xenopus laevis Nucleoplasmin (155KRPAATKKAGQAKKKKL171), p53 (305KRALPNNTSSSPQPKKK321), and BRCA1 ( 503 KRKRR 507 ); and nuclear export of NES-containing proteins such as p53 ( 339 EMFRELNEALELKD 352 ) and CTND1 (942ESLEEELDVLVLDDE956). SLiMs also promote the correct localization of proteins by acting as anchors that mediate interactions with location-specific structures or complexes, thereby retaining proteins in a specific location (right). Examples discussed in the text include anchoring of proteins to membranes by protein lipidation, which depends on modification motifs; spatial restriction of PKA by AKAPs; targeting of +TIP proteins to microtubules via SxIP motifs, for instance, in CLASP2 (492KRSKIPRSQGC502 and 515 RSSRIPRPSVS525 ) and APC protein (2801RPSQIPTPVNN2811).

associated with protein mislocalization (see section 7).176 The view that cytosolic proteins freely diffuse around the cell is being challenged, and the majority of proteins are likely to be actively transported to their destination.177 Understanding the intracellular transport infrastructures of the cell is key to understanding the spatial separation of proteins and the subcompartmentalization of the proteome. The correct localization of proteins is often directed by targeting motifs (Figure 6), which can be divided into two subsets: (i) traff icking motifs that mediate binding to the transport machinery that directs relocalization of proteins to a specific subcellular location (see section 4.1.2.1) and (ii) anchoring motifs that are not directly involved in protein trafficking but are recognized by biomolecules specific to a subcellular location and thereby allow the motif-containing protein to be retained at that location (see section 4.1.2.2). Multiple trafficking and anchoring motifs have been characterized to date (Table 7), several of which are discussed in the following sections, and it is likely that many more remain to be discovered. 4.1.2.1. Ligand Motifs and Protein Trafficking. Vesicular trafficking of proteins in the endocytic and secretory pathways is a multistep process that requires assembly of protein

specific and reliable mechanism to quickly and appropriately respond to the presence of a stimulus. A final group of degrons of note are the N-terminal degradation signals or N-degrons, which target proteins for destruction by the N-end rule pathway.170 N-Degrons bind to ubiquitin ligases known as N-recognins, which contain UBR domains.171,172 They can be constitutively active or conditionally regulated by PTM. This includes N-terminal exposure of embedded degrons by proteolytic cleavage or transformation of pro-N-degrons to N-degrons, for example, via arginylation of the N terminus. Other types of PTM such as N-acetylation can block binding of a degron to UBR domains.170,173 Thus, Nterminal degradation signals represent a further example of SLiM-directed spatiotemporal control of protein abundance and clearance in a context-dependent manner. 4.1.2. Ligand Motifs and Protein Targeting. Eukaryotic cells are highly compartmentalized, having both physical and functional subcellular compartments that each contain a specific set of proteins to mediate specific cellular processes.174,175 Segregation of proteins to the correct compartments is vital for their functionality, and incorrect localization often leads to aberrant activity as evidenced by the numerous diseases 6745

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

Table 7. Representative Examples of Targeting Motifs motif

initial compartment

final compartment

trafficking dibasic ER-retrieval motif

Golgi

dileucine sorting signal

trans Golgi network (TGN)

ER-retention motif (KDEL motif)

Golgi

nuclear export signal (NES) nuclear localization signals (NLS) peroxisomal targeting signal (PTS) QVxP motif

nucleus

endoplasmatic reticulum (ER) lysosome/ endosome compartment endoplasmatic reticulum (ER) cytoplasm

cytoplasm

nucleus

cytoplasm

peroxisome

Golgi

tyrosine-based endocytic motif anchoring PIP Box

membrane

cilia membrane intracellular compartments

SxIP motif

DNA replication fork microtubule ends

motif patterna,b

description motif recognized by COPI coatamer

Rx{0,1}R or Kx{0,1}Kx{2,3}−COOH

motif recognized by the VHS domain of GGAc clathrin adaptor

DxxLLx{1,2}−COOH

motif recognized by KDEL receptors

[KRH]DEL−COOH

motif recognized by the Exportin 1-like protein domain of Exportins motif recognized by the Armadillo repeats of Importin subunit alpha motif recognized by the TPR domain of Peroxin-5 (PEX5) motif for transport from the Golgi to the ciliary membrane motif recognized by the endocytic machinery

ΦxxxΦxxΦxΦ

motif recognized by proliferating cell nuclear antigen (PCNA) motif recognized by the EBH domain of microtubule end-binding (EB) proteins

core of basic residues SKL−COOH QVxPx−COOH YxxΦ

QxxΦxx[HFM][FMY] Sx[IL]P

a

The syntax used for notation of motif regular expressions is described in the legend of Figure 1. bAn extended list of targeting motifs is available in the ELM resource.15 cGGA: Golgi-localized, gamma-ear-containing, Arf-binding family of proteins.

cell is the process that mediates nuclear entry and exit.186 The nuclear envelope, which provides a boundary between the cytoplasm and the nucleus, is perforated by large macromolecular complexes known as nuclear pore complexes (NPCs) that allow bidirectional transfer of molecules between the nucleus and the cytoplasm. Small molecules can diffuse through these pores, while proteins larger than 40 kDa require active transport to cross the nuclear envelope.187 Several active transport pathways exist, many of which are specialized to facilitate nucleo-cytoplasmic trafficking of specific proteins or under specific conditions. However, the majority of proteins are translocated by the two classical transport systems: Importin alpha/beta (KPNA and KPNB) complexes to transport cargoes into the nucleus and the Exportin-1 (XPO1) machinery to transport them out (Figure 6).186 Cargo selection for transport generally occurs through recognition of SLiMs, and hundreds of import and export motif instances have been discovered to date.81,188 Cargoes selected for nuclear import contain a basic monopartite or bipartite motif known as a nuclear localization signal (NLS). The motifs were originally characterized in Simian virus 40 (SV40) Large T antigen47 and Xenopus laevis Nucleoplasmin;189 however, a multitude of proteins are now known to utilize NLS motifs for nuclear entry, including therapeutically relevant proteins such as the oncogene p53190 and BRCA1 (breast cancer type 1 susceptibility protein).191 Conversely, cargoes for nuclear export contain helical hydrophobic nuclear export signal (NES) motifs recognized by XPO1, for example, p53192 and CTND1 (Catenin delta-1).193 4.1.2.2. Ligand Motifs and Protein Retention by Anchoring. Once proteins are transported to the correct compartment, they can be retained through location-specific interactions, which often depend on SLiMs (Figure 6). Many important cellular compartments are enclosed within membranes that separate organelles from the rest of the cell.123 Targeting of soluble cytosolic proteins to membranes, and even to specific

transport complexes, recruitment of specific cargo proteins to these complexes, translocation of the cargo-loaded complexes between compartments, and their delivery to the correct target compartment.178 Cargo proteins often contain SLiMs that are recognized by adaptor coat proteins of the different pathways and act as sorting signals to ensure the cargo is transported to the correct location (Figure 6).179 Endocytic cargo adaptors recognize dileucine- or tyrosine-based internalization motifs that are present in the cytoplasmic portion of integral plasma membrane proteins, such as CD4 (T-cell surface glycoprotein CD4) and EGFR (epidermal growth factor receptor),180,181 and bind to the sigma and mu subunits of AP-2 (adaptor protein complex 2), respectively.182 Transport of proteins between the endoplasmic reticulum (ER) and the Golgi apparatus is also directed by several trafficking SLiMs. Cargo proteins containing a C-terminal dilysine trafficking motif are transported from the Golgi to the ER by Coatamer complex COPI-coated vesicles. These motifs ensure retention in the ER or retrieval from postER compartments of ER membrane proteins, such as the yeast WBP1 (oligosaccharyl transferase subunit beta), which is required at the ER for glycosylation of nascent proteins.183 Retrieval of many soluble ER-resident proteins from post-ER compartments depends on a C-terminal KDEL motif that binds to the seven transmembrane KDEL receptors (ER lumen protein retaining receptors), which in turn interact with subunits of the COPI complex.184 Important KDEL motifcontaining proteins that are recycled to the ER lumen include chaperones that assist in protein folding, processing, and transport, for instance, ENPL (endoplasmin) and enzymes such as P4HB (protein disulfide-isomerase), which catalyzes rearrangements of disulfide bonds.185 Proteins not destined for secretion or localization along the secretory pathway are synthesized in the cytosol, from where they can be transported to other subcellular compartments. The canonical and best-studied trafficking mechanism in the 6746

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

Table 8. Representative Examples of Complex Assembly Motifs function cell cycle

cell−cell adhesion chaperone cytoskeleton DNA damage checkpoint DNA replication protein folding signal transduction splicing

telomere maintenance transcription translation

vesicular transport

ligand

motif

motif patterna,b

description

mitotic spindle checkpoint component MAD2 (MAD2) integrin family

MAD2-interacting motif (MIM)

motif recognized by the HORMA domain of the mitotic spindle checkpoint protein MAD2

[KR][IV][LV]xxxxxP

RGD motif

RGD

14-3-3 family

14-3-3-binding motif

actin family breast cancer type 1 susceptibility protein (BRCA1) proliferating cell nuclear antigen (PCNA) stress-inducedphosphoprotein 1 (STIP1) TNF receptor-associated factor 2 (TRAF2) splicing factor U2AF 65 kDa subunit (U2AF65)

WH2 motif BRCA1-binding motif

short motif in many extracellular matrix proteins recognized by integrin family members phospho-dependent motif recognized by 14-3-3 proteins hydrophobic helical motif recognized by Actin phospho-dependent motif recognized by BRCA1

PSxxF

ancient motif recognized by PCNA

Qxx[ILM]x[HFM][FMY]

C-terminal motif in heat shock proteins that binds the tetratricopeptide repeat (TPR) domains of STIP1 motif recognized by the meprin and TRAF-homology (MATH) domain of TRAF2 motif recognized by noncanonical RNA recognition motif (RRM) domains of U2AF65

EEVD−COOH

short motif in TERF1-interacting nuclear factor 2 (TINF2) recognized by TERF1 motif recognized by the WD40 repeat domains of Groucho/TLE corepressors motif recognized by eIF4E involved in the regulation of translation initiation

[FY]xLxP

telomeric repeat-binding factor 1 (TERF1) transducin-like enhancer protein 1 (TLE1) eukaryotic translation initiation factor 4E (EIF4E) tumor susceptibility gene 101 protein (TSG101)

PCNA interacting protein (PIP) box EEVD motif TRAF2-binding motif U2AF homology motif (UHM) ligand motif (ULM) TERF1-binding motif engrailed homology domain 1 (EH1) motif eIF4E-binding motif

PTAP motif

motif recognized by the ubiquitin E2 variant (UEV) domain of the vacuolar sorting protein TSG101

RxxP[ST]xP [R]xx[ILVMF][ILMVF]xx[ILVM]

[PSAT]x[QE]E [KR]{1,4}[KR]x[KR]W

[FYH]x[IVM]xx[ILM][ILMV] YxxxL[VILMF]

P[ST]AP

a The syntax used for notation of motif regular expressions is described in the legend of Figure 1. bAn extended list of complex assembly motifs is available in the ELM resource.15

tumor suppressor protein APC (adenomatous polyposis coli protein), function as general microtubule tip localization signals (Figure 6).198 Conversely, proteins can be recruited to localized complexes with distinct subcellular localizations by ligand motifs within these complexes. As the role of the motif is to retain the protein that recognizes these motifs at that location, they can be considered anchoring motifs. For example, members of the AKAP family, described in section 4.1.3.3, spatially restrict cAMP-dependent PKA (protein kinase A) to a limited set of subcellular locations and as such the PKA-binding motifs in the AKAP family members function as anchoring motifs. However, these motifs are also important for the role of AKAPs as scaffolds that increase the specificity and efficiency of PKA signaling by bringing multiple components of this pathway together and as such can also be defined as motifs for docking (see section 4.1.1.1) and complex assembly (see section 4.1.3.3), emphasizing the functional overlap between the different ligand motif classes. 4.1.3. Ligand Motifs and Protein Complex Assembly. Most processes in the cell are not carried out by biomolecules in isolation but in assemblies known as macromolecular complexes.2,11,123 Complex formation is facilitated by the use of different types of interaction interfaces with a wide range of distinct binding attributes (see section 2), and the properties of SLiMs (see section 3) make them ideal binding modules to drive the regulated assembly of large dynamic complexes in a context-dependent manner. Their common use for this purpose is evidenced by the many examples of SLiM-mediated interfaces involved in formation of binary and large multiprotein complexes that have been rapidly accumulating since the discovery of the archetypal SH2 and SH3 domain-binding ligand motifs over 20 years ago (Table 8).199−201 In the

submembrane compartments, can be facilitated by lipid-binding domains, for instance, the PH and C1 domains that bind to phosphatidylinositol lipids and diacylglycerol (DAG) or DAG analogues, respectively.194 However, protein lipidation, which is mediated by specific modification motifs (see section 4.2.1.3), also promotes association of soluble proteins with membranes and hence can act as a targeting signal. Although covalently attached lipids such as myristoyl, isoprenyl, and palmitoyl groups are commonly regarded as a general membrane anchor for proteins, evidence suggests they can function as more specific sorting signals.195 Furthermore, several examples of motifs that directly mediate lipid−protein interactions have also been characterized, for example, the basic motif in BIN1 that can bind PIP2.42 Subcellular compartments also include functional compartments that are not physically separated by a membrane but to which specific processes are restricted. Appropriately targeted proteins can interact with location-tethered macromolecules to be retained at the correct subcellular compartment. Often these proteins are part of larger localized complexes, and binding to such complexes limits the diffusion of the protein. One such essential functional compartment is the microtubule network. The microtubule plus-end-tracking proteins (+TIPs) are important regulators of microtubules, as they control their dynamics and linkage to other cellular structures, and as such play important roles in cell division and migration.196 Targeting of different +TIPs to the growing distal ends of microtubules is mediated by their SxIP motifs, which bind to end-binding homology (EBH) domains of end-binding (EB) proteins that autonomously associate with the microtubule plus ends. As such, SxIP motifs, which can also be found in a variety of structurally and functionally unrelated proteins,197 including cytoplasmic linker-associated proteins (CLASPs) and the 6747

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

Figure 7. Functionality of ligand motifs. (A) Phosphorylation of two 14-3-3-binding motifs in RAF1 (256RSTPSTP261 and 618RSAPSEP623) induces complex assembly of RAF1 with a 14-3-3 dimer, which inhibits the kinase activity of RAF1. (B) The E3 ubiquitin ligase RNF4 uses four adjacent SUMO-binding motifs (36IELVET41, 46IVDLT50, 58VVDLT62, and 67VVIVDE72) to specifically recognize polysumoylated substrates, which are subsequently marked for proteasomal degradation. (C) Multiple TRAF6-binding PxE motifs in IRAK1 trimers (542PQENSY547, 585PVESDE590, and 704PEESDE709) cooperate to oligomerize and activate the E3 ubiquitin ligase TRAF6. (D) Binding of different LxCxE-containing proteins, including KDM5A (1373LFCDE1377), CCND1 (5LLCCE9), and viral oncoproteins such as the E7 protein of human papillomavirus (22LYCYE26), to RB1 allows assembly of functionally distinct complexes at E2F1/DP1-dependent gene promoters. (E) Ligand-dependent assembly of functionally distinct complexes on nuclear receptors. In the absence of a ligand, nuclear receptors bind CoRNR-box motifs (LxxΦΦxxΦΦ) in corepressors, for instance, NCOR1 (2051LADHICQII2059 and 2263LEDIIRKAL2271) and NCOR2 (2143LAQHISEVI2151 and 2350LEAIIRKAL2358). Ligand binding to the receptors allosterically switches their specificity to NR box motifs (LxxLL) in coactivators such as CBP (69QLSELLR75). (F) Modular architecture of the scaffold protein EPN1 showing its PIP2-binding ENTH domain, ubiquitin-interacting motifs (UIM), AP-2-binding motifs, clathrin-binding motif, and EPS15-binding motifs. (G) Modular architecture of the scaffold protein AKAP5 showing its recruitment sites for PKC, calcineurin, and PKA. (H) SLiM-dependent autoinhibition of the tyrosine-protein kinase SRC, which depends on intramolecular interactions between a phosphorylated SH2-binding motif (530PYQPG533) and the SH2 domain as well as an SH3-binding motif (252KPQTQGLA259) and its cognate SH3 domain. Legend: protein names are given in italics; SLiM-binding domains are represented as blue-bordered gray boxes; other domains are represented as mauve boxes; domain names are given in these boxes; yellow boxes represent motifs; small blue circles represent phosphates; green stars signify an accessible active site of a kinase; red stars signify an inaccessible active site of a kinase.

following sections we will discuss a variety of related but distinct roles that SLiMs play in the assembly of functional complexes (Figure 7), more specifically (i) how multiple lowaffinity motif-containing interfaces are used cooperatively to build stable complexes (see section 4.1.3.1); (ii) how motifs can direct the assembly of functionally distinct complexes around an invariant core (see section 4.1.3.2); (iii) how the presence of multiple motifs and domains in modular proteins confers scaffolding activity (see section 4.1.3.3); and finally (iv) how

motifs can keep a protein in an autoinhibited state until activation is required (see section 4.1.3.4). 4.1.3.1. Ligand Motifs and Cooperative Complex Assembly. As motifs only provide relatively weak interactions, SLiMdriven assembly of metastable complexes generally requires the cooperative use of multiple SLiMs, which results in high-avidity binding with a more than additive increase in affinity.130 Due to this cooperativity, formation of these complexes depends on multiple factors, which allows for tight control of the assembly process. Several examples of proteins that use multiple motifs in 6748

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

dissociation of RB1 from the E2F1/DP1 complex promotes expression of CDKs, cyclins, and regulators of DNA replication, thereby driving the cell into S phase.206 A peptide-binding pocket on RB1 can recruit a diverse set of transcriptional modulators containing an LxCxE motif to the promoters of target genes regulated by E2F1/DP1 and as such allows construction of functionally distinct complexes (Figure 7D). To date, a large number of LxCxE motif-containing RB1-binding proteins have been characterized, including histone modifiers such as the demethylase KDM5A,208 the RB1 regulatory cyclinCDK complex subunit CCND1,209 and viral oncoproteins such as the E7 protein of human papillomavirus.210 Nuclear receptors (NRs) also form functionally distinct complexes around a core set of subunits; however, they do so in a context-dependent manner depending on ligand-bound or ligand-free state (Figure 7E).211 Nuclear receptors are a large family of transcription factors that includes the hormoneactivated androgen, estrogen, and thyroid receptors. They conditionally recruit corepressors or coactivators depending on the presence of a ligand.211 In the absence of their ligand, NRs bind NCOR1 and 2 (nuclear receptor corepressor 1 and 2) subunits to form a transcription repression complex. These corepressors contain motifs known as CoRNR (co-repressor nuclear receptor) boxes that bind to a hydrophobic groove on the ligand-binding domains of the NRs. The NCORs then recruit chromatin-modifying corepressor complexes to repress gene transcription at NR-response elements.212,213 Upon ligand binding, the receptors undergo an extensive conformational change that allosterically occludes part of the CoRNR boxbinding site, hence favoring recruitment of proteins containing an NR box motif, a shorter two-turn hydrophobic α-helix found in many NR coactivators such as the histone acetyltransferase CBP, thereby promoting expression of hormone-responsive genes.211,214 4.1.3.3. Ligand Motifs and Multiprotein Scaffolding. Scaffolds are a class of proteins that act as multivalent platforms to allow cooperative assembly of transient multiprotein complexes. Most scaffolds interact with numerous proteins and consequently are highly modular. Many scaffolds take advantage of SLiM-mediated interactions and contain multiple SLiMs (e.g., LAT (linker for activation of T-cells family member 1), DOK1 (docking protein 1), AKAP5 (A-kinase anchor protein 5)), two or more copies of SLiM-binding domains (e.g., ZO-1 (Zona occludens protein 1), HOP (Hsp70/Hsp90 organizing protein), GRB2), or a combination of SLiM-binding domains and SLiMs (e.g., BCAR1 (breast cancer antiestrogen resistance protein 1), AKAP10 (A-kinase anchor protein 10)).215−217 Scaffolds commonly function as adaptor proteins to nucleate functionally connected proteins. For example, EPN1 (Epsin-1) is a large multimotif scaffold protein that provides a platform for the assembly of a complex that promotes clathrin-coated vesicle (CCV) formation in receptor-mediated endocytosis (Figure 7F).218 The sole globular domain in the protein, an Nterminal ENTH domain, binds to PIP2, tethering EPN1 to the plasma membrane and promoting membrane invagination.219 The remainder of EPN1 is intrinsically disordered and contains multiple motifs that allow this scaffold to link AP-2- and ubiquitin-dependent cargo recruitment to CCV formation.220 The AP-2 complex contains multiple pockets that can recognize SLiMs: the mu subunit recognizes YxxΦ motifs in cargo proteins, the appendage of both alpha and beta subunits recognize DP[FW] motifs, and the beta appendage platform

tandem, resulting in an increased binding strength, have been characterized. A canonical example is the recruitment of 14-3-3 dimers by cooperative binding of two phospho-dependent 143-3-binding motifs. For instance, phosphorylation of two such motifs on either side of the kinase domain in RAF1 (RAF proto-oncogene serine/threonine-protein kinase) induces binding to a 14-3-3 dimer (Figure 7A). The second motif binds with higher affinity than the N-terminal instance, but together they form a high-avidity complex between RAF1 and the 14-3-3 dimer that locks the kinase in an inhibited conformation.202 Another example involves the yeast Aurora kinase IPL1, which recognizes microtubule plus ends through two SxIP motifs (46SKIP49 and 72SKIP75) about 20 amino acids apart that bind to the EB proteins decorating the plus ends of microtubules. The first and second motifs bind with a Kd of 15 and 90 μM, respectively; however, both motifs cooperate to bind with a low-affinity Kd of 0.15 μM.203 Cooperative binding also increases the specificity of interactions, as shown for the E3 ubiquitin ligase RNF4 that targets polysumoylated proteins for degradation (Figure 7B). The specificity for poly- but not mono- or disumoylated substrates is controlled by the cooperative interaction of four adjacent SUMO (small ubiquitin-related modifier)-binding motifs (SBMs) with the SUMO chain.204 While these isolated cases show how multiple motifs in a single protein can mediate cooperative binding, very often SLiMs in different proteins cooperate to drive avidity-based complex assembly. Many pathways are activated by oligomerization of upstream components, and a common mechanism to recognize these changes is to build avidity-driven interfaces that require oligomerization of one or both sides of the interface to produce a biologically relevant interaction. In these complexes, a binary interaction of the components is generally insufficient for a stable interface, but higher stoichiometry permits biologically relevant binding. For example, ligand binding and clustering of Toll-like receptors leads to recruitment and oligomerization of several proteins that serve as a platform for trimerization of IRAK1 (interleukin-1 receptorassociated kinase 1) (Figure 7C). Binding motifs for TRAF6 (TNF receptor−associated factor 6) within the IRAK1 trimers cooperate to recruit and trimerize TRAF6. Oligomerization of TRAF6 is required for its E3 ubiquitin ligase activity and leads to a series of downstream events that culminate in IKKβ (inhibitor of nuclear factor kappa-B kinase subunit beta) activation, IκB (NF-kappa-B inhibitor) degradation, and eventually relocalization of NF-κB (nuclear factor NF-kappaB) to the nucleus.205 4.1.3.2. Ligand Motifs and Assembly of Functionally Distinct Complexes. The clean paradigmatic definition of a complex as a stable invariant assembly of macromolecules understates their dynamic and versatile nature. Diverse complexes can be built around constitutive core subunits by combining this core with conditional partners, depending on the cellular context. Many SLiM-mediated interactions involved in complex assembly are used to recruit additional nonconstitutive subunits to an invariant core to produce a set of related but functionally distinct complexes. This diversification of core functionality is illustrated by RB1, an important transcriptional regulator and intensively studied tumor suppressor (Figure 7D).206,207 RB1 associates with a heterodimeric transcription factor complex containing E2F1 and DP1 (transcription factor Dp-1) to suppress the transcription of S phase-promoting genes. At the G1-S checkpoint, regulated 6749

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

kinase activity and also prevents intermolecular interactions of the SH2 and SH3 domains. The kinase domain can be activated by inducing the open conformation, either by dephosphorylation of Y530 in SRC or by competitive binding of other SH2or SH3-binding motif-containing proteins to the SH2 or SH3 domains of SRC, respectively.234 Many intramolecular inhibitory interactions affecting noncatalytic activities have also been characterized. For example, an intramolecular interaction with an SH3-binding motif in BIN1 (336RKGPPVP342) prevents binding of the BIN1 SH3 domain to the SH3-binding motif of MYC (60PLSPSR65),235 and an intramolecular interaction with a PDZ-binding motif in APBA1 (amyloid beta A4 precursor protein-binding family A member 1) (832EQPVYI−COOH) results in an autoinhibited conformation of APBA1 where binding of ligands containing a PDZ-binding motif such as PSEN1 (presenilin-1) (462FHQFYI−COOH) is abrogated.236 4.1.4. Summary of Ligand Motifs. Cell regulation is governed by large macromolecular complexes that have to be assembled and activated at the correct location, at the right time and/or under the appropriate conditions. This requires tight control of the local abundance and function of the numerous subunits, processes in which regulatory ligand motifs play an important role by regulating the modification, stability, localization, and activity of proteins. The modification state of a protein generally not only depends on the presence of a modification target site in a substrate (see section 4.2) but also involves a separate binding region that assists in recruiting the modifying enzyme. Use of such docking motifs to recruit modifying enzymes complements the (generally weak) substrate specificity of enzymes for PTM target sites and increases the local concentration of a target site in the proximity of a catalytic site (Table 5). This in turn increases the specificity and efficiency of enzyme-catalyzed protein modification and decreases off-target PTM events. Docking motifs can indirectly modulate a protein’s stability, activity, and/or subcellular localization by controlling its modification state and thereby regulating recruitment of regulatory partners. Furthermore, the degradation motif subclass of docking motifs directly determines the stability of proteins by targeting them to specific substrate-recognition subunits of ubiquitin ligase complexes, which by polyubiquitylation mark their substrates for proteasomal degradation (Table 6). The activity of docking and degron motifs can be regulated by different mechanisms, including PTM, competitive inhibition of binding, or the requirement of complex preassembly to generate a functional motif-binding site, all of which allow for tight control of the modification state and local abundance of a protein in a context-dependent manner. Importantly, docking interactions not only target catalytic domains to specific substrates but also allow recruitment to be regulated conditionally and enzyme activity to be modulated through allosteric mechanisms that depend on binding of the correct substrate.137 Targeting motifs also contribute to the control of local abundance of proteins by acting as trafficking or anchoring signals that facilitate appropriate protein localization (Figure 6, Table 7). Trafficking motifs mediate recruitment of proteins to intracellular transport pathways that target them to their correct location. By acting as sorting signals they direct synthesized proteins to the correct subcellular compartments or translocate proteins between different compartments where they can perform distinct tasks. While trafficking motifs control protein transport, anchoring motifs contribute to the correct subcellular

subdomain can recognize Fxx[FL]xxxR motifs. The central section of EPN1 is dedicated to AP-2 binding and consists of eight DP[FW] motifs and a single Fxx[FL]xxxR motif.221 The recruitment and endocytosis of ubiquitylated cargo proteins is mediated by three ubiquitin-interacting motifs (UIMs) in EPN1.220 In addition, three NPF motifs in the C-terminal region of EPN1 bind to the EH domains of EPS15 (epidermal growth factor receptor substrate 15), an additional endocytic adaptor protein.222 Finally, a clathrin-box motif recruits clathrin to EPN1 to link cargo recognition to CCV formation.223,224 The repetition of the DP[FW] motifs has led to the hypothesis that multiple sites act cooperatively to allow endocytic adaptors to probe cargo concentration.225 Adaptor proteins like EPN1 often function conditionally, and in this model, an aviditydriven stable interaction between a single adaptor and multiple cargo-binding AP-2 complexes is required to recruit the endocytic machinery. Another function of scaffolds is to recruit components from distinct pathways to enable integrative, localized signaling in response to multiple signaling cues.216 Use of scaffold proteins as platforms for signaling complex assembly increases the specificity and efficiency of signal transmission and provides a mechanism to modulate the dynamics of signaling. The AKAPs, a family of approximately 50 scaffolding proteins, regulate the spatiotemporal activation of PKA signaling.226 Recruitment of PKA occurs via a docking motif, found in most AKAPs, that forms an amphipathic helix and binds to a hydrophobic groove on the surface of the dimerization/docking domain of the regulatory subunits of PKA.140,227 The AKAPs scaffold PKA with other regulators of PKA signaling to allow fine-tuned and contextual control of PKA-dependent responses.228 For example, AKAP5 contains motifs for recruitment of PKA, PKC (protein kinase C), and the Ca2+-dependent phosphatase Calcineurin (Figure 7G).229,230 Furthermore, the localization of the scaffold often depends on motif-mediated recruitment to particular proteins, for instance, a PDZ domain-binding motif in AKAP10 (659STKL−COOH) binds to PDZK1 (Na+/H+ exchange regulatory cofactor NHE-RF3), localizing PKA to SLC34A1 (sodium-dependent phosphate transport protein 2A)-containing complexes.231 This ability of PKA docking motifs to contribute to the control of PKA localization underlines the fuzzy boundary between the complex assembly, docking (see section 4.1.1.1), and targeting motif definition (see section 4.1.2.2). Restricting the localization of PKA promotes phosphorylation of the correct subset of substrates upon activation by colocalizing the enzyme, which has a very weak intrinsic specificity, with the relevant targets. 4.1.3.4. Ligand Motifs and Intramolecular Inhibitory Interactions. The activity of proteins is often restrained through autoinhibitory intramolecular interactions that can be switched off in a context-dependent manner when protein activation is required. Ligand motifs mediate many intramolecular binding events required for distinct functional conformations of a protein. For instance, motif-driven autoinhibitory conformations are known to regulate the activity of several classes of kinases and phosphatases.232,233 The canonical example is the phospho-dependent inhibited conformation of SRC kinase (Figure 7H). Phosphorylation of Y530 in the SH2binding motif of SRC induces an intramolecular interaction with the SH2 domain and promotes an additional intramolecular interaction between the SH3 domain and a KPxx[QK] motif in the linker region between the SH2 and kinase domains.234 This conformation results in inhibition of 6750

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

Table 9. Representative Examples of Modification Motifs modification type

enzyme

glycosylation glycosylation cleavage

N-myristoyltransferase palmitoyltransferase prenyltransferase SUMO-conjugating enzyme UBC9 cyclin-dependent kinase (CDK) protein kinase B (PKB) large tumor suppressor homologue (LATS) kinase phosphatidylinositol 3-kinase-related kinase (PIKK) oligosaccharyltransferase glycosyltransferase furin

cleavage

caspase-3/7

cleavage

separin (ESPL1)

peptidyl-prolyl cis−trans isomerization

protein interacting with NIMA 1 (PIN1)

lipidation lipidation lipidation sumoylation phosphorylation phosphorylation phosphorylation phosphorylation

motif patterna,b

description N-terminal myristoyl lipid attachment site. palmitoyl lipid attachment site. C-terminal prenyl lipid attachment site. sumoylation site modified by UBC9. phosphorylation site modified by CDK phosphorylation site modified by PKB phosphorylation site modified by LATS phosphorylation site modified by PIKK family members carbohydrate attachment site carbohydrate attachment site cleavage site recognized and cleaved by furin peptidase cleavage site recognized and cleaved by caspase peptidases cleavage site recognized and cleaved by separin peptidase phospho-dependent isomerization site

NH2−M(G)xxx[STAGCN] NH2−MG(C)xxS

or G(C)MxxxC (C)xΦx−COOH Φ(K)xE ([ST])Px[KR] RxRxx([ST]) Hx[KR]xx([ST]) ([ST])Q (N)x[ST] or (N)xC Cx(S)xPC or Cx{3,5}([ST])C Rx[KR]R [DSTE]xxD[GSAN] E[IMPVL][MLVP]R P[ST]P

a

The syntax used for notation of motif regular expressions is described in the legend of Figure 1. bAn extended list of modification motifs is available in the ELM resource.15

SLiM and a cognate binding domain in the same protein allows an intramolecular interaction to lock a protein in an autoinhibited conformation. Such SLiM-dependent autoinhibition seems to be a common inhibitory mechanism that provides additional means to ensure that functional complexes are only assembled and activated at the correct location at the right time and/or under the appropriate conditions.

localization of proteins by tethering them to compartmentspecific molecular complexes and limiting their diffusion. As such, these anchoring motifs can have additional roles that overlap the functions of other ligand motif classes, for instance, bringing an enzyme and its substrate into close proximity. Multiple regulatory mechanisms exist that modulate binding of trafficking and anchoring motifs to allow controlled relocalization of proteins at specific times or in response to specific signals. The majority of ligand motifs characterized to date do not have a regulatory function but play an important role in complex assembly by providing binding sites that can be easily regulated to mediate formation of functionally distinct and metastable macromolecular assemblies in a context-dependent manner (Figure 7, Table 8). These complex assembly ligand motifs generally bind their target with low affinity, and hence, they are often used cooperatively to provide stability to the assembled complex. Cooperative use of multiple low-affinity interactions for building metastable yet dynamic signaling complexes facilitates combinatorial, contextual, and robust signaling, properties that are essential for a system to be able to conditionally and reliably integrate and propagate signals. The evolutionary plasticity of motifs also provides the means to confer functional diversity on complexes. The ability of SLiMs to easily arise de novo has allowed multiple proteins to acquire the ability to bind to the same surface. This has facilitated the diversification of many complexes that contain nonconstitutive parts, recruited via SLiMs, allowing the same stable core subunits to have a diverse, often opposing, range of functions depending on the additional members recruited to the complex. These beneficial properties of motifs for regulated complex assembly are exploited by scaffold proteins.216 These highly modular proteins contain multiple SLiMs, SLiM-binding domains or both, which allow them to function as nonenzymatic aggregators that concentrate functionally connected proteins. Scaffolds play an important role in cell signaling as they can provide spatial, temporal, and contextual regulation of enzymes, nucleate functionally connected proteins, or act as adaptors to link pathways. In addition, the co-occurrence of a

4.2. Post-Translational Modification Motifs

The second group of SLiMs consists of modification motifs. These motifs overlap sites targeted by enzymes for posttranslational modification and mediate specific binding to the active site of the modifying enzyme to allow subsequent catalytic modification of the target site (Table 9). 237 Modification motifs can be divided into three classes. The first class contains sites recognized by enzymes that catalyze addition or removal of a PTM moiety, such as a phosphate, ubiquitin, or lipid group (see section 4.2.1). The second class comprises sites that are recognized by proteases that mediate irreversible proteolytic cleavage at the motif site (see section 4.2.2). The third class consists of structural modif ication motifs targeted by enzymes that catalyze conversion of cis and trans isomers of proline-containing peptide bonds (see section 4.2.3). These PTMs are often, but not always, reversible and regularly only occur under specific conditions, which makes them ideal mediators for signal transmission and molecular decision making.129,130,238 The different classes of modification motifs are discussed and illustrated with some representative examples in the following sections. In the final section the main aspects of the functionality of modification motifs and their regulatory potential are summarized (see section 4.2.4). 4.2.1. Post-Translational Modification Motifs and Moiety Addition or Removal. Although several hundred types of PTM are known to date, relatively few of these have been extensively studied.239 The most studied and well-known types of PTM involve covalent attachment of a functional group, the PTM moiety, to a specific amino acid residue, a process that is often catalyzed by a particular type of modifying enzyme. Many of these specific residues that are targeted for 6751

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

membrane association.195 This PTM is frequently found on integral and peripheral membrane proteins, for instance, the GTPase HRas (180G(C)MSCKC186).252 The importance of protein lipidation in signaling can be illustrated by the role of myristoylation and palmitoylation of the tyrosine-protein kinase FYN. These two modifications cooperate to correctly target FYN to plasma membrane lipid rafts where this kinase colocalizes with T-cell receptor (TCR) complexes, being required for TCR signaling (see section 6.1).253 4.2.1.4. Protein Ligation. Proteins can also be modified by the reversible conjugation of other polypeptides or proteins to specific amino acids. Ubiquitylation is the archetypal example of this class of PTM. However, the site of ubiquitin attachment is generally considered to lack strong specificity determinants, and the specificity of ubiquitin conjugation is mainly driven by docking interactions.131,254 Covalent attachment of SUMO, a small protein of about 100 amino acids, to lysine residues by the SUMO-conjugating enzyme UBC9 is directed by a SUMO modification motif (Table 9).255 Functions of sumoylation include targeting proteins to subnuclear structures, as seen for PML (promyelocytic leukemia protein) (489I(K)ME492),256 and transcriptional regulation, as in the case of p53 (385F(K)TE388).257 Interestingly, the attached ubiquitin and SUMO proteins themselves can recognize ubiquitin-interacting motifs 258 (UIMs) and SUMO-binding motifs 204 (SBMs), respectively, allowing regulated recruitment of specific SLiMcontaining binding partners to modified proteins. 4.2.1.5. Moiety Removal. The reversibility of PTMs involves the antagonistic action of enzymes that catalyze removal of the PTM moiety, for example, phosphatases and deubiquitylases.154,259 The specificity of PTM moiety removal enzymes is generally believed to depend primarily on docking and scaffolding interactions rather than sequence specificity (see sections 4.1.1.1 and 4.1.3.3).155,260 For instance, phosphatases display strong specificity for targets, yet on a sequence level the modification sites of these targets show little or no similarity.156 However, some phosphatases have been shown to have intrinsic specificity, although these seem to be in the minority. For instance, the yeast dual-specificity tyrosine-protein phosphatase CDC14 promotes mitotic exit by specifically targeting a subset of PSP phosphorylation sites, predominantly cell cycle-related cyclin-CDK-modified sites, that contain a modified serine residue while having weaker activity toward phosphothreonine-containing sites.261 4.2.2. Post-Translational Modification Motifs and Proteolytic Cleavage. Another type of PTM that plays an important role in cell regulation is protein cleavage, proteolytic processing of proteins into smaller polypeptides by proteasecatalyzed hydrolysis of specific peptide bonds (Table 9). Similar to addition/removal of a PTM moiety, cleavage motifs act as recognition sites for enzymes; however, in this case these enzymes irreversibly cleave the protein at a specific site within the motif. Well-known functions include cleavage of N-terminal methionine,262 production of active enzymes or peptide hormones from inactive precursors,263 exposure or removal of targeting signal sequences that determine subcellular localization,264 and control of important physiological processes such as the cell cycle and apoptosis.265,266 The subtilisin-like proprotein convertases (SPCs/PCs) proteolytically process latent precursor proteins to produce biologically active products such as neuropeptide hormones. Members of this family cleave their substrates at motifs with paired basic residues.263 This weak specificity is augmented by a

PTM are embedded within SLiMs that provide specificity for the modifying enzymes (Table 9). Some modifying enzymes have strong intrinsic substrate specificity. However, in many cases, specificity for a particular modification motif is rather weak, in which case reliable target recognition relies on additional mechanisms such as docking interactions (see section 4.1.1.1).137 In the following sections, some of the most common types of PTM for which modification motifs have been characterized are discussed. 4.2.1.1. Phosphorylation. The most familiar types of PTM involve covalent binding of smaller chemical groups to specific amino acids, resulting in altered physicochemical characteristics of protein regions.239 Many such PTMs have been described, including acetylation, methylation, and hydroxylation; however, the best-characterized PTM is the reversible phosphorylation of serine, threonine, or tyrosine residues, and specific modification motifs have been defined for numerous kinases (Table 9). These include CDK phosphorylation sites, found, for instance, in p27Kip1 (186Q(T)PKK190),240 motifs targeted for modification by PKB (protein kinase B), for example, in the human transcription factor FOXO4 (Forkhead box protein O4) (27RPRSC(T)WPL35 and 192RRRAA(S)MDS200),241 and target sites in LATS (large tumor suppressor homologue) kinase substrates such as the protein yorkie (163HSRAR(S)S169), a transcriptional coactivator involved in the Hippo signaling pathway in Drosophila.242 The attached negatively charged phosphate moiety alters the physicochemical nature of a protein region, which can affect protein function in different ways, for instance, by inducing or blocking an interaction (see section 3.4.2).129,130,243 4.2.1.2. Glycosylation. One of the most ubiquitous PTM is protein glycosylation, attachment of a diverse set of sugars, either as monosaccharides or as polysaccharides, to specific residues in a protein. Sugars are generally linked to asparagine residues via N-glycosylation or to serine or threonine residues via O-glycosylation.244 Many different functions have been attributed to glycosylation, with effects on protein folding and conformation, localization, stability, and activity.244 Several modification motifs that are targeted for glycosylation have been identified (Table 9). These motifs are commonly found in secreted and cell surface-bound proteins, including the secreted VWF protein (von Willebrand factor) (1147(N)SC1149),245 and the transmembrane EGFR (56(N)NC58),246 which are Nglycosylated, and coagulation factors such as FA7 (110CA(S)SPC115),247 which is O-glycosylated. O-Glycosylation has also been characterized for intracellular components such as transcription factors, for example, FOXO1 (Forkhead box protein O1), whose transcriptional activity is regulated in a glucose-dependent manner.248,249 However, a glycosylation motif has not yet been identified in this protein. 4.2.1.3. Lipidation. Several motifs have been characterized that function as target sites for lipidation, which is frequently used to localize proteins to membranes (Table 9). Multiple lipid PTMs are often combined to strengthen membrane association of a protein by increasing the avidity of the interaction.250 Examples of lipidation include myristoylation, attachment of myristate to an N-terminal glycine residue, and palmitoylation, linkage of a long-chain fatty acid group to a cysteine residue. Myristoylation is irreversible during the working lifetime of a protein and provides a weak membrane anchor for soluble proteins such as ARF1 (ADP-ribosylation factor 1) (NH2−M(G)NIFAN7).251 Conversely, reversible palmitoylation is a more dynamic process and promotes stable 6752

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

which modulates the conversion between the autoinhibitory closed conformation and the active open conformation.275 The phospho-specific PPIase PIN1 (peptidyl-prolyl cis−trans isomerase NIMA-interacting 1) plays an important role in a wide range of biological processes, including the cell cycle, growth factor-induced signaling, and stress responses.274 It contains a WW domain that recognizes phosphorylated serine/ threonine-proline motifs and a rotamase domain that catalyzes the cis−trans isomerization of this peptide bond.274 Hence, substrate recognition depends on the activity of prolinedirected kinases such as CDK and MAPK family members. The consequence of PIN1-mediated isomerization varies in different substrates and for different target motifs. For instance, PIN1induced conformational changes can inhibit or promote dephosphorylation of phosphorylated [ST]P motifs by proline-directed phosphatases, thereby modulating the phosphorylation/dephosphorylation cycles of proteins and hence their activity and function. Such isomer-specific catalysis is shown by the serine/threonine-protein phosphatase PP2A, which only effectively dephosphorylates a P[ST]P motif in CDC25C when the peptide bond is in the trans conformation (45VPRPTPV50).276 This interplay between structural modification and isomer-specific PTM adds an additional layer of regulation to PTM-dependent control of protein function. 4.2.4. Summary of Post-Translational Modification Motifs. Post-translational modification of proteins is a common mechanism to integrate and propagate signals in cells. Many enzymes that modify proteins are guided by SLiMs at the modification site, which act as a recognition signature for the active site of the modifying enzyme (Table 9). However, these motifs often have weak intrinsic specificity; therefore, other factors including the use of docking sites (see section 4.1.1), subcellular localization of enzyme and substrate (see section 4.1.2), or use of scaffold proteins (see section 4.1.3.3) play an important role in guiding protein modification. Traditionally, modification motifs were regarded as sites for attachment of canonical PTM moieties such as a phosphate. However, protein modification also encompasses addition and removal of a wide variety of PTM moieties, proteolytic cleavage, and structural modification of the peptide backbone.237 Post-translational modification of the target site changes the physical or chemical nature of the peptide. While some PTMs are irreversible, and thus stable and static, others are reversible and transient. These conditional PTMs are ideal mediators of cell signaling, since they can be enzymatically attached and removed in a regulated and context-dependent manner. This allows rapid and dynamic control of protein function by rewiring interaction networks, changing subcellular localization, modulating stability, or altering enzymatic activity, which often results from a PTM-dependent change of the activity of overlapping or adjacent ligand motifs (see section 3.4.2).19,129,130,238 In addition, multiple PTMs can cooperate to produce complex integrative interfaces. Common mechanisms include competition between different types of PTM for the same residue shared by different modification motifs or positive cooperativity between multiple PTMs of the same or a different type.130,136,279 Such higher order use of modification motifs is exemplified by priming-dependent modification. Priming involves cooperativity between two overlapping modification motifs, with modification of the first modification motif resulting in activation of the second. Several enzymes require priming of their target motifs; hence, they are only active

unique spatiotemporal expression pattern for the individual proteases.267 A well-studied SPC is Furin, which predominantly localizes to the trans-Golgi network but can shuttle to the cell surface. Furin processes a variety of cellular substrates such as metalloproteases, for instance, ADAM10 (disintegrin and metalloproteinase domain-containing protein 10) (210RKKRT214),268 and secreted components such as growth factors, including TGFB1 (transforming growth factor beta-1) (275RHRRA279),269 but is also involved in activating viral envelope glycoproteins such as human immunodeficiency virus (HIV) gp160 (508REKRA512).270 Caspases are proteases that drive apoptosis and are activated by apoptotic stimuli such as DNA damage. Caspase-3 and -7 recognize substrates with an acidic motif, cleaving after an essential aspartate residue that is usually followed by a small amino acid.271 These effector caspases cleave a specific set of target proteins to irreversibly commit to cell death. By convergently evolving a motif recognized by effector caspases, proteins come under the regulation of the apoptotic pathway, and to date, hundreds of proteins have been revealed as targets of caspase-3 and -7, although only a few are considered to be key substrates of the apoptotic pathway.271 Cleavage of these substrates, for instance, DFFA (DNA fragmentation factor subunit alpha) (114DETD117 and 221DAVD224), promotes cell death in an organized manner and drives many of the morphological changes characteristic of cells undergoing apoptosis such as DNA fragmentation.272 Additional proteins are targeted to promote the apoptotic process, for example, cleavage of BAD (Bcl2 antagonist of cell death) (11EQED14) and BCL2 (apoptosis regulator Bcl-2) (31DAGD34) induce proapoptotic or inhibit antiapoptotic activity, respectively.271 4.2.3. Post-Translational Modification Motifs and Structural Modification. A third motif-mediated protein modification type involves enzyme-catalyzed conformational alterations of the peptide backbone. While peptide bonds between most amino acids predominantly occur in the trans conformation, a relevant fraction of peptide bonds involving a proline residue can also adopt the cis conformation.273 Since the backbone of the cis and trans isomers follows a different path, these two states will have a different structure, at least locally, and thus can have a different function, allowing proline to act as a molecular switch by toggling between distinct functional states of a protein.274 Although interconversion between the cis and the trans isomers is slow due to the large energy barrier between the two states, peptidyl-prolyl cis−trans isomerases (PPIases) can catalyze isomerization of specific prolyl peptide bonds (Table 9).273 Emerging evidence supports an important role of PPI-mediated cis−trans isomerization in the dynamic control of a variety of biological processes.273−276 A ubiquitous family of PPIases is the Cyclophilins, which are involved in several signaling pathways, including apoptosis, but relatively few of their substrates in this context have been identified so far.277 Studies with peptide libraries to identify motif sequences specific for Cyclophilin A indicate that this PPIase preferentially isomerizes peptide bonds between glycine and proline, with other positions surrounding these two residues also contributing to specificity and affinity.278 One example of a Cyclophilin-specific modification motif-containing protein is the adapter protein CRK (237GP238 in chicken CRK), which is involved in growth factor-induced signaling and cell motility. The activity of CRK is switched by Cyclophilin Acatalyzed isomerization of this glycine-proline peptide bond, 6753

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

Figure 8. Cooperative use of SLiMs in CDC25 family members. (A) The modular architecture of the three human CDC25 family members (paralogs CDC25A, CDC25B, and CDC25C) with modifying enzymes. CDC25 family members contain several SLiMs: KEN-box degrons (140NKENE144 in CDC25A, 191DKEND195 in CDC25B, 150NKEND154 in CDC25C); βTrCP degrons (76PSSEPSTDPSG83 and 215DDGFVD220 in CDC25A and 268DDGFVD273 in CDC25B); NLS motifs (274KRPERSQEESPPGSTKRRK292 in CDC25A, 349KRRR352 in CDC25B, and 240KVKKK244 in CDC25C); NES motifs (39LSPVTNLTVTMD50 in CDC25A, 52VTTLTQTMHDLAGL65 in CDC25B, and 189EISDELMEFSLKDQE203 in CDC25C); cyclin-binding motifs (11RRLLF15 in CDC25A); and 14-3-3-binding motifs (175RQNPSAP180 and 504KSRPTW508 in CDC25A, 320RSPPSMP325 in CDC25B, 213RSPPSMP218 in CDC25C). (B) Motifs of CDC25 proteins and how these motifs are regulated to dictate the functions of these cell cycle regulators. Distinct degrons mediate degradation of CDC25 by the APC/C and, in a phospho-dependent manner, by the SCFβTrCP ubiquitin ligases. Nucleo-cytoplasmic shuttling of CDC25 proteins is directed by NLS and NES motifs, which can be inhibited by different kinases through phosphorylation of residues in or adjacent to these motifs. Finally, phospho-dependent complex formation with 14-3-3 proteins allows modulation of CDC25 phosphatase activity. Legend: protein names are given in italics; mauve boxes represent globular domains; yellow boxes represent motifs; small blue circles represent phosphates; gray ovals represent motif binding proteins; blue boxes describe the interaction outcome; “+P” signifies addition of a phosphate; “+Ub” signifies addition of a ubiquitin.

5. STRATEGIES OF COOPERATIVE SLIM USE The compactness of SLiMs enables the encoding of multiple functional sites in a relatively short polypeptide sequence. Many proteins contain multiple motifs whose complex interplay can provide distinct functionality and modes of regulation to a protein.10,11 As discussed earlier, hundreds of motif classes, each with a different functional role, have been characterized to date. During its evolution, a protein can acquire de novo a distinct subset of SLiMs to recruit novel binding partners and develop a unique regulatory path from translation to degradation (see section 3.2). In this section, we will introduce two different ways that allow motif use to shape the function of a protein. First, we will review a selection of the regulatory motifs characterized in members of the CDC25 family of phosphatases to illustrate how motifs can regulate the course of a protein from the ribosome to the proteasome by dictating its localization, modification state, activity, and stability (see section 5.1). Second, we will discuss how the motif complement of the cell cycle regulatory protein p21Cip1 (cyclindependent kinase inhibitor 1) permits this compact protein to perform two distinct tasks, inhibiting cyclin-CDK holoenzymes

toward their substrate after an initial modification event catalyzed by a different enzyme. Known examples include GSK3B (([ST])xxxP[ST]) and CK1 (casein kinase 1) (PSxx([ST])), which respectively phosphorylate the transcriptional regulator HSF1 (heat shock factor protein 1) (300EPP(S)PPQPS307) after a downstream priming phosphorylation by a MAPK280,281 and p53 (15PSQE(T)FSD21) after an upstream phosphorylation induced by DNA damage.282 Priming-dependent modification is not restricted to kinases but also regulates other types of PTM. For instance, a priming phosphorylation by GSK3B promotes sumoylation of an upstream lysine residue in HSF1 (297V(K)EEPPPS303).283 Examples of cross-regulation between the different classes of modification motifs include the phosphorylation-dependent isomerization of peptidyl-prolyl bonds by PIN1 and phosphorylation-dependent inhibition of caspase-mediated protein cleavage.274,284 Such cross-talk between multiple PTMs enables the true potential of PTMs for combinatorial and robust signaling and underlies their importance for conditional and dynamic signal integration. 6754

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

of the 130PTP131 peptide are populated. However, PIN1, a peptidyl-prolyl cis−trans isomerase that isomerizes peptide bonds within a structural modification motif, can recognize this phosphorylated peptide in the PLK1 docking motif and catalyze its conversion to the trans isomer. The trans isomer is a better substrate for the PP2A phosphatase, and consequently, PP2A efficiently dephosphorylates the PLK1 docking motif.276 In summary, the peptide centered on T130 is recognized for phosphorylation by a proline-directed kinase, docking by the Polo-box domain of PLK1, isomerization by PIN1, and dephosphorylation by PP2A, conferring highly regulated recruitment of PLK1. As phosphorylation of CDC25C by PLK1 results in activation of the phosphatase, which subsequently activates cyclin-CDK complexes, such tight control of CDC25C modification is important for the temporal regulation of CDK activity and preventing premature entry into mitosis.276 5.1.2. SLiMs and CDC25 Activity. Motifs that modulate the localization and stability of the CDC25 proteins can indirectly affect their activity. However, certain motifs function to directly alter the activity of these phosphatases by promoting the assembly of specific complexes (Figure 8B) (see section 4.1.3). All three human CDC25 paralogs have been shown to contain 14-3-3-binding motifs that recruit 14-3-3 proteins in a phospho-dependent manner to regulate CDC25 function in a variety of ways.285,286,288 For instance, phosphorylation of the 14-3-3-binding motifs of CDC25A by CHK1 induces binding to 14-3-3 proteins. This blocks binding of CDC25A to the cyclin-B1-CDK1 holoenzyme, hence inhibiting the recruitment and activation of this cyclin-CDK complex by CDC25A.286 Similarly, phosphorylation of S323 within a 14-3-3-binding motif of CDC25B recruits 14-3-3 and results in obstruction of the phosphatase active site, thereby inhibiting CDK dephosphorylation and activation by CDC25B.289 The phosphatase PP1 can dephosphorylate S323 to deactivate the 14-3-3-binding motif, allowing CDC25B to activate cyclin B-CDK1 and facilitate mitotic entry. Alternatively, phosphorylation of S321 within the 14-3-3-binding motif can also deactivate this motif to abolish binding of 14-3-3 and allow activation of CDC25B.288 The CDC25 family also utilizes motifs to positively modulate CDK dephosphorylation. For example, although cyclin-binding docking motifs are generally used to recruit substrates to cyclinCDK complexes (see section 4.1.1.1), a classical cyclin-binding motif in CDC25A interacts with cyclin subunits to stabilize the interaction of CDC25A with cyclin E- or cyclin A-CDK2 complexes. This docking interaction promotes specific dephosphorylation of cyclin-associated CDK2 by CDC25A, which is required for activation of the kinase complex and timely progression through the cell cycle.290 5.1.3. SLiMs and CDC25 Stability. The abundance of the CDC25 paralogs is tightly regulated by transcriptional, translational, and post-translational mechanisms.82 Post-translationally, the local abundance is controlled by ubiquitylationdependent degradation, promoted by the APC/C and SCF ubiquitin ligase complexes (Figure 8B) (see section 4.1.1.2).291−293 The amount of CDC25A and CDC25B has been shown to vary in a cell cycle-dependent manner.82 Consistent with this, both CDC25A and CDC25B are substrates of the APC/C complex and their timely degradation during mitotic exit and in early G1 phase is dependent on KEN-box degrons.291,292 Interestingly, an alternatively spliced isoform of CDC25B (CDC25B2) lacks the KEN-box degron and is not degraded during mitosis.292 The level of CDC25C

and blocking DNA replication (see section 5.2). These proteins were selected for further discussion because several aspects of their function and regulation are fairly well characterized. However, it should be emphasized that many of the IDRcontaining proteins studied in depth so far, including MYC, SMAD (mothers against decapentaplegic homologue) proteins, p53 family members, Epsins, EGFR, DAG1 (Dystroglycan), PXN (Paxillin), SYNJ1 (Synaptojanin-1), and NRIP1 (nuclear receptor-interacting protein 1), use a variety of SLiMs to perform their functions at the right time, in the correct place, and under the appropriate conditions. Many of these motifs have been curated, together with the appropriate references, in the ELM resource.15 5.1. From Translation to Degradation: The Life Cycle of CDC25

Phosphatases of the CDC25 family are highly regulated components of the cell cycle control machinery of the eukaryotic cell.82 These proteins activate CDKs by antagonizing the WEE1 (Wee1-like protein kinase) and PKMYT1 (membrane-associated tyrosine- and threonine-specific cdc2inhibitory kinase) dependent inhibitory phosphorylation of T14 and Y15 within the CDK activation loop. The human proteome has three CDC25 phosphatases: CDC25A, CDC25B, and CDC25C.82 While CDC25A mainly functions at the G1/S transition, CDC25B and CDC25C predominantly act at the G2/M transition. Although they share a highly conserved Cterminal catalytic domain, each family member has a unique Nterminal regulatory region containing some shared and some distinct motifs that differentially control the modification state, activity, stability, and localization of the three different paralogs (Figure 8A). These motifs cooperate to conditionally activate, degrade, and sequester CDC25 phosphatases depending on the cell state and as such guide these proteins from translation to degradation. 5.1.1. SLiMs and CDC25 Modification. Modification of the CDC25 paralogs depends on distinct sets of docking (see section 4.1.1.1) and modification motifs (see section 4.2). A wide range of enzymes modify the CDC25 proteins at numerous sites, for example, CDC25A is specifically targeted for phosphorylation on multiple modification motifs by the kinases CHK1 and 2 (checkpoint kinase-1 and 2), CK1, CDK1, and PLK3 (polo-like kinase 3).82 These sites often occur adjacent to ligand motifs, allowing the interactions of these motifs to be conditionally modulated depending on specific stimuli. Each modification site has evolved to recruit a particular enzyme or set of enzymes to allow CDC25 proteins to respond to specific cell state information. For instance, after DNA damage-dependent activation of the cell cycle checkpoint effectors CHK1 and CHK2, these kinases phosphorylate several sites ([KR]xx(S)) on CDC25 family members to inhibit their activity and block cell cycle progression.285,286 Conversely, the cell cycle kinases PLK and CDK phosphorylate the CDC25 paralogs at multiple canonical sites, [DE]x([ST]) and ([ST])Px[KR], respectively, resulting in their activation, which promotes progression through the cell cycle.82 The T130 residue of CDC25C provides an exceptional illustration of the tight control of the modification state of these phosphatases by recruiting multiple distinct enzymes in a regulated manner. Phosphorylation of T130 activates a docking motif for PLK1 that promotes subsequent phosphorylation of CDC25C by PLK1 on multiple sites.287 As this residue is followed by a proline residue, both the cis and the trans isomers 6755

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

Figure 9. Cooperative use of SLiMs in cyclin-dependent kinase inhibitor family members. (A) The modular architecture of two human CDI family members (paralogs p21Cip1 and p27Kip1). p21Cip1 contains an N-terminal IDD known as the kinase-inhibitory domain (KID) (including a cyclinbinding motif ( 19 RRLF22 ) in the D1 subdomain) and several SLiMs: a putative Skp2-Cks1 degron ( 128 EGSP 131 ); an NLS (141KRRQTSMTDFYHSKRR156); a PIP-box (144QTSMTDFY151); and a PIP degron (144QTSMTDFYHSKR155). p27Kip1 also contains an Nterminal KID (including a cyclin-binding motif (30RNLF33) in the D1 subdomain), a Skp2-Cks1 degron (184VEQTPK189), and an NLS (152RKRPATDDSSTQNKR166). (B) Motifs of p21Cip1 and how these motifs are regulated to dictate the different functions of this versatile protein. The main functions of p21Cip1 are inhibition of cyclin-CDK activity, which involves its cyclin-binding motif, and inhibition of PCNA activity via its PIP-box and depending on its nuclear localization. Pools of p21Cip1 associated with cyclin-CDK or PCNA complexes can be specifically targeted for degradation via distinct degron motifs. The activity of the ligand motifs of p21Cip1 is positively or negatively regulated by multiple PTM target sites in or adjacent to these motifs. Legend: protein names are given in italics; gray boxes represent IDDs; yellow boxes represent motifs; small blue circles represent phosphates; gray ovals represent motif binding proteins; blue boxes describe the interaction outcome; “+P” signifies addition of a phosphate; “+Ub” signifies addition of a ubiquitin.

has been shown to remain constant throughout the cell cycle294 despite containing an experimentally validated KEN-box.293 The abundance of some CDC25 paralogs is also regulated by the BTRC-containing SCF ubiquitin ligase complex. For instance, targeting of CDC25A by SCFβTrCP for degradation during S and G2 phase depends on phosphorylation of a phosphodegron.295 Several kinases have been implicated in the activation of this degron, including PLK3, CHK1, GSK3B, CK1, and NEK11 (never in mitosis A-related kinase 11), although the exact order of phosphorylation is still unclear.296 Phosphorylation of S76 is a rate-limiting step for the CHK1dependent activation of the BTRC-binding degron, and efficient modification of the site is facilitated by 14-3-3dependent scaffolding of CHK1 and CDC25A, an interaction that requires the 14-3-3 motif in the C terminus of CDC25A.297 In addition, CDC25A and CDC25B contain nonclassical phospho-independent BTRC degrons, although phosphorylation of flanking residues was shown to modulate SCF recruitment.298 5.1.4. SLiMs and CDC25 Localization. Several targeting motifs have been characterized in CDC25 family members (see section 4.1.2). All three human CDC25 paralogs have an NLS motif to facilitate nuclear import299−301 and an NES motif to

facilitate nuclear export (Figure 8B).299,302,303 Colocalization of CDC25 paralogs and their substrate CDKs is required for cell cycle progression, and consequently, CDC25 localization is often modulated in a modification-dependent manner. For example, phosphorylation of S216 in a phospho-dependent 14-33-binding motif in CDC25C by CHK1 recruits a 14-3-3 dimer. The 14-3-3-binding motif flanks the NLS of CDC25C, and the interaction with 14-3-3 occludes recognition of the NLS by the nuclear import machinery, resulting in cytoplasmic sequestration of CDC25C.285 Conversely, phosphorylation of S198 in the NES of CDC25C by PLK1 inhibits binding to XPO1, thus promoting nuclear localization of CDC25C.303 5.1.5. Summary of SLiM-Mediated Regulation of the Life Cycle of CDC25. The activity, stability, localization, and modification state of CDC25 family members are dynamically controlled by their extensive motif content (Figure 8). The evolutionary plasticity of SLiMs has allowed the evolution of a complex regulatory program encoded by cooperative use of numerous modification motifs and ligand motifs. From the moment the nascent CDC25 molecules leave the ribosome these motifs guide the life of these molecules and continue to regulate the proteins until they recruit the ubiquitin ligase machinery required for their timely degradation. The SLiM6756

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

bound cyclin-CDK complex that has catalytic activity and can phosphorylate the T187 residue of p27Kip1.9,311 Phosphorylation of T187 induces binding of a specific phosphodegron motif in p27Kip1 to a composite binding site formed by CKS1 (cyclindependent kinases regulatory subunit 1) and the SKP2 subunit of the SCF ubiquitin ligase complex, resulting in ubiquitylation and subsequent proteasomal degradation of p27Kip1.240 The requirement for CDK-bound CKS1 specifically targets CDKassociated p27Kip1 for degradation. Several sources of data indicate that a similar strategy might be used to relieve cyclinCDK inhibition by p21Cip1, although the molecular details of this process have not yet been fully characterized.304,312 The tyrosine residue that blocks the CDK catalytic site is conserved in p21Cip1 (Y77) and ubiquitylation of p21Cip1 depends on the presence of SKP2, CKS1, CDK, and cyclin subunits. In addition, phosphorylation of the S130 residue of p21Cip1 by CDK further stimulates its ubiquitylation and destabilization. Also, the S130 residue is close to a glutamate residue (E128), which in p27Kip1 is required to bind the composite site formed by SKP2 and CKS1.309,310,313 It should however be mentioned that, in contrast to p27Kip1, phosphorylation of p21Cip1 by CDK is not absolutely required for its ubiquitylation and, in addition, that CKS1 generally shows binding specificity for phosphothreonine-containing motifs.240,309,313 Nevertheless, use of multiple motifs enables complex functionality and signal integration to convert p21Cip1 from a cyclin-CDK inhibitor to a cyclin-CDK substrate to an SCFSkp2 substrate and ultimately to a proteasome substrate. 5.2.2. SLiMs and Inhibition of DNA Replication by p21. In addition to antagonizing cyclin-CDK function, p21Cip1 also controls cell proliferation by modulating DNA replication through its interaction with PCNA (Figure 9B).314 This activity of p21Cip1 and the regulation thereof is independent from its cyclin-CDK inhibitory function and depends on a distinct set of ligand and modification motifs in p21Cip1. Hence, motifs allow p21Cip1 to have an orthogonal function in inhibiting DNA replication. PCNA is a homotrimer that is loaded onto DNA during S phase or after DNA damage and is required for the control of DNA polymerase processivity.85 Binding to PCNA is mediated by a PCNA-binding PIP-box motif in the C-terminal part of p21Cip1 and has been implicated in the response to DNA damage.314 This interaction masks the recruitment site on PCNA for PIP-box motif-containing proteins involved in DNA replication, resulting in inhibition of DNA synthesis to allow DNA repair. The ability of p21Cip1 to efficiently compete with other PCNA-binding proteins was suggested to result from a higher affinity for PCNA due to more extensive contacts between PCNA and the C-terminal flanking region of the PIPbox motif of p21Cip1.85 Modification sites near the PIP-box modulate the PCNA-binding activity of p21Cip1. For instance, PKB-catalyzed phosphorylation of the T145 residue, located in the PCNA-binding motif of p21Cip1, inhibits the interaction between p21Cip1 and PCNA. Hence, promoting the relief of p21Cip1-mediated inhibition of DNA replication is one of the mechanisms through which PKB can exert its proliferative effects in response to growth factors.315 Similar to its role in cyclin-CDK inhibition, the PCNA-inhibitory activity of p21Cip1 depends on its nuclear localization, which is dictated by an NLS in the C-terminal part of the protein.305,316 Phosphorylation of different residues within the NLS has been shown to result in nuclear exclusion of p21Cip1. Depending on cellular models and conditions, phosphorylation of S146 by PRKCD (protein kinase C delta type)317 and S153 by PKC family members318 or

mediated regulation of the CDC25 paralogs is vitally important for their functionality and enables these proteins to respond to various cell stimuli and control progression through the cell cycle in an orderly manner. 5.2. Functional Versatility of p21

Due to the unique properties of SLiMs, specifically their small footprint and modular nature, multiple independent binding modules can be encoded in a relatively small protein and by utilizing these modules in different combinations the protein can perform diverse activities. Such functional versatility that emerges from the combinatorial use of motifs is illustrated by members of the CDI, also known as CIP/KIP (CDK interacting protein/kinase inhibitor protein), family of cyclindependent kinase inhibitors (CKIs). This family consists of the intrinsically disordered proteins p21Cip1, p27Kip1, and p57Kip2, which play an important role as negative regulators of the cell cycle. Each family member contains a well-conserved Nterminal domain that mediates binding to cyclin-CDK complexes.304 Conversely, their C-terminal regions have diverged and confer distinct functionality and modes of regulation on the different family members.83 For instance, the unique SLiM complement of p21Cip1 specifically allows this protein to conditionally perform two distinct functions in cell cycle regulation. The specific activity of p21Cip1 depends on its cellular context and concomitant interaction networks, which are modulated by SLiM-directed regulation of the localization, abundance, and modification state of p21Cip1.305,306 The following sections discuss how the unique combination of ligand and modification motifs of p21Cip1 enables the versatility and tight control of its functionality. 5.2.1. SLiMs and Inhibition of Cyclin-CDK Activity by p21. The p21Cip1 protein is best known as a transcriptional target of p53 that mediates cell cycle arrest in the G1 and G2 phases in response to DNA damage by inhibiting the function of cyclin-associated CDK2 and CDK1, respectively.83,305 Binding of p21Cip1 to the cyclin-CDK complexes occurs via its KID region, an intrinsically disordered domain consisting of the D1 subdomain, which contains a cyclin-binding motif, and the D2 subdomain, which binds to and inhibits the kinase subunit (Figure 9A).304 Inhibition of cyclin-CDK complexes by p21Cip1 is achieved by blocking two sites for motif recognition. First, by acting as a pseudosubstrate and hiding the target recognition site on cyclin using a cyclin-binding motif, the D1 subdomain precludes binding of substrates, for instance, RB1, as well as regulators, such as the CDK activator CDC25A, all of which are recruited to cyclin-CDK complexes via similar motifs that bind to a hydrophobic patch on the cyclin subunit.290 Second, the D2 subdomain inhibits the catalytic activity of the CDK subunit by remodeling the active site and occupying the ATP-binding pocket, thereby blocking CDK recognition of target sites in the CDK substrates (Figure 1A). Relief of CKI-mediated inhibition of cyclin-CDK activity depends on specific targeting of the cyclin-CDK-associated inhibitors for proteasomal degradation (Figure 9B). The molecular details of the regulated release of a CKI from a cyclin-CDK complex and subsequent degradation of the inhibitor have been well characterized for p27Kip1,307,308 and a similar mechanism might control the stability of p21Cip1.304,309,310 In the case of p27Kip1, growth factor stimulation was shown to induce nonreceptor tyrosine kinasemediated phosphorylation of its Y88 residue, which is involved in inhibition of the active site of CDK. This results in a p27Kip16757

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

Figure 10. SLiMs in T-cell receptor signaling. (A) Control of tyrosine-protein kinase LCK activity by regulated switching between the autoinhibitory and the active conformation. Both LCK (504QPYQP507) and PAG1 (317PYSSV320) contain an SH2 domain-binding motif. (B) Recruitment and activation of tyrosine-protein kinase ZAP70 by ITAM motifs in CD3 coreceptors, which are activated by phosphorylation upon TCR stimulation by ligand. CD3γ, -δ, and -ε each contain one ITAM (157DQLPYQPLKDREDDQPYSHL174, 146DQVPYQPLRDRDDAQPYSHL163, and 1 8 5 N P D P Y E P I R K G Q R D L P Y S G L 2 0 2 , r e sp e c t iv e l y ) , C D 3ζ c o n t a i n s t h r e e I T A M s ( 6 9 N Q L P Y N E L N L G R R E E P Y DV L 8 6 , 108EGLPYNELQKDKMAEAPYSEI126, and 139DGLPYQGLSTATKDTPYDAL156). (C) SLiM-mediated assembly of the LAT signaling complex. SOS1 contains four SH3 domain-binding motifs (1151PPVPPR1156, 1179PPAIPPR1185, 1211PPLLPPR1217, and 1290PPVPPR1296). LAT (Short isoform) contains four SH2 domain-binding motifs (132PYLVV135, 171PYVNV174, 191PYVNV194, and 226PYENL229). (D) Cooperative binding of PLCG1 and downstream pathways activated by LAT-dependent TCR signaling. SLP76 contains three SH2 domain-binding motifs (113PYESP116, 128PYESP131, and145PYEPP148). Legend: protein names are given in italics; other biomolecule names are underlined. SH2 domain-binding motifs are represented as small yellow boxes; phosphorylated SH2 domain-binding motifs are represented as small yellow boxes with a blue circle; SH3 domain-binding motifs are represented as small green boxes. SLiM-binding domains are represented as blue-bordered gray boxes; other domains are represented as mauve boxes; domain names are given in these boxes. “−P” signifies removal of a phosphate; “+P” signifies addition of a phosphate; green stars signify an accessible active site of a kinase; red stars signify an inaccessible active site of a kinase.

chains of the PIP degron for recognition by CDT2.323−325 As the PIP degron is only active when the overlapping PIP-box is bound to chromatin-associated PCNA, degradation of PIP degron-containing proteins specifically occurs in a PCNAdependent manner during S phase or after DNA damage and allows specific regulation of the PCNA-inhibitory activity of p21Cip1.324 5.2.3. Summary of SLiM-Mediated Functional Flexibility of p21. The cell cycle regulatory protein p21Cip1 is a compact 164 amino acid protein, yet it controls two important tasks by independently exploiting interaction modules under different conditions. The motif complement of p21 Cip1 underlies the regulated interplay between its subcellular localization, modification state, and stability to modulate its specificity for a large number of binding partners and hence determine its function.305,326,327 Interestingly, these distinct

DYRK1B (dual-specificity tyrosine-phosphorylation-regulated kinase 1B)319 were reported to localize p21Cip1 to the cytoplasm. Furthermore, BRAP (BRCA1-associated protein) functions as a cytoplasmic retention factor by binding to p21Cip1 and thereby hiding the NLS to sequester p21Cip1 in the cytoplasm.320 The pool of p21Cip1 proteins associated with replication forks can be selectively regulated by PCNA-dependent degradation (Figure 9B). The CDT2 (DTL/denticleless protein homologue) substrate recognition subunit of the CRL4Cdt2 ubiquitin ligase complex recognizes proteins that contain a PIP degron motif and targets them for proteasomal degradation.321,322 The PIP degron overlaps the PCNA-binding PIP-box motif, and binding of CDT2 to its substrates depends on preformation of a complex consisting of the substrate and chromatin-bound PCNA, as binding of the PIP-box to PCNA orientates the side 6758

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

activities of p21Cip1 can be specifically regulated by targeting separate pools of this protein for proteasomal degradation through distinct mechanisms in response to different signals.

ideal tools for dynamic complex assembly, as evidenced by their extensive use in signaling processes. Among the most highly studied metastable complexes are the cooperatively assembled TCR complexes.333 Activation of these complexes is induced by engagement of peptide-presenting major histocompatibility complexes (MHCs) on antigenpresenting cells, an event that promotes phosphorylation of a membrane-bound complex and induces formation of a large motif-stabilized complex known as the LAT signaling complex.337 This complex acts as a platform for recruitment and activation of multiple pathways involved in cytoskeletal reorganization, cell adhesion, and transcriptional regulation.338,339 The exact temporal and spatial details of the pathway remain to be characterized; however, the composition and interfaces of the various complexes in the pathway are known, as are the mechanisms regulating them.337,340 In the following section we will describe a simplified version of the initial steps of the TCR signaling pathway to highlight the role of motifs in these dynamic signaling events (Figure 10). We will highlight four key steps that regulate, propagate, and amplify the signal elicited by MHC engagement: (i) activation of SRC family tyrosine-protein kinases such as LCK (see section 6.1.1); (ii) activation of tyrosine-protein kinase ZAP70 (see section 6.1.2); (iii) construction of the LAT signaling platform (see section 6.1.3); (iv) recruitment of the activators of downstream signaling (see section 6.1.4). 6.1.1. SLiMs and LCK Regulation. The SRC family tyrosine-protein kinases phosphorylate the TCR complex upon MHC engagement, thereby initiating intracellular TCR signaling. Consequently, in the absence of a stimulus, these kinases are tightly regulated to limit aberrant kinase activity. The kinase LCK is held in an inactive state by an intramolecular interaction between its SH2 domain and an SH2 domain-binding motif in its C-terminal region, which locks the kinase in an autoinhibited conformation (Figure 10A) (see section 4.1.3.4).341,342 The autoinhibition depends on phosphorylation of the SH2-binding motif at Y505 by CSK (C-terminal Src kinase) and can be relieved by (activating) dephosphorylation of this tyrosine residue by CD45 (receptor-type tyrosine-protein phosphatase C). In unstimulated T cells, inactivation of LCK requires two independent phosphorylation events as well as recruitment of CSK to LCK that is dependent on PAG1 (phosphoprotein associated with glycosphingolipid-enriched microdomains 1). LCK and PAG1 colocalization is regulated by reversible attachment of a membrane-anchoring palmitoyl group on cysteines within the 2GCGC5 and 37CSSC40 motifs of LCK and PAG1, respectively.343 This colocalization enables recruitment of CSK to LCK upon phosphorylation of the Y317 residue in an SH2 domain-binding motif of PAG1. This creates a binding site for the SH2 domain of CSK. As a consequence, CSK phosphorylates the inhibitory Y505 residue in LCK, prompting it to adopt the inactive conformation. 344 Conversely, dephosphorylation of both tyrosines is required to activate LCK. First, the phosphatase CD45 dephosphorylates Y505, enabling LCK to switch to the active conformation. To ensure this event is not reversed, CD45 also dephosphorylates the Y317 residue of PAG1, thereby preventing PAG1-dependent colocalization of CSK with LCK. The tight control of the phosphorylation state of the phospho-dependent SH2 domainbinding motifs that mediate the recruitment of kinases, phosphatases, and adaptor proteins strictly regulates the kinase activity of LCK and minimizes the likelihood of stochastic activation.

6. STRATEGIES OF SYSTEMIC SLIM USE The previous sections have shown that SLiMs are ubiquitous in eukaryotic proteomes and encode diverse functionality. Many proteins utilize distinct sets of motifs that cooperate to integrate pertinent signals and build unique regulatory programs. The examples of CDC25 and p21Cip1 highlighted the immense regulatory potential of utilizing multiple motifs in a single protein. When several such proteins in a system cooperate, these systems can exhibit highly complex regulation and emergent properties, enabled by simple evolvable components. SLiMs are used ubiquitously for regulation of cellular systems, for instance, those mediating progression through the cell cycle (cyclin-CDK-dependent modulation of cell cycle regulators,80 promotion of mitotic exit by the APC/C33), endocytosis (cargo recruitment and assembly of the clathrin cage328), signaling pathways (ErbB,329 Jak-STAT330 and PI3K-Akt331 signaling), protein trafficking (cargo recruitment in vesicular179 and nuclear transport186 pathways), and transcriptional regulation (the G1 checkpoint,332 steroid-responsive transcription214). In the following section we will illustrate the systemic use of SLiMs by discussing two cellular processes that are well studied and understood. First, we will focus on ligand motif-driven complex assembly by discussing how SLiMs facilitate multiprotein complex formation at multiple stages of T-cell receptor (TCR) signaling activation to propagate and buffer signal transduction. We will emphasize the dynamic, conditional, and reversible nature of each step and highlight the role of multivalent interactions in the SLiM-driven assembly of the TCR signaling platform (see section 6.1). Second, we will focus on regulatory motifs by reviewing how SLiMs modulate the cellular response to hypoxia by conditionally regulating the local abundance of a set of transcription factors and their regulators. We will emphasize how integration of cellular information through cooperative use of modification motifs and regulatory ligand motifs can facilitate a dynamic response to changes in cell state by modulating the binding specificity and stability of proteins (see section 6.2). 6.1. Dynamic Signaling Platforms in T-Cell Receptor Signaling

For many cellular receptors, such as the T-cell receptors,333 EGF receptors,334 and integrins,335 intracellular signaling is initiated by SLiM-driven formation of large multiprotein complexes that act as signal transduction platforms. Formation of metastable complexes required for signal propagation depends on multiple motifs that act cooperatively to mutually increase the binding strength of SLiM-mediated interactions through high-avidity binding (see section 4.1.3.1). Cooperative use of multiple low-affinity interaction interfaces to mediate complex assembly allows for accurate, efficient, and robust signal transmission.217 Interactions within these oligomeric complexes are mutually promoted by increased local concentrations of binding sites, and simultaneous binding of these sites with multiple partners provides stability to the assembly. The interdependency between the distinct binding events results in a sigmoidal response that is characterized by a sharp transition between the off and the on states and hence reduces biological noise.336 The attributes of SLiMs, specifically their low-affinity binding and predisposition for regulation by PTM, make them 6759

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

6.1.2. ITAM Activation and ZAP70 Recruitment. The second highly regulated step in TCR activation is the recruitment of ZAP70, which is an example of a robust SLiM-mediated switching mechanism. The T-cell receptor is a multimeric protein complex consisting of two antigen-binding subunits, the TCR-α and -β receptor chains (Figure 10B). These receptors associate with accessory molecules, including CD3γ, -δ, and -ε chains and a homodimer of ζ chains. Upon MHC attachment, LCK phosphorylates these accessory molecules on ITAM (immunoreceptor tyrosine-based activatory motif) peptides, allowing the TCR to recruit and activate the tyrosine-protein kinase ZAP70.343 The exact mechanism of how the accessibility of the ITAMs is controlled is still not fully characterized; however, two models currently exist. The first model suggests a mechanism where the ITAMs of inactivated TCRs are inserted into the plasma membrane, from which they are removed by conformational changes upon ligand binding, making them accessible for phosphorylation.345−347 The second model suggests that upon binding of the MHC complex to the TCR complex the large extracellular domain of the phosphatase CD45 is forced out of the immunological synapse, resulting in the unopposed phosphorylation of the ITAMs by active LCK. The outcome is that the engagement of the peptide-MHC complex by the TCR and accessory CD4 proteins results in activation of LCK and its recruitment to the cytoplasmic tails of the TCR complex, which enables LCK to phosphorylate the ITAMs of the T-cell coreceptors. ITAMs consist of a pair of phospho-dependent SH2 domainbinding motifs, and multiple ITAM copies are present in the TCR complex: CD3γ, -δ, and -ε each contain one ITAM, while CD3ζ contains three ITAMs.345 The SH2 motifs of ITAMs can bind separately with low affinity or cooperatively with high affinity to several SH2 domain-containing proteins (Figure 10B). For example, ZAP70 binds with approximately 20-fold higher affinity to an ITAM peptide phosphorylated on both tyrosines than to a singly phosphorylated ITAM peptide of CD3ζ1.348 Phosphorylation of both SH2 domain-binding motifs of an ITAM by LCK promotes high-affinity interactions with the tandem SH2 domains of ZAP70 in a two-stage procedure. Binding of the first SH2 domain of ZAP70 to an ITAM results in the rearrangement of ZAP70 to expose the binding pocket on the second SH2 domain, which is then able to bind to the second SH2-binding motif in the ITAM. This rearrangement of ZAP70 promoted by ITAM binding relieves the autoinhibitory conformation of ZAP70, allowing both autophosphorylation and phosphorylation by other kinases, such as LCK, which is required for full activation of ZAP70.349 Thus, at the center of the regulation of TCR pathway activation are two simple binary switches: one to activate LCK and one to activate ZAP70. Yet, this motif-based mechanism has enabled development of a highly resilient system to ensure only sustained peptide binding initiates the TCR signaling pathway. 6.1.3. SLiMs and LAT Complex Assembly. ITAMdependent activation of ZAP70 enables this kinase to phosphorylate a number of key residues in the membranebound scaffolding protein LAT, the protein that forms the core of the TCR activation-dependent multiprotein signaling platform (Figure 10C).337,339 In contrast to the previous sections, multiple phosphorylation-activated SH2-binding motifs act cooperatively to construct a lattice-like signaling platform. This platform, known as the LAT signaling complex, is estimated to measure 20−30 μm in diameter and to contain in excess of 100 protein molecules, forming within a few

seconds after TCR complex activation and persisting for approximately 10 min.350−352 The dynamic yet stable assembly of such a large complex is driven by cooperative use of multiple SLiM-mediated interactions. Of particular importance are four tyrosine residues (Y132, Y171, Y191, and Y226) in SH2 domainbinding motifs in LAT that are phosphorylated by ZAP70 to enable recruitment of two further highly modular proteins: GRB2 (as well as the closely related paralog GADS (GRB2related adapter protein 2)) and PLCG1 (1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase gamma-1).337 The related adaptor proteins GRB2 and GADS share a domain architecture with a central SH2 domain flanked by an SH3 domain on either side. The SH2 domains of GRB2 and GADS preferentially bind LAT at three (of the four) SH2 domain-binding motifs: 171PYVNV174, 191PYVNV194 , and 226PYENL229 (Figure 10D). Both proteins have a considerably weaker affinity (50−100-fold) for the first SH2-binding motif in the LAT sequence (132PYLVV135). Instead, PLCG1 was found to bind, via its N-terminal SH2 domain, to this motif.338,353,354 Studies have shown that affinity alone is not enough to define the specificity of PLCG1 for this motif.337,355 Instead, a number of cooperative interactions are required to localize PLCG1 to LAT and activate its enzymatic function. Recruitment of PLCG1 to LAT requires the initial binding of GADS (and to a lesser extent GRB2) to LAT, followed by recruitment of the adaptor protein SLP76 (SH2 domain-containing leukocyte protein of 76 kDa). The cooperative binding between SLP76 and GADS, orchestrated by the SH3-binding motifs in SLP76 and the SH3 domains in GADS,356 stabilizes the interaction between SLP76 and PLCG1 (Figure 10D).337,357 Along with binding of the PH domain of PLCG1 to 1-phosphatidyl-1Dmyo-inositol 3,4,5-trisphosphate (PIP3) in the plasma membrane, this cooperativity ensures the stable interaction of PLCG1 with the 132PYLVV135 motif in LAT and activation of this enzyme.337,353,357 As well as the cooperative motif-mediated interactions required for recruiting and activating PLCG1, additional cooperative interactions between the SH3-binding motifs in SOS1 (son of sevenless homologue 1) and the SH3 domains of GRB2 (and GADS) facilitate the clustering of LAT.340 The SH3 domain-binding motifs in SOS1 bind to multiple GRB2 proteins often associated with different LAT proteins, which brings together these adaptor proteins (Figure 10C).340 Formation of this oligomeric LAT signaling complex is vital for further downstream signaling.340 The cooperativity enabled by the presence of repeating motifs in multiple proteins thus facilitates the reversible, dynamic assembly of the LAT signaling complex. Furthermore, the low binding affinity of each motif ensures stable interactions are not formed without cooperative interactions, promoting robustness and fidelity in signal transduction by buffering transient binding events resulting from incomplete pathway activation. 6.1.4. SLiMs and Activation of Downstream Pathways. Recruitment of SLP76 and PLCG1 to the LAT complex enables propagation of downstream signaling.337,339,358 The presence of multiple SH2-domain binding motifs, in particular in the C-terminal tail of SLP76, can act as a scaffold to colocalize multiple proteins, which can then interact to propagate signal transduction (Figure 10D). SLP76 links TCR activation to actin polymerization through motif-mediated interactions with both VAV1 (proto-oncogene vav) and NCK1 (cytoplasmic protein NCK1).358 Recruitment of these SH2 domain-containing proteins depends on phosphorylation of 6760

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

Figure 11. SLiMs in hypoxia response signaling. (A) Modular architecture of HIF1A and p53. Several globular domains, IDDs, and SLiMs are annotated. HIF1A contains two oxygen-dependent degron motifs in the oxygen-dependent domain (ODD) (ODD1, 400LAPAAGDTIISLDF413; ODD2, 562LAPYIPMDDDFQL574) that are hydroxylated by PHD enzymes (at P402 and P564, respectively) and recognized by the VHL ubiquitin ligase; one additional modification motif recognized by FIH1 (802VNAP805) for oxygen-dependent hydroxylation (at N803); two transactivation domains (the NTAD and CTAD IDDs) with a core TAZ-binding motif (792LPQL795) located in the CTAD; two NLS motifs (17RRKEK21 and 717QRKRKM722); and an NES (630DRMEDIKILI639). The p53 protein contains an N-terminal IDR with two transactivation domains (TAD1, 14LSQETFSDLWKLLPENNV31; TAD2, 48DDIEQWFTE56); an MDM2 degron (19FSDLWKLL26) that overlaps with TAD1; and multiple PTM target sites, including modification motifs for kinases of the PIKK family (12PPL(S)QET18 and 34PLP(S)QAM40), CK1 subfamily (15PSQE(T)FSD21), and GSK3 subfamily (30NVL(S)PLPPS37). The C-terminal IDR contains the tetramerization IDD and several SLiMs including an NES (339EMFRELNEALELKD352), an NLS (305KRALPNNTSSSPQPKKK321), and two docking sites for the USP7 deubiquitylase (359PGGSR363 and 364AHSSH368). (B) Major changes affecting the SLiM-dependent stability and transcriptional activity of HIF1A and p53 in normoxia, mild hypoxia, and anoxia. In normoxic conditions, oxygen-dependent hydroxylation of HIF1A by PHDs promotes HIF1A ubiquitylation and subsequently its destruction, MDM2-dependent ubiquitylation of p53 promotes p53 degradation, and general transcription factors outcompete p53 and HIF1A for CBP/p300. In mild hypoxic conditions, PHD inactivation leads to HIF1A accumulation resulting in HIF1A-dependent transcription. MDM2dependent degradation of p53 continues. In anoxic conditions, phosphorylation-dependent inhibition of MDM2 recruitment to p53 and CBP/p300 competition with MDM2 for the TAD1 of p53 results in the accumulation of p53, p53 outcompetes HIF1A for CBP/p300, p53-dependent transcription is activated, and HIF1A is degraded independently of VHL. Legend: protein names are given in italics. SLiMs are represented as yellow boxes; IDDs are represented as gray boxes; globular domains are represented as mauve boxes; module names are given in or above these boxes. Blue circles represent important phosphorylated sites; red circles represent important hydroxylated sites; yellow circles represent important ubiquitylated sites; the modifying enzyme is specified above these sites when relevant.

three SH2 domain-binding motifs in SLP76.359,360 Subsequently, an SH3 domain in NCK1 can bind an SH3 domainbinding motif (377RSGPLPPPP385) in WASP (Wiskott−Aldrich syndrome protein) to recruit WASP to the complex.358 VAV1, a guanine nucleotide exchange factor, recruits and activates the Rho family G-proteins RAC1 (Ras-related C3 botulinum toxin substrate 1), RHOA (transforming protein RhoA), and CDC42 (cell division control protein 42 homologue). CDC42 can then

activate WASP to initiate actin polymerization and cytoskeletal changes.360 SLP76 can also activate cell adhesion-related pathways through its C-terminal SH2 domain-mediated interactions with FYB (FYN-binding protein) and SHB (SH2 domain-containing adapter protein B). An SH2 domainbinding motif (595PYDDV598) in FYB, for example, binds the SH2 domain of SLP76.361−363 The second major downstream activator of the LAT complex is PLCG1, which hydrolyzes PIP2 6761

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

p300. Consequently, HIF1A must compete to recruit available CBP/p300 to the HIF target genes. The outcome of this competition is regulated by modulating the intrinsic affinity of the binding regions of the various competing transcription factors for CBP/p300, for instance, by PTM, or the local abundance of the different competitors, for instance, by altering their gene expression level, stability, localization, and/or scaffolding. Much of the conditional integration of the cell state indicators that controls the changes in CBP/p300 specificity in response to hypoxia depends on cooperative use of multiple regulatory and modification motifs across several proteins. The following sections discuss how these motifs modulate the stability, activity, localization, and modification state of HIF1A, its regulators and competitors for CBP/p300, and how they encode a pro-survival regulatory program that activates HIF1A in mild hypoxia but under sustained oxygen depletion switches to an apoptotic program mediated by p53 (Figure 11). 6.2.1. SLiMs Controlling HIF1A Stability and Transcriptional Activity. Oxygen-dependent regulation of HIF1A is mediated by conditional modulation of the C-terminal transactivation domain (C-TAD) and the oxygen-dependent degradation domain (ODD) (Figure 11A). These regions contain SLiMs that control the transcriptional activity and stability of HIF1A, respectively. The activity of these motifs is controlled by oxygen-dependent PTM, which allows HIF1A function to be regulated in a context-dependent manner. In normoxic conditions, when the supply of oxygen is sufficient, these motifs cooperate to suppress the hypoxia response by deactivating and degrading HIF1A.366,372 The C-TAD is an extended disordered domain around a core motif that, during hypoxia, mediates binding of HIF1A to the TAZ domains of the transcriptional coactivators CBP/p300. In normoxia, however, CBP/p300 recruitment is inhibited by the oxygenregulated FIH1 (factor inhibiting HIF-1) enzyme, which catalyzes hydroxylation of HIF1A on an asparagine residue adjacent to the core TAZ-binding motif.70 As a result, binding of the HIF1A C-TAD to CBP/p300, an interaction required for HIF transcriptional activity, is inhibited. A second mechanism that inhibits HIF1A activity in normoxia requires degradation of HIF1A to maintain HIF1A abundance at a low level. Two oxygen-dependent degrons within the ODD mediate binding of HIF1A to VHL, which is part of and recruits substrates to the ECS (elongin C/B-Cul2-SOCS) ubiquitin ligase complex. Subsequent polyubiquitylation targets HIF1A for proteasomal degradation. Binding to VHL is dependent on hydroxylation of specific, conserved proline residues within the degrons, which is catalyzed by specific prolyl hydroxylase domain-containing proteins (PHDs) and requires oxygen as a substrate.135 When oxygen levels drop, the PHDs become inactive, the HIF1A VHL-binding degrons are no longer hydroxylated, and their interaction with VHL is disrupted, resulting in stabilization of HIF1A, which is then targeted to the nucleus via its NLS where it can regulate transcription.373 Thus, the presence of oxygen prevents inappropriate induction of the hypoxia response through multiple, complementary safeguards that target SLiMs whose activity can be easily regulated by oxygen-dependent PTM, i.e., by promoting the degradation of cytosolic HIF1A and by inhibiting the transcriptional activity of nuclear HIF1A.366 However, in hypoxic conditions HIF1A is stabilized and inhibition of CBP/p300 binding is relieved, resulting in HIF1A accumulation and transcriptional activity (Figure 11B). Subsequently, HIF induces expression of a diverse set of genes

to produce diacylglycerol (DAG) and inositol-1,4,5-trisphosphate (IP3). DAG activates PKC, and IP3 promotes the release of calcium from intracellular stores, ultimately leading to activation of transcription factors such as NFAT (nuclear factor of activated T cells) proteins that control T cell differentiation, proliferation, and effector response.339 6.1.5. Summary of SLiMs in T-Cell Receptor Signaling. The initial steps of the TCR signaling pathway consist of several motif-based switches that result in the conditional motif-driven construction of metastable signaling platforms. These dynamic and reversible phosphorylation-, competition-, and avidity-dependent switches collaborate to facilitate robust signal transmission by buffering transient TCR-binding events. The TCR signaling pathway is remarkable for the simplicity of its components. The pathway has taken advantage of the evolutionary plasticity of SLiMs and the tractability of SH2 and SH3 domains to build a fine-tuned regulatory system. As such, TCR signaling provides an excellent illustration of the de novo evolution of SH2- and SH3-binding SLiMs and reuse of peptide-binding domains to evolve complex regulatory systems. 6.2. Combinatorial Signal Integration in Hypoxia

Eukaryotic genomes encode hundreds of proteins that sense the cell state or external stimuli and, either directly or indirectly, transmit that information to downstream effectors. These effectors can then adjust their activity to respond to the received information. Often cell state information is amplified and disseminated by proteins that recognize SLiMs, resulting in either PTM or binding events that can conditionally alter protein function. For example, members of the PIKK (phosphatidylinositol 3-kinase-related kinase) family of kinases propagate information concerning cell stresses by phosphorylating ([ST])Q modification motifs,364 while Calmodulin propagates information about calcium levels by Ca2+-dependent recognition and binding to 1-5-10 and 1-8-14 motifs (named for the position of hydrophobic residues in these peptides).365 Proteins that require this information, generally, have convergently evolved the motifs required to receive this information. These motifs will occur in a context that modulates the function of the containing protein, through either gain or loss of a binding partner, inhibition of enzymatic activity, or a change in localization or stability. As motif-mediated interactions can easily be regulated conditionally by reversible PTM or competitive binding (see section 3.4.2), SLiMs are the ideal modules to integrate and transmit changes in cellular conditions. How these mechanisms regulate context-dependent rewiring of SLiM-mediated interactions to alter protein function and control cell fate can be illustrated by the cellular response to low oxygen or hypoxia. Hypoxia plays an important role in mammalian physiology, and consequently, cellular responses to changes in oxygen levels are tightly regulated.366 The transcriptional response to hypoxia is controlled, largely, by the heterodimeric transcription factor HIF (hypoxia-inducible factor). In common with many other transcription factors, including STAT1 and 2 (signal transducer and activator of transcription 1 and 2),367 JUN (transcription factor AP-1),368 FOXO3 (Forkhead box protein O3),369 and p53,370 HIF is coactivated by the versatile multifunctional transcriptional coregulators CBP and p300 (CREB-binding protein and histone acetyltransferase p300).70,371 The abundance of CBP/p300 in the cell is limited, and the oxygen-labile HIF alpha subunit HIF1A binds in a mutually exclusive manner with other transcription factors to overlapping sites on CBP/ 6762

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

to stimulate cellular survival, including genes encoding proteins involved in oxygen delivery, such as transferrin and erythropoietin, metabolism, including glucose transporters and hexokinases, and cell proliferation, for example, the CDK inhibitor p21Cip1. 6.2.2. SLiMs Inhibiting p53 in Normoxia. The cellular tumor antigen p53 functions as a transcription factor that regulates expression of numerous genes involved in a wide range of processes, including cell cycle arrest, DNA repair, and apoptosis. The ability of p53 to modulate transcription depends on interactions with the promoter region of target genes, mediated by its globular DNA-binding domain, and interactions with the basal transcriptional machinery and transcriptional coregulators, including CBP/p300, mediated by SLiMs and IDDs within transactivation domains (TADs) located in the extensive N- and C-terminal disordered regions of p53 (Figure 11A).374 Similar to HIF1A, p53 activity is kept low in unstressed cells by regulating its abundance and transcriptional activity. Post-translationally, the abundance of p53 is controlled by ubiquitylation-dependent degradation, promoted by several ubiquitin ligases, including MDM2.160 The SWIB domain of MDM2 recognizes a degron motif in the N-terminal region of p53, recruiting MDM2 to p53 and promoting the ubiquitylation and subsequent degradation of p53.370,375 Binding of MDM2 also promotes cytoplasmic shuttling of p53 in a ubiquitin-dependent manner that requires recognition of the NES in the C-terminal region of p53.190,376,377 Motif-mediated recruitment of USP7 to p53 results in deubiquitylation and counters the MDM2-catalyzed ubiquitylation, thereby partially stabilizing p53, although p53 is still turned over relatively rapidly with a half-life of around 30 min.378 Consequently, in unstressed cells, SLiM-based mechanisms ensure that p53 is diffusely sequestered in the cytoplasm and rapidly destroyed, thereby inhibiting its transcriptional activity (Figure 11B). 6.2.3. SLiMs Activating p53 in Sustained Hypoxia. While HIF1A activation promotes cellular survival in response to a decrease in oxygen availability, prolonged or severe hypoxia results in attenuation and finally termination of the HIF response and will ultimately lead to controlled cell death through apoptotic processes that depend on the accumulation and activation of p53 (Figure 11B).379,380 Under severe hypoxic conditions, several PTM sites in the N-terminal region of p53 are specifically targeted by stress-responsive kinases and used to modulate the stability and transcriptional activity of the protein.381 Due to the interplay between different overlapping modification and ligand motifs in this region, the function of p53 can be regulated based on integration of multiple signals. The residues S15, T18, S33, and S37 are key regulatory phosphorylation sites that mediate the context-dependent responses of p53. The stress-activated PIKK kinases ATM (ataxia telangiectasia mutated) and ATR (ataxia telangiectasia and Rad3-related protein) phosphorylate p53 at S15, a classical PIKK family glutamine-directed ([ST])Q phosphorylation site.382 Phosphorylation of S15 is required for subsequent modification of T18 by CK1 subfamily members, which recognize primed PSxx([ST]) sites.282 Phosphorylation of T18 inactivates the MDM2-binding motif, thereby blocking binding of p53 to MDM2 and inhibiting degradation of p53. This results in p53 stabilization, increasing the half-life of p53 to over 3 h.383 Disruption of MDM2 binding also inhibits MDM2mediated nuclear export of p53. Consequently, p53 accumulates in the nucleus in an NLS-dependent manner and associates with target genes to promote their transcription.190

Activated p53 also tetramerizes using an oligomerization interface that occludes the NES and blocks nuclear export by inhibiting NES binding by the nuclear export machinery, thereby linking activation to nuclear retention.192 The MDM2-binding motif of p53 overlaps with the TAD1 region that mediates binding to the TAZ domains of CBP/ p300 transcriptional coactivators.370 In addition to inhibiting binding to MDM2, phosphorylation of T18 also results in a 2fold increase in affinity of p53 for these coactivators, thus strongly switching the specificity of this region from MDM2 to CBP/p300 and linking stabilization with activation.128 Further phosphorylation of additional residues in the TAD2 region of p53 additively and gradually increases the affinity of p53 for CBP/p300.56,384 This includes phosphorylation of S37 by the glutamine-directed DNA-PK (DNA-dependent protein kinase), a PIKK family kinase, which creates a primed site recognized by the phospho-dependent kinase GSK3B for further phosphorylation of the S33 residue.385 This gradual increase of affinity for CBP/p300 allows p53 to compete more efficiently with other transcription factors for the limited amount of CBP/p300 expressed in cells as additional sites are phosphorylated in response to prolonged or enhanced stress-induced signaling.56,386 As a result, p53 will sequester CBP/p300 as more and more multiphosphorylated p53 accumulates, tipping the balance of the competition with HIF1A for these essential coactivators in favor of p53, thereby downregulating HIF1A transactivation.379 If these hypoxic conditions endure, further p53 accumulation can promote HIF1A degradation, possibly via formation of a HIF1A-MDM2-p53 complex, resulting in VHL-independent ubiquitylation and degradation of HIF1A.386 6.2.4. Summary of SLiMs in Combinatorial Signal Integration in Hypoxia. The transcriptional response to hypoxia is a highly regulated system, where multiple decisions based on SLiM-mediated integration of cell state information allow the cell to decide whether to adapt to hypoxic conditions or, if the stress situation endures, to target the cell for apoptosis. Extensive cooperative use of modification sites and regulatory ligand motifs across multiple proteins underlies the ability of the pathway to dynamically respond to fluctuations in cellular oxygen levels. These SLiMs modulate the competition between multiple transcription factors for the limited amount of available CBP/p300 by two synergistic mechanisms: (i) by controlling the stability and thereby local abundance of HIF1A and p53 and (ii) by modulating the affinity of HIF1A and p53 for CBP/p300 through modification of their CBP/p300binding modules. As such, the hypoxia response pathway illustrates the true potential of the combinatorial use of molecular motif-based switching mechanisms to accomplish specific, integrative, context-dependent, and robust molecular decision making.

7. SLIMS AND DISEASE SLiMs play a central role in the regulation of many important cellular processes. Accordingly, when SLiMs function aberrantly, the resulting deregulation can have disastrous consequences for the cell. In the following section, we will briefly review the two major ways in which SLiM deregulation can cause disease: (i) pathogenic mimicry of SLiMs in infectious diseases (see section 7.1) and (ii) mutation of SLiMs in cancer and Mendelian diseases (see section 7.2). Finally, we will introduce examples of therapeutics that target SLiM-mediated interactions (see section 7.3). 6763

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

Table 10. Representative Examples of Manipulation of SLiM-Mediated Interactions by Exogenous Biomolecules to Modulate Cellular Processes exogenous biomolecule

endogenous binding domain/target

function

motif sequence

a

viruses human immunodef iciency virus gag protein human immunodef iciency virus protein Vpu human immunodef iciency virus protein Rev human papillomavirus protein E7 Simian immunodef iciency virus envelope glycoprotein gp160 bacteria E. coli O157:H7 EspF(U) E. coli O127:H6 Tirb H. pylori cagAc H. pyloricagA

c

P. aeruginosa exoenzyme S B. anthracis lethal factor fungi P. amygdali Fusicoccin

protozoan parasites T. gondii dense granule protein 24 P. falciparum multiple vertebrates D. jamesoni kaimosae mambin a

tumor susceptibility gene 101 protein (TSG101) UEV domain

recruitment of cellular abscission machinery to complete viral budding411

455PTAP458

F-box/WD repeat-containing protein 1A (BTRC) WD40 repeat

recruitment of CD4 to SCFbTrCP ubiquitin ligase to promote CD4 degradation390

51DPSGNEPS56

importin and exportin subunits

recruitment of cellular machinery for nuclear import and export391,392

35RQARRNRRRRWRERQRR51

retinoblastoma-associated protein (RB1)

deregulation of S phase entry checkpoint210

(NLS) and 74QLPPLERLTLDCSE87 (NES) 22LYCYE26

AP-2 complex subunit mu (AP-2 μ)

recruitment of cellular endocytic machinery412

723YRPV726

neural Wiskott−Aldrich syndrome protein (WASL) GTPase-binding domain cytoplasmic protein NCK1 (NCK1) SH2 domain tyrosine-protein phosphatase non-receptor type 11 (SHP2) SH2 domain serine/threonine-protein kinase MARK2 (MARK2) Kinase domain 14-3-3 proteins

activation of WASL and potent stimulation of actin polymerization396 formation of actin pedestals397

224VAQRLMQHL232

activation of SHP2 phosphatase activity398

969EPIPYATI975

inhibition of MAP family protein serine/threonine kinases400 induction of cell death through 14-3-3 -dependent modification of RAP1 and RAS104 inhibition of MAPK phosphorylation by cleavage of MAPKK substrate docking sites413

486LDREKNVT493

14-3-3 proteins

stabilization of the SLiM-mediated interaction between the plant plasma membrane H+ATPase and 14-3-3404

Fusicoccind

mitogen-activated protein kinase 14 (MAPK14)

activation of MAPK14 kinase activity401

444RRGVSELPPLYI455

dual specificity mitogen-activated protein kinase kinase 3 (MAP2K3)

tagging of about 200 proteins for export to host erythrocytes414 integrin receptors

blocking of platelet aggregation activation and integrin-dependent cell adhesion415

452NPPYAEV457

422LLDALDLAS430

20KRKKDLRI27

and 523RRGVSELPPLRI534 RxLxE

43RGD45

An extended list of viral motifs was published previously.16 bTranslocated intimin receptor. cCytotoxicity-associated immunodominant antigen Not a motif but an organic compound.

d

7.1. SLiMs in Infectious Diseases

review provides an extensive list of examples of viral SLiM mimicry from the available viral motif literature and revealed that one-third of all known eukaryotic motif classes have an instance of viral mimicry in at least one viral protein.388 Viruses utilize host motifs in a variety of ways: to disrupt cellular interactions (e.g., the pseudosubstrate FBXW7-binding degron in the large T antigen of SV40 (699PPTPPPE705) is used to stabilize cellular targets of the SCFFbw7 ubiquitin ligase complex389); to repurpose host proteins for novel tasks (e.g., the BTRC-binding degron in the HIV accessory protein Vpu scaffolds SCFbTrCP to CD4 and promotes CD4 degradation390); to utilize the regulatory pathways of the cell (e.g., the NES and NLS motifs in HIV Rev allow this protein to utilize the host nuclear transport machinery to shuttle between the nucleus and the cytoplasm391,392); and to stimulate inactivated host pathways (e.g., multiple DNA viruses utilize RB1-binding motifs to promote S phase progression, thereby activating the DNA replication machinery of the cell required for viral genome replication393).

7.1.1. Host SLiM Mimicry by Pathogens. The genomes of pathogenic bacteria, fungi, and viruses encode a multitude of diverse proteins that utilize the cellular processes of their hosts. These proteins exploit features of host physiology using a range of interfaces, including novel pathogen-specific interfaces, stolen repurposed genes, and convergently evolved SLiMs.387,388 As discussed in previous sections, SLiMs are, compared to globular domains, easy to evolve de novo and allow access to a diverse set of important cellular pathways. As such, they are a vulnerability of the cell, as mimicry of host SLiMs is a simple and an elegant mechanism for pathogens to disrupt or rewire pathways in host cells. Consequently, a wide range of intracellular pathogens mimic SLiMs to target host processes, thereby facilitating their replication and proliferation (Table 10).388 Viruses do not encode the molecular machinery necessary for their replication and must hijack the functions of cellular proteins in infected cells to complete their life cycle. A recent 6764

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

Table 11. Representative Examples of SLiM Mutations Associated with Human Diseases protein complex assembly motifs amiloride-sensitive sodium channel subunit beta (SCNN1B)437−439,444 cyclin-dependent kinase inhibitor 1C (KIP2)445 cytokine receptor common subunit gamma (IL2RG)446 homeobox protein TGIF1447

disease

description

motif sequencea

Liddle syndrome

constitutive channel activation

616PPPNY620

intrauterine growth retardation

disruption of PCNA binding

270PLISDFFAKRKRS282

X-linked combined immunodeficiency

impaired association with JAK kinases

286TMPRIPTLK294

holoprosencephaly type 4

153PLDLS157

insulin receptor (INSR)448

Rabson−Mendenhall syndrome

linker for activation of T cells (LAT)449 RAF proto-oncogene serine/ threonine-protein kinase (RAF1)74 THAP domain-containing protein 1 (THAP1)450−452 Usher syndrome type-1G protein (SANS)453 enzyme recruitment motifs catenin beta-1431,432

lymphoproliferative disorder in mice

impaired interaction with carboxyl terminus-binding protein (CtBP) impaired interaction with IRSs, decreased response to insulin loss of binding to PLCgamma1

Noonan syndrome type 5

disrupted autoinhibition of RAF1 via 14-3-3

256RSTSTP261

Dystonia type 6

impaired binding to HCF1

134DHNY137

Usher syndrome type 1G

strongly reduced affinity for USH1C

456LEDTEL461

various cancers

32DSGIHS37

hypoxia-inducible factor 2-alpha (HIF-2α)434,454,455 NF-kappa-B inhibitor alpha (IκBα)433

familial erythrocytosis type 4

SH3 domain binding protein 2 (SH3BP2)149,456,457 targeting motifs ceramide kinase-like protein (CERKL)428,429 low-density lipoprotein receptor (LDLR)458 Rhodopsin (RHO)426,459 zinc finger transcription factor Trps1 (TRPS1)460 modification motifs androgen receptor (AR)461−465

cherubism

impaired catenin proteosomal degradation leads to enhanced oncogenic activity impaired HIF-2α proteosomal degradation; impaired interaction of HIF-2α with both VHL and EGLN1 impaired IκBα proteosomal degradation; enhanced inhibitory capacity of IκBα and impaired NFκB activation loss of tankyrase mediated destruction of SH3BP2

retinitis pigmentosa type 26

impaired nuclear import

102KLKRR106

familial hypercholesterolemia

disrupted internalization of LDL receptor

822NFDNPVYQ829

retinitis pigmentosa type 4 tricho-rhino-phalangeal syndrome type 1

disrupted localization to photoreceptor outer segment impaired nuclear localization

344QVAPA348

testicular cancer

defective sumoylation results in impaired control of the transcription factor

381PHARIKLENP390

Bartter syndrome type 2

impaired PKA phosphorylation results in reduced channel activity

216LRKSLLI222

Marfan syndrome autosomal dominant hypophosphataemic rickets IRAN type A familial hypertrophic cardiomyopathy type 17 melanoma

disruption of profibrillin processing resistance to cleavage by furin

2726RGRKRR2731

failure of processing IR proreceptor to mature IR diminished phosphorylation of target protein results in change of Ca2+ signaling in skeletal muscle defective sumoylation results in enhanced binding to the HIF1A promoter and increased transcriptional activity loss of phosphorylation by casein kinase; decreased stability impaired glycosylation

759RKRR762

ATP-sensitive inward rectifier potassium channel 1 (KCNJ1)442,443 fibrillin-1 (FBN1)466 fibroblast growth factor 23 (FGF23)467 insulin receptor (INSR)468 Junctophilin-2 (JPH2)469 microphthalmia-associated transcription factor (MITF)470 period circadian protein homologue 2 (PER2)471 retinal-specific ATP-binding cassette transporter (ABCA4)472

ectodermal dysplasia anhidrotic with T-cell immunodeficiency autosomal dominant

familial advanced sleep-phase syndrome Stargardt disease type 1

993ASSNPEY999 136YLVV139

529LAPYIPMDGEDFQL542 31DSGLDS36

414QRSPPDGQ421

947RRTRKRLN954

and

516PTCVKSEMG524

176RHTR179

161TSLSSLRS168 422IKQE425 662SVASLTS668 97YNNSIL102

a Bold residues of the motif are the major affinity- and specificity-determining residues of the SLiM; underlined residues are mutated in disease phenotypes.

Proteins encoded by pathogenic bacteria and fungi have so far not been shown to interact with their host’s proteome to the same extent as viral proteins. However, many bacterial and fungal pathogens inject or translocate virulence proteins, also known as effector proteins, into the host cell. These proteins often mimic host motifs and utilize similar tactics to viral proteins to deregulate, rewire, and activate cellular pathways to create an environment suitable for the pathogen.394,395 The host cell cytoskeleton is a common target of effector proteins, and several distinctive mechanisms utilized by bacteria to

modulate actin dynamics have been characterized. For example, enterohemorrhagic Escherichia coli secretes several virulence factors into cells of the host intestine. Two effector proteins, EspF(U) (secreted effector protein EspF(U)) and Tir (translocated intimin receptor Tir), cooperate to potently stimulate actin polymerization and promote formation of actin pedestals. A repeated region of the EspF(U) effector protein mimics the autoinhibitory motif of WASL (neural Wiskott−Aldrich syndrome protein), inducing the active conformation of WASL.396 In addition, the SH2 domain-binding motif in Tir 6765

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

high-affinity 14-3-3 interaction locks the autoinhibitory regulatory domain of pma2 to the 14-3-3 dimer, activating pma2 and increasing proton pump activity. 7.1.3. Malevolent Use of Endogenous SLiMs. SLiMs are also utilized by free-living pathogens to perform their own intracellular processes, for example, as previously stated, all life uses the PIP-box motif for PCNA binding during DNA replication. Occasionally, these motifs function in pathways with therapeutic relevance. Plasmodium falciparum, the protozoan responsible for malaria, encodes a protein complex that recognizes a short export motif with the sequence RxLxE known as PEXEL (plasmodium export elements). The motif identifies proteins for export from the plasmodium into the cytoplasm of the erythrocytes of their human host.405 Approximately 200 P. falciparum effector proteins, which extensively remodel the host erythrocytes, are exported in this way. Another therapeutically relevant SLiM is the high-affinity RGD motif of disintegrins found in Viper and rattlesnake venom toxins. Members of the disintegrin protein family have repurposed endogenous extracellular proteases from the ADAM (disintegrin and metalloproteinase domain-containing) subfamily to act as potent anticoagulants.406 Disintegrin proteins block platelet aggregation and inhibit integrindependent cell adhesion using a high-affinity RGD motif to outcompete cellular targets of the beta-1 and -3 families of integrins.407−410

binds the SH2/SH3 adaptor protein NCK1 to indirectly recruit WASL to the E. coli pedestal.397 Thus, using distinct mechanisms, both proteins activate actin nucleation by modulating WASL function. Similarly, the CagA (cytotoxicityassociated immunodominant antigen) protein of Helicobacter pylori, a bacterium associated with stomach ulcers, also induces morphological changes to the infected cell. However, in the case of CagA these changes are facilitated by stimulating phosphatase activity whereby a phosphotyrosine-containing motif in CagA specifically binds the SH2 domain of SHP2 (tyrosine-protein phosphatase non-receptor type 11) in a phosphorylation-dependent manner and activates MAPK signaling.398,399 CagA also targets signaling pathways by negatively modulating host kinase activity. CagA contains a mimic of a MARK (MAP/microtubule affinity-regulating kinase) family substrate docking motif that acts as a pseudosubstrate inhibitor and allows CagA to block phosphorylation of cellular MARK2 (serine/threonine-protein kinase MARK2) targets.400 Conversely, GRA24 (dense granule protein 24) of Toxoplasma gondii activates host kinases utilizing two high-affinity MAPK docking motifs that bind MAPK14 (mitogen-activated protein kinase 14) to activate the kinase.401 GRA24 is secreted from the parasitophorous vacuole to the host cell nucleus upon host invasion and dimerizes MAPK14 to promote trans-autophosphorylation, activation, and nuclear relocalization. Activated MAPK14 triggers expression of genes that encode growth factors and cytokines/chemokines that work in favor of parasite replication. 7.1.2. SLiM Modulation by Pathogens. Pathogens modulate SLiM-regulated systems in a number of ways that do not involve direct molecular mimicry of a host SLiM. Examples exist of pathogenic recognition of SLiM-binding pockets utilizing novel binding modes. For example, a peptide from the ExoS (Exoenzyme S) toxin, an ADP-ribosyltransferase, of the bacterium Pseudomonas aeruginosa occupies the binding groove of human 14-3-3 with a noncanonical unphosphorylated peptide, binding in the reverse orientation to the classical phosphorylated peptides. This motif-mediated interaction allows ExoS to specifically ribosylate host GTPbinding Ras and Ras-related proteins in a 14-3-3-dependent manner,104 inducing cell death by apoptosis. Bacillus anthracis has also evolved an innovative mechanism to inhibit intracellular signaling pathways. The Bacillus-produced anthrax toxin LF (lethal factor), a protease, cleaves the N-terminal MAPK docking motif of several MAPK kinases (MAPKK).402 The catalytic domain of LF appears to have evolved specificity to target MAPK docking motifs, allowing B. anthracis to inhibit signaling through the MAPK pathways by disrupting MAPKK binding to MAPK and thereby block MAPKK phosphorylation of MAPK. Several pathogens are known to produce small compounds that inhibit motif-mediated interactions, for example, the Streptomyces staurosporeus bacterium small compound Staurosporine is a promiscuous kinase inhibitor that competes with ATP for the ATP-binding site and thereby blocks modification of phosphorylation motifs.403 The plant pathogen Phomopsis amygdali also produces a small toxic compound, known as the Fusicoccin toxin; however, it is exceptional in that it stabilizes a physiological interaction between two cellular proteins. The toxin increases the affinity of the Nicotiana tabacum 14-3-3-binding peptide (953QSYPTV957) in tobacco proton pump pma2 (plasma membrane H+ ATPase) for 14-3-3 by 90-fold despite associating weakly in the absence of the 14-3-3-binding peptide.404 The stabilized

7.2. SLiMs in Mendelian Diseases and Cancer

The compact and low-affinity interfaces of SLiMs are susceptible to mutations on or near their key affinity- and specificity-determining residues. Knocking out the function of a single motif can have a wide range of phenotypic severity. Some motif mutants are prenatally lethal,416 while conversely, as many motifs act cooperatively, a single motif can be knocked out without an appreciable phenotype.417 Although the majority of experimentally annotated disease-related missense mutations (around 78%) occur in globular protein regions, where they can cause unfolding of the protein, a significant amount (around 22%) are found within intrinsically disordered regions.418,419 To date, several mutations in SLiM interfaces have been associated with disease phenotypes (Table 11), a representative selection of which are discussed in the following sections. 7.2.1. Mutated SLiMs Resulting in Aberrant Local Abundance. Deregulation of intrinsically disordered proteins is associated with several human diseases.13,420,421 Normally, IDP abundance is strictly controlled by pre- and posttranslational mechanisms, exhibiting both increased degradation and decreased translation rates when compared to structured proteins.117,118,422 Intrinsically disordered proteins require tight regulation, both spatially and temporally, as aberrant local abundance or half-life of many motif-containing proteins promotes nonspecific, nonphysiological interactions that can result in deregulation of the newly targeted proteins or rewiring of the source pathways.122 Consequently, mutations that increase the abundance of a protein in the incorrect cellular compartment or at the incorrect time can have severe consequences. Several disease-related motif mutations result in mislocalization of a protein (Table 11). For example, a ciliary targeting motif (QVxPx−COOH) in RHO (Rhodopsin) mediates vesicular transport to the cilia of eukaryotic cells.423,424 Mutations within this motif lead to defective trafficking of 6766

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

Table 12. Representative Examples of Drugs and Inhibitors Targeting SLiM-Mediated Interactions inhibitor name

disease

3-hydroxy-methylindole487

-

BV02488

chronic myeloid leukemia glioblastomaa -

Cilengitide485 compound 2c (UCS15A analogue)489 erioflorin490 FJ9491 hydroxyproline analogue compound 15492 Imatinib (marketed as Gleevec)493 Lacosamide494 nonhydrolyzable difluoromethylenephosphoserine495 Nutlin-3484 Peptidimer-c496 steroid receptor coactivator peptide mimetic compounds497 UCS15A498

a

cancer cancer (potentially) chronic anemia, acute ischemia, stroke chronic myelogenous leukemia (CML)b epilepsyb leukemia cancerc cancer various diseases (potentially) osteoporosis

inhibited motif binding site/targeted motif containing protein(s) PDZ domain of MAGI3/phosphatidylinositol 3,4,5-trisphosphate 3-phosphatase and dual-specificity protein phosphatase PTEN (PTEN) 14-3-3/tyrosine-protein kinase ABL1 (ABL1) integrin, beta chain, FG-GAP repeat of integrin/vitronectin (VTN); fibronectin (FN1) SH3 domain of GRB2, SRC, and PLCG1/KH domain-containing, RNA-binding, signal transduction-associated protein 1 (SAM68); Son of sevenless homologue 1 (SOS1) WD40 domain of βTrCP1/programmed cell death protein 4 (PDCD4) PDZ domain of Dishevelled (DSH)/Frizzled-7 (FZD7) ODPH degron binding domain of VHL/hypoxia-inducible factor 1-alpha (HIF1A)

ABL kinase active site/-

14-3-3/14-3-3/Forkhead box protein O3 (FOXO3) MATH domain of MDM2/cellular tumor antigen p53 (TP53) SH3 domain of GRB2/Son of sevenless homologue 1 (SOS1) NRBOX binding pocket of thyroid receptor beta, estrogen receptor alpha and beta/nuclear receptor coactivator 2 (NCOA2) SH3 domain of GRB2, SRC and PLCG1/KH domain-containing, RNA-binding, signal transduction-associated protein 1 (SAM68); Son of sevenless homologue 1 (SOS1); GRB2-associated-binding protein 1 (GAB1); tight junction protein ZO-1 (ZO1); tyrosine-protein kinase Fyn (FYN)

drug - phase-III clinical trials bdrug - approved cdrug - phase-II clinical trials

HIF2A in cells with sufficient levels of oxygen. Several mutations within the motif abrogate recognition of the degron by VHL, resulting in impaired degradation and aberrant stabilization of HIF2A. Consequently, mutant HIF2A is not degraded under normoxic conditions and can activate its transcriptional targets. The resulting stimulation of the expression of EPO (Erythropoietin), a key regulator of red blood cells, by stable HIF2A mutants results in the blood disease Polycytemia.434 Degrons may also function independently of modifications. For instance, a PTM-independent degron of SCNN1B (amiloride-sensitive sodium channel subunit beta) is important for regulation of its stability by the E3 ubiquitin ligase NEDD4. The WW domain-binding motif of SCNN1B recruits NEDD4, and the resulting ubiquitylation targets these sodium channel subunits for proteosomal degradation.435 Mutations in the WW domain-binding motif of SCNN1B abrogate its binding to the WW domain of NEDD4,436 thereby extending the half-life of SCNN1B and increasing sodium channel activity by greater than 3-fold, ultimately causing increased sodium reabsorption in the distal nephron.437−439 This phenotype is associated with Liddle’s syndrome, a hereditary salt-sensitive hypertension disease.439 7.2.2. Mutated SLiMs Resulting in Defective Activity. Several characterized SLiM-associated diseases are the result of deregulation of a protein, causing protein over- or underactivity (Table 11). For instance, mutations in a 14-3-3-binding motif in RAF1 abrogate an inhibitory SLiM-mediated interaction of RAF1 with 14-3-3 and give rise to an overactive RAF1 mutant. These mutations lead to increased RAF1-mediated phosphorylation and are associated with Noonan syndrome.74 Similarly, mutations within the synergy control (SC) motifs of AR (androgen receptor) enhance transcription of androgensensitive genes by reducing inhibitory sumoylation.440 The overactivity of nonsumoylated AR has been implicated in several male-specific diseases such as prostate and testicular cancer.441 By contrast, motif mutations can also generate

Rhodopsin to the retinal outer segment, resulting in autosomal dominant retinitis pigmentosa (ADRP).425−427 Another example of defective subcellular localization due to motif loss has been observed for CERKL (ceramide kinase-like protein), which functions in the nucleus and requires an intact NLS for nuclear entry. Mutations resulting in a nonfunctional NLS (R105 → A and R106 → S) cause defective localization of the protein and are reportedly causal for autosomal recessive retinal degeneration.428,429 Mislocalization of proteins can also occur via erroneous post-translational modifications. SHOC2 (leucine-rich repeat protein SHOC-2), a key protein in the activation of the ERK-MAPK pathway, presents an interesting case as, contrary to most motif-related disease examples, instead of “knocking out” a functional motif a single mutation “knocks in” an N-myristoylation motif (1MGSSLGK7). The resulting lipidation of SHOC2 causes relocalization to the intracellular membrane instead of the nucleus (upon EGF stimulation) and drives aberrant signal transduction. 430 Anomalously Nmyristoylated SHOC2 is associated with Noonan-like syndrome, a congenital disorder that results in dwarfism. Several diseases result from SLiM-associated mutations that lead to stabilization of a protein and thereby increase protein abundance (Table 11). For example, stabilizing mutations in a BTRC-binding degron in β-Catenin result in β-Catenin accumulation and constitutive activation of the Wnt signaling pathway and are associated with several types of cancer.431,432 Similarly, IKBA (NF-kappa-B inhibitor alpha), an inhibitor of NF-κB activation, also contains a degron motif recognized by the BTRC subunit of the SCF ubiquitin ligase complex. An S32 → I mutation in this motif blocks the phosphorylationdependent degradation of IKBA, and the resulting loss of IKBA-mediated NF-κB inhibition is associated with autosomal dominant anhidrotic ectodermal dysplasia and T cell immunodeficiency.433 Similarly, the prolyl hydroxylationdependent degron in HIF2A (hypoxia-inducible factor 2alpha) is necessary for the VHL-mediated degradation of 6767

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

underactive proteins. For instance, the S219 → R mutation in a PKA-dependent phosphorylation site of KCNJ1 (ATP-sensitive inward rectifier potassium channel 1) decreases channel activity by 50%, thus inhibiting the ability of cells to reabsorb salt, a phenotype linked to Bartter’s syndrome.442,443 Consequently, patients with this genetic kidney disorder suffer from low potassium levels.

functionality in a protein sequence as they are short, easy to modulate, and evolutionarily plastic. Cooperativity between multiple low-affinity motif-mediated interactions underlies the context-dependent and dynamic assembly of large metastable regulatory complexes, thereby allowing integrative, reliable, and robust signaling in a switch-like manner. As such, many motifmediated interactions are vitally important for diverse cellular processes encompassing the entire functional repertoire of the cell. By discussing numerous examples, this review emphasized the importance, ubiquity, and versatility of SLiMs to the wider biological community. However, despite the accumulating literature on SLiM function and the continuous characterization of novel motifs and motif instances, many more motif activities, classes, and instances are yet to be discovered. Further elucidation of motif function will very likely prove relevant and beneficial, given the involvement of SLiMs in disease and the potential of motif-mediated interactions as valid targets for drug development.

7.3. Therapeutic Potential of SLiMs

The central role of SLiMs in cell regulation has underscored their potential as desirable drug targets, as have the expanding understanding of their deregulation in viral infection, cancer, and Mendelian disease (Table 12). Kinase inhibitors are by far the most populated and successful class of therapeutics targeting SLiM-mediated interactions. Numerous small molecules that inhibit the ability of kinases to recognize and modify PTM motifs in their substrates are already being used or being tested in clinical trials.473,474 For example, Imatinib, an inhibitor of the oncogenic tyrosine-kinase BCR-ABL gene fusion, has been successfully used to target several cancers including chronic myelogenous leukemia (CML).475 This inhibitor only indirectly interferes with SLiM function as it targets the ATPbinding site on the kinase domain to prevent substrate phosphorylation.476 However, other efforts in drug development aim at directly affecting SLiM-mediated interaction interfaces. In the past decade, several small molecules and designed inhibitory peptides mimicking ligand SLiMs have shown promise as drugs,40,215,477,478 and a handful of SLiMcompeting compounds have entered clinical trials.479−483 The two most promising SLiM-mimicking small molecules to date are Nutlin and Cilengitide. Nutlin mimics the helical conformation of the MDM2-binding peptide in the N terminus of p53, thereby blocking the inhibitory interaction with MDM2 and subsequently promoting p53 activity. In cells with wildtype p53, Nutlin can induce cell cycle arrest, promote apoptosis, and inhibit growth. The anticancer properties of the compound were demonstrated in human tumor xenografts in nude mice,484 and Nutlin analogues have entered clinical trials for various human cancers including retinoblastoma479 and liposarcoma.480 Cilengitide is the product of a ligandoriented design approach to find novel inhibitors of the RGD− Integrin interaction and has been shown to be a potent inhibitor of the angiogenesis- and metastasis-associated αvβ1, αvβ3, and αvβ5 integrin complexes. This cyclic RGD pentapeptide inhibitor was also shown to induce apoptosis in glioma cells485 and has progressed to phase III clinical trials for treatment of glioblastomas483 as well as to advanced stages of development for several other tumors.481,482,486

AUTHOR INFORMATION Corresponding Author

*E-mail: [email protected]. Notes

The authors declare no competing financial interest. Biographies

Kim Van Roey studied biochemistry and bioinformatics at the Catholic University of Leuven (KUL), Belgium. He received his Ph.D. degree from the Laboratory of Molecular Cell Biology (MCB) at the KUL and the VIB (Flanders Institute for Biotechnology) Department of Molecular Microbiology for working on characterization of mammalian genes that were isolated using a yeast-based screening system and potentially involved in glucose sensing in pancreatic beta cells. Currently, he works as a computational biologist in the group of Toby Gibson at the European Molecular Biology Laboratory (EMBL) in Heidelberg, Germany, where his main focus is integration of cooperative interactions in bioinformatics resources. Part of this work involves exploring the role of short linear motifs in cooperative complex assembly and the mechanisms that regulate motif-mediated interactions.

8. CONCLUDING COMMENTS Spatial, temporal, and context-dependent control of the numerous molecules in a cell is critical for accurate processing of the barrage of information a cell is constantly receiving. An important question in the signaling field is how does a protein find its correct binding partners in this crowded environment while also preventing spurious and potentially damaging interactions? Evolution has developed a number of elegant solutions to this problem, including sequestering signaling components in specific subcellular compartments, assembling proteins into specific functional complexes, dynamically wiring protein interaction networks by means of PTMs, and tightly controlling the stability and local abundance of proteins. Motifs are the ideal protein interaction modules to encode this 6768

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

Bora Uyar did his undergraduate studies in Biological Sciences and

Holger Dinkel graduated in biology and received his Ph.D. degree in

Bioengineering program of Sabanci University in Istanbul/Turkey,

Bioinformatics from the Friedrich-Alexander-University in Erlangen.

where he worked with Ugur Sezerman on protein structure prediction

He has been studying short linear motifs since 2005 and is currently

and structural motif finding. Later, he was accepted to the CIHR/

working in the group of Toby Gibson at the European Molecular

MSFHR Bioinformatics Graduate Training Program in Vancouver,

Biology Laboratory in Heidelberg. Here, he is maintaining ELM, the

Canada. There he worked in various laboratories on various projects

Eukaryotic Linear Motif resource, which is one of the most

including transcriptome analysis (with Inanc Birol and Cenk

comprehensive databases dedicated to the annotation of short linear

Sahinalp), microRNA target gene prediction (with Marco Marra),

motif classes and instances. This resource was established in 2000 and

and Comparative genomics analysis in Caenorhabditis species (with

continues to be a valuable resource for bench biologist and

Nansheng Chen in the Department of Molecular Biology, Simon

bioinformaticians alike.

Fraser University). After defending his M.Sc. thesis he moved to Heidelberg, Germany, to start his doctoral studies in the European Molecular Biology Laboratory. Since then, he has been working with Toby Gibson in the Structural & Computational Biology Unit. His main project is the analysis of short linear motifs and their association to human diseases.

Markus Seiler received his Ph.D. degree from the Molecular Genome Analysis Division of the German Cancer Center (DKFZ) in Heidelberg, Germany, for his work on fuzzy pattern analysis in protein sequences, during which he received a Klaus-Tschirafellowship for bioinformatics studies. Currently, he holds a postdoctoral fellowship in the lab of Toby Gibson in the European

Robert Weatheritt is a computational biologist who holds a joint position as a postdoctoral researcher at the MRC Laboratory of

Molecular Biology Laboratory (EMBL) in Heidelberg, where he is

Molecular Biology in Cambridge and at the Donnelly Centre in the

involved in the analysis of regulatory protein−protein interactions in

University of Toronto. He was awarded his Ph.D. degree by the

stem cells. His further interests include regulatory fine tuning of

European Molecular Biology Laboratory (EMBL) in Heidelberg for

protein function by translational control mechanisms (analysis of small

research undertaken in the group of Toby Gibson investigating the

RNAs) and subcellular targeting (analysis of nuclear localization

role of linear motifs in cell regulation.

signals). 6769

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

Norman Davey received his Ph.D. degree (2009) from the Conway Institute of Biomolecular & Biomedical Research at University College Dublin, Dublin, Ireland, working on short, linear motif discovery methods. He subsequently moved to the European Molecular Biology Laboratory (EMBL), Heidelberg, Germany, as an EIPOD postdoctoral fellow to work on various aspects of motif biology including the prominent role of SLiMs in regulatory decision making, splice isoformspecific functionality, and viral pathogenesis. In 2013, he joined the Department of Physiology at the University of California, San Francisco (UCSF) as a postdoctoral fellow with Professor David O. Morgan characterizing novel motifs in the cell cycle. His research focuses on the role of SLiM within intrinsically disordered regions in directing cell regulation. He has authored over 20 papers on various aspects of SLiM biology. He continues to utilize evolutionary, proteomic, and genomic data to examine two major open questions about intrinsically disordered regions: (i) what are the modules that are responsible for their functionality and (ii) how do perturbations in the cell modulate the functionality of these modules.

Aidan Budd studied natural sciences at Cambridge University in the United Kingdom, specializing in Zoology. Following his degree, he worked as a research assistant at the Gurdon Institute, Cambridge, UK, for a year, in the lab of Ron Laskey, on biochemical analysis of DNA replication. He followed this by studying for his Ph.D. in the lab of Toby Gibson at EMBL in Heidelberg, Germany, working on protein sequence analysis and phylogenetics of genes with similar histories. His current work focuses on training and community building in bioinformatics, focused on protein sequence analysis and molecular phylogenetics. He manages a bioinformatics community at EMBL and is a founding member of the Heidelberg Unseminars in Bioinformatics community. He organizes and teaches in several practical bioinformatics courses each year.

ACKNOWLEDGMENTS This work was supported by an FP7 Health grant (no. 242129; SyBoSS) from the European Commission (to K.V.R. and T.J.G.), a Medical Research Council (U105185859) and ERASysBio+ project (GRAPPLE) and a Canadian Institutes of Health Research (CIHR) Postdoctoral Fellowship (to R.J.W.), the NGFN DiGtoP grant (to M.S.), and an EMBL International PhD Programme fellowship (to B.U.). Where possible we have attempted to direct readers to additional reviews on the various motif topics and as such much of the primary literature was not referenced directly. We apologize to our colleagues whose work was not cited due to these constraints.

Toby Gibson is at the European Molecular Biology Laboratory (EMBL) in Heidelberg, Germany. He studied Biology at Edinburgh University and did his Ph.D. studies with Bart Barrell at the LMB, Cambridge. He is a computational biologist, i.e., a biologist who finds computers to be useful adjuncts to biological research. He has collaborated with Des Higgins (UCD) on development of the widely used Clustal series of multiple sequence alignment software. He oversees development of ELM, the Eukaryotic Linear Motif resource (http://elm.eu.org/), devoted to protein sequence motifs involved in cell signaling and regulation. He is currently fascinated by the developing structure−function paradigm for the massively interacting hub proteins such as P53, IRS-1, sundry AKAPs, and many, many more. These are characterized by large natively unstructured protein segments that are repositories of abundant “linear motifs”short regulatory sites that interact with other proteins.

REFERENCES (1) Kersse, K.; Verspurten, J.; Vanden Berghe, T.; Vandenabeele, P. Trends Biochem. Sci. 2011, 36, 541. (2) Havugimana, P. C.; Hart, G. T.; Nepusz, T.; Yang, H.; Turinsky, A. L.; Li, Z.; Wang, P. I.; Boutz, D. R.; Fong, V.; Phanse, S.; Babu, M.; Craig, S. A.; Hu, P.; Wan, C.; Vlasblom, J.; Dar, V. U.; Bezginov, A.; Clark, G. W.; Wu, G. C.; Wodak, S. J.; Tillier, E. R.; Paccanaro, A.; Marcotte, E. M.; Emili, A. Cell 2012, 150, 1068. (3) Dunker, A. K.; Lawson, J. D.; Brown, C. J.; Williams, R. M.; Romero, P.; Oh, J. S.; Oldfield, C. J.; Campen, A. M.; Ratliff, C. M.; Hipps, K. W.; Ausio, J.; Nissen, M. S.; Reeves, R.; Kang, C.; Kissinger, C. R.; Bailey, R. W.; Griswold, M. D.; Chiu, W.; Garner, E. C.; Obradovic, Z. J. Mol. Graphics Modell. 2001, 19, 26. (4) Wright, P. E.; Dyson, H. J. J. Mol. Biol. 1999, 293, 321. (5) Uversky, V. N.; Gillespie, J. R.; Fink, A. L. Proteins 2000, 41, 415. (6) Tompa, P. Trends Biochem. Sci. 2002, 27, 527. 6770

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

(7) Dyson, H. J.; Wright, P. E. Nat. Rev. Mol. Cell Biol. 2005, 6, 197. (8) Xie, H.; Vucetic, S.; Iakoucheva, L. M.; Oldfield, C. J.; Dunker, A. K.; Uversky, V. N.; Obradovic, Z. J. Proteome Res. 2007, 6, 1882. (9) Galea, C. A.; Wang, Y.; Sivakolundu, S. G.; Kriwacki, R. W. Biochemistry 2008, 47, 7598. (10) Gsponer, J.; Babu, M. M. Prog. Biophys. Mol. Biol. 2009, 99, 94. (11) Gibson, T. J. Trends Biochem. Sci. 2009, 34, 471. (12) Tompa, P. Curr. Opin. Struct. Biol. 2011, 21, 419. (13) Babu, M. M.; van der Lee, R.; de Groot, N. S.; Gsponer, J. Curr. Opin. Struct. Biol. 2011, 21, 432. (14) Tompa, P. Trends Biochem. Sci. 2012, 37, 509. (15) Dinkel, H.; Van Roey, K.; Michael, S.; Davey, N. E.; Weatheritt, R. J.; Born, D.; Speck, T.; Kruger, D.; Grebnev, G.; Kuban, M.; Strumillo, M.; Uyar, B.; Budd, A.; Altenberg, B.; Seiler, M.; Chemes, L. B.; Glavina, J.; Sanchez, I. E.; Diella, F.; Gibson, T. J. Nucleic Acids Res. 2014, 42, D259. (16) Davey, N. E.; Van Roey, K.; Weatheritt, R. J.; Toedt, G.; Uyar, B.; Altenberg, B.; Budd, A.; Diella, F.; Dinkel, H.; Gibson, T. J. Mol. Biosyst. 2012, 8, 268. (17) Mohan, A.; Oldfield, C. J.; Radivojac, P.; Vacic, V.; Cortese, M. S.; Dunker, A. K.; Uversky, V. N. J. Mol. Biol. 2006, 362, 1043. (18) Mi, T.; Merlin, J. C.; Deverasetty, S.; Gryk, M. R.; Bill, T. J.; Brooks, A. W.; Lee, L. Y.; Rathnayake, V.; Ross, C. A.; Sargeant, D. P.; Strong, C. L.; Watts, P.; Rajasekaran, S.; Schiller, M. R. Nucleic Acids Res. 2012, 40, D252. (19) Van Roey, K.; Dinkel, H.; Weatheritt, R. J.; Gibson, T. J.; Davey, N. E. Sci. Signal 2013, 6, rs7. (20) Beltrao, P.; Albanese, V.; Kenner, L. R.; Swaney, D. L.; Burlingame, A.; Villen, J.; Lim, W. A.; Fraser, J. S.; Frydman, J.; Krogan, N. J. Cell 2012, 150, 413. (21) Dinkel, H.; Chica, C.; Via, A.; Gould, C. M.; Jensen, L. J.; Gibson, T. J.; Diella, F. Nucleic Acids Res. 2011, 39, D261. (22) Hornbeck, P. V.; Kornhauser, J. M.; Tkachev, S.; Zhang, B.; Skrzypek, E.; Murray, B.; Latham, V.; Sullivan, M. Nucleic Acids Res. 2012, 40, D261. (23) Zhao, Y.; Jensen, O. N. Proteomics 2009, 9, 4632. (24) Stein, A.; Ceol, A.; Aloy, P. Nucleic Acids Res. 2011, 39, D718. (25) Schuster-Bockler, B.; Bateman, A. BMC Bioinformatics 2007, 8, 259. (26) Liddington, R. C. Methods Mol. Biol. 2004, 261, 3. (27) Uversky, V. N. Chem. Soc. Rev. 2011, 40, 1623. (28) Tompa, P.; Fuxreiter, M.; Oldfield, C. J.; Simon, I.; Dunker, A. K.; Uversky, V. N. Bioessays 2009, 31, 328. (29) Cho, Y.; Gorina, S.; Jeffrey, P. D.; Pavletich, N. P. Science 1994, 265, 346. (30) Pornillos, O.; Ganser-Pornillos, B. K.; Banumathi, S.; Hua, Y.; Yeager, M. J. Mol. Biol. 2010, 401, 985. (31) Russo, A. A.; Jeffrey, P. D.; Patten, A. K.; Massague, J.; Pavletich, N. P. Nature 1996, 382, 325. (32) Wu, G.; Chen, Y. G.; Ozdamar, B.; Gyuricza, C. A.; Chong, P. A.; Wrana, J. L.; Massague, J.; Shi, Y. Science 2000, 287, 92. (33) He, J.; Chao, W. C.; Zhang, Z.; Yang, J.; Cronin, N.; Barford, D. Mol. Cell 2013, 50, 649. (34) Jeffrey, P. D.; Gorina, S.; Pavletich, N. P. Science 1995, 267, 1498. (35) Vanhooke, J. L.; Benning, M. M.; Bauer, C. B.; Pike, J. W.; DeLuca, H. F. Biochemistry 2004, 43, 4101. (36) Wu, X.; Knudsen, B.; Feller, S. M.; Zheng, J.; Sali, A.; Cowburn, D.; Hanafusa, H.; Kuriyan, J. Structure 1995, 3, 215. (37) Fontes, M. R.; Teh, T.; Toth, G.; John, A.; Pavo, I.; Jans, D. A.; Kobe, B. Biochem. J. 2003, 375, 339. (38) Han, J. H.; Batey, S.; Nickson, A. A.; Teichmann, S. A.; Clarke, J. Nat. Rev. Mol. Cell Biol. 2007, 8, 319. (39) Neduva, V.; Russell, R. B. FEBS Lett. 2005, 579, 3342. (40) Follis, A. V.; Galea, C. A.; Kriwacki, R. W. Adv. Exp. Med. Biol. 2012, 725, 27. (41) Phan, A. T.; Kuryavyi, V.; Darnell, J. C.; Serganov, A.; Majumdar, A.; Ilin, S.; Raslin, T.; Polonskaia, A.; Chen, C.; Clain, D.; Darnell, R. B.; Patel, D. J. Nat. Struct. Mol. Biol. 2011, 18, 796.

(42) Kojima, C.; Hashimoto, A.; Yabuta, I.; Hirose, M.; Hashimoto, S.; Kanaho, Y.; Sumimoto, H.; Ikegami, T.; Sabe, H. EMBO J. 2004, 23, 4413. (43) Kato, M.; Han, T. W.; Xie, S.; Shi, K.; Du, X.; Wu, L. C.; Mirzaei, H.; Goldsmith, E. J.; Longgood, J.; Pei, J.; Grishin, N. V.; Frantz, D. E.; Schneider, J. W.; Chen, S.; Li, L.; Sawaya, M. R.; Eisenberg, D.; Tycko, R.; McKnight, S. L. Cell 2012, 149, 753. (44) Stein, A.; Aloy, P. PLoS One 2008, 3, e2524. (45) Chica, C.; Diella, F.; Gibson, T. J. PLoS One 2009, 4, e6052. (46) London, N.; Movshovitz-Attias, D.; Schueler-Furman, O. Structure 2010, 18, 188. (47) Kalderon, D.; Roberts, B. L.; Richardson, W. D.; Smith, A. E. Cell 1984, 39, 499. (48) Wheelan, S. J.; Marchler-Bauer, A.; Bryant, S. H. Bioinformatics 2000, 16, 613. (49) Kastritis, P. L.; Moal, I. H.; Hwang, H.; Weng, Z.; Bates, P. A.; Bonvin, A. M.; Janin, J. Protein Sci. 2011, 20, 482. (50) Lee, J. H.; Richter, W.; Namkung, W.; Kim, K. H.; Kim, E.; Conti, M.; Lee, M. G. J. Biol. Chem. 2007, 282, 10414. (51) Hirschi, A.; Cecchini, M.; Steinhardt, R. C.; Schamber, M. R.; Dick, F. A.; Rubin, S. M. Nat. Struct. Mol. Biol. 2010, 17, 1051. (52) Chang, C. C.; Naik, M. T.; Huang, Y. S.; Jeng, J. C.; Liao, P. H.; Kuo, H. Y.; Ho, C. C.; Hsieh, Y. L.; Lin, C. H.; Huang, N. J.; Naik, N. M.; Kung, C. C.; Lin, S. Y.; Chen, R. H.; Chang, K. S.; Huang, T. H.; Shih, H. M. Mol. Cell 2011, 42, 62. (53) Sun, Q.; Jackson, R. A.; Ng, C.; Guy, G. R.; Sivaraman, J. PLoS One 2010, 5, e12819. (54) Harreman, M. T.; Kline, T. M.; Milford, H. G.; Harben, M. B.; Hodel, A. E.; Corbett, A. H. J. Biol. Chem. 2004, 279, 20613. (55) Honnappa, S.; John, C. M.; Kostrewa, D.; Winkler, F. K.; Steinmetz, M. O. EMBO J. 2005, 24, 261. (56) Lee, C. W.; Ferreon, J. C.; Ferreon, A. C.; Arai, M.; Wright, P. E. Proc. Natl. Acad. Sci. U.S.A. 2010, 107, 19290. (57) Lacy, E. R.; Filippov, I.; Lewis, W. S.; Otieno, S.; Xiao, L.; Weiss, S.; Hengst, L.; Kriwacki, R. W. Nat. Struct. Mol. Biol. 2004, 11, 358. (58) Philo, J. S.; Aoki, K. H.; Arakawa, T.; Narhi, L. O.; Wen, J. Biochemistry 1996, 35, 1681. (59) Vicentini, A. M.; Kieffer, B.; Matthies, R.; Meyhack, B.; Hemmings, B. A.; Stone, S. R.; Hofsteenge, J. Biochemistry 1990, 29, 8827. (60) Dyson, H. J. Q. Rev. Biophys. 2011, 44, 467. (61) Remaut, H.; Waksman, G. Trends Biochem. Sci. 2006, 31, 436. (62) Fuxreiter, M.; Tompa, P.; Simon, I. Bioinformatics 2007, 23, 950. (63) Donaldson, L. W.; Gish, G.; Pawson, T.; Kay, L. E.; FormanKay, J. D. Proc. Natl. Acad. Sci. U.S.A. 2002, 99, 14053. (64) Uversky, V. N.; Oldfield, C. J.; Midic, U.; Xie, H.; Xue, B.; Vucetic, S.; Iakoucheva, L. M.; Obradovic, Z.; Dunker, A. K. BMC Genomics 2009, 10 (Suppl1), S7. (65) Zhou, H. X. Trends Biochem. Sci. 2012, 37, 43. (66) Dunker, A. K.; Garner, E.; Guilliot, S.; Romero, P.; Albrecht, K.; Hart, J.; Obradovic, Z.; Kissinger, C.; Villafranca, J. E. Pac. Symp. Biocomput. 1998, 473. (67) Papadakos, G.; Housden, N. G.; Lilly, K. J.; Kaminska, R.; Kleanthous, C. J. Mol. Biol. 2012, 418, 269. (68) Dogan, J.; Gianni, S.; Jemth, P. Phys. Chem. Chem. Phys. 2014, 16, 6323. (69) Shammas, S. L.; Travis, A. J.; Clarke, J. J. Phys. Chem. B 2013, 117, 13346. (70) Freedman, S. J.; Sun, Z. Y.; Poy, F.; Kung, A. L.; Livingston, D. M.; Wagner, G.; Eck, M. J. Proc. Natl. Acad. Sci. U.S.A. 2002, 99, 5367. (71) Freedman, S. J.; Sun, Z. Y.; Kung, A. L.; France, D. S.; Wagner, G.; Eck, M. J. Nat. Struct. Biol. 2003, 10, 504. (72) Via, A.; Gherardini, P. F.; Ferraro, E.; Ausiello, G.; Scalia Tomba, G.; Helmer-Citterich, M. BMC Bioinformatics 2007, 8, 68. (73) Tan, C. S.; Pasculescu, A.; Lim, W. A.; Pawson, T.; Bader, G. D.; Linding, R. Science 2009, 325, 1686. (74) Pandit, B.; Sarkozy, A.; Pennacchio, L. A.; Carta, C.; Oishi, K.; Martinelli, S.; Pogna, E. A.; Schackwitz, W.; Ustaszewska, A.; Landstrom, A.; Bos, J. M.; Ommen, S. R.; Esposito, G.; Lepri, F.; 6771

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

(111) Song, J. J.; Garlick, J. D.; Kingston, R. E. Genes Dev. 2008, 22, 1313. (112) Han, Z.; Xing, X.; Hu, M.; Zhang, Y.; Liu, P.; Chai, J. Structure 2007, 15, 1306. (113) Oliver, A. W.; Swift, S.; Lord, C. J.; Ashworth, A.; Pearl, L. H. EMBO Rep. 2009, 10, 990. (114) Song, J. J.; Kingston, R. E. J. Biol. Chem. 2008, 283, 35258. (115) Stirnimann, C. U.; Petsalaki, E.; Russell, R. B.; Muller, C. W. Trends Biochem. Sci. 2010, 35, 565. (116) Krenn, V.; Wehenkel, A.; Li, X.; Santaguida, S.; Musacchio, A. J. Cell Biol. 2012, 196, 451. (117) Gsponer, J.; Futschik, M. E.; Teichmann, S. A.; Babu, M. M. Science 2008, 322, 1365. (118) Vavouri, T.; Semple, J. I.; Garcia-Verdugo, R.; Lehner, B. Cell 2009, 138, 198. (119) Scott, J. D.; Pawson, T. Science 2009, 326, 1220. (120) Bossi, A.; Lehner, B. Mol. Syst. Biol. 2009, 5, 260. (121) Rowicka, M.; Kudlicki, A.; Tu, B. P.; Otwinowski, Z. Proc. Natl. Acad. Sci. U.S.A. 2007, 104, 16892. (122) Jones, R. B.; Gordus, A.; Krall, J. A.; MacBeath, G. Nature 2006, 439, 168. (123) Cho, W. Sci. STKE 2006, 2006, pe7. (124) Buljan, M.; Chalancon, G.; Eustermann, S.; Wagner, G. P.; Fuxreiter, M.; Bateman, A.; Babu, M. M. Mol. Cell 2012, 46, 871. (125) Davis, M. J.; Shin, C. J.; Jing, N.; Ragan, M. A. Mol. Biosyst. 2012, 8, 2054. (126) Ellis, J. D.; Barrios-Rodiles, M.; Colak, R.; Irimia, M.; Kim, T.; Calarco, J. A.; Wang, X.; Pan, Q.; O’Hanlon, D.; Kim, P. M.; Wrana, J. L.; Blencowe, B. J. Mol. Cell 2012, 46, 884. (127) Weatheritt, R. J.; Gibson, T. J. Trends Biochem. Sci. 2012, 37, 333. (128) Ferreon, J. C.; Lee, C. W.; Arai, M.; Martinez-Yamout, M. A.; Dyson, H. J.; Wright, P. E. Proc. Natl. Acad. Sci. U.S.A. 2009, 106, 6591. (129) Akiva, E.; Friedlander, G.; Itzhaki, Z.; Margalit, H. PLoS Comput. Biol. 2012, 8, e1002341. (130) Van Roey, K.; Gibson, T. J.; Davey, N. E. Curr. Opin. Struct. Biol. 2012, 22, 378. (131) Skaar, J. R.; Pagan, J. K.; Pagano, M. Nat. Rev. Mol. Cell Biol. 2013, 14, 369. (132) Shin, C.; Manley, J. L. Nat. Rev. Mol. Cell Biol. 2004, 5, 727. (133) Weatheritt, R. J.; Davey, N. E.; Gibson, T. J. Nucleic Acids Res. 2012, 40, 7123. (134) Jin, J.; Pawson, T. Philos. Trans. R. Soc. London, B: Biol. Sci. 2012, 367, 2540. (135) Hon, W. C.; Wilson, M. I.; Harlos, K.; Claridge, T. D.; Schofield, C. J.; Pugh, C. W.; Maxwell, P. H.; Ratcliffe, P. J.; Stuart, D. I.; Jones, E. Y. Nature 2002, 417, 975. (136) Diella, F.; Haslam, N.; Chica, C.; Budd, A.; Michael, S.; Brown, N. P.; Trave, G.; Gibson, T. J. Front. Biosci. 2008, 13, 6580. (137) Remenyi, A.; Good, M. C.; Lim, W. A. Curr. Opin. Struct. Biol. 2006, 16, 676. (138) Goldsmith, E. J.; Akella, R.; Min, X.; Zhou, T.; Humphreys, J. M. Chem. Rev. 2007, 107, 5065. (139) Biondi, R. M.; Nebreda, A. R. Biochem. J. 2003, 372, 1. (140) Skroblin, P.; Grossmann, S.; Schafer, G.; Rosenthal, W.; Klussmann, E. Int. Rev. Cell Mol. Biol. 2010, 283, 235. (141) Roskoski, R. J. Pharmacol. Res. 2012, 66, 105. (142) Tanoue, T.; Adachi, M.; Moriguchi, T.; Nishida, E. Nat. Cell Biol. 2000, 2, 110. (143) Bardwell, L. Biochem. Soc. Trans. 2006, 34, 837. (144) Ruest, P. J.; Shin, N. Y.; Polte, T. R.; Zhang, X.; Hanks, S. K. Mol. Cell. Biol. 2001, 21, 7641. (145) Lobjois, V.; Froment, C.; Braud, E.; Grimal, F.; Burlet-Schiltz, O.; Ducommun, B.; Bouche, J. P. Biochem. Biophys. Res. Commun. 2011, 410, 87. (146) Cheng, K. Y.; Noble, M. E.; Skamnaki, V.; Brown, N. R.; Lowe, E. D.; Kontogiannis, L.; Shen, K.; Cole, P. A.; Siligardi, G.; Johnson, L. N. J. Biol. Chem. 2006, 281, 23167.

Faul, C.; Mundel, P.; Lopez Siguero, J. P.; Tenconi, R.; Selicorni, A.; Rossi, C.; Mazzanti, L.; Torrente, I.; Marino, B.; Digilio, M. C.; Zampino, G.; Ackerman, M. J.; Dallapiccola, B.; Tartaglia, M.; Gelb, B. D. Nat. Genet. 2007, 39, 1007. (75) Zhang, T.; Prives, C. J. Biol. Chem. 2001, 276, 29702. (76) Hittinger, C. T.; Carroll, S. B. Evol. Dev. 2008, 10, 537. (77) Beltrao, P.; Serrano, L. PLoS Comput. Biol. 2007, 3, e25. (78) Kim, J.; Kim, I.; Yang, J. S.; Shin, Y. E.; Hwang, J.; Park, S.; Choi, Y. S.; Kim, S. PLoS Genet. 2012, 8, e1002510. (79) Zielinska, D. F.; Gnad, F.; Schropp, K.; Wisniewski, J. R.; Mann, M. Mol. Cell 2012, 46, 542. (80) Holt, L. J.; Tuch, B. B.; Villen, J.; Johnson, A. D.; Gygi, S. P.; Morgan, D. O. Science 2009, 325, 1682. (81) Nair, R.; Carter, P.; Rost, B. Nucleic Acids Res. 2003, 31, 397. (82) Boutros, R.; Lobjois, V.; Ducommun, B. Nat. Rev. Cancer 2007, 7, 495. (83) Besson, A.; Dowdy, S. F.; Roberts, J. M. Dev. Cell 2008, 14, 159. (84) Primorac, I.; Musacchio, A. J. Cell Biol. 2013, 201, 177. (85) Bruning, J. B.; Shamoo, Y. Structure 2004, 12, 2209. (86) Conti, E.; Kuriyan, J. Structure 2000, 8, 329. (87) Vogel, C.; Chothia, C. PLoS Comput. Biol. 2006, 2, e48. (88) Liu, B. A.; Nash, P. D. Philos. Trans. R. Soc. London, B, Biol. Sci. 2012, 367, 2556. (89) Kaneko, T.; Joshi, R.; Feller, S. M.; Li, S. S. Cell Commun. Signal 2012, 10, 32. (90) Clery, A.; Jayne, S.; Benderska, N.; Dominguez, C.; Stamm, S.; Allain, F. H. Nat. Struct. Mol. Biol. 2011, 18, 443. (91) Joshi, A.; Coelho, M. B.; Kotik-Kogan, O.; Simpson, P. J.; Matthews, S. J.; Smith, C. W.; Curry, S. Structure 2011, 19, 1816. (92) Corsini, L.; Bonnal, S.; Basquin, J.; Hothorn, M.; Scheffzek, K.; Valcarcel, J.; Sattler, M. Nat. Struct. Mol. Biol. 2007, 14, 620. (93) Cukier, C. D.; Hollingworth, D.; Martin, S. R.; Kelly, G.; DiazMoreno, I.; Ramos, A. Nat. Struct. Mol. Biol. 2010, 17, 1058. (94) Motley, A. M.; Hettema, E. H.; Ketting, R.; Plasterk, R.; Tabak, H. F. EMBO Rep. 2000, 1, 40. (95) Finn, R. D.; Bateman, A.; Clements, J.; Coggill, P.; Eberhardt, R. Y.; Eddy, S. R.; Heger, A.; Hetherington, K.; Holm, L.; Mistry, J.; Sonnhammer, E. L.; Tate, J.; Punta, M. Nucleic Acids Res. 2014, 42, D222. (96) Landgraf, C.; Panni, S.; Montecchi-Palazzi, L.; Castagnoli, L.; Schneider-Mergener, J.; Volkmer-Engert, R.; Cesareni, G. PLoS Biol. 2004, 2, E14. (97) Zarrinpar, A.; Park, S. H.; Lim, W. A. Nature 2003, 426, 676. (98) Ubersax, J. A.; Ferrell, J. E. J. Nat. Rev. Mol. Cell Biol. 2007, 8, 530. (99) Liu, B. A.; Jablonowski, K.; Shah, E. E.; Engelmann, B. W.; Jones, R. B.; Nash, P. D. Mol. Cell. Proteomics 2010, 9, 2391. (100) Poy, F.; Yaffe, M. B.; Sayos, J.; Saxena, K.; Morra, M.; Sumegi, J.; Cantley, L. C.; Terhorst, C.; Eck, M. J. Mol. Cell 1999, 4, 555. (101) Saksela, K.; Permi, P. FEBS Lett. 2012, 586, 2609. (102) Bhattacharyya, R. P.; Remenyi, A.; Good, M. C.; Bashor, C. J.; Falick, A. M.; Lim, W. A. Science 2006, 311, 822. (103) Ng, C.; Jackson, R. A.; Buschdorf, J. P.; Sun, Q.; Guy, G. R.; Sivaraman, J. EMBO J. 2008, 27, 804. (104) Ottmann, C.; Yasmin, L.; Weyand, M.; Veesenmeyer, J. L.; Diaz, M. H.; Palmer, R. H.; Francis, M. S.; Hauser, A. R.; Wittinghofer, A.; Hallberg, B. EMBO J. 2007, 26, 902. (105) Garron, M. L.; Arthos, J.; Guichou, J. F.; McNally, J.; Cicala, C.; Arold, S. T. J. Mol. Biol. 2008, 375, 1320. (106) Nomura, M.; Uda-Tochio, H.; Murai, K.; Mori, N.; Nishimura, Y. J. Mol. Biol. 2005, 354, 903. (107) Sahu, S. C.; Swanson, K. A.; Kang, R. S.; Huang, K.; Brubaker, K.; Ratcliff, K.; Radhakrishnan, I. J. Mol. Biol. 2008, 375, 1444. (108) Jennings, B. H.; Pickles, L. M.; Wainwright, S. M.; Roe, S. M.; Pearl, L. H.; Ish-Horowicz, D. Mol. Cell 2006, 22, 645. (109) Hao, B.; Oehlmann, S.; Sowa, M. E.; Harper, J. W.; Pavletich, N. P. Mol. Cell 2007, 26, 131. (110) Wu, G.; Xu, G.; Schulman, B. A.; Jeffrey, P. D.; Harper, J. W.; Pavletich, N. P. Mol. Cell 2003, 11, 1445. 6772

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

(187) Malik, H. S.; Eickbush, T. H.; Goldfarb, D. S. Proc. Natl. Acad. Sci. U.S.A. 1997, 94, 13738. (188) Fu, S. C.; Huang, H. C.; Horton, P.; Juan, H. F. Nucleic Acids Res. 2013, 41, D338. (189) Dingwall, C.; Robbins, J.; Dilworth, S. M.; Roberts, B.; Richardson, W. D. J. Cell Biol. 1988, 107, 841. (190) Liang, S. H.; Clarke, M. F. J. Biol. Chem. 1999, 274, 32699. (191) Chen, C. F.; Li, S.; Chen, Y.; Chen, P. L.; Sharp, Z. D.; Lee, W. H. J. Biol. Chem. 1996, 271, 32863. (192) Stommel, J. M.; Marchenko, N. D.; Jimenez, G. S.; Moll, U. M.; Hope, T. J.; Wahl, G. M. EMBO J. 1999, 18, 1660. (193) van Hengel, J.; Vanhoenacker, P.; Staes, K.; van Roy, F. Proc. Natl. Acad. Sci. U.S.A. 1999, 96, 7980. (194) Hurley, J. H.; Misra, S. Annu. Rev. Biophys. Biomol. Struct. 2000, 29, 49. (195) Greaves, J.; Chamberlain, L. H. J. Cell Biol. 2007, 176, 249. (196) Akhmanova, A.; Hoogenraad, C. C. Curr. Opin. Cell Biol. 2005, 17, 47. (197) Jiang, K.; Toedt, G.; Montenegro Gouveia, S.; Davey, N. E.; Hua, S.; van der Vaart, B.; Grigoriev, I.; Larsen, J.; Pedersen, L. B.; Bezstarosti, K.; Lince-Faria, M.; Demmers, J.; Steinmetz, M. O.; Gibson, T. J.; Akhmanova, A. Curr. Biol. 2012, 22, 1800. (198) Honnappa, S.; Gouveia, S. M.; Weisbrich, A.; Damberger, F. F.; Bhavesh, N. S.; Jawhari, H.; Grigoriev, I.; van Rijssel, F. J.; Buey, R. M.; Lawera, A.; Jelesarov, I.; Winkler, F. K.; Wuthrich, K.; Akhmanova, A.; Steinmetz, M. O. Cell 2009, 138, 366. (199) Mayer, B. J.; Jackson, P. K.; Baltimore, D. Proc. Natl. Acad. Sci. U.S.A. 1991, 88, 627. (200) Ren, R.; Mayer, B. J.; Cicchetti, P.; Baltimore, D. Science 1993, 259, 1157. (201) Pawson, T.; Schlessingert, J. Curr. Biol. 1993, 3, 434. (202) Hekman, M.; Wiese, S.; Metz, R.; Albert, S.; Troppmair, J.; Nickel, J.; Sendtner, M.; Rapp, U. R. J. Biol. Chem. 2004, 279, 14074. (203) Buey, R. M.; Sen, I.; Kortt, O.; Mohan, R.; Gfeller, D.; Veprintsev, D.; Kretzschmar, I.; Scheuermann, J.; Neri, D.; Zoete, V.; Michielin, O.; de Pereda, J. M.; Akhmanova, A.; Volkmer, R.; Steinmetz, M. O. J. Biol. Chem. 2012, 287, 28227. (204) Tatham, M. H.; Geoffroy, M. C.; Shen, L.; Plechanovova, A.; Hattersley, N.; Jaffray, E. G.; Palvimo, J. J.; Hay, R. T. Nat. Cell Biol. 2008, 10, 538. (205) Ye, H.; Arron, J. R.; Lamothe, B.; Cirilli, M.; Kobayashi, T.; Shevde, N. K.; Segal, D.; Dzivenu, O. K.; Vologodskaia, M.; Yim, M.; Du, K.; Singh, S.; Pike, J. W.; Darnay, B. G.; Choi, Y.; Wu, H. Nature 2002, 418, 443. (206) Giacinti, C.; Giordano, A. Oncogene 2006, 25, 5220. (207) Dick, F. A.; Rubin, S. M. Nat. Rev. Mol. Cell Biol. 2013, 14, 297. (208) Fattaey, A. R.; Helin, K.; Dembski, M. S.; Dyson, N.; Harlow, E.; Vuocolo, G. A.; Hanobik, M. G.; Haskell, K. M.; Oliff, A.; DefeoJones, D.; Jones, R. E. Oncogene 1993, 8, 3149. (209) Soni, R.; Carmichael, J. P.; Shah, Z. H.; Murray, J. A. Plant Cell 1995, 7, 85. (210) Lee, J. O.; Russo, A. A.; Pavletich, N. P. Nature 1998, 391, 859. (211) Nagy, L.; Schwabe, J. W. Trends Biochem. Sci. 2004, 29, 317. (212) Privalsky, M. L. Annu. Rev. Physiol. 2004, 66, 315. (213) Watson, P. J.; Fairall, L.; Schwabe, J. W. Mol. Cell. Endocrinol. 2012, 348, 440. (214) Heery, D. M.; Kalkhoven, E.; Hoare, S.; Parker, M. G. Nature 1997, 387, 733. (215) Pawson, T.; Warner, N. Oncogene 2007, 26, 1268. (216) Good, M. C.; Zalatan, J. G.; Lim, W. A. Science 2011, 332, 680. (217) Findlay, G. M.; Smith, M. J.; Lanner, F.; Hsiung, M. S.; Gish, G. D.; Petsalaki, E.; Cockburn, K.; Kaneko, T.; Huang, H.; Bagshaw, R. D.; Ketela, T.; Tucholska, M.; Taylor, L.; Bowtell, D. D.; Moffat, J.; Ikura, M.; Li, S. S.; Sidhu, S. S.; Rossant, J.; Pawson, T. Cell 2013, 152, 1008. (218) Wendland, B. Nat. Rev. Mol. Cell Biol. 2002, 3, 971. (219) Ford, M. G.; Mills, I. G.; Peter, B. J.; Vallis, Y.; Praefcke, G. J.; Evans, P. R.; McMahon, H. T. Nature 2002, 419, 361.

(147) Lowe, E. D.; Tews, I.; Cheng, K. Y.; Brown, N. R.; Gul, S.; Noble, M. E.; Gamblin, S. J.; Johnson, L. N. Biochemistry 2002, 41, 15625. (148) Hsiao, S. J.; Smith, S. Biochimie 2008, 90, 83. (149) Guettler, S.; LaRose, J.; Petsalaki, E.; Gish, G.; Scotter, A.; Pawson, T.; Rottapel, R.; Sicheri, F. Cell 2011, 147, 1340. (150) Sbodio, J. I.; Chi, N. W. J. Biol. Chem. 2002, 277, 31887. (151) Zhang, Y.; Liu, S.; Mickanin, C.; Feng, Y.; Charlat, O.; Michaud, G. A.; Schirle, M.; Shi, X.; Hild, M.; Bauer, A.; Myer, V. E.; Finan, P. M.; Porter, J. A.; Huang, S. M.; Cong, F. Nat. Cell Biol. 2011, 13, 623. (152) Roy, J.; Cyert, M. S. Sci. Signal 2009, 2, re9. (153) Hendrickx, A.; Beullens, M.; Ceulemans, H.; Den Abt, T.; Van Eynde, A.; Nicolaescu, E.; Lesage, B.; Bollen, M. Chem. Biol. 2009, 16, 365. (154) Virshup, D. M.; Shenolikar, S. Mol. Cell 2009, 33, 537. (155) Bollen, M.; Peti, W.; Ragusa, M. J.; Beullens, M. Trends Biochem. Sci. 2010, 35, 450. (156) Heroes, E.; Lesage, B.; Gornemann, J.; Beullens, M.; Van Meervelt, L.; Bollen, M. FEBS J. 2013, 280, 584. (157) Vucic, D.; Dixit, V. M.; Wertz, I. E. Nat. Rev. Mol. Cell Biol. 2011, 12, 439. (158) Schwanhausser, B.; Busse, D.; Li, N.; Dittmar, G.; Schuchhardt, J.; Wolf, J.; Chen, W.; Selbach, M. Nature 2011, 473, 337. (159) Nakayama, K. I.; Nakayama, K. Semin. Cell Dev. Biol. 2005, 16, 323. (160) Brooks, C. L.; Gu, W. FEBS Lett. 2011, 585, 2803. (161) Gregory, M. A.; Qi, Y.; Hann, S. R. J. Biol. Chem. 2003, 278, 51606. (162) Winston, J. T.; Strack, P.; Beer-Romero, P.; Chu, C. Y.; Elledge, S. J.; Harper, J. W. Genes Dev. 1999, 13, 270. (163) Zur, A.; Brandeis, M. EMBO J. 2001, 20, 792. (164) Clute, P.; Pines, J. Nat. Cell Biol. 1999, 1, 82. (165) den Elzen, N.; Pines, J. J. Cell Biol. 2001, 153, 121. (166) Nguyen, H. G.; Chinnappan, D.; Urano, T.; Ravid, K. Mol. Cell. Biol. 2005, 25, 4977. (167) Lindon, C.; Pines, J. J. Cell Biol. 2004, 164, 233. (168) Calderon-Villalobos, L. I.; Tan, X.; Zheng, N.; Estelle, M. Cold Spring Harbor Perspect. Biol. 2010, 2, a005546. (169) Tan, X.; Calderon-Villalobos, L. I.; Sharon, M.; Zheng, C.; Robinson, C. V.; Estelle, M.; Zheng, N. Nature 2007, 446, 640. (170) Tasaki, T.; Sriram, S. M.; Park, K. S.; Kwon, Y. T. Annu. Rev. Biochem. 2012, 81, 261. (171) Matta-Camacho, E.; Kozlov, G.; Li, F. F.; Gehring, K. Nat. Struct. Mol. Biol. 2010, 17, 1182. (172) Choi, W. S.; Jeong, B. C.; Joo, Y. J.; Lee, M. R.; Kim, J.; Eck, M. J.; Song, H. K. Nat. Struct. Mol. Biol. 2010, 17, 1175. (173) Sriram, S. M.; Kim, B. Y.; Kwon, Y. T. Nat. Rev. Mol. Cell Biol. 2011, 12, 735. (174) Matera, A. G.; Izaguire-Sierra, M.; Praveen, K.; Rajendra, T. K. Dev. Cell 2009, 17, 639. (175) Diekmann, Y.; Pereira-Leal, J. B. Biochem. J. 2013, 449, 319. (176) Hung, M. C.; Link, W. J. Cell Sci. 2011, 124, 3381. (177) Uhlen, M.; Oksvold, P.; Fagerberg, L.; Lundberg, E.; Jonasson, K.; Forsberg, M.; Zwahlen, M.; Kampf, C.; Wester, K.; Hober, S.; Wernerus, H.; Bjorling, L.; Ponten, F. Nat. Biotechnol. 2010, 28, 1248. (178) Derby, M. C.; Gleeson, P. A. Int. Rev. Cytol. 2007, 261, 47. (179) Pandey, K. N. Curr. Opin. Biotechnol. 2010, 21, 611. (180) Kelly, B. T.; McCoy, A. J.; Spate, K.; Miller, S. E.; Evans, P. R.; Honing, S.; Owen, D. J. Nature 2008, 456, 976. (181) Owen, D. J.; Evans, P. R. Science 1998, 282, 1327. (182) Heilker, R.; Spiess, M.; Crottet, P. Bioessays 1999, 21, 558. (183) Eugster, A.; Frigerio, G.; Dale, M.; Duden, R. Mol. Biol. Cell 2004, 15, 1011. (184) Majoul, I.; Straub, M.; Hell, S. W.; Duden, R.; Soling, H. D. Dev. Cell 2001, 1, 139. (185) Munro, S.; Pelham, H. R. Cell 1987, 48, 899. (186) Sorokin, A. V.; Kim, E. R.; Ovchinnikov, L. P. Biochemistry (Mosc.) 2007, 72, 1439. 6773

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

(255) Geiss-Friedlander, R.; Melchior, F. Nat. Rev. Mol. Cell Biol. 2007, 8, 947. (256) Zhong, S.; Muller, S.; Ronchetti, S.; Freemont, P. S.; Dejean, A.; Pandolfi, P. P. Blood 2000, 95, 2748. (257) Rodriguez, M. S.; Desterro, J. M.; Lain, S.; Midgley, C. A.; Lane, D. P.; Hay, R. T. EMBO J. 1999, 18, 6455. (258) Fisher, R. D.; Wang, B.; Alam, S. L.; Higginson, D. S.; Robinson, H.; Sundquist, W. I.; Hill, C. P. J. Biol. Chem. 2003, 278, 28976. (259) Sarkari, F.; La Delfa, A.; Arrowsmith, C. H.; Frappier, L.; Sheng, Y.; Saridakis, V. J. Mol. Biol. 2010, 402, 825. (260) Sheng, Y.; Saridakis, V.; Sarkari, F.; Duan, S.; Wu, T.; Arrowsmith, C. H.; Frappier, L. Nat. Struct. Mol. Biol. 2006, 13, 285. (261) Bremmer, S. C.; Hall, H.; Martinez, J. S.; Eissler, C. L.; Hinrichsen, T. H.; Rossie, S.; Parker, L. L.; Hall, M. C.; Charbonneau, H. J. Biol. Chem. 2012, 287, 1662. (262) Xiao, Q.; Zhang, F.; Nacev, B. A.; Liu, J. O.; Pei, D. Biochemistry 2010, 49, 5588. (263) Seidah, N. G.; Sadr, M. S.; Chretien, M.; Mbikay, M. J. Biol. Chem. 2013, 288, 21473. (264) Teixeira, P. F.; Glaser, E. Biochim. Biophys. Acta 2013, 1833, 360. (265) Clarke, D. J. Cell Cycle 2002, 1, 233. (266) McIlwain, D. R.; Berger, T.; Mak, T. W. Cold Spring Harbor Perspect. Biol. 2013, 5, a008656. (267) Bergeron, F.; Leduc, R.; Day, R. J. Mol. Endocrinol. 2000, 24, 1. (268) Anders, A.; Gilbert, S.; Garten, W.; Postina, R.; Fahrenholz, F. FASEB J. 2001, 15, 1837. (269) Dubois, C. M.; Blanchette, F.; Laprise, M. H.; Leduc, R.; Grondin, F.; Seidah, N. G. Am. J. Pathol. 2001, 158, 305. (270) Moulard, M.; Decroly, E. Biochim. Biophys. Acta 2000, 1469, 121. (271) Fischer, U.; Janicke, R. U.; Schulze-Osthoff, K. Cell Death Differ. 2003, 10, 76. (272) Liu, X.; Zou, H.; Slaughter, C.; Wang, X. Cell 1997, 89, 175. (273) Lu, K. P.; Finn, G.; Lee, T. H.; Nicholson, L. K. Nat. Chem. Biol. 2007, 3, 619. (274) Liou, Y. C.; Zhou, X. Z.; Lu, K. P. Trends Biochem. Sci. 2011, 36, 501. (275) Sarkar, P.; Saleh, T.; Tzeng, S. R.; Birge, R. B.; Kalodimos, C. G. Nat. Chem. Biol. 2011, 7, 51. (276) Zhou, X. Z.; Kops, O.; Werner, A.; Lu, P. J.; Shen, M.; Stoller, G.; Kullertz, G.; Stark, M.; Fischer, G.; Lu, K. P. Mol. Cell 2000, 6, 873. (277) Davis, T. L.; Walker, J. R.; Campagna-Slater, V.; Finerty, P. J.; Paramanathan, R.; Bernstein, G.; MacKenzie, F.; Tempel, W.; Ouyang, H.; Lee, W. H.; Eisenmesser, E. Z.; Dhe-Paganon, S. PLoS Biol. 2010, 8, e1000439. (278) Piotukh, K.; Gu, W.; Kofler, M.; Labudde, D.; Helms, V.; Freund, C. J. Biol. Chem. 2005, 280, 23668. (279) Deribe, Y. L.; Pawson, T.; Dikic, I. Nat. Struct. Mol. Biol. 2010, 17, 666. (280) Chu, B.; Soncin, F.; Price, B. D.; Stevenson, M. A.; Calderwood, S. K. J. Biol. Chem. 1996, 271, 30847. (281) Dajani, R.; Fraser, E.; Roe, S. M.; Young, N.; Good, V.; Dale, T. C.; Pearl, L. H. Cell 2001, 105, 721. (282) Sakaguchi, K.; Saito, S.; Higashimoto, Y.; Roy, S.; Anderson, C. W.; Appella, E. J. Biol. Chem. 2000, 275, 9278. (283) Yang, X. J.; Gregoire, S. Mol. Cell 2006, 23, 779. (284) Torres, J.; Rodriguez, J.; Myers, M. P.; Valiente, M.; Graves, J. D.; Tonks, N. K.; Pulido, R. J. Biol. Chem. 2003, 278, 30652. (285) Peng, C. Y.; Graves, P. R.; Thoma, R. S.; Wu, Z.; Shaw, A. S.; Piwnica-Worms, H. Science 1997, 277, 1501. (286) Chen, M. S.; Ryan, C. E.; Piwnica-Worms, H. Mol. Cell. Biol. 2003, 23, 7488. (287) Elia, A. E.; Cantley, L. C.; Yaffe, M. B. Science 2003, 299, 1228. (288) Astuti, P.; Boutros, R.; Ducommun, B.; Gabrielli, B. J. Biol. Chem. 2010, 285, 34364. (289) Giles, N.; Forrest, A.; Gabrielli, B. J. Biol. Chem. 2003, 278, 28580.

(220) Shih, S. C.; Katzmann, D. J.; Schnell, J. D.; Sutanto, M.; Emr, S. D.; Hicke, L. Nat. Cell Biol. 2002, 4, 389. (221) Edeling, M. A.; Mishra, S. K.; Keyel, P. A.; Steinhauser, A. L.; Collins, B. M.; Roth, R.; Heuser, J. E.; Owen, D. J.; Traub, L. M. Dev. Cell 2006, 10, 329. (222) Morinaka, K.; Koyama, S.; Nakashima, S.; Hinoi, T.; Okawa, K.; Iwamatsu, A.; Kikuchi, A. Oncogene 1999, 18, 5915. (223) Drake, M. T.; Downs, M. A.; Traub, L. M. J. Biol. Chem. 2000, 275, 6479. (224) Dell’Angelica, E. C. Trends Cell Biol. 2001, 11, 315. (225) Schmid, E. M.; Ford, M. G.; Burtey, A.; Praefcke, G. J.; PeakChew, S. Y.; Mills, I. G.; Benmerah, A.; McMahon, H. T. PLoS Biol. 2006, 4, e262. (226) Perino, A.; Ghigo, A.; Scott, J. D.; Hirsch, E. Circ. Res. 2012, 111, 482. (227) Sarma, G. N.; Kinderman, F. S.; Kim, C.; von Daake, S.; Chen, L.; Wang, B. C.; Taylor, S. S. Structure 2010, 18, 155. (228) Pidoux, G.; Tasken, K. J. Mol. Endocrinol. 2010, 44, 271. (229) Klauck, T. M.; Faux, M. C.; Labudda, K.; Langeberg, L. K.; Jaken, S.; Scott, J. D. Science 1996, 271, 1589. (230) Li, H.; Pink, M. D.; Murphy, J. G.; Stein, A.; Dell’Acqua, M. L.; Hogan, P. G. Nat. Struct. Mol. Biol. 2012, 19, 337. (231) Gisler, S. M.; Madjdpour, C.; Bacic, D.; Pribanic, S.; Taylor, S. S.; Biber, J.; Murer, H. Kidney Int. 2003, 64, 1746. (232) Sicheri, F.; Kuriyan, J. Curr. Opin. Struct. Biol. 1997, 7, 777. (233) Hof, P.; Pluskey, S.; Dhe-Paganon, S.; Eck, M. J.; Shoelson, S. E. Cell 1998, 92, 441. (234) Xu, W.; Doshi, A.; Lei, M.; Eck, M. J.; Harrison, S. C. Mol. Cell 1999, 3, 629. (235) Pineda-Lucena, A.; Ho, C. S.; Mao, D. Y.; Sheng, Y.; Laister, R. C.; Muhandiram, R.; Lu, Y.; Seet, B. T.; Katz, S.; Szyperski, T.; Penn, L. Z.; Arrowsmith, C. H. J. Mol. Biol. 2005, 351, 182. (236) Long, J. F.; Feng, W.; Wang, R.; Chan, L. N.; Ip, F. C.; Xia, J.; Ip, N. Y.; Zhang, M. Nat. Struct. Mol. Biol. 2005, 12, 722. (237) Han, K. K.; Martinage, A. Int. J. Biochem. 1992, 24, 19. (238) Seet, B. T.; Dikic, I.; Zhou, M. M.; Pawson, T. Nat. Rev. Mol. Cell Biol. 2006, 7, 473. (239) Jensen, O. N. Curr. Opin. Chem. Biol. 2004, 8, 33. (240) Hao, B.; Zheng, N.; Schulman, B. A.; Wu, G.; Miller, J. J.; Pagano, M.; Pavletich, N. P. Mol. Cell 2005, 20, 9. (241) Obsil, T.; Ghirlando, R.; Anderson, D. E.; Hickman, A. B.; Dyda, F. Biochemistry 2003, 42, 15264. (242) Dong, J.; Feldmann, G.; Huang, J.; Wu, S.; Zhang, N.; Comerford, S. A.; Gayyed, M. F.; Anders, R. A.; Maitra, A.; Pan, D. Cell 2007, 130, 1120. (243) Hunter, T. Philos. Trans. R. Soc. London, B: Biol. Sci. 2012, 367, 2513. (244) Moremen, K. W.; Tiemeyer, M.; Nairn, A. V. Nat. Rev. Mol. Cell Biol. 2012, 13, 448. (245) Titani, K.; Kumar, S.; Takio, K.; Ericsson, L. H.; Wade, R. D.; Ashida, K.; Walsh, K. A.; Chopek, M. W.; Sadler, J. E.; Fujikawa, K. Biochemistry 1986, 25, 3171. (246) Sato, C.; Kim, J. H.; Abe, Y.; Saito, K.; Yokoyama, S.; Kohda, D. J. Biochem. 2000, 127, 65. (247) Hase, S.; Kawabata, S.; Nishimura, H.; Takeya, H.; Sueyoshi, T.; Miyata, T.; Iwanaga, S.; Takao, T.; Shimonishi, Y.; Ikenaka, T. J. Biochem. 1988, 104, 867. (248) Housley, M. P.; Rodgers, J. T.; Udeshi, N. D.; Kelly, T. J.; Shabanowitz, J.; Hunt, D. F.; Puigserver, P.; Hart, G. W. J. Biol. Chem. 2008, 283, 16283. (249) Ozcan, S.; Andrali, S. S.; Cantrell, J. E. Biochim. Biophys. Acta 2010, 1799, 353. (250) Hancock, J. F.; Magee, A. I.; Childs, J. E.; Marshall, C. J. Cell 1989, 57, 1167. (251) Donaldson, J. G.; Honda, A. Biochem. Soc. Trans. 2005, 33, 639. (252) Dudler, T.; Gelb, M. H. J. Biol. Chem. 1996, 271, 11541. (253) van’t Hof, W.; Resh, M. D. J. Cell Biol. 1999, 145, 377. (254) Pickart, C. M.; Eddins, M. J. Biochim. Biophys. Acta 2004, 1695, 55. 6774

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

(324) Havens, C. G.; Walter, J. C. Mol. Cell 2009, 35, 93. (325) Havens, C. G.; Shobnam, N.; Guarino, E.; Centore, R. C.; Zou, L.; Kearsey, S. E.; Walter, J. C. J. Biol. Chem. 2012, 287, 11410. (326) Child, E. S.; Mann, D. J. Cell Cycle 2006, 5, 1313. (327) Starostina, N. G.; Kipreos, E. T. Trends Cell Biol. 2012, 22, 33. (328) Evans, P. R.; Owen, D. J. Curr. Opin. Struct. Biol. 2002, 12, 814. (329) Hynes, N. E.; Lane, H. A. Nat. Rev. Cancer 2005, 5, 341. (330) Reich, N. C.; Liu, L. Nat. Rev. Immunol. 2006, 6, 602. (331) Vivanco, I.; Sawyers, C. L. Nat. Rev. Cancer 2002, 2, 489. (332) Chan, H. M.; Smith, L.; La Thangue, N. B. Oncogene 2001, 20, 6152. (333) Brownlie, R. J.; Zamoyska, R. Nat. Rev. Immunol. 2013, 13, 257. (334) Olayioye, M. A.; Neve, R. M.; Lane, H. A.; Hynes, N. E. EMBO J. 2000, 19, 3159. (335) Bouvard, D.; Pouwels, J.; De Franceschi, N.; Ivaska, J. Nat. Rev. Mol. Cell Biol. 2013, 14, 430. (336) Wu, H. Cell 2013, 153, 287. (337) Balagopalan, L.; Coussens, N. P.; Sherman, E.; Samelson, L. E.; Sommers, C. L. Cold Spring Harbor Perspect. Biol. 2010, 2, a005512. (338) Wange, R. L. Sci. STKE 2000, 2000, re1. (339) Sommers, C. L.; Samelson, L. E.; Love, P. E. Bioessays 2004, 26, 61. (340) Houtman, J. C.; Yamaguchi, H.; Barda-Saad, M.; Braiman, A.; Bowden, B.; Appella, E.; Schuck, P.; Samelson, L. E. Nat. Struct. Mol. Biol. 2006, 13, 798. (341) Brignatz, C.; Paronetto, M. P.; Opi, S.; Cappellari, M.; Audebert, S.; Feuillet, V.; Bismuth, G.; Roche, S.; Arold, S. T.; Sette, C.; Collette, Y. Mol. Cell. Biol. 2009, 29, 6438. (342) Mayer, B. J. Curr. Biol. 1997, 7, R295. (343) Posevitz-Fejfar, A.; Smida, M.; Kliche, S.; Hartig, R.; Schraven, B.; Lindquist, J. A. Eur. J. Immunol. 2008, 38, 250. (344) Solheim, S. A.; Petsalaki, E.; Stokka, A. J.; Russell, R. B.; Tasken, K.; Berge, T. FEBS J. 2008, 275, 4863. (345) Love, P. E.; Hayes, S. M. Cold Spring Harbor Perspect. Biol. 2010, 2, a002485. (346) Sigalov, A. B.; Aivazian, D. A.; Uversky, V. N.; Stern, L. J. Biochemistry 2006, 45, 15731. (347) Xu, C.; Gagnon, E.; Call, M. E.; Schnell, J. R.; Schwieters, C. D.; Carman, C. V.; Chou, J. J.; Wucherpfennig, K. W. Cell 2008, 135, 702. (348) Osman, N.; Lucas, S. C.; Turner, H.; Cantrell, D. J. Biol. Chem. 1995, 270, 13981. (349) Bond, P. J.; Faraldo-Gomez, J. D. J. Biol. Chem. 2011, 286, 25872. (350) Wilson, B. S.; Pfeiffer, J. R.; Surviladze, Z.; Gaudet, E. A.; Oliver, J. M. J. Cell Biol. 2001, 154, 645. (351) Bunnell, S. C.; Singer, A. L.; Hong, D. I.; Jacque, B. H.; Jordan, M. S.; Seminario, M. C.; Barr, V. A.; Koretzky, G. A.; Samelson, L. E. Mol. Cell. Biol. 2006, 26, 7155. (352) Balagopalan, L.; Barr, V. A.; Sommers, C. L.; Barda-Saad, M.; Goyal, A.; Isakowitz, M. S.; Samelson, L. E. Mol. Cell. Biol. 2007, 27, 8622. (353) Houtman, J. C.; Higashimoto, Y.; Dimasi, N.; Cho, S.; Yamaguchi, H.; Bowden, B.; Regan, C.; Malchiodi, E. L.; Mariuzza, R.; Schuck, P.; Appella, E.; Samelson, L. E. Biochemistry 2004, 43, 4170. (354) Cho, S.; Velikovsky, C. A.; Swaminathan, C. P.; Houtman, J. C.; Samelson, L. E.; Mariuzza, R. A. EMBO J. 2004, 23, 1441. (355) Houtman, J. C.; Houghtling, R. A.; Barda-Saad, M.; Toda, Y.; Samelson, L. E. J. Immunol. 2005, 175, 2449. (356) Berry, D. M.; Nash, P.; Liu, S. K.; Pawson, T.; McGlade, C. J. Curr. Biol. 2002, 12, 1336. (357) Beach, D.; Gonen, R.; Bogin, Y.; Reischl, I. G.; Yablonski, D. J. Biol. Chem. 2007, 282, 2937. (358) Barda-Saad, M.; Braiman, A.; Titerence, R.; Bunnell, S. C.; Barr, V. A.; Samelson, L. E. Nat. Immunol. 2005, 6, 80. (359) Wunderlich, L.; Farago, A.; Downward, J.; Buday, L. Eur. J. Immunol. 1999, 29, 1068.

(290) Saha, P.; Eichbaum, Q.; Silberman, E. D.; Mayer, B. J.; Dutta, A. Mol. Cell. Biol. 1997, 17, 4338. (291) Donzelli, M.; Squatrito, M.; Ganoth, D.; Hershko, A.; Pagano, M.; Draetta, G. F. EMBO J. 2002, 21, 4875. (292) Kieffer, I.; Lorenzo, C.; Dozier, C.; Schmitt, E.; Ducommun, B. Oncogene 2007, 26, 7847. (293) Chen, F.; Zhang, Z.; Bower, J.; Lu, Y.; Leonard, S. S.; Ding, M.; Castranova, V.; Piwnica-Worms, H.; Shi, X. Proc. Natl. Acad. Sci. U.S.A. 2002, 99, 1990. (294) Garner-Hamrick, P. A.; Fisher, C. Int. J. Cancer 1998, 76, 720. (295) Jin, J.; Shirogane, T.; Xu, L.; Nalepa, G.; Qin, J.; Elledge, S. J.; Harper, J. W. Genes Dev. 2003, 17, 3062. (296) Silverman, J. S.; Skaar, J. R.; Pagano, M. Trends Biochem. Sci. 2012, 37, 66. (297) Kasahara, K.; Goto, H.; Enomoto, M.; Tomono, Y.; Kiyono, T.; Inagaki, M. EMBO J. 2010, 29, 2802. (298) Kanemori, Y.; Uto, K.; Sagata, N. Proc. Natl. Acad. Sci. U.S.A. 2005, 102, 6279. (299) Kallstrom, H.; Lindqvist, A.; Pospisil, V.; Lundgren, A.; Rosenthal, C. K. Exp. Cell Res. 2005, 303, 89. (300) Davezac, N.; Baldin, V.; Gabrielli, B.; Forrest, A.; Theis-Febvre, N.; Yashida, M.; Ducommun, B. Oncogene 2000, 19, 2179. (301) Graves, P. R.; Lovly, C. M.; Uy, G. L.; Piwnica-Worms, H. Oncogene 2001, 20, 1839. (302) Uchida, S.; Ohtsubo, M.; Shimura, M.; Hirata, M.; Nakagama, H.; Matsunaga, T.; Yoshida, M.; Ishizaka, Y.; Yamashita, K. Biochem. Biophys. Res. Commun. 2004, 316, 226. (303) Toyoshima-Morimoto, F.; Taniguchi, E.; Nishida, E. EMBO Rep. 2002, 3, 341. (304) Mitrea, D. M.; Yoon, M. K.; Ou, L.; Kriwacki, R. W. Biol. Chem. 2012, 393, 259. (305) Romanov, V. S.; Pospelov, V. A.; Pospelova, T. V. Biochemistry (Mosc.) 2012, 77, 575. (306) Cmielova, J.; Rezacova, M. J. Cell. Biochem. 2011, 112, 3502. (307) Grimmler, M.; Wang, Y.; Mund, T.; Cilensek, Z.; Keidel, E. M.; Waddell, M. B.; Jakel, H.; Kullmann, M.; Kriwacki, R. W.; Hengst, L. Cell 2007, 128, 269. (308) Chu, I.; Sun, J.; Arnaout, A.; Kahn, H.; Hanna, W.; Narod, S.; Sun, P.; Tan, C. K.; Hengst, L.; Slingerland, J. Cell 2007, 128, 281. (309) Bornstein, G.; Bloom, J.; Sitry-Shevah, D.; Nakayama, K.; Pagano, M.; Hershko, A. J. Biol. Chem. 2003, 278, 25752. (310) Zhu, H.; Nie, L.; Maki, C. G. J. Biol. Chem. 2005, 280, 29282. (311) Ou, L.; Ferreira, A. M.; Otieno, S.; Xiao, L.; Bashford, D.; Kriwacki, R. W. J. Biol. Chem. 2011, 286, 30142. (312) Yoon, M. K.; Mitrea, D. M.; Ou, L.; Kriwacki, R. W. Biochem. Soc. Trans. 2012, 40, 981. (313) Wang, W.; Nacusi, L.; Sheaff, R. J.; Liu, X. Biochemistry 2005, 44, 14553. (314) Gulbis, J. M.; Kelman, Z.; Hurwitz, J.; O’Donnell, M.; Kuriyan, J. Cell 1996, 87, 297. (315) Rossig, L.; Jadidi, A. S.; Urbich, C.; Badorff, C.; Zeiher, A. M.; Dimmeler, S. Mol. Cell. Biol. 2001, 21, 5644. (316) Rodriguez-Vilarrupla, A.; Diaz, C.; Canela, N.; Rahn, H. P.; Bachs, O.; Agell, N. FEBS Lett. 2002, 531, 319. (317) Ranta, F.; Leveringhaus, J.; Theilig, D.; Schulz-Raffelt, G.; Hennige, A. M.; Hildebrand, D. G.; Handrick, R.; Jendrossek, V.; Bosch, F.; Schulze-Osthoff, K.; Haring, H. U.; Ullrich, S. PLoS One 2011, 6, e28828. (318) Rodriguez-Vilarrupla, A.; Jaumot, M.; Abella, N.; Canela, N.; Brun, S.; Diaz, C.; Estanyol, J. M.; Bachs, O.; Agell, N. Mol. Cell. Biol. 2005, 25, 7364. (319) Mercer, S. E.; Ewton, D. Z.; Deng, X.; Lim, S.; Mazur, T. R.; Friedman, E. J. Biol. Chem. 2005, 280, 25788. (320) Asada, M.; Ohmi, K.; Delia, D.; Enosawa, S.; Suzuki, S.; Yuo, A.; Suzuki, H.; Mizutani, S. Mol. Cell. Biol. 2004, 24, 8236. (321) Abbas, T.; Dutta, A. Cell Cycle 2011, 10, 241. (322) Havens, C. G.; Walter, J. C. Genes Dev. 2011, 25, 1568. (323) Abbas, T.; Sivaprasad, U.; Terai, K.; Amador, V.; Pagano, M.; Dutta, A. Genes Dev. 2008, 22, 2496. 6775

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

(360) Zeng, R.; Cannon, J. L.; Abraham, R. T.; Way, M.; Billadeau, D. D.; Bubeck-Wardenberg, J.; Burkhardt, J. K. J. Immunol. 2003, 171, 1360. (361) da Silva, A. J.; Li, Z.; de Vera, C.; Canto, E.; Findell, P.; Rudd, C. E. Proc. Natl. Acad. Sci. U.S.A. 1997, 94, 7493. (362) Baker, R. G.; Hsu, C. J.; Lee, D.; Jordan, M. S.; Maltzman, J. S.; Hammer, D. A.; Baumgart, T.; Koretzky, G. A. Mol. Cell. Biol. 2009, 29, 5578. (363) Wang, H.; Wei, B.; Bismuth, G.; Rudd, C. E. Proc. Natl. Acad. Sci. U.S.A. 2009, 106, 12436. (364) Kim, S. T.; Lim, D. S.; Canman, C. E.; Kastan, M. B. J. Biol. Chem. 1999, 274, 37538. (365) Tidow, H.; Nissen, P. FEBS J. 2013, 280, 5551. (366) Yee Koh, M.; Spivak-Kroizman, T. R.; Powis, G. Trends Biochem. Sci. 2008, 33, 526. (367) Wojciak, J. M.; Martinez-Yamout, M. A.; Dyson, H. J.; Wright, P. E. EMBO J. 2009, 28, 948. (368) Campbell, K. M.; Lumb, K. J. Biochemistry 2002, 41, 13956. (369) Wang, F.; Marshall, C. B.; Yamamoto, K.; Li, G. Y.; GasmiSeabrook, G. M.; Okada, H.; Mak, T. W.; Ikura, M. Proc. Natl. Acad. Sci. U.S.A. 2012, 109, 6078. (370) Kussie, P. H.; Gorina, S.; Marechal, V.; Elenbaas, B.; Moreau, J.; Levine, A. J.; Pavletich, N. P. Science 1996, 274, 948. (371) Chan, H. M.; La Thangue, N. B. J. Cell Sci. 2001, 114, 2363. (372) Greer, S. N.; Metcalf, J. L.; Wang, Y.; Ohh, M. EMBO J. 2012, 31, 2448. (373) Depping, R.; Steinhoff, A.; Schindler, S. G.; Friedrich, B.; Fagerlund, R.; Metzen, E.; Hartmann, E.; Kohler, M. Biochim. Biophys. Acta 2008, 1783, 394. (374) Beckerman, R.; Prives, C. Cold Spring Harbor Perspect. Biol. 2010, 2, a000935. (375) Wade, M.; Wang, Y. V.; Wahl, G. M. Trends Cell Biol. 2010, 20, 299. (376) Geyer, R. K.; Yu, Z. K.; Maki, C. G. Nat. Cell Biol. 2000, 2, 569. (377) O’Keefe, K.; Li, H.; Zhang, Y. Mol. Cell. Biol. 2003, 23, 6396. (378) Hu, M.; Gu, L.; Li, M.; Jeffrey, P. D.; Gu, W.; Shi, Y. PLoS Biol. 2006, 4, e27. (379) Blagosklonny, M. V.; An, W. G.; Romanova, L. Y.; Trepel, J.; Fojo, T.; Neckers, L. J. Biol. Chem. 1998, 273, 11995. (380) Xenaki, G.; Ontikatze, T.; Rajendran, R.; Stratford, I. J.; Dive, C.; Krstic-Demonacos, M.; Demonacos, C. Oncogene 2008, 27, 5785. (381) Meek, D. W.; Anderson, C. W. Cold Spring Harbor Perspect. Biol. 2009, 1, a000950. (382) Canman, C. E.; Lim, D. S.; Cimprich, K. A.; Taya, Y.; Tamai, K.; Sakaguchi, K.; Appella, E.; Kastan, M. B.; Siliciano, J. D. Science 1998, 281, 1677. (383) Maki, C. G.; Howley, P. M. Mol. Cell. Biol. 1997, 17, 355. (384) Lee, C. W.; Arai, M.; Martinez-Yamout, M. A.; Dyson, H. J.; Wright, P. E. Biochemistry 2009, 48, 2115. (385) Turenne, G. A.; Price, B. D. BMC Cell Biol. 2001, 2, 12. (386) Schmid, T.; Zhou, J.; Kohl, R.; Brune, B. Biochem. J. 2004, 380, 289. (387) Elde, N. C.; Malik, H. S. Nat. Rev. Microbiol. 2009, 7, 787. (388) Davey, N. E.; Trave, G.; Gibson, T. J. Trends Biochem. Sci. 2011, 36, 159. (389) Welcker, M.; Clurman, B. E. J. Biol. Chem. 2005, 280, 7654. (390) Coadou, G.; Gharbi-Benarous, J.; Megy, S.; Bertho, G.; EvrardTodeschi, N.; Segeral, E.; Benarous, R.; Girault, J. P. Biochemistry 2003, 42, 14741. (391) Fischer, U.; Huber, J.; Boelens, W. C.; Mattaj, I. W.; Luhrmann, R. Cell 1995, 82, 475. (392) Malim, M. H.; Bohnlein, S.; Hauber, J.; Cullen, B. R. Cell 1989, 58, 205. (393) Fera, D.; Schultz, D. C.; Hodawadekar, S.; Reichman, M.; Donover, P. S.; Melvin, J.; Troutman, S.; Kissil, J. L.; Huryn, D. M.; Marmorstein, R. Chem. Biol. 2012, 19, 518. (394) Mattoo, S.; Lee, Y. M.; Dixon, J. E. Curr. Opin. Immunol. 2007, 19, 392. (395) Giraldo, M. C.; Valent, B. Nat. Rev. Microbiol. 2013, 11, 800.

(396) Sallee, N. A.; Rivera, G. M.; Dueber, J. E.; Vasilescu, D.; Mullins, R. D.; Mayer, B. J.; Lim, W. A. Nature 2008, 454, 1005. (397) Gruenheid, S.; DeVinney, R.; Bladt, F.; Goosney, D.; Gelkop, S.; Gish, G. D.; Pawson, T.; Finlay, B. B. Nat. Cell Biol. 2001, 3, 856. (398) Higashi, H.; Tsutsumi, R.; Fujita, A.; Yamazaki, S.; Asaka, M.; Azuma, T.; Hatakeyama, M. Proc. Natl. Acad. Sci. U.S.A. 2002, 99, 14428. (399) Lee, I. O.; Kim, J. H.; Choi, Y. J.; Pillinger, M. H.; Kim, S. Y.; Blaser, M. J.; Lee, Y. C. J. Biol. Chem. 2010, 285, 16042. (400) Nesic, D.; Miller, M. C.; Quinkert, Z. T.; Stein, M.; Chait, B. T.; Stebbins, C. E. Nat. Struct. Mol. Biol. 2010, 17, 130. (401) Braun, L.; Brenier-Pinchart, M. P.; Yogavel, M.; Curt-Varesano, A.; Curt-Bertini, R. L.; Hussain, T.; Kieffer-Jaquinod, S.; Coute, Y.; Pelloux, H.; Tardieux, I.; Sharma, A.; Belrhali, H.; Bougdour, A.; Hakimi, M. A. J. Exp. Med. 2013, 210, 2071. (402) Duesbery, N. S.; Webb, C. P.; Leppla, S. H.; Gordon, V. M.; Klimpel, K. R.; Copeland, T. D.; Ahn, N. G.; Oskarsson, M. K.; Fukasawa, K.; Paull, K. D.; Vande Woude, G. F. Science 1998, 280, 734. (403) Tanramluk, D.; Schreyer, A.; Pitt, W. R.; Blundell, T. L. Chem. Biol. Drug Des. 2009, 74, 16. (404) Wurtele, M.; Jelich-Ottmann, C.; Wittinghofer, A.; Oecking, C. EMBO J. 2003, 22, 987. (405) Bullen, H. E.; Crabb, B. S.; Gilson, P. R. Curr. Opin. Microbiol. 2012, 15, 699. (406) Juarez, P.; Comas, I.; Gonzalez-Candelas, F.; Calvete, J. J. Mol. Biol. Evol. 2008, 25, 2391. (407) Lu, X.; Lu, D.; Scully, M. F.; Kakkar, V. V. Curr. Med. Chem. Cardiovasc. Hematol. Agents 2003, 1, 189. (408) Reiss, S.; Sieber, M.; Oberle, V.; Wentzel, A.; Spangenberg, P.; Claus, R.; Kolmar, H.; Losche, W. Platelets 2006, 17, 153. (409) Swenson, S.; Ramu, S.; Markland, F. S. Curr. Pharm. Des. 2007, 13, 2860. (410) Walsh, E. M.; Marcinkiewicz, C. Toxicon 2011, 58, 355. (411) Pornillos, O.; Alam, S. L.; Davis, D. R.; Sundquist, W. I. Nat. Struct. Biol. 2002, 9, 812. (412) Bowers, K.; Pelchen-Matthews, A.; Honing, S.; Vance, P. J.; Creary, L.; Haggarty, B. S.; Romano, J.; Ballensiefen, W.; Hoxie, J. A.; Marsh, M. Traffic 2000, 1, 661. (413) Bardwell, A. J.; Abdollahi, M.; Bardwell, L. Biochem. J. 2004, 378, 569. (414) Maier, A. G.; Cooke, B. M.; Cowman, A. F.; Tilley, L. Nat. Rev. Microbiol. 2009, 7, 341. (415) Sutcliffe, M. J.; Jaseja, M.; Hyde, E. I.; Lu, X.; Williams, J. A. Nat. Struct. Biol. 1994, 1, 802. (416) Reekmans, S. M.; Pflanzner, T.; Gordts, P. L.; Isbert, S.; Zimmermann, P.; Annaert, W.; Weggen, S.; Roebroek, A. J.; Pietrzik, C. U. Cell. Mol. Life Sci. 2010, 67, 135. (417) Kadaveru, K.; Vyas, J.; Schiller, M. R. Front. Biosci. 2008, 13, 6455. (418) Sunyaev, S.; Ramensky, V.; Bork, P. Trends Genet. 2000, 16, 198. (419) Vacic, V.; Markwick, P. R.; Oldfield, C. J.; Zhao, X.; Haynes, C.; Uversky, V. N.; Iakoucheva, L. M. PLoS Comput. Biol. 2012, 8, e1002709. (420) Uversky, V. N.; Oldfield, C. J.; Dunker, A. K. Annu. Rev. Biophys. 2008, 37, 215. (421) Midic, U.; Oldfield, C. J.; Dunker, A. K.; Obradovic, Z.; Uversky, V. N. BMC Genomics 2009, 10 (Suppl1), S12. (422) Marcotte, E. M.; Tsechansky, M. Cell 2009, 138, 16. (423) Mazelova, J.; Astuto-Gribble, L.; Inoue, H.; Tam, B. M.; Schonteich, E.; Prekeris, R.; Moritz, O. L.; Randazzo, P. A.; Deretic, D. EMBO J. 2009, 28, 183. (424) Emmer, B. T.; Maric, D.; Engman, D. M. J. Cell Sci. 2010, 123, 529. (425) Deretic, D.; Schmerl, S.; Hargrave, P. A.; Arendt, A.; McDowell, J. H. Proc. Natl. Acad. Sci. U.S.A. 1998, 95, 10620. (426) Macke, J. P.; Hennessey, J. C.; Nathans, J. Hum. Mol. Genet. 1995, 4, 775. 6776

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

(427) Rakoczy, E. P.; Kiel, C.; McKeone, R.; Stricher, F.; Serrano, L. J. Mol. Biol. 2011, 405, 584. (428) Inagaki, Y.; Mitsutake, S.; Igarashi, Y. Biochem. Biophys. Res. Commun. 2006, 343, 982. (429) Ali, M.; Ramprasad, V. L.; Soumittra, N.; Mohamed, M. D.; Jafri, H.; Rashid, Y.; Danciger, M.; McKibbin, M.; Kumaramanickavel, G.; Inglehearn, C. F. Mol. Vis. 2008, 14, 1960. (430) Cordeddu, V.; Di Schiavi, E.; Pennacchio, L. A.; Ma’ayan, A.; Sarkozy, A.; Fodale, V.; Cecchetti, S.; Cardinale, A.; Martin, J.; Schackwitz, W.; Lipzen, A.; Zampino, G.; Mazzanti, L.; Digilio, M. C.; Martinelli, S.; Flex, E.; Lepri, F.; Bartholdi, D.; Kutsche, K.; Ferrero, G. B.; Anichini, C.; Selicorni, A.; Rossi, C.; Tenconi, R.; Zenker, M.; Merlo, D.; Dallapiccola, B.; Iyengar, R.; Bazzicalupo, P.; Gelb, B. D.; Tartaglia, M. Nat. Genet. 2009, 41, 1022. (431) Chan, E. F.; Gat, U.; McNiff, J. M.; Fuchs, E. Nat. Genet. 1999, 21, 410. (432) Legoix, P.; Bluteau, O.; Bayer, J.; Perret, C.; Balabaud, C.; Belghiti, J.; Franco, D.; Thomas, G.; Laurent-Puig, P.; Zucman-Rossi, J. Oncogene 1999, 18, 4044. (433) Courtois, G.; Smahi, A.; Reichenbach, J.; Doffinger, R.; Cancrini, C.; Bonnet, M.; Puel, A.; Chable-Bessia, C.; Yamaoka, S.; Feinberg, J.; Dupuis-Girod, S.; Bodemer, C.; Livadiotti, S.; Novelli, F.; Rossi, P.; Fischer, A.; Israel, A.; Munnich, A.; Le Deist, F.; Casanova, J. L. J. Clin. Invest. 2003, 112, 1108. (434) Furlow, P. W.; Percy, M. J.; Sutherland, S.; Bierl, C.; McMullin, M. F.; Master, S. R.; Lappin, T. R.; Lee, F. S. J. Biol. Chem. 2009, 284, 9050. (435) Zhou, R.; Patel, S. V.; Snyder, P. M. J. Biol. Chem. 2007, 282, 20207. (436) Abriel, H.; Loffing, J.; Rebhun, J. F.; Pratt, J. H.; Schild, L.; Horisberger, J. D.; Rotin, D.; Staub, O. J. Clin. Invest. 1999, 103, 667. (437) Inoue, J.; Iwaoka, T.; Tokunaga, H.; Takamune, K.; Naomi, S.; Araki, M.; Takahama, K.; Yamaguchi, K.; Tomita, K. J. Clin. Endocrinol. Metab. 1998, 83, 2210. (438) Hansson, J. H.; Schild, L.; Lu, Y.; Wilson, T. A.; Gautschi, I.; Shimkets, R.; Nelson-Williams, C.; Rossier, B. C.; Lifton, R. P. Proc. Natl. Acad. Sci. U.S.A. 1995, 92, 11495. (439) Furuhashi, M.; Kitamura, K.; Adachi, M.; Miyoshi, T.; Wakida, N.; Ura, N.; Shikano, Y.; Shinshi, Y.; Sakamoto, K.; Hayashi, M.; Satoh, N.; Nishitani, T.; Tomita, K.; Shimamoto, K. J. Clin. Endocrinol. Metab. 2005, 90, 340. (440) Mukherjee, S.; Cruz-Rodriguez, O.; Bolton, E.; Iniguez-Lluhi, J. A. J. Biol. Chem. 2012, 287, 31195. (441) Hay, C. W.; McEwan, I. J. PLoS One 2012, 7, e32514. (442) Xu, Z. C.; Yang, Y.; Hebert, S. C. J. Biol. Chem. 1996, 271, 9313. (443) Simon, D. B.; Karet, F. E.; Rodriguez-Soriano, J.; Hamdan, J. H.; DiPietro, A.; Trachtman, H.; Sanjad, S. A.; Lifton, R. P. Nat. Genet. 1996, 14, 152. (444) Tamura, H.; Schild, L.; Enomoto, N.; Matsui, N.; Marumo, F.; Rossier, B. C. J. Clin. Invest. 1996, 97, 1780. (445) Arboleda, V. A.; Lee, H.; Parnaik, R.; Fleming, A.; Banerjee, A.; Ferraz-de-Souza, B.; Delot, E. C.; Rodriguez-Fernandez, I. A.; Braslavsky, D.; Bergada, I.; Dell’Angelica, E. C.; Nelson, S. F.; Martinez-Agosto, J. A.; Achermann, J. C.; Vilain, E. Nat. Genet. 2012, 44, 788. (446) Schmalstieg, F. C.; Leonard, W. J.; Noguchi, M.; Berg, M.; Rudloff, H. E.; Denney, R. M.; Dave, S. K.; Brooks, E. G.; Goldman, A. S. J. Clin. Invest. 1995, 95, 1169. (447) Melhuish, T. A.; Wotton, D. J. Biol. Chem. 2000, 275, 39762. (448) Longo, N.; Wang, Y.; Smith, S. A.; Langley, S. D.; DiMeglio, L. A.; Giannella-Neto, D. Hum. Mol. Genet. 2002, 11, 1465. (449) Sommers, C. L.; Lee, J.; Steiner, K. L.; Gurson, J. M.; Depersis, C. L.; El-Khoury, D.; Fuller, C. L.; Shores, E. W.; Love, P. E.; Samelson, L. E. J. Exp. Med. 2005, 201, 1125. (450) Sohn, A. S.; Glockle, N.; Doetzer, A. D.; Deuschl, G.; Felbor, U.; Topka, H. R.; Schols, L.; Riess, O.; Bauer, P.; Muller, U.; Grundmann, K. Mov. Disord. 2010, 25, 1982.

(451) Groen, J. L.; Ritz, K.; Contarino, M. F.; van de Warrenburg, B. P.; Aramideh, M.; Foncke, E. M.; van Hilten, J. J.; Schuurman, P. R.; Speelman, J. D.; Koelman, J. H.; de Bie, R. M.; Baas, F.; Tijssen, M. A. Mov. Disord. 2010, 25, 2420. (452) Houlden, H.; Schneider, S. A.; Paudel, R.; Melchers, A.; Schwingenschuh, P.; Edwards, M.; Hardy, J.; Bhatia, K. P. Neurology 2010, 74, 846. (453) Kalay, E.; de Brouwer, A. P.; Caylan, R.; Nabuurs, S. B.; Wollnik, B.; Karaguzel, A.; Heister, J. G.; Erdol, H.; Cremers, F. P.; Cremers, C. W.; Brunner, H. G.; Kremer, H. J. Mol. Med. (Berl.) 2005, 83, 1025. (454) Percy, M. J.; Beer, P. A.; Campbell, G.; Dekker, A. W.; Green, A. R.; Oscier, D.; Rainey, M. G.; van Wijk, R.; Wood, M.; Lappin, T. R.; McMullin, M. F.; Lee, F. S. Blood 2008, 111, 5400. (455) Percy, M. J.; Chung, Y. J.; Harrison, C.; Mercieca, J.; Hoffbrand, A. V.; Dinardo, C. L.; Santos, P. C.; Fonseca, G. H.; Gualandro, S. F.; Pereira, A. C.; Lappin, T. R.; McMullin, M. F.; Lee, F. S. Am. J. Hematol. 2012, 87, 439. (456) Ueki, Y.; Tiziani, V.; Santanna, C.; Fukai, N.; Maulik, C.; Garfinkle, J.; Ninomiya, C.; doAmaral, C.; Peters, H.; Habal, M.; RheeMorris, L.; Doss, J. B.; Kreiborg, S.; Olsen, B. R.; Reichenberger, E. Nat. Genet. 2001, 28, 125. (457) Levaot, N.; Voytyuk, O.; Dimitriou, I.; Sircoulomb, F.; Chandrakumar, A.; Deckert, M.; Krzyzanowski, P. M.; Scotter, A.; Gu, S.; Janmohamed, S.; Cong, F.; Simoncic, P. D.; Ueki, Y.; La Rose, J.; Rottapel, R. Cell 2011, 147, 1324. (458) Davis, C. G.; Lehrman, M. A.; Russell, D. W.; Anderson, R. G.; Brown, M. S.; Goldstein, J. L. Cell 1986, 45, 15. (459) Rosas, D. J.; Roman, A. J.; Weissbrod, P.; Macke, J. P.; Nathans, J. Invest. Ophthalmol. Vis. Sci. 1994, 35, 3134. (460) Kaiser, F. J.; Brega, P.; Raff, M. L.; Byers, P. H.; Gallati, S.; Kay, T. T.; de Almeida, S.; Horsthemke, B.; Ludecke, H. J. Eur. J. Hum. Genet. 2004, 12, 121. (461) Hiort, O.; Holterhus, P. M.; Horter, T.; Schulze, W.; Kremke, B.; Bals-Pratsch, M.; Sinnecker, G. H.; Kruse, K. J. Clin. Endocrinol. Metab. 2000, 85, 2810. (462) Audi, L.; Fernandez-Cancio, M.; Carrascosa, A.; Andaluz, P.; Toran, N.; Piro, C.; Vilaro, E.; Vicens-Calvet, E.; Gussinye, M.; Albisu, M. A.; Yeste, D.; Clemente, M.; Hernandez de la Calle, I.; Del Campo, M.; Vendrell, T.; Blanco, A.; Martinez-Mora, J.; Granada, M. L.; Salinas, I.; Forn, J.; Calaf, J.; Angerri, O.; Martinez-Sopena, M. J.; Del Valle, J.; Garcia, E.; Gracia-Bouthelier, R.; Lapunzina, P.; Mayayo, E.; Labarta, J. I.; Lledo, G.; Sanchez Del Pozo, J.; Arroyo, J.; Perez-Aytes, A.; Beneyto, M.; Segura, A.; Borras, V.; Gabau, E.; Caimari, M.; Rodriguez, A.; Martinez-Aedo, M. J.; Carrera, M.; Castano, L.; Andrade, M.; Bermudez de la Vega, J. A. J. Clin. Endocrinol. Metab. 2010, 95, 1876. (463) Bhangoo, A.; Paris, F.; Philibert, P.; Audran, F.; Ten, S.; Sultan, C. Asian J. Androl. 2010, 12, 561. (464) Ferlin, A.; Vinanzi, C.; Garolla, A.; Selice, R.; Zuccarello, D.; Cazzadore, C.; Foresta, C. Clin. Endocrinol. (Oxford) 2006, 65, 606. (465) Garolla, A.; Ferlin, A.; Vinanzi, C.; Roverato, A.; Sotti, G.; Artibani, W.; Foresta, C. Endocr. Relat. Cancer 2005, 12, 645. (466) Milewicz, D. M.; Grossfield, J.; Cao, S. N.; Kielty, C.; Covitz, W.; Jewett, T. J. Clin. Invest. 1995, 95, 2373. (467) Kato, K.; Jeanneau, C.; Tarp, M. A.; Benet-Pages, A.; LorenzDepiereux, B.; Bennett, E. P.; Mandel, U.; Strom, T. M.; Clausen, H. J. Biol. Chem. 2006, 281, 18370. (468) Yoshimasa, Y.; Seino, S.; Whittaker, J.; Kakehi, T.; Kosaki, A.; Kuzuya, H.; Imura, H.; Bell, G. I.; Steiner, D. F. Science 1988, 240, 784. (469) Woo, J. S.; Hwang, J. H.; Ko, J. K.; Weisleder, N.; Kim do, H.; Ma, J.; Lee, E. H. Biochem. J. 2010, 427, 125. (470) Bertolotto, C.; Lesueur, F.; Giuliano, S.; Strub, T.; de Lichy, M.; Bille, K.; Dessen, P.; d’Hayer, B.; Mohamdi, H.; Remenieras, A.; Maubec, E.; de la Fouchardiere, A.; Molinie, V.; Vabres, P.; Dalle, S.; Poulalhon, N.; Martin-Denavit, T.; Thomas, L.; Andry-Benzaquen, P.; Dupin, N.; Boitier, F.; Rossi, A.; Perrot, J. L.; Labeille, B.; Robert, C.; Escudier, B.; Caron, O.; Brugieres, L.; Saule, S.; Gardie, B.; Gad, S.; Richard, S.; Couturier, J.; Teh, B. T.; Ghiorzo, P.; Pastorino, L.; Puig, 6777

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778

Chemical Reviews

Review

(497) Geistlinger, T. R.; Guy, R. K. J. Am. Chem. Soc. 2003, 125, 6852. (498) Oneyama, C.; Nakano, H.; Sharma, S. V. Oncogene 2002, 21, 2037. (499) Ogura, K.; Tsuchiya, S.; Terasawa, H.; Yuzawa, S.; Hatanaka, H.; Mandiyan, V.; Schlessinger, J.; Inagaki, F. J. Mol. Biol. 1999, 289, 439. (500) Myslinski, J. M.; DeLorbe, J. E.; Clements, J. H.; Martin, S. F. J. Am. Chem. Soc. 2011, 133, 18518. (501) Zhang, Y.; Zhang, J.; Yuan, C.; Hard, R. L.; Park, I. H.; Li, C.; Bell, C.; Pei, D. Biochemistry 2011, 50, 7637. (502) Das, S.; Raychaudhuri, M.; Sen, U.; Mukhopadhyay, D. J. Mol. Biol. 2011, 414, 217. (503) Kaneko, T.; Huang, H.; Zhao, B.; Li, L.; Liu, H.; Voss, C. K.; Wu, C.; Schiller, M. R.; Li, S. S. Sci. Signal 2010, 3, ra34. (504) Lorenz, S.; Vakonakis, I.; Lowe, E. D.; Campbell, I. D.; Noble, M. E.; Hoellerer, M. K. Structure 2008, 16, 1521. (505) Hurley, T. D.; Yang, J.; Zhang, L.; Goodwin, K. D.; Zou, Q.; Cortese, M.; Dunker, A. K.; DePaoli-Roach, A. A. J. Biol. Chem. 2007, 282, 28874. (506) Terrak, M.; Kerff, F.; Langsetmo, K.; Tao, T.; Dominguez, R. Nature 2004, 429, 780.

S.; Badenas, C.; Olsson, H.; Ingvar, C.; Rouleau, E.; Lidereau, R.; Bahadoran, P.; Vielh, P.; Corda, E.; Blanche, H.; Zelenika, D.; Galan, P.; Aubin, F.; Bachollet, B.; Becuwe, C.; Berthet, P.; Bignon, Y. J.; Bonadona, V.; Bonafe, J. L.; Bonnet-Dupeyron, M. N.; Cambazard, F.; Chevrant-Breton, J.; Coupier, I.; Dalac, S.; Demange, L.; d’Incan, M.; Dugast, C.; Faivre, L.; Vincent-Fetita, L.; Gauthier-Villars, M.; Gilbert, B.; Grange, F.; Grob, J. J.; Humbert, P.; Janin, N.; Joly, P.; Kerob, D.; Lasset, C.; Leroux, D.; Levang, J.; Limacher, J. M.; Livideanu, C.; Longy, M.; Lortholary, A.; Stoppa-Lyonnet, D.; Mansard, S.; Mansuy, L.; Marrou, K.; Mateus, C.; Maugard, C.; Meyer, N.; Nogues, C.; Souteyrand, P.; Venat-Bouvet, L.; Zattara, H.; Chaudru, V.; Lenoir, G. M.; Lathrop, M.; Davidson, I.; Avril, M. F.; Demenais, F.; Ballotti, R.; Bressac-de Paillerets, B. Nature 2011, 480, 94. (471) Toh, K. L.; Jones, C. R.; He, Y.; Eide, E. J.; Hinz, W. A.; Virshup, D. M.; Ptacek, L. J.; Fu, Y. H. Science 2001, 291, 1040. (472) Bungert, S.; Molday, L. L.; Molday, R. S. J. Biol. Chem. 2001, 276, 23539. (473) Noble, M. E.; Endicott, J. A.; Johnson, L. N. Science 2004, 303, 1800. (474) Zhang, J.; Yang, P. L.; Gray, N. S. Nat. Rev. Cancer 2009, 9, 28. (475) Druker, B. J.; Tamura, S.; Buchdunger, E.; Ohno, S.; Segal, G. M.; Fanning, S.; Zimmermann, J.; Lydon, N. B. Nat. Med. 1996, 2, 561. (476) Druker, B. J.; Lydon, N. B. J. Clin. Invest. 2000, 105, 3. (477) Zhao, L.; Chmielewski, J. Curr. Opin. Struct. Biol. 2005, 15, 31. (478) Parthasarathi, L.; Casey, F.; Stein, A.; Aloy, P.; Shields, D. C. J. Chem. Inf. Model. 2008, 48, 1943. (479) Brennan, R. C.; Federico, S.; Bradley, C.; Zhang, J.; FloresOtero, J.; Wilson, M.; Stewart, C.; Zhu, F.; Guy, K.; Dyer, M. A. Cancer Res. 2011, 71, 4205. (480) Ray-Coquard, I.; Blay, J. Y.; Italiano, A.; Le Cesne, A.; Penel, N.; Zhi, J.; Heil, F.; Rueger, R.; Graves, B.; Ding, M.; Geho, D.; Middleton, S. A.; Vassilev, L. T.; Nichols, G. L.; Bui, B. N. Lancet Oncol. 2012, 13, 1133. (481) Mas-Moruno, C.; Rechenmacher, F.; Kessler, H. Anticancer Agents Med. Chem. 2010, 10, 753. (482) MacDonald, T. J.; Stewart, C. F.; Kocak, M.; Goldman, S.; Ellenbogen, R. G.; Phillips, P.; Lafond, D.; Poussaint, T. Y.; Kieran, M. W.; Boyett, J. M.; Kun, L. E. J. Clin. Oncol. 2008, 26, 919. (483) Carter, A. J. Natl. Cancer Inst. 2010, 102, 675. (484) Vassilev, L. T.; Vu, B. T.; Graves, B.; Carvajal, D.; Podlaski, F.; Filipovic, Z.; Kong, N.; Kammlott, U.; Lukacs, C.; Klein, C.; Fotouhi, N.; Liu, E. A. Science 2004, 303, 844. (485) Oliveira-Ferrer, L.; Hauschild, J.; Fiedler, W.; Bokemeyer, C.; Nippgen, J.; Celik, I.; Schuch, G. J. Exp. Clin. Cancer Res. 2008, 27, 86. (486) Reardon, D. A.; Neyns, B.; Weller, M.; Tonn, J. C.; Nabors, L. B.; Stupp, R. Future Oncol. 2011, 7, 339. (487) Fujii, N.; Haresco, J. J.; Novak, K. A.; Stokoe, D.; Kuntz, I. D.; Guy, R. K. J. Am. Chem. Soc. 2003, 125, 12074. (488) Corradi, V.; Mancini, M.; Manetti, F.; Petta, S.; Santucci, M. A.; Botta, M. Bioorg. Med. Chem. Lett. 2010, 20, 6133. (489) Oneyama, C.; Agatsuma, T.; Kanda, Y.; Nakano, H.; Sharma, S. V.; Nakano, S.; Narazaki, F.; Tatsuta, K. Chem. Biol. 2003, 10, 443. (490) Blees, J. S.; Bokesch, H. R.; Rubsamen, D.; Schulz, K.; Milke, L.; Bajer, M. M.; Gustafson, K. R.; Henrich, C. J.; McMahon, J. B.; Colburn, N. H.; Schmid, T.; Brune, B. PLoS One 2012, 7, e46567. (491) Fujii, N.; You, L.; Xu, Z.; Uematsu, K.; Shan, J.; He, B.; Mikami, I.; Edmondson, L. R.; Neale, G.; Zheng, J.; Guy, R. K.; Jablons, D. M. Cancer Res. 2007, 67, 573. (492) Buckley, D. L.; Van Molle, I.; Gareiss, P. C.; Tae, H. S.; Michel, J.; Noblin, D. J.; Jorgensen, W. L.; Ciulli, A.; Crews, C. M. J. Am. Chem. Soc. 2012, 134, 4465. (493) Eck, M. J.; Manley, P. W. Curr. Opin. Cell Biol. 2009, 21, 288. (494) Park, K. D.; Kim, D.; Reamtong, O.; Eyers, C.; Gaskell, S. J.; Liu, R.; Kohn, H. J. Am. Chem. Soc. 2011, 133, 11320. (495) Arrendale, A.; Kim, K.; Choi, J. Y.; Li, W.; Geahlen, R. L.; Borch, R. F. Chem. Biol. 2012, 19, 764. (496) Gril, B.; Vidal, M.; Assayag, F.; Poupon, M. F.; Liu, W. Q.; Garbay, C. Int. J. Cancer 2007, 121, 407. 6778

dx.doi.org/10.1021/cr400585q | Chem. Rev. 2014, 114, 6733−6778