Reverse Binding Mode of Phosphotyrosine Peptides with SH2 Protein

Aug 9, 2018 - Reverse Binding Mode of Phosphotyrosine Peptides with SH2 Protein ... *E-mail: [email protected]., *E-mail: [email protected]., *E-mail: ...
1 downloads 0 Views 2MB Size
Subscriber access provided by Kaohsiung Medical University

Article

Reverse Binding Mode of Phosphotyrosine Peptides with SH2 Protein Rui Wang, Pete Yuk Ming Leung, Feng Huang, Qingzhuang Tang, Tomonori Kaneko, Mei Huang, Zigang Li, Shawn S.C. Li, Yi Wang, and Jiang Xia Biochemistry, Just Accepted Manuscript • DOI: 10.1021/acs.biochem.8b00677 • Publication Date (Web): 09 Aug 2018 Downloaded from http://pubs.acs.org on August 15, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Reverse Binding Mode of Phosphotyrosine Peptides with SH2 Protein

Rui Wang1, 3, Pete Y. M. Leung2, Feng Huang1, Qingzhuang Tang1, Tomonori Kaneko3, Mei Huang3, Zigang Li4, Shawn S.C. Li3, *, Yi Wang2, *, Jiang Xia1,*

1Department

of Chemistry, 2Department of Physics, The Chinese University of Hong Kong, Shatin,

Hong Kong SAR, China. 3Department of Biochemistry and the Siebens-Drake Medical Research Institute, Schulich School of Medicine and Dentistry, Western University, London, Ontario N6A 5C1, Canada. 4School of Chemical Biology and Biotechnology, Shenzhen Graduate School of Peking University, Shenzhen 518055, China. *Address correspondence to [email protected] , [email protected] , or [email protected]

Lead contact: Jiang Xia, Department of Chemistry, the Chinese University of Hong Kong, Shatin, Hong Kong SAR, China. Email: [email protected]

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 40

ABSTRACT Discerning the different interaction states during a dynamic protein–ligand binding is difficult. Here we apply a site-specific cysteine–-chloroacetyl crosslinking to scrutinize the binding between the Src homology 2 (SH2) domain and phosphotyrosine (pY) peptides, a highly dynamic interaction key to cellular signal transduction. From a model SH2 protein to a set of representative SH2 domains, we showed here that a proximity-induced cysteine–-chloroacetyl reaction crosslinked two spatially adjacent chemical groups as the result of the binding interaction, and reciprocally the information about the interaction states can be deduced from the crosslinked products. To our surprise, we found SH2 domains can adopt a reverse binding mode with “single-pronged”, “two-pronged” and “half” pY peptides. This finding was further supported by a set of 500-ns molecular dynamics simulations. This serendipitous finding defies the canonical theory in SH2 binding, suggests a possible answer on the source of the versatility of SH2 signalling, and sets a model for other protein binding interactions. [157 Words]

ACS Paragon Plus 2 Environment

Page 3 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

INTRODUCTION

The -chloroacetyl group is known to react rapidly with cysteinyl thiol groups that are in the proximity, but very slowly with nucleophiles at distal positions in the same protein or in other proteins or small molecules with a rate often down to an unnoticeable level (Fig. 1A) [1-8]. This proximity effect (a.k.a. proximity-induced or proximity-enabled reactivity) allows the differentiation of a cysteine residue at a proximal location relative to the electrophile versus ones at distal locations on the surface of the same protein. For example, Bai et al utilized the reactivity of a chloroacetylmodified colchicine analog with different cysteine residues of -tubulin to deduce the binding site of colchicine in -tubulin. [1] Spontaneous cysteine–cysteine disulfide crosslinking, another proximal reaction, has also been used to probe the detailed features of protein–protein interactions by covalently locking native interactions while preserving their structural states (for example, resolving the orientation of the coiled coil peptides and locking enzyme–substrate interactions for structural studies). [9-15] One of the differences between the two is that disulfide bond formation requires a redox buffer and is a relatively slow equilibrium, whereas the cysteine–-chloroacetyl reaction produces an irreversible thioester linkage within several minutes and the reaction occurs in common buffers. (Other cysteine-specific proximal reactive groups include haloalkyl or acryloyl groups. [1620]) We reason that the reactivity between a cysteine on a protein (SH2 domain, Src homology 2 domain) and an -chloroacetyl group installed at a specific position of the ligand peptide can serve as an indicator of the spatial closeness of the two residues, based on which different binding states can be covalently locked.

ACS Paragon Plus 3 Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 40

Figure 1. Probing SH2–ligand interaction by proximity-induced thiol-chloroacetyl crosslinking. (A) Diagram depicting the proximity-induced crosslinking strategy used in this study to covalently tether an SH2 domain (shown in surface representation) to a bound peptide (in blue) upon binding interaction. A covalent bond forms between a surface Cys residue on the SH2 domain and the chloroacetyl group incorporated into the peptide ligand when they are brought in close proximity by binding interaction. Chemical structure of the unnatural amino acid, X, (2S)-2-amino-3-[(chloroacetyl)-amino] propionic acid is shown in the insert. (B) The structure of the PLC1-c SH2 domain (ribbon) in complex with a pY peptide (PDB id: 4K44) highlighting the binding interface, the orientation of the peptide and the two binding pockets. Residues in the bound peptide are numbered relative to the pY. (C) Crosslinking reaction kinetics for peptide LA+1 with the PLC1-c SH2 domain (at a peptide-to-protein ratio of 5:1) at 37oC and in PBS. (D) Covalent crosslinking reactions of the chloroacetyl peptides with the PLC1-c SH2 domain. Peptides LA+1 to LB+3, containing the reactive X residue at different locations of the physiological ligands LA or LB in Table 1, were assayed for spontaneous covalent reactions with the SH2 domain. The peptide–SH2 covalent complexes show higher apparent molecular weights than the free SH2 protein on SDS-PAGE. The reactions were quantified by measuring the percentages of peptide-conjugated SH2 domain over the total amount of

ACS Paragon Plus 4 Environment

Page 5 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

SH2 domain after 24 h of incubation at 37oC and in PBS. The error bars represent the standard deviation from at least three independent experiments.

Through binding to Tyr-phosphorylated sequences, the SH2 domains play a pivotal role in transducing the signal emanating from a tyrosine kinase to control cell proliferation, differentiation, movement or survival. [21-25] An SH2 domain typically contains, on its surface, two pockets — a pY-binding pocket formed by the N-terminal half including the α-helix A (αA) and the strands βA, βB, βC and βD, and a hydrophobic specificity pocket formed by the C-terminal half of the SH2 domain to accommodate a hydrophobic residue located C-terminal to the pY (i.e., at pY+3 or pY+4 position) in the peptide ligand (Fig. 1B).[26] The phosphotyrosine residue is indispensable for SH2 binding as it provides approximately half of the total binding energy.[27] This bipartite “two-pronged plug two-holed socket” binding mode suggests that the peptide only adopts one fixed orientation on the surface of the SH2 domain. However, an unresolved puzzle in SH2 signaling is how the roughly 40,000 phosphotyrosine sites that can simultaneously exist in the cell are recognized by the significantly smaller number of SH2 domains (~120 members). It is conceivable that an SH2 domain may encounter many phosphotyrosine ligands before a productive interaction can occur. Is finding the right binding partner by an SH2 domain akin to fitting a key to a lock or is the process more dynamic? In other words, do other modes of ligand binding exist in addition to the canonical “twopronged” mode of SH2–ligand interaction? In the latter regard, while other peptide-binding modules such as the SH3 and WW domains can bind their respective ligands in two opposite orientations, [28] little is known about the existence of reverse orientation binding mode in SH2. Qin et al showed that the five residues N-terminal to pY, but not any of the C-terminal residues, are important for binding to the SH2 domains of phosphatases SHP-1 and SHP-2, and concluded for the first time that at least some SH2 domains can bind to pY peptides in an alternative mode by recognizing only the residues N-terminal to pY. [29] Interestingly, later it was found that the Cbl tyrosine kinase binding domain (Cbl-TKB), which contains an “embedded” SH2 domain, binds pY peptides derived from Sprouty 2 and the EGF receptor in the canonical orientation, but a peptide from the Met receptor tyrosine kinase in a non-canonical manner [30]. In the latter case, the Met peptide adopts a reverse binding orientation with the pY-4 residue occupying the same hydrophobic pocket on the Cbl-TKB that is occupied by the pY+4 residue of the Sprouty or EGFR peptide.[31, 32] This suggests that there is a relatively low energy barrier between the canonical and reverse binding modes, raising the possibility that both

ACS Paragon Plus 5 Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 40

binding modes exist. Therefore, given the dominant contribution of the pY residue to SH2 binding, it is possible that an SH2 domain may explore different binding modes using the phosphotyrosine as a pivotal point before arriving at a productive interaction. Another evidence of SH2 interaction dynamics is given by Lindfors et al: by paramagnetic NMR spectroscopy the authors discovered an FAK pY peptide assumes multiple orientations in the complex by sampling a significant part of the surface of an Src SH2 domain, despite their tight interaction measured by thermodynamic methods. [32] Paramagnetic NMR has been a powerful biophysical method to probe the transient encounter complexes in a variety of protein–protein associations. [32-35]

SH2 domains are also particularly suited for cysteine crosslinking due to the high occurrence rate of cysteine in this protein family. Sequence alignment based on the Pfam protein database [36, 37] indicated about two thirds of SH2 domains have at least one cysteine, and more than 25% have cysteine at the ligand binding site, amenable for proximity-induced cysteine crosslinking. This feature makes the capture of the transient encounter complexes by cysteine–-chloroacetyl reaction possible. In this work, we proved that the crosslinking reaction coincides with the spatial closeness between the cysteine in the SH2 protein and the chloroacetyl group in the pY peptide, first in a SH2 protein, then in a set of representative SH2 domains using “single-pronged”, “two-pronged” and “half” pY peptides. This led to a serendipitous discovery of a reversed orientation. To support this finding, we conducted molecular dynamics simulations and indeed observed how the pY peptide completely flips. This work thereby revealed a previously unappreciated layer of complexity in ligand recognition by the SH2 domain using a chemical method of protein reactions together with computational simulation, agreeing with previous discovery of dynamic conformational ensembles in biomolecular recognition. [38]

ACS Paragon Plus 6 Environment

Page 7 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

MATERIALS AND EXPRIMENTAL DETAILS

Materials Unless otherwise noted, all reagents were purchased from commercial sources and used without further purification. Rink Amide-ChemMatrix resin and Fmoc-protected amino acids were obtained from GL Biochem Ltd. (Shanghai, China). 5(6)-FAM (fl) was purchased from Life Technologies (USA). Other reagents were purchased from commercial suppliers, including Labscan Limited (Thailand), Meryer Technologies Co., Ltd. (Shenzhen, China), Chem-Impex International Inc. (USA), and Sigma-Aldrich Co. (USA). DNA oligonucleotides were synthesized by Invitrogen (HK). Restriction endonucleases were purchased from Takara Biotech Co., Ltd. (Dalian, China). PCR products and products of restriction digests were purified by gel electrophoresis and extraction using the gel extraction kit from Takara. Plasmid DNA was purified from overnight cultures by using the Takara Plasmid Miniprep Kit. Sequencing reactions were performed and analyzed at BGI (HK). An In-Gel Tryptic Digestion Kit was purchased from Thermo Fisher Scientific (USA).

Plasmids and protein expression All the DNA sequences of SH2 domains were synthesized by BGI (Shenzhen, China). The DNA encoding each protein was cloned into the vector of pET28a (Invitrogen, Hong Kong) between NdeI and HindIII by standard cloning methods and confirmed by DNA sequencing. The plasmids were transformed into E. coli BL21 (DE3) cells, and colonies were grown overnight in LB media supplemented with antibiotics. The starter culture grown overnight was used to inoculate 600 mL of LB media with antibiotics. IPTG (1 mM) was added when cell culture reached OD600 ~ 0.6 to induce protein expression. After overnight grown at 16 °C, cells were harvested and resuspended in lysis buffer (containing 20 mM Tris, 500 mM NaCl, 3 mM DTT, 0.1 mM PMSF, pH 7.5), sonicated, and centrifuged to obtain the supernatant and to remove cell debris. Recombinant proteins were purified from lysates by Ni-NTA agarose (Qiagen) with imidazole elusion followed with size-exclusion separation and dialyzed. Protein concentration was determined by Bradford assay and confirmed by SDS PAGE.

Peptide synthesis, purification and characterization

ACS Paragon Plus 7 Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 40

Peptides were synthesized by solid phase peptide synthesis technique based on the Fmoc/HBTU chemistry. Phosphotyrosine was incorporated using Fmoc-Tyr(PO(OBzl)OH)-OH. Briefly, Fmoc group was deprotected by piperidine/DMF (20% v/v) for 30 min and the coupling was done in 5-fold excess of amino acid activated by one equivalent of 1:1 HBTU/HOBt mixture and two equivalents of DIEA in DMF. After the peptide sequence was completed, 5(6)-carboxyfluorescein (5(6)-FAM, fl) or t 5(6)- Carboxytetramethylrhodamine) (5(6)-TAMRA, tmr) was coupled to the N terminus amino group in the presence of EDC/HOBt. For the synthesis of peptides containing the unnatural amino acid X, Fmoc-Dap(Mtt)-OH was incorporated. After the sequence is finished, Mtt group was selectively removed with 1% TFA and 5% TIS in DCM, and chloroacetic acid was coupled in the presence of HBTU/HOBt. Peptides were cleaved from the resin by a cleavage cocktail containing TFA, EDT, water, and TIS (94:2.5:2.5:1, v/v), precipitated in ice-cold ether, pelleted by centrifugation, dissolved in water, purified by semi- preparative reserve phase HPLC, and lyophilized. The identities of the synthetic peptides were confirmed by MALDI-TOF MS analysis.

Covalent crosslinking reaction Purified SH2 protein (30 μM) and pTry peptides (150 μM) were incubated in PBS (pH 7.4) containing 0.5 mM TCEP at 37 °C. SDS loading buffer was added to the reaction mixture, followed by boiling at 100°C for five min to stop the crosslinking reaction. The samples were then separated on 16.5% Tris-tricine gels by electrophoresis. The gels were imaged under a Typhoon TRIO+ Variable Mode Imager (GE Healthcare, USA) for in-gel fluorescence scanning at the FITC or TRITC channel. The gels were then stained by Coomassie Blue dye and imaged, and the band intensities were quantified by using a BIO-Rad software (Biorad, USA). The error bars refer to the average and standard deviation of at least three independent experiments.

The reaction of peptide LA-3+3Cys was performed in PBS in the presence of 2 mM GSH/1 mM GSSG at 37 °C for overnight. A DTT-free loading buffer (62.5 mM Tris-HCl, pH 6.8, 25% glycerol Glycerol, 1% Bromophenol Blue) was added to the reaction mixture, followed by thermodenaturation at 100 °C for five min to stop the crosslinking reaction. The samples were finally separated on 16.5% Tris-tricine gels by electrophoresis. The fluorescent bands and coomassie were imaged under a Typhoon and BIO-Rad Imager, respectively. After staining and imaging, the

ACS Paragon Plus 8 Environment

Page 9 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

intensities of protein and protein conjugation signals were followed by analysis and quantification with the reaction percentage.

The fluorescein-labeled chloroacetyl pY peptides were incubated with LCK SH2C224G protein in PBS (pH 7.4, 0.5 mM TCEP) at 37 °C for 3 hours. After stopping reaction at 100 °C for five min and separating on 16.5% Tris-tricine gels, the bands were imaged under a Typhoon at FITC or TRITC channel. The staining Coomassie Blue dye was imaged by BIO-Rad Imager.

Spot peptide array analysis of SH2 Domain Specificities The OPAL peptide array was synthesized on cellulose membrane using an AUTO-Spot MultiPep RSi (Intavis). All steps were carried out at room temperature unless otherwise specified. The OPAL membrane was first blocked with 5% bovine serum albumin in TBST (0.1 M Tris-HCl (pH 7.4), 150 mM NaCl, and 0.1% Tween 20) for 4 h. LCK SH2 protein (3 μg) was then added to the array membrane at a final concentration of 1 μg/ml and the two were incubated for overnight. The monoclonal antibody Lck (sc-433, Santa Cruz Biotechnology, 1 μg/ml) was used to detect the bound SH2 domain. The membranes were washed three times with TBST for 10 min, probed with HRP conjugated anti-mouse

IgG

and bound peptide spots

were

visualized by enhanced

Chemiluminescence.

MD simulation- system preparation and protocols The X-ray structure was obtained from the protein data bank (PDB) with the PDB code 1LCK (31). The PDB structure contains three main parts, the SH2, SH3 domain and the pY peptide with a phosphorylated tyrosine (pY). The SH3 domain was cut and not used in our simulations. The N and C terminus of the SH2 domain and the pY peptide were capped with acetylated and amidated terminals respectively. MD simulations were carried out with the SH2 domain and pY peptide in a dodecahedron simulation box filled with water.

All MD simulations were performed using GROMACS version 5.0.6, [39, 40] with the all-atom CHARMM36 force field. [41, 42] Two types of simulation systems with dodecahedron box were constructed: the smaller system had a box unit vector of ~8 nm (Crystal_sims_1-3) and the bigger system had a unit vector of ~10 nm (Random_sims_1-6) (see Table S4). In both types of systems, the ACS Paragon Plus 9 Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 40

equivalent of 0.1 M NaCl was added and additional ions were introduced to neutralize the systems. Following PROPKA calculation, [43, 44] charged protein side chains were modeled in their default ionic states at neutral pH. Energy minimization was performed using the steepest descent algorithm for 1000 steps to remove any overlapping contacts. Subsequently, 20-ns NVT (constant number of particles, volume and temperature) equilibration was then performed, followed by 20-ns NPT (constant number of particles, pressure and temperature) equilibration. In both equilibrations, the C α atoms of the protein and the peptide were restrained with a spring constant of 1000 kJ mol-1 nm−2 to allow re-equilibration of water around the protein. The position restraints were removed in the subsequent 500-ns NVT production runs (see Table S4). All protein bonds were constrained using the LINCS algorithm [45] and waters were constrained using the SETTLE algorithm [46], allowing for a time step of 2 fs to be used. The v-rescale thermostat, with a time constant for the coupling of 0.1 ps, was used to maintain the system temperature at 300 K [47]. Electrostatic interactions were computed with a cut-off of 1 nm with interactions beyond this cutoff treated using the smooth particle mesh Ewald method [48]. The van der Waals (vdW) interactions were computed with a cut-off of 1.2 nm and long-range dispersion correction turned off. The neighbor list was updated every 20 steps. GROMACS tools were used to determine the conformation properties of the protein and peptide [49]. Other analysis and visualization of simulation trajectories was performed with VMD [50].

ACS Paragon Plus 10 Environment

Page 11 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

RESULTS

Proximity-induced crosslinking of PLC1-c SH2 domain with the peptide ligands Chloroacetyl containing peptides were designed to react with one of the cysteine containing SH2 domain, the C-terminal SH2 domain of PLC1 protein (PLC1-c SH2 domain). According to the crystal structure of the protein complex, PLC1-c SH2 domain contains a cysteine (Cys715) in close proximity to the pY+1 residue of the bound peptide (Fig. 1B).[51] Two pY peptide ligands of PLC1c SH2 have been reported, a peptide PGFpYVEAN (called Ligand A or LA herein) from the X-Y linker region of the PLC1-c SH2 domain phosphorylated at Tyr783, and a doubly Tyr phosphorylated sequence DTEVpYESPpYAD (LB) from the SYK tyrosine kinase.[52] Chloroacetyl derivatives of LA and LB in which the reactive amino acid X were incorporated at the +1, +2 or +3 position relative to the pY by replacing the original amino acids (entries 1-7 in Table 1). Peptides LA and LB possess different binding properties to PLC1-c SH2 domain: LA causes minimal conformational change in the SH2 domain during binding event (with a RMSD of ~0.3 Å) whereas binding of LB induces extensive structural changes based on the crystal structures.[52] Reactive peptide derivatives (a fluorescein molecule is often installed at the N terminus to facilitate quantification and detection, without affecting the binding property and reactions) were incubated with recombinant PLC1-c SH2 protein at 37oC in phosphate buffered saline (PBS, pH 7.4), the reaction was stopped by thermal denaturation at 95oC for 10 minutes, and the mixture was resolved on SDS-PAGE. Peptide LA+1, in which the Val residue at pY+1 position in the parental peptide LA was replaced by X, was found to be the most reactive, converting 79% of the SH2 protein to a covalent complex (Fig. 1C). In contrast, peptide LB+2 and LB+3 exhibited weak reactivity, with crosslinking yields being 4.2% and 1.9% respectively (Fig. 1D). Because the reactive residue X was positioned at the pY+1, pY+2, and pY+3 sites in peptides LA+1, LB+2, and LB+3 respectively, the dominant reactivity with peptide LA+1 suggests that Cys715 of the PLC1 c-SH2 points towards the pY+1 position of the bound peptide. PLC1 c-SH2 also adopts certain conformational flexibility that allows the thiol group of Cys715 to “wiggle” to pY+2 or pY+3 positions of the high-affinity LB peptide, albeit with a very low probability but clearly noticeable from the reactivity with LB+2 and LB+3 (Fig. 1C). A non-reactive variant LA+1*, with a sequence PGFpYX*EAN, X*=Dap(Ac), was measured to have a KD of 12.1 M with the PLC1-c SH2 domain (Fig. S1), suggesting that replacing Val to X in LA+1 should retain the majority of the binding affinity of the parental peptide LA (KD value of WT LA peptide ACS Paragon Plus 11 Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 40

GFpYVEANPM was reported to be 6.45 M [53]). This proved that the SH2–pY peptide binding complex can be captured by cysteine crosslinking, and the efficiency of cysteine–-chloroacetyl reaction correlates with the distance between cysteine and chloroacetyl group in the binding complex.

Reaction patterns of different SH2 groups We next extended the crosslinking reaction to a broader selection of SH2 domains. Sequence analysis of the SH2 domains listed in the Pfam protein database (http://pfam.xfam.org/) [36, 37] reveals that 64% of SH2 domains contain at least one cysteine. Based on the relative position of the Cys residue, Cys-containing SH2 domains are classified into five groups (Fig. 2, Table S1). In group I the Cys residue is located within the BC loop of the SH2 domains, close to the pY-1 and pY-2 positions of the bound phosphopeptide. Group II SH2 domains contain a Cys in the βC strand, close to pY-1, pY+1 or pY+2 positions of the peptide. Group III (including PLCγ1-c SH2 domain) features a Cys residue in the βD strand near the pY+1, pY+2 or pY+3 positions of the bound peptide. In group IV a Cys can be found at the βE strand or DE/EF loops, close to pY+2 or pY+3 positions. These four groups encompass 2,500 SH2 domains, or more than ¼ of the SH2 domains collected in the Pfam database (Fig. 2C). Group V were collected from PDB database instead of Pfam database, as their Cys locates at the C terminus in the αB helix or the BG loop (close to pY+2 to pY+6 positions) which is not included in the sequences in Pfam database.

ACS Paragon Plus 12 Environment

Page 13 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Figure 2. Grouping SH2 domains based on the location of surface Cys residues. (A) Diagram of the structure components of a common SH2 domain. Cys-containing SH2 domains were categorized into 5 groups I to V based on the location of the cysteine. (B) Relative position of the Cys residue on SH2 proteins in different groups, and the pY residue and the N terminus of the peptide. The SH2 domain are shown in ribbon with the Cys residues as black sticks and highlighted

ACS Paragon Plus 13 Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 40

in orange balls. The bound pY peptides are shown in thick blue lines with the pY identified as red sticks, and the peptide N terminus of pY labeled with blue solid balls. SH2 groups I to V find cysteine residues at BC loop, C region, D region, E region and BG loop respectively. (C) Statistics of the number of SH2 domains in groups I to IV based on the alignment of SH2 sequences in the Pfam database.

We then examined the crosslinking reactions of Cys-containing SH2 domains with chloroacetyl pY peptides. To this end, we selected 10 human SH2 domains representing members in groups II to V (Table 2 and Table S2). Group I SH2 domains were removed because the Cys residue points to the pY residue, thereby precluding a cysteine crosslinking. The 10 SH2 proteins were purified and tested for reaction with peptides LA+1 to LB+3 (Table 1). Eight of the ten SH2 proteins showed crosslinking reactivity, with different patterns (Fig. 3A, representative SDS-PAGE gel images shown in Fig. S2). The PLCγ1-c SH2 domain (group III), and the NCK1 and APS SH2 domains (group IV) reacted with LB+2. The LCK, VAV1 and CSK SH2 domains (group V) reacted with peptides LA+3 and LB+3 (Fig. 3A). The SRC SH2 domain, which possesses features of both groups II and V, reacted only with LA+1. Close examination of its structure reveals that the reactive Cys238 is located closer to the pY+1 than pY+3 position, consistent with the reaction pattern (Fig. 2B). The PIK3R1-c SH2 domain, containing three cysteines at the ligand binding interface, showed reactivity with peptides LA+1, and LA+3, indicating that these Cys residues were accessible to pY+1 and pY+3 positions of the peptide (Fig. 2B). Exceptions do exist. The SYK and SH2D1A SH2 domains (group II) exhibited no reactivity towards this panel of chloroacetyl pY peptides. Closer examination of the crystal structures revealed that the cysteine residues in these two proteins point towards the pY-binding site, making them inaccessible to the chloroacetyl group of the reactive peptides that we synthesized (Fig. 2B). Taken together, the pattern of cysteine-chloroacetyl crosslinking reactions is consistent with the structural feature of the SH2 proteins in different groups.

ACS Paragon Plus 14 Environment

Page 15 of 40

Biochemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Paragon Plus 15 Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 40

Figure 3. Reactions of the eight SH2 domains with chloroacetyl pY peptides. (A) Reactions with chloroacetyl pY peptides derived from the template peptides LA or LB. (B) Reactions with “single-pronged” AA-peptide series. Error bars denote standard deviation from three independent experiments. Reaction patterns of “single-pronged” pY peptides indicated a reverse binding mode To further probe the dynamic nature of SH2–pY peptide interaction, we next synthesized a set of “single-pronged” alanine-based phosphopeptides (the AA series, Table 1). The AA-series peptides represent degenerated SH2 ligands without specificity residues at the pY+2, pY+3, or pY+4 position, hence their name “single-pronged” pY peptides. Remarkably, all eight SH2 domains reacted with at least one member in the library (Fig. 3B). Of note, all except the VAV1 SH2 domain crosslinked robustly with peptide AA+1, suggesting that the presence of the pY residue sufficiently induces proximity between the peptide and the SH2 domain conducive for crosslinking. Moreover, the PLC1-c, PIK3R1-c, LCK and VAV1 SH2 domains could crosslink with multiple AA peptides. Therefore, the reactivity pattern displayed by the AA series peptides differed significantly from those observed for the LA and LB peptides. We postulate that, in the absence of a specific residue to function as the second anchor (in addition to the pY residue which provides the first and most common anchor for all peptides), a “single-pronged” pY peptide would more freely explore the binding surface of SH2 protein than a “two-pronged” peptide.

To our surprise, peptide AA-1 and AA-3, containing X at the pY-1 or pY-3 position N-terminal to the pY residue, crosslinked with the PLC1-c or PI3K3R1-c SH2 domain with comparable or better efficiency than peptides bearing the X residue at a C-terminal position (eg., pY+1). Similarly, peptide AA-3 was more reactive to the LCK1 SH2 domain than peptides AA+1 or AA+2. That the PLC1-c, PI3K3R1-c and the LCK SH2 domains were capable of crosslinking with peptides harboring X at either the N- or the C-terminal side of the pY residue was intriguing. This indicates that the bound AA-3 or AA-1 peptides must adopt an orientation opposite to the canonical mode because the position of the reactive cysteine on an SH2 domain is fixed, otherwise the chloroacetyl group in the AA-3 peptide would be too far-removed from the cysteine residue.

Reverse binding mode observed by the “two-pronged” pY peptides

ACS Paragon Plus 16 Environment

Page 17 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

We next re-visited “two-pronged” LA, LB or LC peptides and examined their ability to conjugate with the LCK SH2 domain with the reverse mode (The LC peptide is derived from C-terminal tail of LCK, which, when phosphorylated, binds to its own SH2 domain and inhibits kinase activity). [54] To exclude any ambiguity, Cys224 that is far-removed from the reactive site was replaced by a glycine residue by site-directed mutagenesis, leaving only Cys217 as the reactive site. Using fluorescence polarization (FP), we observed that the mutant LCK SH2C224G retained the same high-affinity binding to the LC (EGQpYQPQPA) peptide (KD=4.6 M) as the wild-type domain.[54] The LCK SH2C224G protein was then employed for crosslinking experiments to peptides that contained the chloroacetyl group at either the pY-3 or pY+3 position of the template peptides LA, LB or LC, to test the preference for binding orientation. Peptide LC is a natural ligand of the LCK SH2 domain (based on PDB 1LCK) that differs from LA and LB in that it does not contain a negatively charged residue at pY+1 or pY+2 position (Table 1). To estimate whether the amino acid replacement with the chloroacetyl group will drastically affect the binding affinity, we synthesized non-reactive versions with an acetyl group at the corresponding positions of the chloroacetyl groups, generating +3* and -3* pairs. AA+3*, LA+3* and LB+3* peptides bind to LCKC224G-SH2 protein with similar affinity as AA-3*, LA-3* and LB3* respectively (Fig. S3). Only LC+3* and LC-3* pair showed affinity difference. Nevertheless, replacing residues with amino acid X at either +3 or -3 position is not expected to drastically affect the binding affinity. The choroacetyl pY peptides were then incubated with the SH2 domain for 3 hours at RT to promote crosslinking. As shown in Fig. 4A, all peptides were capable of crosslinking with the SH2C224G protein, albeit with different efficiency and preference for the canonical or reverse binding mode. Specifically, the LA+3 and LB+3 peptides preferred the normal binding mode as they yielded more conjugated products than LA-3 and LB-3 despite their similar binding affinities. In contrast, peptides LC-3 and AA-3 favored the reverse binding mode as evident in their higher conjugation yields compared to peptides LC+3 and AA+3. When the LC-3 and LC+3 peptides were labeled with a red and green fluorescent dye (TAMRA and FAM, respectively), we found that peptides LC-3 and LC+3 gave two conjugation products representing the reverse and normal binding modes (Fig. 4B). Interestingly, the LC-3 peptide appeared to out-compete LC+3 as the former gave rise to more conjugated product than the latter when incubated in a 1:1 mixture with LCK SH2 C224G. This observation suggests that, in the context of peptide LC, the LCK SH2 domain may have a higher chance to adopt the reverse binding mode than the canonical one during the binding process, although both modes were sampled. To simultaneously capture both the normal and reverse binding modes by ACS Paragon Plus 17 Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 40

covalent crosslinking, we next synthesized a bifunctional peptide LA-3+3Cys in which the X residue was incorporated at the pY-3 position and a Cys at pY+3 of the LA template. We then reacted this peptide with the LCK SH2 domain in the presence or absence of a reducing agent. When peptide LA3+3Cys was incubated with the LCK SH2C224G protein in glutathione (GSH) redox buffer, the population of the peptide that bound in the normal orientation would be locked through formation of a disulfide bond that is susceptible to a reducing agent, while the population that bound with the SH2 domain in the reverse orientation would be expected to crosslink through a thioether bond resistant to reduction. Prior to the addition of reducing reagents, we found 45% of the LCK SH2 protein formed covalent complex with the peptide; addition of dithiothreitol (DTT) to the solution decreased the amount of conjugation complex to 20% (Fig. 4C). These data provide compelling evidence that the LCK SH2 domain sample both the canonical or reverse binding modes during the recognition of AA and LA peptides.

ACS Paragon Plus 18 Environment

Page 19 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Figure 4. Reverse binding mode of “two-pronged” pY peptides. (A) Both normal and reverse binding modes were captured for all the peptides. (B) Peptides LC+3 and LC-3 compete for covalent reaction with the LCK SH2C224G protein. LC+3 and LC-3 were labeled with different fluorescent dyes (FAM and TAMRA) to display green or red fluorescence respectively. Both peptides reacted with SH2 protein in a mixture. (C) Estimating the percentages of normal and reverse binding modes using a dual-reactivity peptide LA-3+3Cys and a two-step reaction. In redox buffer the SH2-peptide complex adopting the normal mode is locked by a reversible disulfide bond, and the complex adopting the reverse mode is locked by a nonreversible thioether. Treating the complex mixture with the reducing agent DTT breaks the disulfide bond, leaving only the complex formed as the result of the reverse mode. (D) Adding or removing an acidic amino acid (Glu) at pY+1 ACS Paragon Plus 19 Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 40

or pY+2 position strongly affects the propensity of the reverse binding mode. (E) K179L accounts for the interaction with the Glu at pY+1 or pY+2 position. Error bars refer to the average and standard deviation of at least three independent experiments. See Materials and Methods for detailed experimental procedures.

Electrostatic interaction strongly affects ligand orientation After demonstrating that both orientations populate the SH2–pY peptide bound states, we next explored the factors that dictate the preference of normal versus reverse binding mode. By sequence comparison of peptides LA and LB that prefer normal orientation with peptides LC and AA that favor reverse binding, we found that the former contains a Glu residue at the pY+1 or pY+2 position. This acidic residue is expected to interact with Lys179 on the LCK SH2 domain.[58, 59] Sequence alignment indicates this Lys residue is conserved in 37% (or 44) SH2 domains (Table S3). This suggests that charge–charge interaction may be a key determinant of the binding mode preference. To validate this prediction, we synthesized a derivatie of peptide LA-3, LA-3-Q, by replacing the Glu residue at pY+2 position with the neutral Gln residue. Also, we synthesized charged versions of peptides LC-3 and AA-3 in which a Glu residue was incorporated at the pY+1 position (Table 1). The resulting peptide pairs, LA-3-Q/LA-3, LC3/LC-3-E, and AA-3/AA-3-E were then incubated with the LCK SH2C224G protein, respectively. Consistent with our prediction, removing the negative charge at pY+2 from the peptide LA-3 resulted in an increase in the reversely bound complex from 6% to 12%, whereas introducing a Glu residue at pY+1 position in the LC-3 and AA-3 peptides led to a significant decrease in reverse binding (from 51% to 28% for LC-3 and to 25% for AA-3) (Fig. 4D). Furthermore, we found that mutation of the Lys179 to a Leu in LCK SH2C224G led to a significant decrease in reverse binding for peptide AA-3, but a significant increase for peptide AA-3-E (Fig. 4E). These results indicate that the electrostatic interaction at the pY+1 or pY+2 position strongly affects the preference of ligand orientation during the surface sampling process.

Reverse binding mode observed by the “half” pY peptides To further understand the mechanism of reverse binding, we employed the Oriented Peptide Array Library (OPAL) approach to identify residues that are critical to normal (via the C-terminus) and reverse (via the N-terminus) binding. To this end, we synthesized OPAL membranes with the degenerated “half” sequences Z-Z-Z-Z-pY and pY-Z-Z-Z-Z (where Z denotes a mixture of natural amino acids except Cys) lacking the C terminal portion or N terminal portion respectively and probed ACS Paragon Plus 20 Environment

Page 21 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

them respectively with the LCK SH2 protein. The relative importance of different amino acids at a given position was determined by replacing the Z with a defined residue. The LCK SH2 domain showed a strong preference for aliphatic/hydrophobic residues (eg., Val and Ile) at the positions pY+3 and pY+4, but no apparent selectivity for N-terminal residues [54, 55] (Fig. 5A). This result is in accordance with the reported specificity of the LCK SH2 domain. [56] Our inability to detect Nterminal specificity from the Z-Z-Z-Z-pY library might be caused by the low basal affinity of the degenerated library. To address this potential limitation of the OPAL approach, we synthesized a permutation peptide array library (Z-EGQpY) based on the N-terminal sequence of peptide LC, a natural ligand of the LCK SH2 domain. Intriguingly, probing this library with the LCK SH2 domain identified Ile as a strongly preferred residue at the pY-3 or pY-4 position (Fig. 5B). Assuming that the corresponding peptide binds in the reverse orientation, this specificity profile agrees with the Cterminal specificity of the LCK SH2 domain where the peptide binds in the normal orientation.

ACS Paragon Plus 21 Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 40

Figure 5. Revealing the reverse binding mode using “half” pY peptides. (A) Amino acid preference screening using OPAL method. A randomized library shows the LCK SH2 protein preferentially recognizes Ile or Val at pY+3 and pY+4 positions, but binds very weakly to the half peptides lacking C terminus. (B) Permutation library of N-terminal fragment of peptide LC reveals a strong preference of Ile at the pY-3 and pY-4 positions. (C) Covalent reactions of “half” peptides of the N terminal part of LC-3 in comparison to the full length LC-3 peptides. (D) Acidic amino acids at pY-3 and pY-4 significantly increase the reaction of the half peptides LC-1 and LC-2. Error bars refer to the average and standard deviation from at least three independent experiments. ACS Paragon Plus 22 Environment

Page 23 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

To obtain further evidence for the reverse binding mode, we carried out crosslinking experiments using a group of LC-derived peptides that were missing either the N- or C-terminal portion entirely or that contained an Ile or a Gln residue at the pY-4 position. The reactive residue X was incorporated at the pY-3 or pY+3 position to allow us to capture the reverse or normal binding mode. Intriguingly, the LC+3-c and LC-3-n truncation peptide formed conjugates with the SH2 domain with 78% and 39% efficiency, respectively, compared to 60% by the full-length LC-3 peptide (Fig. 5C). That the LC-3-n peptide (without C-terminal part) could crosslink with the LCK SH2 domain with decent efficiency (39%) provides compelling evidence for the existence of the reverse binding mode. It is also remarkable that the LC+3-c truncation peptide reacted with the SH2 domain with much greater efficiency than the full-length peptide LC-3 (Fig. 5C). This suggests that the elimination of the reverse binding mode by removing the N-terminus from the LC peptide promoted the orientation and thereby conjugation, of the peptide in the normal mode.

To define the role of binding affinity in SH2–ligand crosslinking, we determined the affinities of fulllength or truncated LC peptide containing the Ile or Glu residue at the pY-3 and/or pY-4 positions. For the natural LC peptide, truncation of the C-terminal half, but not the N-terminal half, markedly reduced affinity (Fig. S4). Interestingly, the inclusion of the Ile-Ile pair at the pY-3/pY-4 positions led to a pronounced increase in affinity for both the full-length LC (KD from 4.6 M to 1.6 M) and the C-terminal truncation peptide (KD from 22 M to 10.4 M). In contrast, inclusion of the Gln-Gln or Glu-Glu pair in the same positions either decreased binding or had no significant effect (Fig. S4). These results are consistent with the peptide array data suggesting the Ile is favored at the pY-3 and pY-4 positions (Fig. 5B). Does an increase in affinity translate into a similar increase in crosslinking efficiency? To address this issue, we incorporated the reactive residue X at the pY-3 position of these peptides and measured their crosslinking to the LCK SH2 domain. The inclusion of either an Ile or a Gln residue at the pY-4 position led to ~20% increase in crosslinking efficiency for the corresponding peptides LC-3-4I-n and LC-3-4Q-n, compared to LC-3-n. Intriguingly, the same substitutions resulted in significant decreases in crosslinking efficiency for the full-length LC peptide (Fig. 5C). These data suggest that factors other than affinity may play a more important role in dictating peptide binding in the reverse orientation.

ACS Paragon Plus 23 Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 40

Because the presence of a C-terminal Glu residue plays a critical role in orienting the LC peptide in the normal mode upon binding to the LCK SH2 domain (Fig. 4), we tested whether the inclusion of a Glu at the N-terminus would promote peptide binding in the reverse orientation. To this end, we synthesized peptides that contained a Glu-Glu or Ile-Ile dipeptide at the pY-3 and pY-4 positions and the reactive residue X at either the pY-1 or pY-2 position. We found that the resulting peptides LC-1E and LC-2-E dramatically outperformed peptides LC-1-I and LC-2-I in forming conjugates with the LCK-SH2 domain in the reverse orientation (Fig. 5D) despite higher binding affinities for the latter than the former peptides (Fig. S4). We reason that the affinity measures the sum of all the binding modes [29], but the covalent conjugation only captures the population that adopts the reverse mode. Based on known structural data, the acidic residues can interact with the basic Lys182/Arg184 residues of the LCK SH2 domain via electrostatic interaction. Therefore, the presence of a Glu residue at the C-terminus (relative to the pY) would promote canonical binding; conversely, Glu at the Nterminus would promote peptide binding in the reverse orientation.

Reverse binding mode detected by molecular dynamics simulation To support our findings, we sought to recapitulate the dynamic process of SH2–ligand binding by molecular dynamics (MD) simulation. Starting with the crystal structure of the LCK SH2 domain in complex with peptide LC (PDB ID: 1LCK), [54] we carried out three independent, 500-ns MD simulations on the water-solvated SH2–peptide complex (Table S4). In two of the three simulations (Crystal_sim_1 and 3), the peptide remained in the canonical orientation, although considerable conformational fluctuation was recorded for the peptide in Crystal_sim_3 (Fig. 6). In another simulation (Crystal_sim_2 in Fig. 6B), the peptide underwent a significant change in orientation at around 100 ns and adopted an orientation opposite to that in the crystal structure towards the end of the 500-ns simulation (Fig. 6 and Movie S1). Throughout this process, the pY residue remained tightly bound to SH2, which served as an anchor and pivot point of rotation for the peptide. Indeed, our analysis identified that the same residues in the SH2 domain, including Arg134, Arg154, Ser156, Glu157, and Ser158, were involved in forming salt bridges and hydrogen bonds with the pY in all three simulations, indicating that the binding of the pY residue to the SH2 domain was essentially unaffected during the dynamic re-orientation of the peptide. We also estimated that the interaction between the SH2 domain and the pY residue is approximately twice as strong as that between the SH2 and the remainder of the peptide (Fig. S5). These results agree well with the previous reports ACS Paragon Plus 24 Environment

Page 25 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

that the pY residue contributes at least half of the binding free energy for a peptide ligand to an SH2 domain,[27] and also provide evidence independent in support of the reverse binding mode for SH2– peptide interaction.

Lastly, our MD simulation reveals that the binding of pY peptide to SH2 protein is highly dynamic, and the binding process adopts a wide variety of conformations as shown in Crystal_Sim_3 (Fig. 6C). This large dynamic pattern was corroborated by data from an independent set of 500-ns simulations (Random_sim_1 to 6) in which the start position of the peptide was randomized (Table S4 and Fig. S6A). We monitored the location of the pY in the simulated complex by the root mean square deviation (RMSD) from its location in the crystal structure, and observed that the pY quickly enters the pY-binding site of the SH2 domain, indicating a valid binding process (Fig. S6B). Similar as the MD simulations based on the crystal structure, these randomized simulations demonstrated that both the canonical and reverse orientation could be adopted by the peptide (Fig. 6D-6F). Furthermore, the pY+3 or pY-3 residue of the peptide was found in the vicinity (within 10 Å) of Cys217 on the LCK SH2 domain, offering a basis for proximity-induced crosslinking. Taken together, molecular dynamics simulations further corroborate the findings from proximity-induced crosslinking that (1) pY dominates in the binding energetics, (2) the peptide binding to SH2 is dynamic, and (3) hinged on the pY binding site as an anchor, the peptide likely adopts normal and reserve binding modes, with different ratios in the bound population.

ACS Paragon Plus 25 Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Text

(A)

(B)

Page 26 of 40

(C) Crystal_sim_3

Crystal_sim_2

Crystal_sim_1

C

N

N

(D)

(E) Random_sim_4

(F) Random_sim_5

Random_sim_6

C

N

Figure 6. MD simulation of the LCK SH2-ligand interaction. (A-C) MD simulation snapshots from Crystal_sim_1 to 3. The pY peptide from the crystal structure LCK is colored in black. The pY peptide from MD simulations (cartoon and C-terminus as sphere) is shown every 50 ns and colored according to the simulation time: red, beginning; white, middle; and blue, end of simulation. (D-F) MD simulation snapshots from Random_sim_4 to 6 as representative random simulation results. The pY peptide is shown every 10 ns for the last 100 ns of the simulations. Wang et al. Figure 6

ACS Paragon Plus 26 Environment

Page 27 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

DISCUSSION

Recently, a two-step complex formation has been proposed to account for the structural dynamics of protein–protein interactions. [34, 35] This model shows that the formation of protein complexes involves encounter complexes, in which proteins show few specific interactions and assume many orientations, before the productive complex state is achieved. A direct visualization of the transient encounter states can be realized based on paramagnetic relaxation enhancement measurement using a mutant peptide or protein containing a paramagnetic label. [32-35] Besides, computational simulation also shed light on the formation of the encounter complexes. [63] Here, in this report, we show for the first time that the structural dynamics of SH2–pY peptide interaction can be captured by a proximity-induced cysteine reaction.

Although not a model protein for the study of encounter complex, SH2 domains presents an unsolved puzzle in the field of signal transduction: how these modular domains (~120 in human genome) seek out their binding partners (thousands) in the cell to form specific complexes.[24, 62] How does an SH2 domain find the “right” pY sequence to bind to? Previous studies from our labs and others have shown that, in addition to recognizing the pY, different SH2 domains prefer different residues at the pY+2, +3 or +4 positions. [24, 26] While this scheme of classification explains the “general” specificity of the SH2 domain family, it is inadequate to account for the exquisite specificity displayed by some SH2 domains in vivo. [63, 64] Or does encounter complex play a role in the recognition of SH2 domains with pY peptides? Data from our crosslinking and MD simulation studies showed that the binding of a pY peptide to an SH2 domain is a dynamic process that explores a wide range of peptide conformations and orientations. While the pY residue serves the role of an anchor for the peptide, the remainder of the peptide sequence is largely free to explore the binding interface of an SH2 domain until another anchor is found to stabilize the complex. Although residues C-terminal to the pY have been shown to provide the second anchor for most SH2 domains studied to date, we found that N-terminal residues could also fulfill this role, consistent with the previous report [29]. However, for an N-terminal residue to function as the second anchor, the pY peptide has to be presented to the SH2 domain in an orientation that is opposite to the canonical binding mode. Our work has provided compelling evidence that the LCK SH2 domain could engage the C-terminal sequence in the canonical mode and the N-terminal sequence in the reverse mode and that both ACS Paragon Plus 27 Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 40

binding modes may be used for physiological interactions (such as those mediated by the LC peptide). Our studies also showed that the exploration of the protein surface is a general phenomenon occurring to numerous SH2 domains, although the single incidence of c-Cbl-TKB [30]) is the only case with a crystal structure. While it can be argued that the number of potential SH2–peptide interactions far outnumber the complex structures determined to date, it is likely that the reverse mode would only generate productive binding for specific SH2–peptide pairs. In support of this assertion, we found that the location of a negative charged residue such as Glu greatly affect the orientation of the peptide in binding to the LCK SH2 domain because of a potential electrostatic interaction with the Lys179 and Lys182/Arg184 of the latter. Since the latter residues are highly conserved residues in the SH2 domains from the Src family of non-receptor tyrosine kinases, it is likely that similar modes of reverse binding dictated by complementary charge-charge interactions may occur to these SH2 domains. Because most tyrosine kinases and phosphotyrosine phosphatases prefer acidic residues upstream of the Tyr/pY, ~50% of all identified pY sites from human proteins contain at least one negative charged residue (Glu or Asp) at the N-terminal region (pY-1 to pY-5). [64] It is likely that some of these pY sites are capable of binding in reserve orientation when they encounter an SH2 domain carrying complementary charge (Lys or Arg) at an optimal location to promote electrostatic interaction. The pivotal role of charge in determining peptide ligand orientation has also been observed in other modular domains including SH3. In this regard, SH3 domains may bind to peptides containing either the RxxPxxP or PxxPxR motif with in opposite orientation. The orientation of the peptide is dictated by the location of the Arg residue relative to the PxxP core motif as it interacts with acidic residues on the SH3 domain. [65] A similar mechanism accounts for the interaction of some WW domains with peptide ligands. [66, 67]

Lindfors et al showed that a high-affinity pY peptide (KD < 1 M) binds with the Src SH2 domain with unexpectedly high structural dynamics [32], suggesting that different modes may prevalently exist in peptide–domain interactions within the KD range of 0.1-100 µM. It is also reasonable to assume that at least some SH2 domains are capable of binding to peptides in reverse orientation because pY contributes to half of the binding energy. As numerous peptides may be Tyr phosphorylated in a cell at a given time, a sizable proportion of these peptides may explore the reverse binding mode especially when they contain charged residues flanking the pY that may aid in their orientation in a manner opposite to the “cognate” binding mode. Besides electrostatic interactions, ACS Paragon Plus 28 Environment

Page 29 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

specificity encoded by N-terminal residues may be an important factor dictating peptide binding in the reverse orientation. In this regard, the OPAL or permutation array approach described here may be applied systematically to SH2 domains to uncover N-terminal determinants of reverse binding. Comprehensive information on specificity and charge (on both the ligand peptide and target SH2 domain) would help us identify ligands that favor the reverse binding mode. It would be also important to define, in future studies, the physiological relevance of the reverse binding mode in the function of the SH2 domain and other modular or enzymatic domains in vivo. The physiological relevance of the reverse binding mode however is beyond the scope of this work.

The high flexibility of the SH2–peptide interaction also makes the covalent crosslinking of the intermediates a difficult task. For example, in another crosslinking experiment, a photoreactive version of LC+3 peptide containing a photoLeu at pY+3 position failed to covalently crosslink with LCK SH2 protein upon illumination to give a detectable yield (Figure S7). Multiple reasons may explain this. One possibility could be the high flexibility of the binding residues flanking the pY site does not give the short-lived reactive carbene at the side chain of photoLeucine sufficient time to contact the chemical groups of the SH2 proteins. This further strengthens the necessity of using a relative “strong” crosslinking reaction, like cysteine–-chloroacetyl reaction that we used here to result in effective crosslinking. Of course, one may argue if reverse binding mode popularly exist in the SH2–peptide interaction, why most of the crystal structures of the SH2–peptide interacting complexes showed only one binding mode, except one. We reason that this may attribute to the fact that crystallization may favor one stable state, which is supported by the paramagnetic NMR spectroscopic analysis.

ACS Paragon Plus 29 Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 30 of 40

AUTHOR CONTRIBUTIONS J. X. conceived the project. S. S. C. L., Y. W. and J. X. designed experiments and wrote the manuscript, in consultation with Z. L.. R. W., P. Y. M. L. and F. H. carried out the experiments. T. K. and M. H. helped analyze the data.

SUPPORTING INFORMATION AVAILABLE Methods and Materials. Movie S1. Tables S1-S4. Figures S1-S6.

ACKNOWLEDGMENTS Financial support for this work was provided via GRF grants 14304915 and 14321116, HMRF grant 15140052 and an AoE/M-09/12 grant (to JX and YW), a grant by the National Natural Science Foundation of China 21628201 (to JX), and a Canadian Institute of Health Researh grant (to SSL). SSL held a Canada Research Chair in Functional Genomics and Cellular Proteomics.

REREFRENCES AND NOTES 1. Bai, R.; Pei, X. F.; Boyé, O.; Getahun, Z.; Grover, S.; Bekisz, J.; Nguyen, N. Y.; Brossi, A.; and Hamel, E. (1996) Identification of cysteine 354 of beta-tubulin as part of the binding site for the A ring of colchicine. J. Biol. Chem. 271, 12639-12645. 2. Nonaka, H.; Tsukiji, S.; Ojida, A.; and Hamachi, I. (2007) Non-enzymatic Covalent protein labeling using a reactive tag. J. Am. Chem. Soc. 129, 15777−15779. 3. Nonaka, H.; Fujishima, S.; Uchinomiya, S.; Ojida, A.; and Hamachi, I. (2010) Selective covalent labeling of tag-fused GPCR proteins on live cell surface with a synthetic probe for their functional analysis. J. Am. Chem. Soc. 132, 9301−9309. 4. Lu, Y.; Huang, F.; Wang, J.; and Xia, J. (2014) Affinity-guided covalent conjugation reaction based on PDZ-Peptide and SH3-Peptide Interactions. Bioconjugate Chem. 25, 989-999. 5. Wang, J.; Yu, Y.; and Xia, J. (2014) Short peptide tag for covalent protein labeling based on coiled coils. Bioconjugate Chem. 25, 178-187. 6. Liu, M.; Ji, Z.; Zhang, M.; and Xia, J. (2017) Versatile site-selective protein reaction guided by WW domain–peptide motif interaction. Bioconjugate Chem. 28, 2199-2205.

ACS Paragon Plus 30 Environment

Page 31 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

7. Yu, Y.; Liu, M.; Ng, T. T.; Huang, F.; Nie, Y.; Wang, R.; Yao, Z-P.; Li, Z.; and Xia, J. (2016) PDZ-reactive peptide activates ephrin-B reverse signaling and inhibits neuronal chemotaxis. ACS Chem. Biol. 11, 149-158. 8. Yu, Y.; Nie, Y.; Feng, Q.; Qu, J.; Wang, R.; Bian, L.; and Xia, J. (2017) Targeted covalent inhibition of grb2–sos1 interaction through proximity-induced conjugation in breast cancer cells. Mol. Pharmaceutics 14, 1548−1557. 9. Bass, R. B.; Butler, S.; Chervitz, S. A.; Gloor, S. L.; and Falke, J. J. (2007) Use of site-directed cysteine and disulfide chemistry to probe protein structure and dynamics: Applications to soluble and transmembrane receptors of bacterial chemotaxis, Methods Enzymol 423, 25-51. 10. Harbury, P. B.; Zhang, T.; Kim, P. S.; and Alber, T. (1995) A switch between two-, three-, and four-stranded coiled coils in GCN4 leucine zipper mutants, Science 262, 1401- 1407. 11. Zhou, N. E.; Kay, C. M.; and Hodges, R. S. (1993) Disulfide bond contribution to protein stability: positional effects of substitution in the hydrophobic core of the two-stranded a-helical coiled-coil, Biochemistry 32, 3178-3187. 12. Verdine, G. L.; and Norman, D. P. (2003) Covalent trapping of protein-DNA complexes, Annu Rev Biochem 72, 337-366. 13. Huang, H.; Chopra, R.; Verdine, G. L.; and Harrison, S. C. (1998) Structure of a covalently trapped catalytic complex of HIV-1 reverse transcriptase: implications for drug resistance, Science 282, 1669-1675. 14. Yi, C.; Jia, G.; Hou, G.; Dai, Q.; Zhang, W.; Zheng, G.; Jian, X.; Yang, C. G.; Cui, Q.; and He, C. (2010) Iron-catalysed oxidation intermediates captured in a DNA repair dioxygenase, Nature 468, 330-333. 15. Mishina, Y.; and He, C. (2003) Probing the structure and function of the Escherichia coli DNA alkylation repair AlkB protein through chemical cross-linking, J. Am. Chem. Soc. 125, 8730-8731. 16. Lowry, B.; Li, X.; Robbins, T.; Cane, D.E.; Khosla, C. (2016) A Turnstile Mechanism for the Controlled Growth of Biosynthetic Intermediates on Assembly Line Polyketide Synthases. ACS Cent. Sci. 2, 14-20. 17. Xiang, Z.; Lacey, V. K.; Ren, H.; Xu, J.; Burban, D.J.; Jennings, P. A.; Wang, L. (2014) Proximity-enabled protein crosslinking through genetically encoding haloalkane unnatural amino acids. Angew. Chem. Int. Ed. Engl. 53, 2190-2193. 18. Chmura, A. J.; Orton, M. S.; and Meares, C. F. (2001) Antibodies with infinite affinity. Proc. Natl. Acad. Sci. U.S.A. 98, 8480−8484. 19. Mizukami, S.; Watanabe, S.; Hori, Y.; and Kikuchi, K. (2009) Covalent protein labeling based on noncatalytic beta-lactamase and a designed FRET substrate. J. Am. Chem. Soc. 131, 5016−5017. 20. Chen, Z.; Jing, C.; Gallagher, S. S.; Sheetz, M. P.; and Cornish, V. W. (2012) SecondACS Paragon Plus 31 Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 32 of 40

generation covalent TMP-tag for live cell imaging. J. Am. Chem. Soc. 134, 13692−13699. 21. Pawson, T. (2004) Specificity in signal transduction: from phosphotyrosine-SH2 domain interactions to complex cellular systems. Cell 116, 191-203 22. Pawson, T.; and Kofler, M. (2009) Kinome signaling through regulated protein-protein interactions in normal and cancer cells. Curr Opin Cell Biol 21, 147-153 23. Songyang, Z.; Shoelson, S. E.; Chaudhuri, M.; Gish, G.; Pawson, T.; Haser, W. G.; King, F.; Roberts, T.; Ratnofsky, S.; Lechleider, R. J.; and et al. (1993) SH2 domains recognize specific phosphopeptide sequences. Cell 72, 767-778 24. Huang, H.; Li, L.; Wu, C.; Schibli, D.; Colwill, K.; Ma, S.; Li, C.; Roy, P.; Ho, K.; Songyang, Z.; Pawson, T.; Gao, Y.; and Li, S. S. (2008) Defining the specificity space of the human SRC homology 2 domain. Mol Cell Proteomics 7, 768-784 25. Machida, K.; and Mayer, B. J. (2005) The SH2 domain: versatile signaling module and pharmaceutical target. Biochimica et Biophysica Acta 1747, 1-25 26. Kaneko, T.; Huang, H.; Zhao, B.; Li, L.; Liu, H.; Voss, C. K.; Wu, C.; Schiller, M. R.; and Li, S. S. (2010) Loops govern SH2 domain specificity by controlling access to binding pockets. Sci. Signal 3, ra34 27. Bradshaw, J. M.; Mitaxov, V.; and Waksman, G. (1999) Investigation of phosphotyrosine recognition by the SH2 domain of the Src kinase. J. Mol. Biol. 293, 971-985 28. Kaneko, T.; Li, L.; and Li, S. S. (2008) The SH3 domain-a family of versatile peptide- and protein-recognition module. Front Biosci. 13, 4938-4952. 29. Qin, C.; Wavreille, A.-S.; Pei, D. (2005) Alternative mode of binding to phosphotyrosyl peptides by Src homology-2 domains. Biochem. 44, 12196-12202. 30. Ng, C.; Jackson, R. A.; Buschdorf, J. P.; Sun, Q.; Guy, G. R.; and Sivaraman, J. (2008) Structural basis for a novel intrapeptidyl H-bond and reverse binding of c-Cbl-TKB domain substrates. EMBO J. 27, 804-816. 31. Sun, Q.; Ng, C.; Guy, G. R.; and Sivaraman, J. (2011) An adjacent arginine, and the phosphorylated tyrosine in the c-Met receptor target sequence, dictates the orientation of c-Cbl binding. FEBS Lett. 585, 281-285. 32. Lindfors, H. E.; Drijfhout, J. W.; and Ubbink, M. (2012) The Src SH2 domain interacts dynamically with the focal adhesion kinase binding site as demonstrated by paramagnetic NMR Spectroscopy. IUBMB Life, 64, 530-544. 33. Tang, C.; Iwahara, J.; and Clore, G.M. (2006) Visualization of transient encounter complexes in protein–protein association. Nature, 2006, 444, 383-386. 34. Ubbink, M. (2009) The courtship of proteins: Understanding the encounter complex. FEBS Lett. 583, 1060-1066. ACS Paragon Plus 32 Environment

Page 33 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

35. Suh, J-Y.; Tang, C.; and Clore, G. M. (2007) Role of electrostatic interactions in transient encounter complexes in protein-protein association investigated by paramagnetic relaxation enhancement. J. Am. Chem. Soc.; 2007, 129, 12954-12955. 36. Finn, R. D.; Bateman, A.; Clements, J.; Coggill, P.; Eberhardt, R. Y.; Eddy, S. R.; Heger, A.; Hetherington, K.; Holm, L.; Mistry, J.; Sonnhammer, E. L.; Tate, J.; and Punta, M. (2014) Pfam: the protein families database. Nucleic Acids Research 42, D222-230 37. Finn, R. D.; Coggill, P.; Eberhardt, R. Y.; Eddy, S. R.; Mistry, J.; Mitchell, A. L.; Potter, S. C.; Punta, M.; Qureshi, M.; Sangrador-Vegas, A.; Salazar, G. A.; Tate, J.; and Bateman, A. (2016) The Pfam protein families database: towards a more sustainable future. Nucleic Acids Research 44, D279-285. 38. Boehr, D. D.; Nussinov, R.; and Wright, P. E. (2009) The role of dynamic conformational ensembles in biomolecular recognition. Nat. Chem. Biol. 5, 789-796. 39. Developers, G. (2003) Gromacs 5.0.6 User Manual. 1-19. 40. Berendsen, H. H. (1995) Interactions between 5-hydroxytryptamine receptor subtypes: is a disturbed receptor balance contributing to the symptomatology of depression in humans? Pharmacol. Ther. 66, 17-37. 41. Klauda, J. B.; Venable, R. M.; Freites, J. A.; O'Connor, J. W.; Tobias, D. J.; MondragonRamirez, C.; Vorobyov, I.; MacKerell, A. D.; Jr.; and Pastor, R. W. (2010) Update of the CHARMM all-atom additive force field for lipids: validation on six lipid types. J. Phys. Chem. B 114, 7830-7843 42. Bjelkmar, P.; Larsson, P.; Cuendet, M. A.; Hess, B.; and Lindahl, E. (2010) Implementation of the CHARMM Force Field in GROMACS: Analysis of Protein Stability Effects from Correction Maps, Virtual Interaction Sites, and Water Models. J. Chem. Theory Comput. 6, 459-466 43. Sondergaard, C. R.; Olsson, M. H.; Rostkowski, M.; and Jensen, J. H. (2011) Improved Treatment of Ligands and Coupling Effects in Empirical Calculation and Rationalization of pKa Values. J. Chem. Theory Comput. 7, 2284-2295 44. Olsson, M. H. M.; Sondergaard, C. R.; Rostkowski, M.; and Jensen, J. H. (2011) PROPKA3: Consistent Treatment of Internal and Surface Residues in Empirical pK(a) Predictions. J. Chem. Theory Comput. 7, 525-537 45. Hess, B. (2008) P-LINCS: A parallel linear constraint solver for molecular simulation. J. Chem. Theory Comput. 4, 116-122 46. Miyamoto, S.; and Kollman, P. A. (1992) Settle - an Analytical Version of the Shake and Rattle Algorithm for Rigid Water Models. J. Comput. Chem. 13, 952-962 47. Bussi, G.; Donadio, D.; and Parrinello, M. (2007) Canonical sampling through velocity rescaling. J. Chem. Phys. 126: 014101. ACS Paragon Plus 33 Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 34 of 40

48. Essmann, U.; Perera, L.; Berkowitz, M. L.; Darden, T.; Lee, H.; and Pedersen, L. G. (1995) A smooth particle mesh ewald method. J. Chem. Phys. 103, 8577-8593 49. Van Der Spoel, D. (2005) GROMACS: fast, flexible, and free. J. Comput. Chem. 26, 1701-1718 50. Humphrey, W.; Dalke, A.; and Schulten, K. (1996) VMD: visual molecular dynamics. J. Mol. Graph. 14, 33-38, 27-38. 51. Hajicek, N.; Charpentier, T. H.; Rush, J. R.; Harden, T. K.; and Sondek, J. (2013) Autoinhibition and phosphorylation-induced activation of phospholipase C-gamma isozymes. Biochemistry 52, 4810-4819 52. Groesch, T. D.; Zhou, F.; Mattila, S.; Geahlen, R. L.; and Post, C. B. (2006) Structural basis for the requirement of two phosphotyrosine residues in signaling mediated by Syk tyrosine kinase. J. Mol. Biol. 356, 1222-1236. 53. Bunney, T. D.; Esposito, D.; Mas-Droux, C.; Lamber, E.; Baxendale, R. W.; Martins, M.; Cole, A.; Svergun, D.; Driscoll, P. C.; Katan, M. (2012) Structural and functional integration of the PLCγ interaction domains critical for regulatory mechanisms and signaling deregulation. Structure 20, 2062-2075. 54. Eck, M. J.; Atwell, S. K.; Shoelson, S. E.; and Harrison, S. C. (1994) Structure of the regulatory domains of the Src-family tyrosine kinase Lck. Nature 368, 764-769. 55. Bibbins, K. B.; Boeuf, H.; and Varmus, H. E. (1993) Binding of the Src SH2 domain to phosphopeptides is determined by residues in both the SH2 domain and the phosphopeptides. Mol. Cell. Biol. 13, 7278-7287. 56. Songyang, Z.; Carraway, K. L.; 3rd, Eck, M. J.; Harrison, S. C.; Feldman, R. A.; Mohammadi, M.; Schlessinger, J.; Hubbard, S. R.; Smith, D. P.; Eng, C.; and et al. (1995) Catalytic specificity of protein-tyrosine kinases is critical for selective signalling. Nature 373, 536-539. 57. Brown, M. T.; and Cooper, J. A. (1996) Regulation, substrates and functions of src. Biochimica et Biophysica Acta 1287, 121-149. 58. Cousins-Wasti, R. C.; Ingraham, R. H.; Morelock, M. M.; and Grygon, C. A. (1996) Determination of affinities for lck SH2 binding peptides using a sensitive fluorescence assay: comparison between the pYEEIP and pYQPQP consensus sequences reveals context-dependent binding specificity. Biochemistry 35, 16746-16752. 59. Virdee, S.; Macmillan, D.; and Waksman, G. (2010) Semisynthetic Src SH2 domains demonstrate altered phosphopeptide specificity induced by incorporation of unnatural lysine derivatives. Chem. Biol. 17, 274-284. 60. Sheng, R.; Jung, D. J.; Silkov, A.; Kim, H.; Singaram, I.; Wang, Z. G.; Xin, Y.; Kim, E.; Park, M. J.; Thiagarajan-Rosenkranz, P.; Smrt, S.; Honig, B.; Baek, K.; Ryu, S.; Lorieau, J.; Kim, Y. M.;

ACS Paragon Plus 34 Environment

Page 35 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

and Cho, W. (2016) Lipids regulate Lck protein activity through Their Interactions with the Lck Src Homology 2 Domain. J. Biol. Chem. 291, 17639-17650. 61. Park, M. J.; Sheng, R.; Silkov, A.; Jung, D. J.; Wang, Z. G.; Xin, Y.; Kim, H.; ThiagarajanRosenkranz, P.; Song, S.; Yoon, Y.; Nam, W.; Kim, I.; Kim, E.; Lee, D. G.; Chen, Y.; Singaram, I.; Wang, L.; Jang, M. H.; Hwang, C. S.; Honig, B.; Ryu, S.; Lorieau, J.; Kim, Y. M.; and Cho, W. (2016) SH2 Domains Serve as Lipid-Binding Modules for pTyr-Signaling Proteins. Mol. Cell 62, 720 62. Bian, Y.; Li, L.; Dong, M.; Liu, X.; Kaneko, T.; Cheng, K.; Liu, H.; Voss, C.; Cao, X.; Wang, Y.; Litchfield, D.; Ye, M.; Li, S. S.; and Zou, H. (2016) Ultra-deep tyrosine phosphoproteomics enabled by a phosphotyrosine superbinder. Nat. Chem. Biol. 12, 959-966 63. Zhang, L.; and Buck, M. Molecular simulations of a dynamic protein complex: role of saltbridges and polar interactions in configurational transitions. Biophy. J. 2013, 105, 2412-2417. 63. Findlay, G. M.; Smith, M. J.; Lanner, F.; Hsiung, M. S.; Gish, G. D.; Petsalaki, E.; Cockburn, K.; Kaneko, T.; Huang, H.; Bagshaw, R. D.; Ketela, T.; Tucholska, M.; Taylor, L.; Bowtell, D. D.; Moffat, J.; Ikura, M.; Li, S. S.; Sidhu, S. S.; Rossant, J.; and Pawson, T. (2013) Interaction domains of Sos1/Grb2 are finely tuned for cooperative control of embryonic stem cell fate. Cell 152, 10081020 64. Yasui, N.; Findlay, G. M.; Gish, G. D.; Hsiung, M. S.; Huang, J.; Tucholska, M.; Taylor, L.; Smith, L.; Boldridge, W. C.; Koide, A.; Pawson, T.; and Koide, S. (2014) Directed network wiring identifies a key protein interaction in embryonic stem cell differentiation. Mol. Cell 54, 1034-1041. 64. Songyang, Z.; Cantley, L. C. Recognition and specificity in protein tyrosine kinase-mediated signalling. (1995) Trends Biochem. Sci. 20, 470-475. 65. Li, S. S. C. (2005) Specificity and versatility of SH3 and other proline-recognition domains: structural basis and implications for cellular signal transduction. Biochem. J 390, 641-653. 66. Macias, M. J.; Wiesner, S.; and Sudol, M. (2002) WW and SH3 domains, two different scaffolds to recognize proline-rich ligands. FEBS Lett. 513, 30-37. 67. Zarrinpar, A.; and Lim, W. A. (2000) Converging on proline: the mechanism of WW domain peptide recognition. Nat. Struct. Biol. 7, 611-613.

ACS Paragon Plus 35 Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 36 of 40

RIGURE TITLES AND LEGENDS Figure 1. Probing SH2–ligand interaction by proximity-induced thiol-chloroacetyl crosslinking. (A) Diagram depicting the proximity-induced crosslinking strategy used in this study to covalently tether an SH2 domain (shown in surface representation) to a bound peptide (in blue) upon binding interaction. A covalent bond forms between a surface Cys residue on the SH2 domain and the chloroacetyl group incorporated into the peptide ligand when they are brought in close proximity by binding interaction. Chemical structure of the unnatural amino acid, X, (2S)-2-amino-3-[(chloroacetyl)-amino] propionic acid is shown in the insert. (B) The structure of the PLC1-c SH2 domain (ribbon) in complex with a pY peptide (PDB id: 4K44) highlighting the binding interface, the orientation of the peptide and the two binding pockets. Residues in the bound peptide are numbered relative to the pY. (C) Crosslinking reaction kinetics for peptide LA+1 with the PLC1-c SH2 domain (at a peptide-to-protein ratio of 5:1) at 37oC and in PBS. (D) Covalent crosslinking reactions of the chloroacetyl peptides with the PLC1-c SH2 domain. Peptides LA+1 to LB+3, containing the reactive X residue at different locations of the physiological ligands LA or LB in Table 1, were assayed for spontaneous covalent reactions with the SH2 domain. The peptide–SH2 covalent complexes show higher apparent molecular weights than the free SH2 protein on SDS-PAGE. The reactions were quantified by measuring the percentages of peptide-conjugated SH2 domain over the total amount of SH2 domain after 24 h of incubation at 37oC and in PBS. The error bars represent the standard deviation from at least three independent experiments. Figure 2. Grouping SH2 domains based on the location of surface Cys residues. (A) Diagram of the structure components of a common SH2 domain. Cys-containing SH2 domains were categorized into 5 groups I to V based on the location of the cysteine. (B) Relative position of the Cys residue on SH2 proteins in different groups, and the pY residue and the N terminus of the peptide. The SH2 domain are shown in ribbon with the Cys residues as black sticks and highlighted in orange balls. The bound pY peptides are shown in thick blue lines with the pY identified as red sticks, and the peptide N terminus of pY labeled with blue solid balls. SH2 groups I to V find cysteine residues at BC loop, C region, D region, E region and BG loop respectively. (C) Statistics of the number of SH2 domains in groups I to IV based on the alignment of SH2 sequences in the Pfam database. Figure 3. Reactions of the eight SH2 domains with chloroacetyl pY peptides. (A) Reactions with chloroacetyl pY peptides derived from the template peptides LA or LB. (B) Reactions with “single-pronged” AA-peptide series. Error bars denote standard deviation from three independent experiments. Figure 4. Reverse binding mode of “two-pronged” pY peptides. (A) Both normal and reverse binding modes were captured for all the peptides. (B) Peptides LC+3 and LC-3 compete for covalent reaction with the LCK SH2C224G protein. LC+3 and LC-3 were labeled ACS Paragon Plus 36 Environment

Page 37 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

with different fluorescent dyes (FAM and TAMRA) to display green or red fluorescence respectively. Both peptides reacted with SH2 protein in a mixture. (C) Estimating the percentages of normal and reverse binding modes using a dual-reactivity peptide LA-3+3Cys and a two-step reaction. In redox buffer the SH2-peptide complex adopting the normal mode is locked by a reversible disulfide bond, and the complex adopting the reverse mode is locked by a nonreversible thioether. Treating the complex mixture with the reducing agent DTT breaks the disulfide bond, leaving only the complex formed as the result of the reverse mode. (D) Adding or removing an acidic amino acid (Glu) at pY+1 or pY+2 position strongly affects the propensity of the reverse binding mode. (E) K179L accounts for the interaction with the Glu at pY+1 or pY+2 position. Error bars refer to the average and standard deviation of at least three independent experiments. See Materials and Methods for detailed experimental procedures. Figure 5. Revealing the reverse binding mode using “half” pY peptides. (A) Amino acid preference screening using OPAL method. A randomized library shows the LCK SH2 protein preferentially recognizes Ile or Val at pY+3 and pY+4 positions, but binds very weakly to the half peptides lacking C terminus. (B) Permutation library of N-terminal fragment of peptide LC reveals a strong preference of Ile at the pY-3 and pY-4 positions. (C) Covalent reactions of “half” peptides of the N terminal part of LC-3 in comparison to the full length LC-3 peptides. (D) Acidic amino acids at pY-3 and pY-4 significantly increase the reaction of the half peptides LC-1 and LC-2. Error bars refer to the average and standard deviation from at least three independent experiments. Figure 6. MD simulation of the LCK SH2-ligand interaction. (A-C) MD simulation snapshots from Crystal_sim_1 to 3. The pY peptide from the crystal structure LCK is colored in black. The pY peptide from MD simulations (cartoon and C-terminus as sphere) is shown every 50 ns and colored according to the simulation time: red, beginning; white, middle; and blue, end of simulation. (D-F) MD simulation snapshots from Random_sim_4 to 6 as representative random simulation results. The pY peptide is shown every 10 ns for the last 100 ns of the simulations. Movie S1. Trajectories of MD simulations Crystal_sim_1 to 3. The pY peptide is colored according to simulation time --- red: beginning, white: middle, and blue: end of simulation. The C-terminus of the peptide is shown in sphere representation. The pY peptide from the crystal structure 1LCK and its C-terminus are colored in black. SH2 is colored in cyan. The pY residue and residues on SH2 within 0.3 nm are shown in stick representation. Non-polar hydrogen atoms and water are omitted for clarity.

ACS Paragon Plus 37 Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 38 of 40

TABLES WITH TITLES AND LEGENDS Table 1. List of chloroacetyl pY peptides used in this study

X represent a derivative of diaminopropionic acid with a reactive chloroacetyl group, (2S)-2-amino3-[(-chloroacetyl)-amino] propionic acid. The peptides were synthesized with carboxyfluorescein at the N terminus to facilitate detection.

ACS Paragon Plus 38 Environment

Page 39 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Table 2. List of the SH2 domains used in this study. SH2 domain

Group

PDB ID

Uniprot ID

Location of Cys

Reference

SYK-c

II

1CSY

P43405

βC5206 (βG3259)

54

SH2D1A

II

1D4T

O60880

βC242/βC544

55

PLC1-c

III

2FCI

Q32PK0

βD5715

51

NCK1

IV

2CI9

P16333

βE3340

56

APS

IV

1RQQ

Q9Z200

βE1477

57

LCK

V

1LCK

P06239

BG-loop217 (βG3224)

58

VAV1

V

2ROR

P15498

BG-loop753

59

CSK

II, V

2RSY

PIK3R1-c

II, III

1PIC

P32577 P27986

βC5119/βC8122/BG loop164 βC1656/βC4659/βD5670

60

61

βC3185/BG-loop238 V-SRC

II, V

1SPS

P00524

ACS Paragon Plus 39 Environment

loop245)

62

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

TOC Figure

ACS Paragon Plus 40 Environment

Page 40 of 40