Engineering improved anti-phosphotyrosine antibodies based on

crystallographic screens were set up with the purified Fab mixed with a 3-mer ... To further test our hypothesis that the sulfate ion is a competitive...
0 downloads 0 Views 1MB Size
Subscriber access provided by University of Virginia Libraries & VIVA (Virtual Library of Virginia)

Article

Engineering improved anti-phosphotyrosine antibodies based on immuno-convergent binding motif Yun Mou, Xin Zhou, Kevin Leung, Alexander J Martinko, Jiun-Yann Yu, Wentao Chen, and James A. Wells J. Am. Chem. Soc., Just Accepted Manuscript • DOI: 10.1021/jacs.8b08402 • Publication Date (Web): 06 Nov 2018 Downloaded from http://pubs.acs.org on November 7, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of the American Chemical Society

Engineering improved anti-phosphotyrosine antibodies based on immuno-convergent binding motif

Yun Mou1,2, Xin X Zhou1, Kevin Leung1, Alexander J Martinko1,3, Jiun-Yann Yu4, Wentao Chen1, & James A Wells1,5*

1Department

of Pharmaceutical Chemistry, University of California, San Francisco, San

Francisco, California, USA. 2Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan. 3Chemistry and Chemical Biology Graduate Program, University of California, San Francisco, San Francisco, California, USA. 4Department of Electrical, Computer, and Energy Engineering, University of Colorado, Boulder, Colorado, USA. 5Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, California, USA.

*To

whom correspondence should be addressed:

James A. Wells, Ph.D. Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, California, USA. Phone: +1-415-514-4498 Email: [email protected]

Keywords:

phosphotyrosine,

antibody,

phage

display,

computational protein design

1

ACS Paragon Plus Environment

X-ray

crystallography,

Journal of the American Chemical Society 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 32

ABSTRACT Phosphotyrosine (pY) is one of the most highly studied posttranslational modifications that is responsible for tightly regulating many signaling pathways in eukaryotes. Panspecific pY antibodies have emerged as powerful tools for understanding the role of these modifications. Nevertheless, structures have not been reported for pan-specific pY antibodies, greatly impeding the further development of tools for integrating this ubiquitous posttranslational modification using structure-guided designs.

Here, we

present the first crystal structures of two widely utilized pan-specific pY antibodies, PY20 and 4G10. The two antibodies, although developed independently from animal immunizations, have surprisingly similar modes of recognition of the phosphate group, implicating a generic binding structure among pan-specific pY antibodies. Sequence alignments revealed that many pY binding residues are predominant in the mouse V germline genes, which consequently led to the convergent antibodies. Based on the convergent structure, we designed a phage display library by lengthening the CDR-L3 loop with the aid of computational modeling. Panning with this library resulted in a series of 4G10 variants with 4 to 11 fold improvements in pY binding affinities. The crystal structure of one improved variant showed remarkable superposition to the computational model, where the lengthened CDR-L3 loop creates an additional hydrogen bond indirectly bound to the phosphate group via a water molecule. The engineered variants exhibited superior performance in Western blot and immunofluorescence.

2

ACS Paragon Plus Environment

Page 3 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of the American Chemical Society

INTRODUCTION Phosphotyrosine (pY) is a critical posttranslational modification (PTM) that modulates numerous signaling events in eukaryotes, including growth factor signaling1, cell differentiation2, and T-cell activation.3 Due to their pervasive roles in cell biology, the dysregulation of pY PTMs has been implicated in many human diseases, such as neurodegeneration4 and cancer.5,6 Collectively, there have been more than 44,000 nonredundant pY sites discovered in human and mouse according to PhosphoSitePlus7, and the number is steadily growing along with the advances of pY proteomic technologies.8 Together with hundreds of pY kinases and phosphatases, understanding the complex network of pY regulation is a significant challenge in the post-genome era.9

Unlike the other two common phosphorylation PTMs (pS and pT) that are relatively abundant in cells,10 pY is far less abundant11 (approximately 0.1% or less of all phosphorylation PTMs). Studying pY PTMs, therefore, requires very specific and sensitive probes. During the last two decades, many antibodies12-15 or pY binding domains16,17 have been developed to pan-specifically isolate pY PTMs with varying degrees of success.8,18 Improving the sensitivity and specificity of these probes is still highly desirable. Structure-guided mutagenesis combined with binding selections has been applied to generate sequence specific and recombinant antibodies for pS and pT, but not yet for pY. Key to these studies was the use of structure-guided phage display from a structurally defined scaffold to focus mutations leading to improvement in pS and pT binding affinities for both the phosphorylation state and sequence. Despite a long history of developing and employing pY antibodies in many applications (Western blot,

3

ACS Paragon Plus Environment

Journal of the American Chemical Society 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

immunofluorescence, proteomics, etc.), the structural mechanism of how the antibodies recognize the pY group remains unknown.

Here, we report the first crystal structures of two widely utilized pan-specific pY antibodies, PY2015 and 4G10.19,20 Although generated independently from animal immunizations, to our surprise, their mode of molecular recognition of the pY group is highly conserved between the two antibodies. Both structures host a deeply buried cationic binding site for the phosphate group that provides multiple hydrogen bonds and salt bridges from the same residues in the antibodies. The tyrosine extends the phosphate group to the binding site without π-π stacking or cation-π interactions with the antibody that would not accommodate pS or pT. Surprisingly, many of the pY binding residues are located on β-strand structures in the framework regions, and not the CDR loops. Sequence analysis revealed that these residues were highly predominant in the mouse V germline genes. Consequently, these V germline genes could be quickly affinity-matured to the structurally convergent pY antibodies upon immunization. Based on the convergent structure, we further optimized the binding affinity with the aid of computational design and phage display selection. These studies provide a recombinant scaffold for structureguided engineering of pY binders with high affinity and superior performance. RESULTS Crystal structures of PY20 and 4G10 Fabs complex with sulfate group To facilitate the engineering of improved pY probes, we first sought to solve the pY antibody crystal structure in complex with a phosphopeptide in order to understand the mechanism by which it recognizes pY. Two widely utilized pY antibodies, PY20 and

4

ACS Paragon Plus Environment

Page 4 of 32

Page 5 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of the American Chemical Society

4G10, were recombinantly expressed using E. coli in a Fab format. Co-complex crystallographic screens were set up with the purified Fab mixed with a 3-mer synthetic peptide LpYL. Initial attempts were impeded by protein aggregates that failed to generate good-quality crystals possibly due to the low stability of the antibodies. The 4D5 human antibody scaffold from traztuzumab is very stable (Tm ~82 °C), and we and others have found that antibodies derived from it are very stable.21 Thus, we reasoned we may increase the stability if we humanized PY20 and 4G10 by grafting their CDRs on to the 4D5 scaffold22 (Table S1). Indeed, the humanized Fabs PY204D5 and 4G104D5 exhibited great thermostability with Tm at 76 °C and 80 °C, respectively, as compared to the parental 4G10’s Tm at 65 °C. Importantly, an ELISA assay confirmed that both PY204D5 and 4G104D5 had preserved pY binding affinity and specificity (Figure S1). The cocrystallizations of PY204D5/LpYL and 4G104D5/LpYL successfully generated crystals with good diffraction properties (Table S2). Unfortunately, the crystal structures of both the PY204D5 and 4G104D5 Fab fragments did not reveal any electron density for the LpYL peptide despite the fact that the Fab fragment could be well refined. Instead, there was a tetrahedron-like density engulfed in the CDR pocket at the same position of PY204D5 and 4G104D5 (Figure 1). We found that all the crystallization hits were grown in buffer containing either phosphate or sulfate salts, which possibly outcompeted the pY peptide binding. Indeed, a sulfate group modeled into the electron density was well coordinated by multiple hydrogen bonds and salt bridges. Interestingly, the proposed sulfate binding sites are very similar between PY204D5 and 4G104D5. Both structures use T33H and H35H from the stem region of CDR-H1 and R95H from the stem region of CDR-H3 to bind the sulfate ion. The major difference is that 4G104D5 has an additional interaction to the

5

ACS Paragon Plus Environment

Journal of the American Chemical Society 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

phosphate from R96L of the CDR-L3 loop. This observation is consistent with our kinetic binding data, where 4G104D5 shows a tighter binding affinity to pY compared to PY204D5 (Table 1). To further test our hypothesis that the sulfate ion is a competitive inhibitor of pY, we performed binding experiments in the presence of excess sulfate ion and found that the pY peptide no longer bound to the antibody (Figure S2).

Figure 1 | Crystal structures of PY204D5 and 4G104D5 complex with a sulfate ion shows similar binding mode. (a) PY204D5/sulfate complex structure. Two residues from CDR-H1 (T33H, H35H) and one residue from CDR-H3 (R95H) form hydrogen bonds or salt bridges to the sulfate ion. (b) 4G104D5/sulfate complex structure. 4G104D5 uses the same three residues as PY204D5 to bind the sulfate ion with an additional binding residue R96L from CDR-L3. The omit map (Fo-Fc) of the sulfate group is shown at the 3σ contour level in both (a) and (b).

6

ACS Paragon Plus Environment

Page 6 of 32

Page 7 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of the American Chemical Society

Table 1 | Kinetic measurements of antibody/pY binding using biolayer interferometry. Antibodies used in this study were tested for their binding kinetics to a pY peptide (GGGpYGGG). *For 4G104D5 and its variants, the 93L position (defined by 4G104D5) and its replacement sequences (between 92L and 94L) are shown, respectively. The 93L position defined by 4G104D5 is not applicable to PY204D5 and D59N PY204D5 because their CDR-L3 is one-residue lengthier than 4G104D5. The sequence motifs XX(T/S)S and XXXG are shown in bold. Crystal structure of 4G10 Fab complex with the pY peptide Since 4G104D5 is a superior pY binder to PY204D5, we performed a more extensive cocrystallization screening in the absence of sulfate or phosphate ions with 4G104D5/LpYL in hopes of obtaining crystals with the LpYL bound. Indeed we identified such a condition and acquired good diffraction of the crystals to a resolution of 2.3 Å (Table S2). The crystal structure showed a clear electron density for the first two residues of the LpYL peptide bound in the 4G104D5 CDR pocket (Figure 2). The phosphate group of pY 7

ACS Paragon Plus Environment

Journal of the American Chemical Society 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

is deeply buried in the stem region of the CDRs at the same position where the sulfate ion was observed in the previous PY204D5 and 4G104D5 structures. There are four important features in the 4G104D5/LpYL structure that appear to confer the pan-specificity of pY substrates. (i) The residues R96L, T33H, H35H, and R95H form direct hydrogen bonds or salt bridges to the phosphate group of pY (Figure 2b). (ii) There are three water molecules surrounding the pY residue that form hydrogen bonds with the phosphate group. Each of them mediates the hydrogen-bond interactions between the 4G104D5 antibody and the pY residue (Figure 2c). The residues Y91L and Y100H, which pack as a parallel displaced π-π stacking, form hydrogen bonds to two water molecules individually. The third coordinated water molecule lies on top of the tyrosine ring and is stabilized by three hydrogen bonds, two from the phosphate group and the backbone carboxyl of the pY residue, respectively, and one from the N52H of the antibody. (iii) The residues G55H and G56H form a “seat” that is shape complementary to the backbone of the pYproceeding residue (Figure 2b and 2c). The G55H has the torsion angles (phi=95.8, psi=15.8) that are forbidden for all amino acids except glycine. The position 56 could only accommodate a glycine without creating steric clashes to the pY peptide. (iv) The proceeding residue of pY points the side chains toward the solvent, i.e. it does not interact with the antibody. The following residue of pY does not show a clear electron density, implying no static interactions to the antibody. This explains the pan-specificity of 4G10 because the antibody only interacts with the pY residue, but not the neighboring ones. Surprisingly, there are no π-π stacking or cation-π interactions between the antibody and the pY residue. The tyrosine simply acts as a bridge to extend the phosphate group into the deep binding pocket. This confers the pY specificity of 4G104D5 over pS and pT,

8

ACS Paragon Plus Environment

Page 8 of 32

Page 9 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of the American Chemical Society

because pS and pT are too short to reach the phosphate binding site. Intriguingly, among the nine critical residues mentioned above, only one residue (G55H) is at the loop structure, whereas the other eight residues (Y91L, R96L, T33H, H35H, N52H, G56H, R95H, Y100H) that mediate phosphate binding are located on the β-strands that are either at or very close to the framework regions. Overall, the 4G104D5 structure has a fairly deep and sophisticated binding site that provides eight hydrogen bonds and/or salt bridges directly or indirectly bound to the phosphate group.

Figure 2 | Crystal structure of 4G104D5/LpYL peptide complex shows a complementary H-bonding binding site for binding pY. (a) Overview of the cocomplex structure. Light chain is shown in green. Heavy chain is shown in cyan. The pY peptide is shown in yellow. (b) The Fo-Fc omit map of the pY peptide (contour level = 3σ). The phosphate-binding residues (R96L, T33H, H35H, R95H) are shown. The residues G55H and G56H from CDR-H2 form a shape-complementary conformation that 9

ACS Paragon Plus Environment

Journal of the American Chemical Society 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

accommodates the backbone structure of the pY-proceeding residue. (c) same as (b), except the three bound waters and their binding residues (Y91L, N52H, Y100H) are shown. Structure-guided design to improve the 4G10/pY binding We next sought to leverage the structural insights gained from our crystallographic data to further engineer 4G10 to improve its properties as a detection reagent. Based on the PY204D5/sulfate structure, we designed a single mutant D59N, which we hypothesized might improve the affinity by replacing the repulsive interaction between D59 and pY with a hydrogen bond between N59 side chain amide and pY (Figure S3). The kinetic binding assay showed that the D59N mutant indeed improved the affinity to 1 μM, which was approximately 2-fold superior to PY204D5 but still inferior to 4G104D5 (Table 1). We next designed single mutants of 4G104D5 in an attempt to introduce π-π stacking or cation-π interaction to the pY residue according to the 4G104D5/LpYL structure. However, all eight single mutants (N52F, N52K, N52R, N52H, I58F, I58K, I58R, I58H) significantly diminished the pY binding (Figure S4). Given the limited space at positions 52H and 58H, we speculated that the diminished binding was caused by the steric clashes to the pY peptide. Solvent-accessible surface area (SASA) analysis showed that the pY group is well packed by 4G104D5 except that one side of pY is more solvent-exposed (the front side of Figure 2b). We therefore aimed to create extra affinity from this side by more aggressive engineering. A close inspection of the CDRs revealed that a lengthened CDR-L3 loop might contribute additional interactions to the pY group from the solventexposed side. We exploited computational tools23 to model various insertions between the 92L and 94L residues. We found that an insertion of three or four residues might provide new contacts to the pY group (Figure 3a). For example, based on the modeling, a 3-

10

ACS Paragon Plus Environment

Page 10 of 32

Page 11 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of the American Chemical Society

residue insertion with a serine at the C-terminus could create an additional hydrogen bond to the water molecule mediating to the phosphate (Figure S5). We therefore prepared a 4G104D5 phage display library with the CDR-L3 lengthened by 0 to 4 residues using degenerate codons (Table S3). The library was panned against a pY peptide with three glycines on the flanking regions (GGGpYGGG). Five rounds of selections with decreased concentrations of the pY peptide (1 μM, 300 nM, 100 nM, 30 nM, and 10 nM) were performed. Good enrichment ratios were obtained in each round comparing the pY peptide and its non-phosphorylated counterpart titters (Figure S6). We picked 92 clones from rounds 3 to 5 (~30 clones from each round) and sequenced them. The sequencing results showed that the 3-residue insertion clones were the most predominant (84/92), whereas the 0, 1, 2, and 4-residue insertions only account for 0, 0, 1, and 7 clones, respectively. We then employed phage ELISA assays to evaluate the pY binding specificity and affinity. The direct ELISA assay showed that all clones retained exquisite pY selectivity over non-phosphorylated tyrosine (Figure 3b, left). More than 60 clones showed improved affinity compared to the parental 4G104D5 in a competitive ELISA assay, where a soluble pY peptide at 200 nM was used as the competitor (Figure 3b, right). Based on the competitive ELISA results, we selected the top 12 clones (all were 3-residue insertions) and expressed them recombinantly as Fabs for affinity measurements. The sequences of the top 12 clones could be grouped into two motifs: XX(T/S)S (6 clones) and XXXG (6 clones) (Figure S7). The biolayer interferometry measurements showed that all 12 clones exhibited better pY binding affinity compared to the parental 4G104D5 (Table 1). We further tested whether the choices of the 4D5 scaffold or the 4G10 scaffold would affect the binding affinity. We picked 6 clones from

11

ACS Paragon Plus Environment

Journal of the American Chemical Society 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

the top 12 variants and grafted the CDR-L3 sequence back to the parental 4G10 scaffold. We found that the variants in the 4G10 and 4D5 scaffolds have comparable binding affinities (Table S4). Overall, the 18 4G10 variants we validated showed better pY binding affinity compared to the parental 4G104D5 ranging from 4.3 to 11.2 fold improvement.

Figure 3 | Computational modeling of CDR-L3 insertions and ELISA screening of 4G104D5 CDR-L3 loop library generates superior pY antibodies. (a) Based on the 4G104D5/pY structure (shown in green), we computationally modeled various lengths of insertions into the CDR-L3 loop between the 92L and 94L residues. For each length, 100 loop conformations were modeled. The models are shown in magenta. (b) About 30 colonies were picked from Round 3, 4, and 5 selections, respectively, for Fab phage

12

ACS Paragon Plus Environment

Page 12 of 32

Page 13 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of the American Chemical Society

ELISA assays. Data from left to right: Round 3 clones, Round 4 clones, Round 5 clones, and 4 controls (*PY204D5, **D59N PY204D5, ***4G104D5, ****Blank). Left: direct ELISA. Blue and red bars are data using GGGpYGGG and GGGYGGG as the substrates, respectively. Right: competitive ELISA using 200nM GGGpYGGG peptide as the soluble competitor. The red dash line indicates the value of 4G104D5. Structural elucidation of the improved pY antibody In order to better understand the improved affinity of our CDR-L3 lengthened variants, we solved the crystal structure of the 4G10-S54D5 co-complex with the LpYL peptide (Table S2). The 4G10-S54D5 structure keeps the LpYL binding and the three surrounding water molecules the same way as the 4G104D5 (Figure 4a). In addition, the last residue of the LpYL peptide was well defined in the electron density map. Like the residue proceeding the pY, the residue following the pY also points the side chain toward the solvent, thus making no interactions with the antibody. The serine at the position 95AL from the lengthened L3 creates a new hydrogen bond to the water molecule that is also hydrogen bonded by Y91L and the pY phosphate (Figure 4a). Furthermore, the CDR-L3 forms a loop that is structurally fixed by a hydrogen bond between S92L and T95L (Figure 4a). This hydrogen bond, although not directly contributing to the pY binding, stabilizes the conformation of CDR-L3 loop and keeps S95AL in close proximity to the water molecule responsible for pY binding. The 4G10-S54D5/pY structure is well predicted by the computational model. In our in silico models where we virtually built one hundred 3-residue-insertion CDR-L3 loops, there are two canonical conformations with one energy much more favorable than the other. Indeed, the 4G10-S54D5/pY crystal

13

ACS Paragon Plus Environment

Journal of the American Chemical Society 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

structure superimposed with the energetically-favorable conformation of CDR-L3 in the model (Figure 4b, backbone r.m.s.d < 0.5Å).

Figure 4 | Crystal structure of 4G10-S54D5/LpYL complex superimposes with the design model. (a) The LpYL peptide shows a clear omit map (Fo-Fc, contour level = 3 σ) for all three residues. The serine at the position 95AL in the engineered CDR-L3 (shown in green) forms a new hydrogen bond to the existing bound water in the crystal structure. The residue T95L in the engineered CDR-L3 forms a hydrogen bond to S92L. (b) Overlay of the crystal structure (green) and the computational modeling (green). Engineered pY antibodies show superior performance in immunoaffinity applications. We compared the performance of the parental 4G10 and our affinity-improved variants in two experimental applications, pY Western blot and pY immunofluorescence. For the pY Western blot experiments, the whole cell lysate of Jurkat cells with or without the pretreatment of pervanadate (a phosphotyrosine phosphatase inhibitor) was used. Consistent with previous observations, due to the tight regulation of pY modifications in cells, there

14

ACS Paragon Plus Environment

Page 14 of 32

Page 15 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of the American Chemical Society

were hardly any pY signals detected in the lysate without the pervanadate pre-treatment, reflecting dominant tyrosine phosphatase activity in cells (Figure S8). The lack of signal verifies that the 4G10 antibody and the engineered variants show very little general protein cross-reactivity in the whole cell lysate. Conversely, the lysates pre-treated with pervanadate show multiple bands across a wide range of molecular weights, reflecting the pY pan-specificity of 4G10 and its engineered variants (Figure 5a). Notably, there are many bands that are hardly detected by 4G10 but readily detected by the engineered variants, particularly in the range of ~25-55 kD. The signal improvement in this molecular weight range is up to 7.2 fold compared 4G10-S54G10 to the parental 4G10 (Figure 5b). In terms of the overall signal intensity and number of bands detected, the best variants for the Western blot application are 4G10-G14G10, 4G10-G64D5, and 4G10S54G10, with the overall intensity improved by ~200% compared to the parental 4G10. We next sought to compare the immunofluorescence sensitivity for the parental 4G10 and the 3 best variants from the Western blot results. HeLa cells with or without the pervanadate pre-treatment were fixed and immuno-stained. Similar to the Western blot experiments, the cells without pervanadate pre-treatment did not show any detectable immunofluorescence (data not shown), indicating the tight pY regulation under physiological conditions and the absence of antibody non-specific binding. For cells pretreated by pervanadate, the parental 4G10 showed moderate immunofluorescence in the cytoplasm region, whereas two engineered variants, 4G10-G64D5 and 4G10-S54G10 exhibited as much as 250% brighter signals in comparison (Figure 5c). The 4G10-G14G10 variant, however, only showed comparable intensity with the parental 4G10, despite of its superior performance in the Western blot experiment. Collectively this data illustrates

15

ACS Paragon Plus Environment

Journal of the American Chemical Society 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

that our engineered variants exhibit superior performance to the parental 4G10 in both pY Western blot and pY immunofluorescence experiments. One particular variant, 4G10S54G10, showed the best signals in both applications.

Figure 5 | Comparisons of 4G10 and the engineered variants shows improved performance in pY applications. (a) pY Western blot of Jurkat whole-cell lysate. The engineered variants showed significant stronger signals than 4G10 especially in the range of MW 25-55 kD (framed in the red box). (b) The quantification of the pY Western signals in (a). (c) pY immunofluorescence of Hela cells. The pY signal is pseudo-colored in red, and the nucleus staining is pseudo-colored in blue. DISCUSSION

16

ACS Paragon Plus Environment

Page 16 of 32

Page 17 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of the American Chemical Society

We have successfully grafted two widely utilized pan-specific pY antibodies, PY20 and 4G10, onto a highly stable scaffold 4D5 and solved their crystal structures to facilitate further engineering and understanding of pY molecular recognition. Surprisingly, PY20 and 4G10 share nearly the same mode of binding for the phosphate group despite being raised from independent animal immunization origins. This remarkable example of convergence could represent a canonical pY binding structure of other naturally occurring antibodies. Another intriguing feature revealed by the structures is that all the binding residues, including those have direct or indirect interactions to the pY residue, are located in the stem regions (mostly in β-strand structure) of CDRs, but not in the middle of CDR loops. Therefore, the binding site is rather static and does not require a recognition-induced conformational change for binding. The CDRs around the binding site are relatively short and adopt common canonical conformations24,25 (Table S5) with no steric hindrance for the pY substrate. Overall, the structure harbors a deep but wellexposed and electrostatically complementary binding site for pY binding. Surprisingly, there are no π-π interactions or cation-π interactions to the aromatic ring of pY. The pY specificity comes from the depth of the binding site, which can not be reached by pS or pT because of their short side chains. Figure 6 illustrates the structural explanation of pY antibody specificity over pS/pT and Y. Our structures also reveal that neighboring residues of pY do not directly interact with the antibody, therefore conferring the pY panspecificity. This is in contrast to other natural pY binding proteins, such as SH2 domains. SH2 domains recognize their pY substrates with strong sequence preference. There are two binding pockets of SH2 domains, namely the pY binding pocket and the peptide pocket. Similar to the 4G10 paratope, the SH2 pY binding pocket contains cationic

17

ACS Paragon Plus Environment

Journal of the American Chemical Society 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

residues (typically two arginine and one histidine) critical for phosphate coordination. However, the binding of this pocket alone is relatively weak and only contributes roughly half of the substrate binding energy26. The other half binding energy is provided by the extended peptide groove, which recognizes the residues C-terminal to the pY group. This pocket is highly variable among the SH2 domains in order to achieve different peptide sequence specificity. Inspired by the SH2 domains, we previously produced generalized recombinant antibody scaffolds for pS, pT, and pY peptide binding using CDR-H2 as the phosphate pocket and CDR-L3/CDR-H3 as the peptide pocket.14 While the pS and the pT scaffolds were highly specific to their cognate peptides, the pY scaffold did not show strong preference for pY over non-phosphorylated tyrosine. The structure showed little burial of the pY side chain in CDR-H2.

Figure 6 | Structural elucidation of pY antibody binding specificity Left: The pY residue bound in the pY antibody. The pY residue is well-packed in the binding site with many hydrogen bonds and salt bridges to the antibody as shown in Figure 2. Middle: The pS residue is modeled with the pY antibody. The top cartoon schematic shows that pS is

18

ACS Paragon Plus Environment

Page 18 of 32

Page 19 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of the American Chemical Society

too short to reach the phosphate group to the binding pocket. The bottom figure models the pS phosphate group at the binding site, which results in steric clashes between the pS backbone and the antibody (red dashed circle). Right: A tyrosine residue is modeled in the pY antibody. It lacks the phosphate group to interact with the antibody (magenta dashed circle). Ruff-Jamison et al.12 performed a systematic comparison of several antibodies generated by mouse immunization using pY antigens. Based on the sequence similarity, it was deduced that two pairs of antibodies, PY2/PY54 and PY20/PY69, arose from the same ancestral clones, respectively, and diverged later by somatic mutations. PY42 and 129 IgM likely arose from independent clones. Of these six clones, they found common and distinct features in all six CDRs at the sequence level. It was not clear if the sequence consensus implicates a similar pY binding structure shared by all six antibodies, and if so, which consensus residues are responsible for the pY binding. With the PY20 and 4G10 structures we solved here, we re-analyzed the sequence alignment and discovered that most residues critical for the pY binding in the structures are indeed conserved across the PY clones and 129 IgM (Table 2). The conserved residues could be structurally categorized into 3 types: (i) direct binding residues to pY (magenta); (ii) a hydrogen bond network around pY mediated by water molecules (yellow); (iii) the GG motif in CDR-H2 (green). All antibodies have at least two residues in each type with exception to 129 which lacks the residues for binding to water molecules. Consistently, 129 is the weakest pY antibody with the binding affinity 2-3 order of magnitude lower than the PY clones. The conservation across different types of binding residues implies that these antibodies likely resemble the same structure for pY binding in spite of their different clonal origins.

19

ACS Paragon Plus Environment

Journal of the American Chemical Society 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Table 2 | Sequence alignments of pY antibodies. The sequence alignments reveal that PY2, PY20, PY42, PY54, PY69, 129, and 4G10 might share similar pY binding modes as the critical binding residues are conserved. Magenta: phosphate binding, Yellow: water binding, Green: the GG motif for the pY-proceeding residue binding. The structural convergence among multiple independent pY antibodies is rather surprising given that VDJ recombination and somatic hypermutation is a highly variable and random process. Since the immuno-converged pY binding site is composed of many residues from the β-strands in the framework region, and not the CDR loops, we suspected that some of those residues critical to the pY binding might pre-exist in the mouse V germline genes prior to somatic hypermutation. Indeed, sequence alignments to the mouse V germline database (ABG database)27 revealed that six (Y91L, T33H, H35H, N52H, G55H, G56H) out of nine residues critical for pY binding in 4G10 were predominant in the consensus sequences of mouse VH and Vk germline genes. The Y91L residue was found in 9 of 67 Vk germline genes, and the T33H, H35H, N52H, G55H, and G56H residues were found in 10, 83, 54, 135, 40 of 185 VH germline genes, respectively. Furthermore, there are 2 VH germline genes that simultaneously contain all these 5 critical residues, whereas 14 and 33 VH germline genes contain 4 and 3 of the 5 critical residues, respectively (Table 3). Therefore, we believe that the immuno-convergence of pY binding residues arose from pre-existing mouse germline genes, rather than independent solutions through somatic hypermutation. This immuno-convergence, i.e. a

20

ACS Paragon Plus Environment

Page 20 of 32

Page 21 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of the American Chemical Society

particular antigen immunization induces similar sequences of antibodies from diverse genetic backgrounds, is a longstanding observation in both human and mouse28. For instance, it has been long known that immunization with simple hapten antigens, such as arsenate (Ars)29 or 4-hydroxy-3-nitrophenylacetyl (NP)30, would induce essentially identical antibodies from different mice. The molecular mechanism behind this convergence was not clear, but it is generally assumed that the antibody repertoire has very limited solutions to this kind of antigen. Our structural data and the sequence analysis provide an alternative explanation. We found several pre-existing V germline genes encoding a set of residues favorable for pY antigen binding outside the CDR region. These genes would then be quickly affinity-matured and therefore lead to the convergent antibodies. We believe that pre-disposed binding residues in the V germline genes are more likely happened to simple antigens, such as pY and small haptens. In addition to haptens, simple epitopes from complex antigens, such as immunodeficiency virus, dengue virus, and influenza virus, have also resulted in convergent immunoglobulins.31,32 The convergence would be especially profound if many germline genes share consensus residues attributing to the antigen binding, such as the case of pY.

Table 3 | Comparison of the convergent pY binding residues with the mouse V germline genes. Top: number of V germline genes that contains the convergent pY

21

ACS Paragon Plus Environment

Journal of the American Chemical Society 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

binding residues. Bottom: number of the VH germline genes that contain multiple convergent pY binding residues. Traditional in vitro antibody affinity maturation often relies on random/rational mutagenesis of the CDRs without altering the loop lengths. Here, we adopted a different approach, in which the CDR-L3 loop was lengthened based on the 4G10/pY co-complex structure, and successfully created affinity-improved variants. Our approach leverages the combinational power of computational loop modeling and experimental phage display selection. One of the improved variants was structurally validated with X-ray crystallography and showed virtually superimposable structure with the design model. The advancement of computational tools has enabled accurate loop modeling in several examples, including building long loops with high resolution33, altering enzyme specificity34, and predicting antibody hypervariable conformations35. We showed that the loop modeling could also be utilized as a useful tool for antibody affinity maturation. The affinity-improved variants exhibited better performances than the parental 4G10 in two pY applications, Western blot and immunofluorescence. As a comparison, Mayer and coworkers profiled the pY proteomics using a wide variety of SH2 domains.36 Unlike the pan-specificity of PY20 or 4G10 and its variants, SH2 domains showed a strong preference toward their native pY substrates. A judicious combination of particular SH2 domains in combination with improved 4G10 variants may maximize the depth of the pY profiling. Our work provides, for the first time, structural insights into the mechanism of pY recognition of the most widely used antibodies in biological research. We found that pY antibodies derived from independent immunization share a convergent pY binding site.

22

ACS Paragon Plus Environment

Page 22 of 32

Page 23 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of the American Chemical Society

Based on the convergent pY antibody structures, we propose a hypothesis that explains the long observed convergent immunoglobulin responses: the mouse V germline genes share predominant antigen-binding residues, and therefore little affinity maturation by means of V gene hypermutation is required for raising high-affinity antibodies. Furthermore, we demonstrate how the convergent pY antibody structures can be harnessed to build improved next generation tools for better pan-specific detection and isolation of the pY proteome. To extend the applicability guided by the pan-specific pY antibody structures we solved, our future work will focus on developing sequencespecific pY antibodies against important pY targets in physiological or pathological states. Finally, the high-resolution structures of pY antibodies also provide a good starting point for engineering other types of structurally related PTM antibodies, such as 1- or 3-phosphohistidine antibody and sulfotyrosine antibody. ONLINE METHODS Construct preparation, protein expression, and purification. All the Fabs were constructed in a dual-expression vector that expresses the light chain and the heavy chain with the pelB and the stII signal peptides, respectively, for the periplasm expression. A C-terminal 6xHis tag was put in the heavy chain. The CDR grafting was achieved by Kunkel cloning. The Fabs were expressed using C43(DE3) E. coli strain at 30°C for approximately 20 hours with 1mM IPTG induction21. The cells were harvested by centrifugation and lysed using B-PER lysis buffer. For the 4D5 scaffold Fabs, the lysate was incubated at 60°C for 20 minutes and centrifuged to remove the inclusion body. The Fabs were purified by Ni2+-NTA resin and buffer exchanged in TBS buffer for further characterization.

23

ACS Paragon Plus Environment

Journal of the American Chemical Society 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

X-ray crystallography. The Fabs were concentrated to 15 mg/mL using MWCO 30kDa Amicon Ultra centrifugal filter units. The LpYL peptide was purchased from Sino Biological with HPLC purification. The Fab and the peptide were mixed at 1:5 molar ratio and set up for crystallographic conditions using hanging-drop vapor diffusion at room temperature. The PY204D5 crystals were grown in 20% PEG4000 and 0.336M ammonium sulfate, pH 6.0. The 4G104D5 crystals were grown in 20% PEG4000, 0.2M ammonium sulfate, and 0.1M sodium acetate at pH 5.5. The 4G104D5/LpYL co-complex crystals were grown in 20% PEG3350, 0.2M potassium/sodium tartrate, and 0.1M Birtris propane at pH 6.5. The c310/LpYL co-complex crystals were grown in 26% PEG3350, 0.1M Bis-Tris, and 0.16M ammonium acetate at pH 6.0. All crystals were soaked in 15% glycerol in the well solution and flash cooled by liquid nitrogen. Diffraction data were collected at the Advanced Light Source Beamline 8.3.1 at the Lawrence Berkeley National Laboratory. iMosflm37 and XDS38 were used for data processing. The diffraction phases were obtained through molecular replacement using PHENIX39, initially using the CDR-trimmed Fab structure in PDB 1BJ140 and later using the 4G104D5 structure. Ellipsoidal truncation41 of the PY204G5 dataset was performed on the Diffraction Anisotropy Server (University of California, Los Angeles) to correct for the strong anisotropy along the b direction and significantly improved the electron density map. Further refinement and real space adjustment were done with PHENIX39 and Coot42, respectively. The data statistics are listed in Table S2. Computational modeling Triad computational protein design suite23 was employed for loop modeling of the 4G104D5 CDR-L3 loop. The 4G104D5/LpYL co-complex structure was used as input. The 92L and 94L residues were fixed as anchor points, and the 93L

24

ACS Paragon Plus Environment

Page 24 of 32

Page 25 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of the American Chemical Society

residue was replaced with 1, 2, 3, or 4 alanines or glycines. During the loop modeling, the nearby residues are repacked, and the overall structure is energy minimized. For each loop length, 100 lowest energy conformations were output. Based on the Ala/Gly loop, the loop sequence was designed to maximize the binding affinity to the LpYL peptide. Phage library construction. The L3 library was constructed by randomized oligonucleotides using degenerate codons. The phagemid that contained the 4G104D5 gene was double digested by PstI and KpnI, which are located on the two flanking sides of the CDR-L3, respectively. Five pairs of forward and reverse oligonucleotides that construct different lengths of L3 were purchased from Integral DNA Technology. The oligonucleotides were 5’ phosphorylated by T4 polynucleotide kinase at 37°C for 1 hour. An equimolar mixture of the forward and the reverse oligomers was heated to 95 °C for 10 min and gradually cooled down to room temperature. The annealed dsDNA was ligated into the PstI/KpnI digested phagemid and transformed into SS320 electrocompetent cells. After one hour recovery at 37°C, the cells were expanded into 500mL 2xYT with 50 ug/mL carbenicillin. The M13KO7 helper phage was added to the culture at MOI=10 when OD600 = 0.6. The culture was grown for approximately 20 hours with 250 rpm shaking at 37°C. Next day, the cells were pelleted by centrifugation. The phage was precipitated from the supernatant by adding 1/5 volume of 20% PEG8,000 and 2.5M NaCl. The phage library was resuspended in TBS buffer with 50% glycerol and 2mM EDTA and stored at -20 °C. Phage display selection. All phage selections were done according to previous protocols21. Briefly, the L3 phage library was first enriched by protein L magnetic beads to deplete non-displayed or truncated Fab phage. The enriched library was then selected

25

ACS Paragon Plus Environment

Journal of the American Chemical Society 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

against a biotinylated peptide GGGpYGGG immobilized on streptavidin-coated magnetic beads. A biotinylated peptide GGGYGGG was used for library clearance and enrichment tests. In total, five rounds of selections were performed with decreasing amounts of GGGpYGGG peptide (1M, 300nM, 100nM, 30nM, and 10nM). After each round, the phage titer was determined according to standard protocols. Briefly, the acid-eluted phage was serially diluted and added to mid-log phage XL1-Blue E. coli. The infected culture was incubated for 20 min at room temperature on an orbital shaker. The cells were spotted on LB-agar plates with carbenicillin and incubated overnight at 37 °C. The enrichment was determined by comparing the phage titers of GGGpYGGG and GGGYGGG selections. Phage ELISAs. ELISAs were performed according to standard protocols. Briefly, a single colony from the phage selection was picked into 1mL 2xYT with 50 µg/mL carbenicillin, 5 µg/mL tetracycline, and 1010 M13KO7 helper phage particles. The culture was grown for approximately 20 hours at 37 °C in a shaking incubator. The overnight culture was spun down to pellet the cells. The phage supernatant was diluted 5 fold in TBS buffer with 0.05% Tween-20 and 0.2% BSA for ELISA. 384-well Maxisorp plates were coated with NeutraAvidin (5 µg/mL) overnight at 4 °C. 200nM of biotinylated GGGpYGGG or GGGYGGG was captured by NeutraAvidin in the Maxisorp plate for 30 min at room temperature. The plate was washed three times and loaded with diluted phage supernatant for 30 min. In the case of competitive ELISA, the diluted phage supernatant was first incubated with 200 nM biotinylated GGGpYGGG before loading on the plate. The plates were washed three times and loaded with anti-M13 horseradish

26

ACS Paragon Plus Environment

Page 26 of 32

Page 27 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of the American Chemical Society

peroxidase (HRP) conjugate for 15 min. The plate was washed three times and detected by TMB substrate at 450 nm. Binding kinetics. Biolayer interferometry data were measured using an Octet RED384 (ForteBio). Biotinylated peptides GGGpYGGG or GGGYGGG were immobilized on the streptavidin (SA) biosensor. Purified Fabs were used as analyte in the solution. TBS with 0.05% Tween-20 and 0.2% BSA was used for all diluents and buffers. A 1:1 monovalent binding model was used to fit the kinetic parameters (kon and koff). Western blot Jurkat cells were pre-treated with 0.1 mM freshly activated pervanadate for 15 minutes at 37 °C. The cells were harvested and lysed using RIPA buffer containing 1 mM EDTA and protease inhibitor cocktails. The lysate was briefly sonicated and quantified by BCA assay. 10 µg of the protein lysate was loaded into each well and run on a 4-20% SDS-PAGE. The gel was transferred to a PVDF membrane. The membrane was briefly washed and blocked with Odyssey TBS blocking buffer at 4 °C overnight. The membrane was then incubated with the pY Fab at 200 nM at room temperature for 30 min followed by 6xHis tag antibody conjugated with DyLight 680. The membrane was imaged with LI-COR system. Immunofluorescence The HeLa cells were pre-treated with 0.1mM freshly activated pervanadate for 15 minutes at 37 °C. The cells were fixed by adding -20 °C methanol for 5 minutes followed by air-drying. These cells were rehydrated in TBS with 200 nM pY Fab at room temperature. After washing three times with TBS, the fluorescence staining was performed by adding 6xHis tag antibody conjugated with DyLight 680 and DAPI. The immunofluorescence images were taken using a Zeiss microscope with oil immersion objectives.

27

ACS Paragon Plus Environment

Journal of the American Chemical Society 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Data availability Atomic coordinates and structure factors have been deposited in the Protein Data Bank under accession codes PDB 6DEZ (PY204G5 complex with sulfate), 6DF0 (4G104D5 complex with sulfate), 6DF1 (4G104D5 complex with pY peptide), and 6DF2 (4G10-S54D5 complex with pY peptide). Supporting information Figure S1 | Binding Specificity measured by ELISA assay Figure S2 | Biolayer interferometry measurements of 4G104D5/pY Figure S3 | Structure-guided design of PY204D5 Figure S4 | 4G104D5 mutants biolayer interferometry measurements. Figure S5 | Computational modeling of CDR-L3 insertions Figure S6 | The phage enrichment test of the CDR-L3 library selection Figure S7 | Sequence consensus of the top-12 hits Figure S8 | pY Western blot of Jurkat whole-cell lysate Table S1 | Humanization of PY20 and 4G10 onto the 4D5 scaffold. Table S2 | Crystallography Statistics of four structures in this study. Table S3 | The CDR-L3 phage display library Table S4 | Biolayer interferometry measurements of CDR-L3 variants with the 4D5 and 4G10 scaffolds. Table S5 | The canonical conformations of CDRs of the pY antibodies. Table S6 | The IgBlast alignments of PY20 and 4G10 sequences to the IMGT mouse germline. Acknowledgments We thank Christopher Waddling at the UCSF crystallography facility for technical assistance on X-ray crystallography. We thank Beamline 8.3.1 at the Advanced Light Source, which is operated by the University of California Office of the 28

ACS Paragon Plus Environment

Page 28 of 32

Page 29 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of the American Chemical Society

President, Multicampus Research Programs and Initiatives grant MR-15-328599 the National Institutes of Health (R01 GM124149 and P30 GM124169), Plexxikon Inc. and the Integrated Diffraction Analysis Technologies program of the US Department of Energy Office of Biological and Environmental Research. The Advanced Light Source (Berkeley, CA) is a national user facility operated by Lawrence Berkeley National Laboratory on behalf of the US Department of Energy under contract number DE-AC0205CH11231, Office of Basic Energy Sciences. This work was supported by a P41 grant from the NCI (P41 CA196276). Competing financial interests Y.M. and J.A.W declare that they have competing financial interest. A provisional patent application related to this work is under consideration. References (1)

Gotoh, N. Cancer Sci. 2008, 99, 1319.

(2)

Kratchmarova, I.; Blagoev, B.; Haack-Sorensen, M.; Kassem, M.; Mann, M.

Science 2005, 308, 1472. (3)

Smith-Garvin, J. E.; Koretzky, G. A.; Jordan, M. S. Annu. Rev. Immunol. 2009,

27, 591. (4)

Koffie, R. M.; Hyman, B. T.; Spires-Jones, T. L. Mol. Neurodegener. 2011, 6, 63.

(5)

Rikova, K.; Guo, A.; Zeng, Q.; Possemato, A.; Yu, J.; Haack, H.; Nardone, J.;

Lee, K.; Reeves, C.; Li, Y.; Hu, Y.; Tan, Z.; Stokes, M.; Sullivan, L.; Mitchell, J.; Wetzel, R.; Macneill, J.; Ren, J. M.; Yuan, J.; Bakalarski, C. E.; Villen, J.; Kornhauser, J. M.; Smith, B.; Li, D.; Zhou, X.; Gygi, S. P.; Gu, T. L.; Polakiewicz, R. D.; Rush, J.; Comb, M. J. Cell 2007, 131, 1190. (6)

Hunter, T. Keio J. Med. 2002, 51, 61.

(7)

Hornbeck, P. V.; Kornhauser, J. M.; Tkachev, S.; Zhang, B.; Skrzypek, E.;

Murray, B.; Latham, V.; Sullivan, M. Nucleic Acids Res. 2012, 40, D261.

29

ACS Paragon Plus Environment

Journal of the American Chemical Society 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(8)

Sharma, K.; D'Souza, R. C.; Tyanova, S.; Schaab, C.; Wisniewski, J. R.; Cox, J.;

Mann, M. Cell Rep. 2014, 8, 1583. (9)

Li, L.; Tibiche, C.; Fu, C.; Kaneko, T.; Moran, M. F.; Schiller, M. R.; Li, S. S.;

Wang, E. Genome Res. 2012, 22, 1222. (10)

Khoury, G. A.; Baliban, R. C.; Floudas, C. A. Sci. Rep. 2011, 1, 90.

(11)

Hunter, T. Curr. Opin. Cell Biol. 2009, 21, 140.

(12)

Ruffjamison, S.; Camposgonzalez, R.; Glenney, J. R. J. Biol. Chem. 1991, 266,

6607. (13)

Tinti, M.; Nardozza, A. P.; Ferrari, E.; Sacco, F.; Corallino, S.; Castagnoli, L.;

Cesareni, G. Nat. Biotechnol. 2012, 29, 571. (14)

Koerber, J. T.; Thomsen, N. D.; Hannigan, B. T.; Degrado, W. F.; Wells, J. A.

Nat. Biotechnol. 2013, 31, 916. (15)

Glenney, J. R., Jr.; Zokas, L.; Kamps, M. P. J. Immunol. Methods 1988, 109, 277.

(16)

Bian, Y.; Li, L.; Dong, M.; Liu, X.; Kaneko, T.; Cheng, K.; Liu, H.; Voss, C.;

Cao, X.; Wang, Y.; Litchfield, D.; Ye, M.; Li, S. S.; Zou, H. Nat. Chem. Biol. 2016, 12, 959. (17)

Jones, R. B.; Gordus, A.; Krall, J. A.; MacBeath, G. Nature 2006, 439, 168.

(18)

Blagoev, B.; Ong, S. E.; Kratchmarova, I.; Mann, M. Nat. Biotechnol. 2004, 22,

1139. (19)

Cutler, R. L.; Liu, L.; Damen, J. E.; Krystal, G. J. Biol. Chem. 1993, 268, 21463.

(20)

Esinger, D.; Stiles, L.; Lamarche, A.; Jelinek, T. US patent US 6,824,989 B1, Sep

1, 2000. (21)

Hornsby, M.; Paduch, M.; Miersch, S.; Saaf, A.; Matsuguchi, T.; Lee, B.;

Wypisniak, K.; Doak, A.; King, D.; Usatyuk, S.; Perry, K.; Lu, V.; Thomas, W.; Luke, J.; Goodman, J.; Hoey, R. J.; Lai, D.; Griffin, C.; Li, Z.; Vizeacoumar, F. J.; Dong, D.; Campbell, E.; Anderson, S.; Zhong, N.; Graslund, S.; Koide, S.; Moffat, J.; Sidhu, S.; Kossiakoff, A.; Wells, J. Mol. Cell. Proteomics 2015, 14, 2833. (22)

Carter, P.; Presta, L.; Gorman, C. M.; Ridgway, J. B. B.; Henner, D.; Wong, W.

L. T.; Rowland, A. M.; Kotts, C.; Carver, M. E.; Shepard, H. M. Proc. Natl. Acad. Sci. USA 1992, 89, 4285.

30

ACS Paragon Plus Environment

Page 30 of 32

Page 31 of 32 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of the American Chemical Society

(23)

Mou, Y.; Yu, J. Y.; Wannier, T. M.; Guo, C. L.; Mayo, S. L. Nature 2015, 525,

230. (24)

North, B.; Lehmann, A.; Dunbrack, R. L., Jr. J. Mol. Biol. 2011, 406, 228.

(25)

Adolf-Bryfogle, J.; Xu, Q.; North, B.; Lehmann, A.; Dunbrack, R. L., Jr. Nucleic

Acids Res. 2015, 43, D432. (26)

Liu, B. A.; Engelmann, B. W.; Nash, P. D. FEBS Lett. 2012, 586, 2597.

(27)

Almagro, J. C.; Hernandez, I.; Ramirez, M. D.; Vargas-Madrazo, E.

Immunogenetics 1998, 47, 355. (28)

Dunand, C. J. H.; Wilson, P. C. Philos T R Soc B 2015, 370, 20140238.

(29)

Wysocki, L. J.; Gridley, T.; Huang, S.; Grandea, A. G., 3rd; Gefter, M. L. J. Exp.

Med. 1987, 166, 1. (30)

Bothwell, A. L.; Paskind, M.; Reth, M.; Imanishi-Kari, T.; Rajewsky, K.;

Baltimore, D. Cell 1981, 24, 625. (31)

Strauli, N. B.; Hernandez, R. D. Genome Med. 2016, 8, 60.

(32)

Parameswaran, P.; Liu, Y.; Roskin, K. M.; Jackson, K. K. L.; Dixit, V. P.; Lee, J.

Y.; Artiles, K. L.; Zompi, S.; Vargas, M. J.; Simen, B. B.; Hanczaruk, B.; McGowan, K. R.; Tariq, M. A.; Pourmand, N.; Koller, D.; Balmaseda, A.; Boyd, S. D.; Harris, E.; Fire, A. Z. Cell Host Microbe 2013, 13, 691. (33)

Hu, X.; Wang, H.; Ke, H.; Kuhlman, B. Proc. Natl. Acad. Sci. USA 2007, 104,

17668. (34)

Murphy, P. M.; Bolduc, J. M.; Gallaher, J. L.; Stoddard, B. L.; Baker, D. Proc.

Natl. Acad. Sci. USA 2009, 106, 9215. (35)

Marcatili, P.; Rosi, A.; Tramontano, A. Bioinformatics 2008, 24, 1953.

(36)

Machida, K.; Thompson, C. M.; Dierck, K.; Jablonowski, K.; Karkkainen, S.; Liu,

B.; Zhang, H.; Nash, P. D.; Newman, D. K.; Nollau, P.; Pawson, T.; Renkema, G. H.; Saksela, K.; Schiller, M. R.; Shin, D. G.; Mayer, B. J. Mol. Cell 2007, 26, 899. (37)

Battye, T. G.; Kontogiannis, L.; Johnson, O.; Powell, H. R.; Leslie, A. G. Acta

Crystallogr. D Biol. Crystallogr. 2011, 67, 271. (38)

Kabsch, W. Acta Crystallogr. D Biol. Crystallogr. 2010, 66, 133.

(39)

Adams, P. D.; Afonine, P. V.; Bunkoczi, G.; Chen, V. B.; Davis, I. W.; Echols,

N.; Headd, J. J.; Hung, L. W.; Kapral, G. J.; Grosse-Kunstleve, R. W.; McCoy, A. J.;

31

ACS Paragon Plus Environment

Journal of the American Chemical Society 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Moriarty, N. W.; Oeffner, R.; Read, R. J.; Richardson, D. C.; Richardson, J. S.; Terwilliger, T. C.; Zwart, P. H. Acta Crystallogr. D Biol. Crystallogr. 2010, 66, 213. (40)

Muller, Y. A.; Chen, Y.; Christinger, H. W.; Li, B.; Cunningham, B. C.; Lowman,

H. B.; de Vos, A. M. Structure 1998, 6, 1153. (41)

Strong, M.; Sawaya, M. R.; Wang, S.; Phillips, M.; Cascio, D.; Eisenberg, D.

Proc. Natl. Acad. Sci. USA 2006, 103, 8060. (42)

Emsley, P.; Cowtan, K. Acta Crystallogr. D 2004, 60, 2126.

For Table of Contents Only

32

ACS Paragon Plus Environment

Page 32 of 32