RPtag as an orally bioavailable, hyperstable epitope tag and

survive the GI-tract, and are thus precluded from oral administration for medical, diagnostic, or research purposes. Here we present a new scaffold, R...
0 downloads 5 Views 3MB Size
Subscriber access provided by Kaohsiung Medical University

Article

RPtag as an orally bioavailable, hyperstable epitope tag and generalizable protein binding scaffold Jennifer R. DeRosa, Brandon S. Moyer, Ellie Lumen, Aaron J. Wolfe, Meegan B. Sleeper, Anthony H. Bianchi, Ashleigh Crawford, Connor McGuigan, Danique Wortel, Cheyanne Fisher, Kelsey J. Moody, and Adam R Blanden Biochemistry, Just Accepted Manuscript • DOI: 10.1021/acs.biochem.8b00170 • Publication Date (Web): 03 May 2018 Downloaded from http://pubs.acs.org on May 3, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

RPtag as an orally bioavailable, hyperstable epitope tag and generalizable protein binding scaffold ‡



Jennifer R. DeRosa1,2, , Brandon S. Moyer1,2, , Ellie Lumen1,2, Aaron J. Wolfe1,2, Meegan B. Sleeper1,2, Anthony H. Bianchi1,2, Ashleigh Crawford1,2, Connor McGuigan1,2, Danique Wortel1,2, Cheyanne Fisher1,2, Kelsey J. Moody1,2, Adam R. Blanden1,2,*

1

Ichor Therapeutics, Inc., 2521 US-11, Lafayette, NY 13084

2

RecombiPure, Inc., 2521 US-11, Lafayette, NY 13084

*corresponding author e-mail: [email protected] address: Ichor Therapeutics, Inc., 2521 US-11, Lafayette, NY 13084

ACS Paragon Plus Environment

1

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 58

ABSTRACT Antibodies are the most prolific biologics in research and clinical environments because of their ability to bind targets with high affinity and specificity. However, antibodies also carry liabilities. A significant portion of the life science reproducibility crisis is driven by inconsistent performance of research-grade antibodies, and clinical antibodies are often unstable and require costly cold chain management to reach their destinations in active form. In biotechnology, antibodies are also limited by difficulty integrating them in many recombinant systems due to their size and structural complexity. A switch to small, stable, sequence-verified binding scaffolds may overcome these barriers. Here we present such a scaffold, RPtag, based on a ribose-binding protein (RBP) from extremophile Caldanaerobacter subterraneus. RPtag binds an optimized peptide with pM affinity, is stable to extreme temperature, pH, and protease treatment, readily refolds after denaturation, is effective in common laboratory applications, was rationally engineered to bind bioactive PDGF-β, and was formulated as a gut-stable orally bioavailable preparation.

ACS Paragon Plus Environment

2

Page 3 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

INTRODUCTION Since the FDA approval of the first clinical monoclonal antibody in 1986, monoclonal antibodies have skyrocketed in use, currently representing ~1/2 of the total biopharmaceuticals market with 4-5 new approvals every year and over 300 products in clinical development1,2. The near ubiquitous application of antibodies in routine research applications, including western blot, fluorescence microscopy, ELISA, immunoprecipitation, and protein purification etc., is driven by the ability of antibodies to recognize and bind target antigens with high affinity and specificity. However, despite the ubiquity of their applications, antibodies carry with them significant, perhaps insurmountable, liabilities. For example, it is widely recognized that a significant portion of the current life-science reproducibility crisis is driven by inconsistent performance of research-grade antibodies3–8. Indeed, some estimates indicate that less than half of commercially available antibodies are able to recognize their targets with their claimed specificity9,10, and less than 25% are usable for their indicated application7,11, causing some to suggest the move toward sequence-verified recombinant products12. While moving to recombinant, sequence-verified products in general would be a useful first step away from the largely empirical process currently used for routine antibody production, it still does not solve the problem of antibody fragility or quality control after it is produced. Antibodies are often only stable for weeks-months in aqueous solution even when stored at 4 °C11,13,14, and require costly cold-chain management for appropriate transportation, increasing cost and decreasing access in the developing world15. Additionally, when antibodies are immobilized for such applications as IP pulldowns or protein chromatography, they are often incompatible with commonly required reducing agents (e.g., DTT) and column cleaning protocols. Antibodies themselves are also large and structurally complex, involving multiple

ACS Paragon Plus Environment

3

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 58

protein domains and complex networks of disulfide bonds for proper conformation (mw ~150 kDa for IgG, 14-17 disulfide bonds)16,17. This substantially limits the potential applications of antibodies in engineered polypeptide biomolecules, with smaller, simpler, more robust scaffolds being far preferable. This ultimately leads to decreased usability and increased cost. An alternative solution is to switch to recombinant, sequence-verified binding proteins based on a non-antibody scaffold whose affinity and specificity can rival those of antibodies, but with reduced size and increased mechanical and thermodynamic stability. Such issues have been identified in the past decade or so; and correspondingly there are a handful of non-antibody scaffolds currently in use and under commercial development that seek to address them, e.g. Kunitz domains18,19, monobodies20–22 , DARPins23–26, anticalins27–30, nanobodies24, affibodies31,32, and others (for an overview of the clinical development of non-antibody binding scaffolds, the reader is directed to the review by Vazquez-Lombardi et. al.16). While these scaffolds do provide smaller, more stable, and readily manufactured backbones on which to build, to our knowledge none of these molecules have sufficient pH or protease stability to survive the GI-tract, and are thus precluded from oral administration for medical, diagnostic, or research purposes. Here we present a new scaffold, RPtag, based on a ribose-binding protein (RBP) from extremophile Caldanaerobacter subterraneus33,34. This protein has been shown to be an excellent substrate for protein engineering because of its high thermostability (Tm ~102 ˚C), tolerance to mutation, and innate ligand binding capacity (i.e. D-ribose) 33–35. In this study, we split the protein at an irregular c-terminal pair of β-sheets to generate a large protein fragment (RPtag(large)) and a small target peptide (RPtag(small)). We demonstrate that the two fragments bind one another at antibody-like affinity and specificity, solve the kinetic binding mechanism,

ACS Paragon Plus Environment

4

Page 5 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

mutationally characterize the binding pocket, and engineer the sequence of RPtag(small) to bind RPtag(large) with pM affinity. We further demonstrate the efficacy of RPtag(large) and small in many common laboratory applications typically dominated by antibodies and other epitope tags under a remarkable range of conditions. Finally, we rationally engineer RPtag(large) to bind both a minimally modified synthetic peptide sequence and bioactive PDGF-β-dimer, and demonstrate its ability to survive the GI-tract and be orally bioavailable as a vitamin B12 conjugate. Together, our data indicate that RPtag is among the most robust scaffolds reported, and merits further study as both an epitope tag, and a substrate for engineering in binding activity.

ACS Paragon Plus Environment

5

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 58

EXPERIMENTAL DETAILS Reagents All genes and expression plasmids were codon optimized, synthesized, manufactured, and sequence-verified by GenScript (Piscataway, NJ) with the exception of the pTagRFP expression vector, which was purchased from Axxora, LLC (Farmingdale, NY). Rhodamine-labeled peptides were synthesized solid-phase with a purity ≥90% confirmed by HPLC-MS by ThermoFisher Scientific (Waltham, MA). They were obtained as a dry TFA salt, reconstituted in dry DMSO, and frozen at -20 °C until use. Unlabeled peptides were obtained from either ThermoFisher Scientific (Waltham, MA) or GenScript (Piscataway, NJ), both with a purity ≥90% confirmed by HPLC-MS with equivalent results. TEV protease was expressed and purified inhouse. Fluorescein isotiocyanate (FITC) and propylene glycol (PG) was purchased from SigmaAldrich (St. Louis, MO). Mouse-anti-6xHis (4E3D10H2/E2), mouse-anti-alpha tubulin (DMA1), and HRP-conjugated goat-anti-mouse (62-6520) were purchased from Thermo-Fisher Scientific (Waltham, MA). Mouse monoclonal anti-FLAG M2-Peroxidase (HRP) antibody was purchased from Sigma-Aldrich (St. Louis, MO). SulfoLinkTM Coupling Resin, EZ-LinkTM maleimide activated horseradish peroxidase, SuperSignalTM ELISA Femto Substrate, 1-StepTM TMBBlotting Substrate Solution, PierceTM C-Myc-Tag IP/Co-IP Kit, White PierceTM Maleimide Activated Plates, LipofectamineTM 3000 transfection reagent, and standard western blot supplies were purchased from Thermo-Fisher Scientific (Waltham, MA). Chemically competent BL21(DE3) cells were obtained from Thermo Fisher Scientific (Waltham, MA) and New England Biolabs (Ipswitch, MA) with similar results. Isopropyl β-D-1-thiogalactopyranoside (IPTG) was purchased from Gold Biotechnology (St. Louis, MO). Cell culture media, fetal bovine serum (FBS), plates, and other standard cell culture reagents as well as untreated 96 and

ACS Paragon Plus Environment

6

Page 7 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

384 well plates were purchased from Corning (Corning, NY). The following detergents were purchased from Anatrace (Maumee, OH): n-undecyl-β-D-maltopyranoside (UM), n-dodecylN,N-dimethylamine-N-oxide (DDAO), n-decyl-β-D-maltopyranoside (DM), n-octyl-β-Dthioglucopyranoside (OTG), n-octyl-β-D-glucopyranoside (OG), n-dodecyl-β-Dmaltopyranoside (DDM), and CHAPS. Cell lines were purchased from ATCC (Manassas, VA) and maintained in the indicated media with 5% CO2 at 37 ˚C in a standard humidified thermostated cell culture incubator. Chromatography resins were purchased from GE Healthcare Bio-Sciences (Marlborough, MA), except for Ni-NTA agarose, which was purchased from Qiagen (Valencia, CA). All other chemicals were purchased from Millipore Sigma (Burlington, MA) and were reagent grade or better. Unless otherwise specified, experiments were conducted in 50 mM Tris pH 8.0, 0.005% Tween 20.

Optical Measurements Absorbance measurements were taken using a ThermoFisher Evolution 201 UV-Vis spectrophotometer with 1 cm quartz cuvettes. Fluorescence, fluorescence anisotropy, and luminescence measurements were all made using a Molecular Devices Spectramax i3 plate reader equipped with fluorescein (λex\λem=585 nm/535 nm) and rhodamine (λex\λem=535 nm/595 nm) fluorescence polarization modules with an assumed G-factor of 1. Luminescence measurements were taken using white polystyrene 96-well plates, and fluorescence measurements in black polystyrene 96- or 384-well plates.

Protein Expression and Purification

ACS Paragon Plus Environment

7

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 58

RPtag(large) and its mutants were cloned into pET-28a (+) expression plasmids. All variants included an N-terminal 6xHis tag and a single N-terminal Cys to enable rapid purification and site-specific immobilization (sequences available in Table S2, all mutant numbering in this manuscript is in reference to the first sequence under "PROTEINS" including all n-terminal tags). Two protocols were used to express protein. For low density shaker flask growths, chemically competent BL21(DE3) E. coli were transformed with 50 ng expression plasmid, streaked onto Luria Broth (LB) + 50 mg/L kanamycin agar plates and grown at 37 ˚C overnight. Five colonies per mutant were then picked and grown in Fernbach flasks in LB + 50 mg/L kanamycin at 30 ˚C with continuous shaking at 225 RPM until OD600 = 0.6. Cultures were then induced with 0.1 mM IPTG and grown for an additional 18 h. For high density growths, picked colonies were inoculated into 50 mL LB + 50 mg/L kanamycin and grown to OD600 = 0.6 at 37 ˚C with continuous 225 RPM shaking, the cells were pelleted by centrifugation, then resuspended in LB + 40% (v/v) glycerol, flash frozen on dry ice, and stored at -80 ˚C. Frozen stocks were then used to inoculate 10-L New Brunswick Bioreactors filled with LB supplemented with 10 g/L glucose, 0.6 g/L MgSO4, 0.1 mL/L Antifoam 204, and 100 mg/L kanamycin (800 RPM agitation, 8 SLPM 0.2 µM filtered room air, 30 ˚C). The pH was maintained between 6.85 and 7.15 via the addition of 30% NH4OH (base) or a mixture of 50% glucose/1.5% MgSO4 (acid) by peristaltic pump. Fed-batch cultures were induced at OD600 = 6 with 1 mM IPTG, and grown for an additional 18 h. After growth, cells were pelleted by centrifugation, resuspended in 50 mM Tris pH 8.0, 300 mM NaCl, 10 mM β-mercaptoethanol (βME) and 10 mM imidazole, and frozen at -80 ˚C until purification. After thawing, cells were lysed enzymatically (Lysozyme, DNAaseI, 5 mM MgSO4 1 hour on ice), cell debris pelleted by centrifugation, and the clarified supernatant loaded onto a Ni-NTA column equilibrated with the

ACS Paragon Plus Environment

8

Page 9 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

lysis buffer. The protein was then eluted with a step gradient of imidazole (10 mM - 500 mM), and protein containing fractions pooled and dialyzed against 50 mM Tris, 150 mM NaCl, 10 mM β-ME, 1 mM EDTA. Protein concentrations were determined using calculated ε280 nm = 4,470 M-1cm-1 except for D126W (ε280 nm = 9,970 M-1cm-1), H122W (ε280 nm = 9,970 M-1cm-1), and K129Y (ε280 nm = 5,960 M-1cm-1). TagRFP variants were made with an 8xHis tag on the opposite terminus of the indicated RPtag sequence, and were purified as above except the temperature was dropped to 25 ˚C before induction, and concentration was determined using ε555 = 100,000 M-1cm-1. Mature PDGF-β sequence lacking signal peptide was cloned downstream of an Nterminal 8xHis tag and TEV-protease site with a terminal serine such that, after cleavage, only the native PDGF-β sequence remains. The coding sequence was then subcloned into a pET 28a (+) expression plasmid and transformed into BL21(DE3) as above and grown on LB agar + 50 mg/L kanamycin. After overnight growth at 37 ˚C, colonies were picked and grown in 10 L benchtop bioractors in LB + 50 mg/L kanamycin until OD600 = 0.6 (800 rpm agitation, 8 SLPM air, 37 ˚C) and induced with 0.1 mM IPTG for 18 h. Cells were harvested by centrifugation and resuspended in 50 mM Tris pH 8.0, 300 mM NaCl, 10 mM β-ME, 10 mM imidazole and frozen at -20 ˚C. Cells were then thawed and lysed enzymatically with a few crystals of lysozyme and DNAaseI + 5 mM MgSO4 for 1 h at room temperature. Solid guanidine hydrochloride was added to the mixture to a final concentration of 6 M and the pH re-adjusted to 8.0 with NaOH. Remaining cell debris was pelleted by centrifugation, and the resulting supernatant incubated with Ni-NTA resin pre-equilibrated with the same buffer with guanidine. After incubation, resin was allowed to settle, the supernatant was poured off, and the resin was washed in 20 column volumes of the same buffer with guanidine. After washing, protein was eluted with the same

ACS Paragon Plus Environment

9

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 58

buffer + 0.5 M imidazole. Protein-containing fractions were pooled and spiked with 10 mM dithiothreitol and incubated at room temperature for 1 h to help reduce any disulfide bonds. The protein was then refolded by rapid dilution into 50 mM phosphate, 150 mM NaCl pH 7.4 (10fold dilution), and exhaustively dialyzed against the same buffer for 48 h at room temperature in the presence of 0.01 mg/mL TEV protease to cleave off the tags. Precipitate was removed by centrifugation and 0.2 µm filtering, and the protein further purified by a 0.15-1M NaCl gradient on a SP-sepharose column, and size exclusion chromatography on a superdex S200 (prep grade column). The resultant protein was a homogenous band on SDS-PAGE, migrated at the expected weight of the dimer when boiled without reducing agent and monomer when boiled with 10% βME, and enhanced the proliferation of serum-starved IMR-90 fibroblasts (Fig. S9). After purification, proteins were all flash frozen on dry ice and stored at -80 ˚C until use. All proteins were >90 % pure by SDS-PAGE stained with Coomassie Brilliant Blue R250.

Fluorescent Labeling Proteins RPtag(large) and its mutants were desalted into 50 mM phosphate, 150 mM NaCl pH 7.5 using a PD10 desalting column, and incubated at 0.1-1 mM with 1 mM FITC for 1-24 h at room temperature. Excess dye was removed by desalting 2 additional times using a PD10 column, and concentration and labeling stoichiometry determined by absorbance using ε494 = 70,000 M-1cm-1 and a A494/A280 correction factor of 0.3. Stoichiometries were 0.05-0.1 labels/protein.

HRP-Conjugation of Proteins Cys-containing RPtag(large) (10 mg) was mixed with 5 mg of lyophilized EZ-LinkTM maleimide activated horseradish peroxidase in 50 mM phosphate, 150 mM NaCl, 1 mM EDTA, and

ACS Paragon Plus Environment

10

Page 11 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

allowed to react at room temperature for 1 h. The resulting mixture was then purified by gel filtration on a S200 (prep grade) column. Protein concentration was determined using ε403 nm = 100,000 M-1cm-1 for HRP.

Binding Affinity Measurements Rhodamine labeled peptides or FITC labeled proteins were incubated with increasing concentrations of unlabeled binding partner for 1 h at room temperature, and the fluorescence anisotropy measured. Resultant curves were fit with the general single-site binding equation:

Eqn1:

rmeasured = r0 + (rmax - r0)*(Ptot + x + Kd - sqrt((Ptot + x + Kd)^2 - 4 * Ptot * x))/(2 * Ptot)

where rmeasured is the measured anisotropy, r0 is the baseline anisotropy, rmax is the maximum anisotropy, Ptot is the fixed total concentration of labeled peptide/protein, x is the variable concentration of unlabeled binding partner, and Kd is the measured Kd. Unless otherwise stated, the concentration of the labeled species was 1-2 nM.

Kinetics Measurements Binding kinetics for the native RPtag(large) and small were measured by mixing increasing concentrations of rhodamine labeled RPtag(small) (0-10 nM) with a fixed saturating concentration of unlabeled RPtag(large) (1000 nM), as well as a fixed concentration of labeled RPtag(small) (10 nM) and increasing concentrations of unlabeled RPtag(large) (0-10,000 nM), and monitoring the fluorescence anisotropy over time. Resultant anisotropy curves were converted to concentration of bound RPtag(large)/(small) complex using the equation

ACS Paragon Plus Environment

11

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 58

Eqn2. [L-S] = (rmeasured – r0)/(rmax – r0) * [S]total

Where [L-S] is the concentration of LS complex, rmeasured is the measured anisotropy, r0 is the anisotropy in the absence of any unlabeled binding partner, and rmax is the anisotropy in the presence of saturating binding partner. Unbinding kinetics were evaluated by pre-complexing rhodamine labeled RPtag(small) (0-10 nM) with a fixed saturating concentration of unlabeled RPtag(large) (1000 nM), adding a large excess of unlabeled native RPtag(small) (0.1 mM), and observing the fluorescence anisotropy over time. LS complex concentrations were determined as above.

Initial rates

were determined by linear fits of the first 2 min of the reactions, and full curves were fit with the single exponential equation:

Eqn3: y = y0 + Ae-kt

where y is the curve value at time t, y0 is the curve value at t=0, A is the amplitude of the curve, k is the first-order rate constant, and t is the time. Because of the excellent agreement between the values from the single exponentials and linear fits of the initial rates, all other variants were measured using a single concentration of labeled peptide (2 nM) and unlabeled RPtag(large) (1000 nM). All other peptides were competed off with 0.1 mM unlabeled Nd2,P5A,E18A (tight) peptide, as it is the tightest binding peptide for RPtag(large) yet identified. The resultant kinetics mechanism was simulated in Tenua 2.1.

ACS Paragon Plus Environment

12

Page 13 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Protein/Peptide Resin Immobilization Either RPtag(large) (40 mg/mL) with its sole N-terminal Cys or the indicated RPtag(small) peptide with a C-terminal GGC sequence (4 mg/mL) were desalted or dissolved into 50 mM phosphate pH 8.0, 150 mM NaCl, 10 mM EDTA. SulfoLinkTM resin was then poured into a column, washed with 10 column volumes of the same buffer, and the protein or peptide solution added to the resin at 1 mL solution/mL of settled resin volume. The slurry was mixed by gentle rocking for 1-2 h at room temperature in the dark. The column was then drained, mixed with 1 column volume of 50 mM L-Cysteine, and rocked in the dark for another 1 h at room temperature. The column was then drained, washed with 2 column volumes of 6 M guanidine hydrochloride, washed with 5 column volumes of 50 mM phosphate pH 7.2, 150 mM NaCl, 10 mM EDTA, and the resin stored at 4 °C until use. Methods used to synthesize resins specifically for capacity testing presented in Table 1 can be found in Supplemental Methods.

Protease Resistance Test RPtag(large) and BSA were mixed at 2 mg/mL each in either 50 mM Tris pH 8.0 and increasing concentrations of trypsin, or 50 mM glycine pH 3.0 and increasing concentrations of pepsin. Samples were incubated at room temperature for 1 h, then immediately mixed with SDS-PAGE sample buffer + 10% β-ME and boiled. Remaining protein was assessed by SDS-PAGE stained with coomassie brilliant blue R250.

Temperature Resistance Test Either 1 µM RPtag(large) or 1 µM anti-6x-His antibody in 50 mM phosphate pH 7.4, 150 mM NaCl were sequentially heated to 95 °C for 5 min, rescued for 1 min on ice, then an aliquot taken

ACS Paragon Plus Environment

13

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 58

and mixed with rhodamine labeled binding peptide (GHHHHHH for the his tag antibody and native RPtag(small) for RPtag(large)) to final concentrations of 100 nM protein/100 nM peptide, incubated for 1 h at room temperature, and the fluorescence anisotropy measured. Fraction binding was calculated from the anisotropy value before any boiling and the anisotropy value in the absence of any binding partner. For the autoclave test, 5 µM samples of each protein in 50 mM phosphate pH 7.4, 150 mM NaCl were placed in open micro-centrifuge tubes covered in aluminum foil and subjected to a standard 15 minute liquid sterilization cycle with slow exhaust to prevent boiling (15 min at 121 °C, ~60 min > 100 °C) and the Kd for rhodamine labeled target peptide measured by fluorescence anisotropy.

IP/Pulldown HEK293T cells were seeded onto a tissue culture treated 6-well plate at a density of 600,000 cells per well and allowed to attach overnight in high glucose DMEM + 10 % FBS. The following day, Lipofectamine-plasmid DNA complexes were prepared according to the manufacturer’s instructions. For each treatment, 2,500 ng of both labeled (Nd8Cd4 (fast) and cmyc) α-tubulin in pcDNA3.1(+) and pTagRFP DNA was combined with P3000 reagent and either full or half-dose Lipofectamine 3000 in serum-free media and incubated at room temperature for 15 minutes. Meanwhile, spent media in the 6-well plate was removed, cells were washed once with phosphate buffered saline (Corning), and fresh media was replaced. After incubation, Lipofectamine-DNA complexes were added directly into experimental wells of the 6well plate. Cells were incubated with Lipofectamine-DNA complexes for 72 h under standard cell culture conditions. After one PBS wash and media change, fluorescence was measured at 555/585 nm (λex/λem) to ensure transfection was successful. Cells were then lysed with 50 mM

ACS Paragon Plus Environment

14

Page 15 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

phosphate pH 7.4, 150 mM NaCl, 1 mM EDTA, 1% TritonX-100 for 1 h at room temperature, supernatant clarified by centrifugation, and incubated overnight with either 2.5 µL anti-c-myctag resin from the PierceTM C-Myc-Tag IP/Co-IP Kit, or 2.5 uL immobilized RPtag(large) agarose with continuous shaking at 4 °C. Resin was then washed with 3 x 500 µL washes of 50 mM phosphate pH 7.4, 150 mM NaCl, and the samples eluted by heating the resin to 95 °C for 5 min with 15 µL 2x SDS-PAGE sample buffer + 10% β-ME, and the run on SDS-PAGE, and assayed by western blot against α-tubulin.

Cell Proliferation Assay Low passage (1.3 g/L and a solubility limit of >100 mg/mL in buffer. The production time was ~36-72 h from bacterial freezer-stock to purified protein and requires only standard bacterial culture media, making it orders of magnitude faster and less expensive than antibody production36. We measured a Kd between RPtag(large) and small of 0.21 nM, 33-times tighter than the 7 nM measured for an anti-His tag antibody and 6x-His peptide (Fig. 1b, Table S2). We found no binding between RPtag(small) and BSA up to 10,000 nM, and a Kd of ~26,000 nM for RPtag(large) and BSA, yielding a molar-selectivity of >100,000-fold and mass-selectivity of >3 million-fold. As a stress test, we repeatedly boiled RPtag(large) and anti-6xHis antibody in 5 min intervals and assayed the residual binding activity (Fig. 1c). Whereas the antibody lost >90% of its activity in a single cycle, RPtag(large) retained >50% of its activity until 20 cycles. We then autoclaved the proteins and found the treatment completely destroyed antibody binding, but RPtag retained measurable, albeit reduced, binding (Fig. S1). To test protease resistance, we

ACS Paragon Plus Environment

21

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 58

incubated RPtag(large) and BSA with increasing concentrations of pepsin or trypsin (Fig. 1d). We found RPtag(large) 1,000-10,000-fold more resistant to pepsin at pH 3. For trypsin at pH 8, we found similar resistance to protease cleavage, but note that at the highest concentration of trypsin a substantial fragment of RPtag(large) remains, whereas no fragment of BSA remains. This may indicate the presence of an even more protease-stable core protein fragment within our RPtag(large) sequence used here -- perhaps RPtag(large) excluding some number of unstructured N-terminal residues. The extreme expression, solubility, and stability of RPtag(large) caused us to hypothesize that it could function as a fusion protein epitope tag. The top 5 pBLAST hits in the Landmark model organism database have low sequence identity to RPtag(large) (Fig. 1e), and the aligned C-termini share little conservation to native RPtag(small) excepting K3, L14, and N19 (Fig. 1f), indicating that there should be little interference from endogenous proteins in commonly used model organisms. We immobilized RPtag(large) to agarose beads and applied rhodamine-labeled RPtag(small) to a packed column, and observed it bind as a tight red band that did not diffuse significantly for >2 weeks at room temperature (Fig. 1g). The band readily eluted with glycine pH 1.5, and the resin could be re-used immediately upon equilibration with Tris pH 8.0. The same was true after repeated washing with 6 M guanidine hydrochloride (Fig. S2). We then fused RPtag(large) and small to the N- or C- terminus of the red-orange fluorescent protein TagRFP (RFP), immobilized RPtag(large) and small to agarose beads, packed the beads into FPLC compatible columns, and applied all 4 proteins to both columns (Fig. 1h-i). Regardless of the location of the tag, when the protein and column complemented (large/small), there was significant observable binding, and when they did not complement (large/large or small/small),

ACS Paragon Plus Environment

22

Page 23 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

there was little to no binding, indicating that RPtag can be used as a protein tagging system in principle.

ACS Paragon Plus Environment

23

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 58

Figure 1

Figure 1. Proof of concept for RPtag. a) Cartoon representation of RPtag(large) (cyan) and RPtag(small) (magenta) segments (PDB: 2ioy). b) Representative binding curves (n ≥ 3) by fluorescence anisotropy of RPtag(large)/(small) against nonspecific controls (BSA) and antibody/6xHis tag interaction. In all curves the target peptide is labeled with rhodamine, except BSA/RPtag(large) where RPtag(large) is labeled with FITC. c) Fraction binding activity remaining after successive 5 min treatments at 95 ˚C. Samples are mean ± SEM (n = 2) d) Remaining BSA or RPtag(large) after 1 h treatment with the indicated amount of protease and pH by SDS-PAGE. e) Identity table for the top 5 similar sequences in the landmark model organism database by pBLAST using RBP from C. subterraneus as the search. Cyan box shows identities relative to C. subterraneus RBP. f) Sequence logo of the aligned C-termini of the proteins shown in (e) relative to native RPtag(small) (above the line in black). Height of the letter is proportional to the frequency at that position. g) Photograph of a column of immobilized RPtag(large) agarose with rhodamine labeled RPtag(small) before applying peptide (start), after

ACS Paragon Plus Environment

24

Page 25 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

applying 10 mL 1 µM peptide and washing with 10 mL buffer (binding), during elution with 0.1 M glycine pH 1.5 (elution), and after elution (post elution). h) Cartoon depiction of RFP constructs and columns with RPtag(large) and small used in panel i. i) Photographs of columns packed with RPtag(large)- and (small)-imobilized agarose after 25 mL 1 µM of RFP with the indicated tag (large or small) on the indicated terminus (N or C) was applied at 1 mL/min and the columns were washed with 10 mL buffer.

ACS Paragon Plus Environment

25

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 58

Optimization of RPtag(small) for laboratory applications To establish a baseline for identifying beneficial and detrimental mutations in RPtag(small), we solved the kinetic binding mechanism. Initial rate of complex (LS) formation increased linearly with RPtag(small) concentration (Fig. 2a, c), but not RPtag(large) (Fig. 2b-c), indicating a firstorder rate limiting step consistent with a unimolecular process in RPtag(small), i.e. a conformational change from a "non-binding" (S) to "binding" (S*) state. Additionally, there was a significant missing amplitude in the kinetics, which represents the proportion of the reaction that occurred in the dead time of the instrument (~1 min), and therefore the amount of RPtag(small) present as S* at the start. From the missing amplitude and total amplitude, we can calculate the equilibrium constant between S and S* (Keq = 1.6) and correct for the true concentration of S in our initial rate plots (Fig. 2c). Using the linear fit from the initial rate plot, we calculate a rate constant for the conversion of S to S* kf = 0.091 min-1. In good agreement, when we fit the association curves to a first-order rate law, we find kf = 0.092 min-1. Knowing Keq and kf, we can calculate a reverse rate constant kr = 0.056 min-1. We then measured the dissociation rate by pre-forming multiple concentrations of labeled LS complex, adding a large excess of unlabeled RPtag(small), and observing the kinetics (Fig. 2d). The initial dissociation rates were linear with respect to LS complex concentration (Fig. 2e). When fit with a line, this yields a rate constant koff = 0.0091 min-1. In good agreement, when we fit dissociation curves to a first-order rate law, we find koff = 0.012 min-1. Knowing the koff and Kd, we can calculate the association rate constant of the S* binding kon = 4.3 x 107 M-1 min-1. We simulated the mechanism shown in Fig. 2f, and found excellent agreement between the simulation, our data, and the theoretical fits.

ACS Paragon Plus Environment

26

Page 27 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Figure 2

Figure 2. Kinetic and thermodynamic characterization of binding mechanism. a) Representative binding kinetics traces (n = 3) for native RPtag(large) and small at increasing RPtag(small) concentrations (black dots) with single exponential fits (red solid line) and simulation results (dashed cyan lines). b) Representative binding kinetics traces (n = 3) for native RPtag(large) and small at increasing RPtag(large) concentrations (black dots) with single exponential fits (red solid line) and simulation results (dashed cyan lines). c) Linear fits of initial rate of complex formation as a function of RPtag(large) (L, red) and small (S, black) concentration represented in panels a and b. The dashed line shows RPtag(small) concentration corrected for the S-S* Keq shown in panel f. Results are mean ± SEM (n = 3). d) Representative unbinding kinetics traces (n = 3) for native RPtag(large) and small at increasing RPtag(large)/(small) complex concentrations (black dots) with single exponential fits (red solid line) and simulation results (dashed cyan lines). e) Linear fit of unbinding kinetics as a function

ACS Paragon Plus Environment

27

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 58

of RPtag(large)/(small) complex concentration represented in panel d. Results are mean ± SEM (n = 3). f) Proposed mechanism of RPtag(large) and small binding simulated in cyan in panels a,b and d. g) Binding affinities of alanine scanning mutagenesis of RPtag(small). Residue 10 was excluded as it is natively alanine. Data are mean ± SEM (n = 4). Log-transformed values were compared by one-way ANOVA with Holm-Sidak post-hoc analysis relative to native control. ***p100 mg mL-1 and yield of 1.3 g L-1 for RPtag(large) using our methods. Additionally, we are aware of no other scaffold that has been demonstrated to survive the GI-tract, let alone be orally bioavailable when conjugated with an absorption tag. An affinity comparison between RPtag(large) and antibodies as well as non-antibody binding scaffolds is also favorable. RPtag(large) was able to bind an optimized target 19-mer peptide with a Kd = 4.7 x 10-11 M. This is much tighter than the bulk of mouse monoclonal antibodies (Kd typically ~10-8-10-9 M), and is similar to the much tighter binding rabbit monoclonal antibodies (typically ~10-10-10-11 M)37. While DARPins23, FN3 monobodies48, anticalins49, and other scaffolds have variants that have been reported to bind with a Kd ~10-10 M, they often bind with Kd ~10-7-10-9 M50. This is similar to RPtag(large) binding its optimized 19mer target (Kd = 4.7 x 10-11 M) and the engineered RPtag(large) H122L,A253R variant binding PDGF-β (Kd = 7.5 x 10-8 M). These comparisons are perhaps imperfect as antibodies, monobodies, DARPins, anticalins, and other scaffolds have been shown to be extensively generalizable through either

ACS Paragon Plus Environment

44

Page 45 of 58 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

endogenous immunity or processes such as phage and yeast display, whereas RPtag(large) has only been generalized to a single biologically relevant non-native target. Indeed, the affinities we report for RPtag here should be interpreted with the caveat that we began all experiments with a target that bound RPtag(large) natively, rather than engineering a de novo binding interaction. This invites the question regarding the extent of the generalizability of RPtag(large). Structurally, the binding pocket of RPtag(large) is non-traditional as compared to antibodies and many nonantibody binding scaffolds. Typically, antibodies, DARPins, nanobodies, and FN3 monobodies mediate their contacts with targets through variable loop regions more-or-less disconnected from the core structural components of the proteins. Presumably, this allows these loops to adopt a wide variety of sequences necessary to compliment with targets and enable broad spectrum binding. However, this is not universally true for these scaffolds. For example, the Koide group solved the crystal structure of an FN3 monobody (HA4) complexed with the Abl SH2 domain that reveals contact between the two through one of the loops, but also shows substantial contact through the core β-sheet region (PDB: 3K2M)50. The binding mode for other scaffolds can be completely different as well. A crystal structure of an engineered anticalin (US7) complexed with amyloid-β peptide (1-40) shows a primarily β-sheet contact surface (PDB: 4MVI)49, and affibodies mediate their contacts almost exclusively through α-helices (PDB: 3MZW)16,51. So there is no reason in principle that any of the common secondary structure elements cannot be used to engineer binding surfaces, provided the scaffold can tolerate mutations at those positions. The modeled structure of the RPtag(large) binding cleft shows 32 residues within 5 Å of the native RPtag(small) peptide, almost all of which are found in α-helices and β-sheets (Fig. 6). We mutated 8 of these residues in this study and were able to generate every mutation attempted at those positions as well as demonstrate functional changes in binding as a result of those

ACS Paragon Plus Environment

45

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 46 of 58

mutations. Thus, RPtag is tolerant to mutation in the putative binding cleft and such mutations can alter the affinity and specificity of its targets. The Regan group curated and analyzed a data set of protein-protein interactions with experimentally determined Kds and high-quality crystal structures in the Protein Data Bank, and found that even the largest protein-protein interfaces in their data set had a total buried surface area of