Design and Application of a DNA-Encoded ... - ACS Publications

Nov 29, 2017 - (9, 10) Due to the flat and featureless interaction interface between two proteins, it has proven challenging to design small molecule ...
0 downloads 21 Views 1MB Size
Subscriber access provided by University of Groningen

Letter

Design and Application of a DNA-Encoded Macrocyclic Peptide Library Zhengrong Zhu, Alex Shaginian, LaShadric C. Grady, Thomas O'Keeffe, Xiangguo E. Shi, Christopher P. Davie, Graham L Simpson, Jeffrey A Messer, Ghotas Evindar, Robert N. Bream, Praew P. Thansandote, Naomi R Prentice, Andrew M. Mason, and Sandeep Pal ACS Chem. Biol., Just Accepted Manuscript • DOI: 10.1021/acschembio.7b00852 • Publication Date (Web): 29 Nov 2017 Downloaded from http://pubs.acs.org on November 29, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

ACS Chemical Biology is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 19 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

Design and Application of a DNA-Encoded Macrocyclic Peptide Library

Zhengrong Zhu*, Alex Shaginian, LaShadric C. Grady, Thomas O’Keeffe, Xiangguo E. Shi, Christopher P. Davie, Graham L. Simpson, Jeffrey A. Messer, Ghotas Evindar, Robert N. Bream, Praew P. Thansandote, Naomi R. Prentice, Andrew M. Mason, Sandeep Pal

ABSTRACT A DNA-encoded macrocyclic peptide library was designed and synthesized with 2.4×1012 members composed of 4-20 natural and non-natural amino acids. Affinity-based selection was performed against two therapeutic targets, VHL and RSV N protein. Based on selection data some peptides were selected for resynthesis without DNA tag and their activity was confirmed.

Drug discovery falls into two major classes of therapeutics: small molecule chemical compounds (molecular weight in the range of 100 to 1000 dalton); and biologics or proteins (molecular weight greater than 5000 dalton). Each class has its pros and cons. Small molecule drugs are often suitable for oral delivery, may have good cell permeability and thus can be applied to many classes of therapeutic targets. However, small molecules are limited to protein targets with a lipophilic binding pocket and more prone to off-target side effects limiting their utility. Biologic drugs generally have fewer off-target side effects, can be longer-acting in the body and are more suited for certain cell surface receptors and protein-protein interactions. However, biologics such as monoclonal antibodies cannot be delivered orally and require a cold chain for their distribution, thus making them only available to developed countries. Since the molecular weight of peptides of 5-50 amino acids are between these two major categories, peptide drugs offer opportunity of combining advantages of both biologics and small molecules as therapeutics1,2.

1 ACS Paragon Plus Environment

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Peptide drugs may have high specificity and potency similar to biologics, but are chemically synthesized and depending on their composition and molecular properties, can be cell permeable gaining access to intracellular drug targets. Recently, cell permeable peptides have been discovered that can be used to carry other molecules into cells3, as well as cyclic penetrant peptides with biological activity4,5. However, linear peptides made from natural amino acids are subject to proteolysis and are rapidly degraded in vivo. Cyclization of peptides and incorporation of non-natural amino acids can stabilize their backbone, increasing their resistance to proteolysis6,7. In addition, cyclization reduces the conformational freedom and forces peptides to form a more ordered secondary structure, often leading to higher binding affinity and higher specificity. Using non-natural amino acids also widens chemical library diversity and thus increases chance for successful identification of hits in biological assays8.

Among therapeutic target classes that peptide drugs have been applied to, protein-protein interactions (PPIs) are of particular interest. PPIs have long been a challenging therapeutic target class for drug discovery9,10. Due to the flat and featureless interaction interface between two proteins, it has proven challenging to design small molecule drugs to fit in the interface and disrupt the interaction. Since many PPI targets are located in the cytoplasm or nucleus of cells, it is also very challenging to develop traditional biologic drugs. However, as macrocyclic peptides can often mimic protein interfaces effectively and can still migrate into cells, macrocyclic peptide-based drugs have achieved some success in recent years against this target class11,12. In addition, these macrocyclic peptides can be valuable tools to study the protein surface, probe the interaction between proteins13 and in some cases, allow design of small molecule “peptidomimetics”14,15.

Encoded library technology (ELT) provides a strategy for identifying small molecule compounds that bind protein targets using DNA tagged combinatorial libraries16,17,18,19,20,21. Each molecule in the ELT library comprises a drug-like warhead attached to a double stranded DNA coding region through an adapter module (the DNA headpiece). Each cycle of synthetic chemistry to construct the warhead is 2 ACS Paragon Plus Environment

Page 2 of 19

Page 3 of 19 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

encoded by ligation of a short double-stranded DNA tag that identifies the building block added. Using split/mix methods, chemical diversity on the order of 106 – 109 drug-like warheads can be readily achieved, order of magnitudes more than other lead discovery technologies. The libraries are then screened by affinity-based selection. Binders are separated from non-binders by multiple rounds of capture of target protein to affinity matrix and heat elution. Then chemical structures of binders are identified by translation of the DNA tagging sequences. Because of the high sensitivity and throughput of modern DNA sequencers, nanomole quantities of input library, and microgram amounts of target protein, are sufficient for selection experiments. The low material consumption allows multiple selection conditions to be tested in parallel. Since detection is based on affinity binding, no understanding of mechanism of action is required.

When making ELT libraries, chemical reactions used in ELT must be compatible with DNA tags, which include phosphodiesters, ribose with anomeric linkage, bases of nitrogenous heterocycles, and exocyclic amines. Moreover, all chemical reactions have to be carried out in water to adequately solvate the DNA tags. This does however allow convenient purification of the reaction products to be achieved using an ethanol precipitation. Sequences of DNA tags are randomly selected to encode building blocks.

In order to cover diverse chemical space of macrocyclic peptides, a macrocyclic library with six cycles of chemistry was made with natural and non-natural amino acids, dipeptides, and tripeptides, resulting in ring size of 4 to 20 amino acids and library size of 2.4×1012. A linear peptide library of the same size was also made as a control. The size of this library is ~1000 times larger than the largest that has been previously reported and necessitated a concomitant increase in DNA sequencing coverage of the selection output. The design and synthetic strategy for the macrocyclic DNA-encoded library (DEL-1) and the corresponding linear control DEL-2 is outlined in Figure 1. An affinity-based selection was performed to separate ELT molecules bound to target protein from unbound. Then the selection output was PCR

3 ACS Paragon Plus Environment

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

amplified and sequenced. Based on sequence information of DNA tags, chemical structure information of encoded warheads was determined.

Von Hippel–Lindau tumor suppressor (VHL) is an E3 ubiqutin ligase enzyme which is involved in the ubiquitination and degradation of a hypoxia-inducible factor (HIF), a transcription factor that plays a central role in the regulation of gene expression by oxygen. VHL is associated with many diseases22,23,24. Ligands of VHL can be used in proteolysis targeting chimera technology (PROTAC), a new approach for therapeutic intervention25. It is known that peptides containing hydroxyproline are binders to VHL26. Analysis of the macrocyclic peptide ELT library showed that 1.4% of the library members contained the hydroxyproline monomer. Encouragingly, following ELT selection, 92% of hits contained the hydroxyproline monomer which validated this newly made peptide ELT library, though no preference for any specific sequences. Two putative hits are shown in Figure 2 as examples. Since it is well known that peptides containing hydroxyproline are binders to VHL, these peptides were not synthesized off-DNA to confirm activity but potentially could serve as useful tools to probe the VHL:HIF protein-protein interaction.

Interaction between respiratory syncytial virus (RSV) N-protein and P-protein plays a vital role in the replication of RSV27. Compounds disrupting this protein-protein interaction may be a treatment for RSV infection28. An ELT selection with both macrocyclic and linear peptide ELT libraries was run against RSV N protein. Data analyses were performed based on ELT selection data and predicted cell permeability and solubility29. Due to concern that the cyclization reaction for some macrocyclic peptides may not be complete, four macrocyclic peptides with a 10-fold preference for cyclized over linear peptides and with good predicted permeability and solubility were selected for resynthesis without DNA tag. Synthesis of the peptides was carried out using FMOC-Rink Amide-Polystyrene resin using the corresponding FMOC-amino acids and cyclization using modified Cu(II)-click chemistry according to Figure 1. Their binding to RSV N was detected by an AS-MS (affinity selection-mass spectrometry) 4 ACS Paragon Plus Environment

Page 4 of 19

Page 5 of 19 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

assay. In Table 1, macrocyclic off-DNA peptides showed better binding affinity than corresponding linear peptides, reproducing on-DNA peptides binding pattern in ELT selection. The difference in binding between macrocyclic and linear peptides may be due to reduced rotational freedom and more rigid structure by cyclization. In a TR-FRET (time-resolved fluorescent resonance energy transfer) assay detecting disruption of the binding interaction between RSV N and P proteins, functional activity of these macrocyclic peptides showed good correlation with binding affinity to RSV N (Figure 3).

In this study we have designed and synthesized a macrocyclic ELT library with 2.4×1012 cyclic peptides composed of 4-20 natural and non-natural amino acids. As far as we know, it is the largest DNA-encoded library ever made. Testing with two protein-protein interaction targets demonstrates that active macrocyclic peptides can be identified from this large macrocyclic library (confirmed by off-DNA synthesis and testing). This work points to a new direction in discovering active compounds for traditionally intractable targets and fills the gap between small molecule and biologics drug discovery.

METHODS Building block validation. 352 Fmoc-amino acid building blocks (BBs) were validated as described herein, and product yield was one of the key factors in selecting the 276 that were incorporated into library synthesis. Fmoc-amino acids (including Fmoc-protected di- and tri-peptides BBs) were validated using AOP-Headpiece-L-propargylglycine as the substrate. Di- and tri-peptides BBs underwent a 2-step validation that included: 1) acylation of the BB onto AOP-Headpiece-L-propargylglycine using standard DMTMM conditions and then 2) Fmoc-deprotection. Product yields of the 40 di- and 9 tri-peptide BBs incorporated into the library ranged from 65-100% yield based on negative ion mass spectrometry (HPLC/ESI-MS; Thermo Fisher Scientific LCQ Advantage or Bruker uTOF). The remaining BB’s underwent a 4-step validation that included: 1) acylation of the BB onto AOP-Headpiece-Lpropargylglycine using standard DMTMM conditions; 2) Fmoc-deprotection; 3) acylation of the product with Fmoc-Phe using standard DMTMM conditions; and 4) Fmoc deprotection. 4-Step product yields of 5 ACS Paragon Plus Environment

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

the 227 BBs incorporated into the library ranged from 50-95% based on negative ion mass spectrometry. BB validation results are summarized in Supplementary Table 1.

DNA tag validation. Sequences of DNA tags are randomly selected to encode building blocks (Supplementary Table 2). All DNA tags were purchased from Biosearch Technologies (Novato, CA). Average purity of DNA tags is 80% (minimum purity for QC: 60%) determined by UPLC/MS method. After library synthesis was completed, PCR was performed to generate DNA sequences compatible with Illumina sequencing flowcells from DNA-encoded molecules in the library. The PCR output was purified using Agencourt AMPure XP SPRI beads according to the manufacturer’s instructions, and then quantitated on an Agilent BioAnalyzer using a high sensitivity DNA kit. The final concentration of amplicons for each sample was between 3-40 nM. Portions this material were loaded to generate 20 million clusters on an Illumina GAII or HiSeq platform. The resulting sequence information was used to ensure all DNA tags in the library were intact and evenly represented (what we refer to as ‘dilute library sequencing’ (DLS) QC).

Synthesis of macrocyclic peptide DNA-encoded library. As shown in Figure 1, following installation of Fmoc-L-propargylglycine (Fmoc-Pra-OH) onto the AOP-Headpiece, six encoded cycles incorporating Fmoc-amino acids, Fmoc-protected di- and tri-peptides as well as encoded nulls (deletions) in cycles 2-5 were carried out using DMTMM-promoted acylation and Fmoc-deprotection conditions analogous to what has been previously published16 (scale ranging from 5µmole per building block for cycle 1 to 0.5µmole per building block for cycle 6). A portion of this post-cycle 6 material was converted to macrocyclic DEL-1 via acylation with azidoacetic acid and subsequent intramolecular click cyclization and DEL-2, the non-cyclized control library, was prepared by acetylation of the terminal amine with acetic anhydride. Standard click macrocyclization conditions used in the synthesis of DEL-1: To a microtube containing a solution of on-DNA alkyne/azide substrate (1 mM in 250 mM pH9.4 sodium borate buffer) 4 equivalents of CuSO4·5H20 (200 mM in water) was added, followed by 4 equivalents of 6 ACS Paragon Plus Environment

Page 6 of 19

Page 7 of 19 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

sodium ascorbate (200 mM in water). The reaction mixture was briefly vortexed and then heated at 60°C for 30 minutes. When azidoacetic acid is attached to the C-terminal amide group, we found side chains of Asn and Gln did not need to be protected.

ELT selection. ELT selection was performed as previously described with some modifications16,30,31. HIS-tagged VHL or RSV N protein (5 µg) was immobilized on 5 µl of IMAC resins (Phynexus). DEL1 or DEL2 library (2.5 nmole, 100 copies per member in this library of 2.4×1012 members) in 60 µl of selection buffer (50 mM Tris-HCl (pH 7.5), 150 mM NaCl, 0.1% Tween-20, 0.1 mg/ml sheared salmon sperm DNA (Ambion)) were incubated with the immobilized protein for 1 hr at room temperature, then washed five times with 100 µl of selection buffer to remove unbound DEL molecules. To elute bound molecules, resins were incubated in 60 µl of selection buffer at 80°C for 10 min. The eluent was incubated with 5 µl of IMAC resins for 22 min to remove denatured protein before next round of selection. This process was repeated two additional times. The same procedure was followed for no target control except naked IMAC resins were used instead of IMAC resins with protein immobilized. After the copy number of selection output was determined by qPCR, appropriate number of PCR cycles is selected to add DNA sequences compatible with Illumina sequencing flowcells. PCR output is purified using Agencourt AMPure XP SPRI beads according to the manufacturer’s instructions, and then quantitated on an Agilent BioAnalyzer using a high sensitivity DNA kit. Final concentration of amplicon for each sample is between 3 and 40 nM. Portions of products were loaded to generate 20 million clusters on an Illumina GAII or HiSeq platform. Based on sequence information of DNA tags, chemical structures were obtained.

Analysis of ELT selection data. Enriched features were identified as described previously32. Background subtraction was performed by removing any feature from DEL1 or DEL2 selections that contained one or more features enriched in the no target control. Full macrocycle structures were selected that had 3 or more copy counts and contained one or more features with a 10-fold preference for cyclized. 7 ACS Paragon Plus Environment

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Further selection of peptides was done by 3D conformational search using Low MoDE MD (CCG MOE software,) and summarized in Supplementary Table 3 and Supplementary Figure 1. The Solvation energy is calculated using Poisson Boltzmann calculations with two different dielectric constants, 80 and 4, to mimic the water and the membrane environment. The 3D conformation of a peptide with the lowest penalty in diffusing from the water to the membrane environment was chosen among all the 3D conformations as given by LowMode MD (svmChromLogD 4-6)33. The peptides were also ranked based on Solvation energy and the more traditional LogD/CMR permeability measurements as well34. Aimed to identify active peptides with a balance between permeability and solubility, cyclic peptides in Table 1 were selected based on following criteria: 1)

SVMchromlogD range between 4 and 633

2)

Most compounds nearer to the discrimination line34

3)

Solvation energy is less than -40 kcal/mol so that compounds are predicted to be water soluble34

4)

Most of the compounds are between the chromlogD/cmr and chromlogD/Solubility line (the

golden triangle)34

Based on consideration of chemical diversity, four cyclic peptides were selected for synthesis. There was minor modification of structure due to availability of building blocks and concern of feasibility of chemical synthesis. In addition, their linear peptides were synthesized as control (Table 1).

Synthesis of off-DNA linear and macrocyclic peptides. Linear and cyclic peptides 3-10 (Table 1) were synthesized using the general method: Linear peptides were synthesized using the Liberty 1 Microwave Peptide Synthesizer using Rink-AM resin (200-400 mesh, Merck Millipore, 0.2 mmol scale) and standard FMOC-amino acid coupling conditions with FMOC-AA (0.5mmol, 0.2M in NMP), HCTU (0.5 mmol, 0.5M in DMF) and DIPEA (1 mmol, 1M in DMF). Half of the peptide material was cleaved from the resin using TFA-TIPS-H2O (2 ml, 95:2.5:2.5) to provide the linear peptides 4, 6, 8 and 10 as the 8 ACS Paragon Plus Environment

Page 8 of 19

Page 9 of 19 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

trifluoroacetate salts after purification using RP-HPLC (0.1%TFA, MeCN/H2O). The remaining resin was transferred to a microwave vial and cyclized by adding a solution of water:tert-butanol (1:1, 3ml), copper (II) sulfate pentahydrate (1.500 mmol) and sodium L-ascorbate (1.500 mmol). The resin was filtered and washed with water, DCM and DMF, dried, and peptide material cleaved using TFA-TIPSH2O (2 ml, 95:2.5:2.5). The macrocyclic peptides 3, 5, 7 and 9 were provided after purification as the trifluoroacetate salts using RP-HPLC (0.1%TFA, MeCN/H2O). Characterization of synthesized peptides are summarized in Supplementary Table 4.

AS-MS assay. We have developed AS-MS assay in house, which was similar to SpeedScreen and SECTID technologies reported in the literatures35,36. Mixtures containing peptides and RSV N protein were prepared by combining 2 µl of 100 µM of peptide solution with 20 µl of 10 µM of RSV N protein in a 384-well assay plate. Duplicate samples thus prepared were incubated for 60 min at room temperature, then chilled to 4°C prior to AS-MS analysis. Control experiments were conducted for each compound or mixture to confirm that any unbound ligand is trapped by the stationary phase and only protein-bound ligand is eluted for analysis (i.e., no chromatographic breakthrough is occurring). Bio-Rad P10 resins were swelled in water at 4 ºC over 12 hours. The 384-well filter plates (cat. MZHVN0W10; EMDMillipore, Billerica, MA) were loaded with 130 µl of P10 resin slurry per well, the plates were spun at 1000 g for 2 min. Then the plates were rinsed with selection buffer (20 mM Tris-HCl, 200 mM NaCl, pH 7.4) for three times. The sample volume transferred was 18 µl. Samples were transferred from the assay plate using a Biomek FX (Beckman Coulter) equipped with a 384-well head. SEC assemblies (SEC plate and collection plate) were then spun for 2 min at 1000 g. Then, 9 µl 1:1 mixture of DMSO and acetonitrile were added to each well in the collection plate. Total volume in each well was 27 µl. SEC plates were discarded and collection plates were prepped for LC-MS detection. Then ligands were dissociated from the complex and were desalted and eluted into a Thermo LTQ linear trap mass spectrometer (Thermo, Billerica, MA). The sample volume injected was 6 µl. Chromatography consisted of C18 guard column and analytical column (Kinetex C18, 1.7u, 100A, 2.1x50mm, Phenomenex, 9 ACS Paragon Plus Environment

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Torrance, CA) eluted with a gradient of A (0.1% formic acid in water) versus B (100% methanol) at a 1mL/min flow rate. The molecule peak area, obtained from the mass spectrometer, represented as signal. MS analysis was performed with positive mode ionization occurring from a standard nebulized ESI source with the capillary at 3.5 kV, a desolvation temperature of 180 °C, a source temperature of 100 °C, and 30 V “cone” and 3 V extraction lens settings.

TR-FRET assay. Assay was run in buffer containing 20 mM HEPES (pH 7.5), 50 mM KCl, 2 mM CHAPS, 15 mM DTT, 0.05% BSA. In a Greiner Black FIA 384-well Plate (cat. 784076) containing compound (100 nl in 100% DMSO), 5 µl of 24 nM RSV N protein/14 nM d2-anti-His Ab (CisBio cat. 61HisDLB) was added, then 5 µl of 6 nM Biotin-RSV P/6 nM Eu-Streptavidin (Perkin Elmer cat. AD0063) was added. After incubation for 1 hr at room temperature, plates were read on a PerkinElmer Viewlux, using the following settings: measurement time = 20 s, excitation filter = 340/10 nm, emissions filters = 618/8 nm and 671/8 nm, delay time = 50 µs, and read time = 354 µs. Data was reported as the ratio of the D2 counts (at 660 nm)/Eu counts (610 nm). IC50 value was determined by using ActivityBase of XC50 module (Supplementary Figure 2).

ACKNOWLEDGEMENTS We thank C. Phelps and C. Arico-Muendel for helpful insights, K. Valko and H. Dexter for help with peptide synthesis, and A. Goetz for help with assay data. Supporting Information Available: SI Figure 1, SI Figure 2, SI Table 1, SI Table 2, SI Table 3, SI Table 4. This material is available free of charge via the internet at http://pubs.acs.org.

10 ACS Paragon Plus Environment

Page 10 of 19

Page 11 of 19 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

C3 C2

C5

C1 NH

1) Azidoacetic acid (2x100 equiv.) DMTMM (2x100 equiv.) pH 9.4 sodium borate buffer/DMA RT, overnight

C4

2) CuSO4. 5H2O (4 equiv.) sodium ascorbate (4 equiv.) pH 9.4 sodium borate buffer 60 oC, 30 min

H2N

O

NH

C4

O

N N

C1

C6

N

C2

C3

20 amino acids

C5

C1

C4

276 amino acids/peptides (same 20 as in C1/C6 plus additional 256) + 2 nulls

C5

C6 O

NH

O

NH

DEL-1

C2

NH

C5

C

Acetic anhydride (100 equiv.) pH 9.4 sodium borate buffer/DMA RT, 6h C3

C4

C2 C1

C6

NH

C3

NH

DEL-2 (linear control)

O

Scaffold Examples O

amino acid derived heterocycles

OH

O H N Fmoc

O

OH

O N

Fmoc

N

N H

OH

X Fmoc

N

H N

N

O

O NH

X = O, S

di- and tri-peptides

unnatural amino acids O HN O

Fmoc

H

OH

Fmoc

S

O HN NH

HN Fmoc

Fmoc

OH

H N

O

Fmoc

Fmoc O

N

OH O

O

O

H N

N

H N

OH

O OH

N O OH

O O

O

P O

Figure 1. DNA-encoded libraries (DELs). DEL-1 macrocycle formed by installation of azidoacetic acid and subsequent click cyclization; a linear control library (DEL-2) was prepared by acylation of the terminal amine with acetic anhydride. These libraries of 6 cycles contain 2.4×1012 members with each member having 4-20 natural and non-natural amino acids (including two glycines at two termini). 11 ACS Paragon Plus Environment

NH

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 2. Two examples of macrocyclic peptides containing hydroxyproline bound to VHL were identified through ELT selection.

12 ACS Paragon Plus Environment

Page 12 of 19

Page 13 of 19

120

% Inhibition of RSV N-P Binding

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

100

80

60

40 Cyclic Peptide

20

Linear peptide 0 0

20

40 60 80 % Bound to RSV N

100

120

Figure 3. Correlation between binding to RSV N protein and inhibition of RSV N-P binding (at 10 µM peptide) for macrocyclic and linear peptides. Binding to RSV N protein was determined by AS-MS assay and inhibition of RSV N-P binding was determined by TR-FRET assay.

13 ACS Paragon Plus Environment

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 19

Table 1. Structure, binding affinity to RSV N protein, and inhibition of RSV N-P binding of macrocyclic and linear peptides.

Peptide

Structure

% Bound to RSV N protein

pIC50 in RSV N-P TRFRET assay

57.3

5.6

7.2