Analysis of Current DNA Encoded Library Screening Data Indicates

Mar 13, 2017 - Analysis of Current DNA Encoded Library Screening Data Indicates ... In theory, high false negative rates may be overcome by employing ...
0 downloads 0 Views 597KB Size
Letter pubs.acs.org/acscombsci

Analysis of Current DNA Encoded Library Screening Data Indicates Higher False Negative Rates for Numerically Larger Libraries Alexander L. Satz,* Remo Hochstrasser, and Ann C. Petersen Roche Pharmaceutical Research and Early Development (pRED) Roche Innovation Center Basel, F. Hoffmann-La Roche, Ltd., Grenzacherstrasse 124 CH-4070 Basel, Switzerland

Downloaded via UNIV OF FLORIDA on June 27, 2018 at 13:23:28 (UTC). See https://pubs.acs.org/sharingguidelines for options on how to legitimately share published articles.

S Supporting Information *

ABSTRACT: To optimize future DNA-encoded library design, we have attempted to quantify the library size at which the signal becomes undetectable. To accomplish this we (i) have calculated that percent yields of individual library members following a screen range from 0.002 to 1%, (ii) extrapolated that ∼1 million copies per library member are required at the outset of a screen, and (iii) from this extrapolation predict that false negative rates will begin to outweigh the benefit of increased diversity at library sizes >108. The above analysis is based upon a large internal data set comprising multiple screens, targets, and libraries; we also augmented our internal data with all currently available literature data. In theory, high false negative rates may be overcome by employing larger amounts of library; however, we argue that using more than currently reported amounts of library (≫10 nmoles) is impractical. The above conclusions may be generally applicable to other DNA encoded library platforms, particularly those platforms that do not allow for library amplification. KEYWORDS: DNA-encoded libraries, screening, drug discovery, molecular diversity, combinatorial chemistry

D

Despite the similarities between a DEL screen and a simple purification, no literature report explicitly provides yields of library members following synthesis and screening. However, rough estimates of yields may be calculated from reported data. For instance, 35 million copies of a positive control were spiked into a library and screened against a protein target, after which the positive control was observed 971 times;1 simple division gives a recovery of ∼0.003%. Extremely large libraries (>1010) possesses only 1000s of copies of each unique library member at the outset of a screen, and thus a low percent recovery would explain their underwhelming productivity. A survey of relevant literature reveals a range of crudely estimated yields from 0.00005 to 0.08% (Table S1).1,10,14 Estimated yields determined from literature data indicate that current library production and screening protocols result in low yields (Table S1); however, these estimates may be inaccurate as they are based solely on observed sequences. A more accurate method for estimating yields is described herein, and applied to a data set previously reported by our laboratory.13 Employing a combination of internal and literature data, we attempt to validate and quantify the hypothesis of Clark et al. that larger libraries provide weaker signals.

NA-encoded libraries (DELs) are commonly used to discover small-molecules that interfere with the activity of pharmaceutically relevant proteins.1−14 DELs consist of complex mixtures where each library member possesses a small-molecule moiety covalently linked to a DNA oligomer. The sequence of the DNA oligomer encodes the chemical structure of the attached small-molecule. The contents of a DEL are readily determined via high throughput sequencing, and libraries may be rapidly and inexpensively screened against protein targets to find small-molecule ligands. A correlation between library size and productivity has been reported for combinatorial phage display libraries,15 and it is tempting to believe that a similar correlation might exist for DNA-encoded libraries. Extremely large DELs are readily produced via split-and-pool chemistry, and libraries containing a trillion unique chemical structures have been reported.2,16,17 Our laboratory recently reported the successful screening of 16 DELs (numeric sizes ranged from 106 to 1011 small-molecule structures per DEL) against 2 different protein targets, resulting in the discovery of 34 structurally distinct clusters; however, we observed no correlation between library size and productivity13 or ligand potency (Figures S1−S2). In the seminal report by Clark et al.,1 it was observed that signal from an 800 million member library was ∼100-fold weaker than signal resulting from a ∼100-fold numerically smaller library; it was therefore hypothesized that larger DNA encoded libraries produce weaker signals. However, the library size at which signal becomes undetectable was never determined. © 2017 American Chemical Society

Received: January 30, 2017 Revised: March 10, 2017 Published: March 13, 2017 234

DOI: 10.1021/acscombsci.7b00023 ACS Comb. Sci. 2017, 19, 234−238

Letter

ACS Combinatorial Science

Table 1. Yield of a Library Member That Binds the Protein Target Ranges from 0.002−1% Following 2 Rounds of Screeninga cmpd IDb

pIC50b

DEL size (millions)c

DELIDb,c

theoretical quantity library members in molecules (millions)d

screenIDe

unique reads (mi)f

number library members after screen (ni)g

% yieldh

33 37 39 41 42 43 45 46 47 47 51 55 5 19

5.6 5.6 7.2 5.2 6.6 5.6 5.4 5.9 6.7 6.7 5.4 6.1 9 7.6

1.2 3.8 3.8 3.8 3.8 9.7 1.2 3.8 470 470 3.8 3.8 470 100

7 16 16 16 16 11 7 16 5 5 16 16 5 1

3.5 1 1 1 1 0.5 3.5 1 0.01 0.05 1 1 0.1 0.35

3 3 3 3 3 3 3 3 3 2 3 3 1 1

38 42 79 10 37 115 26 33 0 23 14 50 57 92

110 122 228 29 107 331 76 96 0 23 41 145 913 1873

0.004 0.01 0.02 0.002 0.01 0.06 0.002 0.01 0 0.04 0.004 0.01 1.0 0.6

a

See Supporting Information section-1 for a glossary. bCompounds have been previously reported by Eidam and Satz;13 the same ID numbers are used. cGeneric schema provided in Table S2. Values are rounded. dTheoretical quantity of material added to the PCR mix prior to amplification. See Supporting Information section-4 for exemplar calculation. Values are rounded. eSee Table S3 fExperimentally observed unique reads for the particular (ith) library member. gTotal number of amplifiable ith library members in the PCR mix prior to amplification (this value is calculated as described in the Supporting Information section-7). h100 × (column-8/column-5). Values are rounded.

The total number of amplifiable library members corresponding to compound 33 is then back-calculated (according to eq 1) to be 110 molecules (n33 = 110) (Table 1, row 1). Calculations are detailed in Supporting Information section-7. The percent yield of the library member corresponding to compound 33 is 0.004% (110 divided by 3.5 million). Percent yields for all 13 nontruncated compounds reported by Eidam and Satz13 are listed in Table 1. The percent yields listed in Table 1 are less than 100% for numerous reasons including synthetic yield during multistep small molecule synthesis and/or DNA degradation during library production.19,20 The numerically largest library employed in screen-3 was not productive; DEL-6 has a numeric size of ∼81 billion.13 Each DEL-6 library member possessed 102 molecules at the outset of the screen (a theoretical quantity of 4.1 × 1012 total molecules per library). A similar ratio (molecules divided by numeric size) is commonly employed in phage display; for instance Weber et al.21 screened a 40 billion combinatorial phage display library starting with >1012 phage particles. However, in phage display the library is amplified following each round of screening. In the case of DEL-6, a library member that binds the protein target is estimated to result in only ∼0.002−1 amplifiable molecule following two rounds of screening (based on the range of yields listed in Tables 1 and S1). This implies that DEL-6 requires ∼150 nmols of library per screening condition to detect a library member that binds the protein target (assuming ∼1 million copies per library member is required at the outset of the screen). However, employing such a quantity of library is challenging as current library production protocols yield only 3−5 μmols of DEL.1,10 The reported 1 trillion member libraries2,16,17 are predicted to require >1.5 μmols of DEL per screening condition. Library pools used herein require two successive rounds of screening to observe acceptable signal-to-noise; in contrast, literature indicates that screens employing 10 000-fold smaller amounts of library require only a single round.22,23 The ability to employ extremely large quantities of library at the outset of a screen may be challenging, as yields of the desired library members that bind the target protein may decrease with each additional round of screening. Screens 1a, 1b, and 1c were run successively, in a manner previously reported.1−13 Screens 1a, 1b,

Note that the Supporting Information includes a glossary (section-1), a description of library encoding (section-2), and a protocol for DNA encoded library affinity screens (section-3). The theoretical quantity of the ith library member added at the outset of a screen is calculated as described in the Supporting Information section-4. In the case of compound 33 (Table 1), which is derived from DEL-7, 3.5 million copies per DEL-7 library member were added at the outset of screen-3. The quantity of each library member recovered after the screen is determined by employing eq 118 where m is the sum of unique DNA sequences observed via sequencing (referred to as unique reads), x is the total number of sequence reads (referred to as total reads), and n is the number of unique and amplifiable DNA sequences in the PCR mix prior to amplification (eq 1 is validated in Supporting Information section-5). Because of inclusion of a randomized sequence1 during library construction, only an insignificant percentage of library members collected following a screen will possesses identical sequences. x ⎡ ⎛ 1⎞ ⎤ m = n⎢1 − ⎜1 − ⎟ ⎥ ⎝ ⎣ n⎠ ⎦

(1)

The fraction of nonunique sequences observed via sequencing (1 − m/x) can be used to estimate the total number of amplifiable DNA sequences in the PCR mix prior to amplification (see Supporting Information section-6). In the case of screen 3, 21 318 804 and 25 211 327 unique (m) and total reads (x) are observed, respectively. The ratio (m/x) is employed to estimate that 72 million amplifiable DNA sequences were added to the PCR mix prior to amplification. The precision of the estimate increases as the ratio of m/x decreases (Figure S3) and is expected to provide near perfect results when x ≫ n. Estimations correlate with experimental (nonquantitative) PCR results (Figure S4), and correctly rank-orders samples with 2-fold differences in DNA quantity (Figure S5). Note that the accuracy of estimated yields provided in Table 1 is entirely dependent upon observed mi values and therefore the quality of the sequencing data. In the case of screen 3, the DNA sequence corresponding to compound 33 had 38 unique reads (m33 = 38) (Table 1, row 1). 235

DOI: 10.1021/acscombsci.7b00023 ACS Comb. Sci. 2017, 19, 234−238

Letter

ACS Combinatorial Science Table 2. Sequencing Output for Screens 1a−1ba screenID

rounds of screening

total reads (x)b (×10−6)

unique Reads (m)b (×10−6)

total number of amplifiable library members added to the PCR mix prior to amplification (n)c (×10+6)

% nonunique sequences (1 − m/x)

library members with enrichment > 10d

cycles PCRe

1a 1b 1c

1 2 3

36 41 38

35 39 17

2400 330 22

1 6 55

30 13 441 4566

∼13 ∼24 30

a See Supporting Information section-1 for a glossary. bExperimentally observed total and unique reads. cThis value is calculated as described in Supporting Information section-6. dNumber of library members detected with enrichment > 10 (see Supporting Information section-8 for the definition of enrichment). eCycles of PCR required for the collected library material (following the screen) to be observed via electrophoresis on an ethidium bromide gel (see Supporting Information section-3).

Figure 1. Library member with ∼500-fold greater theoretical yield possesses ∼500-fold greater enrichment. Enrichment of two DEL-5 (Table S2) library members from screen-1c (Table S3), as a function of total reads (x). The value of total reads is reduced via random sampling of the experimental sequencing data prior to recalculation of enrichment values (see R script ‘sample_sequencing_data.R’ in Supporting Information).

and 1c detect 30, 13 441, and 4566 library members with enrichment > 10 respectively (Table 2). The high false negative rate for Screen-1a may be attributed to low sequencing depth (x/ n ≈ 0.01) because of a large quantity of recovered library. Each successive screen (1b−1c) results in 10-fold less recovered library (Table 2), resulting in vastly fewer false negatives for screen 1b. Screen 1c however has more false negatives than screen 1b; as suggested above, we assume loss of material due to repeated handling and purifications is unavoidable. We attempt to further validate the hypothesis of Clark et al.1 that larger libraries provide weaker signals. First, we directly compare library members from the same library and screen. Compounds 5 and 29 are both derived from DEL-5 screen 1 and have reported pIC50 values of 9 and 7, respectively (information regarding libraries and screens are provided in Tables S2 and S3, and compound properties and biological activities were previously reported by Eidam and Satz13). The library member corresponding to 29 is a truncate and has a ∼500-fold greater theoretical quantity at the outset of the screen, and a corresponding 500-fold greater enrichment, than 5 (Figure 1). Figures 2 and S11 compare enrichments between all observed parent and truncated library members for screens 1 and 2 respectively; the numerically smaller truncated libraries provide stronger signals than the larger parent libraries. Second, we plot maximum observed enrichment for all screening data (including relevant literature data) versus library numeric size; a moderate correlation is apparent despite the data arising from numerous laboratories, libraries, and targets (Figure 3). (Note that our rational for using “maximum observed enrichment” as the dependent variable is provided in the Supporting Information section-9).

Figure 2. Histogram comparing phosphodiesterase (screen 1) enrichment between truncated and nontruncated (parent) library members; truncated libraries are numerically smaller and provide a stronger signal than larger parent libraries. Note that no parent library members possess enrichment > 100. Aggregation of data from DELs 1, 2, 4, and 5; only library members with enrichment > 25 are included. Truncated libraries are generally 2−3 orders of magnitude numerically smaller than the nontruncated (parent) libraries from which they are derived. A similar result is observed for screen-2, which employed a kinase target protein (Figure S11).

Observed enrichment is also influenced by the ratio of the total number of amplifiable library members added to the PCR mix prior to amplification (n) relative to the total reads (x). This 236

DOI: 10.1021/acscombsci.7b00023 ACS Comb. Sci. 2017, 19, 234−238

Letter

ACS Combinatorial Science

Figure 3. Maximum observed enrichment correlates with library size. Ordinary least-squares results in the equation y = −0.42x + 4.77. The maximum observed enrichment (Log10) is simply the most enriched library member for any given library and screen. For screens 1−3, all libraries are included that possess at least one library member with enrichment > 4. All applicable literature data is included; for literature data, enrichment is assumed equivalent to the number of times the most enriched library member was observed as listed in Table S1, column 4.

technologies offering the potential for library amplification,25,26 or one-bead one-compound approaches.27 The yields herein may be interpreted in the same way as for most other reported synthesis; a theoretical yield is determined based upon quantities of starting materials used, and the resulting yield encompasses both the synthesis and purification. However, dissimilar to most other synthesis, desired products are synthesized within complex mixtures, quantity of starting materials are crudely estimated by spectrophotometric analysis, and purified products may only be characterized via sequencing.

relationship can be readily quantified by random sampling of the sequencing data from screen-1c; enrichment for library members corresponding to compounds 5 and 29 are recalculated after removing increasing amounts of sequencing data (Figure 1). A 10-fold decrease in total reads (x) results in a near complete loss of signal for the library member corresponding to compound 5. To optimize future DNA encoded library design, we have attempted to quantify the numeric library size at which signal becomes undetectable. To accomplish this, we (i) have calculated that percent yields of individual library members following a DEL screen range from 0.002 to 1% (Table 1), (ii) extrapolated that ∼1 million copies per library member are required at the outset of a screen, and (iii) from this extrapolation predict that false negative rates will begin to outweigh the benefit of increased chemical diversity at numeric library sizes >108. Additionally, we have further validated the hypothesis of Clark et al.1 that larger libraries will have weaker signals, and demonstrated that larger libraries will disproportionately suffer from false negatives upon undersampling (Figures 1−3). The above conclusions may be considered applicable to other DNA encoded library platforms with the following caveats. As is true of most reported DEL screens, our conclusions are drawn solely from sequencing data, and a lack of infinite sequencing depth may bring about inaccuracies. Our investigated data set may be an outlier or we may employ screening protocols significantly different from others. Also, we assume that using more than currently reported amounts of library (≫10 nmoles) at the outset of a screen (to increase copies per library member) is impractical. Lastly, DNA encoded library material is not fully characterized before or after a screen, making interpretation of yields difficult. We attempted to ameliorate the above concerns by calculating percent yields for a large number of molecules derived from different screens, targets, and libraries (Table 1); note that the screens themselves were extremely productive, with high enrichment of more than 34 structurally distinct clusters.13 Additionally, we augmented our internal data set with data reported by two external laboratories (Table S1), and observe consistent results (Figure 3). Of course, there’s no expectation that our conclusions would be applicable to highly divergent methodologies, including dynamic combinatorial libraries,24



ASSOCIATED CONTENT

S Supporting Information *

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acscombsci.7b00023. Python and R scripts (TXT) Glossary, encoding schema, detailed calculations, supplemental data tables, and supplemental figures (PDF)



AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. Phone: +41616874118. ORCID

Alexander L. Satz: 0000-0003-1284-1977 Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS We thank the drug discovery teams in Roche Pharmaceutical Research and Early Development for allowing us to use their data for this analysis.



REFERENCES

(1) Clark, M. A.; Acharya, R. A.; Arico-Muendel, C. C.; Belyanskaya, S. L.; Benjamin, D. R.; Carlson, N. R.; Centrella, P. A.; Chiu, C. H.; Creaser, S. P.; Cuozzo, J. W.; Davie, C. P.; Ding, Y.; Franklin, G. J.; Franzen, K. D.; Gefter, M. L.; Hale, S. P.; Hansen, N. J. V.; Israel, D. I.; Jiang, J.; Kavarana, M. J.; Kelley, M. S.; Kollmann, C. S.; Li, F.; Lind, K.;

237

DOI: 10.1021/acscombsci.7b00023 ACS Comb. Sci. 2017, 19, 234−238

Letter

ACS Combinatorial Science Mataruse, S.; Medeiros, P. F.; Messer, J. A.; Myers, P.; O’Keefe, H.; Oliff, M. C.; Rise, C. E.; Satz, A. L.; Skinner, S. R.; Svendsen, J. L.; Tang, L.; van Vloten, K.; Wagner, R. W.; Yao, G.; Zhao, B.; Morgan, B. A. Design, synthesis and selection of DNA-encoded small-molecule libraries. Nat. Chem. Biol. 2009, 5, 647−654. (2) Goodnow, R. A., Jr.; Dumelin, C. E.; Keefe, A. D. DNA-encoded chemistry: enabling the deeper sampling of chemical space. Nat. Rev. Drug Discovery 2016, 16, 131−147. (3) Kollmann, C. S.; Bai, X.; Tsai, C.-H.; Yang, H.; Lind, K. E.; Skinner, S. R.; Zhu, Z.; Israel, D. I.; Cuozzo, J. W.; Morgan, B. A.; et al. Application of encoded library technology (ELT) to a protein-protein interaction target: Discovery of a potent class of integrin lymphocyte function-associated antigen 1 (LFA-1) antagonists. Bioorg. Med. Chem. 2014, 22, 2353−2365. (4) Soutter, H. H.; Centrella, P.; Clark, M. A.; Cuozzo, J. W.; Dumelin, C. E.; Guie, M.-A.; Habeshian, S.; Keefe, A. D.; Kennedy, K. M.; Sigel, E. A.; Troast, D. M.; Zhang, Y.; Ferguson, A. D.; Davies, G.; Stead, E. R.; Breed, J.; Madhavapeddi, P.; Read, J. A. Discovery of cofactor-specific, bactericidal Mycobacterium tuberculosis InhA inhibitors using DNAencoded library technology. Proc. Natl. Acad. Sci. U. S. A. 2016, 113, E7880−E7889. (5) Mandal, P.; Berger, S. B.; Pillay, S.; Moriwaki, K.; Huang, C.; Guo, H.; Lich, J. D.; Finger, J.; Kasparcova, V.; Votta, B.; Ouellette, M.; King, B. W.; Wisnoski, D.; Lakdawala, A. S.; DeMartino, M. P.; Casillas, L. N.; Haile, P. A.; Sehon, C. A.; Marquis, R. W.; Upton, J.; Daley-Bauer, L. P.; Roback, L.; Ramia, N.; Dovey, C. M.; Carette, J. E.; Chan, F. K.; Bertin, J.; Gough, P. J.; Mocarski, E. S.; Kaiser, W. J. RIP3 induces apoptosis independent of pronecrotic kinase activity. Mol. Cell 2014, 56, 481−495. (6) Deng, H.; Zhou, J.; Sundersingh, F. S.; Summerfield, J.; Somers, D.; Messer, J. A.; Satz, A. L.; Ancellin, N.; Arico-Muendel, C. C.; Sargent Bedard, K. L.; Beljean, A.; Belyanskaya, S. L.; Bingham, R.; Smith, S. E.; Boursier, E.; Carter, P.; Centrella, P. A.; Clark, M. A.; Chung, C. W.; Davie, C. P.; Delorey, J. L.; Ding, Y.; Franklin, G. J.; Grady, L. C.; Herry, K.; Hobbs, C.; Kollmann, C. S.; Morgan, B. A.; Pothier-Kaushansky, L. J.; Zhou, Q. Discovery, SAR, and X-ray Binding Mode Study of BCATm Inhibitors from a Novel DNA-Encoded Library. ACS Med. Chem. Lett. 2015, 6, 919−924. (7) Encinas, L.; O’Keefe, H.; Neu, M.; Remuinan, M. J.; Patel, A. M.; Guardia, A.; et al. Encoded library technology as a source of hits for the discovery and lead optimization of a potent and selective class of bactericidal direct inhibitors of Mycobacterium tuberculosis InhA. J. Med. Chem. 2014, 57, 1276−1288. (8) Disch, J. S.; Evindar, G.; Chiu, C. H.; Blum, C. A.; Dai, H.; Jin, L.; Schuman, E.; Lind, K. E.; Belyanskaya, S. L.; Deng, J.; Coppo, F.; Aquilani, L.; Graybill, T. L.; Cuozzo, J. W.; Lavu, S.; Mao, C.; Vlasuk, G. P.; Perni, R. B. Discovery of Thieno[3,2-d]pyrimidine-6-carboxamides as potent inhibitors of SIRT1, SIRT2, and SIRT3. J. Med. Chem. 2013, 56, 3666−3679. (9) Gilmartin, A. G.; Faitg, T. H.; Richter, M.; Groy, A.; Seefeld, M. A.; Darcy, M. G.; Peng, X.; Federowicz, K.; Yang, J.; Zhang, S. Y.; Minthorn, E.; Jaworski, J. P.; Schaber, M.; Martens, S.; McNulty, D. E.; Sinnamon, R. H.; Zhang, H.; Kirkpatrick, R. B.; Nevins, N.; Cui, G.; Pietrak, B.; Diaz, E.; Jones, A.; Brandt, M.; Schwartz, B.; Heerding, D. A.; Kumar, R. Allosteric Wip1 phosphatase inhibition through flap-subdomain interaction. Nat. Chem. Biol. 2014, 10, 181−187. (10) Litovchick, A.; Dumelin, C. E.; Habeshian, S.; Gikunju, D.; Guie, M.-A.; Centrella, P.; Zhang, Y.; Sigel, E. A.; Cuozzo, J. W.; Keefe, A. D.; Clark, M. A. Encoded library synthesis using chemical ligation and the discovery of sEH inhibitors from a 334-million member library. Sci. Rep. 2015, 5, 10916. (11) Wu, Z.; Graybill, T. L.; Zeng, X.; Platchek, M.; Zhang, J.; Bodmer, V. Q.; Wisnoski, D. D.; Deng, J.; Coppo, F. T.; Yao, G.; Tamburino, A.; Scavello, G.; Franklin, G. J.; Mataruse, S.; Bedard, K. L.; Ding, Y.; Chai, J.; Summerfield, J.; Centrella, P. A.; Messer, J. A.; Pope, A. J.; Israel, D. I. Cell-Based Selection Expands the Utility of DNA-Encoded SmallMolecule Library Technology to Cell Surface Drug Targets: Identification of Novel Antagonists of the NK3 Tachykinin Receptor. ACS Comb. Sci. 2015, 17, 722−731.

(12) Concha, N.; Huang, J.; Bai, X.; Benowitz, A.; Brady, P.; Grady, L. C.; Kryn, L. H.; Holmes, D.; Ingraham, K.; Jin, Q.; Pothier Kaushansky, L. P.; McCloskey, L.; Messer, J. A.; O’Keefe, H.; Patel, A.; Satz, A. L.; Sinnamon, R. H.; Schneck, J.; Skinner, S. R.; Summerfield, J.; Taylor, A.; Taylor, J. D.; Evindar, G.; Stavenger, R. A. Discovery and Characterization of a Class of Pyrazole Inhibitors of Bacterial Undecaprenyl Pyrophosphate Synthase. J. Med. Chem. 2016, 59 (15), 7299−7304. (13) Eidam, O.; Satz, A. L. Analysis of the productivity of DNA encoded libraries. MedChemComm 2016, 7, 1323−1331. (14) Cuozzo, J. W.; Centrella, P. A.; Gikunju, D.; Habeshian, S.; Hupp, C. D.; Keefe, S. A.; Sigel, E. A.; Soutter, H. H.; Thomson, H. A.; Zhang, Y.; Clark, M. A. Discovery of a Potent BTK Inhibitor with a Novel Binding Mode by Using Parallel Selections with a DNA-Encoded Chemical Library. ChemBioChem 2017, DOI: 10.1002/cbic.201600573. (15) Griffiths, A. D.; Williams, S. C.; Hartley, O.; Tomlinson, I. M.; Waterhouse, P.; Crosby, W. L. Isolation of high affinity human antibodies directly from large synthetic repertoires. EMBO J. 1994, 13 (14), 3245−3260. (16) Mullard, A. DNA-encoded drug libraries come of age. Nat. Biotechnol. 2016, 34 (5), 450−451. (17) Mullard, A. DNA tags help the hunt for drugs. Nature 2016, 530 (5), 367−369. (18) Henry (http://math.stackexchange.com/users/6460/henry) probability distribution of coverage of a set after̀ X independently, randomly selected members of the set (version 2011-04-13). http:// math.stackexchange.com/q/32800 (accessed March 2, 2017). (19) Satz, A. L. DNA Encoded Library Selections and Insights Provided by Computational Simulations. ACS Chem. Biol. 2015, 10, 2237−2245. (20) Satz, A. L. Simulated Screens of DNA Encoded Libraries: The Potential Influence of Chemical Synthesis Fidelity on Interpretation of Structure-Activity Relationships. ACS Comb. Sci. 2016, 18 (7), 415− 424. (21) Weber, M.; Bujak, E.; Putelli, A.; Villa, A.; Matasci, M.; Gualandi, L.; Hemmerle, T.; Wulhfard, S.; Neri, D. A Highly Functional Synthetic Phage Display Library Containing over 40 Billion Human Antibody Clones. PLoS One 2014, 9 (6), e100000. (22) Mannocci, L.; Melkko, S.; Buller, F.; Molnàr, I.; Gapian Bianké, J.P.; Dumelin, C. E.; Scheuermann, J.; Neri, D. Isolation of Potent and Specific Trypsin Inhibitors from a DNA-Encoded Chemical Library. Bioconjugate Chem. 2010, 21, 1836−1841. (23) Franzini, R. M.; Ekblad, T.; Zhong, N.; Wichert, M.; Decurtins, W.; Nauer, A.; Zimmermann, M.; Samain, F.; Scheuermann, J.; Brown, P. J.; Hall, J.; Gräslund, S.; Schüler, H.; Neri, D. Identification of structure-activity relationships from screening a structurally compact DNA-encoded chemical library. Angew. Chem., Int. Ed. 2015, 54, 3927− 3931. (24) Reddavide, F. V.; Lin, W.; Lehnert, S.; Zhang, Y. DNA-Encoded Dynamic Combinatorial Chemical Libraries. Angew. Chem., Int. Ed. 2015, 54, 7924−7928. (25) Weisinger, R. M.; Wrenn, S. J.; Harbury, P. B. Highly Parallel Translation of DNA Sequences into Small Molecules. PLoS One 2012, 7, e28056. (26) Krusemark, C. J.; Tilmans, N. P.; Brown, P. O.; Harbury, P. B. Directed Chemical Evolution with an Outsized Genetic Code. PLoS One 2016, 11 (8), e0154765. (27) MacConnell, A. B.; McEnaney, P. J.; Cavett, V. J.; Paegel, B. M. DNA-encoded solid-phase synthesis: encoding language design and complex oligomer library synthesis. ACS Comb. Sci. 2015, 17 (9), 518− 534.

238

DOI: 10.1021/acscombsci.7b00023 ACS Comb. Sci. 2017, 19, 234−238