Document not found! Please try again

Analysis of Current DNA Encoded Library Screening Data Indicates

Mar 13, 2017 - To optimize future DNA-encoded library design, we have attempted to quantify the library size at which the signal becomes undetectable...
0 downloads 4 Views 608KB Size
Subscriber access provided by University of Newcastle, Australia

Letter

Analysis of Current DNA Encoded Library Screening Data Indicates Higher False Negative Rates for Numerically Larger Libraries Alexander L. Satz, Remo Hochstrasser, and Ann C. Petersen ACS Comb. Sci., Just Accepted Manuscript • DOI: 10.1021/acscombsci.7b00023 • Publication Date (Web): 13 Mar 2017 Downloaded from http://pubs.acs.org on March 14, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

ACS Combinatorial Science is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 15

ACS Combinatorial Science

1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analysis of Current DNA Encoded Library Screening Data Indicates Higher False Negative Rates for Numerically Larger Libraries. Keywords: DNA encoded libraries, Screening, Drug discovery, Molecular diversity, Combinatorial chemistry.

Alexander L. Satz, Remo Hochstrasser, and Ann C. Petersen Roche Pharmaceutical Research and Early Development (pRED) Roche Innovation Center Basel F. Hoffmann-La Roche Ltd, Grenzacherstrasse 124 CH-4070 Basel, Switzerland [email protected], +41616874118

ACS Paragon Plus Environment

ACS Combinatorial Science

2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Graphical Abstract:

Abstract: To optimize future DNA encoded library design, we have attempted to quantify the library size at which signal becomes undetectable. To accomplish this we i) have calculated that percent yields of individual library members following a screen range from 0.002-1%, ii) extrapolated that ~ 1 million copies per library member are required at the outset of a screen, and iii) from this extrapolation predict that false negative rates will begin to outweigh the benefit of increased diversity at library sizes >108. The above analysis is based upon a large internal data set comprised of multiple screens, targets, and libraries; we also augmented our internal data with all currently available literature data. In theory, high false negative rates may be overcome by employing larger amounts of library; however, we argue that using more than currently reported amounts of library (>>10 nmoles) is impractical. The above conclusions may be generally applicable to other DNA encoded library platforms, particularly those platforms which do not allow for library amplification.

ACS Paragon Plus Environment

Page 2 of 15

Page 3 of 15

ACS Combinatorial Science

3 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

DNA encoded libraries (DELs) are commonly used to discover small-molecules that interfere with the activity of pharmaceutically relevant proteins (1-14). DELs consist of complex mixtures where each library member possesses a small-molecule moiety covalently linked to a DNA oligomer. The sequence of the DNA oligomer encodes the chemical structure of the attached small-molecule. The contents of a DEL are readily determined via high throughput sequencing, and libraries may be rapidly and inexpensively screened against protein targets to find small-molecule ligands. A correlation between library size and productivity has been reported for combinatorial phage display libraries (15), and it is tempting to believe that a similar correlation might exist for DNA encoded libraries. Extremely large DELs are readily produced via split-and-pool chemistry, and libraries containing a trillion unique chemical structures have been reported (2,16-17). Our laboratory recently reported the successful screening of 16 DELs (numeric sizes ranged from 106 to 1011 small-molecule structures per DEL) against 2 different protein targets, resulting in the discovery of 34 structurally distinct clusters; however, we observed no correlation between library size and productivity (13) or ligand potency (Figures S1-S2). In the seminal report by Clark et al. (1) it was observed that signal from an 800 million member library was ~100-fold weaker than signal resulting from a ~100-fold numerically smaller library; it was therefore hypothesized that larger DNA encoded libraries produce weaker signals. However, the library size at which signal becomes undetectable was never determined. Despite the similarities between a DEL screen and a simple purification, no literature report explicitly provides yields of library members following synthesis and screening. However, rough estimates of yields may be calculated from reported data. For instance, 35 million copies of a positive control were spiked into a library and screened against a protein target, after which the positive control was observed 971 times (1); simple division gives a recovery of ~0.003%. Extremely large libraries (>1010) possesses only 1000s of copies of each unique library member at the outset of a screen, and thus a low percent recovery would explain their underwhelming productivity. A survey of relevant literature reveals a range of crudely estimated yields from 0.00005-0.08% (Table S1)(1,10,14). Estimated yields determined from literature data indicate that current library production and screening protocols result in low yields (Table S1); however these estimates may be inaccurate as they are based solely on observed sequences. A more accurate method for estimating yields is described herein, and applied to a data set previously reported by our laboratory (13). Employing a combination of internal and literature data, we attempt to validate and quantify the hypothesis of Clark et al. that larger libraries provide weaker signals. Note that the supporting information includes a glossary of terms (Section-1), a description of library encoding (Section-2), and a protocol for DNA encoded library affinity screens (Section-3).

ACS Paragon Plus Environment

ACS Combinatorial Science

Page 4 of 15

4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The theoretical quantity of the ith library member added at the outset of a screen is calculated as described in the supporting information Section-4. In the case of compound 33 (Table-1), which is derived from DEL-7, 3.5 million copies per DEL-7 library member were added at the outset of screen-3. The quantity of each library member recovered after the screen is determined by employing EQ-1 (18) where m is the sum of unique DNA sequences observed via sequencing (referred to as unique reads), x is the total number of sequence reads (referred to as total reads), and n is the number of unique and amplifiable DNA sequences in the PCR mix prior to amplification (EQ1 is validated in supporting information Section-5). Due to inclusion of a randomized sequence (1) during library construction, only an insignificant percentage of library members collected following a screen will possesses identical sequences. ଵ ௡

݉ = ݊[1 − (1 − )௫ ]

EQ-1

The fraction of non-unique sequences observed via sequencing (1-m/x) can be used to estimate the total number of amplifiable DNA sequences in the PCR mix prior to amplification (see supporting information Section-6). In the case of screen-3, 21’318’804 and 25’211’327 unique (m) and total reads (x) are observed respectively. The ratio (m/x) is employed to estimate that 72 million amplifiable DNA sequences were added to the PCR mix prior to amplification. The precision of the estimate increases as the ratio of m/x decreases (Figure S3) and is expected to provide near perfect results when x >> n. Estimations correlate with experimental (non-quantitative) PCR results (Figure S4), and correctly rank-orders samples with 2-fold differences in DNA quantity (Figure S5). Note that the accuracy of estimated yields provided in Table 1 is entirely dependent upon observed mi values and therefore the quality of the sequencing data. In the case of screen-3, the DNA sequence corresponding to compound 33 had 38 unique reads (m33 = 38) (Table 1, row 1). The total number of amplifiable library members corresponding to compound 33 is then back-calculated (according to EQ-1) to be 110 molecules (n33 = 110) (Table 1, row 1). Calculations are detailed in supporting information Section-7. The percent yield of the library member corresponding to compound 33 is 0.004% (110 divided by 3.5 million). Percent yields for all 13 non-truncated compounds reported by Eidam & Satz (13) are listed in Table 1. The percent yields listed in Table 1 are less than 100% for numerous reasons including synthetic yield during multi-step small molecule synthesis and/or DNA degradation during library production (19-20).

ACS Paragon Plus Environment

Page 5 of 15

ACS Combinatorial Science

5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Table 1. The yield of a library member that binds the protein target ranges from 0.002-1% following 2 rounds of screening (see supporting information Section-1 for a glossary of terms). Cmpd IDa

pIC50a

DEL size (millions)b

DELIDa,b

5.6

1.2

7

Theoretica l quantity library members in molecules (millions)c 3.5

33 37

5.6

3.8

16

39

7.2

3.8

41

5.2

42

Screen unique - IDd reads (mi) e

Number library members after screen (ni)f

% Yieldg

3

38

110

0.004

1

3

42

122

0.01

16

1

3

79

228

0.02

3.8

16

1

3

10

29

0.002

6.6

3.8

16

1

3

37

107

0.01

43

5.6

9.7

11

0.5

3

115

331

0.06

45

5.4

1.2

7

3.5

3

26

76

0.002

46

5.9

3.8

16

1

3

33

96

0.01

47

6.7

470

5

0.01

3

0

0

0

47

6.7

470

5

0.05

2

23

23

0.04

51

5.4

3.8

16

1

3

14

41

0.004

55

6.1

3.8

16

1

3

50

145

5

9

470

5

0.1

1

57

913

0.01 1.0

19

7.6

100

1

0.35

1

92

1873

0.6

a Compounds

have been previously reported by Eidam & Satz (13); the same ID numbers are used. b Generic schema provided in Table S2. Values are rounded. c Quantity of material added to the PCR mix prior to amplification. See supporting information Section-4 for exemplar calculation. Values are rounded. d See Table S3 e Experimentally observed unique reads for the particular (ith) library member. f Total number of amplifiable ith library members in the PCR mix prior to amplification (this value is calculated as described in the supporting information Section-7). g100*(column-8/column-5). Values are rounded.

ACS Paragon Plus Environment

ACS Combinatorial Science

6 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The numerically largest library employed in screen-3 was not productive; DEL-6 has a numeric size of ~81 billion (13). Each DEL-6 library member possessed 102 molecules at the outset of the screen (a theoretical yield of 4.1 × 1012 total molecules per library). A similar ratio (molecules divided by numeric size) is commonly employed in phage display; for instance Weber et al. (21) screened a 40 billion combinatorial phage display library starting with > 1012 phage particles. However, in phage display the library is amplified following each round of screening. In the case of DEL-6, a library member that binds the protein target is estimated to result in only ~0.002-1 amplifiable molecule following two rounds of screening (based on the range of yields listed in Tables -1 and –S1). This implies that DEL-6 requires ~150 nmols of library per screening condition to detect a library member that binds the protein target (assuming ~1 million copies per library member is required at the outset of the screen). However, employing such a quantity of library is challenging as current library production protocols yield only 3-5 µmols of DEL (1,10). The reported 1 trillion member libraries (2, 1617) are predicted to require >1.5 µmols of DEL per screening condition. Library pools used herein require two successive rounds of screening to observe acceptable signal-to-noise; in contrast, literature indicates that screens employing 10’000-fold smaller amounts of library require only a single round (22-23). The ability to employ extremely large quantities of library at the outset of a screen may be challenging, as yields of the desired library members that bind the target protein may decrease with each additional round of screening. Screens -1a,-1b, and -1c were run successively, in a manner previously reported (112,13). Screens -1a, -1b, and -1c detect 30, 13’441, and 4’566 library members with enrichment > 10 respectively (Table 2). The high false negative rate for Screen-1a may be attributed to low sequencing depth (x/n ~ 0.01) due to a large quantity of recovered library. Each successive screen (1b-1c) results in 10-fold less recovered library (Table 2), resulting in vastly fewer false negatives for screen-1b. Screen-1c however has more false negatives than screen-1b; as suggested above, we assume loss of material due to repeated handling and purifications is unavoidable. We attempt to further validate the hypothesis of Clark et al.(1) that larger libraries provide weaker signals. First we directly compare library members from the same library and screen. Compounds 5 and 29 are both derived from DEL-5 Screen-1, and have reported pIC50 values of 9 and 7 respectively (information regarding libraries and screens are provided in Tables S2 and S3, and compound properties and biological activities were previously reported by Eidam & Satz (13)). The library member corresponding to 29 is a truncate and has a ~500fold greater theoretical quantity at the outset of the screen, and a corresponding 500-fold greater enrichment, than 5 (Figure 1). Figures -2 and -S9 compare enrichments between all observed parent and truncated library members for screen-1 and-2 respectively; the numerically smaller truncated libraries provide stronger signals than the larger parent libraries. Second, we plot maximum observed enrichment for all screening data (including relevant literature data) versus library numeric size; a moderate correlation is apparent despite the data arising from numerous laboratories, libraries, and targets (Figure 3). (Note that our rational for using ‘maximum observed enrichment’ as the dependent variable is provided in the supporting information Section-9). Observed enrichment is also influenced by the ratio of the total number of amplifiable library members added to the PCR mix prior to amplification (n) relative to the total reads (x). This relationship can be readily quantified by random sampling of the sequencing data from

ACS Paragon Plus Environment

Page 6 of 15

Page 7 of 15

ACS Combinatorial Science

7 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Screen-1c; enrichment for library members corresponding to compounds 5 and 29 are recalculated after removing increasing amounts of sequencing data (Figure 1). A 10-fold decrease in total reads (x) results in a near complete loss of signal for the library member corresponding to compound 5. To optimize future DNA encoded library design, we have attempted to quantify the numeric library size at which signal becomes undetectable. To accomplish this we i) have calculated that percent yields of individual library members following a DEL screen range from 0.002-1% (Table 1), ii) extrapolated that ~1 million copies per library member are required at the outset of a screen, and iii) from this extrapolation predict that false negative rates will begin to outweigh the benefit of increased chemical diversity at numeric library sizes >108. Additionally, we have further validated the hypothesis of Clark et al. (1) that larger libraries will have weaker signals, and demonstrated that larger libraries will disproportionately suffer from false negatives upon undersampling (Figures 1-3). The above conclusions may be considered applicable to other DNA encoded library platforms with the following caveats. As is true of most reported DEL screens, our conclusions are drawn solely from sequencing data, and a lack of infinite sequencing depth may bring about inaccuracies. Our investigated data set may be an outlier or we may employ screening protocols significantly different from others. Also, we assume that using more than currently reported amounts of library (>>10 nmoles) at the outset of a screen (to increase copies per library member) is impractical. Lastly, DNA encoded library material is not fully characterized before or after a screen, making interpretation of yields difficult. We attempted to ameliorate the above concerns by calculating percent yields for a large number of molecules derived from different screens, targets, and libraries (Table 1); note that the screens themselves were extremely productive, with high enrichment of more than 34 structurally distinct clusters (13). Additionally, we augmented our internal data set with data reported by two external laboratories (Table S1), and observe consistent results (Figure 3). Of course, there’s no expectation that our conclusions would be applicable to highly divergent methodologies, including dynamic combinatorial libraries (24), technologies offering the potential for library amplification (25-26), or one-bead one-compound approaches (27). The yields herein may be interpreted in the same way as for most other reported synthesis; a theoretical yield is determined based upon quantities of starting materials used, and the resulting yield encompasses both the synthesis and purification. However, dissimilar to most other synthesis, desired products are synthesized within complex mixtures, quantity of starting materials are crudely estimated by spectrophotometric analysis, and purified products may only be characterized via sequencing.

ACS Paragon Plus Environment

ACS Combinatorial Science

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48

8

Table 2. Sequencing output for Screens 1a-1b (see supporting information Section-1 for a glossary of terms). Screen-ID Rounds of Screening

Total number of amplifiable Library members Cycles PCRd % non-unique library members added to sequences (1-m/x) with enrichment > the PCR mix prior to 10c amplification (n)b (×10-6) 1a 1 36 35 2’400 1 30 ~13 1b 2 41 39 330 6 13’441 ~24 1c 3 38 17 22 55 4’566 30 a Experimentally observed total and unique reads. b This value is calculated as described in supporting information Section-6. c Number of library members detected with enrichment > 10 (see supporting information Section-8 for the definition of enrichment). d Cycles of PCR required for the collected library material (following the screen) to be observed via electrophoresis on an ethidium bromide gel (see supporting information Section-3). Total Reads (x)a (×10-6)

Unique Reads (m)a (×10-6)

ACS Paragon Plus Environment

Page 8 of 15

Page 9 of 15

ACS Combinatorial Science

9 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 1. A library member with ~500-fold greater theoretical yield possesses ~500 fold greater enrichment. Enrichment of two DEL-5 (Table S2) library members from Screen-1c (Table S3), as a function of total reads (x). The value of total reads is reduced via random sampling of the experimental sequencing data prior to re-calculation of enrichment values (see R script ‘sample_sequencing_data.R’ in supporting information).

ACS Paragon Plus Environment

ACS Combinatorial Science

10 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 2. Histogram comparing phosphodiesterase (screen-1) enrichment between truncated and non-truncated (parent) library members; truncated libraries are numerically smaller and provide a stronger signal than larger parent libraries. Note that no parent library members possess enrichment >100. Aggregation of data from DELs -1,-2,-4, and -5; only library members with enrichment > 25 are included. Truncated libraries are generally 2-3 orders of magnitude numerically smaller than the non-truncated (parent) libraries from which they are derived. A similar result is observed for screen-2, which employed a kinase target protein (Figure S11).

ACS Paragon Plus Environment

Page 10 of 15

Page 11 of 15

ACS Combinatorial Science

11 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 3. Maximum observed enrichment correlates with library size. Ordinary least squares results in the equation y = -0.42x +4.77. The maximum observed enrichment (Log10) is simply the most enriched library member for any given library and screen. For screens 1-3, all libraries are included that possess at least one library member with enrichment >4. All applicable literature data is included; for literature data, enrichment is assumed equivalent to the number of times the most enriched library member was observed as listed in Table S1, column 4.

ACS Paragon Plus Environment

ACS Combinatorial Science

12 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Supporting Information: Glossary, encoding schema, detailed calculations, supplemental data tables, supplemental figures, Python and R scripts. Acknowledgements: We thank the drug discovery teams in Roche Pharmaceutical Research and Early Development for allowing us to use their data for this analysis.

ACS Paragon Plus Environment

Page 12 of 15

Page 13 of 15

ACS Combinatorial Science

13 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

References 1. Clark, M.A.; Acharya, R.A.; Arico-Muendel, C.C.; Belyanskaya, S.L.; Benjamin, D.R.; Carlson, N.R.; Centrella, P.A.; Chiu, C. H.; Creaser, S.P.; Cuozzo, J. W.; Davie, C.P.; Ding, Y.; Franklin, G.J.; Franzen, K.D.; Gefter, M.L.; Hale, S.P.; Hansen, N.J.V.; Israel, D.I.; Jiang, J.; Kavarana, M. J.; Kelley, M.S.; Kollmann, C.S.; Li, F.; Lind, K.; Mataruse, S.; Medeiros, P.F.; Messer, J.A.; Myers, P.; O’Keefe, H.; Oliff, M.C.; Rise, C.E.; Satz, A.L.; Skinner, S.R.; Svendsen, J.L.; Tang, L.; van Vloten, K.; Wagner, R.W.; Yao, G.; Zhao, B.; Morgan, B.A. Design, synthesis and selection of DNA-encoded small-molecule libraries. Nat. Chem. Biol. 2009, 5, 647-654. 2. Goodnow, R.A.Jr.; Dumelin, C.E.; Keefe, A.D. DNA-encoded chemistry: enabling the deeper sampling of chemical space. Nat. Rev. Drug Discovery 2016, 16, 131-147. 3. Kollmann, C.S.; Bai, X.; Tsai, C-H.; Yang, H.; Lind, K.E.; Skinner, S.R.; Zhu, Z.; Israel, D.I.; Cuozzo, J.W.; Morgan, B.A. Application of encoded library technology (ELT) to a protein-protein interaction target: Discovery of a potent class of integrin lymphocyte function-associated antigen 1 (LFA-1) antagonists. Bioorg. Med. Chem. 2014, 22, 23532365. 4. Soutter, H.H.; Centrella, P.; Clark, M.A.; Cuozzo, J.W.; Dumelin, C.E.; Guie, M.-A.; Habeshian, S.; Keefe, A.D.; Kennedy, K.M.; Sigel, E.A.; Troast, D.M.; Zhang, Y.; Ferguson, A.D.; Davies, G.; Stead, E.R.; Breed, J.; Madhavapeddi, P.; Read, J.A. Discovery of cofactor-specific, bactericidal Mycobacterium tuberculosis InhA inhibitors using DNAencoded library technology. Proc. Natl. Acad. Sci. U. S. A. 2016, 113, E7880-E7889. 5. Mandal, P.; Berger, S. B.; Pillay, S.; Moriwaki, K.; Huang, C.; Guo, H.; Lich, J. D.; Finger, J.; Kasparcova, V.; Votta, B.; Ouellette, M.; King, B. W.; Wisnoski, D.; Lakdawala, A. S.; DeMartino, M. P.; Casillas, L. N.; Haile, P. A.; Sehon, C. A.; Marquis, R. W.; Upton, J.; Daley-Bauer, L. P.; Roback, L.; Ramia, N.; Dovey, C. M.; Carette, J. E.; Chan, F. K.; Bertin, J.; Gough, P. J.; Mocarski, E. S.; Kaiser, W. J. RIP3 induces apoptosis independent of pronecrotic kinase activity. Mol. Cell. 2014, 56, 481–495. 6. Deng, H.; Zhou, J.; Sundersingh, F. S.; Summerfield, J.; Somers, D.; Messer, J. A.; Satz, A. L. ; Ancellin, N.; Arico-Muendel, C. C.; Bedard, K. L.; Beljean, A.; Belyanskaya, S. L.; Bingham, R. ; Smith, S. E. ; Boursier, E.; Carter, P.; Centrella, P. A.; Clark, M. A.; Chung, C. W.; Davie, C. P.; Delorey, J. L.; Ding, Y.; Franklin, G. J.; Grady, L. C.; Herry, K.; Hobbs, C.; Kollmann, C. S.; Morgan, B. A.; Pothier-Kaushansky, L. J.; Zhou, Q. Discovery, SAR, and X-ray Binding Mode Study of BCATm Inhibitors from a Novel DNA-Encoded Library. ACS Med. Chem. Lett. 2015, 6, 919–924. 7. Encinas, L.; O’Keefe, H.; Neu, M.; Remuinan, M.J.; Patel, A.M.; Guardia, A. Encoded library technology as a source of hits for the discovery and lead optimization of a potent and selective class of bactericidal direct inhibitors of Mycobacterium tuberculosis InhA. J. Med. Chem. 2014, 57, 1276-1288. 8. Disch, J.S.; Evindar, G.; Chiu, C.H.; Blum, C.A.; Dai, H.; Jin, L.; Schuman, E.; Lind, K.E.; Belyanskaya, S.L.; Deng, J.; Coppo, F.; Aquilani, L.; Graybill, T.L.; Cuozzo, J.W.; Lavu, S.; Mao, C.; Vlasuk, G.P.; Perni, R.B. Discovery of Thieno[3,2‑d]pyrimidine-6carboxamides as potent inhibitors of SIRT1, SIRT2, and SIRT3. J. Med. Chem. 2013, 56, 3666-3679. 9. Gilmartin, A.G.; Faitg, T.H.; Richter, M.; Groy, A.; Seefeld, M.A.; Darcy, M.G.; Peng, X.; Federowicz, K.; Yang, J.; Zhang, S.Y.; Minthorn, E.; Jaworski, J.P.; Schaber, M.; Martens, S.; McNulty, D.E.; Sinnamon, R.H.; Zhang, H.; Kirkpatrick, R.B.; Nevins, N.; Cui, G.; Pietrak, B.; Diaz, E.; Jones, A.; Brandt, M.; Schwartz, B.; Heerding, D.A.; Kumar, R. Allosteric Wip1 phosphatase inhibition through flap-subdomain interaction. Nat. Chem. Biol. 2014, 10, 181–187. 10. Litovchick, A.; Dumelin, C.E.; Habeshian, S.; Gikunju, D.; Guie, M-A.; Centrella, P.; Zhang, Y.; Sigel, E.A.; Cuozzo, J. W.; Keefe, A. D.; Clark, M.A. Encoded library synthesis using

ACS Paragon Plus Environment

ACS Combinatorial Science

14 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

11.

12.

13. 14.

15.

16. 17. 18.

19. 20.

21.

22.

23.

24. 25. 26.

chemical ligation and the discovery of sEH inhibitors from a 334-million member library. Scientific Reports 2015, 5, 10916. Wu, Z.; Graybill, T. L.; Zeng, X.; Platchek, M.; Zhang, J.; Bodmer, V. Q.; Wisnoski, D. D.; Deng, J.; Coppo, F. T.; Yao, G.; Tamburino, A.; Scavello, G.; Franklin, G. J.; Mataruse, S.; Bedard, K. L.; Ding, Y.; Chai, J.; Summerfield, J.; Centrella, P. A.; Messer, J. A.; Pope, A. J.; Israel, D. I. Cell-Based Selection Expands the Utility of DNA-Encoded Small-Molecule Library Technology to Cell Surface Drug Targets: Identification of Novel Antagonists of the NK3 Tachykinin Receptor. ACS Comb. Sci. 2015, 7, 722–731. Concha, N.; Huang, J.; Bai, X.; Benowitz, A.; Brady, P.; Grady, L.C.; Kryn, L.H.; Holmes, D.; Ingraham, K.; Jin, Q.; Kaushansky, L.P.; McCloskey, L.; Messer, J.A.; O’Keefe, H.; Patel, A.; Satz, A.L.; Sinnamon, R.H.; Schneck, J.; Skinner, S.R.; Summerfield, J.; Taylor, A.; Taylor, J.D.; Evindar, G.; Stavenger, R.A. Discovery and Characterization of a Class of Pyrazole Inhibitors of Bacterial Undecaprenyl Pyrophosphate Synthase. J. Med. Chem. 2016, 59 (15), 7299–7304. Eidam, O.; Satz, A.L. Analysis of the productivity of DNA encoded libraries. Med. Chem. Commun. 2016, 7, 1323-1331. Cuozzo, J.W. ; Centrella, P.A. ; Gikunj, D. ; Habeshian, S.; Hupp, C.D.; Keefe, S. A.; Sigel, E.A.; Soutter, H.H.; Thomson, H.A. ; Zhang, Y. ; Clark, M.A. Discovery of a Potent BTK Inhibitor with a Novel Binding Mode by Using Parallel Selections with a DNA-Encoded Chemical Library. ChemBioChem. 2017, DOI:10.1002/cbic.201600573. Griffiths, A. D.; Williams, S. C.; Hartley, O.; Tomlinson, I. M.; Waterhouse, P.; Crosby, W.L.; et al Isolation of high affinity human antibodies directly from large synthetic repertoires. EMBO 1994, 13(14), vol.13 3245-3260. Mullard, A. DNA-encoded drug libraries come of age. Nature Biotechnol. 2016, 34 (5), 450-451. Mullard, A. DNA tags help the hunt for drugs. Nature 2016, 530 (5), 367-369. Henry (http://math.stackexchange.com/users/6460/henry), probability distribution of coverage of a set after `X` independently, randomly selected members of the set, URL (version: 2011-04-13): http://math.stackexchange.com/q/32800 (accessed March 2, 2017). Satz, A. L. DNA Encoded Library Selections and Insights Provided by Computational Simulations. ACS Chem. Biol. 2015, 10, 2237–2245. Satz, A. L. Simulated Screens of DNA Encoded Libraries: The Potential Influence of Chemical Synthesis Fidelity on Interpretation of Structure-Activity Relationships. ACS Comb. Sci., 2016, 18(7), 415-424. Weber,M.; Bujak, E.; Putelli, A.;Villa, A.; Matasci, M.; Gualandi, L.; Hemmerle, T.; Wulhfard, S.; Neri, D. A Highly Functional Synthetic Phage Display Library Containing over 40 Billion Human Antibody Clones. PLoS ONE 2014, 9(6), e100000. Mannocci, L.; Melkko, S.; Buller, F.; Molnàr, I.; Bianké, J.-P. G.; Dumelin, C.E.; Scheuermann, J.; Neri, D. Isolation of Potent and Specific Trypsin Inhibitors from a DNA-Encoded Chemical Library. Bioconj. Chem. 2010, 21, 1836-1841. Franzini, R.M.; Ekblad, T.; Zhong, N. ; Wichert, M.; Decurtins, W.; Nauer, A.; Zimmermann, M.; Samain, F.; Scheuermann, J.; Brown, P.J.; Hall, J.; Gräslund, S.; Schüler, H.; Neri, D. Identification of structure-activity relationships from screening a structurally compact DNA-encoded chemical library. Angew Chem Int Ed Engl. 2015, 54, 3927-3931. Reddavide, F.V.; Lin, W.; Lehnert, S.; Zhang, Y. DNA-Encoded Dynamic Combinatorial Chemical Libraries. Angew Chem Int Ed Engl. 2015, 54, 7924-7928. Weisinger, R.M.; Wrenn, S.J.; Harbury, P.B. Highly Parallel Translation of DNA Sequences into Small Molecules. PLoS ONE 2012, 7, e28056. Krusemark, C.J.; Tilmans, N.P.; Brown, P.O.; Harbury, P.B.; Directed Chemical Evolution with an Outsized Genetic Code. PLOS ONE 2016, 11(8), e0154765.

ACS Paragon Plus Environment

Page 14 of 15

Page 15 of 15

ACS Combinatorial Science

15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

27. MacConnell, A.B.; McEnaney, P.J.; Cavett, V.J.; Paegel, B.M. DNA-encoded solid-phase synthesis: encoding language design and complex oligomer library synthesis. ACS Comb. Sci. 2015, 17 (9), 518–534.

ACS Paragon Plus Environment