8 Computer-Assisted Synthetic Analysis in Drug Research P. GUND, J. D. ANDOSE, and J. B. RHODES
Downloaded by CORNELL UNIV on August 27, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0061.ch008
Merck Sharp & Dohme Research Laboratories, Dept. of Scientific Information, and Corporate Management Information Systems, MSDRL Systems and Programming Dept., Merck & Co., Inc., Rahway, N.J. 07065
It was recognized at Merck some time ago (1) that the analytical and data-handling capabilities of a computer could f a c i l i t a t e organic synthesis design, just as spectroscopic methods have revolutionized structure determination. Indeed, the chemist's approach to synthetic design resembles a computer program (2). As outlined in Figure 1, the chemist begins by defining his synthetic problem; he collects relevant knowledge about chemical reactions (note that he never sequentially searches through all available literature); and he designs a synthesis. If the synthesis f a i l s or if he wants additional p o s s i b i l i t i e s , he iterates (repeats the process) to generate additional syntheses. If initial attempts are unsatisfactory, the chemist may enlarge his store of relevant knowledge, e.g. by reading the literature or talking to an expert; or he may redefine the problem - i . e . , find new keys so that more of his knowledge of reactions becomes relevant. In fact, chemists have been known to search exhaustively for a way to implement a preferred route, only to f i n a l l y re-analyze their problem and find an entirely different - and ultimately successful - one. If sufficiently desperate, the chemist may browse (i.e., perform a random search) through the literature for ideas. This method has occasionally succeeded when more rational approaches failed, and might be taken as an indication that our reaction classification and retrieval methods are imperfect. We may envision two levels of computer support of the chemist's analytical process. One approach - which we may c a l l a reaction retriever - organizes and retrieves relevant reaction information. The other, which we here c a l l a synthesizer, simulates a large part of the synthetic process, as shown by the frame in Figure 1. Both computer approaches require a data base of chemical reactions, and it is conceivable that they could share the same data base. We w i l l return to this point; but first, we should consider whether either method would find use by the practicing chemist. 179 Wipke and Howe; Computer-Assisted Organic Synthesis ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Wipke and Howe; Computer-Assisted Organic Synthesis ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
LEME PDROEFBIM PAKOTETYRERNS CONCEPTS ν
>
>
Figure
1.
Organic
e n
synthesis: Problem
^LRTIEEARCATTOIUNRE f
"retriever
synthesizer"
CHEMSIT)
analysis
Downloaded by CORNELL UNIV on August 27, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0061.ch008
8.
Synthetic
GUND ET AL.
Analysis
in
Drug
Research
181
Categories of Pharmaceutical Synthetic A n a l y s i s We i d e n t i f y four types of pharmaceutical s y n t h e s i s : (1) s y n t h e s i s of analogs of a l e a d " compound, (2) s y n t h e s i s of n a t u r a l products, (3) process development» and (4) new r e a c t i o n discovery (Figure 2). A lead analog program normally attempts 11
Synthesis
Type
Synthesis
Analogs of "Lead"
Class
Retriever
SM ?
Synthesizer f o r D i f f i c u l t Compounds; R e t r i e v e r Retriever
Downloaded by CORNELL UNIV on August 27, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0061.ch008
SM Ρ Ρ
Synthesizer; Retriever
Retriever
Process Development
? ·* Ρ SM + Ρ
Synthesizer; Retriever
Retriever
New
SM ·> ?
R e t r i e v e r ; Forward Operating Synthesizer
Natural
? SM
Computer A i d
Product
Reactions
Figure
2.
Computer-assisted
synthetic
analysis types and
classes
to f i n d short syntheses of a s e r i e s of r e l a t e d compounds, o f t e n from a s i n g l e intermediate which can be obtained i n l a r g e q u a n t i t i e s - f o r example, c o n s t r u c t i o n of various s i d e chains s t a r t i n g from p e n i c i l l a n i c a c i d . O c c a s i o n a l l y , however, a d e s i r e d analog must be made by q u i t e d i f f e r e n t chemistry - f o r example, p r e p a r a t i o n of dethiacephalosporins (J3). Also o c c a s i o n a l l y , analogs w i l l be made by applying s t r a i g h t f o r w a r d chemistry to an a v a i l a b l e s t a r t i n g m a t e r i a l . Thus, i f we d i f f e r e n t i a t e three c l a s s e s of s y n t h e t i c analyses - (a) s t a r t i n g m a t e r i a l and product s p e c i f i e d (SM •> P); (b) product s p e c i f i e d (? •+ P) ; and (c) s t a r t i n g m a t e r i a l s p e c i f i e d (SM ?) , then lead analog syntheses may belong to any of the c l a s s e s . When a n a t u r a l product with i n t e r e s t i n g b i o l o g i c a l a c t i v i t y i s i s o l a t e d , s y n t h e s i s i s used to confirm the s t r u c t u r e and to o b t a i n s u f f i c i e n t m a t e r i a l f o r biochemical and s t r u c t u r e m o d i f i c a t i o n s t u d i e s . Synthetic a n a l y s i s i s g e n e r a l l y of the "product s p e c i f i e d (? P) type, except when a r e l a t e d m a t e r i a l i s known and a v a i l a b l e (SM Ρ synthesis) · For process development, a n a l y s i s tends to be exhaustive, s i n c e the optimal commercial s y n t h e s i s o f t e n i s d i f f e r e n t from the "best" laboratory method. When a cheap r e l a t e d compound i s a v a i l a b l e , the a n a l y s i s may be of the SM ·* Ρ type. F i n a l l y , chemists apply new r e a c t i o n s to known compounds i n order to generate new drug leads; t h i s r e q u i r e s s y n t h e t i c f f
Wipke and Howe; Computer-Assisted Organic Synthesis ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
COMPUTER-ASSISTED ORGANIC SYNTHESIS
182
Downloaded by CORNELL UNIV on August 27, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0061.ch008
analyses of the SM ? o r , o c c a s i o n a l l y , SM Ρ types. A r e a c t i o n r e t r i e v e r program i s a p p l i c a b l e to a l l f o u r types of syntheses. A s y n t h e s i z e r program a p p l i e s p r i m a r i l y to the ? Ρ c l a s s o f problem, although a forward working s y n t h e s i z e r would apply to SM ? problems. The a p p l i c a b i l i t y of these programs by s y n t h e s i s type i s summarized i n F i g u r e 2. As a g e n e r a l i z a t i o n , a s y n t h e s i z e r i s most u s e f u l f o r " c r e a t i v e syntheses, while a r e a c t i o n r e t r i e v e r may i d e n t i f y optimal c o n d i t i o n s a f t e r a r e a c t i o n pathway has been chosen. Therefore both methods should be v a l u a b l e .
11
Program D e s c r i p t i o n s A r e a c t i o n r e t r i e v e r program permits o r g a n i z i n g and e n l a r g i n g the s t o r e of r e a c t i o n knowledge a v a i l a b l e to each individual. Chemists have long used manual systems f o r o r g a n i z i n g r e a c t i o n knowledge, such as i n d i v i d u a l card f i l e s , T h e i l h e i m e r s famous s e r i e s of volumes, Reactiones Organicae, and r e c e n t l y the Derwent Chemical Reactions Documentation S e r v i c e . Computer o r g a n i z a t i o n of such c o l l e c t i o n s enables r e t r i e v a l by r e a c t a n t s , products, r e a c t i o n type, r e a c t i o n c o n d i t i o n s , and/or mechanism (4)· C r e a t i o n of such a computer program i s g e n e r a l l y considered an information r e t r i e v a l a p p l i c a t i o n (4)· A s y n t h e s i z e r program t r a d i t i o n a l l y begins with a t a r g e t s t r u c t u r e and a p p l i e s chemical r u l e s to generate and evaluate p o t e n t i a l precursors. I t o f f e r s the c a p a b i l i t y of f a s t , exhaustive, unbiased s y n t h e t i c analyses. As i n d i c a t e d i n many of the other c o n t r i b u t i o n s to t h i s symposium, t h i s approach i s u s u a l l y considered an a r t i f i c i a l i n t e l l i g e n c e a p p l i c a t i o n . Since we had concluded that both program types were d e s i r a b l e , we wondered i f the same r e a c t i o n data base could serve for both computer approaches, as the flow diagram of Figure 1 suggests. We t h e r e f o r e embarked upon a f e a s i b i l i t y study to t e s t t h i s dual use concept. 1
Proposed Computerized Chemical Reaction C o l l e c t i o n We conceived a system where r e a c t i o n s coded by Merck chemists could serve three purposes - current awareness, r e a c t i o n r e t r i e v a l , and s y n t h e s i z e r input (Figure 3). We i d e n t i f i e d the f o l l o w i n g system development stages: (I) development of r e a c t i o n coding sheet; (II) d e s c r i p t i o n o f r e a c t i o n s on sheets by chemists ( c o n t i n u i n g ) ; ( I I I ) t r a n s l a t i o n to computer readable r e a c t i o n information by i n f o r m a t i o n s c i e n t i s t s and t y p i s t s ( c o n t i n u i n g ) ; (IV) development of software f o r computer storage and r e t r i e v a l of r e a c t i o n i n f o r m a t i o n . In the f e a s i b i l i t y study, we a c t u a l l y c a r r i e d out phases I and I I , and performed a l i m i t e d systems a n a l y s i s o f phases I I I and IV. We designed a r e a c t i o n coding sheet t o c o n t a i n most of the information needed by both r e a c t i o n r e t r i e v e r and s y n t h e s i z e r programs (Figure 4 ) . Data on r e a c t a n t s , products, intermediates, reagents, c o n d i t i o n s , y i e l d s , work-up procedures, mechanism, and
Wipke and Howe; Computer-Assisted Organic Synthesis ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
8.
GUND E T A L .
Synthetic
Analysis
in Drug
CHEMISTS WRITE REACTION DESCRIPTIONS
DEVELOP REACTION CODING SHEET
»
CODERS INPUT REACTIONS
' REACTION I ' RETRIEVAL/ SERVICE/
Downloaded by CORNELL UNIV on August 27, 2016 | http://pubs.acs.org Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0061.ch008
Figure
3.
Reaction
1.CODER: P. G-u* A
file
6 . REFERENCES:
Rfc+cl.^e
T^WKcdrov 7. AFFILIATION:
I INPUT FOR / SYNTHESIZER/ / PROGRAM /
development
2 .LOCATION(ΜΟΝΕ) : RIO (Wli)
4.REACTION NAME: C c p K e J # e p * r i i * S y n ^ e c i S
}
CAr^nC**,
*
3.DATE:
Zlz7/rtf
5.SOURCE: ^*MERCg>
V * * ^ V 6 * 9 , * · Γ3 Λ * 7 . ϊ )
LeH -*0/ RATING: #
/
Ac«-»»~e
1 2 3 «(sjist)
NaH/ï>*F
or
13. REACTION CONDITIONS: ® - 7 8 ' C 14. WORKUP PROCEDURES : (Σ) CKro~*.-tD«|