Chemoinformatics in Drug Discovery - Journal of Chemical Information

Wendy A. Warr. Wendy Warr & Associates, Cheshire, England08/04/2007. J. Chem. Inf. Model. , 2007, 47 (5), pp 1995–1996. DOI: 10.1021/ci7001999...
0 downloads 0 Views 25KB Size
BOOK REVIEWS

J. Chem. Inf. Model., Vol. 47, No. 5, 2007 1995

BOOK REVIEWS Chemoinformatics in Drug Discovery. Edited by Tudor I. Oprea. From the series Methods and Principles in Medicinal Chemistry. Edited by R. Mannhold, H. Kubinyi, and G. Folkers. Wiley-VCH Verlag GmbH & Co. KgaA: Weinheim, Germany. 2005. xxii + 494 pp. ISBN 3-527-30753-2. Hardcover U.S. $215, Euro 172.50. According to the publisher, in this book “chemoinformatics experts demonstrate what can be achieved today by harnessing the power of chemoinformatics algorithms for the drug discovery process”. The book aims to be “an invaluable resource for drug developers and medicinal chemists in academia and industry”. It contains 17 chapters divided into four parts: virtual screening; hit and lead discovery; databases and libraries; and chemoinformatics applications. Eight chapters are contributed by computational chemists in the pharmaceutical industry, five by academics, two by mixed teams, and two by software developers. The introductory chapter by Garland Marshall is, the author himself admits, a very personal view of the historical evolution of chemoinformatics. It starts by showing the importance of chemoinformatics by estimating that the enalapril patent (no number given) encompasses at least 59 trillion structures. This is a readable chapter with some “homely” wisdom: I noted the term “diversity fetish”, and I particularly liked the observation that Mother Nature never shaved with Occam’s razor. Unfortunately there are a few spelling mistakes and grammatical infelicities too: perhaps no one wanted to interfere too much with the great man’s wisdom. Reference 46 must be wrong; if it is a “recent” review by Oprea and Marshall, I suspect that it is the following reference: Oprea, T. I.; Marshall G. R. Receptor-Based Prediction of Binding Affinities. Perspect. Drug DiscoVery Des. 1998, 9/10/11, 3561. Tudor Oprea starts his own chapter with another example of the size of the chemoinformatics and drug discovery problem, namely Dave Weininger’s estimate that chemical space might encompass 1029 structures. A useful contribution to the debate on the definition of chemoinformatics is given on page 26. The chapter summarizes Oprea’s well-known research on drugs, leads, and “leadrugs”. There is little that is new in here (indeed, by intent, there is little that is really new in the whole book), but Oprea’s work on “leadlikeness” has made a significant contribution to the drug discovery endeavor and fits in well as the first chapter in the virtual screening section. Mike Hann and his colleagues at GlaxoSmithKline follow with a practical chapter on computational chemistry (Mike has never much liked the term “chemoinformatics”.), molecular complexity, and screening set design. This shows how a major pharmaceutical company actually tackles the problem of virtual screening. The discussion on complexity relates to some of Oprea’s ideas in the preceding chapter. Matthias Rarey and his colleagues give a masterly overview of algorithmic engines in virtual screening, classifying and describing algorithms for molecular docking, structural alignment, molecular similarity, and pharmacophore mapping, drawing attention to the pros and cons of each method. I would have described this chapter as “weighty” except that such a description might unfairly deter the general reader. In fact, very few equations are presented, and the chapter ends by describing some successful applications of virtual screening. The authors, who are not of English mother tongue, are to be congratulated on producing one of the best written chapters in the whole book, backing up their arguments with 298 literature references. The last chapter in the virtual screening section, by Dragos Horvath of CNRS and co-workers at Cerep, compares similarity-based and hypothesis-based pharmacophore approaches and demonstrates that, despite the problems outlined, high-quality pharmacophore-based models can be used effectively for the virtual search of cyclooxygenase-2 inhibitors. This chapter is not an easy read; I found the extensive use of quotation marks irritating, but the scientific content is valid. There are three chapters in the hit and lead discovery section. The first, by a team at Wyeth, is a good contribution on enhancing hit quality and diversity within assay throughput constraints. It discusses noise, false negatives, and so on and outlines alternatives to the “top X”

method of selecting a small percentage of the total number of actives. It is unfortunate that in the references, the Journal of Chemical Information and Computer Sciences is consistently abbreviated as “J. Comput. Chem. Inf. Sci.”. The next chapter is a very good review, by Cullen Cavallaro and colleagues at BMS, of molecular diversity methods for designing discovery and focused libraries. The partnering of chemistry and design is emphasized. I feel that the next chapter, on in silico lead optimization, is too much of a sales pitch for the software package RACHEL; there is no comparison with other de novo design approaches, and only 12 references are cited. Four chapters on databases and libraries follow. The first concerns Oprea’s own “baby”, the World of Molecular Bioactivity (WOMBAT) database. This is actually a novel contribution: although, using SciFinder, I could find five papers describing research uses of WOMBAT, this chapter is the only place in the literature where the construction and content of WOMBAT are described. It is interesting to note how many errors Oprea’s team has found in the medicinal chemistry literature. Three authors from Metaphorics next describe the chemical and biological informatics network (Cabinet) methodology. This is an information resource that consists of a set of servers providing information using HyperText Transfer Protocol (HTTP) and a set of libraries for generation of servers that can collaborate within the Cabinet framework. Each Cabinet server can communicate with other Cabinet servers to provide integrated access to diverse information resources. Integration by federation is briefly compared with integration by unification, but the chapter is biased toward federation. The chapter is written rather carelessly, and many abbreviations or jargon terms are either not defined or are defined at the wrong point. Sections 10.3410.36, in particular, should have been spotted by the editors and rewritten. This chapter seems to have been written in 2003 and not updated. I question whether the material on computer science implementation and performance is appropriate for this book. The concluding section, however, has some interesting philosophical observations about print and about intellectual property. Peter Kenny and Jens Sadowski describe two AstraZeneca tools for dealing with practical issues such as protonation, formal charges, tautomerism, and nitrogen inversion to improve database search results. Three authors from ChemDiv cover rational design of GPCR-specific combinatorial libraries using their proprietary software, ChemoSoft. A few grammatical and spelling errors should have been detected, and trademark superscripts should have been removed in this chapter. The remaining five chapters of the book cover chemoinformatics applications. Compound collections play a crucial role in the search for new leads at large pharmaceutical companies, and it is essential that these collections be augmented to ensure that chemical space is adequately covered, thus enhancing the chance that new and interesting leads will be identified through high throughput screening. Gerry Maggiora, and colleagues who worked with him at Pharmacia, describe the approaches that they used. This is an excellent contribution to the book: I could not fault it except to say that, were this book to be used as a textbook, cross-referencing of this chapter to parts of other chapters (e.g., Chapter 6 by the Wyeth authors) would have been a useful enhancement. Karl-Heinz Baringhaus and Hans Matter of Sanofi-Aventis review methods for simultaneously optimizing affinity, selectivity, and pharmacokinetics of leads and report on successful uses of these approaches at their own company and others. This review is well-written and extensively and carefully referenced. Robert Goodnow and colleagues at Roche give a user’s perspective on tools for library design and the hit-to-lead process and, in particular, their integrated methods, called “Roche adaptive drug design and refinement” (RADDAR). Again, the chapter is extensively referenced. Alex Tropsha demonstrates how QSAR models with variable selection can be used for virtual screening, for database mining, or for chemical library design, provided that the QSAR model is properly validated and used within its applicability

1996 J. Chem. Inf. Model., Vol. 47, No. 5, 2007

BOOK REVIEWS

domain. The book ends with Donald Abraham’s personal account of successful discovery of potential drugs in an academic setting. Unfortunately, he cannot yet give the story a “happy ending”: efaproxaril has been granted orphan drug status by the FDA, but the results of a key clinical trial are not due until summer 2007. The book is written with the user rather than the developer of chemoinformatics software in mind. In terms of the learned chemoinformatics literature, it is up to date to about 2004, but the fact that new algorithms are not reported may not necessarily matter in terms of the aims of the book and its target readership. Not much is said about docking and scoring, but docking is perhaps too complex for the intended reader and is covered well by other books. Indeed, it is a moot point whether docking is really a computational chemistry technique rather than a chemoinformatics technique. Glide is not in the index although GOLD is. Synthetic accessibility in de novo drug design is not covered, but, again, de novo design is probably not very relevant to the target audience. PubChem, and other databases such as ZINC on the Internet, have had a big impact in academia since the book was written. The database section of Oprea’s work would benefit from an extra chapter or two were a second edition to be produced. Visualization is covered only briefly, in Chapter 15 by the Roche authors. I could not find the popular tool, Spotfire, mentioned anywhere in this book. MOGA/MoSelect for multiobjective optimization is mentioned in two chapters but not in the one on multiobjective optimization. None of these terms is indexed, although multidimensional optimization is. The inadequacies of the index made it difficult for me to check which topics might be missing. There is a subject indexsbut its cross-referencing is inadequate, and it is peppered with errors. Worst of all, it seems that the page numbering of the book changed after the page numbers had been set in the index. Multiauthor books are often criticized for redundancy, omissions, and lack of coherence. The choice of topics may be dictated by the editor and the choice of the editor’s cronies as authors. Works of this type require strong editorial control and good copy-editing. I am pleased to say that this multiauthor book can be recommended as a coherent and usable tool for users of chemoinformatics software. It clearly demonstrates the way in which computation may meet the challenges presented by lead discovery and optimization, and it deserves its place in Mannhold, Kubinyi, and Folkers’ series of practice-orientedmonographs. It is not designed as a textbook, but it could be used as a supplementary one in postgraduate courses if caveats were given about the chapters that are exemplary rather than full reviews of available software in any given subfield. Mainstream computational chemists are not going to learn much that is new from the work, but they would find it an interesting read.

and informatics specialists. Since one of the major uses of graphtheoretical matrices is for constructing molecular descriptors, which in turn are employed in QSAR, QSPR, and screening of combinatorial libraries for drug design, the whole community of pharmaceutical and medicinal chemists should take an interest in this field. One should bear in mind that the traditional Hansch approach cannot be employed for the screening of huge combinatorial libraries. After a brief introduction, the 130 graph-theoretical matrices that are presented in the book are grouped into five chapters: (i) adjacency matrices and related matrices (17 headings), (ii) incidence matrices (6 headings), (iii) distance matrices and related matrices (28 headings), (iv) special matrices (18 headings), and (v) graphical matrices (5 headings). Each heading discusses several closely related matrices. The final part of the book consists of concluding remarks (1 page), 365 references (33 pages), and a subject index (8 pages). Most readers of the Journal of Chemical Information and Modeling will be familiar with the square symmetrical adjacency matrices and distance matrices, because these two classes of matrices have been used to generate most of the topological indices needed for chemical information and drug design. Unlike the adjacency and graph distance matrices which are in a one-to-one association with the corresponding graphs, some of the related matrices such as the detour matrix (with entries Dij being the maximum topological distances between vertices i and j), or the edge-adjacency matrix, do not determine uniquely the corresponding graphs (the latter aspect is connected with the fact that there exist nonisomorphic graphs having the same line graph). Incidence matrices are nonsymmetrical and list associations between different pairs of graph constituents (vertices, edges, cycles, paths). Special matrices combine two or more matrix elements into one entry or assign various mathematical operations to matrix elements. Graphical matrices have subgraphs (or numerical invariants of subgraphs) as elements. Constantly, matrices are exemplified with accompanying illustrations of corresponding graphs. The authors mention briefly the applications of important graph-theoretical matrices such as the widely used topological indices, the enumeration techniques based on matrices, the encoding of chemical structures, and the all-pervading field of chemical and biochemical informatics. This book should be at hand for all those who deal with such applications.

Alexandru T. Balaban Texas A&M UniVersity at GalVeston CI700278S 10.1021/ci700278s Published on Web 09/08/2007

Wendy A. Warr Wendy Warr & Associates, Cheshire, England CI7001999 10.1021/ci7001999 Published on Web 08/04/2007

Graph Theoretical Matrices in Chemistry. By Dusˇanka Janezˇicˇ, Ante Milicˇevic´, Sonja Nikolic´, and Nenad Trinajstic´. Mathematical Chemistry Monographs MCM-3, Series Editor Ivan Gutman. Faculty of Science, University of Kragujevac: Kragujevac, Serbia. 2007. vi + 205 pp. ISBN 978-96-8182972-1. Hardcover, U.S. $95.00. The four authors point out in their Preface that mathematical chemistry has a long tradition and that one of the main tools for storing and handling information on chemical structure is based on graphtheoretical matrices. The historical beginning of chemically interesting matrices may be considered the paper of H. Poincare´ published in 1900 on vertex-edge incidence matrices. During the last 25 years, more than 100 new graph-theoretical matrices have been introduced, and until now they had been reviewed only partially, either as articles or as book chapters. Now for the first time there exists a book devoted exclusively to graph-theoretical matrices, and this fact will certainly attract the attention of theoretical chemists, mathematicians, and computational

Nontraditional Careers for Chemists: New Formulas in Chemistry. By Lisa M. Balbes. Oxford University Press, New York, NY. 2007. xi + 307 pp + bibliography and indexes. ISBN 0-19-518366-5, hardcover, U.S. $74.50. ISBN 0-19-518367-3, softcover, U.S. $27.95. It is probably a good assumption that most readers of this journal are chemists who are working in a “nontraditional” career. A nontraditional career in chemistry (some would say “alternative”) can be defined as a career using one’s background in chemistry and not primarily performing laboratory work. “Nontraditional” is preferable over “alternative” for those of us who still consider themselves chemists first and foremost and who mildly resent implications that we have “bought the farm” and bailed out of chemistry altogether. The author aims this book at anyone contemplating a nonlaboratory career in chemistry, from students to those already out in the workforce. I would also encourage its use by those mentoring for careers, especially for students, from high school on up. Career mentoring is a very important but often neglected endeavor by educators. An education in chemistry provides excellent training and background for a number of careers, and the author provides background information on a number of career categories with which many chemists may be unfamiliar. Twelve career categories are described including several subcategories within each. For each category, three or four

BOOK REVIEWS chemists present a brief biography of their background, what they now work at, and how they got there. Rather than detailed career planning, a lot of chance and luck was involved. The mission of the book is to help smooth the path for those who follow. In the chapter on communications, technical writing and presentation are described including skills and requirements. Personal histories include those with careers in software customer support, journal publishing, TV production, and freelance journalism. Careers in information science include special librarianship, information science/information specialist, abstracting and indexing, and database development. Personal contributions include academic special librarianship at various schools, corporate information center management, database administration (including chemical nomenclature), and professional society management. The chapter on chemistry and patents covers patent examiners, patent searchers, patent liaison, registered patent agents, and patent or intellectual property attorneys. Personal vitae include a patent searcher, a patent agent, and patent attorneys for both a pharmaceutical company and a law firm. Sales and marketing in the chemical field is described broadly. Personal histories include those in chemical technical service, small business management, chemical software management, and management consultancy in both small and large firms. The chapter on business development describes careers further along the business cycle than sales and marketing. Personal descriptions include pharmaceutical entrepreneurship, pharmaceutical venture capital and investing, nanotechnology, pharmaceutical information technology support, and management of small business and entrepreneurship support. Careers in regulatory affairs, both government and nongovernment, are described. Vignettes include management consultancy for health services (both in small and large firms), governmental drug analysis, and product safety. Public policy careers inevitably involve the world of politics, on either side of the table. Personal histories include working for science advisory groups, a law firm, and a health care firm. In addition to the category title, safety careers may be found in the areas of health, environmental issues, and industrial hygiene. Biogra-

J. Chem. Inf. Model., Vol. 47, No. 5, 2007 1997 phies include safety officers in both government and corporate laboratories and two independent consultants. The “people” category describes careers in human resources (HR). Vignettes include management of a corporate HR group, a recruiting agency, and corporate personnel recruiting. The chapter on computers acknowledges that almost everyone uses computers in their professional life. However, computational chemistry is now an established field. Personal descriptions include two examples of corporate computer, modeling, and software support as well as the software supply business. Education describes the unique requirements for teaching at both the high school and collegiate level. Personal profiles include high school and community college teachers, a high school science department chair, and the vice president of research for a university. “Chemistry and Everything Else” describes careers in imaging, material, and art. Personal contributions include a photographer and a self-labeled weaver/writer/teacher. In all cases, both from the author and the contributors, the value of a chemical education and background is stressed. This book reinforces the concept that professionals should be prepared to provide career mentorship, including on an occasional, as needed basis. Therefore, every chemical professional and not just teachers should have access to this book (it is available in both paperback and hardcover). It belongs at least in the library of every college and university and on the shelves of every collegiate science department or department of chemistry. Of course, everyone regularly involved in career mentorship should have a personal copy as well. From personal experience, this reviewer has found enhanced interest among high school chemistry teachers in career awareness or mentoring, so availability of this book to that audience is also highly recommended.

Robert E. Buntrock Buntrock Associates, Orono, Maine CI700279P 10.1021/ci700279p Published on Web 08/18/2007