Pattern recognition combats data explosion - Chemical & Engineering

Eng. News , 1972, 50 (35), pp 14–15. DOI: 10.1021/cen-v050n035.p014. Publication Date: August 28, 1972. Copyright © 1972 American Chemical Society...
1 downloads 0 Views 364KB Size

Pattern recognition combats data explosion Teamed with a computer, the technique can help chemists cull irrelevant data when dealing with large quantities of data Systematic methods of determining which of millions of chemical candidates would actually perform as miracle drugs, or as catalysts for industrial re­ actions, or for many other applications have been sorely lacking. In the cancer field alone, at least half a million com­ pounds have been tested for anticancer activity in order to find a few that work, since the screening system has amounted mainly to a methodical test­ ing of almost everything in sight. Pattern recognition techniques may offer one way for the chemist to escape the tidal waves of data with which he is being inundated by modern instrumen­ tation as he conducts such screening operations as well as other chemical re­ search problems. The goal of pattern recognition is to detect or predict ob­ scure but perhaps highly desirable prop­ erties of substances by discerning pat­ terns among seemingly unrelated or very indirect and multitudinous data. A good way of accomplishing that goal, according to a number of investi­ gators, is for the scientist, with his uniquely human pattern recognition capabilities, to interact with a computer that is capable of comprehending ndimensional problems. Bruce Kowalski of Colorado State University and Charles Bender of the University of California's Lawrence Livermore Labo­ ratory have worked up a broad introduc­ tion to pattern recognition techniques [J. Amer. Chem. Soc, 94, 5632 (1972)]. And Stanford's Herman Chernoff has programed a computer to present chem­ ical data by drawing droll faces, taking advantage of the fact that human beings are used to noting differences in facial characteristics whereas they often find rows and columns of numbers confusing. A basic thrust of the pattern recogni­ tion approach is aimed at avoiding inun­ dation by data that are either totally irrelevant to the problem at hand or unnecessary to its solution, so as to be­ come more receptive to patterns among the pertinent data. A computer can be programed to evaluate contributions that various data make to the solution of a problem, to observe whether it has wasted its time, and to compile useful 14

information into η-dimensional pat­ terns that relate to properties that may not be directly measurable. Investiga­ tors in the field say that computers can perform in a manner analogous to the telephone operator who, when given a person's name, address, hair color, job description, and hobby, will quickly offer the feedback observation that the name is always necessary, the address is necessary if the name is common, and the other information is irrelevant. Dr. Kowalski and Dr. Bender cite a number of recent applications of one particular pattern recognition method, involving the linear learning machine, to the analysis of various types of spec­ troscopic data. For example, they say, computers have become 95% accurate in mass spectroscopic determination of hydrocarbon structure using only five to 10 parameters, compared to some 500 parameters typically examined in con­ ventional techniques.

Another example of the technique's applicability illustrates spectacular sav­ ings in dollars as well as in time. At Lawrence Livermore Laboratory, large blocks of very expensive high explosives, which were to be machined into special shapes and used to generate shock waves for physics research, were cracking, ap­ parently because of some compositional defect. Chemists tried unsuccessfully to relate the substance's cohesive proper­ ties to several physical and chemical characteristics. Finally the pattern recognition team was called in on the problem. Since the blocks were crack­ ing at the rate of $500,000 worth per month, the pattern recognition team was told to take whatever steps seemed necessary, with the assumption that the time required to solve the problem might be in the man-year order of mag­ nitude. That was on a Friday. On Saturday, the computer found relationships be-

Mineral analysis represented in facial features "Funny faces" like these represent computerized mineral analysis data from core samples drilled from a Colorado mountainside in a program devised by Stanford's Dr. Herman Chernoff. Significant changes in data are more quickly seen when the variables are represented by facial features, he says, as illustrated by the more drastic changes beginning in faces 6 and 10. In this example they show where the core divides into three major zones of mineral content. Each face represents 12 different percentages of minerals found in core samples taken at regular intervals. Length and curve of the mouth show two different percentages, for example, as do the sizes, positions, and shapes of the rest of the facial features. Altogether, 18 variables can be shown in each face drawn by the computer's plotter.


