Mathematics and Computers Prologue to Automation
ENGINEERS
concern themselves with transformations of energy. ..What about scientists? A sufficiently broad point of view is that scientists are concerned primarily with acquiring and organizing information. Utility is no concern of basic science. Yet, time and again, scientific achievements which at first appeared most impractical have later turned out to have most utilitarian applications. Pure curiosity can be a strong motivation in scientific work, but many discoveries have resulted from a mixture of motives strongly tinged with practical and commercially profitable aims.
Models We shall not explore the corhposition of this mixture, but instead focus our attention on one salient feature of the scientific process-organization of knowledge through the use of a “model.” I n essence, the scientist is forever making models of nature, and testing how well his models agree with nature. If he becomes so interested in the models that he loses interest in how well they correspond with nature, he becomes a pure mathematician. Then some one else, perhaps an applied mathematician, may later find ready-made a conceptual system which models some aspect of nature never considered by the pure mathematician. Consider first a nonchemical example, old and well known. Ancient peoples, observ‘ing the stars, accumulated a mass of data about them. At first, this was organized by models known as “constellations.” This organization enabled them to identify “wanderers” or planets, as distinguished from fixed stars. Ptolemy’s model of the universe permitted something like modern “data reduction” on observations of planetary positions. Then Kepler spent something like 13 years on what we would )describe as “empirical curve fitting,” seeking a better model. Kepler’s eventual success with ellipses achieved an enormous economy of description. At that point, a few geometric parameters could replace not only all the data already collected, but all that might have been collected in the past or could be collected in the future. Although Kepler also derived certain laws relating one planetary motion to another, it remained for Newton to make an even grander model. His Law
-
of Gravitation and his Laws of Motion explained, or modeled, much more than Kepler’s ellipses, including not only apples but modern-day missiles. This example is a classic one. Such examples have led many to identify the idea of a model with a set of equations. Often these equations are differential equations. I n chemistry as well as physics you find many such models. However, we may profitably enlarge the concept of mathematical model to include much more-the idea that any sort of economical description that organizes information on some aspect of nature is essentially a mathematical model. Take, for example, the Periodic Table of the elements. Probably Mendeleef never thought of his work as mathematical. Nonetheless, the Periodic Table gives us an economy of description. This model suggested, and still suggests, pertinent questions for further investigation, as not all elements seem to have the properties expected from their position in the table. Prout’s hypothesis, and many others in the domain of chemistry, are clearly within our meaning of mathematical model, and from such models the science of chemistry has evolved. Our business as scientists, then, whether we be working in chemistry, physics, astronomy, biology, or whatever, is to model systems, and if these models have any virtue to them at all, those which we use should come within our extended meaning of the term “mathematical model.” Such a model has, of course, more than mere economy of description. Not only does it describe the observations you have, but it predicts or sugge8ts what you might expect from experiments not yet performed. Here are the clues to further investigation, to further understanding of nature. A set of “stupid theorems” recently stated some obvious but often neglected rules. One most appropriate to our present topic is: “An experiment which turns out exactly as you expected it to has increased your confidence, byt not your knowledge.” We often learn the most, if we will, from experiments which turn out “badly.” Models that fail are the steppingstones of scientific progress.
Measures of Organization If organization of knowledge is our business, can we measure the degree of
organization we have achieved? Already we know of some very nice measures of organization in special fields. I n thermodynamics, entropy is often regarded as a measure of this sort. Ip recent years, the study of communication through noisy channels has been developed by Shannon and others, leading to an entropylike expression in information theory. However, the information theory of the communication specialist is not yet prepared to answer all the problems in communication engineering, still less those of the scientist in organizing the data he wrests from nature. (After all, thermodynamics does not solve all problems in chemistry.) Scientists have been prolific in their development of tools for observation and measurement-for the acquisition of new data to test their models of nature. Today the scientist has a new tool, of an entirely different sort. The modern automatic electronic computer is not primarily a measuring device, but an information-processing system of extreme flexibility. I t is a general-purpose tool for the main job common to all scientific fields-the testing of models. With such a tool,’ how far can we go toward the “automation” of scientific work?
Automation Automation in factory production is older than the word itself. Office automation has so far dealt chiefly with routine clerical tasks, such as payroll operations, inventory keeping, and invoicing. Where, in scientific work, can we use an information-processing robot to save time and money? No one else can answer such a question better than the scientist himself, for no one else knows his work as well as he does. Before he can give a good answer, however, he must learn more about this new tool than many scientists now know. All too often, their attitude is more wistful than aggressive and enquiring. “ I wish I could get a computer to do this job for me, but of course it is impossible.” “We don’t have one here, and I understand they are very hard to program.” Often these are excuses based on ignorance, and those who should be doing creative work continue to condema themselves to computational d r u d g e 3 when the same time might better be spent in finding out how easily VOL. 51, NO. 1
JANUARY 1959
7
the new tool can be made their slave. Long-winded data reduction is a fit job for the computer. Another obvious application is in the tabulation of special functions which are often needed. Curve fitting, by least squares or some other criterion, is a natural application, as are statistical jobs, such as correlation and regression, or analysis of variance. A more ambitious job might be computing the specific heat of a gas from spectroscopic data. These are the drudge jobs we can get out from under with a computer. Once freed of these, we will have more time to be creative. However, this does not mean that we turn our back to the computer as we seek to be creative. Part of our time should be spent in looking for ways to use the computer in something more than data reduction and routine analysis of observational results. Model testing is not all measurement and data reduction. Over and over again, models have been proposed and never adequately tested-just because there was no convenient way to get numbers that could be compared with experimental results. Any scientist can write down in a few minutes an equation which might take hours, or weeks, or even years, to “solve” (by hand). Many of the mathematical models we can formulate can be tested against nature only by resorting to some sort of numerical treatment, such as step-by-step integration. Until crucial tests of a model are carried out, the model can be plausible and interesting, but sterile. Using a general-purpose computer, you are not bound to any one method of attack. Cleverness pays off. with or without a computer-but with a computer the simplest and most straightforward approach is often preferable. In some cases, exhaustive consideration of all possible combinations is quicker than trying to find a more sophisticated solution. When complete enumeration takes too much computer time, a random sampling, or Monte Carlo, technique can often produce useful estimates. You may start with a model in which the parameters are simply guessed at, vary the parameters in any way you wish, and learn from such “numerical experiments” how to steer the model toward a better correspondence with nature. This method may also give insight into the physical interpretation of the model and its parameters. U p to this point, we have considered only the arithmetic abilities of our electronic tool. Probabilities, regressions, algebraic or differential equations, all are reduced to the arithmetic combination of numbers in the computer. Office automation depends heavily on the ability of the data-processing equipment to sort, file, select, collate, and otherwise manipulate data which need not be numerical. These functions use what
8
INDUSTRIAL AND ENGINEERING CHEMISTRY
are termed the “logical abilities” of the computer. What can the scientist do with these? Nonnumeric Application in Scientific Progress
Literature Searching. 4 n outstanding problem, which the Division of Chemical Literature of the ACS has long recognized, is that of literature reference searching. Although many people are attempting to solve such problems through the use of electronic equipment, the advance toward a satisfactory system is not as rapid as some would have you think. The trouble is the lack of a good mathematical model. We might hope, with a n appropriate mathematical model for literature searching, to measure the efficiency of various proposed systems, but we are not far enough along to do this yet. The theory, when it evolves, will be a statistical one, of course, like the kinetic theory of gases or Shannon’s information theory for communication. Communication “through time,” by means of recorded results in libraries, was already a serious problem when Vannevar Bush first proposed a mechanized solution many years ago. Abstracting systems which have kept us going until now are beginning to bog down, as the literature increases at an exponential rate. More and more time that should be spent in better ways is being taken up either by inefficient searching or by inefficient repetition of both experimental and theoretical work to which reference was lacking. In 10 or 20 years mathematical models will give us excellent guidance on the methodology of literature reference searching, and electronic equipment will be assisting us in a way which we cannot now foresee. The scientists (and other library users) of that future time will find themselves unable to understand how we in 1958 ever managed without such a system. Translations. I n using our imagination about the application of computers, we have so far assumed that the computer itself would exercise no imagination. Inevitably, the Challenge must come: Are our own creative talents sufficient to endow our computers with imagination and creativity? The work now in progress on “learning systems” and “heuristic programming” points in this interesting direction. Newspaper accounts suggest that computer translations from Russian to English can be turned out like thermodynamic tables. A computer-stored dictionary and a program with the rules of grammar will do the trick. Human languages, however, don’t yield that easily. But more can be done with some much more restricted languages. Algebra, for instance, is a restricted language. What you would like to do
with a computer, for many scientific purposes, can be expressed in a suitably restricted problem language. Automatic translation from such languages into “computer machine language” is now accomplished through “compiling systems.” This makes it easy for an engineer or a scientist to direct the computer to do his computations, after spending less than a day in lcarning the rules of the game. This is an important first step, but only a first step, toward the much more difficult problem of translating from one human language to another. Much that has been learned in making generalpurpose compilers to convert one restricted language into another is, we believe, applicable to the broader problem. However: significant progress in dealing with human languages is dependent upon how well we succeed in programming a computer so that it will “learn as it goes.” There are exciting projects now blazing new trails in this field. Experimental work ranges from chess playing to proving theorems in logic or geometry. We shall eventually see polished machine translations produced by equipment which learned how by experience. It will be more practical to give the machine a learning program than to discover for ourselves the complex rules which we would have to supply to obtain the equivalent ability. The Perceptron. The organization of the Perceptron (Cornel1 Aeronautical Laboratory) is thought to be analogous to parts of living perception-brain systems. Whether this turns out to be so or not, learning experiments with Perceptron-like devices will provide a strong stimulus to work on machines which will be able to exhibit pattern-response and “concept recognition.” At present, the elements of the Perceptron are gross and cumbersome compared to the neat little cells with which the living brain operates. I t remains for the chemists to change all that. Just recently deoxyribonucleic acid (DKA) has been synthesized. I t is really not unreasonable to expect that in years to come our physical, chemical, and biological sciences will provide synthetic counterparts of living brain cells, which can be organized into efficient data-processing systems by methods only vaguely foreshadowed by the Perceptron of today. It is probable that everything I have touched on will be old and uninteresting in 25 years. Then it will be time for another look ahead.
J. W. MAUCHLY Remington Rand Univac, Philadelphia 29, Pa.