Math Is Cheaper than Physics - American Chemical Society

Math Is. Cheaper than Physics. Analytical experiments can provide more information about chemical sys- tems than ever before. However, ex-...
1 downloads 0 Views 2MB Size
Focus

Math Is Cheaper than Physics Analytical experiments can provide more information about chemical systems than ever before. However, extracting information from the data generated in those experiments can be a painstaking process. In the past few years a number of "chemometric" techniques have been developed to help researchers turn data into real knowledge about chemical systems. However, few chemists have had access to these techniques up to now. T h a t is why there is a move afoot to put chemometrics right into the instruments used by analytical chemists. In fact, there are some who predict that this development is on the same order of importance for the coming decade as the incorporation of microprocessors into instruments was for the last. Of course dedicated software for analytical instrumentation has been available for some time. But chemometricians like to emphasize that most current instruments are really incredibly dumb relative to the current state of the art of information processing. According to Bruce Kowalski of the University of Washington, "There are no instruments out there right now that can do anything beyond multiple regression analysis, as far as I know." And Dave Duewer at Monsanto says, "I haven't seen anything capable of multivariate statistics yet. One company has come out with an eight-channel UV spectrometric liquid chromatography detector. Now if they can kick that up to 128 channels, we can start to do some real data analysis. "The CAT scanners, whole-body NMRs, and PET scanners are beautiful applications of multivariate analysis," says Duewer. "They take signals with little meaning and apply some rather sophisticated number crunching to produce some beautiful information. I'm disappointed that this concept has taken so long to catch on with the analytical instruments, per0003-2700/82/A351-1379$01.00/0 © 1982 American Chemical Society

The direction of automation for the next 10 years is going to be the inclusion of chemometric tools in the data systems. —Alice Harper

haps because there's less money in them compared to the tomography market." Alice Harper of the University of Utah tells us that the presence of a big computer alone will not be the major selling point for the analytical instrument of the future: "The direction of automation for the next 10 years is going to be the inclusion of chemometric tools in the data systems." What are some of the chemometric tools Kowalski, Duewer, and Harper are referring to, and what can they do for the analytical chemist? In general, these tools help the chemist design experiments and extract as much information as possible from complex data sets. If natural phenomena could all be described as simple two-dimensional plots, perhaps we would not need chemometrics. The problem, according to Kowalski, is that "life and nature are not two-dimensional, but multivariate. In complex systems, many things are varying at the same time. It's amazing what you can extract from a data matrix and a system under study when you look at several things varying simultaneously.

"When we have two variables we have a two-dimensional plot," Kowalski continued. "If there are three variables, a plot can still be represented on paper. But what happens if there are 4 variables, 10 variables, or even 100? The plots are still there—we just can't see them." As an example of how this multidimensional information can be manipulated, Kowalski tells of a group of biomedical researchers at the University of Washington that was monitoring the concentrations of six blood enzymes in a liver study. "I said, 'Let's make a six-dimensional plot.' When they heard the words 'six-dimensional plot,' they went away for a while, but they came back." The structure of such complex data—how the points are spread around and their relationships to one another—cannot normally be seen, so Kowalski used a chemometrics technique to project the six-dimensional data down to two dimensions with minimal loss of information. The biomedical researchers could now see 75% of the experimental information displayed in only two dimensions, and they went away again—this time happy instead of confused. Then there is the tale of the bad whiskey. Some forensic scientists in the West had a terrible problem: Malefactors were putting bad whiskey in good whiskey bottles. The forensic scientists needed a simple instrument that could distinguish bad whiskey from good. It also had to be portable enough to be carried in the trunk of a police car. Running chromatograms of good and bad whiskeys on a simple GC, Kowalski and co-workers consistently ended up with 17 peaks per sample. They then searched for an area of 17dimensional space that contained the expensive whiskeys and excluded the cheap whiskeys. "The problem was easily solved," he says. "In fact, we

ANALYTICAL CHEMISTRY, VOL. 54, NO. 13, NOVEMBER 1982 · 1379 A

Focus found we could reduce the problem down to two dimensions. Only two of the peaks, isoamyl alcohol and acetaldehyde, were needed to make a com­ plete distinction." Kowalski also found a bargain, an inexpensive scotch whiskey lying very close to the highquality domain, but he adamantly re­ fuses to divulge the identity of this beverage. Another application of chemometrics involves curve resolution soft­ ware to deconvolute fused peaks. These peaks occur when more than one component elutes at the same time. This problem can be solved in liquid chromatography (LC), for in­ stance, by simultaneously monitoring a series of absorbance wavelengths to generate a multivariate data set. By examining the variation at η + 1 spec­ tral windows across the peak, the prin­ cipal component analysis algorithm can determine whether or not up to η components are present under any peak. The individual contributions of up to two components to any fused peak can then be calculated. In effect, the components are resolved with soft­ ware rather than hardware. "Two LC companies will introduce this curve resolution technology at the next Pittsburgh Conference for sure," Kowalski predicts. "The company Infometrix, of which I was a cofounder,

Two LC companies will introduce this curve resolution technology at the next Pittsburgh Conference for sure. —Bruce Kowalski

is putting that software into those in­ struments. It won't be very long until all the LC companies have multichan­ nel spectral detectors to take advan­ tage of this multivariate curve resolu­ tion technology." "We can tell the analyst how many components are under each fused peak," explains Gerry Erickson, presi­ dent of Infometrix (Seattle, Wash.). "At present, we cannot quantitatively resolve the individual contributions of

more than two components. But we are working to extend the applicabili­ ty of the technique to more than two." Erickson is also interested in put­ ting his principal component analysis software into gas chromatography/ mass spectrometry (GC/MS) systems, where the information-rich detector is capable of generating a great deal of multidimensional data on each chro­ matographic peak. With GC/MS, his software system would not only be able to resolve the individual chroma­ tographic contributions of up to two components for each peak, but would also be able to generate the best math­ ematical solution for the individual mass spectra of those components. Infometrix, which provides consult­ ing services and software licensing in multidimensional data analysis, was strictly a basement/garage type of op­ eration when it was cofounded by Er­ ickson and Kowalski in 1978. For the first 2 1/2 years, Erickson was the only full-time employee. Now there are four employees, including James Koskinen, who recently came on board to focus on chemometrics software de­ velopment. "We've been providing consulting services to groups trying to come up to speed in pattern recogni­ tion and multidimensional data analy­ sis on fairly complex problems," ex­ plains Erickson. "We think there are several mathematical techniques that could have very broad application, that could become very useful tools for the analytical chemist." There is no doubt that chemome­ trics has much to offer. But up to now the field has had little impact outside of academia. "Before information pro­ cessing will be used in industry," says Duewer of Monsanto, "it will have to be built into the machines. You would almost never do a Fourier transform analysis unless the software were in­ side the instrument. Nobody ever worries now about the inversion of a Fourier signal, because that is some­ how taken care of within a black box. For a multicomponent statistical tech­ nique to be really accepted by bench chemists, it has to be made invisible. It has to be like the Fourier transform package." "The best approach we have seen in the near term for implementing curve resolution and making it available to scientists is to actually put the module within specific instrument systems and to, in effect, make the mathemat­ ics a part of the instrument," says Er­ ickson. "Analytical chemists in gener­ al probably don't need to know or want to know the particulars of the detailed mathematical steps involved. All they need to know is if the soft-

1380 A · ANALYTICAL CHEMISTRY, VOL. 54, NO. 13, NOVEMBER 1982

For a multicomponent statistical technique to be really accepted by bench chemists, it has to be made invisible. It has to be like the Fourier transform package. —Dave Duewer

ware is performing according to speci­ fications." Mike Parsons of Arizona State Uni­ versity agrees that "the instrument manufacturers are moving in that di­ rection. I also think it would be natu­ ral for them to start thinking about computer graphics for pattern recog­ nition." With computer graphics it has become possible to look at three-di­ mensional projections of multidi­ mensional data sets, enabling re­ searchers to "make use of the uniquely human ability to recognize meaningful patterns in the data" (Kolata, G. "Computer Graphics Comes to Statis­ tics," Science 1982,217, 919-20). Unfortunately, no one is expecting computer graphics to appear in ana­ lytical instruments any time soon. "There are quite a few people who are into 3-D graphics software," says Alice Harper. "But very few are designing graphics systems for instrumentation. Until now, I really don't think the huge demand has actually been there." But some of the nongraphical chemometrics software should be com­ mercially available soon. "I think most of the major manufacturers are begin­ ning to look at multivariate statistics," says Monsanto's Duewer. Wade Fite, president of Extranuclear Labs in Pittsburgh, who has been following some of Alice Harper's chemometrics research with interest, says he has been "very much impressed at how some real information can be extract­ ed from such complex data. With the advances going on, why, math is get­ ting cheaper than physics." Stuart A. Borman