Counter-propagation neural networks in the modeling and prediction

Carsten A. Bruckner , Marc D. Foster , Lawrence R. Lima , Robert E. Synovec , Richard J. Berman , Curtiss N. Renn , and Edward L. Johnson. Analytical ...
1 downloads 0 Views 1MB Size
Anal. Chem. 1992, 64, 379-386

additional columns, molecular species of Pi are resolved. The results obtained with the mixed phases can be improved by the use of the switching system, in this case the separation of Pg from the solvent front can be achieved with a retention time lower than 4 min. This column-switching system keeps the advantages of the single silica column (high efficiency) for most solutes and resolves the remaining complex solvent front (chloroform, phosphatidylglycerol, phosphatidylinositol). Rapid determination in less than 6 min of phosphatidylglycerol and phosphatidylinositolwas carried out (Figure 12). Simultaneously, the other phospholipids are isocratically eluted with a complete run time of 18 min and a sensitivity higher than in a gradient mobile-phase elution. Application of the Coupled System: Separation of Pulmonary Surfactant Analysis. The separation of phospholipids with the switching device is applied to the analysis of pulmonary ox surfactant. The compositions of phosphatidylglycerol (5% ) and phosphatidylethanolamine (15%) have been determined and differ from the data obtained by DethlofP3 for a material of equivalent origin (phosphatidylglycerol = 15% and phosphatidylethanolamine = 5%). With the chromatographic system used in this report, an increased resolution for phosphatidylglycerol and other phospholipids is achieved and permits a better accuracy in the quantitative analysis. In addition, the separation of phosphatidylinositol molecular species gives a new chromatographic profile of the lung surfactant. An automated version of the switching system will allow the elution of all phospholipid classes by heart cut techniques with different mobile phases. It could be suitable for a routine analysis of phospholipids especially to investigate fetal pulmonary maturity in the amniotic fluid.

ACKNOWLEDGMENT We are indebted to Martin Czok and Emmanuelle Brialix for helpful discussions and technical assistance. REFERENCES (1) Denlzot, A. 6.;Tchoreloff, P. C.; Bonanno, L. M.;Proust, J. E.;Lindenbaum, A,; Dehan, M.; Pulsieux. F. Med. Sei. 1991. 7, 37-42.

379

(2) Goerke, J. Biochem. Biophys. Acta 1974, 344, 241-281. (3) Possmayer, F.; Yu, S. H.; Weber, J. M.; Harding, P. 0. R. Can. J . Blochem. CaIIBioI. 1984, 62, 1121-1131. (4) Jobe. A.; Ikegami, M. Am. Rev. Resp. Dis. 1987, 136, 1258-1275. (5) King, R. J.; Carmlchael, M. C.; Horowitz. P. M. J . Bioi. Chem. 1983, 258, 10872-10880. (8) Whitsett, J. A.; Hull. W.; Ross, G.; Weaver, T. fedletr. Res. 1985, 19. 501-508. (7) Crawford, S. W.; Mecham, R. P.; Sage, H. Blochem. J . 1088, 240. 107-114. (8) Whitsett, J. A.; Ohning, 6.L.; Ross, 0.; Meuth, J.; Weaver, T.; Holm. 6. A.; Shapiro, D. L.; Notter, R. H. Pedlafr. Res. 1988, 20, 480-487. (9) Yu. S. H.; Possmayer. J. Blochem. J . 1988, 236, 85-89. (10) Ross. 0. F.; Notter. R. H.; Meuth, J.; Whltsett, J. A. J . Bioi. Chem. 1988. 261, 14283-14291. (11) Kiuchl, K.; Ohta, T.; Eblne, H. J . Chromafogr. 1977. 133, 228-230. (12) Yandrasitz. J. R.; Be", 0.; Segal, S. J . Chromafogr. 1981, 225, 319-32a. - . . .-. . (13) Marion, D.; Doulllard, R.; Gandemer, 0. Et&. Rech. 1888, 3 , 229-234. (14) Hundrieser, K.; Clark, R. M. J . Dairy Sei. 1988, 71, 81-87. (15) Weaver, T. E.; Whitsett. J. A. Semin. ferinat. 1988, 12. 213-220. (18) burts van Kessel. W. S. M.;Hax, W. M. A.; Demel. R. A,; Degier. J. B k h e m . Biophysic. Acta 1977. 486, 524-530. (17) Hax, W. M. A,; Geurts van Kessel, W. S. M. J . C h m f o g r . 1977, 142. 735-741. (18) Jungaiwala, F. B.; Evans, J. E.; Mc Cluer, R. M. Blochem. J . 1978. 155, 55-80. (19) Chen, S. S. H.; Kou, A. Y. J . Chromatogr. 1982, 227. 25-31. (20) Brland, R. L.; Harold, S. J . Chromatogr. 1981, 223, 277-284. (21) Paton, R. D.; Mc Gillbay, A. I.; Speir, T. F.; Whittle, M. J.; WhMeld, C. R.; Logan, R. W. Clln. Chim. Acta 1983, 133, 97-110. (22) Andrews, A. G. J . Chromafogr. lS85, 336, 139-150. (23) Dethioff. L. A.; Giimore, L. B.; Garyer, H. J . Chromafogr. 1988, 382, 79-87. (24) Helnze, T.; Kynast, G.; Dudenhausen, J. W.; Saling, E. J . ferlnaf. Med. 1988, 16, 53-80. (25) Smith, M.;Jungaiwala. F. B. J . LipMRes. 1981, 82, 897-704. (28) Patton, G. M.;Fasulo, J. M.; Robins, S. J. J . LipU Res. 1982, 23, 190-197. (27) Cantafora, A.; DI Base, A.; Alvaro, D.; Angellco, M.; Marin, M.; Attlll, A. F. Clin. Chim. Acta 1987, 134, 281-295. (28) Floyd, T. R.; Hartwick, R. A. In High f&wmance LMuMChrometmphy: Advances andferspectlves; Horvath. Cs., Ed.; Academic Press: New York, 1986 Vol. 4, pp 50-53. (29) Giddlngs. J. C. Dynamics of Chromatography: Principles and Theory; M. Dekker: New York, 1985; Part I . (30) Folch, J.; Lees, M.; Stanley, G. H. S. J . Bioi. Chem. 1957, 226, 497-509.

RECEIVED for review April 17, 1991. Accepted October 31, 1991.

Counter-Propagation Neural Networks in the Modeling and Prediction of Kovats Indices for Substituted Phenols Keith L. Peterson Division of Science and Mathematics, Wesleyan College, Macon, Georgia 31297 Counter-propagation neural networks are applied to the problem of modeling and predicting the Kovats indices of a set of substituted phenols from their nonempirical structural descriptors. The results are compared to those obtained from quantitative structure-chromatographk retention relationships in the form of multivariate linear regression equations. I find that the neural networks are signtficantiy better at modeling the data, typically giving root mean square errors in Kovats indices between 0 and 10, whereas linear regression equations typically give root mean square errors between 50 and 150. The predictions of Kovats indices with neural networks are better than predictions from regression equatlons by a factor of about 1.25 when the correlatbn coefficient between the structural descriptors and retention index is low. However, when the correiatbn coefficient is high, the regression predlctlons are better than the neural network predictlons by factors between 1.5 and 2.0.

I. INTRODUCTION There have been many attempts at obtaining quantitative structurechromatographic retention relationships (QSRR's) for given classes of compounds.' Typically, either empirical physicochemical parameters or nonempirical structural descriptor parameters have been used in order to obtain quantitative or semiquantitative relationships which allow the prediction of the retention behavior of an individual solute of a given class. The QSRRs are often expressed in the form of a linear equation whose independent variables are the parameters mentioned above and whose dependent variable is a Kovats index. The equation is obtained by performing a multivariate linear regression on data for compounds of a given class on a particular stationary phase. In this paper the use of counter-propagation neural networks as an alternative to linear multivariate QSRRs will be introduced. The usefulness of these neural networks in the

0003-2700/92/0384-0379$03.00/00 1992 American Chemical Society

380

ANALYTICAL CHEMISTRY, VOL. 64, NO. 4, FEBRUARY 15, 1992

modeling and prediction of Kovats indices from nonempirical structural descriptors will be studied for a series of substituted phenols. The neural network results will be compared to those obtained from linear multivariate regression equations. Back-propagation neural networks have been used to predict the products of electrophilic aromatic substitution reactions for a set of 45 monosubstituted benzenes2 and to predict minimum effective doses and optimal doses of a set of 39 carboquinones and of a set of 60 benzodiazepine^.^ Of these two studies the work in ref 3 is more nearly analogous to the work reported here in that the minimum effective and optimal doses had been previously studied with multivariate linear regression. W i l e the back-propagation results reported in refs 2 and 3 are quite good, I have chosen to use counterpropagation networks in the present work because they are simpler to use; they generally require less trial and error adjustment of parameters in order to achieve useful results. This remark will be explained in section 11. The remainder of the paper is organized as follows. Section I1 provides an introduction to neural networks in general and counter-propagation neural networks in particular. Section I11 describes the application of counter-propagation neural networks to a set of substituted phenols and compares the results to those obtained from multivariate QSRR’s. 11. COUNTER-PROPAGATION N E U R A L NETWORKS

In brief terms, a neural network is a computer model that matches the functionality of the brain in a very simplified manner. The neuron is the fundamental cellular unit of the brain. Its nucleus is a simple processing unit which receives and combines signals from other neurons via input paths. If the combined signal is strong enough, the neuron “fires”, producing an output signal which is sent along an output path. This output path divides and connects to other neurons through junctions referred to as synapses. The amount of output signal that is transferred across a junction depends on the synaptic strength of the junction. When the brain learns, it is the synaptic strengths which are m~dified.~ In a neural network a processing element (PE) plays the role of a neuron and has many input paths. The combined input to the PE is passed to its output path by a transfer function. (The transfer function is a function which maps the combined input to some output value; this will be explained in detail shortly.) The output path is connected to input paths of other P E S through weights which are analogous to synaptic strengths. When the neural net learns, it is the weights which are modified. A neural net consists of many PE’s joined by input and output paths. Typically, PE’s are organized into a sequence of layers. (See Figure 1.) The first layer is the input layer with one PE for each variable or feature of the data. The last layer is the output layer consisting of one PE for each variable to be recognmd. In between are a series of one or more hidden layers consisting of a number of PE’s which are responsible for learning. P E S in any layer are fully or randomly connected to PE’s of a succeeding layer. Each connection is represented by a number called a weight. In this paper I will be concerned with networks undergoing supervised learning, wherein the network is provided with a set of inputs (in our case a set of pattern vectors each describing a given phenol by a set of nonempirical structural descriptors) and a set of desired outputs. Each output represents a Kovats index for a given phenol. In this case the inputs are different from the outputs; the network is heteroassociative. Upon repeatedly presenting the input and output sets to the network, the weights between PE’s are gradually adjusted so that finally the network will generate the correct output when given a certain input. After this training or

LAYER

I

output

n

Competitive

Normalizing

input

t

t

D 2 MR

Flgure 1. Sample unifiow counter-propagation neural network. The input layer consists of two PE’s, one for each input feature (dipole moment squared, D2,and molecular refractivity, MR; see text). The normalizing layer contains one more PE than the input layer. The competitive layer containing SIX PE’s would be appropriate for a data set consisting of six phenols. The output layer contains one PE for the predicted Kovats index vaiue, I .

learning phase the network can be presented with an input whose corresponding output is unknown. When this input is fed through the successive PE layers, it is transformed into a predicted output (Kovats index). This prediction process can also be used with inputs whose outputs are known in order to test the training level of the network. If the network is fully trained, the predicted output should match the known output. I now give a brief description of the operation of a typical PE in a general neural network, with particular attention being given to points which are relevant to counter-propagation networks. For further details about neural networks the reader is referred to the excellent books on the subject, some of which are given in refs 5-9. Counter-propagation networks were invented by Hecht-Nielsen,’Owho noted that they were well suited to statistical analysis and function approximation; they should therefore have potential usefulness in the present application. During the operation of a network a PE will receive inputs from other PE’s to which it is connected. These inputs are the outputs of the connected P E S and are combined via a summation f u n c t i ~ n I. ~will ~ ~be concerned with two types of such functions. The first is the weighted sum Ii = cw,xj j

where Iiis the result of the summation function, Wijis the weight of the connection from processing element j to processing element i, and xi is the output of the PE from which the connection originates. The second summation function is a normalization function whereby the input vector to the network (xl,x2,..., x,) is replaced by an augmented vector (xo,xl,..., x,), where xo is chosen such that the augmented vector is normalized. The result of the summation function for a P E is transformed to an output of the PE with a transfer f u n ~ t i o n . I~ , ~ will be concerned with only one transfer function Ti = Ii (2) This is a linear transfer function where Ii is, for example, obtained from eq 1. I note in passing that back-propagation networks typically use a sigmoidal transfer function Ti = [l exp(-liG)]-’ (3) where G is an adjustable parameter, the optimum value of which must usually be found by trial and error. In some cases using the wrong value of G can lead to a back-propagation network which fails to converge to the desired outputs. Other

+

ANALYTICAL CHEMISTRY, VOL. 64, NO. 4, FEBRUARY 15, 1992

981

one more P E than the input layer. This layer ensures that transfer functions commonly used with back-propagation every input vector has the same length, i.e., that the vectors networks also contain adjustable parameters. Thus, one POare normalized. This layer utilizes the normalization sumtential advantage of counter-propagation networks over mation function, a linear transfer function, a direct output back-propagation networks is the absence of an adjustable function, and no learning rule. The second hidden layer is parameter in its transfer function. the competitive layer which acta as a nearest-neighbor clasThe result of the transfer function is then acted upon by sifier. The P E S in this layer learn with the Kohonen learning an output function. Most often, the output function merely rule. During training each P E competes with the others in passes along the value obtained from the transfer function. (This is referred to as a direct output function.) However, the layer and is equally likely to win for any randomly chosen input. For a given input vector, however, only one P E in this in counter-propagation networks one of the hidden layers utilizes an output function which allows for c ~ m p e t i t i o n ~ , ~ layer wins. The PE that wins is the PE which has the highest among the PE’s of the layer. Each P E in the hidden layer output. Equivalently,the winning PE is the PE whose weight vector (i.e., the set of numbers Wij) is closest to the input will have a number associated with it as a result of the transfer vector. This is what is meant by the phrase “nearesbneighbor function. The PE with the highest number is said to win. Use classifier”. The competitive layer utilizes a weighted sum of a “one-highest”output function allows the value from the summation function, a linear transfer function, a one-highest winning PE to be passed to P E S in the next layer but prevents output function, and the Kohonen learning rule. The output values from all other PE’s in the layer from being passed. of the winning competitive layer PE is the input to the output A consequence of competion is that only weights associated layer which in our case contains one PE. This P E will output with the winning P E can change. In a competitive layer of the predicted Kovata index for a given input vector (i.e., a counter-propagationnetwork, the weights change according phenol) by using the Widrow-Hoff learning rule to interpret to the Kohonen7vglearning rule the output from the competitive layer. For a given input to = wij C(x, - Wij) (4) the network there will be only one output from the competitive layer, that output coming from the winning PE. The Widwhere Wij is the weight (see after eq 1) before learning has row-Hoff rule enables the output P E to associate a Kovats occurred, xi, is the j t h input to the ith P E in the layer, C is index (i.e., desired network output) with each competitive layer a learning coefficient, and W,l is the weight after learning. The learning coefficient satisfies 0 < C