Ind. Eng. Chem. Res. 1989,28, 1619-1628
1619
PROCESS ENGINEERING AND DESIGN Pattern Recognition in Computer-Aided Synthesis of Industrial Chemical Reactions Daniel S. Tow and Dale F. Rudd* Chemical Engineering Department and Mathematics Research Center, University of Wisconsin-Madison, Madison, Wisconsin 53706
We report on the beginnings of a program that bridges the interface between the science and the economics of industrial chemistry. The general objective is as follows: given an industrial economy and a well-developed base of industrial chemical reactions, synthesize a set of new chemical reactions that hold the promise of being both chemically feasible and economically viable. Using well-known principles, we have prepared a chemical reaction synthesis program to generate hypothetical chemical reactions that have some degree of chemical and economic reality. Using new principles based on vector space pattern recognition, we explore the properties of a pattern recognition program that is capable of recognizing, with a 2 order of magnitude improvement in discrimination, hypothetical reactions that have similar chemical and economic characteristics to the known industrial reactions. It is suggested that combination of the two programs can lead to a powerful tool to explore for reactions that have attractive chemistry and easily can be incorporated into an industrial economy. 1. Introduction
In the last decade, we have focused our attention on the interface between the science and the economics of industrial chemistry. On both sides of this interface, industrially useful developments have been reported. Economic Assessment Program. These programs determine the economic potential in the world economy of given chemical reactions (Rudd et al., 1981). In a recent proprietary study, the economic promise was assessed in the United States, Western Europe, and Japan of new catalytic reactions that are under development (Staff, 1985). Catalyst Design Advisory Program. These programs aid the chemist in the interpretation of experiments in heterogeneous catalysis (Dumesic et al., 1987). For example, a link has to be made between molecular orbital calculations of surface bond energies, spectroscopic studies of model catalyst surfaces, microkinetic mechanisms of chemical reactions, and the rates and selectivity of industrial catalysts. A wide variety of industrial applications have been reported (Dumesic et al., 1989). In this paper, we report on the beginnings of a program that bridges the interface between the science and the economics of industrial chemistry (Tow, 1986). The general objective is as follows: given an industrial economy and a well-developed base of industrial chemical reactions, synthesize a set of new chemical reactions that hold the promise of being both chemically feasible and economically viable. We have prepared a chemical reaction synthesis program to generate hypothetical chemical reactions that have some degree of chemical and economic reality. We do not seek to reproduce the elaborate programs already in application; rather, we require access to a wide variety of hypothetical chemical reactions upon which to explore methods of pattern recognition. In this work, the compound set is restricted to simple compounds, inorganics and organics 0888-5885/89/2628-1619$01.50/0
having four or fewer carbons, with no carbon chain having more than three carbons. Only a very small fraction of the reactions synthesized have significant industrial potential. In one test case, the program generated 15392 chemical reactions, only 54 of which were identified as established industrial chemical technology. We have prepared a pattern recognition program that is capable of examining known industrial chemical reactions and training itself to recognize hypothetical reactions that have similar chemical and economic characteristics to the known industrial reactions. From the set of 15392 hypothetical chemical reactions, the pattern recognition program selected a set of the 100 most-promising chemical reactions. This set contained 48 of the previously identified industrial reactions and 52 new chemical reactions that are chemically and economically similar to the industrial reactions. The use of pattern recognition theory is the original contribution of this work. A combination of the chemical reaction synthesis program and the pattern recognition program can form a powerful tool to explore reactions that have attractive chemistry and easily can be incorporated into an industrial economy. 2. Generation of Hypothetical Reactions The generation of the hypothetical reactions begins wit1 a statement of the industrial chemical economy into which new industrial chemical reactions are to be integrated. In this work, the economy is similar to the commodity chemical industry of the United States in 1983. This information establishes a preliminary pricing and production structure for the variety of commodity chemicals, a structure that is used to direct chemical reaction synthesis toward large volume products. Next a set of commodity chemicals is selected to define the area of chemistry within which the hypothetical reI
0 1989 American Chemical Society
1620 Ind. Eng. Chem. Res., Vol. 28, No. 11, 1989 actions are to be synthesized. Combinations of these chemicals form possible reactants, intermediates, and products of the hypothetical reactions. Two general areas are identified. Market-Limited Chemicals. Chemical reactions that produce commodity chemicals are limited by the available supply of feedstocks and existing demands for the products. The market limits are determined by the chemical economy under consideration. Utility and Fuels. Chemical reactions can be valuable for the production of heat and fuels. The production of heat and fuel is not limited by the defined chemical economy, as they may be sold outside the chemical industry. The chemical reaction synthesis program assembles a wide variety of chemical reactions that seem to be feasible chemically, that is, reactions that have favorable thermodynamics, that involve plausible rearrangements of chemical structure, and that are driven by the breaking and forming of defined chemical bond types. Also favored are reactions that lead toward reasonable costs of separation of the products and positive venture profits. The hope is to avoid the incredible combinatorial problems associated with combining chemicals at random and calling the result a chemical reaction. We seek to synthesize hypothetical reactions that appear superficially to be chemically and economically reasonable on the industrial scale. By use of 97 chemicals combined into sets of reactants, intermediates, and products, the chemical reaction synthesis program generates 15 392 hypothetical reactions, among which 54 are identified as existing chemical technologies. This is the discrimination accuracy expected at this initial stage of reaction generation. The computer-aided synthesis of chemical reactions is a well-advanced field of research and certain useful basic principles are available (Govind and Powers, 1981; Rotstein et al., 1982; Fornari et al., 1989). In the next several sections, we apply these principles to the development of the chemical reaction synthesis program. It is important to outline the development of this program; the program screens out a large fraction of the thermodynamically infeasible and structurally unpromising chemical reactions. This reduces the scope of properties that must be included in the pattern recognition program. Thus, although well-known principles are used in the synthesis program, it is necessary to spend some time describing how those principles are used. The nature of the pattern recognition problem is directly dependent on the class of chemical reactions prepared by the synthesis program. 2.1. Data Filling. The initial stages of the program form data fies on all the chemicals to be used in generating the chemical reactions. 2.1.1. Chemical Formulas. The chemical data filer prompts for the stoichiometric formulas of the chemicals. The formulas are entered as [ni,ai]where i = 1, ..., m denotes the different elements in the compound, niis the number of atoms of the ith element, and aiis the atomic number of the ith element. Regardless of the order in which the elements are arrayed in the input formula, the program standardizes the formula for ascending atomic numbers. These standardized formulas are necessary in the subsequent efficient balancing of the stoichiometric reactions. As the formulas are fded, the program also generates and files two lists, a list of all the elements occurring in the formulas of the data set and a reference list of the compounds having each element as the highest numbered elements in the compound. The latter list is useful in
efficiently compiling the groups of compounds to be combined for the stoichiometric reactions. 2.1.2. Chemical Bonds. The next stage of the chemical data filer prompts for the chemical bonds in the chemicals. A bond t y p e is determined by the elements on each end of the bond and by the multiplicity of the bond. A double bond is treated as two bonds, a first (single) bond and a second bond, while a triple bond is treated as three bonds, a first, a second, and a third bond. Ionic bonds, easily broken in an aqueous solution, are all recorded separately. Regardless of the order by which the bond types are entered, the program standardizes the data for descending bond-type code numbers. This standardized order is necessary in the efficient analysis of bonds broken and formed in the reactions. 2.1.3. Thermodynamic Data. The next stage of the chemical data filer prompts for the thermodynamic data for the chemicals. Approximate expressions for the standard Gibb’s free energy and enthalpy of formation as a function of temperature can be derived from the filed data on the standard free energy and enthalpy at 298 K and temperatures and enthalpies of melting and boiling. Decomposition points are also needed to avoid generating infeasible reactions that only appear to be thermodynamically feasible at temperatures that would decompose the reactants or the products before they could react. For example, it is meaningless to consider a reaction involving glycerin at 1000 K since any glycerin reactant will decompose at 563 K, before it reaches the temperature of reaction, and any glycerin product at 1000 K will decompose before it can be cooled to stability. Though a few compounds have sharp decomposition points, most compounds gradually grow less stable over a broad range of temperatures. Thus,for most compounds, the decomposition point is poorly defined. For the purpose of this work, the approximate temperatures used are adequate. Where there is doubt about the decomposition points, temperatures erring on the high side are used since sometimes the reactions sought are the decomposition reactions, which would be excluded if the decomposition points were set too low. Apart from their use in evaluating thermodynamic feasibdities, the melting, boiling, and decomposition points and the enthalpies of boiling are used to estimate the costs of separation of coproducts in the economic evaluation of the reactions. Certain chemicals lack melting or boiling points. Since the model uses a thermodynamic analysis of conditions at 1atm, a compound that sublimes without melting at 1atm has no relevant melting point. Data are then supplied, indicating a false melting point at 1K less than the sublimation temperature. This false melting point is paired with a 0 kcal/mol enthalpy of melting so that the pseudomelting point has no effect on the later thermodynamic analyses. The sublimation temperature and the enthalpy of sublimation replace the boiling point and the enthalpy of boiling. Both melting and boiling points can be meaningless if thermal decomposition occurs in the solid or the liquid phase. Here, any nonexistent temperature of phase transition should be set at some convenient point, say 2000 K, well above the decomposition point supplied. Dummy data of 0 kcal/mol are supplied for the corresponding enthalpies of phase transition, All data for phase transitions occurring above the decomposition temperature are ignored by later thermodynamic analyses. 2.2. Generation of Candidate Reactions. The synthesis and evaluation of new reactions begins with the
Ind. Eng. Chem. Res., Vol. 28, No. 11, 1989 1621 generation of large numbers of stoichiometricallybalanced reactions. Because it is not possible to examine every conceivable reaction individually in even the most cursory detail, it is necessary to narrow the search to a manageably small subset of the more promising reactions. This subset should be delimited by properties that can be known of the reactions even before the reaction stoichiometries have been found. 2.2.1. Structural Similarity. The reactions generated in this search are limited primarily by the choice of chemicals to be used in the reaction stoichiometry. Structurally complex products are most often synthesized from structurally similar precursors using narrowly defined reaction types, such as aldol condensation and Grignard reactions, involving breaking and forming of defined chemical bonds. Given the low probability of finding any kinetically reasonable mechanism to produce a structurally complex product without using a known, well-tested reaction type, structurally driven synthesis is most appropriate for generation of reactions involving structurally complex compounds. In this work, the compound set is restricted to simple compounds, inorganics and organics having four or fewer carbons, with no carbon chain having more than three carbons. 2.2.2. Industrial Scale. This work seeks new reactions useful to the chemical industry as a whole. These reactions should not only offer a significant profit but should also be reactions that can be profitable on a large scale. To be profitable on a large scale, a reaction must either produce only waste products and heat, as, for example, the combustion of coal by utilities, or a reaction must produce products in high demand. A product having high demand may be in high demand from outside the chemical industry or may be a high-volume precursor for other products in high demand. It is possible to synthesize clusters of new reactions that can have high-volume intermediates that are not now in high demand. These intermediates are produced and consumed in equal amounts and are thereby largely independent of the external market and of the remainder of the chemical industry. The synthesis of reaction clusters to achieve preselected net reactions has been studied (May and Rudd, 1976). Clusters of reactions involving at least one intermediate that is not produced or consumed in high volume in the current industry require development of at least two new high-volume reactions. One new reaction must efficiently produce the intermediate in large amounts exactly matched by the consumption of the intermediate by the other new reaction. Of the 15392 reactions generated in this work, 54 (roughly 1in 300) are existing industrial reactions. Thus, an average new reaction has a low a priori likelihood of succeeding industrially. The probability of simultaneously introducing two or more new reactions for industry is much lower still. The chemicals for reaction in this work are limited to chemicals with the highest demand and availability. In this way, most of the reactions generated hold the promise of being profitable on a large scale without depending on the unlikely development of any complementary new reactions needed to exactly balance a negligible net production and consumption of a low-volume compound. 2.2.3. Stoichiometric Conventions. In balancing a chemical reaction, one stoichiometric coefficient is fixed arbitrarily, effectively normalizing the reaction. With one
coefficient set arbitrarily and with n adjustable coefficients, an appropriate set of n + 1 chemicals can exactly satisfy n element balances. Occasionally, one of the n + 1chemicals in the balanced reaction has a coefficient of zero and thus drops out, leaving only n compounds involved in the final reaction. Very rarely, two or more compounds can drop out of a reaction with several element balances. Most reactions (about 98% in this work), though, show n 1 chemicals involved to give n element balances. As will be demonstrated later, new reactions involving more than four chemicals have a very low a priori probability of success in industry. A few existing industrial reactions do have four and even five element balances, but owing to the combinatoric nature of the problem, vastly more hypothetical reactions would have to be examined with four element balances, while only a few additional useful reactions with four element balances could be found. Therefore, to efficiently generate only reactions with four or fewer chemicals involved, the reactions generated are restricted to only two or three element balances. 2.2.4. Three-Element Limitation. Clearly, since only two or three element balances are allowed in this search, no reactions can contain any compound with more than three elements. A few four-element compounds are nevertheless included in the data set simply to test the program’s ability to handle any of these compounds, should they be included in future data sets. 2.2.5. Degrees of Freedom. A large class of reactions have at least 1hypothetical degree of freedom, which in turn offers the attractive potential for widely adjustable thermodynamics. A degree of freedom occurs when the ratio between some pair of stoichiometric coefficients need not be fixed to obtain a balanced reaction. Generally, these reactions have at least n + 2 chemicals to satisfy only n element balances. There are important drawbacks, however, to these reactions. We shall refer to reactions with no degrees of freedom as irreducible reactions. A reaction having 1 or more degrees of freedom is reducible. In every case, a reducible reaction can be reduced to the sum of at least two simpler, irreducible reactions. For example, the reaction 4C + 0 2 + 2C02 6CO could also be balanced with the coefficients 3C + 0 2 + CO2 4CO
+
-
-+
Since the ratio between the coefficients for C and O2is unfixed, the reaction has a degree of freedom and is reducible. Notice that, with n = 2, n + 2 chemicals are involved in this reaction, having only n element balances. Note, too, that the reaction can be written as the sum of the two irreducible reactions: 2c + 0 2 2co 2C + 2C02 --* 4CO If both irreducible reactions are thermodynamically feasible, the reducible reaction is not necessary, as it can be replaced by independent operation of the irreducible reactions; the adjustable thermodynamics are simply not needed. However, if one irreducible reaction is required to drive the other forward (as for the example at moderate temperatures), the reducible reaction is kinetically unlikely. In the first place, it will be shown that reactions with only a few chemicals involved have the highest a priori probability of success, and even for only two-element balances, a reducible reaction must have at least four chemicals involved. In the second place, for one reaction to drive another forward, conditions are required such that the
-
1622 Ind. Eng. Chem. Res., Vol. 28, No. 11, 1989
reducible net reaction is kinetically feasible (despite the number of chemicals involved) and both of the simpler irreducible reactions are kinetically hindered. This search, therefore, only includes irreducible reactions, reactions uniquely satisfying the element balances. The search avoids almost all reducible reactions by simply not attempting to use more than n + 1 chemicals to achieve n element balances. A very few reducible reactions can be put together with only n 1 chemicals for n element balances. For example, the three-element-balance reaction 3CH20 CO + H, + CH3COOH can be reduced to CH2O CO + H, plus 2CH20 CH3COOH When a linear algebra program detects a problem lacking a unique solution, the reaction is rejected. Of the 97 existing industrial reactions, only one, 0 2 + 2CH3COOH + 2CH2CH2 -+ 2CH3COOCHCH2 2H2 has a hypothetical degree of freedom, further indicating the generally low degree of promise in these reactions. Furthermore, the reaction mechanism almost certainly consumes the degree of freedom in requiring a one-to-one reaction between CH,COOH and CHzCH2so that, even for this unusual example of an industrial reaction with a degree of freedom, the thermodynamic properties are not so adjustable as they first appear. 2.2.6. Order Groups of Reactions. To lower the computer core memory needed for handling the reactions, the reactions are generated and filed in ordered groups. Within the program, each chemical is referred to by one of a set of consecutive integers. The ordered groups for generation and filing are defined by the highest numbered chemical involved in the reaction. Thus, for example, all reactions involving acetic acid and only lower numbered chemicals are generated and filed together. To further order the files, reactions consuming the highest numbered chemical (referred to as forward reactions) are filed separately from reactions producing the same compound (referred to as reverse reactions). The highest numbered chemical in a reaction will be referred to as the reference chemical. 2.2.7. Reaction Synthesis. For the generation of reactions in each file, a list of the elements in the reference chemical is compiled. These elements are called the fixed elements. The rest of the elements occurring in chemicals numbered lower than the reference chemical are called the variable elements. The stoichiometric reaction generator begins with reactions of three element balances, followed by reactions of two element balances, provided the number of fixed elements does not exceed the number of element balances. Zero, one, or two variable elements may need to be added to the fixed elements to reach the desired element balances. For each reference chemical file, all the possible combinations of the variable elements needed for the element balances are generated exactly once. The fixed elements are combined with each variable element combination to produce, exactly once, each ordered list of three or two element balances capable of generating a reaction involving the reference chemical. A matrix M is generated, holding the formulas of all compounds numbered lower than the reference chemical and only having elements on the element balance list. The
+
--+
-+
+
+
program the generates, exactly once, each ordered combination of n chemicals in M to form A. A unique solution is impossible if the rank of A is less than the number of element balances, n. Very rarely, for rank A < n, will there be infinitely many solutions. As already mentioned, these are reducible reactions having at least 1 degree of freedom, and they are unpromising as potential industrial reactions. In either case, the reaction generator simply proceeds to the next combination of chemicals without further consideration of the infeasible or unpromising combination of chemicals. 2.3. Thermodynamic Screening. Each stoichiometric reaction generated is evaluated for its thermodynamic properties. The thermodynamic evaluation covers the promising range of reaction temperatures from 298 K to T,, where T,, is the minimum decomposition temperature of the chemicals involved in the reaction or 1500 K, whichever is smaller. 2.3.1. Free Energy. The thermodynamic evaluation begins with the calculation of the standard Gibb’s free energy, enthalpy, and entropy of reaction at 298 K. The thermodynamic evaluation uses the approximation that the specific heat differences between reactants and products may be neglected because the energies of phase change and reaction are much larger. Under this approximation, the enthalpy and entropy of reaction are constant except at temperatures of a phase change of some chemical involved in the reaction. At phase changes of the chemicals involved, the free energy of reaction is continuous over all temperatures but has a discontinuous slope at each phase transition. To calculate the free energy for the full range of temperatures the program finds the constant enfrom 298 K to T,, thalpies and entropies between each pair of phase changes. A negative standard free energy shows that the reaction can go forward at standard conditions. (Standard conditions are pure solid, pure liquid, or pure ideal gas at 1 atm.) With a high degree of recycle, it is certainly possible for a thermochemical process to achieve a reaction with a low conversion that results from a positive standard free energy of reaction. Positive free-energy reactions may also be achieved by electrolytic processes. Most feasible industrial processes, however, are built around processes that show negative standard free energies at some point in the temperature range examined. Of the existing reactions in the industry model, 54 show negative standard free energies, while only 8 failed to be rediscovered owing to unfavorable standard free energies. Of these, two reactions represent electrolytic processes, five are processes operated above T-, and the last produces a dilute product. All reactions failing to show negative standard free energies of reaction at any temperature in the range examined are discarded. For this simple screening, the approximate thermodynamic analysis used is probably adequate. 2.3.2. Temperature Range. Reactions with negative standard free energies at some temperature in the range examined will be referred to as thermodynamically favorable reactions. Thermodynamic properties are calculated and filed for further evaluation of the reactions in the temperature range Tminto T-, the minimum and maximum temperatures that allowed a negative standard free energy of reaction in the temperature range examined. 2.4. Economic Evaluation. Each thermodynamically favorable reaction is evaluated for its local potential profit P. The estimate of potential profitability has three parts: the difference between product and reactant values, the
Ind. Eng. Chem. Res., Vol. 28, No. 11, 1989 1623 value or cost of heat liberated or needed, and the roughly estimated separation costs, including charges for capital investment. The value or cost of the heat liberated or needed per short-ton-mole reacted is referred to as the heat value of the reaction. The heat value of the reaction is negative for endothermic reactions since these reactions require heat. In general, the cost of providing a given quantity of heat at a given temperature is much greater than the value that could be recovered from the same heat released at the same temperature. This economic inefficiency in heat recovery results from the capital costs of heat transfer that must be paid regardless of the direction of heat transfer. Moderate-temperature heat supplied to or recovered from a process will have a lower absolute value than the same quantity of heat supplied or recovered at a higher temperature. The value of heat is about zero near room temperature, climbing roughly linearly to high values at high temperatures. The function describing the approximate value of heat was derived empirically from the costs of low-temperature and high-temperature steam and from consideration of the costs of fuels (coal and methane) commonly used to prvide heat for utilities. Of course, in industry, the value of heat varies widely with circumstances. However, since the value of heat is usually small in comparison with the chemical value of a reaction, the heat value needs only to be approximate in this context. Sometimes, particularly when products of a reaction are close boiling, the separation costs are important to the economics of a process. In general, the evaluation of separation costs requires a detailed design of the proposed chemical process. Such detailed design could not reasonably be done for all 15392 reactions generated in this search. For a rough, practical first approximation of separation costs, severe simplifications were made. Usually, the separation costs, like the heat values, are small enough that the rough estimates are adequate for this context, particularly since, as will be shown, the most promising reactions tend to have only one product, not requiring any separations. The estimation of separation penalties uses the assumption that only products require separating because conditions of temperature, pressure, and kinetic efficiency allow the nearly complete conversion of reactants. Where a reaction yields only one product or yields two or more products with widely separated melting or boiling points, the separation cost is set at or near zero. Where a reaction yields two or more products with close boiling points, the separation cost is estimated with an automatic, simplified distillation design. In a few cases, distillation is infeasible owing to product decomposition points lying below the potential column operating temperatures. Such separations must be designed individually to find realistic costs of separation by crystallization, extraction, etc. For want of any systematic alternative, a simple cost penalty proportional to the amounts of products to be separated is applied to these special separation cost estimates. The shortcoming of the estimate of gross reaction profitability lies in the costs associated with the reaction kinetics. Slow reactions can require prohibitively expensive reactors, and unfavorable kinetics can allow unwanted side reactions that eliminate any potential for profit. Costs associated with the reaction kinetics can range from near zero costs, as for a simple flame reaction, to very high costs, as for a kinetically infeasible reaction. By neglecting the kinetic costs, the program finds an optimistic upper bound
to the potential reaction profitability. In practice, even an efficient industrial reaction will yield a profit well below this upper bound, but the upper bound does give a useful estimate of how much potential there is to absorb kinetic costs while still yielding some profit. To allow an objective comparison of potential profitability, all costs are normalized for the reaction written, having absolute values of the stoichiometric coefficients summing to one. The complete estimate of local potential profitability is given by P = A V c + vh-c, where P is the potential profitability per short-ton-mole for the normalized reaction, AVc is the chemical value of the reaction, v h is the heat value of the reaction, and c, is the estimated cost, if any, of separating the products of the reaction. P is stored as a reaction property to be used later in estimating potential profitability per year and to be considered for use as a property for pattern recognition of promising reactions. 3. Pattern Recognition The chemical reaction synthesis program generates a set H of hypothetical chemical reactions that superficially appear to be industrially attractive. The chemical reactions are stoichiometrically balanced, have favorable free-energy changes in industrially reasonable temperature ranges, are chemically reasonable as they involve structurally similar compounds in which defined chemical bonds are formed and broken, and appear to lead toward positive local venture profit. These several properties are far from sufficient to ensure commercial feasibility. Methods of pattern recognition must be developed further to screen the large number of reactions generated by a synthesis program. Found in this set, H, are chemical reactions that have already been commercially developed. These known industrial reactions form the set T. We suspect that there are other promising reactions in H waiting to be found. The set of known and unknown industrially attractive reactions is denoted by I. We use vector-space pattern recognition to develop methods to detect the properties of the set I that distinguish the industrially promising reactions in the much larger set of hypothetical reactions H. Pattern recognition is an area of artificial intelligence devoted to methods of training a computer to recognize members of defined categories by example. In this work, the defined category is potentially industrially significant chemical reactions I and the term by example connotes use of the chemical reactions in H that have already been identified as commercial technology. The set of identified commercial processes defines the training set T. We seek to train the computer to recognize new chemical reactions that have the same characteristics for success as those reactions that have already been proven commercially. 3.1. Decision Theory. Reactions belong either to category I , promising industrial reactions, or to a category U , unpromising industrial reactions. Let the property vectors of the reactions in category I be called I vectors and the corresponding vectors for category U be called U vectors. If categories I and U are unambiguously defined, and the properties described by the vectors are linked to membership in I or U , then the set of I vectors and the set of U vectors are disjoint. In this case, it may be possible to find a surface S in the vector space completely dividing
1624 Ind. Eng. Chem. Res., Vol. 28, No. 11, 1989
the I vectors from the U vectors. If such a surface is found, recognition of I vectors and U vectors can be perfect. Normally, the properties used for pattern recognition do not completely determine membership in I or U. The regions of vector space containing I vectors and U vectors overlap, thus making perfect recognition of membership in I or U impossible. When perfect recognition is impossible, a Bayesian approach is useful (Jurs, 1974). Training uectors, known to be I vectors and U vectors, are used to estimate the distributions of all the I vectors and U vectors. With these distributions, one can estimate the probabilities that any unknown vector belongs to I or U. Bayes’s criterion for decision makiig yields the formula, if DIIDU > LI/Lu,then postulate category I. Otherwise, postulate category U. DI and Du are, respectively, the I and U distributional densities (weighted for the a priori probabilities for I and U categories), and LIand Lu are the losses that would result if a postulation of category I or a postulation of category U , respectively, was wrong. The purely theoretical properties of the hypothetical reactions generated will not suffice to fully distinguish the industrially promising reactions from the unpromising. Clearly, at best the theoretical properties of reactions can only find the reactions likely to prove industrially useful. We use vector-space pattern recognition in this work to find the reaction property combinations that tend to promise industrial usefulness. We use set T of 54 existing industrial reactions as training sets to approximate the unknowable underlying distribution of the set of industrially promising reactions. To approximate the underlying distribution of promising reaction properties, the density in the space between individual training vectors is interpolated using a sum of interpolating functions that smoothly fall toward zero with the distance from the training vectors. Since the optimum underlying distribution is unknowable, there is no one ideal means of interpolating. Specht (1969) describes a particularly convenient and physically reasonable interpolation for the densities as follows: m
DA(XI,..., Xp) = Cg(di) t=l
g(di) = (2~)-p/~a-p exp[-(di/2a2)]
where p is the dimensionality of the vetor space, m is the number of training vectors in category A , DA is the interpolated density of the A vectors weighted by the number of training vectors, g is the interpolation function, a multivariate gaussian with the covariance matrix equal to a21(where in this formula alone Z is the identity matrix), u is an adjustable “smoothing parameter”, and di is the Euclidean distance in p space from the vector X = (XI, ...,X p ) to the A-category training vector XA, = ( X A , ~...,, XA. ) * ‘he parameter a is adjusted to large values to obtain the smooth density functions and is kept small to obtain densities closest to the density of the training vectors. Specht shows that if infinitely many training vectors are available the interpolated density, DA, can be made arbitrarily close to the training vector density. In this work, we have at the most 54 training vectors, corresponding to the 54 known industrial chemical reactions out of the 15 329 chemical reactions generated. Our hope is to create sufficiently accurate estimates of the distribution function DIfor all promising reactions,
known and unknown, from training set T of 54 industrial chemical reactions. 3.1.1. Approach Taken. We immediately run into a problem. We have no way of identifying members of the unpromising reaction set, U. We can only identify members of the known industrial reactions T to be used as the training set of approximate the properties of the industrially promising reactions I. Somehow we must work with only the two available sets of reactions: T , the set of known industrial reactions, and H, the set of all hypothetical reactions. This line of thought is taken. If we select a single commercially proven chemical reaction Ti in training set T, the pattern recognition program can be trained to select that single chemical reaction from the set Hi, the set of all hypothetical reactions minus the single reaction Ti.The pattern recognition parameters are adjusted to maximize F = D,/D,z Unfortunately the pattern recognition parameters so selected will only efficiently identify the single technology
Ti. However, if we require that a single set of pattern recognition parameters maximize simultaneously the sum of the criteria for the selection of all of the Ti in the training set, a pattern recognition system is obtained that is biased toward the training set T m
max C D T J D W i=l
Fa,, =
m where m is the number of members of the training set. With the pattern recognition program so trained, it will recognize chemical reactions that are similar to the known set of industrial reactions. 3.2. Pattern Recognition Vector Space. Vector-space pattern recognition uses numerical properties as the basis for recognition. Each reaction under consideration corresponds to a point in vector space determined by selected properties of the reaction. These several reaction properties were examined to isolate those few reaction properties that are most efficient identifiers of promising reactions. The properties selected include those compiled by the computer during the course of the synthesis of the reactions. Pattern recognition in reaction synthesis is dependent on the method of synthesis, and it is to be expected that specially tailored pattern recognition programs will need to be developed for each specific reaction path synthesis program. 3.2.1. Stoichiometric Properties. Simple stoichiometries are thought to favor feasible reaction kinetics. Combinations of these stoichiometric properties may be useful to identify reactions that are easily executed industrially. The stoichiometric properties are as follows: the number of chemicals involved in the reaction, the number of element balances required, the number of reactants (as opposed to products), and the average magnitude of the integral stoichiometric coefficient. For example, the first chemical reaction with a value of 1would seem to be structurally simpler S + C3HB H2S + C3H5 than the second reaction with a value of 5.75 10N02 + 4H20 8HN03 + N2 3.2.2. Thermodynamic Properties. All of the reactions generated have favorable free-energy changes. The magnitudes of these favorable energetics and the tem-+
-
Ind. Eng. Chem. Res., Vol. 28, No. 11, 1989 1625 peratures a t which they are favorable may be important in pattern recognition. Also, the phases of the reactants are thought to influence the engineering of the reaction. The thermodynamic properties are as follows: the maximum temperature of the negative Gibb’s free-energy range, the minimum temperature of the negative Gibb’s freeenergy range, the maximum minus the minimum of the temperature range, the largest value of the negative free energy, the heat of reaction, the minimum number of phases present at all feasible reaction temperatures, and the number of solids present at the highest feasible reaction temperature. 3.2.3. Chemical Bond Properties. The chemical bonds that must be broken and formed to execute a reaction ought to be an indicator of chemical feasibility. For example, in the reaction HzCCH2 + H2O C2H40H
-
on the left-hand side, there are four C-H bonds, one C = C bond, and two O-H bonds, and on the right-hand side, there are five C-H bonds, one C-C bond, one O-H bond, and one 0-C bond. With no additional information, we can state that at a minimum one of the two bonds in C=C and one O-H bond must be broken and one 0-C bond and one C-H bond formed. However, the reaction CH2OHCH20H + C2H6 2CHSCH20H +
could be executed by two mechanisms: breaking two C-C bonds and forming two other C-C bonds; breaking and forming C-0 and C-H bonds. This ambiguity can only be resolved by knowing the detailed mechanism of the chemistry, a property that is known for only a few wellstudied chemical reactions, or by assuming that two hypothetical technologies exist, each with one of the two postulated bond activity assumptions. The net gain and loss of each bond type is referred to as the minimum bond reaction. In most reactions, this gives a clear picture of bond-type involvement and, in addition, the total number of bonds involved, the number of bond types, and the number of single bonds. Single, double, and triple covalent bonds differ greatly in reactivity and stability; the absence of involvement of bond types might be an indicator of the upper bound of reactivity, and for this reason, the bond type least likely to be involved may be an important indicator. The chemical bond properties are as follows: the total number of bonds active, the total number of bond types active, the total number of single bonds active (this is particularly useful, as the involvement of double and triple bonds does not require the breaking of the structure of a molecule, which only occurs with single bond activity), the bond types least likely to be involved, the bond types least likely to be formed and the bond types least likely to be broken. 3.2.4. Economic Properties. The economic properties are as follows: the local profitability of the reaction and the improvement in profitability of the industry as a whole. Fathi-Afshar and Rudd (1981)developed a simple criterion for the economic promise of hypothetical technology, based on the impact that the technology has on the industry into which it is to be integrated. Preconditioning of the training vectors occurs for each component Xj: Xj= ajRj + b, where R . is a function of the property (number of chemicals involved, free energy of reaction, bond ranking number, etc.) chosen so that arithmetic differences have meaning. The parameters aj and bj are used to center X j roughly
around zero and to spread the points in the jth dimension to a standard deviation value of unity. The dimensionality and the reaction properties used to define the vector space strongly influence the convergence of the pattern recognition algorithm. All else being equal, the introduction of added dimensions tends to increase the distance between the neighboring training vectors. This can result in an insufficiently smooth distribution unless the smoothing parameter u in the interpolation function is set high. However, if u is set too high, the information contained in the most useful reaction properties will be smoothed away. We let u = 0.5. The chemical reaction properties used in pattern recognition were considered individually, combined in pairs and then triplets and so forth, leading to 49 individual combinations of reaction properties. For each combination, the optimum Fave was determined, consistent with reasonable convergence. Three chemical reaction properties survived this screening operation. T h e number of chemicals involved in the reactions. This property is useful because only about 2% of the reactions generated to form set H involved only three chemical, yet in training set T , half of the reactions only involved three chemicals. Thus, reactions involving only three chemicals are favored a priori by about 50-fold over four chemical reactions. T h e number of reactants involved i n the reactions. All but one four-chemical reaction of the reactions in training set T have exactly two reactants and two products, and three-chemical reactions are 5 times as likely to involve two reactants and one product than one reactant and two products. Reactions aA + bB cC are favored over reactions aA bB + CC Bond type least likely to be active in the reactions. Owing to the combinatorics of the chemical reaction generation program, 84% of the reactions in H involve single carbon-carbon bonds. Only 2% of the industrial reactions in training set T involve single carbon-carbon bonds. Although single carbon-carbon bond activity occurs in the training set, reactions not involving single carbon-carbon bonds are favored by 130-fold. With these three reaction properties defining the property-vector space, we found that for the reactions in T evaluated as if they were unknown, Fa, = 0.093. If the properties used in the pattern recognition had no relevance, we would expect that Favewould equal the ratio of the number of known reactions to the total number of reactions generated, 54/15 392 = 0.0035. Thus, the pattern recognition program yields an average of 26 times higher estimate of successful selection of the correct technology than random selection. The sum of the a priori probabilities of success for all of the hypothetical reactions is 54, the number of known successful reactions. An unbiased evaluation of success by pattern recognition should also yield a sum of probabilities in the neighborhood of 54. If the sum is much higher than 54, the F values are too high, probably due to poor convergence with the pattern recognition program. This sum was found to be 44.3, reasonably close to, but somewhat lower than, the a priori estimate. This means that the probabilities of success are somewhat conservative.
-
-
4. Discussion of Results
Ninety-two chemicals were filed to initiate the reaction generator. Of these, five are compounds of four elements
1626 Ind. Eng. Chem. Res., Vol. 28, No. 11, 1989
and are ineligible for use in three-element balance reactions. Therefore, only 87 chemicals were used to generate the 15392 reactions that showed at least nominal promise. Ninety-seven existing industrial reactions were filed as models of useful reactions. Of these, 54 existing reactions appeared among the 15 392 reactions generated. 4.1. Known Reactions Not Synthesized. As a check of the reaction generator and of the strategy that, of necessity, narrowed the set of reactions generated, we examine the reasons that 43 existing industrial reactions failed to be rediscovered by the reaction generator. Twenty reactions were missed owing to requiring four or five element balances. Since the reaction generator fails to rediscover these 20 reactions, it undoubtedly misses several new four-element-balance reactions that would be useful to industry. Yet to find these few useful new reactions, we would need to somehow cull almost all the hundreds of thousands of reactions that would result if this search were extended to four element balances. Though this extended search may be possible, it is beyond this current study. Apart from the 20 reactions excluded for having too many element balances, two more reactions were excluded for their stoichiometries. One reaction was excluded for containing two minor market chemicals, and the other is reducible. Thirteen reactions were discarded for economics that were calculated to be unfavorable for general industrial use. The final eight discarded model reactions were calculated to have unfavorable thermodynamics. Of these eight reactions, two are electrolytic, five occur at temperatures above 7'-, and the last produces only dilute HN03 (while the thermodynamics are calculated based on pure materials). 4.2. Expected Economic Impact. The relative ranking of chemical reactions on the F scale measures relative success as perceived by the pattern recognition program. Should a chemical reaction be successfully introduced into the chemical economy, the change in the total annual economic return from the economy, A$/year, can be estimated by the solution of standard industrial economic models. A significant ranking scale is the economic expectation, E , of the chemical reaction:
E = FA$/year The economic expectations, E, are intended primarily for comparison of the relative amounts of money worth investing in attempts to develop these reactions. The lowest ranked reactions might not justify a substantial research effort, unless the researcher believes that it is significantly underrated in its probability of success or its potential yield. The top five ranked reactions are utility reactions with expectations, E , in the range 5.2 X lo9 < E < 6.9 X lo9. The highest ranked market-limited reaction is the important already used reaction for the synthesis of ammonia: N2 + 3Hz
-
2NH3
rated with E = 2.3 X l@. The next highest ranked reaction is new, a potentially very high volume process for singlestep production of sulfuric acid using HZS + 202 H2S04 rated with E = 2.1 X lo8. As we proceed down the ranks, the rated economic expectations quickly drop to lower values, reaching E = 7.6
Table I. The 100 Most-Promising Reactions: Utility and Fuel Reactions' rank reaction 1 2c + 0 2 2 c o 2* c + o2 cop 2Hz + 0 2 2H20 3* 4 2H2 + C CHI 5* 8
9 13* 14 17 19
14 "The symbol
-s + - sop + + + - + + + + + so2 + - s + + + 02
2C 2Hz0 CH, COZ 2CH4 302 2 c 0 4H2O CH, 202 HzO COB 4Hz 3C C3Hs 3H2 2C CzHs 2c 2co CaO C Oz CaC03
* denotes existing technology in the training set.
Table 11. The 100 Most-Promising Reactions: Market-LimitedReactions in the Rediscovered Existing Technology Training Set rank reaction 6 3Hz + N2 2NH3 10 2C2H4 + O2 P(ethy1ene oxide) 18 eo + Cl2 COCl2 21 2S02 + O2 2S03 22 2N0 + 0 2 2N02 23 SO3 + H 2 0 H2S04 24 2CzH4 + 0 2 -+ 2CHSCHO 21 A1203.3HzO A1203 + 3Hz0 32 CHZCHCH, + H2O --+ CH3CHOHCH3 34 C3HB Hz + CH2CHCH3 36 CO + 2Hz CHBOH 43 C2H4 + H2O CHSCHZOH 47 C2H* + Cl2 CHzClCHzCl 50 PCHZCHCHO + 02 BCHZCHCOOH 52 CHzCICHzCl CHzCHCl + HC1 79 propylene oxide + HzO propylene glycol 89 CHzCHCl + HCI CH3CHC12
---
-
--- - +
lo6 for the 100th best reaction. The list of the most-promising 100 reactions generated is divided into three sublists: utility and fuel reactions, Table I; existing market-limited reactions that were part of the training set, Table 11; new market-limited reactions, Table 111. Four of the 12 utility and fuel reactions are already widely used industrially, including the burning of sulfur X
-
s + 0 2 so2 which qualified as a utility reaction because it appears to be economical to operate for its released heat even if there is no demand for the sulfur dioxide product in the economic model of the economy. This ranking is probably false since the economic evaluation neglects the pollution problems that are so important to the economics of this reaction. Inspection shows that several of the utility reactions are no more than indirect means of burning coal and methane. The economic expectations of the utility reactions are somewhat arbitrarily limited by the operation limit of 2 X lo8 short-ton-moll year imposed on nonmarket operations. Therefore, the order in which these reactions are ranked is only approximate. To show that the method can discover useful reactions, existing industrial reactions were evaluated as if they had not yet been discovered. In the top 100 reactions, 21 were existing industrial reactions, including 18 in the top 50. Thus, in the top 50 reactions, the fraction of existing reactions is 100 times higher than the fraction of existing reactions in the full 15 392 reactions generated. This is even more than the rated probabilities of success would lead one to expect, owing to the industrial reactions having
Ind. Eng. Chem. Res., Vol. 28, No. 11, 1989 1627 Table 111. The 100 Most-Promising Reactions: Market-Limited Hypothetical Reactions' rank reaction H9S + 209 H ~ S O I 7 11 c& + 0; CHZOHCH@H 12 2s + 3O2 2S03 2CHzCHCH3+ Oz B(propy1ene oxide) 15 2CH4 + O2 2CH30H 16 C,H, + 0,- CHXOOH 20 25 C-+ko, l 92 c o " 2C3H8+ O2 2CH3CHOHCH3 26 2C9Ha + 0, 2CHqCHiOH 28 diithilene glycol ---CH;OHCHzOH + CH3CH0 29 2CzH6 + 02 -+ 2CzH4 - 2Hz0 30 C3H8+ O2 propylene glycol 31 33 CzHa + 02 CzH4 + HZOZ 2HzS + 3oz 2Hz + 2S03 35 2(vinyl acetate) + Oz 2(acetic anhydride) 31 2C3H8+ Soz 2(glycerin) 38 2NH3 + 30, 2Hz + 2HN03 39 propylene oxide + COz acetic anhydride 40* 41* CHzO + CHSCOCH3 2CH3CHO CzH6 + S H2S + C2H4 42 diethylene glycol 2Hz + acetic anhydride 44 2CHqCOCH.q CIHn 45 " - + CHzCHCOOH H2 +- Oz HzOz 46 C1, + CHXHCl CH,CCl, 48 " " CH3CHOkCH3 + Oz glycerin 49 2CH3CHOHCH3+ Oz 2(ptopylene glycol) 51 4NH3 + 302 2Hz + 2NHdN03 53 2CzH6+ 302 O(ethy1ene oxide) + 2H202 54 2CHzCHCH3 + 02 2CH3COCH3 55 NH3 + 7N02 5 N 0 + 3HNO3 56 2CHSCHO CH4 + CHZCHCOOH 51* 58 SZClz+ 3O2 2S03+ Clz CzHB+ Oz ethylene oxide + Hz 59 C& + CHzOHCH20H 2CHSCHZOH 60 7N02 + NH4NO3 5 N 0 + 4HNO3 61 62 HzS + 2O2 SO3 + HzO 2(ethylene oxide) CHI + CHzCHCOOH 63* 2CHZCHCl+ CC1,CClZ 3CHzCClZ 64 diethylene glycol 2CH3CH0 + HzO 65 vinyl acetate CHzO + CHzCHCHO 66* HZ + CH3COCH3 CH3CHOHCH3 67 NH, + 2O2 HN03 + HzO 68 4NHq + 7NO9 5 N 0 + 3NHiNOq 69 ~" CzH4-+ Clz CH3CHClz IO CzH4 + CClzCC12 2CHzCClz 71 2HNO.q + 2HiO9 2NH.q + 50, 72 CH30H + dH3CHOHCH3--26H3CHz0H 13* 2NH4NO3 + 502 4NNO3 + 2H20z 75 H, + vinyl acetate 2CH3CHO 76 77 NH,N03 + 2O2 2HN03 + HzO I8 B(propy1ene glycol) + 02 2(glycerin) 2CO + C C 4 2C0C12 + C 80 2C2H4+ Clz Hz + 2CHzCHCl 81 2c12 + c + c02 2coc12 82 2NH3 + 2Nz + 9 0 z 6HN03 83 2NH4N03t 2Nz + go2 8HN03 84 2C12 t C CCl, 85 CzHG + Clz Hz + CHzClCHzCl 86 Clz + CHzCHCl CHZClCHC12 81 CH3COCH3+ H20z glycerin 88 ethyl acetate CHI + CHzCHCOOH 90* 2CHZCHCl- CzH4 + CHzCClz 91 4Hz + 2N2 + 302 2NHdNO3 92 93* 2C9Hn CH, + CqHn 94 cci, i- CH, 2cHzCiZ C + 2C12 + C3Ha 2CHzClCHzCl 95* 96* CCld + C3H8 2CH3CHClZ 97 CHzCHCl + 2CClZCC1, 3CHClCClz 98 NH4N03+ HzO 2NH3 + 20, 99 CCl, + 3CH, 4CH3Cl 100 C2H4 + 3CClzCC12 4CHClCCl2
-- --- -- - --
----- - ------ +
--
-
--
- -
-+
-+
-
-t
-- --
"The symbol * denotes C-C single bond activity.
high potential profitability as well as high F values. Four of the industrial reactions were already listed as utility
reactions. The 17 market-limited industrial reactions are listed in Table 11. Finally, the most promising 57 new market-limited reactions are listed in Table 111. This is the desired end product of computer-aided synthesis of industrial chemical reactions. Several of these reactions have been recognized previously as promising and have been investigated experimentally. With a refinement of the data base and improvements in the synthesis and pattern recognition program, such a listing should attract the attention of industrial chemists interested in investigating new reactions. The list is meant to be used critically, combined with the chemist's intuition, experience, knowledge of factors that this evaluation neglects, and knowledge of the research organization's needs and resources. By assigning very low probabilities, the pattern recognition system effectively discards almost all reactions, showing carbon-carbon single bond activity involved in the minimum bond reactions. Examination of the carbon chain lengths on each side of the reactions shows that 10 of the top 100 reactions must require both breaking and forming of carbon-carbon single bonds.
5. Conclusions and Recommendations It is clear that computer-aided synthesis of industrial chemical reactions is feasible. It has been demonstrated that a large number of plausible chemical reactions can be synthesized. Pattern recognition methods can be developed to select a much smaller list of the reactions that are recognized to be similar to reactions that are known to be useful industrially. The expected economic ranking of the reactions can give some idea of the dollar payoff to be expected should the new reactions be commercialized. With extensions of the data base, integration with the available extensive models of the chemical economy, and improvements in the reaction synthesis and pattern recognition programs, industrially significant problems can be attacked. The integration of pattern recognition into the reaction synthesis program can reduce the combinatorial problems when chemistry involving larger molecules is considered. However, we believe that greater advances can be made with a deeper understanding of what is meant by chemical similarity. Throughout the study, attempts were made to identify the parameters required to compare the chemistry of similar reactions. Of particular interest are parameters that identify similar chemical structures, that measure similar activity in the forming and breaking of chemical bonds, and that point to promising catalyst selections. It is necessary to define some chemical similarity class into which to place each reaction synthesized. This is currently being approached by Dumesic et al. (1989) using the microkinetics of heterogenous catalysis. Acknowledgment This work was performed at the Department of Chemical Engineering and the Mathematics Research Center of The National the University of Wisconsin-Madison. Science Foundation Fellowship Program,the D. C. Slichter Professorship, and the Mathematics Research Center provided financial support. Professors J. A. Dumesic, S. H. Langer, and the late R. R. Hughes provided insight into the formulation of the research problem. Literature Cited Dumesic, J. D.; et al. Ind. Eng. Chem. Res. 1987, 26, 1399-1407. Dumesic, J. A.; Trevino, A. A.; Rudd, D. F. Microkinetics in Catalysis Research; Shanahan Valley Associates: Madison, WI, 1989.
Ind. Eng. Chem. Res. 1989,28, 1628-1639
1628
Fathi-Afshar, S.; Rudd, D. F. Chem. Eng. Sci. 1981,36, 1421-1425. Fornari, T.; Rotstein, E.; Stephanopoulos, G . Chem. Eng. Sci. 1986, in press. Govind, R.; Powers, G. J. AIChE J. 1981,27, 429-442. J u n , P. C. Chemical Data Interpretation Using Pattern Recognition Techniques. Computer Representation and Manipulation of Chemical Information; Wipke, W. T., et al., Eds.; Wiley and Sons: 1974. May, D.; Rudd, D. F. Chem. Eng. Sci. 1976, 31, 59. Rotstein, E.; Resasco, D.; Stephanopoulos, G. Chem. Eng. Sci. 1982, 37, 1337. Rudd, D. F.; Fathi-Afshar, S.; Trevino, A. A.; Stadtherr, M. A. Petrochemical Technology Assessment; Wiley-Interscience: New
York, 1981. Specht, D. F. Polynomial Discriminant Functions for Pattern Recognition. In Pattern Recognition; L. N., Kanel, Ed.; Thompson Book Company: Washington, DC 1969. Staff of Shanahan Valley Associates and Catalytica Associates. Impact of New Catalyst Developments on the Chemical Industry. Mountain View, CA, 1985. Tow, Daniel Stuart. Ph.D. Thesis, University of WisconsinMadison, 1986.
Received for review April 12, 1989 Revised manuscript received July 25, 1989 Accepted August 15, 1989
Accurate Solution of Different ial-Algebraic 0ptimization Problems Jeffery S. Logsdon and Lorenz T. Biegler* Department of Chemical Engineering, Carnegie-Mellon University, Pittsburgh, Pennsylvania 15213
Differential-algebraic optimization problems arise often in chemical engineering processes. Current numerical methods for differential-algebraic optimization problems rely on some form of approximation in order to pose the problem as a nonlinear program. Here we explore an appropriate discretization and formulation of this optimization problem by considering the stability and error properties of implicit Runge-Kutta (IRK) methods for differential-algebraic equation (DAE)systems. From these properties, we are able to enforce appropriate error constraints and method orders in a collocation-based nonlinear programming (NLP) formulation. After demonstrating the IRK properties on a small DAE system, we show from variational conditions that optimal control problems can have the same difficulties as higher index DAE systems. This is illustrated for a number of small chemical engineering optimization examples that exhibit higher index characteristics. For these cases, the NLP formulation in this paper yields efficient and accurate solutions. 1. Introduction
The determination of optimal control profiles is of major importance for process applications. Examples within chemical engineering include problems in reactor design, process start-up, batch process operation, etc. However, the solution of optimization problems with differential and algebraic equation models still remains a difficult problem. At present, optimization problems with nonlinear algebraic equations can be solved in a straightforward way as nonlinear programs. On the other hand, unconstrained problems with differential equation models can be handled through the calculus of variations. However, models that combine both of these features are currently optimized by imposing some level of approximation to the problem. The purpose of this paper is to develop and discuss a nonlinear programming formulation that leads to the accurate solution (within an E tolerance) of the general differentialalgebraic optimal control problem. Current methods for handling these problems either apply an approximation to the control variable profile or to both the state and control profiles. A straightforward approach adopted by Sargent and Sullivan (1977) is to parameterize the control profile (e.g., piecewise constant) over variable-length finite elements and to solve the differential equations with this parameterization. A nonlinear programming algorithm is then applied to the control parameters in an outer calculation loop. Similar strategies have been proposed by Ray (1981) and Morshedi (1986). This “feasible path” approach requires the repeated and expensive solution of the differential-algebraic equations. Also, state variable inequality constraints cannot be handled in a straightforward way. Finally, the quality of the
* Author to whom correspondence should be addressed. 0888-5885/89/2628-1628$01.50/0
solution is strongly dependent on the parameterization of the control profile. Early studies with the second approach, parameterization of both the state and control profiles, were reported by Neuman and Sen (1973),Tsang et al. (1975),and Lynn et al. (1979). Here state and control profiles and the differential equations were parameterized by use of some method of weighted residuals (e.g., orthogonal collocation). This leads to a large nonlinear program (NLP) with algebraic equality constraints. However, since NLP algorithms were less developed at that time, this approach either was inefficient when compared to feasible path methods or was restricted to specialized (e.g., linear) problems. With advances in NLP methods through the development of Successive Quadratic Programming (SQP) and MINOS, these NLP’s could be solved more efficiently and could handle nonlinear state and control profile Constraints in a straightforward manner. Biegler (1984) demonstrated this approach on a small batch reactor problem. Renfro et al. (1987) solved much larger problems with orthogonal collocation on finite elements and piecewise constant approximations to the control profile. In order to obtain accurate finite element solutions, however, Cuthrell and Biegler (1987, 1989) imposed additional constraints in the NLP formulation in order to enforce accurate state profiles. They classified the role of finite elements in terms of knot locations (over which the error was equidistributed, hence minimized) and breakpoints that allowed for control profile discontinuities. This led to a formulation that enforced the accurate solution of the differential equations and allowed for a general description of the control profile. In this paper, we explore the theoretical development of these finite element constraints and present a formulation that leads to arbitrarily accurate state variable and control 0 1989 American Chemical Society