Prediction of Solubility of Solid Organic Compounds in Solvents by

Martin D. Ellegaard , Jens Abildskov , and John P. O'Connell .... Bush, and Charles A. Eckert , Timothy C. Frank, Sumnesh Gupta, and James D. Olson ...
0 downloads 0 Views 390KB Size
5114

Ind. Eng. Chem. Res. 2002, 41, 5114-5124

Prediction of Solubility of Solid Organic Compounds in Solvents by UNIFAC Sandra Gracin,† Tore Brinck,‡ and Åke C. Rasmuson*,† Department of Chemical Engineering and Technology, and Department of Chemistry, Royal Institute of Technology, S-100 44 Stockholm, Sweden

Predictions of solubility of nine different solid organic fine chemical compounds in water and organic solvents of relevance to industrial processing are examined. UNIFAC interaction parameters are taken from standard reference literature, extracted from liquid-vapor equilibria. For most systems, predicted solubilities deviate more than 15% from experimental values. Deviations are due to uncertainties in the estimation of the activity of the pure solid as well as to deficiencies in the estimation of activity coefficients in the solution. By comparison with results from ab initio quantum chemical calculations of the elecrostatic potential on the molecular surface of the solutes, it can be shown that a key assumption of the UNIFAC approach is not necessarily fulfilled. The properties of a functional group may depend significantly on the properties of the rest of the molecule. Introduction The solubility of solid compounds in solvents and solvent mixtures plays a key role in all crystallization processes. Crystallization is a crucial step in the manufacturing of many pharmaceuticals, agrochemicals, dyestuffs, catalysts, zeolites, proteins, and food products. The solubility at the conditions of the process, determines the production rate and the yield, as well as the method by which supersaturation is generated in the process and how the supersaturation varies during the course of operation. In the production of fine chemicals and pharmaceuticals, the need for rapid process development and scaling-up is stronger than ever, and hence there is an increasing need for methods that can replace tedious laboratory work by computer simulations. Predictive methods can also contribute to finding more optimal solutions in the overall processing of different compounds. In the area of vapor-liquid equilibria, significant advances have been made in the prediction capability by a reasonable combination of theoretical modeling and parameter determination from experimental data. The development of group contribution methods for prediction of activity coefficients, especially the UNIFAC method, has been of great importance in design work of chemical separation processes. The UNIFAC approach by Fredenslund et al.1 allows the predicition of activity coefficients in a liquid mixture within a Raoults law framework. In the model, molecules are broken down into functional groups and the mixture is treated as a mixture of groups. The properties of each group are assumed to be independent of the rest of the molecule to which it is attached. On the other hand, by this approach an unlimited number of molecules can be represented by perhaps 30-40 different functional groups. Data for the groups are determined by regression of experimental data collected in large data banks. * Corresponding author. Telephone: +46 8 7908227. Fax: +46 8 105228. E-mail: [email protected]. † Department of Chemical Engineering and Technology. ‡ Department of Chemistry.

Over the years, modifications of the UNIFAC method have been proposed to overcome limitations, for example, in the applicable temperature range, representation of temperature effects and accuracy for systems where molecules are very unequal in size. Weidlich and Gmehling2 and Larsen et al.3 introduce a modified combinatorial part and temperature-dependent group interaction parameters. Abildskov et al.4 proposed the addition of a second-order term in order to improve the capability to handle isomers and difficulties with proximity effects. While quite extensive work has been done in developing and proposing different models and modifications, less work has been devoted to validation. This makes it very difficult to assess the actual improvement of each new procedure over the previous one. Wu and Sandler5,6 showed that ab initio molecular orbital calculation can be used to compute the charge on atoms and groups of atoms in molecules. On the basis of the principles of unchanging geometry and approximate group electroneutrality, they were able to identify functional groups for use in group contribution methods. Complete testing of any proposed new group, or other modifications of the UNIFAC model, can only be done within the context of a refitting of all model parameters using a large bank of experimental equilibrium data. In the prediction of vapor-liquid equilibria, the UNIFAC method has been in use for almost two decades, and its capabilities and weaknesses are known. Much less attention has been given to the prediction by UNIFAC of solubility of organic solids. Gmehling et al.7 found a satisfactory prediction of the solubility of typical nonelectrolyte solids (naphthalene, anthracene, phenanthrene) in different solvents. Kan and Tomson8 used UNIFAC to predict the aqueous and nonaqueous solubilities of a vast number of environmentally important chemicals (chlorinated benzene, polycyclic aromatic hydrocarbons, substituted phenols, polychlorinated biphenyls, organohalide insecticides). Good agreement was observed for some groups of compounds, but for chlorinated alkenes, phthalates, and long-chain alkanes, the results were less successful. According to the authors, the primary reason for the discrepancy appears to be the uncertainty of group interaction parameters.

10.1021/ie011014w CCC: $22.00 © 2002 American Chemical Society Published on Web 09/06/2002

Ind. Eng. Chem. Res., Vol. 41, No. 20, 2002 5115

Gupta and Heidemann9 and Pihno et al.10 assign a new UNIFAC group in order to predict the solubility of amino acids in water, but it was only useful for a limited number of amino acids and peptides. Kuramochi et al.11 proposed a modification of Larsen’s UNIFAC model and introduced three new groups to accurately describe the activity coefficients of amino acids and peptides in water. In a further extension,12 11 new groups were introduced. Gabas and Laguerie13 correlated the solubility data of sugars in water by introducing an aldohexose cyclic skeleton group. Little has been done on organic molecules containing polar and nonpolar groups, that is, molecules typical for the fine chemicals and pharmaceuticals area. In particular, little has been done on these compounds when dissolved in organic solvents typically used in production. One exception is the work of Ochner and Sokoloski14 in which UNIFAC was used to predict the solubility of 4-hexylresorcinol in ethyl acetate, ethyl myristate, hexane, and their binary and ternary mixtures. The predicted solubilities were within 10% of the experimental ones in 18 out of 21 solutions. UNIFAC was shown to be able to account for positive and negative deviations from ideality in systems forming hydrogen bonds between molecules having different sizes. Ness15 applied UNIFAC to generate a list of candidate solvents for cooling crystallization of a new photographic chemical compound, and Frank et al.16 reviewed a number of methods to predict the solubility of nonionic organic solids for the same purpose. In the present work the ability of the original UNIFAC method to predict solubilities of very different solid organic fine and specialty chemicals, organic intermediates, and pharmaceutically important compounds in 15 industrially important solvents is examined. Many fine chemicals and pharmaceutical compound molecules are more complex than the molecules normally encountered in gas-liquid equilibria. In addition, these molecules sometimes contain functional groups for which UNIFAC parameters are unavailable. We use the original UNIFAC model, since for this version the range of parameters is the widest. Theory Equilibrium Relationship. Solubility denotes the solute concentration in a solution that is in thermodynamic equilibrium with the solute in the solid state, that is, the solute concentration of the compound denoted by i when

µsi ) µsat i

(1)

The superscript s denotes the solid phase, assumed to be pure, and the superscript sat denotes the saturated solution. The chemical potential of the solute in the saturated solution can be written

µsat i

) µ°i + RT

sat ln(γsat i xi )

sat sat asi ) asat i ) γi xi

(3)

Equation 3 shows that the solubility mole fraction of a compound can be predicted if the activity of the pure solid and the activity coefficient of the solute in the saturated solution can be predicted. Obviously, the standard state of the definition of the solution activity coefficient must be the same as that of the activity of the solid state. UNIFAC Activity Coefficients. Fredenslund et al.1 first developed UNIFAC as a method for predicting liquid-phase activity coefficients for systems in vaporliquid equilibrium. The method is based on the traditional Raoult’s law standard state; that is, the standard state is the pure component as a liquid at the temperature of interest. If the temperature is below the melting point of the pure compound, this standard state becomes a pure supercooled melt. Its properties can be calculated with fair accuracy provided that the solution temperature is not too far away from the melting point of the solute. The UNIFAC approach for prediction of activity coefficients is based on the UNIQUAC model. The activity coefficient is expressed in terms of two types of contribution:

ln γi ) ln γCi + ln γRi

(4)

The combinatorial part, upper index C, reflects the excess entropy of mixing due to differences in size and shape, and the residual part, upper index R, reflects the excess enthalpy of mixing resulting from interaction energies. Details of the UNIFAC model can be found in many of the references given, for example, Fredenslund et al.1 Activity of the Solid. Equations 1-3 allow us to write

ln

asi

µsi - µ°i

)

RT

)

gsi - g°i RT

)-

∆gfus RT

(5)

where ∆gfus is the change in the partial molar Gibbs free energy of i in going from the solid state to the reference state at constant temperature and pressure. In the present study we only consider compound i, and hence the subscript is neglected in the further derivation. Since the reference state is a supercooled melt at the same temperature and pressure:

(

)

∂∆gfus/T ∂T

)-

P

∆hfus T2

(6)

Integration from the melting point temperature to temperature T leads to fus

∫TT d∆gT m

)

∆gfus )T

fus

∫TT ∆hT2 m

dT

(7)

since the free energy change at the melting point is equal to zero. Hence, we arrive at

(2)

where ° denotes a standard state used as reference and having the same temperature. If an equal standard state is used for the solid state and the dissolved state, the activity of the dissolved solute in the saturated solution equals that of the pure solid:

ln as )

fus

∫TT ∆h RT2 m

dT

(8)

which shows that the activity of the pure solid phase can be determined by integration of the enthalpy of fusion from the atmospheric melting point to the

5116

Ind. Eng. Chem. Res., Vol. 41, No. 20, 2002

Table 1. Test Substancesa

Equation 10 is primarily used in the present evaluation, but the approximations behind it are addressed in the discussion.

temperature in question. The activity coefficient is predicted by the UNIFAC method, and the pure solid activity is predicted by eq 10. Predicted solubilities of nine different compounds in a variety of solvents are compared with experimental data. Experimental data are partly retrieved from the Beilstein database17 and are partly from published18 and unpublished19 data produced in our laboratory. Activity coefficient calculations are done by a computer program written in Visual Fortran 5.0, which incorporates the equations of Fredeslund et al.1 and the interaction parameters of Hansen et al.20 and Fredeslund and Sorensen.21 The solubility mole fraction is calculated iteratively by eq 10, since the activity coefficient depends on concentration. Two guesses for the composition are initially required (possible minimum and maximum values, i.e., 0 and 1). The algorithm then employs the method of Van Wijangaarden-Dekker-Brent22 to determine a new value, which is then compared to the original values. The iteration continues until the difference between successive concentration values is less than a specified tolerance. Test Substance Selection. Test substances are listed in Table 1 (data in the table is retrieved from the Beilstein database17), and their structures are shown in Figure 1. They were chosen among organic fine and specialty chemicals, organic intermediates, and pharmaceutically important compounds, for their industrial relevance and/or structural interest. Almost all of the test substances are mono- or disubstituted (para position) benzenes with polar groups attached directly to the benzene ring or via nonpolar groups. Solvent Selection. Two criteria were used for solvent selection in the present work. The first is their industrial relevance, especially in the fine chemicals and the pharmaceutical industries. The second criterion is their polarity measured by the empirical polarity parameter, ENT (the normalized Reichard-Dimroth parameter, ET (30)23,24). Solvents were chosen to reflect a wide range of polarities.

Evaluation

Results

The solubilities of solid organic compounds at 293 or 298 K in various solvents are estimated by prediction (1) of the activity coefficient of the solute in the solution, γsat, and (2) of the activity of the pure solid, as, at the

Results for each solute are presented in Figure 2, except for the case of paracetamol, which is presented in Figure 3. The functional groups in benzoic acid are identified as 5 AcH, 1 AC, and 1 COOH. No clear

M

Hm

Tm

test substance

CAS-RN

g/mol

J/mol

°C

benzoic acid p-hydroxybenzoic acid p-aminobenzoic acid phenylacetic acid p-hydroxyphenylacetic acid p-aminophenylacetic acid paracetamol ibuprofen trans-stilbene

65-85-0 99-96-7 150-13-0 103-82-2 156-38-7 1197-55-3 103-90-2 15687-27-1 103-30-0

122.12 138.12 137.14 136.15 152.15 151.16 151.17 206.28 180.25

17 982 31 443 24 032 16 000 28 384 43 938 27 100 25 500 27 690

122 214 189 77.8 151 204 170.5 77 125

a

CAS registry numbers are supplied by the author.

temperature of the saturated solution. Since dh ) Cp dT, we can write

(

)

∆hfus m 1 1 s ln a ) + R Tm T

∫T

T

∫TT (Cmp - Csp) dT m

RT2

m

dT (9)

According to eq 9, the heat capacity of the solid needs to be integrated from T to the melting point, and the heat capacity of the melt needs to be integrated from the melting point down to temperature T. Since T often is far below the melting point, the heat capacity of the “hypothetical” supercooled melt cannot be measured but must be obtained by extrapolation of the data taken above the melting point. Very few data are available in s the literature for (Cm p - Cp) ) ∆Cp. In the chemical engineering literature, it is commonly assumed that ∆Cp ) 0 or that the corresponding term in eq 9 is negligible, which results in s

sat sat

a )x γ

[

(

∆hfus m 1 1 ) exp R Tm T

Figure 1. Test substances.

)]

(10)

Ind. Eng. Chem. Res., Vol. 41, No. 20, 2002 5117

Figure 2. Predicted versus experimental mole fraction solubility for each solute (mibk ) 4-methyl-2-pentanone).

systematic deviation between predicted and experimental values is observed. A mean relative deviation over n |[((xi,exp - xi,calc)/xi,exp) × all solvents is defined as ∑i)1 100]/n| and amounts to 31% for the solubility of benzoic acid.

The functional groups in phenylacetic acid are identified as 5 AcH, 1 AcCH2, and 1 COOH. There is a tendency for predictions to be too low. The mean relative deviation for the solubility of phenylacetic acid is 32%.

5118

Ind. Eng. Chem. Res., Vol. 41, No. 20, 2002

Figure 4. Overall relative deviation distribution.

Figure 3. Predicted versus experimental solubility for paracetamol (cases 1 and 2).

The functional groups in p-hydroxybenzoic acid are identified as 4 AcH, 1 AC, 1 AcOH, and 1 COOH. Predicted values are usually lower than the experimental values. The mean relative deviation for the solubility of p-hydroxybenzoic acid is 73%. The relative deviation is particularly high in chloroform (783%) and in water (293%), two solvents in which the solubility is low. The functional groups in p-hydroxyphenylacetic acid are identified as 4 AcH, 1 AcOH, 1 AcCH2, and 1 COOH. Predictions tend to systematically underestimate higher solubilities. The mean relative deviation calculated excluding chloroform is 43%. In chloroform the relative deviation is over 1500%, while in all other solvents the relative deviation is in the range from 20 to 65%. The functional groups in p-aminobenzoic acid are identified as 4 AcH, 1 AC, 1 COOH, and 1 AcNH2. Most predicted values are lower than experimental ones. Acetone and chloroform values are clear exceptions, with major overprediction. The mean relative deviation for the solubility of p-aminobenzoic acid is 578%. High relative deviations are found for chloroform (517%), hexane (700%), benzene (435%), and acetone (250%). The functional groups in p-aminophenylacetic acid are identified as 4 AcH, 1 AcNH2, 1 AcCH2, and 1 COOH. There is a tendency for predicted values to be low. The mean relative deviation excluding chloroform is 57%. For solubility in chloroform the relative deviation is 582%. The functional groups in ibuprofen are identified as 3 CH3, 1 CH, 4 AcH, 1 AcCH2, 1 AcCH, and 1 COOH. No systematic deviation is observed. The mean relative deviation is only 17%. The functional groups in trans-stilbene are identified as 10 AcH, 2 AC, and 1 CHdCH. The solubility in essentially all solvents is overestimated. The mean relative deviation for the solubility of trans-stilbene is 40%.

The functional groups in paracetamol are identified as 5 AcH, 1 AcOH, and 1 CH3CONH or 4 AcH, 1 AcOH, 1 AcNH, and CH3CO. However, since required parameters are not available, some approximations have been done. Either (case 1) the functional groups in paracetamol are identified as 4 AcH, 1 AcOH, 1 AcNH2, and CH3CO or (case 2) they are identified as 4 AcH, 1 AcOH, 1 AcCH2, and CH3CO. Results are presented in Figure 3. For case 1, the mean relative deviation is 176% mostly because of the large relative deviation for solubility in ethyl acetate (757%) and in water (557%), but it is also high in acetone (109%). The mean relative deviation calculated without ethyl acetate and water is 40%. The mean relative deviation for case 2 is 64% including all solvents. Especially in chloroform (260%) and in acetone (104%), the deviation is high. No systematic deviation is found. In summary, either there is no strong systematic deviation or there is a tendency for the predicted solubilities to be too low. The case of trans-stilbene is an exception. In Figure 4 the relative deviation distribution is presented. All together, 104 systems have been evaluated if both sets of paracetamol data are included. Only for 23 systems is the deviation below 15%, out of which 7 systems contain trans-stilbene, 4 contain ibuprofen, 3 contain benzoic acid, 3 contain paracetamol (set 1) and 3 contain paracetamol (set 2), 2 contain phenylacetic acid, and 1 contains p-aminophenylacetic acid. For 50 systems the relative deviation exceeds 45%. Results for each solvent are presented in Figure 5. In water, which is the most polar solvent in this study, the solubility is overall low and no clear systematic deviation between predicted and experimental values is observed. The mean relative deviation over all solutes in water is 128%, because of very large deviations for the paracetamol case 1 (557%) and for p-hydroxybenzoic acid (293%). In methanol, the solubility tends to be systematically underestimated with relative deviations often on the order of 50% for all the solutes except for ibuprofen, for which the prediction is very accurate. The mean relative deviation is 45%. In ethanol, the solubility tends to be systematically underestimated, except for good predictions for p-aminophenylacetic acid and paracetamol. The mean relative deviation is 37%.

Ind. Eng. Chem. Res., Vol. 41, No. 20, 2002 5119

Figure 5. Predicted versus experimental solubility for each solvent.

Also in 2-propanol, the solubility is mostly underestimated with a mean relative deviation of 40%. For the other solvents there is no strong systematic deviation between predicted and experimental values, as shown in Figure 5. The mean relative deviation is 80% for solubility in acetone, to which large deviations for p-aminobenzoic acid (251%) and for paracetamol contribute. The mean relative deviation is 41% for solubility in 4-methyl-2-pentanone (mibk), 129% for solubility in ethyl acetate, and 41% for solubility in toluene. Even though the predictions in choroform are quite adequate in absolute terms (Figure 5), the mean relative deviation is very high: 943% because there are very large relative deviations for p-aminobenzoic acid

(5172%), p-hydroxyphenylacetic acid (1553%), p-hydroxybenzoic acid (783%), and p-aminophenylacetic acid (583%). All of these solutes have similar molecular structures and have low solubility in chloroform. Also, in toluene, these four compounds have a low solubility, but the solubility is, in comparison, quite correctly predicted. Discussion The accuracy in the prediction of the solubility of solid organic compounds in solvents by the method adopted in this work depends on the accuracy of the UNIFAC prediction of the activity coefficient of the solute in the

5120

Ind. Eng. Chem. Res., Vol. 41, No. 20, 2002

Table 2. “Sensitivity Analysis” on Solubility Prediction for p-Aminobenzoic Acid in Different Alcohols solubility (mol/mol) at T ) 298 K solvent

x expl

x predicted

AcNH2-OH ) -17.4 OH-AcNH2 ) -52.39

ethanol 1-propanol octanol

0.0455 0.0314 0.0209

0.0148 0.0128 0.0074

67.47 59.30 64.55

AcNH2-OH ) -34.8 OH-AcNH2 ) -104.78

ethanol 1-propanol octanol

0.0455 0.0314 0.0209

0.0177 0.0152 0.0086

61.00 51.53 58.62

AcNH2-OH ) -52.2 OH-AcNH2 ) -157.17

ethanol 1-propanol octanol

0.0455 0.0314 0.0209

0.0219 0.0187 0.0104

51.78 40.54 50.34

AcNH2-OH ) -69.6 OH-AcNH2 ) -209.56

ethanol 1-propanol octanol

0.0455 0.0314 0.0209

0.0272 0.0230 0.0125

40.25 26.91 40.26

AcNH2-OH ) -104.4 OH-AcNH2 ) -314.34

ethanol 1-propanol octanol

0.0455 0.0314 0.0209

0.0425 0.0353 0.0184

6.56 -12.46 12.02

AcNH2-CH3OH ) -118.1 CH3OH-AcNH2 ) 489.7 AcNH2-CH3OH ) -118.1 CH3OH-AcNH2 ) 122.42

methanol

0.0501

0.0214

57.23

methanol

0.0501

0.0502

-0.14

saturated solution, as well as on the accuracy of the estimation of the activity of the pure solid. Estimation of Activity Coefficients. The accuracy of the prediction of the activity coefficient depends on the validity of the basic assumptions of the UNIFAC model and on the quality of the parameter values that are used. The key assumption of the UNIFAC approach is that molecules can be broken down into functional groups that can be treated as being independent of the rest of the molecule. Interaction parameters between groups are determined by correlation of the model to experimental data. Parameters for about 30 different functional groups are determined simultaneously in an extensive optimization where the total deviation between the UNIFAC model and huge amounts of experimental data is minimized. Because of the complexity of this optimization, parameter values are primarily correlation values rather than describing the actual physics in each particular interaction, and of course this limits the possibility to extrapolate the use into new situations. Furthermore, the values of the UNIFAC parameters used in the present work have been determined from vapor-liquid equilibrium data, because these are the most extensive data that are available. From a physical point of view these interaction parameters should be equally useful for solid-liquid equilibria. However, because of deviations from the underlying assumptions and because of the complexity of the parameter determination, the extrapolation to solid-liquid systems is not necessarily valid. Despite the reservations above, we still expect that parameter values should at least qualitatively describe the nature of the interaction between groups. Comparison of the interaction parameters of AcNH2-OH and OH-AcNH2 with other interaction parameters in UNIFAC tables, while taking into consideration the real nature of this interaction (strong hydrogen bonding), indicates that the parameters should be more negative. “Sensitivity analysis” has been done on those two parameters (Table 2), and results show that it is possible to get a much better fit with larger negative values for these parameters (or, for the methanol case, a less positive value for the CH3OH-AcNH2 parameter). Adjustments in UNIFAC parameter values can influence significantly the predicted value of the activity

rel dev/%

coefficient and hence the predicted solubility. The calculations indicate that the UNIFAC model is fairly sensitive to changes in its parameters. The validity of the basic assumption of the UNIFAC model has been examined in some detail by quantum chemical calculations of the electrostatic potential of the molecular surface of selected compounds. Optimized geometries have been computed at the ab initio HF/6-31G* level of theory using the Gaussian 9425 suite of programs. The electrostatic potential was computed at the same level of theory on molecular surfaces defined in accordance with ref 26 by the 0.001 au contour of the electron density. These computations were performed using a locally developed code (hs95). Results are illustrated in Figure 6 for some of the compounds. First of all, it can be noted that the substituents affect the electrostatic potential of the aromatic ring. The aromatic ring is considerably more negative in aniline than in benzoic acid. The reason is that the NH2 substituent is electron donating, while the C(O)OH substituent is electron withdrawing. In addition, it can be seen in Figure 6 that the two substituents in p-aminobenzoic acid influence each other. For example, the NH2 group is more negative in aniline than in p-aminobenzoic acid. The reason is that in the latter the C(O)OH substituent withdraws electrons from the NH2 group. In the same way the electron donating power of the NH2 group results in more negative oxygens in p-aminobenzoic acid compared to benzoic acid. These observed changes in the electrostatic potential directly reflects the changes in the hydrogen bond donating and accepting tendencies of the different functional groups due to the substitution.27 From Figure 6, we can conclude that the C(O)OH substituent increases the hydrogen bond donating ability of the NH2 group and decreases its hydrogen bond accepting ability. This figure also shows that the NH2 group increases the hydrogen accepting ability of the C(O)OH group. It is clear that the basic assumption in UNIFAC that the different functional groups of a molecule can be taken to be independent of each other in many cases does not sufficiently describe the reality and may lead to serious errors in the predicted interaction energies. The quantum chemical calculations can be used as a guide for selecting substitution parameters when pa-

Ind. Eng. Chem. Res., Vol. 41, No. 20, 2002 5121

Figure 6. Calculated ab initio HF/6-31+G* electrostatic potentials [V(r)] on the molecular surfaces of (a) p-aminobenzoic acid, (b) benzoic acid, and (c) aniline. Color ranges in kcal/mol: red for V(r) > 30; yellow for 10 < V(r) < 30; green for -10 < V(r) < 10; blue for -30 < V(r) r 10; purple for V(r) r 30.

rameters for a specific functional group are not available in UNIFAC. As described in the results section, we were not able to find parameters for the AcNH group in paracetamol. As substitution parameters for this group, we tested parameters for the AcNH2 and AcCH2 groups. In Figure 7, we show the computed electrostatic potential on the molecular surfaces of paracetamol and of the two molecules that are obtained when the NH group is substituted for NH2 and CH2, respectively. Since an sp3 hybridized nitrogen only can have three bonds, the COCH3 has to be removed when NH is substituted for NH2 in the calculations of the electrostatic potential. In the following, please notice that we are comparing the electrostatic potential on the surfaces of the nitrogen and of the hydrogens that are attached to the nitrogen. In the UNIFAC simulations, the COCH3 group is retained even when, for example, the AcNH group is replaced by an AcNH2 group, and hence the strong negative potential of the carbonyl oxygen that is observed in Figure 7a should not be mixed up with the negative surface of the nitrogen in Figure 7b. As can be seen from the figure, the surface of the NH2 group has areas of strongly negative potential which have no counterparts for the NH group. On the other hand, the surface of the hydrogen in the NH group is considerably more positive than the corresponding surface area over the hydrogens in the NH2 group. The figure clearly shows that the AcNH2 group is not a good substitute for the AcNH group. The CH2 group lacks the strongly negative areas of the NH2 group, and in that sense, it resembles the NH group rather well. However, the hydrogens of the CH2 group are much less positive than the hydrogen of the NH group. Thus, the quantum chemical results show that neither AcNH2 nor AcCH2 is a good substitute group for AcNH, but that AcCH2 should perform slightly better than AcNH2. This is also in agreement with the results we found when we tested the two groups for prediction of solubilities of paracetamol (see Results section). Dissociation, if it occurs, is not explicitly accounted for by the UNIFAC model even though it may be included in parameter values. p-Aminobenzoic acid is an amino acid and appears as zwitterions when the pH is close to the isoelectric point. Far away from the isoelectric point, we may find anionic and cationic forms. When p-aminobenzoic acid is dissolved in water, the pH becomes 3.6, which is close to the isoelectric point 3.5. However, at the isoelectric point in water, the solution contains only 10.5% zwitterions28 and the rest is neutral molecules. In ethanol-water mixtures containing 50% and 75% ethanol, the corresponding zwitterion content is very low (0.27% and 0.015%, respectively). Hence, we do not believe that dissociation significantly influences the results of the present work, since the prediction of the solubility in water is quite good compared to those for many of the other solvents, and the evaluation for p-aminobenzoic acid includes 11 additional solvents. Estimation of Solid-State Activity. The most significant problem of estimating the activity of the solid phase relates to the fact that the reference state is a supercooled melt, the properties of which in general cannot be determined experimentally but have to rely on extrapolation of data from the melting point and above (eq 9). In our predictions, we use eq 10, in which it has been assumed that ∆Cp ) 0. Hilderbrand and Scott29 ob-

5122

Ind. Eng. Chem. Res., Vol. 41, No. 20, 2002

Figure 7. Calculated ab initio HF/6-31+G* electrostatic potentials [V(r)] on the molecular surfaces of (a) paracetamol, (b) paracetamol with NHCOCH3 replaced by NH2, and (c) paracetamol with NH replaced by CH2. Color ranges in kcal/mol: red for V(r) > 30; yellow for 10 < V(r) < 30; green for -10 < V(r) < 10; blue for -30 < V(r) r 10; purple for V(r) r 30. Table 3. Comparison between Experimental and Predicted Solubilities of Benzoic Acid Calculated with Solubility Eqs 10, 13, and 14 solubility (mol/mol) at T ) 298.15 K and atmospheric pressure water methanol ethanol octanol acetone chloroform tetrahydrofuran diethyl ether dioxane hexane benzene

exp

UNIFAC (eq 10)

rel dev/%

UNIFAC (eq 13)

rel dev/%

UNIFAC (eq 14)

rel dev/%

0.0005 0.1151 0.1389 0.1122 0.1944 0.1064 0.2434 0.1985 0.2109 0.0129 0.0627

0.0004 0.1030 0.1048 0.1154 0.2013 0.1764 0.1852 0.1085 0.1438 0.0112 0.0723

20.00 10.55 24.55 -2.85 -3.55 -65.79 23.91 45.34 31.82 13.18 -15.31

0.0006 0.1543 0.1607 0.1543 0.2935 0.2520 0.2494 0.1519 0.2865 0.0159 0.1067

-14.55 -34.02 -15.66 -37.48 -50.97 -136.81 -2.44 23.47 -35.84 -23.10 -70.25

0.0005 0.1308 0.1427 0.1399 0.2662 0.2236 0.2249 0.2272 0.2584 0.0135 0.0930

-5.30 -13.68 -2.72 -24.72 -36.92 -110.15 7.58 -14.45 -22.52 -4.28 -48.27

served a linear relationship between ln as and ln T, and they suggested another approximate equation for solubility. If we rewrite eq 9 as the first partial derivative with respect to ln T

(

)

∆hfus ∂ ln a m ) ∂ ln T RT s

(11)

and take the second derivative, we obtain

(

) [ ( )]

fus ∂2 ln as ∂ ∆hm ) T ∂T RT ∂(ln T)2

)

∆Cp ∆hfus m R RT

(12)

fus fus At the melting point, ∆hfus m ) Tm∆sm , and if ∆sm ) ∆Cp, this second derivative is zero. Hence, authors suggested that ∆Cp could be approximated with entropy of fusion, and this is claimed29 to be at least as good approximation as was the assumption that ∆Cp ) 0. This results in

ln as ) ln(xsatγsat) )

[

]

∆sfus m T ln R Tm

(13)

s If the difference (Cm p - Cp) ) ∆Cp is relatively insensitive to temperature,30,32,34 eq 9 can be integrated into

s sat sat asat i ) xi γi ) a ) exp

[

(

)

∆hfm 1 1 R Tm T

(

)]

Tm ∆Cp ln - Tm/T + 1 R T

(14)

Many workers, especially in the chemical engineering field, have favored eq 10, and some other workers, especially in the pharmaceutical field, have suggested eq 13 to be a better approximation. During the past decade, published results indicate that estimation of the activity of the solid phase requires a fair estimate of ∆Cp. Mishra and Yalkowsky33 found that, in terms of ideal solubility, eq 10 predicts a value closer to that of eq 14 than did eq 13. Neau et al.34-36 found that ∆Cp was not negligible and was closer to ∆sfus m than to zero, but neither of these two approximations were valid for accurately estimating the activity of the solid phase as given by eq 14. As shown in Figure 8, for the general case, we cannot safely assume that ∆Cp is constant, but it may increase at decreasing temperature. The literature contains few data on the heat capacities of solid and liquid organic compounds. However, for some of the substances of Table 1, data are available in the Beilstein database16 and have been used for comparison of eqs 10, 13, and 14. For benzoic acid, the

Ind. Eng. Chem. Res., Vol. 41, No. 20, 2002 5123

Figure 8. Heat capacities of paracetamol (experimental data from ref 34). Table 4. Comparison between Experimental and Predicted Solubilities of Paracetamol Calculated with Solubility Eqs 10 and 14 solubility (mol/mol) at T ) 298.15 K and atmospheric pressure exp

UNIFAC (eq 10)

rel dev/%

UNIFAC (eq 14)

rel dev/%

water methanol ethanol 1-propanol 2-propanol 1-butanol acetone ethyl acetate toluene

Paracetamol (Parameter Set 1) 0.0018 0.0116 -557.40 0.0825 0.0658 0.0297 54.82 0.1139 0.0601 0.0528 12.21 0.1559 0.0453 0.0401 11.54 0.1219 0.0460 0.0403 12.24 0.1225 0.0392 0.0310 21.09 0.0951 0.0369 0.0772 -109.16 0.1791 0.0055 0.0469 -756.71 0.1405 0.0002 0.0001 55.69 0.0002

-4560.90 -73.22 -159.25 -169.10 -166.65 -142.35 -385.51 -2467.20 16.50

water methanol ethanol 1-propanol 2-propanol 1-butanol acetone ethyl acetate toluene

Paracetamol (Parameter Set 2) 0.0018 0.0004 76.57 0.0001 0.0658 0.0349 46.93 0.0729 0.0601 0.0526 12.51 0.1031 0.0453 0.0500 -10.33 0.0969 0.0460 0.0502 -9.35 0.0974 0.0392 0.0463 -18.15 0.0895 0.0369 0.0754 -104.28 0.1291 0.0055 0.0035 35.45 0.2310 0.0002 0.0008 -259.68 0.0029

94.35 -10.84 -71.53 -114.06 -112.00 -128.20 -249.95 -4122.64 -1197.03

difference in the calculated solid-state activities of eqs 10 and 14 is 13% and this corresponds to differences in predicted solubilities of up to 50%. The difference in the calculated solid-state activities of eqs 13 and 14 is 5.4%, resulting in a 30% difference in predicted solubilities (Table 3). For paracetamol, neither of the two simplifications is valid for accurate estimation of the solidstate activity, since ∆Cp is as high as 99.8 J/mol‚K at the melting point (Neau et al.36). The difference in the calculated solid-state activities of eqs 10 and 14 is 44%, resulting in a difference higher than 100% in predicted solubilities in most of the cases (Table 4). For both tested substances, eq 14 predicts higher solubilities than eq 10. Hence, the neglect of the ∆Cp term in eq 10 may at least partly explain that the predicted solubilities for some compounds tend to be underestimated. For example, we may note that p-hydroxybenzoic acid and p-hydroxyphenylacetic acid have high ∆Cp values.19 Conclusions In the present work, the prediction of the solubilities of organic compounds in water and in organic solvents is explored. Overall, predictions are not sufficiently accurate to allow for a reasonably accurate design of a crystallization process or for selection of a suitable

solvent. The predictions may perhaps serve as a first rough guide in the selection of solvents. The reason for the discrepancies can be related to uncertainties in the prediction of activity coefficients by UNIFAC, as well as to uncertainties in the estimation of the activity of the solid state. Since UNIFAC is a simple additive model, it would not be expected to yield a highly accurate prediction in cases where properties are nonadditive. In the present case, the properties of a functional group have been shown to depend on the rest of the molecule, and this is a violation against a key assumption of the model. Unfortunately, most of the systems that we are interested in are of this kind, and this suggests that UNIFAC in its present form is not quite satisfactory for solubility prediction. Since we are interested in very different compounds, which are impossible to classify as one group of compounds, some simple modification of the model is not expected to be sufficient in our case. In the prediction of the activity of the pure solid, uncertainties are due to the fact that the standard state is a pure melt at a temperature far below the melting point. Hence, its properties can only be accessed by extrapolation. In addition, the data needed, that is, the heat capacities of the pure solid and of the pure melt, respectively, are normally not available. It is shown in the present work that neglect of the heat capacity term can influence the predicted solubility significantly. Notation a ) activity Cp ) molar heat capacity at the given pressure, J/mol‚K Cm p ) heat capacity of the supercooled pure melt Csp ) heat capacity of the pure solid s ∆Cp ) difference in heat capacity, ) Cm p - Cp gi ) partial molar Gibbs free energy of species i, J/mol ∆gfus ) molar Gibbs free energy change upon fusion, J/mol h ) partial molar enthalpy, J/mol ∆hfus ) molar enthalpy change upon fusion, J/mol ∆hm ) enthalpy of melting, J/mol n ) number of solvents in summation R ) gas constant ∆sfus ) partial molar entropy upon fusion, J/mol‚K T ) temperature, K Tm ) melting point, K x ) solubility, mol/mol Greek Symbols γi ) activity coefficient of component i based on deviation from Raoult’s law µi ) chemical potential of component i, J/mol Upper Index C ) combinatorial part R ) residual part s ) solid state sat ) saturated solution ° ) at standard state Lower Index i ) component i m ) at melting point s ) solid

Note Added after ASAP Posting This article was released ASAP on 9/6/02 with the captions for Figures 4 and 5 reversed. The corrected version was posted on 9/10/02.

5124

Ind. Eng. Chem. Res., Vol. 41, No. 20, 2002

Literature Cited (1) Fredeslund, A.; Gmehling, J.; Rasmussen, P. Vapor-Liquid Equilibria Using UNIFAC; Elsevier: Amsterdam, 1977. (2) Weidlich, U.; Gmehling, J. A Modified UNIFAC Model. 1. Prediction of VLE, HE, and Gamma-infinity. Ind. Eng. Chem. Res. 1987 26 (7), 1372-1381. (3) Larsen, B. L.; Rasmussen, P.; Fredeslund, A. A Modified UNIFAC group-contribution model for prediction of phase equilibrium and heat of mixing. Ind. Eng. Chem. Res. 1987, 26, 22742286. (4) Abildskov, J.; Constantinou, L.; Gani, R. Towards the development of a second-order approximation in activity coefficient models based on group contribution. Fluid Phase Equilib. 1996, 118, 1-12. (5) Wu, H. S.; Sandler, S. I. Use of ab Initio Quantum Mechanics Calculation in Group Contribution Methods. 1. Theory and Basis for Group Identifications. Ind. Eng. Chem. Res. 1991, 30, 881-889. (6) Wu, H. S.; Sandler, S. I. Use of ab Initio Quantum Mechanics Calculation in Group Contribution Methods. 2. Test of new groups in UNIFAC. Ind. Eng. Chem. Res. 1991, 30, 889-897. (7) Gmehling, J. G.; Andersson, T. F.; Prausnitz, J. M. Solidliquid equilibria using UNIFAC. Ind. Eng. Fundam. 1978, 17 (4), 269. (8) Kan, A. T.; Tomson, M. B. Unifac Prediction of Aqueous and Nonaqueous Solubilities of Chemicals with Environmental Interest. Environ. Sci. Technol. 1996, 30, 1369-1376. (9) Gupta, R. B.; Heidesmann, R. A. Solubility models for amino acids and antibiotics. AIChE J. 1990, 36 (3), 333-341. (10) Pinho, S. P.; Silva, C. M.; Macedo, E. A. Solubility of amino acids: A Group-Contribution Model involving phase and chemical equlibria. Ind. Eng. Chem. Res. 1994, 33, 1341-1347. (11) Kuramochi, H.; Noritomi, H.; Hoshino, D.; Nagahama, K. Measurements of solubility of two amino acids in water and prediction by the UNIFAC model. Biotechnol. Prog. 1996, 12, 371379. (12) Kuramochi, H.; Noritomi, H.; Hoshino, D.; Nagahama, K. Representation of activity coefficient of fundamental biochemicals in water by the UNIFAC model. Fluid Phase Equilib. 1997, 130, 117-132. (13) Gabas, N.; Lague´rie, C. Predictions with UNIFAC of liquid-solid phase diagrams: applications water-sucrose-glucose, water-sucrose-fructose and water-xylose-mannose. J. Cryst. Growth 1993, 128, 1245-1249. (14) Ochsner, A. B.; Sokoloski, T. D. Prediction of Solubility in Nonideal Multicomponent Systems using the UNIFAC Group Contribution Model. J. Pharm. Sci. 1985, 74, 634-637. (15) Nass, K. K. Rational solvent selection for cooling crystallization. Ind. Eng. Chem. Res. 1994, 33, 1580-1584. (16) Frank, T. C.; Downey, J. R.; Gupta, S. K. Quickly screen solvents for organic solids. Chem. Eng. Prog. 1999, 95 (12), 4161. (17) Beilstein Database. (18) Granberg, R. A.; Rasmuson, Å. C. Solubility of paracetamol in pure solvents. J. Chem. Eng. Data 1999, 44, 1391-1395. (19) Gracin, S.; Rasmuson, Å. C. Solubility of organic compounds in pure solvents. Submitted for publication in J. Chem. Eng. Data. (20) Hansen, H. K.; Rasmussen, P.; Fredeslund, A. Vapor-liquid equilibria by UNIFAC group contribution. 5. Revision and extension. Ind. Eng. Chem. Res. 1991, 30, 2352-2355.

(21) Fredeslund, A.; Sorensen, J. M. In Group-Contribution estimation methods, Models for thermodynamic and phase equilibria calculations; Sandler, S. I., Ed.; Marcel Dekker: New York, 1994; Chapter 4, pp 287-361. (22) Press, W. H.; Flannery, B. T.; Teukolsky, S. A.; Vatterling, T. Van Wijngaarden-Dekker-Brent Method. Numerical Recipes: The Art of Scientific Computing; Cambridge University Press: Cambridge, 1986; pp 251-254. (23) Dimorth, K.; Reichardt, C. Lo¨sungsmittel und empirische parameter zur charakteriserung ihrerpolarita¨t. Top. Curr. Chem. 1968, 1-73. (24) Reichardt, C. Solvents and solvent effects in organic chemistry; VCH Verlagsgesellschaft: Weinheim, 1990. (25) Frisch, M. J.; Trucks, G. W.; Schlegel, H. B.; Gill, P. M. W.; Johnson, B. G.; Robb, M. A.; Cheeseman, J. R.; Keith, T. A.; Petersson, G. A.; Montgomery, J. A.; Raghavachari, K.; Al-Laham, M. A.; Zakrzewski, V. G.; Ortiz, J. V.; Foresman, J. B.; Peng, C. Y.; Ayala, P. Y.; Chen, W.; Wong, M. W.; Andres, J. L.; Replogle, E. S.; Gomperts, R.; Martin, R. L.; Fox, D. J.; Binkley, J. S.; Defrees, D. J.; Baker, J.; Gonzalez, C.; Stewart, J. J. P.; HeadGordon, M.; Pople, J. A. Gaussian 94, Rev. B.3; Gaussian, Inc.: Pittsburgh, PA, 1995. (26) Bader, R. F. W.; Carroll, M. T.; Cheeseman, J. R.; Chang, C. J. Am. Chem. Soc. 1987, 109, 7968. (27) Brinck, T. The Use of the Electrostatic Potential for Analysis and Prediction of Intermolecular Interactions. In Theoretical Organic Chemistry; Parkanyi, C., Ed.; Elsevier Science B. V.: Amsterdam, 1998; Vol. 5, pp 51-93. (28) van de Graaf, B.; Hoefnagel, A. J.; Wepster, B. M. Substituent effects. 7. Microscopic dissociation constants of 4-aminoand 4-(dimethylamino)benzoic acid. J. Org. Chem. 1981, 46 (4), 653-7. (29) Hildebrand, J. H.; Scott, R. L. Regular Solutions; Van Nostrand Reinhold: New York, 1970. (30) Prausnitz, J. M.; Lihtentaler, R. N.; de Avezedo, E. D. Molecular thermodynamics of fluid-phase equilibria; Prentice Hall PTR: Upper Saddle River, NJ, 1999. (31) Walas, S. M. Phase Equilibria in Chemical Engineering; Butterworth: London, 1985. (32) Grant, D. J. W.; Higuchi, T. Solubility behavior of organic compounds; Wiley: New York, 1990. (33) Mishra, D. S.; Yalkowsky, S. H. Ideal solubility of a solid solute: Effect of heat capacity assumotions. Pharm. Res. 1992, 9 (7), 958-959. (34) Neau, S. H.; Flynn, G. N. Solid and liquid heat capacities of n-alkyl para-aminobenzoates near the melting point. Pharm. Res. 1990, 7 (11), 1157-1163. (35) Neau, S. H.; Flynn, G. N.; Yalkowsky, S. H. The influence of heat capacity assumptions on the estimation of solubility parameters from solubility data. Int. J. Pharm. 1989, 49, 223229. (36) Neau, S. H.; Bhandarkur, S. V.; Helmuth, E. W. Differential molar heat capacities to test ideal solubility estimations. Pharm. Res. 1997, 14 (5), 601-605.

Received for review December 17, 2001 Revised manuscript received June 11, 2002 Accepted June 24, 2002 IE011014W