Modeling of the Critical Micelle Concentration (CMC) of Nonionic

Aug 1, 2013 - Most often, electrical conductivity, surface tension, light scattering, and fluorescence spectroscopy have been used for this purpose.(4...
1 downloads 15 Views 1MB Size
Subscriber access provided by University of Rochester | River Campus & Miner Libraries

Article

Modeling of the Critical Micelle Concentration (CMC) of Nonionic Surfactants with an Extended Group-Contribution Method Michele Mattei, Georgios Kontogeorgis, and Rafiqul Gani Ind. Eng. Chem. Res., Just Accepted Manuscript • DOI: 10.1021/ie4016232 • Publication Date (Web): 01 Aug 2013 Downloaded from http://pubs.acs.org on August 3, 2013

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Industrial & Engineering Chemistry Research is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

Industrial & Engineering Chemistry Research

ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 29

Modeling of the Critical Micelle Concentration (CMC) of Nonionic Surfactants with an Extended Group-Contribution Method Michele Mattei a,b, Georgios M. Kontogeorgis b, Rafiqul Gani a,* a

Computer Aided Process-Product Engineering Center (CAPEC), Department of Chemical and

Biochemical Engineering, Technical University of Denmark, Søltofts Plads, Building 229, DK2800, Kgs. Lyngby, Denmark b

Center for Energy Resources Engineering (CERE), Department of Chemical and Biochemical

Engineering, Technical University of Denmark, Søltofts Plads, Building 229, DK-2800, Kgs. Lyngby, Denmark *

Corresponding author: Rafiqul Gani: e-mail address: [email protected]; phone number: +45 45 25 28

82; fax number: +45 45 93 29 06

ABSTRACT

A group-contribution (GC) property prediction model for estimating the critical micelle concentration (CMC) of nonionic surfactants in water at 25°C is presented. The model is based on the Marrero and Gani GC-method. A systematic analysis of the model performance against experimental data is carried out using data for a wide range of nonionic surfactants covering a wide range of molecular structures. As a result of this procedure, new third order groups based on the

ACS Paragon Plus Environment

1

Page 3 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

characteristic structures of nonionic surfactants are defined and are included in the Marrero and Gani GC-model. In this way, those compounds that exhibit larger correlation errors (based only on first and second order groups) are assigned to more detailed molecular descriptions, so that better correlations of critical micelle concentrations are obtained. The group parameter estimation has been performed using a data-set of 150 experimental measurements covering a large variety of nonionic surfactants: linear, branched and phenyl alkyl ethoxylates, alkanediols, alkyl mono and disaccharide ethers and esters, ethoxylated alkyl amines and amides, fluorinated linear ethoxylates and amides, polyglycerol esters and carbohydrate derivate ethers, esters and thiols. The model developed consists of linear group contributions and the critical micelle concentration is estimated using the molecular structure of the nonionic surfactant alone. Compared to other models used for the prediction of the critical micelle concentration, and in particular, the quantitative structureproperty relationship models, the developed GC-model provides an accurate correlation and allows for an easier and faster application in computer-aided molecular design techniques thus facilitating chemical process and product design.

KEYWORDS: Critical micelle concentration, group-contribution method, surfactants 1. INTRODUCTION Critical micelle concentration (CMC) is defined as “the limit below which virtually no micelles are detected and the limit above which virtually all additional surfactant molecules form micelles. Many properties of surfactant solutions, if plotted against the concentration, appear to change at a different rate above and below this range”.1 This is considered to be a fundamental property of surfactants, not only because a number of interfacial phenomena such as detergency can take place due to the presence of micelles in solutions, but also because it affects other phenomena such the surface tension reduction that is not directly influenced by the formation of micelles.2 It has been

ACS Paragon Plus Environment

2

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 29

demonstrated that only polar solvents with two or more potential hydrogen-bonding centers are capable of showing micelle formation, while in nonpolar solvents, clusters of surfactants may form but their behavior is not comparable to that of micelles in aqueous media.3 The experimental determination of the value of critical micelle concentration can be made by indirect use of different physical properties that depend on size or number of particles in solution. Most often, electrical conductivity, surface tension, light scattering and fluorescence spectroscopy have been used for this purpose.4 As the value of the critical micelle concentration is obtained by extrapolating the loci of any of the above mentioned properties above and below the range of change of rate until they cross, and since the values obtained using different properties are not identical, critical micelle concentration is understood as a range of concentration, rather than a unique single value.1 Depending on the specific application it is sometimes desired to be as close as possible to the critical micelle concentration, but below so that micelles are not formed (e.g. reduction of the surface tension of a blend), or well above the critical micelle concentration (e.g. in the design of an emulsion-based products). Thus, a model that can provide reliable estimates of critical micelle concentrations for a wide range of surfactants and conditions is needed, so that a proper selection of surfactants can be performed. Various methods have been proposed for correlating available measurements of critical micelle concentration and, based on that, to interpolate or extrapolate and perform predictions. These methods can be divided into the following categories: simple correlations, molecular simulations, activity coefficient models, equations of state, and quantitative structure-property relationships (QSPR) approaches. The authors are not aware of any current GC-based method developed for the calculation of critical micelle concentration of surfactants. It is in fact generally accepted that the critical micelle concentration in aqueous media is affected by the molecular structure of the surfactant considered, particularly in terms of alkyl chain length. The critical micelle concentration

ACS Paragon Plus Environment

3

Page 5 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

is in fact known to decrease as the number of carbon atoms in the hydrophobic moiety increases to about 16. However, when the number of atoms approaches 16 the critical micelle concentration no longer decreases rapidly and when the chain exceeds 18 carbon atoms it may remain substantially unchanged with further increase of the chain length.5 Moreover, the replacement of the hydrocarbon-based hydrophilic moiety with a fluorocarbon-based one with the same number of carbon atoms appears to cause a decrease of the critical micelle concentration.6 Therefore quantitative structure-property relationship models as well as group-contribution based models may be suitable for the modeling of this property. Empirical correlations have been developed in order to relate the critical micelle concentration to the various structural units of the surfactant. A relation with the number of carbon atoms in the hydrophobic chain was found by Klevens7, with parameters depending on the different classes of nonionic surfactants considered. This kind of correlation may be less accurate for most of the nonionic surfactants, but it represents a good and fast first estimation. More recently, efforts have been made for modeling the critical micelle concentration through molecular simulation8,9 but these simulations require extremely long computational times for the complex surfactant systems. Other approaches use the representation of the Gibbs energy of mixing as an indicator of the self-assembly of surfactants into micelles. The Gibbs energy of mixing of these aqueous systems are often estimated through activity coefficient models such as NRTL10 and UNIFAC11, with the later needing the definition of a new group (CH2CH2O) to consider the peculiar behavior of the ethoxylate groups. Both models provide accurate predictions, and can estimate critical micelle concentrations at different temperatures, but the application range is not as broad as for other methods, as it is usually limited to the class of alkyl ethoxylates. However, the UNIFAC based model may allow extrapolation because of the predictive nature of UNIFAC, therefore, it may be identified as an option for prediction of critical micelle concentration. Different equations of state have also been employed to estimate the behavior of aqueous surfactant solutions,

ACS Paragon Plus Environment

4

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 29

such as, surface equations of state (Gibbs, Langmuir and Volmer)12 and SAFT13. Results are encouraging, but in order to estimate the model parameters, experimental data (respectively surface pressure vs. surfactant concentration and vapor pressure vs. temperature) are needed, while extrapolation for predictions is not recommended. Several models have also been proposed, based on theoretical considerations such as those by Molyneux14 and, more recently, by Zdziennicka15. The accuracy of these models is high, due to their strong theoretical basis, but their use as predictive models is questionable. Their application is in fact limited to a few families of surfactants, for which the parameters of the model have been regressed based on experimental measurements of density, viscosity, conductivity and light scattering of surfactant aqueous solutions. Finally, the largest family of models only recently developed for prediction of critical micelle concentration of nonionic surfactants is that of quantitative structure-property relationships (QSPR) models. They are based on the molecular structure of the chemicals and they may easily be used for extrapolation because of the nature of the parameters (descriptors) used. They origin from large numerical regressions and they are usually characterized by high statistical indexes for correlation performances. An extensive review of the QSPR models recently developed for prediction of several properties, including the critical micelle concentration, of nonionic and ionic surfactants is given by Hu et al.16, while the method of Zhu et al.17 is among the latest of models of this type to be published. The property model for critical micelle concentration of nonionic surfactant developed here, instead, belongs to the family of the group-contribution methods. These methods apply very well to chemical process and product design since they can provide accurate prediction without being computationally demanding and can be used in computer aided molecular design as they employ the same building blocks for molecular representation. Only the molecular structural information of the surfactant is needed as input information. Another mixture property which has been modeled in a similar manner using only molecular structure information of the compound is

ACS Paragon Plus Environment

5

Page 7 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

octanol/water partition coefficient.18 The application of the developed model is highlighted through illustrative examples. 2. MODELING OF CMC WITH THE ORIGINAL MARRERO AND GANI GC-METHOD 2.1 DATABASE The data-set used in the present work contains 161 nonionic surfactants, covering the most common families of nonionic surfactants: linear, branched and phenyl alkyl ethoxylates; alkanediols; alkyl mono and disaccharide ethers and esters: ethoxylated alkyl amines and amides; fluorinated linear ethoxylates and amides; polyglycerol esters and carbohydrate derivate ethers, esters and thiols. In this paper, we have classified the data in terms of molecular description of each of the nonionic surfactants divided into different classes, as given in Table 1. This classification becomes particularly important during the introduction of the unique third order groups (see section 2.3). INSERT TABLE 1 2.2 MODEL DEVELOPMENT When developing a QSPR model, the data-set is often divided into training and validation sets, while this is not necessary the case for GC-models, since the formation of a randomly selected validation set may exclude some of the GC-model parameters, limiting the application range of the models itself. Moreover, considering as many data as possible to regress the parameters of the GCmodel results in lower uncertainties of the estimated model parameters and consequently lower uncertainties (and better reliability) of the predicted property values.20 Therefore all the available measurements from Katritzky et al.19 have been considered eligible to be used in the parameter estimation step. The Marrero and Gani21 GC-method for property prediction has the form:          



(1)



ACS Paragon Plus Environment

6

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 29

The function f (X) is a function of property X and it may contain additional adjustable model parameters (universal constants) depending on the property involved. Ci is the contribution of the first order groups of type i that occurs Ni times, Dj is the contribution of the second order groups of type j that occurs Mj times and Ek is the contribution of the third order group of type k that occurs Ok times. For determining the contributions Ci, Dj, and Ek, Marrero and Gani suggested a multilevel estimation approach, so that the contributions of the higher levels act as corrections to the approximations of the lower levels.21 As an alternative to the step-wise regression method, a simultaneous regression method can be applied, in which the regression is performed by considering all the terms containing first, second, and third-order groups in a single regression step. The definition of f(X) is specific for each property X and the selection of the most appropriate form is done by analyzing the behavior of certain class of pure compounds as their carbon number increase.21,22 For critical micelle concentration, the most suitable definition of f(X) is:           



(2)



Usually the decision of the most suitable form of f(X) is made from observation of the trend of the experimental data of the property to be estimated as a function of the number of CH2 groups for alkanes. This analysis cannot be performed here, since alkanes are not surfactants and do not aggregate in micelles, therefore they do not have a critical micelle concentration. However, a similar approach can be used, considering the largest family of nonionic surfactants; the linear alkyl ethoxylates. This family of nonionic surfactants is characterized by a repetitive structure which may be simplified as follows:              

ACS Paragon Plus Environment

7

Page 9 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

This description is written in short form as CnEOm. Plotting different values of critical micelle concentration as a function of the number of carbon atoms in the hydrophobic chain of the surfactants, a trend as shown in Figure 1 is obtained. INSERT FIGURE 1 As seen in Figure 1, the dependence of the critical micelle concentration of linear alkyl ethoxylates on the number of carbon atoms of the hydrophobic chain is not linear. If, on the other hand, the same analysis is carried out by plotting the logarithmic values of the critical micelle concentration (the negative sign is taken in order to obtain positive data) against the number of carbon atoms in the carbon chain of the surfactant, a trend as in Figure 2 is obtained. INSERT FIGURE 2 As shown in Figure 2, the dependence of the logarithm of the critical micelle concentration on the number of carbon atoms of the hydrophobic tail is linear. This justifies the choice of the form of f(X) as in equation (2). In the Marrero and Gani GC-method, the molecular structure of a compound is considered as the collection of three types of groups, named first order groups (which are simple functional groups), second order groups and third-order groups. The aim of such a multi-level estimation scheme is to obtain high accuracy and reliability, maintaining a wide application range of the model for a number of pure component properties. As an example, the molecular structure of Octyl glucoside (n-octyl-β-D-glucoside, CAS number 29836-26-8), a common nonionic surfactant frequently used to dissolve integral membrane proteins for studies in biochemistry, is represented in terms of first, second and third-order groups of the Marrero and Gani GC-method is given in Table 2. INSERT TABLE 2

ACS Paragon Plus Environment

8

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 29

When using GC-based property models, the contribution of one or more groups that are needed to represent the molecular structure of the compound for which properties are to be estimated, may be missing for various reasons. For example, due to lack of necessary experimental data, the parameters were not regressed. To tackle this problem, the atom connectivity index (CI) method has been integrated with the GC-method to predict the missing contributions. This approach is known as the group-contribution+ (GC+) approach and details on the procedure have been given by Gani et al.23. Before adopting the Marrero and Gani GC-method to the data-set chosen for the parameter estimation step, it is necessary to analyze the matrix of group occurrences to make sure that each group describes at least two of the surfactants present in the data-set. A single occurrence would actually distort the performance of the model, leading to a perfect match for the compounds with those groups, but providing uncertain extrapolation capability. Nine such nonionic surfactants have been removed from the data-set. Also, two of the nonionic surfactants have been removed from the data-set because they are isomers and their group descriptions with Marrero and Gani GC-method are identical. Therefore, since it is not appropriate to choose one of the two values for the same group description, both data are excluded from the parameter estimation step. Therefore, the dataset originally containing161 nonionic surfactants is reduced to 150 for the parameter regression step. To represent these compounds, 30 first order groups and 11 second order groups are needed. The results of the parameter estimation step performed through the step-wise regression method are given in terms of statistical indices in Table 3 and in illustrated terms of a parity plot in Figure 3. INSERT TABLE 3 INSERT FIGURE 3 2.3 MODEL IMPROVEMENT USING THIRD ORDER GROUPS

ACS Paragon Plus Environment

9

Page 11 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

The results in Table 3 indicate that accuracy of the Marrero and Gani GC-method using only first and second order groups is rather low compared with the performance of the same method for other properties17. In order to improve the accuracy of GC-based methods, Hukkerikar et al.24 provide the following recommendations: o Add new data, if available, to allow a more comprehensive coverage of the molecule types; o Check for consistency-uncertainty of the data; o Identify those compounds characterized by the largest correlation errors; o Analyze the group descriptions of the above mentioned compounds; o Identify opportunities for introduction of unique third order groups for specific classes of components; o Regress the new group contributions. In this work, the size of the data-set is considered satisfactory for describing the entire family of nonionic surfactants. Regarding the second point, since the source of the experimental data did not provide any information on measurement uncertainty, this information could not be included. In this work, the following analysis has been performed: o Estimation of critical micelle concentration using the Marrero and Gani GC-model for all the compounds present in the data-set and analysis of the differences between the experimental values and the calculated ones, in order to identify compounds with the largest prediction errors;

ACS Paragon Plus Environment

10

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 29

o Analysis of the molecular structures of those “problematic” compounds identified in the previous step and inclusion of new unique third order groups in the Marrero and Gani GCmodel to improve the prediction performance through better correlation of data; o Parameter estimation after the inclusion of new third order groups in the GC-model to obtain the relative contributions. The step-by-step procedure is repeated until statistical indices of performance such as the coefficient of determination (R2), the standard deviation (SD) and the average absolute deviation (AAD) are obtained. The newly included third order groups and the relative parameters are then considered final. The first step has been addressed by calculating the critical micelle concentration for all the compounds present in the data-set with the original set of groups of Marrero and Gani GC-model. The statistical indices of performance are then calculated for each compound and the results, divided per surfactant class in order to identify the chemicals with the largest prediction errors, are given in Table S1 in the Supporting Information. In general, the model needs improvements for almost all families, the largest errors being for polyglycerol esters and the lowest errors for branched alkyl ethoxylates. It must be, however, emphasized that a few very large errors have influenced the statistical indices of some of the surfactant families. For example, in the case of phenol alkyl ethoxylates the large majority of the relative errors are below 5%, but the large errors of C8Ph2, C9Ph2 and C9Ph12, well above 10%, affects the average performance of the whole sub-set. As a solution, a set of unique and new third order groups have been added to balance these prediction errors, improving the overall performance of the model. Particular attention is given to the linear alkyl ethoxylates, which are oligomer-like molecules, characterized by two repetitive units, sometimes present more than 15 times per

ACS Paragon Plus Environment

11

Page 13 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

molecule. Similarly, branched alkyl ethoxylates show poor statistical indices, which may be improved through the addition of new third order groups. Also alkanediols, polyglycerol esters and several of the carbohydrate derivate compounds show deficiencies when experimental values are compared to the model predictions. Their fundamental functional groups (the hydroxyl group) are repeated several times and new groups need to be defined to consider these issues. The definition of new third-order groups is based on the “similarity criteria” approach. It is based on the comparison of the molecular structures of the compound characterized by high prediction errors, in order to identify a set of molecules which are “similar” in nature. In order to be “similar”, two or more compounds need to have one or more consecutive first order groups in common. For example, all the linear alkyl ethoxylates are “similar” since all of them have at least one CH2 group consecutive to a CH2O group in their structure. According to this criterion, then, “similar” compounds may be collected together and the third order group characterizing those similar molecules can be defined. On the basis of the above mentioned analysis, 15 third order groups have been defined. Once the new set of groups has been identified, a final parameter regression is performed, where all the group contributions are estimated simultaneously. The results of this parameter regression are reported in Table S2 in the Supporting Information. The performance statistics for the developed model, overall and divided per class, compared to those relative to the same model before the addition of dedicated third order groups are summarized in Table 4, while a parity plot is presented in Figure 4. Table 4 provides also a comparison of the developed GC-based method with two different QSPR-based models for the estimation of the critical micelle concentration. The relative statistical indices have been calculated based on the modeled data reported by Katritzky et al.19. INSERT TABLE 4

ACS Paragon Plus Environment

12

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 29

INSERT FIGURE 4 It can be observed that the statistical indices are much improved for each class of nonionic surfactants. In particular considerable improvements have been achieved for polyglycerol esters, compared to when only first and second order groups were used. To be fair to the authors of the QSPR-based models, we acknowledge that while ours is a strictly correlation error, theirs may or may not be correlation error. Some prediction errors which are still observed in some cases can be attributed to the insufficient availability of data for these nonionic surfactant families. When data is lacking, the “similarity” criteria cannot be applied to all those compounds with large errors, and thus unique third order groups cannot be defined. A future improvement is the inclusion in the dataset of a more compounds. That is, when more data are available, it will allow the definition of more third order groups for the remaining surfactants with large prediction errors. 3. APPLICATION EXAMPLE The application of the developed GC-model to estimate the critical micelle concentration of nonionic surfactants is illustrated by considering the example of Tetraglycerol monostearate. Table 5 shows the groups needed for the description of Tetraglycerol monostearate, their occurrences and their contributions in the case that no new third order groups have been defined. Table 6 provides the same information after the inclusion of the new third order groups. By comparison of the predictions with the experimental value, it can be clearly seen the remarkable effect of the introduction of new third order groups in terms of performances of the method. INSERT TABLE 5 INSERT TABLE 6 4. VALIDATION OF THE MODEL

ACS Paragon Plus Environment

13

Page 15 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

It has already been pointed out that GC-methods usually do not split the data-set between training set and validation set, since the random choice of a validation set may exclude some parameters from the regression or may distort the performance of the model if a contribution is regressed on a single data point. However, a validation a posteriori of the model is needed in order to test the reliability of the model. For this reason a subset of 30 data points, corresponding to the 20% of the size of the data-set used for the parameter regression, has been randomly chosen and the parameter regression has been performed again excluding the selected subset. In order to avoid the above mentioned problems for GC-methods to occur, though, the random choice of the subset is limited to those compounds the exclusion of which would not lead to the removal of any group from the set to be regressed. The selected subset is highlighted in Table 1, and the predictions relative to the parameter regression with the full data-set and to the reduced set are reported in Table S4 in the Supporting Information. Since the predictions obtained with the group parameters regressed without the validation sub-set are very similar to the correlations obtained with the group parameters regressed with the whole data-set of 150 compounds, both set of model parameters are considered acceptable and the parameter set reported in Table S2 in the Supporting Information is recommended for use. 5. CONCLUSIONS A group-contribution property model based on the Marrero and Gani GC-method is developed for the estimation of the critical micelle concentration of nonionic surfactants. In order to achieve high estimation accuracy, a property-data-model analysis has been carried out and the molecules having the largest prediction errors have been identified. Using this information and applying the “similarity” criteria, appropriate third order groups have been defined. These new groups and their respective parameters give more detailed structural information about the compounds and thus the

ACS Paragon Plus Environment

14

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 29

overall prediction performance of the model is significantly improved. The developed model requires only the information of molecular structure of the nonionic surfactants and provides reliable and accurate values of critical micelle concentrations. The developed model has a wide range of application, as the well-known classes of nonionic surfactants have been investigated. All the necessary first, second and third order groups involved in the description of the nonionic surfactants have been considered and their contributions have been determined. In those cases where the structure of a nonionic surfactant is represented by groups for which the corresponding group parameters could not be regressed in the present work, the CI-based group-contribution+ (GC+) approach is applied. Note that in this work, the critical micelle concentration has been modeled as a primary property, based only on the molecular structure of the surfactants involved. Therefore, it has been necessary to consider a fixed temperature for the CMC value. In addition, only very few data for the CMC at different temperatures are available in the literature. However, the addition of a temperature-dependent term to the model is considered a relevant future extension of the work. Other current and future works should focus on the extension of the data-set, possibly by considering measurements indirectly obtained through the observation of the same physical properties. This way, the uncertainty connected with the model used for the estimation of the experimental critical micelle concentration is removed. ACKNOWLEDGMENTS Financial support from the Technical University of Denmark is greatly acknowledged. SUPPORTING INFORMATION Table S1: Statistical indices of performances of the correlation using the Marrero and Gani GCmethod with only first and second order groups. Table S2: Marrero and Gani group definition and contributions. Table S3: Comparison of the statistical indices of performances of the correlation

ACS Paragon Plus Environment

15

Page 17 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

using the Marrero and Gani GC-method before and after the addition of dedicated third order groups. Table S4: Comparison of the calculated value of 30 critical micelle concentration with the total data-set and with the reduced data-set considered for the validation. This material is available free of charge via the Internet at http://pubs.acs.org. LITERATURE CITED (1) IUPAC. Compendium of Chemical Terminology, 2nd ed. (the "Gold Book"). Compiled by A. D. McNaught and A. Wilkinson. Blackwell Scientific Publications, Oxford (1997). XML on-line corrected version: http://goldbook.iupac.org (2006-) created by M. Nic, J. Jirat, B. Kosata; updates compiled by A. Jenkins. ISBN 0-9678550-9-8. doi:10.1351/goldbook (2) Rosen, M. J.; Surfactants and Interfacial Phenomena - Third Edition 2004, ISBN 0-471-478180; © John Wiley & Sons, Inc. (3) Ray, A.; Solvophobic Interactions and Micelle Formation in Structure Forming Nonaqueous Solvents; Nature (London) 1971, 231, 313 (4) Mukerjee, P. and K. J. Mysels, Critical Micelle Concentration of Aqueous Surfactant Systems, 1971, NSRDS-NBS 36, U.S. Dept. of Commerce, Washington, DC (5) Greiss, W.; Über die Beziehungen zwischen der Konstitution und den Eigenschaffen von Alkylbenzolsulfonaten mit jeweils einer geraden oder verzweigten Alkylkette bis zu 18 Kohlenstoff-Atomen II; Fette, Seifen, Anstrichmittel 1955, 57, 168-172 (6) Shinoda, K. and hirai, T.; Ionic Surfactants Applicable in the Presence of Multivalent Cations. Physicochemical Properties; J. Physical Chemistry 1977, 81, 1842

ACS Paragon Plus Environment

16

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 29

(7) Klevens, H. B.; Structure and Aggregation in Dilute Solutions of Surface Active Agents; J. American Oil Chemists’ Society 1953, 30, 74 (8) Sanders, S. A.; Sammalkorpi, M.; Panagiotopoulos A. Z.; Atomistic Simulations of Micellization of Sodium Hexyl, Heptyl, Octyl, and Nonyl Sulfates; J. Physical Chemistry B 2012, 116, 2430-2437 (9) LeBard, D. N.; Levine, B. G.; Mertemann, P.; Barr, S. A.; Jusufi, A.; Sanders, S.; Klein, M. L.; Panagiotopoulos A. Z.; Self-assembly of Coarse-grained Ionic Surfactants Accelerated by Graphics Processing Units; Soft Matter 2012, 8, 2385-2397 (10) Chen, C. C.; Molecular Thermodynamic Model for Gibbs Energy of Mixing of Nonionic Surfactant Solutions; AIChE J. 1996, 42, 3231 (11) Voutsas, E. C.; Flores, M. V.; Spiliotis, N.; Bell, G.; Halling, P. J.; Tassios, D. P.; Prediction of Critical Micelle Concentration of Nonionic Surfactants in Aqueous and Nonaqueous Solvents with UNIFAC; Industrial & Engineering Chemistry Research, 2001 40, 2362-2366 (12) Viades-Trejo, J.; Abascal-Gonzalez, D. M.; Gracia-Fadrique, J.; Critical Micelle Concentration of Poly(Oxy-1,2-Ethanediyl), alpha-Nonyl Phenol-omega-Hydroxy Ethers (C9H19C6H4Ei=6,10.5,12,17.5) by Surface Equations of State; J. Surfactant and Detergents 2012, 15, 637-645 (13) Li, X. S.; Lu, J. F.; Li, Y. G.; Study on Ionic Surfactant Solutions by SAFT Equation incorporated with MSA; Fluid Phase Equilibria 2000, 168, 107-123 (14) Molyneux, P.; Rhodes, C. T.; Swarbrick, J.; Thermodynamics of Micellization of N-Alkyl Betaines; Transactions of the Faraday Society. 1965, 61, 1043

ACS Paragon Plus Environment

17

Page 19 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

(15) Zdziennicka, A.; Szymczyk, K.; Krawczyk, J.; Janvzuk, B.; Critical Micelle Concentration of Some Surfactants and Thermodynamic Parameters of their Micellization; Fluid Phase Equilibria 2012, 322, 126-134 (16) Hu, J.; Zhang, X.; Wang, Z.; A Review on Progress in QSPR Studies for Surfactants; International Journal of Molecular Sciences 2010, 11, 1020-1047 (17) Zhu, Z. C.; Wang, Q.; Jia, Q. Z.; Tang, H. M.; Ma, P. S.; Quantitative Structure-Property Relationship of the Critical Micelle Concentration of Different Classes of Surfactants; Acta Physico-Chimica Sinica 2013, 29, 30-34 (18) Marrero, J.; Gani, R.; Group-contribution-based Estimation of Octanol/Water Partition Coefficient and Aqueous Solubility;Industrial & Engineering Chemistry Research, 2002, 41, 66236633. (19) Katritzky, A. R.; Pacureanu, L. M.; Slavov, S. H.; Dobchev, D. A. and Karelson M; QSPR Study of Critical Micelle Concentration of Nonionic Surfactants; Industrial & Engineering Chemistry Research, 2008, 47, 9687-9695. (20) Hukkerikar, A.; Sarup, B.; Ten Kate, A.; Abildskov, J.; Sin, G.; Gani, R.; Group-contribution+ (GC+) Based Estimation of Properties of Pure Components: Improved Property Estimation and Uncertainty Analysis; Fluid Phase Equilibria 2012, 321, 25-43 (21) Marrero, J.; Gani, R.; Group-contribution Based Estimation of Pure Component Properties; Fluid Phase Equilibria 2001, 183-184, 183-208 (22) Constantinou, L.; Gani, R.; New Group-contribution Method for Estimation Properties of Pure Components; AIChE J. 1994, 40, 1697-1710

ACS Paragon Plus Environment

18

Industrial & Engineering Chemistry Research

(23) Gani, R.; Harper, P. M.; Hostrup, M.; Automatic Creation of Missing Groups Through Connectivity Index for Pure-component Property Prediction; Industrial & Engineering Chemistry Research, 2005, 44, 7262-7269 (24) Hukkerikar, A. S.; Meier, R. J.; Sin, G.; Gani, R.; A Method to Estimate the Enthalpy of Formation of Organic Compounds with Chemical Accuracy; Fluid Phase Equilibria, 2013, 348, 2332 LIST OF FIGURES

1 0.8

(CMC)exp

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 29

0.6 0.4 0.2 0 0

2

4

6

8

10

12

14

16

nr of C atoms

Figure 1: Dependence of the experimental critical micelle concentration with the number of carbon atoms in the hydrophobic tail of linear ethoxylates

ACS Paragon Plus Environment

19

Page 21 of 29

6

-log(CMC)exp

5 4 3 2 1 0 0

2

4

6

8 10 nr of C atoms

12

14

16

Figure 2: Dependence of the logarithm of the experimental critical micelle concentration with the number of carbon atoms in the hydrophobic tail of linear ethoxylates

7 6 5

-log(CMC)mod

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

4 3 2 1 0 0

1

2

3

4

5

6

7

-log(CMC)exp

Figure 3: Parity plot relative to the correlation of 150 data-points regarding critical micelle concentration of nonionic surfactants using the Marrero and Gani GC-method with only first and second order groups

ACS Paragon Plus Environment

20

Industrial & Engineering Chemistry Research

7 6 5

-log(CMC)mod

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 29

4 3 2 1 0 0

1

2

3

4

5

6

7

-log(CMC)exp

Figure 4: Parity plot relative to the correlation of 150 data-points regarding critical micelle concentration of nonionic surfactants using the Marrero and Gani GC-method after the addition of dedicated third order groups

LIST OF TABLES Table 1: List of the original data-set of experimental critical micelle concentration of nonionic surfactant at 25 C19, together with the molecular description of each of them. Distinction in classes is based only on the molecular structure of the surfactants. Compounds highlighted in grey have been excluded from the parameter regression step. Compounds highlighted with (v) have been chosen as the validation step. Code

-log(CMC)exp

Code

-log(CMC)exp

Code

-log(CMC)exp

Linear alkyl ethoxylates (43 compounds) CnEm

CnH2n+1O(C2H4O)mH

C4E1

0.009

C10E5 (v)

3.100

C12E12

3.854

C4E6

0.110

C10E6

3.046

C12E14

4.260

C6E3

0.980

C10E7

3.015

C13E8

4.569

C6E4 (v)

1.032

C10E8

3.000

C14E6

5.000

C6E5

1.017

C10E9

2.886

C14E8

5.046

C6E6

1.164

C11E8 (v)

3.523

C14E9 (v)

5.046

C8E1 (v)

2.310

C12E1

4.638

C15E8 (v)

5.456

C8E3

2.125

C12E2

4.481

C16E6

5.780

C8E4

2.063

C12E3

4.284

C16E7

5.770

C8E5

1.959

C12E4 (v)

4.194

C16E8

5.921

C8E6

2.004

C12E5

4.194

C16E9

5.678

ACS Paragon Plus Environment

21

Page 23 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

C8E9

1.886

C12E6

4.060

C16E10 (v)

5.699

C9E8 (v)

2.520

C12E7

4.086

C16E12

5.638

C10E3

3.222

C12E8

4.000

C10E4

3.167

C12E9

4.000

Phenyl alkyl ethoxylates (15 compounds) CnPhEm

CnH2n+1(C6H4)O(C2H4O)mH

C8PhE1

4.305

C8PhE6

3.678

C8PhE30 (v)

3.959

C8PhE2

4.116

C8PhE7

3.602

C8PhE40

4.119

C8PhE3

4.013

C8PhE8

3.553

C9PhE2

3.377

C8PhE4

3.886

C8PhE9

3.523

C9PhE5 (v)

3.328

C8PhE5 (v)

3.824

C8PhE10

3.481

C9PhE12

3.301

IC10E9

2.526

Branched alkyl ethoxylates (5 compounds) ICnEm

(CH2Cn/2-2Hn-4)2CHCH2O(C2H4O)mH

IC4E6

0.049

IC8E6

1.670

IC6E6

1.016

IC10E6 (v)

2.547

Alkanediols (5 compounds) C8GLYCER CnDIOL C10DIOL CnDIOL C11DIOL (v)

2.237

C8H17OCH2CH(OH)CH2OH

(n=10,12) 2.638

C12DIOL

(n=11,15) 2.638

Cn-2H2n-3CH(OH)CH2OH 3.745

Cn-3H2n-5CH(OH)CH2CH2OH C15DIOL

4.886

Alkyl mono and disaccharide ethers and esters (7 compounds) CnGLUC

CnH2n+1O(C6H11O5)

C8GLUC

1.602

C12DELAC

3.222

C12H25NH(C6H12O4)O(C6H11O5) (first ring open)

C10GLUC (v)

2.658

C12GLUC

C12MALT

3.620

C12H25(C6H10O4)O(C6H11O5)

C12SUCR

3.469

C11H23C(O)O(C6H10O4)O(C6H11O5)

C18SUCR

5.292

C8H17CH=CHC7H14C(O)O(C6H10O4)O(C6H11O5)

3.721

Ethoxylated amines and amides (12 compounds) C11CONEO

3.585

CnCONEmE

C11H23C(O)N(C2H4OH)2 CnH2n+1C(O)N[(C2H4O)NCH3]2

C9CONE3E

2.299

C11CONE2E

3.398

C9CONE4E

2.193

C11CONE3E (v)

3.292

C12ALAE4

3.413

C12H23NHCHCH3C(O)O(C2H4O)4H

C12GLYE4

3.474

C12H23NHCH2C(O)O(C2H4O)4H

C12SARE4

3.533

C12H23NCH3CH2C(O)O(C2H4O)4H

C12AMEn C12AME3

C11CONE4E

3.611

C12AME9

3.125

C12H25CON(CH3)CH2CH2O(C2H4O)nH 3.292

C12AME6 (v)

3.187

Fluorinated linear ethoxylates and amides (20 compounds)

ACS Paragon Plus Environment

22

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

CFnCONEm CF6CONE3

CnF2n+1CH2C(O)N[(C2H4O)mCH3]2 3.260

CF8CONE3 (v)

4.921

CF10CONE

6.523

CF6SE7

4.319

CF6SE3SE

4.469

6

CF6SEn

C F13C2H4SC2H4O(C2H4O)nH

CF6SE2

4.602

CF6SE3

4.553

CF6SEnSEm CF6SESE2

Page 24 of 29

CF6SE5

4.432

C6F13C2H4(SC2H4OC2H4)n(SC2H4OC2H4)mOH 4.638

HnE3

CF6SE2SE (v)

4.585

H(CF2)nCH2(C2H4O3)CH3 H4E3

2.097

FnE3

H6E3

3.523

F(CF2)nCH2(C2H4O3)CH3 F4E3

2.699

F6E3 (v)

4.097

F(CF2)nCH2CH2CHNHCO(OC2H4)mCH3

FnC3NCOEm F4C3NCOE2

2.009

F6C3NCOE2

3.824

F8C3NCOE2

4.620

F4C3NCOE2

2.854

F6C3NCOE3 (v)

4.046

F8C3NCOE3

4.959

GLY10OL-1

4.676

GLY10LA-1

4.549

Polyglycerol esters (11 compounds) C17H33COO(CH2CHOHCH2O)mH

GLYnOL-1 GLY4OL-1

4.484

4.402

GLY6-LA-1 (v)

4.446

C17H35COO(CH2CHOHCH2O)mH

GLYnST-1 GLY4ST-1

4.562

C11H21COO(CH2CHOHCH2O)mH

GLYnLA-1 GLY4LA-1

GLY6-OL-1 (v)

4.650

GLY6ST-1

SORB-LA-1

4.440

C6O4H11OCOC12H23

SORB-OL-1

4.578

C6O4H11OCOC17H33

SORB-OL-3

4.944

C6O4H9(OCOC17H33)3

4.553

Carbohydrate-derivate esters, ethers and thiols (43 compounds) CnH2n+1COOC12H23O10

Cn-LACTOSE C8-LACTOSE

2.580

C12-LACTOSE

3.370

C16-LACTOSE

5.020

3.370

C16-LACTITOL

5.120

CnH2n+1COOC12H21O10

Cn-LACTITOL C8-LACTITOL

2.561

C12-LACTITOL (v)

N-C12-MPYR (v)

3.740

C12H23OC12O10H21 CnH2n+1OCOC5H11O4

Cn-OCO-XYL C4-OCO-XYL

0.921

C6-OCO-XYL

2.000

C8-OCO-XYL

2.357

C5-OCO-XYL

1.,237

C7-OCO-XYL (v)

1.745

C9-OCO-XYL

2.745

CnH2n+1OC5H11O4

Cn-O-XYL C4-O-XYL

1.237

C7-O-XYL

2.036

C10-O-XYL

3.092

C5-O-XYL

1.420

C8-O-XYL

2.174

C11-O-XYL

3.523

C6-O-XYL (v)

2.027

C9-O-XYL (v)

2.678

Cn-S-XYL

CnH2n+1SC5H11O4

ACS Paragon Plus Environment

23

Page 25 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

C4-S-XYL

0.745

C5-S-XYL (v)

1.337

C6-S-XYL

1.796

3.854

C18-OCO-GLU

3.699

CnH2n+1OCOC6H11O5

Cn-OCO-GLU C8-OCO-GLU

2.796

C12-OCO-GLU

3.638

C16-OCO-GLU

C12-O-MALT

3.482

C12CONE4

3.301

C12H25O(C6H10O4)O(C6H11O5) C12H25CONH(C2H4O)4H

C8TGLUPYR

2.071

C8H17SC6H11O5 (CnH2n+1NC3H6NCOC5H11O5)2(CH2)2

BIS(CnGA) BIS(C8GA)

4.174

BIS(C12GH)

5.284

BIS(C12GA) (v)

(CnH2n+1NC3H6NCOC11H21O11)2(CH2)2

BIS(CnLA) BIS(C8LA) GLUPYR-n

5.420

(C12H25NC3H6NCOC6H13O6)2(CH2)2

3.886

BIS(C12LA)

(n=1,2)

5.051

[CH3(CH2)3OC6O5H10CO)]2(CH2)n+1

GLUPYR-1

2.143

[CH3(CH2)3OC6O5H10CO)](CH2)2[CH3(CH2)3OC6O5H10CO)]

GLUPYR-2

1.883

[CH3(CH2)3OC6O5H10CO)](CH2)3[CH3(CH2)3OC6O5H10CO)]

GLUPYR-3

2.669

[CH3(CH2)3OC6O5H10CO)](CH2)2[CH3(CH2)3OC6O5H10CO)]

GLUPYR-4

2.509

[CH3(CH2)3OC6O5H10CO)](CH2)3[CH3(CH2)3OC6O5H10CO)]

GLUPYR-5

1.801

[CH3(CH2)3OC6O5H10CO)]C6H4[CH3(CH2)3OC6O5H10CO)]

GLUPYR-6

0.959

CH3(CH2)3OC6O5H11

GLUPYR-7

1.886

[CH3(CH2)3OC6O5H10CO)](CH2)2[CH3(CH2)3OC6O5H10CO)]

Table 2: Example of the decomposition in groups (first, second and third order) of a nonionic surfactant (Octyl glucoside), according to the Marrero and Gani GC-method

Molecular Structure

First order groups

Second order groups

Third order groups

and occurrences

and occurrences

and occurrences

ACS Paragon Plus Environment

24

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1 - CH3

1 - CHcyc-CH2

7 - CH2

3 - CHcyc-OH

4 - OH

1 - CHcyc-O

Page 26 of 29

Not used

1 - CH2O 5 - CHcyc 1 - Ocyc

Table 3: Statistical indexes for the correlation of 150 data-points regarding critical micelle concentration of nonionic surfactants using the Marrero and Gani GC-method with only first and second order groups

Data-points for the regression

Correlation coefficient (R2)

Residual distribution plot

Standard deviation (SD)

Average absolute deviation (AAD)

Maximum absolute deviation (AADmax)

0.4018

0.3115

1.5082

1.5 1

150

0.8964

0.5 0 -2

-1

0

1

2

ACS Paragon Plus Environment

25

Page 27 of 29

Table 4: Statistical indexes for the correlation of 150 data-points regarding critical micelle concentration of nonionic surfactants using the Marrero and Gani GC-method before and after the addition of third order groups, compared with two different QSPR models19

QSPR model 1

GC-based method with 3rd order groups

GC-based method without 3rd order groups

Data-points for the regression

QSPR model 2

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Correlation coefficient (R2)

Residual distribution plot

Standard deviation (SD)

Average absolute deviation (AAD)

Maximum absolute deviation (AADmax)

0.4018

0.3115

1.508

0.1735

0.1274

0.669

0.4210

0.3602

1.646

0.2949

0.2546

0.889

3

150

0.8964

1.5 0 -2

-1

0

1

2

3

150

0.9824

1.5 0 -2

-1

0

1

2

3

161

0.8876

1.5 0 -2

-1

0

1

2

3

161

0.9460

1.5 0 -2

-1

0

1

2

ACS Paragon Plus Environment

26

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 29

Table 5: Prediction of the critical micelle concentration of Tetraglycerol monostearate, before the introduction of new third order groups. Predicted value is compared with experimental value and with the prediction of a QSPR model. Tetraglycerol monostearate

Molecular structure

Molecular formula: C30H60O10

First order groups

Occurrences

Group contribution

CH3

1

0.129

CH2

19

0.360

OH

1

-0.302

CH2CO

1

-0.345

OCH2CHOH

4

-0.290

Second order groups

Occurrences

Group contribution

CHOH

3

0.182

CHm(OH)CHn(OH) (m,n in 0..2)

1

0.292

Third order groups

Occurrences

Group contribution

No third order groups are involved  log           5.775 





QSPR models from Table 4:  log ,$%&'  5.329; 5.125; 19

 log .  4.650

Table 6: Prediction of the critical micelle concentration of Tetraglycerol monostearate after the introduction of new third order groups. Predicted value is compared with experimental value and with the prediction of a QSPR model. Tetraglycerol monostearate

Molecular structure

ACS Paragon Plus Environment

27

Page 29 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Molecular formula: C30H60O10

First order groups

Occurrences

Group contribution

CH3

1

-0.223

CH2

19

0.434

OH

1

-0.892

CH2CO

1

-0.324

OCH2CHOH

4

-0.297

Second order groups

Occurrences

Group contribution

CHOH

3

-0.145

CHm(OH)CHn(OH) (m,n in 0..2)

1

0.012

Third order groups

Occurrences

Group contribution

(CH2)n-CH2CO (n=15)

1

-1.229

 log           4.609 





QSPR models from Table 4:  log ,$%&'  5.329; 5.125 19

 log .  4.650

ACS Paragon Plus Environment

28