Prediction of Indices of Refraction and Glass Transition Temperatures

Graph theoretical indices were exclusively used in the prediction of indices of refraction, n, and glass .... A number of structurally heterogeneous a...
0 downloads 0 Views 69KB Size
J. Phys. Chem. B 2002, 106, 1501-1507

1501

Prediction of Indices of Refraction and Glass Transition Temperatures of Linear Polymers by Using Graph Theoretical Indices R. Garcı´a-Domenech* Unidad de InVestigacio´ n de Disen˜ o de Fa´ rmacos y ConectiVidad Molecular, Departamento de Quı´mica Fı´sica, Facultat de Farma` cia, UniVersitat de Vale` ncia, Spain

J. V. de Julia´ n-Ortiz J. V. D. J. O. InVestigacio´ n de NueVos Productos, Valencia, Spain ReceiVed: June 20, 2001; In Final Form: October 30, 2001

Graph theoretical indices were exclusively used in the prediction of indices of refraction, n, and glass transition temperatures, Tg, into a group of addition polymers. Models with 10 variables were selected for the prediction of n (r ) 0.981, SEE ) 0.0147) and Tg/M (r ) 0.946, SEE ) 0.439). The average errors in the predictions were 0.69% and 12.7% for n and Tg, respectively. The descriptors involved in these models were calculated from the structures of the monomers.

Introduction The aim of this study was to obtain graph-theoretical models for the structure-property relationship of refractive indices and glass transition temperatures for a broadly representative group of high molecular weight addition polymers. The functionalities present in the side chains included halides, cyanides, carboxylates, acetates, amides, ethers, alcohols, hydrocarbon chains, aromatic rings, and nonaromatic rings. The refractive index (n), also called the index of refraction, is a measure of the bending of a ray of light when passing from one medium into another. It is also equal to the velocity c of light of a given wavelength in empty space divided by its velocity V in a substance, or n ) c/V. The prediction of this property in polymers is valuable due to its application to the design of new optical materials. For example, the total reflection of light at the boundary between two media of different optical properties is of the utmost importance in the manufacture of waveguides, films, or optical fibers. In addition, n is related to dielectric funtions and other electromagnetic properties,1 such as the energy gap in semiconductors.2 The prediction of refractive indices for polymers has been carried out using several approaches.3-5 Some of them are outlined here. Mekenyan et al.3 used a modified Wiener path number, which provided finite values in the case of infinite polymer chains within homologous series (topological extrapolation method). Askadskii4 and Jenecke and co-workers6 have developed successful group contribution methods, on the basis of the additivity principle. Katritzky et al.5 have applied the CODESSA (COmprehensive DEscriptors for Structural and Statistical Analysis) method to the structures of the repeating unit of the polymers, end-capped by hydrogen atoms. The plastic behavior of polymers is influenced by the arrangement of their molecules on a large scale. Polymer morphologies can be considered either amorphous or crystalline. Amorphous molecules are arranged randomly and are intertwined, whereas crystalline molecules are arranged closely and * Corresponding author. E-mail: [email protected].

in a discernible order. A particular thermoplastic material retains its molded shape up to a certain temperature, known as the glass transition temperature (Tg). Below Tg, the molecules of a polymer material are frozen in what is known as the glassy state; there is little or no movement of molecules, and the material is stiff and even brittle. Above Tg the amorphous parts of the polymer show plasticity due to the increase of the rate of molecular motions, which cause an increase in the distances between polymer chains. Molecules display increased mobility and the material enters the rubbery state. The coefficient of thermal expansion and the heat capacity dramatically vary when the Tg is surpassed. Tg is one of the most important properties of amorphous polymers and, despite the proliferation of additives, greatly conditions the uses of a given polymer as shown by Chipara.7 This property is difficult to determine experimentally because the transition takes place over a comparatively wide temperature range and it is dependent on conditions such as the method of measurement, duration of the experiment, and pressure.8-10 The discrepancies in reported values of Tg in the literature can be quite large. Zutty and Whitworth11 related the glass transition temperature of polyvinyls with the heats of vaporization and boiling points of low molecular weight model compounds. Mekenyan et al.3 also applied their topological extrapolation method to this problem. Several predictive models have appeared that make use of additive methods.8,10,12 Other formalisms, in which the concept of flexibility plays an important role, have been developed from conformational analysis. They comprise the Hopfinger and Koehler method13 and the EVM (energy, volume, mass) model.14,15 Again, Katritzky et al.8 have applied the CODESSA method with an approach similar to the one used in the prediction of n, but calculating the indices for representative oligomeric structures. If the polymer chain length is large enough, the two properties n and Tg have no relationship to the molecular mass of the polymer. Since all the properties depend on the chemical structure of the polymer molecule, and this structure is conditioned by the monomer structure, all the descriptors used in the present work were calculated for the molecular monomer

10.1021/jp012360u CCC: $22.00 © 2002 American Chemical Society Published on Web 01/18/2002

1502 J. Phys. Chem. B, Vol. 106, No. 6, 2002

Garcı´a-Domenech and de Julia´n-Ortiz

TABLE 1: Symbols for Topological Indices and Their Definitions index symbol m

χp mχv

p

m

χc m v χ

c

m

χpc mχv pc Gm Jm Gvm Jvm κ ST(i) Hmax V4 W

TABLE 2: Statistics of the Equation Selected for the Prediction of Refractive Indicesa

definition

ref

path connectivity index of order m ) 0-4 path valence connectivity index or order m ) 0-4 cluster connectivity index of order m ) 0-4 cluster valence connectivity index of order m ) 0-4 path-cluster connectivity index of order m ) 0-4 path-cluster valence connectivity index of order m ) 0-4 topological charge index of order m ) 1-5 bond topological charge index of order m ) 1-5 valence topological charge index of order m ) 1-5 valence bond topological charge index of order m ) 1-5 kappa index atom-type E state indices maximum hydrogen E state index number of vertices of degree 4 Wiener path number

28 28 28 28 28 28 29 29 29 29 31 32 32 30 33

before the chain-reaction polymerization. For example, the descriptors calculated for the molecule of ethylene were used to represent the macromolecule of polyethylene. A graph-theoretical description of the molecular structure is exclusively assumed in the present work. This paradigm has the advantage of the speed of calculations for almost any conceivable organic molecule, in addition to its strong predictive ability. Different graph-theoretical indices have shown their success in QSAR and QSPR analysis to predict diverse physical, chemical, and biological properties in several groups of compounds.16-20 The Randic´-Kier-Hall indices are the most widely used, and although they have not at present an unambiguous interpretation, some theoretical articles have related them to orbital energies.21,22 A different graph-theoretical approach, on the basis of the cluster-expansion method, has been previously used in the estimation of molar refractive indices for organic compounds.23,24 Materials and Methods A number of structurally heterogeneous addition linear polymers were selected. The reported experimental data have been taken from articles by Katritzky et al.5,8 In chemical graph theory, molecular structures are normally represented as hydrogen-depleted graphs, whose vertices and edges act as atoms and covalent bonds, respectively. Chemical structural formulas can be then assimilated to undirected and finite multigraphs with labeled vertices, commonly known as molecular graphs. Graph-theoretical indices, also known as topological indices, are descriptors that characterize molecular graphs and are able to give account of their structural properties. To calculate the descriptors, the polymers were represented by their corresponding monomers. A set of graph-theoretical descriptors was calculated for each monomer molecule by using Hall’s MOLCONN-Z25 and Etopo1126 programs. The indices calculated belonged to the following families (Table 1): Randic´-Kier-Hall subgraph connectivity indices mχt 27 up to order 4 and their corresponding valence indices,28 topological charge indices,29 topological geometric indices,30 kappa indices,31 atom-type electrotopological (E) state indices,32 and Wiener index.33 A description of the indices used is given in the Appendix. The physical properties were linearly correlated with the aforementioned descriptors to obtain several connectivity functions. The Furnival-Wilson algorithm34 was used to obtain

variable

coeff B

SE(B)

t-Student

p-level

constant 1χ 0 v χ κ3 Hmax ST(-CH3) ST(aCHa) ST(dC 0.8. It showed r2 ) 0.844, but SEE ) 0.0282, that is, a standard error of the estimates 92% greater than the one obtained with the selected 10-variable equation. The simplest equation with r2 > 0.9 had six variables, namely 0χv, κ3, ST(aCHa), ST(-F), ST(-CH3), and 1χ. Its SEE ) 0.0206, that is, 40% greater than the finally selected equation. The increase in number of variables seems justified by the lowering of SEE, until an optimum given by the lowest Cp. It is noteworthy that several atom-type electrotopological state indices appear in the equations. These parameters account for the influence of certain atoms on the value of n. The presence of fluorine makes the value of n smaller. This fact is reflected by means of the negative sign of the ST(-F) term. The presence of the Randic´ index 1χ in the equation reflects the influence of branching on the value of n. The more ramification in the monomer unit, the smaller 1χ is and, therefore, the bigger n is. Table 3 shows the predicted results obtained with each one of the compounds present in the training group (column 3). The mean error in predictions is MRE ) 0.69%, which reveals the quality of the selected model for the prediction of refraction index. Only two polymers, poly(p-methoxystyrene) and poly(vinylidene chloride) present errors in prediction greater than 2% (2.18 and 2.62%, respectively). Similar results were obtained in the cross-validation study (column 5, Table 3), with a mean error in the prediction of 0.82% and SEE(CV) ) 0.0169. The graphic representations of the residuals obtained by using the selected equation versus the residuals of the cross-validation, Figure 1a, and the correlation coefficient, r2, vs prediction coefficient, r2CV, Figure 1b, illustrate the quality of the selected model. The discrepancies between the two residuals are small for most of the studied polymers. Table 4 shows the results of prediction for the test group. The mean error is 1.95%, which confirms the validity of the proposed equation. Glass Transition Temperature. The study of multilinear regression carried out with Tg shows differences in relation to the previously studied property, n. Greater errors were obtained in general in all the correlations. As pointed out in the Introduction, this might be due to the difficulty of experimentally quantifying this property. In a first correlation attempt by using Tg, poor regressions were obtained even when many descriptors were used in the equation (r < 0.85). Since Bicerano has obtained successful results in the prediction of Tg/M, that is, Tg divided by the molecular weight of the repeating unit,10 this approach was attempted. Table 5 shows the statistical parameters for the best regression equation obtained with 10 variables and using, as dependent variable, the ratio Tg/M. The same as with n, all the descriptors in the equation are highly significant, and a good correlation was obtained (r ) 0.9455). Likewise, equations with fewer

Garcı´a-Domenech and de Julia´n-Ortiz

Figure 1. Validation of the mathematical model obtained with the index of refraction. (a) Residuals obtained with the best regression versus residuals obtained by cross-validation. (b) Correlation coefficients, r2, versus prediction coefficients, r2CV, obtained by randomization test.

TABLE 4: Prediction of Indices of Refraction, n, in the Test Set of a Group of Addition Polymers polymer poly(oxymethylene) poly(oxyethylene) poly(ethylene terephthalate) poly[oxy(2,6-dimethyl-1,4phenylene)] poly(p-xylylene) poly(vinylbutyral) poly[oxy(methylphenylsilylene)] poly(styrene sulfide) poly(chloro-p-xylylene) poly[oxy(methyl-n-hexylsilylene)] poly(propylene oxide) poly(3-butoxypropylene oxide) poly(3-hexoxypropylene oxide) poly(1-phenylethyl methacrylate) a

n(expt)a n(calc)b residual

error (%)

1.4800 1.4563 1.5750 1.5750

1.5002 0.0202 1.36 1.4733 0.0170 1.17 1.5476 -0.0274 1.74 1.5398 -0.0352 2.24

1.6690 1.4850 1.5330 1.6568 1.6290 1.4450 1.4570 1.4580 1.4590 1.5487

1.6355 1.4814 1.6399 1.6341 1.6373 1.5192 1.4634 1.4449 1.4400 1.5777

-0.0335 -0.0036 0.1069 -0.0227 0.0083 0.0742 0.0064 -0.0131 -0.0190 0.0290

2.01 0.24 6.97 1.37 0.51 5.14 0.44 0.90 1.30 1.87

From ref 5. b From the mathematical model (Table 2).

variables and greater Cp values showed greater errors of estimates. The best equation having five variables (1χv, ST(-O-), ST(-F), J1, W) was the simplest one with r2 > 0.8 (r2 ) 0.837). It showed SEE ) 0.5257, i.e., 20% greater than the one obtained with 10 variables. The presence in the equation of the variables ST(-O-) and T S (-F) illustrates the influence of these atoms on the value of the property. Likewise, the presence of ramifications in the polymer, indicated through 1χv and V4, diminishes the Tg/M value

n and Tg Values of Linear Polymers

J. Phys. Chem. B, Vol. 106, No. 6, 2002 1505

TABLE 5: Statistics of the Equation Selected for the Prediction of Tg/Ma variable

coeff B

SE(B)

t-Student

p-level

constant 1χv 2 v χ 3χv p κR1 ST(-O-) ST(-F) W J1 J2 V4

7.5327 -3.3939 1.2841 1.1482 0.2952 -0.1740 -0.0382 0.0070 -4.8992 2.2515 -0.2590

0.2769 0.3995 0.2490 0.3740 0.0914 0.0292 0.0073 0.0013 0.8133 0.6047 0.0971

27.2 -8.5 5.2 3.1 3.2 -6.0 -5.3 5.2 -6.0 3.7 -2.7

0.000 000 0.000 000 0.000 002 0.003 003 0.001 865 0.000 000 0.000 001 0.000 001 0.000 000 0.000 384 0.009 430

a N ) 84; r ) 0.945 49; r2 ) 0.893 95; r2CV ) 0.840 22; SEE ) 0.438 56; SEE(CV) ) 0.538 33; F(10,73) ) 61.5; p < 0.000 00.

because the two terms are negative. The importance of the transfers of intramolecular charge on the Tg value is apparent due to the presence of J1 and J2 in the equation. Figure 2 shows the results of the cross-validation and randomization studies obtained with the model. As with n, the discrepancies between both residuals are small for most of the studied polymers. This indicates that the model is adequate for the prediction of Tg of linear polymers.

Table 6 illustrates the values of predicted Tg obtained with each one of the studied polymers using the selected equation. The MRE obtained for the training set and for the crossvalidation were 12.7% and 15%, respectively. These values are acceptable if we keep in mind the uncertainty that accompanies the experimental determination of Tg for each case. For the two properties, the prediction is high enough to make the introduction of additional descriptors, such as electronic or quantum mechanical, which afford more information, or the use of more sophisticated or nonlinear discriminant analyses, unnecessary. The results obtained exclusively using graph-theoretical descriptors, chosen from a pool of about 200, are comparable in quality to the ones obtained by Katritzky et al.5,8 in the same properties and polymer set by using the CODESSA approach. This fact confirms the potency of graph-theoretical indices as a useful tool for the structural characterization and prediction of physicochemical properties of polymers. The CODESSA algorithm searches for linear correlations of a given property by using more than 600 descriptors of several classes: constitutional, geometric, topological, electrostatic, quantum-chemical, and thermodynamic ones. For the prediction of indices of refraction, a correlation of 0.970 was obtained for an equation containing five descriptors: four quantum-chemical

TABLE 6: Prediction of the Glass Transition Temperature, Tg, of a Group of Addition Polymers compound

M

Tga

Tgb

Tgc

compound

M

Tga

Tgb

Tgc

poly(ethylene) poly(acrylic acid) poly(methyl acrylate) poly(ethyl acrylate) poly(vinyl chloride) poly(acrylonitrile) poly(vinyl acetate) poly(styrene) poly(2-chlorostyrene) poly(2-methylstyrene) poly(propylene) poly (ethoxyethylene) poly(n-butyl acrylate) poly(vinyl hexyl ether) poly(1,1-dimethylethylene) poly(methyl methacrylate) poly(ethyl methacrylate) poly(isopropyl methacrylate) poly(2-chloroethyl methacrylate) poly(phenyl methacrylate) poly(tetrafluoroethylene) poly(chlorotrifluoroethylene) poly(oxymethylene) poly(oxyethylene) poly(butylenethylene) poly(ethylene terephthalate) poly(vinyl n-octyl ether) poly(vinyl n-decyl ether) poly(vinyl n-pentyl ether) poly(vinyl 2-ethylhexyl ether) poly(vinyl n-butyl ether) poly(vinyl isobutyl ether) poly(vinyl sec-butyl ether) poly(isobutyl methacrylate) poly(n-hexyl methacrylate) poly(n-butyl methacrylate) poly(4 methyl 1-pentene) poly(vinyl chloroacetate) poly(n-propyl methacrylate) poly(cyclohexylethylene) poly(3-chlorostyrene) poly(4-chlorostyrene)

28 72 86 100 62.5 53 86 104 138 118 42 72 128 128 56 100 114 128 148 162 56 116. 30 44 84 192 156 184 114 156 100 98 100 96 170 142 83 120 128 110 138 138

195 379 281 251 348 378 301 373 392 409 233 254 219 209 199 378 324 327 365 393 228 373 218 206 220 345 194 197 207 207 221 251 253 348 268 293 302 304 306 363 363 389

178 369 353 296 296 335 302 365 465 394 203 254 274 216 203 370 295 275 306 309 245 432 212 188 284 541 168 142 238 174 253 263 233 333 232 264 279 323 276 329 397 413

172 368 362 299 290 312 303 365 472 392 200 254 276 216 204 369 293 272 302 297 246 454 208 181 290 636 166 129 240 168 255 265 231 331 230 262 276 324 274 325 400 415

poly(3-methylstyrene) poly(4-methylstyrene) poly(4-fluorostyrene) poly(1-pentene) poly(tert-butylacrylate) poly(1,1-dichloroethylene) poly(1,1-difluoroethylene) poly(ethyl chloroacrylate) poly(tert-butyl methacrylate) poly(oxytrimethylene) poly(oxytetramethylene) poly(oxyoctamethylene) poly(oxyhexamethylene) poly(n-octyl acrylate) poly(n-octyl methacrylate) poly(n-heptyl acrylate) poly(n-nonylacrylate) poly(n-hexyl acrylate) poly(1-heptene) poly(n-propyl acrylate) poly(pentafluoroethylethylene) poly(2,3,3,3-tetrafluoropropylene) poly(3,3-dimethylbutyl methacrylate) poly(n-butylacrylamide) poly(vinyl trifluoroacetate) poly(3-methyl-1-butene) poly(n-butyl R-chloroacrylate) poly(sec-butyl methacrylate) poly(heptafluoropropylethylene) poly(3-pentyl acrylate) poly(5-methyl-1-hexene) poly(oxy-2,2-dichloromethyltrimethylene) poly(vinyl isopropyl ether) poly(p-(n-butyl)styrene) poly(2-methoxyethyl methacrylate) poly(3,3,3-trifluoropropylene) poly(3-cyclopentyl-1-propene) poly(3-phenyl-1-propene) poly(n-propyl R-chloroacrylate) poly(sec-butyl R-chloroacrylate) poly(3-cyclohexyl-1-propene) poly(sec-butyl acrylate)

118 118 122 70 128 97 64 134 142 58 72 128 100 184 198 170 198 156 98 114 146 114 170 127 140 70 162 142 196 142 98 127 86 160 144 96 110 118 148 162 124 128

374 374 379 220 315 256 233 366 380 195 190 203 204 208 253 213 216 216 220 229 314 315 318 319 319 323 330 330 331 257 259 265 270 279 293 300 333 333 344 347 348 253

338 351 347 270 271 354 244 350 282 230 238 139 211 222 236 226 233 238 285 284 313 312 432 363 349 295 304 261 309 209 311 346 227 268 300 261 319 348 323 302 308 266

335 349 345 274 263 376 247 349 255 238 246 126 212 224 232 226 239 239 293 286 313 312 496 369 355 293 303 257 287 203 317 386 223 266 301 249 316 349 321 299 298 266

a

Tg experimental from ref 8. b Tg calculated from the mathematical model. c Tg calculated from cross-validation.

1506 J. Phys. Chem. B, Vol. 106, No. 6, 2002

Garcı´a-Domenech and de Julia´n-Ortiz Randic´ -Kier-Hall Subgraph Connectivity Indices. As is well-known, the χi indices may be derived from the adjacency matrix and they are defined as Nm

m

χt )

m Sj ∑ j)1

where m is the subgraph order, that is, the number of edges in the subgraph; Nm is the number of type t order m subgraphs within the whole graph,and mSj is a factor defined for each subgraph as m+1

Sj )

m

(δi)j-1/2 ∏ i)1

where j denotes the particular set of edges that constitutes the subgraph and δi is the degree of vertex i, that is, its number of edges. Valence connectivity indices are defined similarly, substituting δi by δiv, defined as

δi )

Zv - hi

v

Z - Zv - 1

where Z is the atomic number of the atom i, Zv is its number of valence electrons, and hi is the number of H atoms attached to it. Topological Charge Indices Gk and Jk. For a given graph, Gk and Jk are defined as follows: Figure 2. Validation of the mathematical model obtained with the glass transition temperature. (a) Residuals obtained with the best regression versus residuals obtained by cross-validation. (b) Correlation coefficients, r2, versus prediction coefficients, r2CV, obtained by randomization test.

ones and the relative number of F atoms.5 Glass transition temperature (Tg/M) was predicted through a model with a correlation of 0.972, consisting of five variables: four chemicophysical ones plus one graph-theoretical, the second-order Kier shape index.8 The paradigms obtained by Katritzky et al. show similarities to the ones reported in this paper, for example, the presence of fluorine indicators in the n models or of kappa indices in the Tg/M equations. This suggests an equivalency between the two approaches and shows that the threedimensional information contained in the physicochemical descriptors can be implicitly coded by graph-theoretical indices, which normally imply a quicker calculation process.

N-1

Gk )

N

∑ ∑ |cij|δ(k,dij) i)1 j)i+1

and

Jk )

Gk N-1

where N is the number of vertices in the chemical graph representing the molecular structure, that is, the number of atoms different from hydrogen in the molecule. cij is the charge term between the vertices i and j. It is defined as cij ) mij - mji, where mij and mji are elements of the N × N matrix M, obtained as the product of two matrixes: M ) A‚Q. Thus

Conclusions Two successful correlation models for the prediction of the properties n and Tg of a variety of high molecular weight polymers are presented. All the molecular descriptors are graphtheoretical. The mathematical models employed in this work retain the main structural features that condition the correlated physical properties and, therefore, can be applicable to regular linear polymers of a great variety of chemical structures, as suggested by using validation tests. Acknowledgment. R.G.-D. acknowledges the Generalitat Valenciana for financial support (Project No. GV99-91-1-12). Appendix Molecular graphs can be analytically represented by matrices from which may be derived all the descriptors used in this work.

N

mij )

∑aihqhj

h)1

Matrix A is called the connectiVity or adjacency matrix. Its elements aih represent the bonds between the atoms corresponding to vertices i and h in the graph. Element aih takes value 0 either if i ) h or if i is not linked to h; it takes value 1 if i is bonded to h by a simple bond, value 1.5 if the bond is aromatic, 2 for double bonds, and 3 for triple ones. Matrix Q is known as the Coulombian matrix. Its elements qhj take value 0 for h ) j. Otherwise qhj ) 1/dhj2, where dhj is the number of bonds, topological distance, between the vertices h and j. δ represents the Kronecker delta symbol: δ(R,β) ) 1, if R ) β; δ(R,β) ) 0, if R * β.

n and Tg Values of Linear Polymers dij is the topological distance between vertices i and j. This is the minimum number of bonds, of any order, that separates the atoms i and j. Thus, Gk represents the overall sum of the absolute values of the cij charge terms for every pair of vertices i and j, at a topological distance k. Electrotopological State Indices. Developed by Kier and Hall, they include electronic structural information for each atom of the graph, as well as information with respect to their topological environment. The intrinsic state I of an atom in a chemical graph is defined as

I ) (δv + 1)/δ where δv, as previously defined, represents the number of σ electrons + π electrons + pairs of free electrons; δ represents the number of σ electrons. Therefore, δv - δ ) π electrons + pairs of free electrons, what is known as Kier-Hall electronegativity. The intrinsic state of an atom in a chemical graph reflects its electronic and topological environ in the absence of interaction with the rest of the molecule. To quantify this influence, the electrotopological state of an atom i is defined as

S(i) ) Ii + ∆Ii where ∆Ii represents the electronic and topological influences on the atom i by the rest of the molecule.

∆Ii )

∑(Ii - Ij)/rij2

rij corresponds to the number of vertices between the atoms i and j. Electrotopological state indices of the type of atoms are obtained by summing the electrotopological states for each present type of atoms in the molecule, ST(i). For instance, ST(aCHa) represents the sum of the electrotopological state indices for all the aromatic CH carbons, and ST(dC