Establishing Quantitative Structure−Property Relationships (QSPR) of

Jun 8, 2001 - This was much more clearly observed for the values of cold filter plugging point (CFPP). The values of CFPP varied from approximately âˆ...
0 downloads 0 Views 88KB Size
Energy & Fuels 2001, 15, 943-948

943

Establishing Quantitative Structure-Property Relationships (QSPR) of Diesel Samples by Proton-NMR & Multiple Linear Regression (MLR) Analysis G. S. Kapur,*,† A. Ecker,‡ and R. Meusinger§ Indian Oil Corporation Limited, Research & Development Centre, Sector-13, Faridabad, 121 007, Haryana, India, The Austrian Institute for Petroleum ARCS/FICHTE, A-1030 Vienna, Faradaygasse 3, Austria, and Institute of Organic Chemistry, University of Mainz, Duesbergweg 10-14, 55099 Mainz, Germany Received January 30, 2001. Revised Manuscript Received April 23, 2001

The relationships between the structural parameters of diesel samples as observed by 1H NMR spectroscopy have been established with their physicochemical properties using multiple linear regression (MLR). About 60 commercial diesel samples were taken up for this study and their various physicochemical properties were determined using the standard IP, DIN, or ISO methods. The properties included density, cetane number, cetane index, viscosity, sulfur content, final boiling point, and the percentage of volume distillable till 350 °C. The 1H NMR spectra of the diesel samples were correlated with the properties using stepwise multiple linear regression, and resulted in high correlations. The method allows for the determination of about 7 properties of commercial diesel sample from a single 1H NMR spectrum. The present model is restrained to similar types of diesel samples, when intend to be used for predictive purposes. The presence of low levels of diesel additives (like cetane improvers or pour point depressant, etc.), which are not detectable by NMR, also pose limitation to the method, as emerged from the present feasibility study.

Introduction The minimum requirements for diesel fuel qualities are defined in standards or specifications with the purpose for trouble-free running of the engine. The most important properties that determine the suitability of a diesel fuel are the density, viscosity, cetane number, cetane index, sulfur content, and low-temperature flow properties such as pour point or cold filter plugging point (CFPP).1,2 An optimum value of the density for a diesel fuel is desired as the density has a considerable effect on the engine performance in terms of engine power output and exhaust gas emissions. Similarly, too low a viscosity of a diesel fuel can lead to wear in the injection pump, whereas, higher than the specified viscosity deteriorates injection. The cetane number is another very important property as it is used to describe the ignition quality of diesel fuels. A higher cetane number of a diesel fuel has advantages for the good ignition and starting behavior, as well as for the reduction of smoke and noise. The cetane index also provides an indication of the ignition quality, however, is calculated by means of a formula which considers density and the distillation range of a * Author to whom correspondence should be addressed. E-Mail: [email protected]. † Indian Oil Corporation Limited. ‡ The Austrian Institute for Petroleum ARCS/FICHTE. § University of Mainz. (1) Drews, A. W. 1989, ASTM Manual on Hydrocarbon Analysis, 4th addition, ASTM, PA. (2) Ullmann’s Encyclopeadia of Industrial Chemistry, 1990; Vol. A16, pp-719, VCH.

diesel fuel. Similarly, a higher sulfur content causes higher exhaust gas emissions, and corrosion of the engine. The poor low-temperature flow properties of a diesel fuel may lead to clogging of the fuel filter and could finally lead to engine stoppage. All the above properties are dependent primarily upon the composition of diesel fuels, which consist of thousands of individual components belonging to aromatic, paraffinic (normal, branched, cyclic), olefins, and polar hydrocarbon classes. However, none of these classes, alone, can impart all the desired properties to a diesel fuel. An optimum level of various classes/components is required to be present in a diesel fuel (or sometimes low levels of fuel additives) so that all the specifications can be met. For example, a high normal-paraffin content could impart high cetane number to the diesel fuel, but at the same time results in its high pour point as well. Various specification properties of a diesel fuel are determined by standard test methods as described by ASTM, IP, or DIN. These standard methods are normally time-consuming and laborious, but inevitable in strict quality control conditions. Hence, there is always a need for much quicker but equally reliable methods for obtaining the same information, particularly, when large numbers of samples are required to be tested in a stipulated time period. There have been many reports describing methods for establishing property-composition relationships and of predictive equations for calculating values of the fuel properties. These methods are generally devised for convenient estimation of the fuel properties and utilize simply measured properties

10.1021/ef010021u CCC: $20.00 © 2001 American Chemical Society Published on Web 06/08/2001

944

Energy & Fuels, Vol. 15, No. 4, 2001

Kapur et al.

Table 1. Details of Various Properties of the Diesel Samples Used in This Work property f

density (Kg/m2)

Method Variation Specification

DIN 51757 823.3-845.5 820-860

viscosity @40 °C (mm2/s) DIN -51562 2.15-3.15 2.00-4.50

cetane number

cetane index

sulfur (ppm)

IQT 48.4-55.1 Min 49

IP 380 49.6-53.9 Min 46

EN 24260 190-480 Max. 500

%volume distillable up to 350 °C ISO 3405 86.5-97.7 Min. 85

final boiling point (°C) ISO 3405 354.0-383.9 -

DIN - Deutsches Instituet fuer Normung (German Standard) IQT - Ignition Quality Tester (A combustion-based device, initially developed by Southwest Research Institute (SwRI) and referred to as the Constant Volume Combustion Apparatus, further development by Advanced Engine Technology Ltd.) for determining the ignition delay, which correlates to cetane number IP - Institute of Petroleum (British Standards) EN - EuroNorm (European standard) ISO - International Organization for Standardization

or parameters from molecular spectroscopic techniques (say infrared or nuclear magnetic resonance) to rapidly predict other fuel properties. Nuclear magnetic resonance (NMR) spectroscopy is the most powerful method in analytical chemistry for the identifications of structural groups. NMR spectroscopy has established itself as a highly potential and useful technique for the analysis of the petroleum products especially of the fuel range. The main feature of this technique is that the chemical shift measured by NMR spectroscopy is strongly dependent on the chemical environment of the investigated nuclei. Besides, the technique is very fast, highly versatile with no restriction on boiling point of the sample, and is highly quantitative. NMR spectroscopy has also been used for predicting the fuel properties, where NMR-derived structural/ compositional parameters have been correlated with the fuel properties using various statistical methods. This approach of relating chemical structural information (say from IR or NMR) with the properties of a fuel sample is called Chemometrics. For example, for the gasoline range samples, there had been many reports3-5 detailing the successful prediction of RON values using the correlations between the NMR (1H/13C) measured compositional parameters and the RON value. Muhl et al. developed two separate relationships between the composition (determined by 1H NMR) and octane number of fluid catalytic cracking gasoline3 and of reformed gasoline.4 The degree of correlation obtained between the actual and the predicted values of octane number was very high (R ) 0.792 for cracked gasolines and R ) 0.86 for reformed gasolines). Meusinger et al. have also described the correlations between the 1H NMR chemical shift regions and the RON values of commercial gasoline samples using the statistical techniques of multiple linear regression6 and hierarchical cluster analysis.7 In their work using multiple linear regression analysis,6 separate regression equations were developed for predicting RON and MON of unleaded gasoline samples. A very high correlation was obtained between the actual and the predicted values of RON (R ) 0.9507) and MON (R ) 0.9503). The majority of the work using NMR spectroscopy and various statistical tools has been reported for predicting the RON of gasoline samples. In contrast, there are few reports of similar work on diesel samples. Gulder et al.8 reported a correlation in terms of carbon groups (mea(3) Muhl, J.; Srica, V. Fuel 1987, 66, 1146. (4) Muhl, J.; Srica, V.; Jednacak, M. Fuel 1989, 68, 201. (5) Hiller, W. G.; Abu-Dagga, F.; Al-Tahou, B. J. Prakt. Chem. 1992, 334, 691. (6) Meusinger, R.; Schindlbauer, H. GIT Fachz. Lab. 1994, 38, 115. (7) Meusinger, R. Fuel 1996, 75, 1235, and references therein. (8) Gulder, O. J.; Glavincevski, B. Ind. Eng. Chem. Prod. Res. Dev. 1986, 25, 156.

sured from 1H NMR spectra) to predict the cetane number of diesel fuels. A very high correlation (R ) 0.992) was obtained between the actual and the predicted values of cetane number. However, the developed relationship using multiple regression was purely an empirical expression. Similarly, in an extensive series of papers, Cookson et al.9-11 described simple linear property-composition correlations between the 13C NMR measured compositional parameters and various properties of jet and diesel fuels. The correlation coefficients achieved between the actual and predicted values for various properties were of the order of 0.9 and above. However, the authors aimed to achieve information that can be interpreted in simple chemical terms, rather than developing tools for predicting the fuel properties. In the present work, various physicochemical properties of the commercial diesel fuels have been correlated directly with their proton NMR spectral intensities using multiple linear regression (MLR) analysis. The developed model equations, relating diesel properties with the 1H NMR signal intensities, have been crossvalidated on another set of similar type of diesel samples. The method allows estimation of at least seven properties of diesel fuels from a single 1H NMR spectrum of the diesel sample. Experimental Details Samples. All the diesel samples used in this study (60 number) were of commercial grades collected from various service stations in Vienna (Austria) during the summer and wintertime. Various properties of the samples have been measured by the standard methods and are included in Table 1. NMR Recording. The proton NMR spectra of all the samples were recorded on a 400 MHz NMR spectrometer. The following methodology has been used while recording the NMR spectra in order to take care of even minor mismatch of the chemical shifts from sample to sample. Concentration dependence of the chemical shift for various hydrocarbon groups, particularly aromatics and nearby groups, has been avoided by taking exactly similar concnentration of all the diesel samples (50 µL of diesel in 500 µL of CDCl3) for recording an NMR spectrum. Even similar type (grade) of NMR tubes were used (from the same manufacturer) for recording of all the samples. The recording of all the 60 samples was done by using an auto sample changer during an overnight run, using similar acquisition parameters. The important parameters used for recording the free induction decay (FID) signals were the following: spectral width ) 5787 Hz (acquisition time ) 2.83 s), spectral size ) 32 k, recycle delay> 10 s, and number of scans ) 32. The Fourier trans(9) Cookson, D. J.; Lloyd, C. P.; Smith, B. E. Energy Fuels 1987, 1, 438. (10) Cookson, D. J.; Smith, B. E. Energy Fuels 1990, 4, 152. (11) Cookson, D. J.; Iliopoulos, P.; Smith, B. E. Fuel 1995, 74, 70.

Structure-Property Relationships of Diesel Samples

Energy & Fuels, Vol. 15, No. 4, 2001 945

Figure 1. 400 MHz 1H NMR spectrum of a typical diesel sample along with the integral regions (A to R) used for correlations. formation of the raw FIDs was done separately using similar processing parameters such as zero filling, exponential multiplication factor (using line broadening factor lb ) 0.3 Hz), and manual phase correction of the signals. For all the spectra, a manual baseline correction routine was applied using a very high expansion on the y-axis. This was done separately for aliphatic and aromatic regions of the spectra (see a well at around 6 ppm; Figure 1). The last and the most critical step was the integration of various regions, as these integral intensities are to be correlated with the properties of diesel samples. This has been achieved by creating an integral file (containing 18 regions) for one of the sample, and storing the same in memory. The first step toward this is the exact referencing of the spectra. For this purpose, one of the signals from the sample itself has been used as a secondary reference. First, the TMS signal was set equal to 0 ppm, followed by setting anther signal (shown as SR; Figure 1) at exactly the same ppm value, in all the samples. This will take care of small shift in the field from sample to sample. After this is done, simply the stored integral file is recalled, and the corresponding intensities of all the 18 regions are obtained. The intensities so obtained were normalized to a value of 1000 in each case. The total time of NMR measurement, including sample preparation, shimming of the magnet, acquiring FID, processing, and integration was typically 30 min for one sample. Multiple Linear Regression (MLR) Analysis. The multiple linear regression was performed on a PC using statistical software (@ STATGRAPHICS), to develop statistical valid model regression equations.12,13 The data for 60 diesel samples was divided in to two sets. One set of the data (around 40 samples) was used for building up the statistically valid model equations. This set is called the model data set. Using this data set, various properties of the diesel samples (Table 1) have been correlated with the NMR integral intensity values of various spectral regions by multiple linear regression (MLR) analysis. This resulted in model regression equations for different properties (Table 2). NMR data of the remaining samples (around 20 in number, called the validation data set) were used later, for the purpose of validation of the developed equations. For such samples, the NMR intensities of various regions were substituted in the developed equations, and values of the properties were predicted. None of the sample from this validation data (12) Jobson, J. D. in “Applied Multivariant Data Analysis”, 1991, Springer-Verlag. (13) Devore, J. L.; Farnum, N. R. in “Applied Statistics for Engineers and Scientists”, 1999, Duxbury Press.

Table 2. The Various Integral Regions of the Proton NMR Spectrum of the Diesel Sample Used for Developing Structure-Property Correlations intensity

region (ppm)

A B C D E F G H I J K L M N O P Q R

8.997 - 8.200 8.200 - 7.551 7.551 - 7.182 7.182 - 7.130 7.130 - 6.972 6.972 - 6.785 6.785 - 6.425 4.184 - 3.406 3.406 - 2.883 2.883 - 2.641 2.641 - 2.292 2.292 - 2.040 2.040 - 1.963 1.963 - 1.570 1.570 - 1.391 1.391 - 1.115 1.115 - 0.941 0.941 - 0.254

set was used for initial model building and can also be called as unseen data.

Results and Discussion Table 1 includes the details of the properties of the diesel samples measured in this work along with their respective variation. As all the samples were of commercial grades, the variation in the properties was very small. Figure 1 shows the representative 1H NMR spectrum of a diesel sample. The expanded spectrum in the range 2-9 ppm is also shown in the same figure. The spectrum is very complex and consists of large number of signals due to different hydrocarbon types and molecular structures. Certain regions particularly 0.4-2.0 ppm, are highly overlapped and not well resolved. However, the presence of large number of signals in the spectrum is advantageous and provide a sort of fingerprint of the diesel sample, and may be used to distinguish different diesel samples having different properties resulting from a change in their composition. As the NMR spectrum of the sample is highly overlapped and crowded, it is not possible to assign each of signals to specific molecular structural types. However, it is possible to tentatively assign the spectrum in terms of various structural groups (i.e. aromatics, side chains linked to aromatics, naphthenes and paraffins). The

946

Energy & Fuels, Vol. 15, No. 4, 2001

Kapur et al.

Table 3. Various Model Equations Developed for Different Properties Using the NMR Integral Intensities of Different Regions (Figure 1), along with the Values of the Squared Multiple Correlation (R2) and Standard Error of Estimation (SE) property

model equation

statistical data

Density SE ) 1.11 Viscosity SE ) 0.09 Cetane Number SE ) 0.40 Cetane Index SE ) 0.27 Sulfur SE ) 24 %Vol up to 350 °C

5.092(A+B) + 2.537(H+I) + 1.732(J) + 1.083(L) + 4.279(M) + 1.704(O) + 0.830(P) + 2.380(Q) + 0.426(R)

R2 ) 0.998

-0.097 (D+E+F+G) - 0.435(H) + 0.333(I) -0.0446(L) + 0.0073(P) + 0.091(Q) - 0.0245(R)

R2 ) 0.998

-0.272(A+B+C+D) + 0.570(H+I) + 0.653(J) - 0.712(L) + 3.001(M) - 0.408(N) + 0.0886(P) + 0.152(R)

R2 ) 0.998

-0.713(C+D)+0.528(K)-1.268(M)+0.075(P) +0.240(Q)

R ) 0.998

11.01(A+B+C+D+E+F+G) + 37.46(H+I) - 43.8(J) - 9.959(L) + 1.358(P) -12.639(Q) + 2.864(R)

R2 ) 0.992

0.72(A+B+C+D+E+F+G) + 1.052(J) - 0.916(K) + 0.437(L) -5.065(M) + 0.719(O) + 0.031 (P) 0.434(Q) + 0.215 (R)

R2 ) 0.992

-3.249(C+D) - 2.903(E+F+G) + 3.661(K) + 7.828(M) + 0.453(P) + 1.771(Q) + 0.146 (R)

R2 ) 0.998

SE ) 0.63 FBP SE ) 1.46

assignments of the signals in terms of various hydrocarbon classes such as paraffins, naphthenes, olefins, and aromatics is trivial and have been made previously.14 In the present study, for the purpose of correlating the spectral features i.e., integral intensities with the diesel properties, the NMR spectra of the samples were divided in to 18 smaller regions (named A to R), each one of them determined quantitatively by their respective integral intensities. For simplicity, the respective integral intensities have been designated as alphabets A to R (Figure 1). This division of the spectra into 18 regions is based on the obtained resolution in the spectrum (expanded spectrum in Figure 1) and further tentative assignment of the regions. For example, the region of aromatic ring protons (6-9 ppm) has been divided in to 7 smaller regions (assigned to tri-ring, di-ring, mono-ring, highly substituted mono-ring structures etc., as one moves upfield from regions A to G). Similarly, the region due to groups directly attached to aromatic rings (4-2 ppm) has been divided in to 5 smaller regions (assigned to bridged CH2 groups in fluorene types (H), R-CH (I), R-CH2 (J), R-CH2 + R-CH3 (K) and R-CH3 groups (L). The region M is due to olefinic groups. The regions (N to Q) are due to CH and CH2 groups of naphthenes and paraffins (normal and iso). The region R is due to CH3 groups only. Table 2 gives the exact division of the spectra in terms of the various integral regions used in this study. However, in such type of chemometric studies, where spectral features are statistically related to the fuel properties, an unambiguous or detailed assignment of the NMR spectrum is not a prerequisite. For multiple linear regression analysis, the normalized integral intensities (A to R) or their linear combinations were used as the ‘independent variables’, and were correlated with various properties of the diesel samples, given as the ‘dependent variables’. In the present study, as the total sum of all the ‘independent variables’ or the total integral intensity of individual spectrum is constant (i.e. equal 1000) for all the 60 samples, the regression model is called ‘mixture model’ and is fitted without a constant (or an intercept term). As mentioned earlier, for developing a model equation for property prediction, around two-third of the samples (around 40 in number) were used for building up the model. The developed model equation was tested for its (14) Bansal, V.; Kapur, G. S.; Sarpal, A. S.; Kagdiyal, V.; Jain, S. K.; Srivastava, S. P. Energy Fuels 1998, 12, 1223.

validity on the rest of one-third samples by comparing the actual value of the property with the predicted value. The above protocol was used for all the seven properties. To obtain the reduced models, the number of the ‘independent variables’ were reduced by the technique of stepwise multiple linear regression. A backward elimination procedure was employed to reduce the number of independent variable in the equation. The backward elimination procedure starts with all the independent variables in the equation, and eliminates the least significant variable at the first step, and continues until no ‘insignificant’ variables remain.12,13 In a reduced model equation so obtained, various regression coefficients for different ‘independent variables’ were assessed on the basis of their significant levels (a probability (p) cutoff value of 0.05 was used), and analysis of variance (ANOVA) table. The F-ratios from the ANOVA table were compared with the table values of F-ratios for a particular degree of freedom for model and error terms. The coefficients and the overall model were found to be highly significant in all the cases. Table 3 lists the various model regression equations so obtained for individual properties along with the statistical data. The value of the squared multiple correlation (R2) for the developed model equations was very high (around 0.99) for all the properties. The value of the standard error of estimate (SE) was determined to be within the acceptable limits for all the properties. However, as the model regression equations have been fitted without an intercept or constant term, the value of R2 as computed by the used statistical software package has a different meaning. For such regression through origin (no intercept), R2 represents the proportion of explained variability about the origin only. This value cannot be compared to the R2 value when the intercept is included in a regression equation. Therefore, in the present cases, the computed value of R2 cannot be used as a measure of goodness of fit. Once the regression equation has been obtained, one can compare actual value of the dependent variable (i.e. property) and the corresponding fitted value, when values of the independent variable (say NMR intensities) are substituted in the regression equation. A scatter plot of actual versus fitted values of property gives a visual impression of how strongly these two values are related. Figure 2 shows the plots indicating the correlation between the actual versus fitted values

Structure-Property Relationships of Diesel Samples

Energy & Fuels, Vol. 15, No. 4, 2001 947

Figure 2. The plots of actual versus fitted value (for the model data set) and actual versus predicted values (for validation data set) for various properties using equations in Table 3. Table 4. Pearson Correlation Coefficients (R) between the Actual and the Fitted Values for the Model Data Set, and between the Actual and the Predicted Values for the Validation Data Set

Model Set Validation Set

density

viscosity

cetane number

0.985 0.962

0.933 0.963

0.969 0.790

of the properties for the model data set for all the seven individual properties of the diesel samples. The quantitative assessment of relationship between actual and fitted values is obtained by calculating Pearson correlation or correlation coefficient (R). This correlation coefficient has been found to be very high and varies from 0.890 to 0.985 for different properties for the model data set. The correlation coefficients for various properties are reasonably close to the values obtained in earlier literature studies carried out with similar objectives.3-11 Once the regression model equations have been developed, it is necessary to cross-validate the same before such models can be used for predictive purposes. Therefore, cross validation of the developed model was done on the remaining data set of around 20 diesel samples (Called validation or unseen data set). The NMR integral intensity values of these samples were substituted in the regression equations (Table 3), and values of various properties were predicted. The pre-

cetane index 0.970 0.910

sulfur

%volume distillable up to 350 °C

final boiling point (FBP)

0.890 0.945

0.964 0.832

0.969 0.940

dicted values were then compared with the actual values of the properties. Table 4 also shows correlation coefficients between the actual and the predicted values of the properties. The corresponding scatter plots between the actual and the predicted values have been shown in Figure 2 for the validation data set. The value of correlation coefficients (R) for different properties for the validation data set varies from 0.790 to 0.963. The plots in Figure 2 and the data in Table 4 indicate that various properties of a diesel sample used in this study can be estimated from a single 1H NMR spectrum in a short span of time with reasonable accuracy, however with some limitations. It is evident from Figure 2 as well as Table 4 that the model equation for cetane number (CN) estimation showed a relatively poor correlation (0.79) between the actual and the predicted values. The reason for this relatively poor correlation is the fact that one batch of the diesel samples was collected in the months of

948

Energy & Fuels, Vol. 15, No. 4, 2001

Kapur et al.

Figure 3. Plots of the residuals versus the estimated values of the various properties.

extreme winters. These samples were expected to contain some cetane/ignition improver. The cetane/ ignition improvers are normally added in the ppm level and it is not possible to observe their presence in a diesel sample using NMR spectroscopy. Many of such potential samples were eliminated from both the model and the validation set of data, while developing the model for cetane number estimation. The final model was developed on less number of 25 samples and validated on another 17 samples. However, during the model development it was realized that it is possible that few of the samples used for model building may still contain a cetane improver. One way of improving the correlations was to make separate model equation for the different batches of the samples, which however was not tried due to small number of samples available in the batches. Whereas, the model for the cetane index (CI) showed much higher correlation (0.91) between the actual and the predicted values of the cetane index. The cetane index, contrary to the cetane number is independent of the presence of ignition improvers. This perhaps shows the limitation of such correlation procedures, which may lead to wrong estimate if some diesel additives are present in order to enhance a particular property. This was much more clearly observed for the values of cold filter plugging point (CFPP). The values of CFPP varied from approximately -1 °C for one batch (30 samples) to -21 °C for the 2nd batch of samples collected in the months of extreme winters (30 number). The very low CFPP values for the winter batch were due to the presence of some flow improver. No attempt was made to develop NMR based model for predicting the CFPP values. Regression Diagnostics Cross validation of the developed model equations has shown the reliability of the model. However, before a model equation is used for estimating or predicting a particular property, it is also important to test if the assumption underlying the linear regression are reasonable, and the data appear to be sampled from a population that meets these assumptions. The sample ana-

logues of the errors in the population model are the residuals- the difference between the actual, and the fitted (predicted) values of the dependent variable. The behavior of the residuals was examined in order to diagnose that the assumptions underlying linear regression model are reasonably met. Figure 3 shows the plots of the residual versus the estimated values for different properties. In each case, the residuals were found to have constant variance and were arranged in a horizontal band almost evenly distributed above and below the zero line. This behavior is expected from such plots if the assumptions underlying the linear regression are being met.12 Conclusions The present work has shown the feasibility of estimating various properties of a diesel sample from a single proton NMR spectrum in a very short time. The approach described here for correlating diesel fuel properties with their 1H NMR spectra is extremely simple and successful, and may be used as a predictive tool. The developed model equations are not universal and hence shall work best for the similar type of diesel samples, which is always so for such type of correlation procedures. None of the developed equations predicted a false negative value for an unseen sample (i.e. from the validation data set). However, it is very much possible to further extend (extrapolate) the applicability of the present method/ approach. This will require fine-tuning of the models, though the methodology and protocol shall remain the same. For this, it will be necessary to establish fresh correlation i.e., regression model equations (by redoing the multiple regression analysis), by including more and more number of samples (having broader range of properties) in the model data set. Including more number of different types of diesel samples may result in some different equation relating the property and NMR intensities. However, protocol for doing this shall remain the same as followed in the present study. EF010021U