Determination of Saturates, Aromatics, and Polars in Crude Oil by 13C

Feb 1, 2016 - NMR spectra have variables with discrete character, and the probabilistic method of selecting variables by the genetic algorithm (GA)(19...
0 downloads 0 Views 1MB Size
Article pubs.acs.org/EF

Determination of Saturates, Aromatics, and Polars in Crude Oil by 13C NMR and Support Vector Regression with Variable Selection by Genetic Algorithm Paulo R. Filgueiras,†,‡ Natália A. Portela,† Samantha R. C. Silva,† Eustáquio V. R. Castro,† Lize M. S. L. Oliveira,§ Julio C. M. Dias,§ Alvaro C. Neto,† Wanderson Romaõ ,† and Ronei J. Poppi*,‡ †

Department of Chemistry, Laboratory of Research and Development of Methodologies for Analysis of Oils - LABPETRO, Federal University of Espírito Santo, Av. Fernando Ferrari, 514, Goiabeiras, Vitória, 29075-910 Espírito Santo, Brazil ‡ Institute of Chemistry, State University of Campinas, C. P. 6154, 13084-971 Campinas, Brazil § CENPES/PETROBRAS, Av. Jequitiba 950, Rio de Janeiro 21941-598, Brazil S Supporting Information *

ABSTRACT: The contents of saturates, aromatics, and polars in crude oil were determined using carbon-13 nuclear magnetic resonance spectroscopy (13C NMR) associated with support vector regression (SVR) and a genetic algorithm (GA) for the simultaneous selection of spectral variables and SVR model parameters. The developed models presented prediction sample errors of 4.4% (w/w) for saturates, 4.3% (w/w) for aromatics (w/w), and 3.7% (w/w) for polars. These results are acceptable for the petroleum industry, considering that the error obtained by the standard methodology is 5% (w/w), which is the maximum value of variation allowed in SARA analysis. The proposed methodology made these determinations using small amounts of samples (approximately 2 mL) in a relatively short time (approximately 2 h).



methods. Partial least-squares regression (PLS)11,12 is the multivariate calibration method most commonly used in chemistry. Due to the complexity of crude oil, the use of new multivariate calibration methods with superior generalization ability, such as those based on support vector regression (SVR),13−15 can provide better results, enabling the development of analytical methodologies for quantitative analysis using NMR.16−18 However, 13C NMR spectra contain a large number of variables (up to 65000). The development of multivariate calibrations with a large number of variables per sample can generate a high computational cost, a loss of chemical interpretation, and models with low prediction ability due to the introduction of a large quantity of noninformative variables. It is therefore important to use variable selection methods to generate more parsimonious models without irrelevant variables. Deterministic methods for selecting such variables, such as the use of spectral ranges and combinations of ranges, are indicated when the sample spectrum is continuous, such as with infrared spectroscopy. NMR spectra have variables with discrete character, and the probabilistic method of selecting variables by the genetic algorithm (GA)19 is more appropriate. A difference in the GA approach is the possibility of selecting a subset of spectral variables simultaneously with the parameters of the SVR model.20 In this paper, the saturates, aromatics, and polars contents in crude oil were determined using 13C NMR spectroscopy associated with support vector regression using the genetic

INTRODUCTION Crude oil consists predominantly of hydrocarbons and minor amounts of oxygen, nitrogen, and sulfur compounds.1−3 Hydrocarbons present in the oil can range from simple molecules, with few carbon atoms, to complex molecules with high molecular weight. Due to this complex mixture, crude oil composition is usually evaluated by groups of compounds with similar chemical properties, such as saturates, aromatics, and polars (SAP); the latter comprises asphaltenes and resins. The saturated class includes normal chain (normal paraffins), branched (isoparaffins), and cyclical (naphthenic) alkanes; the aromatic class comprises aromatic, aromatic-cycloalkane (naphthenic aromatics), and usually, cyclic sulfur compounds.4 Finally, the polar class (resins and asphaltenes) is formed by polycyclic aromatic compounds of high molecular weight, containing larger amounts of heteroatoms. The physicochemical properties of oils vary considerably depending on the constituent substances in each class. The development of feasible analytical methodologies for the determination of such properties rapidly and using a small amounts of samples is needed in the petroleum industry. Due to the predominance of hydrocarbons in crude oils, proton nuclear magnetic resonance spectroscopy (1H NMR) has been used for the prediction of several physical-chemical properties in crude oil and its derivatives.5−10 However, in the SAP determination of crude oils, carbon-13 nuclear magnetic resonance (13C NMR) can produce superior results in relation to 1H NMR because 13C NMR can obtain signals relating to the internal carbons from polyaromatic structures of the resins and asphaltenes of the petroleum. The maximum extraction of quantitative information from NMR spectra is obtained using the use of multivariate statistical © XXXX American Chemical Society

Received: October 9, 2015 Revised: January 27, 2016

A

DOI: 10.1021/acs.energyfuels.5b02377 Energy Fuels XXXX, XXX, XXX−XXX

Article

Energy & Fuels algorithm for the simultaneous selection of spectral variables and parameters of the SVR model.



DATA ANALYSIS Support Vector Regression. Support vector machine is a learning machine method originally developed by Cortes & Vapnik14,21 for binary classification problem solving by searching a hyperplane that maximizes the distance between two classes of samples. The binary classification can be converted into a regression problem by a simple trick: for each xi sample from the regression, a positive constant number d is added and subtracted from the corresponding property of interest yi (quantitative value), thereby forming two groups.15 An important stage in building the SVR model consists of mapping the data to a high-dimensional feature space by a kernel function. In this work, the radial basis function (RBF) was used: K (x i , x j) = exp( −γ || x i − x j ||2 )

Figure 1. Coding of variables: each artificial chromosome is represented by a sequence of binary codes (0’s and 1’s). The parameters of the SVR model were coded using a 15-digit binary sequence for each constant to be optimized.

(1)

optimization process. Each individual is a possible answer to the proposed problem. 3. Fitness Calculation. A metric is used to determine the performance of each individual to the solution of the proposed problem (fitness). In multivariate calibration problems, the most common response desired is the minimization of the root mean squares error of cross-validation (RMSECV), provided by equation:25

where γ is a tunneling parameter that controls the width of the Gaussian function and must be optimized during the model development. The optimal separation hyperplane (OSH) is the regression model (yi = w × ϕ(xi) + b), which should be as smooth as possible and admit errors within certain tolerance limits ε. Samples with above tolerated error (ξi,ξ*i ≥ 0) are weighted by a constant C > 0, according to the equation:22

ncal

N

1 minimize: || w ||2 + C ∑ (ξi + ξi*) 2 i=1 ⎧ y − w·ϕ(x i) − b ≤ ε + ξi ⎪i ⎪ subject to: ⎨ w·ϕ(x i) + b − yi ≤ ε + ξi* ⎪ ⎪ξ , ξ * ≥ 0 ⎩ i i

RMSECV = (2)

∑i = 1 (ycal, i − ycal, ̂ i )2 ncal

(4)

where yref,i and yest,i are the calibration reference values and are estimated by a cross-validation procedure for the ncal samples calibration set. 4. Convergence and Reproduction. In the case that some response obtained satisfies a convergence criterion, the algorithm ends; otherwise, the selection stage of the best fitness and reproduction is initiated. 5. Mutation. In mutation, a specific byte is changed (0 to 1 or 1 to 0). This procedure is relevant to solving problems of local minima during algorithm execution. In this paper, a mutation rate of 1% was adopted.

(3)

where the C constant is the function of errors (ξi,ξi* ≥ 0) that are delimited by a “ε” tolerance, described by the loss of function ε-insensitive. The C and ε constants are optimized during the construction of the model. Genetic Algorithm. The GA is a probabilistic technique of searching and mathematical optimization inspired by the Darwinian principle of species evolution. The optimization process is based on the principles of survival of the fittest individuals, reproduction, and mutation.19 This principle motivated the construction of algorithms for finding numerical solutions for problems with large numbers of variables. In chemistry, the genetic algorithm is used for variable selection in spectroscopy.23,24 In this paper, GA was used for the simultaneous selection of 13C NMR spectrum variables and parameters of the SVR model. In general, GA consists of these steps: 1. Coding of Variables. Each artificial chromosome is represented by a sequence of binary codes (0’s and 1’s). The parameters of the SVR model were coded using a 15-digit binary sequence for each constant to be optimized. Each 13C NMR spectrum variable (chemical shift) was represented by one binary digit, where 0 means that the variable was not selected and 1 means that the variable was selected for model development (Figure 1). 2. Creation of an Initial Population. A population of individuals is randomly created to start the numerical



EXPERIMENTAL SECTION

Crude Oil Samples. In this study, 65 samples of crude oil were used, with widely differing physicochemical characteristics, ranging from light to heavy oils. The saturates, aromatics, and polars contents were determined by a preparative chromatographic column. A representative separation of the fraction of saturated, aromatic, and polar from petroleum was performed according to ASTM D2549-0226 modified. The particle size of the silica and the solvents were changed. A particle size of 230 to 400 mesh was used, 20 g of silica gel was activated at 120 °C for 12 h, poured into a glass column of 25 cm. A tablet silica with 0.2 g petroleum was transferred to the top of the column, and successive elutions were performed with 200 mL of hexane, 200 mL hexane:dichloromethane (1:1) and 200 mL of methanol to separate the fractions of saturated, aromatic, and polar, respectively. Due to inaccuracy in the quantification of the resins and asphaltenes contents separately, they were determined as a single property, named polars. Thus, the calibration models were constructed for the determination of three properties: saturates, aromatics, and asphaltenes + resins (polars). The contents of saturates, aromatics, and polars for 65 petroleum samples were determined, and their results are described in Table 1S). B

DOI: 10.1021/acs.energyfuels.5b02377 Energy Fuels XXXX, XXX, XXX−XXX

Article

Energy & Fuels Table 1. Parameters of the SVR Model optimized values parameter

range

saturates

aromatics

polars

C e γ ε

10−4−104 10−4−10 10−4−1.0 10−4−5.0 × 10−2

37 2.0 × 10−3 1.6 × 10−4 2.3 × 10−2

630 1.0 × 10−3 1.2 × 10−4 7.1 × 10−3

22 2.0 × 10−3 3.9 × 10−4 1.4 × 10−2

13 C NMR Measurements. The 13C NMR spectra were obtained at 27 °C using a Varian VNMRS-400 instrument with a magnetic field of 9.4 T 10 mm Broadband15N-31P {1H-19F}. The samples were prepared by dissolving a mass of oil (corresponding to 40% of the final volume of the solution in deuterated chloroform, CDCl3) containing 0.05 mol/L of Cr(acac)3 (III chromium acetylacetonate), 3 mL total volume. The instrumental conditions were frequency, 100.51 MHz for the 13C core; spectral window, 25510.2 Hz; acquisition time, 1.285 s; standby time, 7 s; pulse, 90° (14.2 μs); number of transients, 1000; and decoupling mode, nny. The decoupling was turned off during the pulse and the waiting time and was connected only during data acquisition. This procedure increased the intensity of the signal, guaranteeing a quantitative experiment. Model Development. The petroleum samples were split into two sets: 45 samples for calibration (model development) and 20 samples for prediction, using the Kennard Stone algorithm.27 Prior to the construction of the GA-SVR model, the spectra were aligned using the icoshift algorithm.28 A problem in using GA for optimization is its possible convergence to local minima. Different selections of spectral variables and SVR parameter values can be obtained when running GA-SVR several times with the same calibration samples. To minimize this effect, 100 GASVR models were built, and only variables with greater frequency of GA selection in the models were selected because important spectral variables tend to be repeatedly selected, whereas less important variables are selected at random. Although the optimization of the SVR model parameters and selection of variables were conducted simultaneously, the choice of the optimum values was evaluated separately. For the model parameters optimization, a set of 15 binary codes was used for the representation of each SVR model parameter. Table 1 presents the ranges of the evaluated values for each parameter (C, e, γ, and ε) in this work. From the 100 GA-SVM models, a frequency distribution plot of the optimized values in each run for each parameter was evaluated, and the best SVR model parameters were chosen based on the median frequency value. For spectral variables selection, each spectrum initially had 5530 variables, and each variable was represented by one binary code, receiving a value of 0 if the variable was not selected for the model development or a value of 1 if the variable was selected. From the 100 GA-SVM models, RMSECV was estimated based on the frequency that the variables were selected. The evaluation of RMSECV was performed using variables selected in 30% to 80% of the GA runs. The GA was configured with the initial population of 1024 individuals, a maximum number of generations of 200, a mutation rate of 1%, algorithm initialization with 15% of the total spectral variables, and optimization with cross-validation procedure “5-fold”. The SVR models were developed using the “libsvm package” for Matlab available freely at https://www.csie.ntu.edu.tw/~cjlin/libsvm,29 and the implementation of the genetic algorithm was conducted in software developed in Matlab by the authors. Also, it was built PLS11,12 models accompanied by the same variable selection technique (Genetic Algorithm) employed for SVR.

ranging from 10 to 55, comprising examples of light to heavy oils. Before the construction of calibration models, 13C NMR spectra need to be aligned. The shifts of the signals in the 13C NMR spectra are related to the chemical modification of the 13 C nucleus ambient. These shifts can be generated by small variations in the acidity of the medium or even by the density.28 These shifts must be corrected prior to construction of the models because small shifts can impair the optimization of quantitative models. Figure 2a shows the 13C NMR spectra

Figure 2. 13C NMR spectra of the (a) original crude oils and (b) after alignment using the icoshift program.

characteristic of the crude oil samples with an emphasis on two regions where the shift of the signals is evident. In Figure 2b, the spectra are displayed after alignment was performed using the icoshift algorithm, where the two regions are highlighted (δ = 78−76 and δ = 32−29 ppm). In the multivariate calibration of spectrophotometric data, such as infrared, there is generally a high correlation between the spectral variables. Thus, for the selection of variables, deterministic methods, such as interval partial least-squares (iPLS) or a combination (siPLS, siSVR), are preferably used. These methods select predetermined spectral regions, and regression models are developed with these selected variables.30 However, in mass spectrometry and nuclear magnetic resonance spectroscopy (1H and 13C), in many cases the spectra have variables with less collinearity.31 In these cases,



RESULTS AND DISCUSSION The API gravity is an important physical property for the petroleum industry. Samples with low API (heavy oils) typically have higher content of resins and asphaltenes in their composition than light oil samples (high API gravity).30 The 65 samples of oil used in this work had API gravity values C

DOI: 10.1021/acs.energyfuels.5b02377 Energy Fuels XXXX, XXX, XXX−XXX

Article

Energy & Fuels

Figure 3. Histogram of the SVR model parameters optimized by the genetic algorithm for determining the contents of saturates, aromatics, and polars. The red vertical line is the median value of the distribution.

number of variables selected in the final model for each oil property was chosen based on the minimum RMSECV value. For saturates and aromatics, the minimum RMSECV was reached using the variables selected in 35% of the GA optimization runs. This equal variable selection frequency for both saturates and aromatics does not imply that the same variables were chosen because different spectral variables can be selected in each run of GA optimization. The GA-SVR final model for the polars content was obtained using the variables selected in 45% of the GA optimization runs (Figure 4). The average spectrum of calibration samples is shown in Figure 5a, which highlights the region where there is a predominance of carbon in aromatic groups and carbons linked to heteroatoms, such as nitrogen and oxygen, common in crude oil samples. The chemical shifts of 129−137 ppm, 123.5−126 ppm, and 128.5−136 ppm are attributed to aromatic carbons substituted by methyl, aromatic carbon in three aromatic ring junctions, and two aromatic ring junctions, respectively.32 The region just below 80 ppm is due to the solvent CDCl3. The region below 40 ppm is characteristic of aliphatic carbon groups that are related to saturates. Although the chemical shift region from 140 to 125 ppm (Figure 5a) has lower intensity signals, it is of great importance in modeling the saturated and aromatic contents (Figure 5, panels b and c) because many variables were selected by the GA optimization procedure. For modeling the polar content, this region was less important (Figure 5d). The GA-SVR final model for saturates was built with only 331 variables (chemical shifts), with 132 variables related to the aromatic carbons region (δ = 140−125 ppm) and 199 concerning the paraffinic carbon region (δ = 42 10 ppm) (Figure 5b). The model of GASVR for aromatics was built with 363 variables: 214 from the aromatic region and 149 from the paraffinic region (Figure 5c). The GA-SVR model for polars had 153 variables: 84 from the aromatic part and 69 from the paraffinic part of the 13C NMR spectrum (Figure 5d).

probabilistic methods for the selection of spectral variables, such as genetic algorithm, can generate better results. The parameters used in the GA-SVR final model were chosen based on the median of the frequency distribution of the optimized values in each run, as presented in Figure 3. The optimized parameters are presented in Table 1. When analyzing the optimized parameters, it is possible to note that the most discrepant values were for the C parameter. In general, a higher value of the parameter C implies a less smoothed optimized function. The model for the aromatics had the highest C parameter, indicating that this model is less robust and may lead to larger prediction model errors. Figure 4 presents a plot of the cross-validation error as a function of variable frequency selection by GA. The optimal

Figure 4. Plot of RMSECV against the frequency of the spectral variables selected. D

DOI: 10.1021/acs.energyfuels.5b02377 Energy Fuels XXXX, XXX, XXX−XXX

Article

Energy & Fuels

Table 2. Results of the Statistical Parameters of the GA-SVR and GA-PLS Models model

parameter

saturates

aromatics

polars

GA-SVR

RMSECV (% w/w) R2cva RMSEP (% w/w) R2pb RMSECV (% w/w) R2cva RMSEP (% w/w) R2pb

3.0 0.944 4.4 0.908 5.6 0.779 6.2 0.847

1.9 0.897 4.3 0.730 2.4 0.749 5.3 0.564

2.8 0.899 3.7 0.774 4.6 0.706 4.0 0.778

GA-PLS

a Determination coefficient calculated from cross-validation. bDetermination coefficient calculated from prediction samples.

validation and 0.9078 for the prediction set, was achieved. For aromatics content modeling, the error was 1.9% w/w for crossvalidation and 4.3% w/w for prediction samples (Table 2). Due to the lower content of aromatics in the oils compared to saturates, these errors, despite having the same magnitude, reflect a lower coefficient of determination: 0.8974 for crossvalidation and 0.7298 for the prediction samples (Table 2). Figure 6b demonstrates the worse adjustment between the ASTM reference values and the predicted values compared to the model for the saturates content (Figure 6a). In general, the results for SAP prediction in crude oil samples were satisfactory for some practical applications where there is the need for a rapid assessment of this property. For determination of saturates and aromatics properties, GASVR model showed better results than the GA-PLS method (Table 2). The coefficients of determination values (R2p) were higher and lower prediction errors (RMSEP) observed. The randomization test to compare the predictive accuracy33 of models was applied to data and indicated significant differences in the accuracy of the GA-SVR and GA-PLS models for saturated (p-value 0.049) and aromatics (p-value 0.037). In the evaluation of nonlinearity present in the data set, applying the methodology based on the augmented partial residual plots (APARP) described by Centner et al.,34 the results were inconclusive. Then it is possible to infer that the superior performance of GA-SVR is due to the complexity of the data set. In the modeling of the polars content (Figure 6c), the results for GA-SVR and GA-PLS were similar (based on randomization test), with values for coefficients of determination of 0.8990 for cross-validation and 0.774 for prediction samples (Table 2) for GA-SVR. The errors had the same magnitude as the two previous calibration models, with 2.8% (w/w) and 3.7% (w/w) for cross-validation and prediction, respectively. The results obtained for the three estimated parameters were acceptable for the petroleum industry, considering that the prediction errors were less than 5% (w/w), which is the maximum value of variation allowed in SAP analysis (saturated, aromatic, and polar) by the methodology used in this work. Another positive point is the strong linear relationships of the models, with R2 higher than 0.70 for all prediction sets and better predictions for saturates can be related to the experimental procedure.

Figure 5. (a) 13CNMR average spectrum of the calibration samples with the predominance of the aromatic carbons region highlighted on the right. Frequency of variables selected for (b) saturates, (c) aromatics, and (d) polars. The horizontal line represents the frequency selection for the construction of the GA-SVR model.

It was expected that the region below 40 ppm had a significant contribution to modeling the saturated property (Figure 5b) because the region is related to aliphatic carbon groups, which constitute the saturated petroleum fraction. However, a higher contribution of the region was observed with a predominance of aromatic compounds. This fact can be related to the reduction of the signal related to aromatic compounds in crude oil samples rich in saturated compounds. During the SVR modeling, the variables are normalized in the range from 0 to 1, and this procedure gives the same initial importance to all variables. Thus, the spectral regions selected by the GA are directly related to the property of interest, not considering the absolute intensity of the NMR signals. The selected variables for the aromatic modeling (Figure 5c) present the predominance of the related aromatic region in the 13 C NMR spectrum, as expected. However, the modeling of the polars content also selected variables in this region (Figure 5d) because of the heteroatom predominance associated with high molecular weight compounds, which, in the case of oils, corresponds to aromatic groups that constitute the resins and asphaltenes. The GA-SVR calibration model for the saturates content presented an average error of 3.0% (w/w) for cross-validation and 4.4% (w/w) for prediction samples (Table 2). Figure 6 presents the plot of the SARA contents measured versus the values estimated by the GA-SVR model. In the plot for the saturates results (Figure 6a), good linearity, expressed by high values of the determination coefficient of 0.9444 for the cross-



CONCLUSIONS The prediction errors for the calibration of the contents of saturates, aromatics, and polars were approximately 4% (w/w). The magnitudes of these errors were satisfactory and below the E

DOI: 10.1021/acs.energyfuels.5b02377 Energy Fuels XXXX, XXX, XXX−XXX

Article

Energy & Fuels

Figure 6. Plot of the SAP contents measured versus the values estimated by the GA-SVR model: (a) saturates, (b) aromatics, and (c) polars (resins and asphaltenes).

responsible for 13C NMR analysis and results interpretation. S.R.C.S. was responsible for 13C NMR analysis and results interpretation. E.V.R.C. was responsible for developing the genetic algorithm and statistical analysis of the results. L.M.S.L.O. and J.C.M.D. were responsible for saturates, aromatics, and polars analysis in the 65 crude oil samples and results interpretation. A.C.N. was responsible for 13C NMR analysis and results interpretation. W.R. was responsible for 13C NMR analysis and results interpretation. R.J.P. was responsible for developing the genetic algorithm, statistical analysis of the results, and conclusions.

level accepted by the petroleum industry (5% w/w). The determination of these constituents in petroleum is important to monitor oil quality, and few methods are described in the literature that can perform this determination using small amounts of sample (2 mL) in a relatively short time (approximately 2 h). The genetic algorithm presented satisfactory results for the spectral variable selection and simultaneous optimization of parameters for the support vector regression model. The evaluation of several models (100 in this case) to prevent convergence of the algorithm to a local minimum and subsequent combination of the results for better selection of the variables were fundamental for the successful use of GA.



Notes

The authors declare no competing financial interest.



ASSOCIATED CONTENT

ACKNOWLEDGMENTS The authors would like to thank LABPETRO-UFES and Petróleo Brasileiro S.A.PETROBRAS/CENPES for providing the crude oil samples and FAPES, CAPES, and CNPq for their financial support and NCQP/UFES by analysis.

* Supporting Information S

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.energyfuels.5b02377. Contents of saturates, aromatics, and polars in the 65 oil samples used in the modeling (PDF)





REFERENCES

(1) Speight, J. G. Handbook of Petroleum Product Analysis; Wiley Interscience: Hoboken, NJ, 2002; pp 46−48. (2) McCain, W. D. Properties of Petroleum Fluids; Pennwell Publishing Company: Tulsa, OK, 1990; pp 257−299. (3) Lyons, W. C.; Plisga, G. J. Standard Handbook of Petroleum & Natural Gas Engeneering; Elsevier: Amsterdam, Netherlands, 2005; pp 242−243. (4) Sanchez-Minero, F.; Ancheyta, J.; Silva-Oliver, G.; Flores-Valle, S. Fuel 2013, 110, 318−321.

AUTHOR INFORMATION

Corresponding Author

*Tel.: +55 019 35213126. Fax: +55 019 35213023. E-mail: [email protected]. Author Contributions

P.R.F. was responsible by developing the genetic algorithm, statistical analysis of the results, and conclusions. N.A.P. was F

DOI: 10.1021/acs.energyfuels.5b02377 Energy Fuels XXXX, XXX, XXX−XXX

Article

Energy & Fuels (5) Monteiro, M. R.; Ambrozin, A. R. P.; Liao, L. M.; Boffo, E. F.; Pereira-Filho, E. R.; Ferreira, A. G. J. Am. Oil Chem. Soc. 2009, 86, 581−585. (6) Ramos, P. F. D.; de Toledo, I. B.; Nogueira, C. M.; Novotny, E. H.; Vieira, A. J. M.; Azeredo, R. B. D. Chemom. Intell. Lab. Syst. 2009, 99, 121−126. (7) Zheng, X. Y.; Jin, Y. Q.; Chi, Y.; Ni, M. J. Energy Fuels 2013, 27, 5787−5792. (8) Masili, A.; Puligheddu, S.; Sassu, L.; Scano, P.; Lai, A. Magn. Reson. Chem. 2012, 50, 729−738. (9) Daniel, M. V.; Uribe, U. N.; Murgich, J. Energy Fuels 2007, 21, 1674−1680. (10) Barbosa, L. L.; Kock, F. V. C.; Silva, R. C.; Freitas, J. C. C.; Lacerda, V., Jr.; Castro, E. V. R. Energy Fuels 2013, 27, 673−679. (11) Höskuldsson, A. J. Chemom. 1988, 2, 211−228. (12) Wold, S.; Sjöström, M.; Eriksson, L. Chemom. Intell. Lab. Syst. 2001, 58, 109−130. (13) Sebald, D. J.; Bucklew, J. A. IEEE Trans. Signal Process. 2000, 48, 3217−3226. (14) Cortes, C.; Vapnik, V. N. Mach. Learn. 1995, 20, 273−297. (15) Li, H.; Liang, Y.; Xu, Q. Chemom. Intell. Lab. Syst. 2009, 95, 188−198. (16) Filgueiras, P. R.; Loureiro, A. R.; Santos, M. F. P.; Castro, E. V. R.; Dias, J. C. M.; Poppi, R. J.; Sad, C. M. S. Fuel 2014, 116, 123−130. (17) Henriques, C. B.; Alves, J. C. L.; Poppi, R. J.; Maciel Filho, R.; Bueno, M. I. M. S. Energy Fuels 2013, 27, 3014−3021. (18) Filgueiras, P. R.; Alves, J. C.L.; Poppi, R. J. Talanta 2014, 119, 582−589. (19) Cong, P.; Li, T. Anal. Chim. Acta 1994, 293, 191−203. (20) Niazi, A.; Leardi, R. J. Chemom. 2012, 26, 345−351. (21) Vapnik, V. N. IEEE Trans. Neural Networks 1999, 10, 988−999. (22) Smola, A. J.; Schölkopf, B. Stat. Compt. 2004, 14, 199−222. (23) Leardi, R. J. Chemom. 2000, 14, 643−655. (24) Xin, N.; Gu, X.; Wu, H.; Hu, Y.; Yang, Z. J. Chemom. 2012, 26, 353−360. (25) Mevik, B. H.; Cederkvist, H. R. J. Chemom. 2004, 18, 422−429. (26) ASTM D2549-02 Standard Test Method for Separation of Representative Aromatics and Nonaromatics Fractions of High-Boiling Oils by Elution Chromatography. Annual Book of ASTM Standards; American Society for Testing Materials: Philadelphia, PA, 1989. (27) Kennard, R. W.; Stone, L. A. Technometrics 1969, 11, 137−148. (28) Savorani, F.; Tomasi, G.; Engelsen, S. B. J. Magn. Reson. 2010, 202, 190−202. (29) Chang, C. C.; Lin, C. J. ACM Trans. Intell. Syst. Technol. 2011, 2, 1−27. (30) Tozzi, F. C.; Sad, C. M. S.; Bassane, J. F. P.; dos Santos, F. D.; Silva, M.; Filgueiras, P. R.; Dias, H. P.; Romão, W.; de Castro, E. V. R.; Lacerda, V., Jr. Fuel 2015, 159, 607−613. (31) Terra, L. A.; Filgueiras, P. R.; Tose, L. V.; Romão, W.; de Souza, D. D.; de Castro, E. V. R.; de Oliveira, M. S. L.; Dias, J. C. M.; Poppi, R. J. Analyst 2014, 139, 4908−4916. (32) Gillet, S.; Rubini, P.; Delpuech, J. J.; Escalier, J. C.; Valentin, P. Fuel 1981, 60, 221−225. (33) Van der Voet, H. Chemom. Intell. Lab. Syst. 1994, 25, 313−323. (34) Centner, V.; de Noord, O. E.; Massart, D. L. Anal. Chim. Acta 1998, 376, 153−168.

G

DOI: 10.1021/acs.energyfuels.5b02377 Energy Fuels XXXX, XXX, XXX−XXX