Estimation of Properties of Crude Oil Residual Fractions Using

The NIR spectra for these fractions were obtained at 20 °C. Principal component analysis (PCA) and partial least-squares (PLS) techniques were used t...
0 downloads 0 Views 253KB Size
998

Energy & Fuels 2007, 21, 998-1005

Estimation of Properties of Crude Oil Residual Fractions Using Chemometrics Sriram Satya,*,† Richard M. Roehner,‡ Milind D. Deo,† and Francis V. Hanson† Department of Chemical Engineering, UniVersity of Utah, 50 South Central Campus DriVe, 3290 MEB, Salt Lake City, Utah, 84112, and AnVil Corporation, 1675 West BakerView Road, Bellingham, Washington 98226 ReceiVed April 3, 2006. ReVised Manuscript ReceiVed December 26, 2006

Knowledge of certain properties of a crude oil such as saturates, aromatics, resins, and asphaltenes (SARA) contents, Conradson carbon residue (CCR), ultimate analysis (CHNS), density, and molecular weight (MW) is useful for the characterization of the oil. Multivariate statistics combined with near-infrared (NIR) spectroscopy can be a powerful tool to rapidly and accurately predict these properties. Twenty-two crude oil fractions, from Alaska North Slope (ANS), the western United States (Utah, Colorado, and Wyoming), and Venezuela were used in this study. Eleven of these samples were C25+ residual fractions while the rest were C12+ residual fractions. The objective was to develop chemometric prediction models to predict the properties of unknown fractions using a single NIR spectrum. The SARA components (HPLC), molecular weights, densities, hydrogento-carbon (H/C) ratios, weight percent (wt %) nitrogen, weight percent sulfur (from CHNS analysis), and weight percent Conradson carbon residue (CCR) were measured. The NIR spectra for these fractions were obtained at 20 °C. Principal component analysis (PCA) and partial least-squares (PLS) techniques were used to analyze and correlate the spectra to the measured properties. Linear correlations with R2 values greater than 0.99 were obtained for all properties studied. The uncertainty in experimental measurements for all the properties studied was comparable with the uncertainty in predictions by the models of the respective property. Furthermore, the models were tested using samples that did not belong to the calibration set. The properties predicted for these samples were within the range described by the experimental error for the respective property.

Introduction The demand for petroleum and its products has increased in the last 25 years of the 20th century. This trend will continue for at least a few decades into the future. Petroleum products are widely used in transportation, in space heating applications, in the chemical process industries, and in the cosmetics and other industries. The rise in demand in terms of quantity and quality for petroleum products renders the opportunities for application of chemometrics in the petroleum industry endless. Chemometric methods can be effectively used for property sensing applications in production, transport, and refining in various scenarios. Properties such as, molecular weight (UOP676-84), density (ASTM D 5002-99), and weight percent (wt %) Conradson carbon residue (CCR) (ASTM D-189), are required as input parameters in several correlations for the prediction of other petroleum properties.1 Other characterization data such as the weight percent saturates, aromatics, resins, and asphaltenes (SARA) and the weight percent paraffin, naphthenes, and aromatics (PNA) are required as input parameters for modeling such phenomena as the wax precipitation temperature and the amount of wax precipitated. These characterization experiments are generally time-consuming and expensive and, as a result, hard to implement in an in-situ environment and for real time * To whom correspondence should be addressed. Tel./Fax: (801) 5816591/(801) 585-9291. E-mail: [email protected]. † University of Utah. ‡ Anvil Corporation. (1) Algelt, K. H.; Boduszynski, M. M. Composition of Heavy Petroleum. 3. An improved Boiling Point-Molecular Weight Relation. Energy Fuels 1992, 6 (1), 68-72.

analysis. An indirect and rapid estimation of these properties, when accurate, can be used to expedite the characterization procedure. Fourier transform infrared (FTIR) spectroscopy has been used in conjunction with multivariate statistical analysis to estimate several properties of middle distillate fuels such as diesel and jet fuels2 and to determine the weight percent asphaltenes in crude oils.3 Similarly, near-infrared (NIR) spectroscopy has been used to estimate the properties of hydrocarbon mixtures,4 petroleum products,5 petroleum crude,6 and gas-oil ratios of live crude oils.7 These methods are as accurate as and are much faster than the traditional methods. The suitability of NIR spectroscopy and multivariate statistical analysis for property prediction has been discussed by Kelly and Callis.8 (2) Fodor, G. E.; Kohl, K. B. Analysis of Middle Distillate Fuels by Midband Infrared. Energy Fuels 1993, 7, 598-601. (3) Wilt, B. K.; Welch, W. T. Determination of Asphaltenes in Petroleum Crude Oils by Fourier Transform Infrared Spectroscopy. Energy Fuels 1998, 12, 1008-1012. (4) Honigs, D. E.; Hirschfeld, T. B.; Hieftje, G. M. Near-Infrared Determination of Several Physical Properties of Hydrocarbons. Anal. Chem. 1985, 57, 443-445. (5) Swarin, S. J.; Drumm, C. A. Prediction of Gasoline Properties with Near-Infrared Spectroscopy and Chemometrics. SAE Trans. 1991, 100, 1110-1118; Sect. 4, 912390. (6) Aske, N.; Kallevik, H.; Sjoblom, J. Determination of Saturate, Aromatic, Resin, and Asphaltenic (SARA) Components in Crude Oils by Means of Infrared and Near-Infrared Spectroscopy. Energy Fuels 2001, 15, 1304-1312. (7) Mullins, O. C.; Daigle, T.; Crowell, C.; Groenzing, H.; Joshi, N. Gas-Oil Ratio of Live Crude Oils Determined by Near-Infrared Spectroscopy. Appl. Spectrosc. 2001, 55 (2), 197-201. (8) Kelly, J. J.; Callis, J. B. Nondestructive Analytical Procedure for Simultaneous Estimation of the Major Classes of Hydrocarbon Constituents of Finished Gasolines. Anal. Chem. 1990, 62 (14), 1444-1451.

10.1021/ef0601420 CCC: $37.00 © 2007 American Chemical Society Published on Web 03/01/2007

Chemometrics of Crude Oil Residual Fractions

Even though quantitative analysis can be done using both the FTIR and NIR techniques, there are some advantages of using an NIR spectrometer. An NIR spectrum primarily consists of overtones and combination bands that are much more subtle and 10-100 times weaker in intensity than the fundamental bands in the mid-IR region. The weakness in intensity enables direct analysis of samples without dilution and also does not require a short optical path length. There is little to no sample preparation required, and the noise is significantly lower. Also, the present day NIR spectrometers can be made suitable for in-situ installations. The objective of this study was to develop chemometric prediction models for several crude oil properties. Furthermore, the goal was to be able to predict all the appropriate properties by measuring one NIR scan of a crude oil sample, thereby reducing the analysis time. Twenty-two crude oils were used in this initial study. The crude oils were obtained from Alaska, Wyoming, Colorado, and Utah in the U.S.A. and from Venezuela. Experimental Methods Sample Set Description. The chemometric models in this study were developed using 22 samples obtained by distillation of different crude oils obtained from different geographical locations, as identified above. The sample set was split into samples of types A and B, for better understanding. Samples of type A were C25+ fraction of crude oils. These oils were distilled by Core Laboratories to 402 °C (756 °F) per ASTM D-2892.9 There were 11 type A samples from the Alaska north slope crude oils or Trans Alaska Pipeline System (TAPS) mix blends. All the samples of type A were black and opaque. They were highly viscous and did not flow at room temperature. Some were waxy, while others had a tarlike appearance. The remaining 11 samples were C12+ fractions that were distilled in a Petroleum Research Center (PERC, University of Utah) laboratory at the University of Utah. The distillation was carried out at 196 °C (385 °F) and atmospheric pressure. This procedure was adopted so that the oil could be “topped” by removing the volatiles. The C12+ resids were black and opaque at room temperature and exhibited varying flow properties. Some samples flowed at room temperature and some did not. Experiments summarized in the subsequent sections were performed using these 22 residual fractions. The residual fractions will henceforth be called resids. SARA Analysis. Asphaltene Separation Using n-Heptane. The asphaltene fraction of a crude oil is insoluble in n-alkanes. This property was used to separate the asphaltenes fraction from the maltenes fractions of the resids by n-heptane induced precipitation. Asphaltenes precipitation was carried out by treating the resids with a 1:40 resid-to-n-heptane volume ratio. The mixture was agitated using an ultrasonic bath, and the precipitated asphaltenes were filtered by vacuum filtration. The solvent was removed from the filtrate by temperature controlled heating under low pressures in a rotary evaporator. The recovered solvent free sample was the maltene fraction. The filtered asphaltenes were dried in an oven at 100 °C. This method of precipitation is known to yield stable asphaltenes.10 The weight percent asphaltenes was calculated based on the initial weight of the sample and the amount of asphaltenes recovered. HPLC Analysis. Approximately 1 g of the maltenes fraction was mixed with 10 mL of n-heptane to prepare the sample for the HPLC experiment. A preparative-scale Waters HPLC system was used to separate the resids into their SAR fractions. The HPLC system consisted of a multisolvent delivery system (Waters 600 controller) (9) Standard Test Method for Distillation of Crude Petroleum. Annual Book of ASTM Standards; ASTM: West Conshohocken, PA, D-2892-03, Vol. 05.01. (10) Speight, J. G.; Long, R. B.; Trowbridge, T. D. Factors Influencing the Separation of Asphaltenes from Heavy Petroleum Feedstock. Fuel 1984, 63, 616-620.

Energy & Fuels, Vol. 21, No. 2, 2007 999

Figure 1. HPLC chromatogram showing retention times for saturates, aromatics, and resins.

that was used to control the flow of heptane, the mobile phase. An autosampler,11 in which the sample vials were loaded, was programmed to inject the samples automatically onto the column. The system consisted of three silica columns each being 300 mm long and 7.8 mm inside diameter connected in series. The columns were pressurized to 1000 psi using a methanol/water mixture and were equilibrated in flowing n-heptane. The autosampler was programmed to inject 300 µL of the sample (loaded in vials) onto the column. The n-heptane flow rate was maintained at 20 mL/ min throughout the run. The SAR fractions were identified based on the relevant peaks appearing in the refractive index (RI) and photodiode array (PDA) signals. The saturates eluted first at about 5.5 min and were detected by the RI detector (Waters 2410 RI detector). The aromatics and the resins were detected by the PDA detector (Waters 996 PDA detector). The aromatics eluted at 8.5 min, and the resins were collected by back flushing the column; they eluted at 115 min. The volume of fluid flowing into the detector was 1% of the total volume of the fluid flowing through the column. The remaining portion was sent to a fraction collector (Waters fraction collector II) where the individual fractions were collected. The fractions were recovered later by evaporating the solvent in a rotary vacuum evaporator. The weight percent saturates, aromatics, and resins were calculated by gravimetric analysis of the fractions collected from the fraction separator after rotary evaporation. An example of an HPLC chromatogram for the separation of the SAR fraction is presented in Figure 1. The small spike at 65 min, observed in the figure, indicated the change in the direction of flow after the aromatics were completely eluted. Molecular Weight Measurement. A Knauer K-7000 vapor pressure osmometer (VPO) was used to measure the molecular weights of the crude oil fractions. The working principle of the osmometer for the measurement of molecular weights can be found in the literature (Knauer manual11). Benzyl was used as the standard to calibrate the VPO. The calibration was done using four concentrations of benzyl in pyridine solvent. The concentrations for the calibration standard varied from 20 to 50 g solute/kg pyridine. The experimental error associated with this experiment was obtained by repeated measurements (three times) of the molecular weight for randomly selected samples. CCR Measurement. Carbon residue in crude oils has a tendency to form coke when combusted under certain limited oxygen conditions. The estimation of the carbon residue is an important measurement to decide on the quality of the feedstock. Refiners usually prefer feedstocks containing low carbon residue since their deposition affects heat transfer during distillation. The weight percent CCR was determined using the ASTM D-18912 method. The sample mass used for the experiments varied from 5 to 10 g depending on the CCR content of the sample. More details on the sample masses, the apparatus setup, and the method used for the determination of weight percent CCR can be found in the ASTM document. A weighed quantity of the sample was placed in a crucible and subjected to destructive distillation. The residue underwent cracking and coking during the fixed period of severe (11) Vapor Pressure Osmometer K-7000, User Manual; Knauer: 1999; V7109. (12) Standard Test Method for Conradson Carbon Residue of Petroleum Products. Annual Book of ASTM Standards; ASTM, West Conshohocken, PA, 2004; Vol. 05.01, D 189-01.

1000 Energy & Fuels, Vol. 21, No. 2, 2007 heating. The crucible containing the remaining carbonaceous material was cooled and weighed to determine the weight percent CCR. A repeat run was performed for all the samples. The weight percent CCR was determined only for 19 samples as a part of the chemometric database because sufficient sample was not available for the 20th sample. The weight percent CCR measurement using the D-189 method is a simple technique. However, small variations in the experimental setup can cause unacceptable differences in the weight percent CCR numbers. Density Measurement. The densities of the samples were measured using a DMA512 digital density meter manufactured by Anton Paar following the ASTM D-500213 method for the measurement of density. The DMA512 digital density meter has a highpressure external cell with a body and a U-tube made of stainless steel. It uses the principle of oscillation of the U-tube to measure the density of the sample. The frequency of oscillation is directly proportional to the density of the sample. It was calibrated using ambient air and double-distilled water as recommended in the user’s manual. The calibration was done for a range of temperatures from 15 to 60 °C in intervals of 5 °C. The sample density was measured for all these temperatures, and a linear trend was observed with change in temperature. The linear fit was greater than 0.98 in all the cases. About 1 mL of sample was used for each measurement. Repeat measurements were done for several samples at different temperatures. The densities of the samples at 20 °C were used to develop the chemometric model. CHNS Elemental Analysis. The ultimate analyses of the resids were determined using a CHNS 932 Leco furnace. The accuracy of this instrument is (0.01% for the measurement of CHN and S. It was calibrated with an acetanilide standard for carbon, hydrogen, and nitrogen and with a sulfamethazine standard for sulfur. The results from these experiments were normalized to 100% without considering the presence of oxygen. The standards were analyzed with 1 ( 0.1 mg of a weighed sample loaded in silver capsules. After a stable measurement was established for the standards, the samples were analyzed with 1 ( 0.1 mg of sample. The average of five measurements of each sample was reported as the CHNS measurement of each sample. The atomic H/C ratio for each sample was obtained using the weight percent hydrogen and carbon values. NIR Spectroscopy. Near-infrared spectroscopy was used in this study to develop chemometric models to predict multiple properties of the resids. A Quantum 1200 Plus near-infrared spectrometer, manufactured by LT Industries, was used to obtain the NIR spectra of the resids in a temperature controlled environment. The path length in all the measurements was 6 mm. The raw NIR spectra of 20 samples used in this study are presented in Figure 2. The combination and the first and the second overtone bands for C-H stretch + C-H14 bend are observed in the 1200-2400 nm region. Most of the results related to this study were expected to be based on the variations of spectral intensities in this region. Furthermore, the spectral range for data acquisition was limited by the instrument setting. All other settings provided data in a narrower range. Hence, the raw spectra were collected at 20 °C in the 1200 and 2400 nm wavelength range. The sample was poured into a 15 mL glass beaker, and the NIR probe was immersed into it. A type J thermocouple was immersed in the sample by the side of the probe in such a way that it did not interfere with the NIR beam. The sample temperature was controlled by immersing this setup in a water bath. Data Analysis. Sometimes it is difficult to extract useful information about the samples from raw spectra. It is common to pretreat them before any data analysis. In this study, the raw spectra were converted to absorbance spectra and imported into Matlab using the “spc-reader” function in the PLS toolbox. The first- and (13) Standard Test Method for Density and Relative Density of Crude Oils by Digital Density Analyzer. Annual Book of ASTM Standards; ASTM, West Conshohocken, PA, 2004; Vol. 05.01, D 5002-1999. (14) Mullins, O. C. Asphaltenes in Crude Oils: Absorbers and/or Scatterers in the Near-Infrared Region. Anal. Chem. 1990, 62, 508-514.

Satya et al.

Figure 2. Raw NIR spectra of the 20 samples.

second-order differentiated values were computed for subsequent analyses. First-order differentiation helped to reduce possible baseline shift. The disadvantage of this technique is the possible reduction of the signal-to-noise ratio.6 Principal component analysis (PCA) allows the identification of groups of variables that are interrelated via phenomena that cannot be directly observed. ASTM defines PCA as a mathematical procedure for resolving sets of data into orthogonal components whose linear combinations approximate the original data to any desired degree of accuracy. As successive components are calculated, each component accounts for the maximum possible amount of residual variance in the set of data. PCA is a technique that is used to reduce the dimensionality of the data. It is especially used to describe multivariate data that are generally hard to visualize. It uses a key result from matrix algebra wherein a p × p symmetric, nonsingular matrix can be reduced to a diagonal matrix L by premultiplying and postmultiplying it by a particular othonormal matrix U. There are several definitions for PCA found in the literature.15-22 The mathematical concept of PCA, the interpretation of the principal components, and the definitions for scores and loadings are cited in several articles in the literature.23,24 Partial least-squares regression (PLSR) is a technique that combines the features from PCA and multiple linear regression (MLR). It is especially used to predict dependent variables from a large set of independent variables. It can be used for fitting a model to observed data. It is also widely used to predict the response of unknown samples using regression.25 The application of PLS (15) Anderson, T. W. An Introduction to MultiVariate Statistical Analysis; John Wiley and Sons: New York, 1958; p 282. (16) Enslein, K.; Ralston, A.; Wilf, H. S. Statistical Methods for Digital Computers; John Wiley and Sons: New York, 1977; p 306. (17) Kennedy, W. J.; Gentle, J. E. Statistical Computing; Marcel Dekker: New York, 1980; p 566. (18) Kramer, R. Chemometric Techniques for QuantitatiVe Analysis; Marcel Dekker: New York, 1998. (19) Martens, H.; Wold, S.; Martens, M. A Laymans Guide to MultiVariate Analysis, in Food Research and Data Analysis; Russwurm, H., Ed.; Applied Science Publishers: New York, 1983; p 482. (20) Naes, T.; Martens, H.; Irgens, C. Comparison of Linear Statistical Methods for Calibration of NIR Instruments. Appl. Stat. 1986, 35, (2), 195206. (21) Osborne, B. G.; Fearn, T. Near Infrared Spectroscopy in Food Analysis; John Wiley and Sons: New York, 1986; p 108. (22) Gnanadesikan, R. Methods for Statistical Data Analysis of MultiVariate ObserVation, second ed.; Wiley Series in Probability Statistics; John Wiley and Sons: New York, 1997. (23) Burns, D. A.; Ciurczak, E. W. Handbook of Near-Infrared Analysis, second ed., revised and expanded ed.; Ciurczak, E.W., Ed.; Marcel Dekker, INC: New York, 2001. (24) Sharaf, M. A.; Illman, D. L.; Kowalski, B. R. Chemometrics; John Wiley and Sons: New York, 1986. (25) http://www.galactic.com/algorithms/pls.htm (accessed 2002).

Chemometrics of Crude Oil Residual Fractions

Energy & Fuels, Vol. 21, No. 2, 2007 1001

Table 1. Sample Type and Properties Measured weight percent (wt %) no.

type

density

sata

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22

A A A A B A A A B A B A A B B B B B A B B B

0.9475 0.9951 0.9655 0.9763 0.9410 0.9986 0.9705 0.9317 0.8662 0.9268 0.8906 0.9806 0.9843 0.9296 0.9311 0.9249 0.8548 0.9812 0.9812 0.8748 0.8763 0.8600

56.43 35.08 36.03 38.18 43.07 33.59 38.16 34.53 59.05 50.29 40.94 37.99 36.20 44.53 46.67 47.98 70.28 50.73 36.75 66.09 65.83 70.19

a

aroma

resins

aspha

N

sulfur

CCR

MW

H/C

25.27 32.50 33.67 35.63 40.87 34.17 39.18 37.70 34.65 29.65 41.33 34.79 36.42 37.30 37.05 35.21 18.51 29.48 31.94 23.59 21.98 21.36

18.30 24.26 27.44 25.35 13.34 22.94 21.95 23.31 6.27 19.99 15.63 26.64 24.56 17.87 15.00 16.10 10.37 15.74 24.78 10.06 12.02 8.35

0.00 8.16 2.86 0.84 2.72 9.30 0.72 4.47 0.03 0.08 2.10 0.58 2.82 0.30 1.28 0.70 0.84 4.05 6.53 0.26 0.17 0.10

0.63 0.82 0.71 0.75 0.54 0.85 0.68 0.79 0.38 0.49 0.34 0.82 0.78 0.55 0.53 0.56 0.35 0.67 0.79 0.34 0.00 0.00

0.65 2.12 1.81 1.70 1.76 2.35 1.75 1.97 0.35 0.47 0.92 1.82 2.15 1.28 1.33 1.24 0.36 2.16 1.84 0.83 0.37 0.14

5.55 14.02 10.61 10.20 NA 15.07 9.79 13.50 0.95 3.75 4.05 12.10 12.30 7.10 7.40 6.55 2.85 9.97 12.00 2.20 2.80 0.10

496 666 609 525 372 739 549 641 308 460 333 626 633 352 355 407 382 417 666 338 354 286

1.714 1.573 1.638 1.614 1.723 1.545 1.634 1.560 1.854 1.775 1.859 1.594 1.618 1.766 1.770 1.729 1.920 1.742 1.604 1.889 1.881 1.895

Here, sat ) saturates, arom ) aromatics, and asph ) asphaltenes.

modeling as a multivariate statistical tool for quantitative analyses using several spectroscopic techniques has been discussed in the literature.26 PLSR is probably the least restrictive of the various multivariate extensions of the multiple linear regression models.25,27 This flexibility allows it to be used in situations where the use of traditional multivariate methods is severely limited, such as when there are fewer observations than predictor variables. In this study, the responses are the properties listed as column headings in Table 1.

Results and Discussion The molecular weight, density, CCR, SARA, H/C atomic ratio, weight percent nitrogen, and weight percent sulfur were measured for the 22 crude oil fractions using the techniques described above. The results of these measurements are presented in Table 1. The authors recognize that the sample set is relatively small at this point in time; however, additional crude oils are being acquired to expand the overall size of the sample set and the range of properties. Two validation samples were selected to avoid reducing the size of the calibration set significantly. Samples 4 and 22 were randomly excluded from the sample set before calibration was performed for all the models. They were used as cross validation samples to test the predictive capability of the models. Wide ranges of values for all the properties are observed in Table 1 indicating that the samples are significantly different from one another. It is a coincidence that samples 4 and 22 do not have the maximum or the minimum values of any of the 10 properties. Principal Component Analysis. Principal component analysis was performed using PLS Toolbox 3.0 in Matlab 6.5 for a dataset consisting of 20 samples (samples 4 and 22 were excluded) in order to understand the relationship between the samples. The first two principal components (PC) described 96% of the variation in the dataset. The second principal component (PC2) split the dataset into two distinct groups. Samples 1 and 9 were identified as potential outliers from an examination of the Hotelling T2 ellipse based on a 95% confidence interval, presented in Figure 3, in dotted lines. Note that Figure 3 was (26) Haaland, D. M.; Thomas, E. V. Partial Least-Square Methods for Spectral Analyses. 1. Relation to Other Quantitative Calibration Methods and the Extraction of Qualitative Information. Anal. Chem. 1988, 60, 1193-1202. (27) http://www.statsoftinc.com/textbook/stpls.html#basic (accessed 2003).

used to understand if there are specific groupings or trends in the dataset. The chemistry data from these two samples indicated no similarity between them other than very low asphaltene content. However, each sample was among the lightest (low density), containing a high saturates content and one of the highest H/C ratios within their type. (Sample 1 was type A, and sample 9 was type B.) It was found upon closer examination of the spectra that the optical densities between 1500 and 1700 nm for these two samples were significantly lower compared to the other samples. The difference in optical densities between the two samples was interpreted to be a true difference due to the chemistry (rather than some random difference) since the spectra were reproduced repeatedly upon several attempts. It was inferred, at this stage, that samples 1 and 9 simply belonged to a different sample group thereby adding variety to the dataset rather than being outliers. Spectra of other samples were also reproduced indicating that true differences may not have been captured by the suite of properties that were studied. Since there was not sufficient evidence for samples 1 and 9 being true outliers, they were included in the dataset for PLS regression. The outliers for each model were chosen based on the scores plot for each model. Two distinct groups were indicated by the

Figure 3. Scores plot from PCA model for the 20 NIR spectra.

1002 Energy & Fuels, Vol. 21, No. 2, 2007

Satya et al.

Table 2. PLS Statistics and Percent Relative Standard Deviation Data for All Models model

cal range

X-explained

Y-explained

no. LVs

R2

samples excluded

molecular weight wt % CCR H/C ratio wt % nitrogen wt % sulfur density wt % saturates wt % aromatics wt % resins wt % asphaltenes

286 - 739 0.1 - 15.07 1.53 - 1.90 0 - 0.82 0.05 - 2.35 0.86 - 0.9986 33.59 - 70.28 18.51 - 41.33 6.67 - 26.64 0 - 9.3

92.19 97.45 98.93 99.53 94.74 98.71 89.99 76.63 80.33 98.16

99.43 100 99.93 99.23 99.9 100 99.53 99.84 99.98 99.5

3 4 5 5 2 7 1 1 1 5

0.996 0.998 0.999 0.992 0.999 0.999 0.996 0.996 0.999 0.998

none 1,11 9,11 9,11 9,11 1,11 1,11 11 9,11 none

scores plot. Samples 11, 17, and 20 lay within the ellipse but were not clearly a part of either of the two groups. Each group had a mixture of samples belonging to the two types of resids. No separation was found based on the resid types; that is, the two distinct groups observed in Figure 3 do not correspond with the resid type. Partial Least-Squares Analysis. PLS regression was performed using the PLS 3.0 Toolbox in Matlab 6.5, to correlate the NIR spectra with the measured properties. Separate regression analysis was performed for each property with the same set of NIR spectra. Different combinations of preprocessing techniques and first or second derivatives were tried for each model, and the best combination based on predictive capability and other factors was used for each model. The NIR spectral dataset was used as the X data, and each measured property set was used as the response for the PLS regressions. The calibration range of each model, the variance explained in the X and Y data, the R2 value, the number of latent variables (LVs) used for developing the models, and the samples excluded as potential outliers are presented Table 2. The number of LVs for all the models was chosen based on the minimum value of root mean squared error of cross validation (RMSECV) in the RMSECV plots. The RMSECV plots used for developing the CCR and the weight percent saturates models are presented in Figures 4 and 5. These plots are not presented for the other properties. Independent NIR spectral scans of the samples present in the calibration set were used for cross validation, in addition to the two samples that were initially set aside. The independent measurements were referred to as the internal cross validation, and the two samples (not included for modeling) were referred to as the external cross validation. The actual versus predicted plots for all the models are presented Figures 6-15. Choice of LVs. The choice of the number of LVs for a model, as mentioned earlier, is based on the minima of the RMSECV plot of each model. However, this is not strictly the case.

Discretion may be used to reduce the number of factors compared to the proposed minima from the RMSECV plot. For instance, the RMSECV attains a minimum value at seven LVs in Figure 4, for the modeling of weight percent CCR. However, only four LVs were chosen to develop the CCR model. This conclusion was arrived at by carefully testing the model in the following manner. Different models were developed using a different number of LVs ranging from seven to three. The linear fit for calibration, the root mean squared error of prediction (RMSEP), the potential outliers, etc., were tested for each model. When the number of LVs was reduced to three, it was observed that the prediction errors were much higher and the calibration had a poor linear fit (