1 INTRODUCTION

1. Rapid Determination of Physical and Chemical Parameters of Reformed Gasoline by NIR Combined with Monte Carlo Virtual Spectrum Identification Metho...
1 downloads 0 Views 622KB Size
Subscriber access provided by READING UNIV

Fossil Fuels

Rapid Determination of Physical and Chemical Parameters of Reformed Gasoline by NIR Combined with Monte Carlo Virtual Spectrum Identification Method Jingyan Li, and Xiaoli Chu Energy Fuels, Just Accepted Manuscript • Publication Date (Web): 04 Jun 2018 Downloaded from http://pubs.acs.org on June 4, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 13 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

Rapid Determination of Physical and Chemical Parameters of Reformed Gasoline by NIR Combined with Monte Carlo Virtual Spectrum Identification Method Jingyan Li, Xiaoli Chu * (Research Institute of Petroleum Processing,SINOPEC,Beijing 100083,China)

ABSTRACT: Based on near-infrared spectroscopy and Monte Carlo virtual spectrum identification method, a fast analytical tool is presented for rapid prediction of key gasoline properties, 542 reformed gasoline samples were collected to establish near infrared spectroscopic database for determination of research octane number and hydrocarbon groups. The prediction accuracy of the Monte Carlo method is slightly lower than that of PLS, but the method does not need modeling and model maintenance, the results show that, the standard deviation of prediction of paraffin, isoparaffin, olefins, aromatics, naphthenes and octane number are 0.31%, 0.47%, 0.21%, 0.43%, 0.67% and 0.25, respectively, which meet the requirements of fast assessment. This method can significantly reduce the maintenance of traditional multivariate calibration. KEYWORDS: NIR; reformed gasoline; identification; database; chemometric

1 INTRODUCTION Reformed gasoline (reformate) is a kind of premium blending stock for high-octane gasoline. Research octane number (RON) and chemical compositions are the critical parameters of reformed gasoline. Although there are well-developed ASTM methods for those two parameters, the procedures are time consuming, require expensive and maintenance–intensive equipment, and are not well suitable for on–line determination. Obviously, more effective and efficient analytical technologies are needed to provide timely data for gasoline blending and real-time optimization of reformed unit. Compared with conventional methods, near infrared spectroscopy (NIR) based on multivariate calibration present many advantages in petroleum analysis, such as high speed, cheapness, efficiency and non-destruction[1,2]. In recent years, NIR combined with chemometric algorithms has been widely used for the in-line or on-line monitoring and controlling of refinery processes, such as crude distillation, catalytic cracking, naphtha cracking, gasoline blending[3,4]. Many critical parameters of petroleum and petroleum products have been predicted accurately and precisely, including octane number, cetane number, distillation properties, chemical compositions, etc[5-9]. The establishment and maintenance of robust calibration models is one of the cores of near infrared spectroscopy method. Partial least square (PLS) is the most popular method to build multivariate calibrations[10,11]. The procedure to build and maintain calibrations by use of PLS takes too much time and energy, and also needs specific intellectuals, because several key *

Corresponding author. E-mail address: [email protected](X.-L. Chu) 1

ACS Paragon Plus Environment

Energy & Fuels 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 13

parameters for PLS regression should be chosen optimally. Furthermore, each property requires at least one traditional PLS models, maintenance of such large number of PLS models has been the bottleneck for NIR wide implementations in refineries. Based on pattern recognition, several alternative non-parametric modeling methods have been proposed[12,13]. Those methods are based on the concept that two samples that have the same spectra would have the same chemical and physical properties. The “Add sample to spectral library” facility of those methods offers the users to easily update the models without requiring a specific expertise. Among others, moving window correlation coefficient method is an effective one, which has been used to obtain detailed assay data of an unknown crude oil based on NIR or MIR if the assay database and spectral library contain the same crude oil[14,15]. Unlike crude oils, there are countless reformed gasoline samples. It is hard to find the same reformed gasoline samples because the same feed could produce many different reformed gasolines under different reaction conditions. Therefore, moving window correlation coefficient method proposed above is not suited to reformed gasoline samples directly. In this paper, for unknown sample, a strategy to generate thousands of virtual samples by spectral library samples with Monte Carlo method has been proposed, combined with moving window correlation coefficient method, the same virtual samples as the unknown sample could be recognized, and the corresponding properties could be calculated rapidly and accurately.

2 CHEMOMETRIC METHODS In this study, partial least squares and moving window correlation coefficient method were used for modeling. Before calibrations, the wavelength selection and spectral pretreatments have been applied in order to get an optimal calibration model. The wavelength range selection is based on the chemical knowledge and the correlation level between properties and NIR absorbance. The spectral pretreatment methods used in this research include mean centering, first and second derivative.

2.1 PLS algorithm The PLS regression algorithm was employed to build quantitative calibration models for RON and hydrocarbon composition of gasoline. There are two steps in the PLS process, calibration and validation. The evaluation of the calibration performance is assessed by the root mean squared error of calibration (RMSECV), the standard root mean squared error of prediction (RMSEP) for validation gives an evaluation of the prediction performance of the quantitative model obtained above. These were calculated by: n

∑ (y RMSECV =

i , actual

− yi , predicted ) 2

i =1

n −1

(1)

m

∑(y RMSEP =

i , actual

− yi , predicted ) 2

i =1

m −1 2

ACS Paragon Plus Environment

(2)

Page 3 of 13 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

where n and m are the amount of samples included in the calibration and validation sets, yi, actual and yi, predicted are the RON and hydrocarbon composition measured by the reference and NIR methods, respectively.

2.2 Moving window correlation coefficient method The traditional correlation coefficient method is often used to compare the similarity of two spectra by all spectra variables or selected spectral regions, which can be expressed as follows: n

∑ (x R ij =

ik

- x i )( x jk - x j )

k =1 n

∑ (x

(3)

n ik

- xi )2

k =1

∑ (x

jk

- x j)2

k =1

Where, x i is the mean value of total absorbance of i spectra, x j is the mean value of total absorbance of j spectra, n is the number of wavelength channels. A correlation coefficient of 1 indicates that the two spectral are totally identical. The higher the correlation coefficient is, the more similar the spectra are. The moving window correlation coefficient method (MWCC) is to select a spectral window that starts at the kth spectral channel and ends at the (k +w–1)th spectral channel (the window width is w), moving the spectral window successively along the equally spectral data to construct a series of moving window (a total of n – w +1 windows), and then calculating corresponding correlation coefficients in each moving window using the formula (3). The moving window correlation coefficients of the first (w–1)/2 and the last (w–1)/2 data cannot be calculated in the process. A vector of correlation coefficients with n–w elements can be obtained by this method. If the two spectra compared are identical, the elements of the correlation coefficient vector in all moving windows are all equal to 1. If there is a difference in a certain wavelength region between two spectra, the moving correlation coefficients in this wavelength region would be dropped obviously. Compared with traditional correlation coefficient scalar obtained by full spectral region, this method can provide more detailed information between two very similar spectra with very subtle differences, which can give more accurate searching results. For the MWCC method, the window width is a very critical parameter. A narrow moving window could identify tiny differences between two spectra, but the risk is that the repetitive spectra of a same sample collected in different dates may not be accurately identified because of spectral measurement errors and instrumental variations. A wide moving window could reduce the influence of external test conditions such as temperature and humidity, but two different samples with very subtle differences may not be effectively distinguished. Thus, the window width needs to be optimized in the practical applications according to the differences of samples and the spectra measurement conditions. As for reformed gasoline, the window width was set to 11 points. When identifying the NIR spectra of unknown reformed gasoline samples automatically from the NIR spectral library by use of MWCC method, one special identification parameter is Q value. The parameter Q is calculated with the sum of squared R (correlation coefficient) obtained in all moving windows. According to the statistical results of the measurement error of the repetitive NIR spectra, the threshold of parameter Q(Qt) for identification of an identical 3

ACS Paragon Plus Environment

Energy & Fuels 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 13

reformed gasoline is given as follows[14]: Q=n-w-0.15, due to the statistical results of the six repetitive spectra. Another identification parameter is the MWR of the correlation coefficient in every moving window. The threshold of parameter MWR (MWRt) is set to 0.9990 accordingly. When identifying an unknown reformed gasoline by NIR spectra, both of the two identification parameters must be greater than the corresponding thresholds. Otherwise, the unknown sample does not match any kind of reformed gasoline in the NIR spectral library.

2.3 Monte Carlo Virtual Spectrum Identification Method As for an unknown sample, if MWCC method could not find the same sample in the NIR spectral library, the virtual samples would be generated by Monte Carlo method based on library samples around the unknown sample. The procedure of Monte Carlo virtual spectrum identification method(MCVS) is executed by the following procedure. Firstly, MWCC method is used to identify N library samples which are most similar to the unknown reformed gasoline sample. Secondly, M virtual samples are generated around the N samples through Monte Carlo method, and the corresponding chemical and physical properties are calculated. Finally, the same virtual reformed gasoline samples as the unknown sample are identified using MWCC method, whose identification parameters of Q and MWR are beyond thresholds respectively. The unknown reformed gasoline properties and hydrocarbon composition can be obtained from the identified virtual samples.

3 EXPERIMENTAL 3.1 Reformed gasoline spectral library A total of 542 reformed gasoline samples were collected by Research Institute of Petroleum Processing (RIPP) from November 2009 to August 2016. All the samples were sealed with 20 mL vial and stored in the freezer at 0 °C. The reformed gasoline spectral library has been constructed based on the above samples. The physical and chemical properties of each sample were determined according to the standard analytical methods by Oil Analysis Department of RIPP, including n-paraffin, i-paraffin, olefin, naphthene, aromatics and research octane number (RON). The octane number of gasoline was determined by bench test machine, and the hydrocarbon group composition was determined by chromatography. Statistics on their properties and composition were shown in Table 1. Table 1. .Statistics of RON and compositions of reformed gasoline in spectral library Statistics

iP/%

nP/%

O/%

N/%

A/%

RON

Min.

3.02

5.51

0.36

0.39

58.08

93.7

Max.

12.89

26.52

2.30

5.50

89.91

105.3

Mean

7.14

14.76

0.88

1.05

75.84

100.5

SD

2.09

4.27

0.29

0.62

6.59

2.4

iP:i-paraffin; nP: n-paraffin; O: olefin; N: naphthene; A: aromatics

4

ACS Paragon Plus Environment

Page 5 of 13

3.2 Near infrared spectroscopy NIR spectra of gasoline were collected by a Thermo Antaris II FT-NIR spectrometer equipped with temperature controlled transmission accessory. Samples are loaded in a 0.7mL cylindrical plug disposable glass vials, and the optical path is 6.5 mm. The near-infrared transmission spectra were recorded by the range of 10000 to 6200 cm−1. The resolution of collected spectra is 8 cm-1, and each spectrum is an accumulation of 128 scans. Background spectra of the empty cuvette were collected before the acquiring each gasoline sample spectrum. The spectral files were transformed into ASCII format. The raw NIR absorbance spectra of gasoline in the range of 10000 to 6200 cm−1 were shown in Figure1. 0 .4 5 0 .4 0 0 .3 5 0 .3 0

AU

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

0 .2 5 0 .2 0 0 .1 5 0 .1 0 0 .0 5 10000

9500

9000

8500

8000

W a v e n u m b e r/c m

7500

7000

6500

-1

Figure 1. .Raw NIR absorbance spectra of gasoline samples in the range of 10000 to 6200cm−1

3.3 Data processing and the NIR spectral library The raw NIR spectra of all reformed gasoline samples were converted to their second derivative spectra in order to enhance the spectral features and weaken baseline variation. The gasoline NIR spectral library has been built by the second derivative spectra of the 542 gasoline samples, which were shown in Figure2. To establish a robust calibration model, the spectral regions related to the specific properties should be chosen, because the spectral intervals that do not have significant absorption characteristics and overlap with the absorption peaks of other functional groups will reduce the prediction ability of the calibration model. Choosing the most useful spectral information to establish calibration model can not only improve the prediction ability of the model, but also eliminate the weak spectral region, reduce the spectral data, and improve the calculation speed. Based on the chemical knowledge and the correlation level between properties and NIR absorbance, the wavelength ranges selected to build reformed gasoline NIR spectral library are 7400 to 7000cm-1 and 8600 to 8100cm-1. All data processing programs used in this article, including spectral pretreatment methods, Monte Carlo method and moving window correlation coefficient method, were written in the 5

ACS Paragon Plus Environment

Energy & Fuels 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 13

Matlab code.

10000

9500

9000

8500

8000

W a v e n u m b e r/c m

7500

7000

6500

-1

Figure 2. . Second derivative spectra of reformed gasoline samples

4 RESULTS AND DISCUSSION 4.1Partial least squares (PLS) The PLS method was used to establish calibration model. Before the model development processes, all the samples were divided into calibration set and validation set. 40 reformed gasoline samples were selected from the reformed gasoline library as the validation set, and the remaining 502 samples constituted the calibration set. The principal factor number was determined by PRESS in cross validation. The model is evaluated by root mean standard error of calibration (RMSECV) and root mean standard error of prediction (RMSEP). The RMSECVs of paraffin, isoparaffin, olefins, aromatics, naphthenes and RON by PLS for reformed gasoline are 0.23%, 0.39%, 0.10%, 0.23%, 0.52% and 0.22, respectively. Table 2 presents the prediction results of gasoline validation set samples. It can be seen from table 2 that the RMSECV and RMSEP are basically consistent, which indicates that the octane number and hydrocarbon composition of the reformed gasoline samples could analyzed quickly and accurately by the PLS method. Table 2. .The RMSEP of different chemometric methods iP/%

nP/%

O/%

N/%

A/%

RON

PLS

0.27

0.43

0.15

0.28

0.55

0.24

MWCC

1.39

3.87

0.38

1.09

6.08

2.73

MCVS

0.31

0.47

0.21

0.43

0.67

0.25

iP:i-paraffin; nP: n-paraffin; O: olefin; N: naphthene; A: aromatics 6

ACS Paragon Plus Environment

Page 7 of 13

Validation results were demonstrated in Figure 3, which show that the prediction based on NIR vs. reference measured values for the research octane number and hydrocarbon composition, almost all points fall on or close to the unity line. The PLS prediction models for the research octane number and hydrocarbon composition are good, with more or less the same RMSEP values as the given experimental uncertainty. Few samples show large deviation because of under fitting in Figure 3 (c) and (d), which indicate that some unknown samples are different from the calibration set samples. 9

(a)

22

(b)

20 18

nP,measured (%)

iP,measured (%)

8 7 6 5 4

16 14 12 10 8

3

6 3

4

5

6

7

8

9

6

8

10

12

iP,predicted (%)

14

16

18

20

22

nP,predicted (%) 6

2.4

(d)

(c)

2.2

5

N,measured (%)

O,measured (%)

2.0 1.8 1.6 1.4 1.2 1.0 0.8

4 3 2 1

0.6 0.6

0.8

1.0

1.2

1.4

1.6

1.8

2.0

0

1

2

O,predicted (%)

3

4

5

N,predicted (%)

90

106

(f)

(e) 104

RON,measured

85

A,measured (%)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

80

75

70

65 65

102 100 98 96

70

75

80

85

90

94 94

96

A,predicted (%)

98

100

102

104

106

RON,predicted

Figure 3. .Scatter plots showing correlations between NIR prediction values and reference values for several physico-chemical properties for reformed gasoline on PLS models

4.2 MWCC method Based on reformed gasoline NIR spectral library, the aim of the study is using MWCC method to distinguish highly similar gasoline sample with an unknown sample, and then provide accurate prediction results. Table 2 presents prediction results of gasoline validation set samples. The threshold parameter of MWR and Q were given in 2.2 in this paper, Qt is 212.85, and MWRt is 0.9990. If the identification parameters exceed the corresponding thresholds, the correct prediction results will be given. If the identification parameters are lower than the corresponding thresholds, the most similar sample evaluation data will be used as the identification results.

7

ACS Paragon Plus Environment

Energy & Fuels 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 13

Table 3. .RON prediction results by MWCC Predicted

Measured

Bias

98.7

98.6

0.1

Q

MWR

212.9929

0.9995

Identification

Predicted

Measured

Bias

TRUE

104.4

103.3

result

Identification

Q

MWR

1.1

212.9961

0.9998

TRUE

result

103.8

100.0

3.8

212.9747

0.9982

FALSE

103.6

101.8

1.8

212.9945

0.9997

TRUE

99.2

101.7

-2.5

212.9804

0.9989

FALSE

103.4

102.8

0.6

212.9876

0.9993

TRUE

105.1

103.3

1.8

212.9899

0.9995

TRUE

102.2

101.2

1.0

212.9862

0.9993

TRUE

104.7

102.4

2.3

212.9973

0.9981

FALSE

104.0

103.7

0.3

212.9942

0.9998

TRUE

95.1

101.6

-6.5

212.9911

0.9986

FALSE

104.5

103.0

1.5

212.9933

0.9996

TRUE

104.3

100.2

4.1

212.9953

0.9986

FALSE

104.7

98.7

6.0

212.9897

0.9982

FALSE

103.7

104.4

-0.7

212.9974

0.9999

TRUE

102.1

101.8

0.3

212.9988

0.9999

TRUE

103.2

103.0

0.2

212.9946

0.9996

TRUE

102.0

100.6

1.4

212.9976

0.9999

TRUE

99.0

98.6

0.4

212.9810

0.9989

TRUE

105.0

101.4

3.6

212.9980

0.9979

FALSE

101.9

101.1

0.8

212.9894

0.9993

TRUE

104.0

98.5

5.5

212.9928

0.9987

FALSE

103.9

103.0

0.9

212.9758

0.9984

TRUE

103.8

103.8

0.0

212.9963

0.9997

TRUE

102.9

104.4

-1.5

212.9916

0.9996

TRUE

102.9

102.3

0.6

212.9949

0.9996

TRUE

102.9

103.3

-0.4

212.9911

0.9994

TRUE

104.0

103.9

0.1

212.9878

0.9994

TRUE

102.9

103.6

-0.7

212.9595

0.9971

TRUE

104.0

102.7

1.3

212.9652

0.9981

TRUE

102.9

102.4

0.5

212.9740

0.9989

TRUE

103.9

103.6

0.3

212.9878

0.9993

TRUE

104.4

103.4

1.0

212.9947

0.9996

TRUE

101.8

101.3

0.5

212.9800

0.9987

TRUE

103.9

101.6

2.3

212.9950

0.9987

FALSE

101.9

104.0

-2.1

212.9973

0.9989

FALSE

103.0

95.1

7.9

212.9972

0.9988

FALSE

103.8

102.5

1.3

212.9938

0.9996

TRUE

95.0

100.1

-5.1

212.9974

0.9988

FALSE

101.7

103.9

-2.2

212.9939

0.9986

FALSE

It can be seen from table 2 that, the RMSEPs of MWCC method are much larger than those of PLS. Table 3 shows the RON prediction results of unknown samples in validation set, and it reveals the unknown samples with false identification have obvious deviation between prediction results and reference data. About one-third of validation samples cannot be recognized successfully by MWCC method because of the limited number of reformed gasoline samples in NIR spectral library. In order to get similar or better prediction results than PLS models, and meanwhile own the advantage of minimizing maintenance of models, the MWCC method should be improved in terms of expansion of sample numbers in spectral library.

4.3 Monte Carlo Virtual Spectrum Identification Method The main concepts of the Monte Carlo virtual spectrum identification method(MCVS) is to improve the successful identification rate for unknown samples by using virtual samples generated by Monte Carlo method based on the library samples around the unknown sample. As for an unknown reformed gasoline sample, if MWCC method failed to find the same sample in the NIR spectral library, N library samples which are closest to the unknown sample are selected. Because virtual samples will be generated based on those N library samples, the number of N is a key parameter for the Monte Carlo virtual spectrum identification method. Larger number of N will generate more different virtual samples, and smaller number of N will generate more similar virtual samples, and both situations may lead to poor prediction results. There will be an optimal 8

ACS Paragon Plus Environment

Page 9 of 13

number of N for certain sample type. Figure4(a) shows the RMSEP of RON under different number of N for the unknown reformed gasoline samples in validation set, and the best number of N is 10. The spatial distribution of a group of 10 unknown samples and their corresponding 10 most similar (nearest) samples identified from the spectral library (calibration set) for every unknown sample are shown in Figure 5, which are presented by the first three scores of principal component analysis (PCA).

0.31

(a)

(b)

0.29

0.30

0.28

0.29

RMSEP

0.27

RMSEP

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

0.28

0.26

0.27 0.25

0.26

0.24

0.25 5

10

15

20

0

10000

N

20000

30000

40000

50000

M

(a) The RMSEP of RON in different number of N;(b) The RMSEP of RON in different number of M ; Figure 4. .Determination of N and M by using Monte Carlo virtual spectrum identification method

After the 10 nearest samples were chosen for the unknown reformed gasoline sample, a set of random number matrix (N × M)with uniform distribution of (0, 1) was produced by mathematical recursive formula, which were normalized by column. M virtual spectra and their chemical and physical properties were calculated by this random number matrix. The number of M virtual samples is another important parameter. In theory, the larger the number of M virtual samples is, the better prediction results will be obtained by using the linear addition method. However, larger number of M virtual samples will take more calculation time. Figure4(b) shows the RMSEP of RON under different virtual samples M from 5000 to 50000. The prediction results indicate that along with the increase of virtual samples, the RMSEP decreases slightly. When the number of M is above 10000, the effect of the number of M on RMSEP is negligible. Therefore, in this study, the optimal number of M is set to 10000.

9

ACS Paragon Plus Environment

Energy & Fuels

0.04

identified calibration samples unknown samples

0.03 0.02 0.01 0.00

PC3

-0.01 -0.02 -0.03

0.12 0.10 0.08 0.06 0.04 0.02 0.00 2 -0.02 PC -0.04 -0.06

-0.04 -1.000 -0.999 -0.998 -0.997 -0.996 -0.995 PC -0.994 1 -0.993

Figure 5. .Spatial distribution of 10 unknown samples and their10 nearest samples in calibration set

Using MWCC method, the identical virtual reformed gasoline samples with the unknown sample are recognized, whose identification parameters of Q and MWR are beyond thresholds respectively. The spatial distribution of 10 unknown samples and their corresponding identical virtual samples are shown in Figure 6. The second derivative spectra of an unknown gasoline sample and its nearest virtual sample are shown in Figure 7(a), the residual spectrum of the unknown gasoline and the nearest virtual sample is shown in Figure 7(b). It can be seen from Figure 7 that there is almost no difference between the unknown gasoline sample and the nearest virtual sample. 0.004

virtual samples unknown samples

0.003

0.002

PC2

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 13

0.001

0.000

-0.001

-0.002 -0.006

-0.004

-0.002

0.000

0.002

0.004

0.006

PC1

Figure 6. . Spatial distribution of 10 unknown samples and their 30 nearest virtual samples

10

ACS Paragon Plus Environment

Page 11 of 13 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

0.010

nearest virtual sample unknown gasoline sample

0.10

(a)

(b)

0.05

0.005

0.00 0.000 -0.05

-0.10

-0.005

-0.15 10000

9500

9000

8500

8000

7500

7000

-0.010 10000

6500

9500

-1

9000

8500

8000

7500

7000

6500

-1

Wavenumber/cm

Wavenumber/cm

(a)Second derivative spectra; (b) Residual spectrum between the unknown gasoline sample and the nearest virtual sample

Figure 7. .Spectra of an unknown gasoline sample and its nearest virtual sample

The chemical and physical properties of all validation set samples which acted as unknown samples were predicted by the Monte Carlo virtual spectrum identification method. The final prediction results of an unknown sample were obtained by the average values of all the identified virtual samples. The statistical prediction results of the method are shown in Table 2. Compared with PLS, the RMSEP of the Monte Carlo virtual spectrum recognition method is slightly higher than those of PLS models, but the difference is very small, basically at the same level. Figure8 shows the validation results, which indicates the prediction results based on NIR combined with virtual Monte Carlo spectrum recognition accord well with reference values for the research octane number and hydrocarbon compositions. The calculation time for one unknown sample is about 2seconds. Because the virtual samples were generated randomly, the prediction results may differ for the same spectra of an unknown sample. 5 repetitive prediction results of the same unknown sample were listed in Table 4, and it can be seen that the repeatability of the results is good. The virtual Monte Carlo spectrum recognition is a qualitative method, and thus no need to establish the complex calibration model. Operators do not need to master modeling and model maintenance knowledge, by adding new samples to the calibration set to achieve the purpose of maintaining the model.

11

ACS Paragon Plus Environment

Energy & Fuels

9 22

(a)

(b)

20 18

nP,measured (%)

iP,measured (%)

8 7 6 5 4

16 14 12 10 8

3

6 3

4

5

6

7

8

9

6

8

10

12

iP,predicted (%)

14

16

18

20

22

nP,predicted (%) 6

2.4

(c)

2.2

(d) 5

N,measured (%)

O,measured (%)

2.0 1.8 1.6 1.4 1.2 1.0

4 3 2 1

0.8 0.6 0.6

0.8

1.0

1.2

1.4

1.6

1.8

0 0.0

2.0

0.5

1.0

1.5

2.0

O,predicted (%)

2.5

3.0

3.5

4.0

4.5

5.0

N,predicted (%)

90

106

(e)

(f)

104

RON,measured

85

A,measured (%)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 13

80

75

70

102 100 98 96

65 65

70

75

80

85

90

94 94

96

A,predicted (%)

98

100

102

104

106

RON,predicted

Figure 8. . Correlations between NIR prediction values and reference values for several physico-chemical properties for gasoline onvirtual spectrum identification method Table 4. .The repeatability of Monte Carlo virtual spectrum identification method iP/%

nP/%

O/%

N/%

A/%

RON

1

6.01

15.60

0.87

1.70

75.73

99.0

2

6.02

15.59

0.85

1.71

75.74

98.9

3

6.02

15.59

0.86

1.73

75.72

99.0

4

6.04

15.62

0.88

1.72

75.65

98.9

5

6.03

15.62

0.86

1.70

75.70

99.0

SD

0.01

0.02

0.01

0.01

0.04

0.1

5 CONCLUSION In this study, a new method to predict chemical and physical properties of reformed gasoline, named Monte Carlo virtual spectrum identification method, was proposed based on near infrared spectroscopy and the moving window correlation coefficient method. Compared with the classical PLS regression model, this method has the advantages of minimizing maintenance of models and do not require specific expertise. The key parameters of this proposed method were discussed and the optimal values for reformed gasoline were determined. The statistic results indicate that, combined with moving window correlation coefficient method, the unknown 12

ACS Paragon Plus Environment

Page 13 of 13 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

samples can be predicted accurately and rapidly using virtual samples generated from the nearest samples around them. The prediction accuracy of the Monte Carlo virtual spectrum identification method is comparable to PLS models and well-matched with reference values. The RMSEPs of paraffin, isoparaffin, olefins, aromatics, naphthenes and RON by this proposed method for reformed gasoline are 0.31%, 0.47%, 0.21%, 0.43%, 0.67% and 0.25 respectively.

ACKNOWLEDGMENT This project has been supported by the National Natural Science Foundation of China (Grant No. 21365008).

REFERENCES (1) Chung, H. Appl. Spectrosc. Rev. 2007, 42(3), 251–285. (2) Valleur, M. Pet. Tech. Q. 1999, 4(4), 81–85. (3) Lambert, D; Descales, B; Bages, S.; Bellet, S; Llinas, J.R. Hydrocarbon Process. 1995,74 (74)103-108. (4) Singh, A; Vermeer, P.J.; Woo, S.S.; Forbes, J.F. J. Process Control. 2000, 10 (1) :43-58. (5) Pasquini, C; Bueno, A. F. Fuel 2007, 86 (12) :1927-1934. (6) Macho, S; Larrechi, M.S. Trends Anal. Chem. 2002, 21 (12) :799-806. (7) Laxalde, J. Ruckebusch, C. Devos, O. Caillol, N.; Wahl, F. Anal. Chim. Acta. 2011 , 705 (1–2) :227-234. (8) Iob, A.; Ali, M.A.; Tawabini, B.S.; Anabtawi, J.A.; Ali, S.A. Fuel 1995 , 74 (2) :227-231. (9) Lee S, Choi H, Cha K, Chung, H. Microchem. J. 2013, 110 (9) :739-748. (10) Geladi, P.; Kowalski, B.R. Anal. Chim. Acta. 1986, 185 (86) :1-17. (11) Wold, S.; Sjostrom, M.; Eriksson, L. Chemom. Intell. Lab. Syst. 2001, 58 (2) :109-130. (12) Davies, A.; Fearn,T. J. Near Infrared Spectrosc. 2006, 14 (1) :1003-1004. (13) Dambergs, R.; Cozzolino,D.; Cynkar, W.; Gishen, M. J. Near Infrared Spectrosc. 2006 , 14 (1) :71-79. (14) Chu, X. L.; Xu, Y. P.; Tian, S. B.; Wang, J.; Lu, W.Z. Chemom. Intell. Lab. Syst. 2011 , 107 (1) :44-49. (15) Li, J.Y.; Chu, X.L.; Tian, S.B. Spectrochim. Acta, Part A. 2013, 112 (8) :457-462.

13

ACS Paragon Plus Environment