Dynamic Nonlinear Partial Least Squares ... - ACS Publications

Aug 7, 2019 - Division of Systems and Control, Department of Information ... the modeling performance of the proposed method, two simulated cases and ...
0 downloads 0 Views 1MB Size
Subscriber access provided by RUTGERS UNIVERSITY

Process Systems Engineering

Dynamic Nonlinear PLS Modeling Using Gaussian Process Regression Hongbin Liu, Chong Yang, Bengt Carlsson, S. Joe Qin, and ChangKyoo Yoo Ind. Eng. Chem. Res., Just Accepted Manuscript • DOI: 10.1021/acs.iecr.9b00701 • Publication Date (Web): 07 Aug 2019 Downloaded from pubs.acs.org on August 7, 2019

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Dynamic Nonlinear PLS Modeling Using Gaussian Process Regression Hongbin Liu1,3,‡,iD, Chong Yang1,‡, Bengt Carlsson2, S. Joe Qin3*, ChangKyoo Yoo4* 1. Co-Innovation Center of Efficient Processing and Utilization of Forest Resources, Nanjing Forestry University, Nanjing 210037, China 2. Division of Systems and Control, Department of Information Technology, Uppsala University, Uppsala 75105, Sweden 3. Mork Family Department of Chemical Engineering and Material Science, University of Southern California, Los Angeles, California 90089, United States 4. Department of Environmental Science and Engineering, College of Engineering, Kyung Hee University, Yongin 446701, Korea

Corresponding author: [email protected]

ACS Paragon Plus Environment

1

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 31

ABSTRACT: A dynamic Gaussian process regression based partial least squares (D-GPRPLS) model is proposed to improve the estimation ability compared to the conventional nonlinear PLS. Considering the strong ability of GPR in nonlinear process modeling, this method is used to build a nonlinear regression between each pair of latent variables in the partial least squares. In addition, augmented matrices are embedded into the D-GPR-PLS model for obtaining better prediction accuracy in nonlinear dynamic processes. To evaluate the modeling performance of the proposed method, two simulated cases and a real industrial process based on wastewater treatment processes (WWTPs) are considered. The simulated cases use data from two high fidelity simulators: benchmark simulation model no. 1 and its long-term version. The second study uses data from a real biological wastewater treatment process. The results show the superiority of D-GPR-PLS in modeling performance for both of data sets. More specifically, in terms of the prediction for effluent chemical oxygen demand of the real WWTP data, the value of root mean square error is decreased by 31%, 16%, and 52%, respectively, in comparison with linear PLS, quadratic PLS, and least squares support vector machine based PLS.

KEYWORDS: Dynamic processes; Gaussian process regression; partial least squares; nonlinear process modeling; wastewater treatment processes

ACS Paragon Plus Environment

2

Page 3 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

1. INTRODUCTION In recent years, process modeling has gained significant attention in chemical and biological processes to ensure the accurate prediction for key variables, such as product quality and pollutant emission.1,2 As an essential part of multivariate statistical methods, latent variable methods (LVMs) are specifically suitable for coping with the high dimensionality and collinearity problems of industrial process data by using the latent variables, which can be then used for constructing the regressions between the latent space and the output space.3 Among the different kinds of LVMs, partial least squares (PLS)4 is the powerful one which takes both the variance structure and the correlation between the inputs and the outputs into consideration, and has been widely used for the regression purpose.5,6 However, when facing the obvious nonlinear characteristics of data, the modeling accuracy of conventional linear PLS can be decreased significantly.7,8 To handle the aforementioned problem, nonlinear extensions of PLS were proposed, and can be roughly divided into two categories.9 The first one is the kernel-based PLS (KPLS), and the other one is to build a nonlinear regression between each pair of latent variables of PLS. For the KPLS method, the model can be constructed in two steps. The first step is to transform the input variables into a high-dimensional feature space by nonlinear mapping. Then the linear PLS method can be modeled between the high-dimensional variables and output variables.10,11 For the latter type of nonlinear PLS, we give more consideration in this work. The quadratic PLS (QPLS), replacing the inner relation between input and output latent variables with a quadratic polynomial, was first proposed in 198912. Another approach was proposed by Qin and McAvoy13 which applied neural networks to PLS (NN-PLS) modeling to build a nonlinear framework. Subsequently, Bang et al.14 updated the inner relationship of PLS with Takagi-Sugeno-Kang fuzzy model, and illustrated its better regression performance than the linear PLS. Recently, Lv et al.9 applied the least squares vector machine to PLS

ACS Paragon Plus Environment

3

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 31

(LSSVM-PLS) and demonstrated the superiority of error-based LSSVM-PLS compared to PLS, QPLS, and NN-PLS models in an industrial process modeling case. According to the proposed research, the estimation capacity of a nonlinear PLS model primarily depends on its own inner regression method. Gaussian process regression (GPR), as a powerful nonlinear method, can be used to interpret the nonlinear systems without prior knowledge of kernel functions and provide prediction uncertainty by the variance of estimation.15 In recent years, GPR has been successfully applied in nonlinear modeling of many engineering fields.1,3,6 Jin et al.1 applied the online ensemble GPR to the fed-batch chlortetracycline fermentation process to estimate the difficult-to-measure variables online. Grbić et al.3 proposed an adaptive mechanism based on a mixture of GPR models and evaluated the efficiency using the Tennessee Eastman process and two real industrial examples. By using the composite covariance functions, Liu et al.6 demonstrated the better prediction performance of GPR for subway indoor air quality data than PLS, back propagation artificial neural networks, and LSSVM. Thus, the GPR model can be an appropriate alternative used for building the inner structure of PLS model. Considering the fact that the aforementioned models have not taken the dynamic properties of industrial process data into consideration, it is therefore reasonable to try to improve the modeling performance of any nonlinear PLS by using the dynamic techniques. The conventional dynamic methods are usually hybrid models composed by different types of dynamic techniques, which mainly include recursive algorithm,16-18 time differences,2,19,20 just-in-time learning technique,21-23 and augmented matrices.24-27 Among those techniques, augmented matrices method is a simple but powerful technique. The standard data matrices used in PLS are then extended with lagged variables, which are used for interpreting the timevarying information between process and quality data. This procedure extends the size of the involved matrices and hence increases the dimensionality of the basic model. The use of PLS gives, however, an effective dimensionality reduction. For example, Ku24 improved the

ACS Paragon Plus Environment

4

Page 5 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

monitoring performance of static PCA by integrating the augmented matrices with conventional models already in 1995. Subsequently, this dynamic technique was applied to non-Gaussian ICA method to reveal more useful information in the Tennessee Eastman process and indoor air quality data in subway systems.25,26 Serdio et al.28 applied PLS with lags to fault detection of condition monitoring in multi-sensor networks, and the experimental results showed that the proposed model performed best in the study. Liu et al.27 recently combined the augmented matrices with concurrent kernel canonical analysis for the comprehensive fault diagnosis of a continuous annealing processes. In this paper, the GPR model is proposed to interpret the nonlinearity between each pair of latent variables in linear PLS. In addition, augmented matrices are also embedded into the GPR-based PLS to result in a dynamic GPR-PLS (D-GPR-PLS) model. The goal is then to achieve a better prediction performance when facing the dynamics and nonlinearities of industrial processes data. In Section 2, we introduce the D-GPR-PLS algorithm in detail. To evaluate the modeling performance of D-GPR-PLS, two cases based on data from the simulation of a benchmark of a biological process are conducted in Section 3. In Section 4, the proposed model is applied to a real industrial wastewater treatment process. In both cases, comparisons with several other PLS techniques are done. Finally, in Section 5, the conclusions are given.

2. NONLINEAR MODELING BASED ON GPR-PLS 2.1. GPR Modeling Method. The training data set is assumed to consist of an input variable matrix X  R nm and an output variable y  R n1 , where the index m denotes the variable numbers and n represents the sample numbers. Let xi be a variable of X, and yi an element of the output variable y, where i  1, 2,K , n . To obtain a nonlinear model between x and y, a Gaussian process regression model is built such that the regression function y=f(x) has a Gaussian prior distribution with zero mean, shown as follows15

ACS Paragon Plus Environment

5

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

y  [ f ( x1 ), f ( x 2 ),L , f ( x n )] : GP(0, C)

Page 6 of 31

(1)

where C is the covariance matrix and usually takes the form of the squared-exponential function1,15: 1 C(xi , x j )   2f exp{ (xi  x j )T M 1 (xi  x j )}   ij n2 2

(2)

2 where  f and  n2 correspond to signal variance and noise variance, respectively;

M  diag (l 2 ) , l is the kernel width. In terms of  ij , when i  j ,  ij  1 , otherwise  ij  0 . Thus, the hyper parameters can be expressed as   (l ,  f ,  n ) . For the test data x new , the prediction of output data ynew also follows a Gaussian distribution of which the mean and variance can be calculated as follows

ynew  K T (x new )C1y 2  new  C(x new , x new )  K T (x new )C1K (x new )

(3) (4)

where K (x new )  [C(x new , x1 ), C(x new , x 2 ),L , C(x new , x n )]T .

2.2. Nonlinear GPR-PLS Modeling. PLS aims at finding latent variables (LVs) that capture data variance and maximize the correlation between input and output variables. It consists of two outer relations and one inner relation. Assume that the input matrix X shares the same structure of that in Section 2.1, the output matrix Y  R n p has p columns, then the outer models for X and Y are as follows X = t1p1T + t 2p T2 + L + t v p Tv = TP T  E

(5)

Y = u1q1T + u 2q T2 + L + u v q Tv = UQ T  F

(6)

where t and u are latent score vectors, p and u are loading vectors, v is the number of latent variables, and E and F are residuals.

ACS Paragon Plus Environment

6

Page 7 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

For the linear PLS, the inner model relating score vectors t and u is expressed by the following equation u  bt  r

(7)

where b is the coefficient determined by minimizing the residual r. For the nonlinear extension of PLS, the inner relationship of score vectors t g and u g (

g  1, 2,K , v for v variables) can be written as u g = f g (t g )  e g

(8)

where f g (g) represents the nonlinear function, e g corresponds to the regression error. Thus, the nonlinear model GPR-PLS can be formulated as the block diagram shown in Figure. 1, and the detailed procedure is conducted as follow: (1) Data standardization for X and Y. (2) Assigning the initial values for E g and Fg ( g  1 ) using X and Y, respectively. (3) Selecting one column of Fg for u g . (4) Performing nonlinear iterative partial least squares (NIPALS)29: w  T g

u Tg E g

wg0 

u Tg u g

wg wg

t g = Eg w g 0 c  T g

cg 0 

t Tg Fg t Tg t g cg cg

u g = Fg c g 0

(9)

(10) (11) (12)

(13) (14)

Loop until u g it converges.

ACS Paragon Plus Environment

7

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 31

(5) GPR Modeling of latent variables and getting the prediction value of output score: uˆ g  f g (t g )

(15)

(6) Calculating the loading vectors p Tg 

q Tg 

t Tg E g t Tg t g uˆ Tg Fg uˆ Tg uˆ g

(16)

(17)

(7) Deflations of E g and Fg

E g  E g  t g p Tg Fg  Fg  uˆ g q Tg

(18) (19)

(8) Updating the g value ( g  g  1 ), then going back to step (3) until finding all of the latent variables.

Figure 1. Inner structure of GPR-PLS.

2.3. Dynamic GPR-PLS Modeling. Based on the input matrix X, the augmented matrix

X a can be defined as:

ACS Paragon Plus Environment

8

Page 9 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

(20)

X a  [ X1 , X 2 ,K , X m ] in which xi (r  1)  xi (r )  x (r  1) xi (r ) Xi   i  M M   xi (r  n  1) xi (r  n  2)

L L O L

xi (r  a  1)  xi (r  a  2)   M  xi (r  a  n) 

(21)

where i  1, 2, , m , r is one of the sample times, a is the value of lags for process variables. It is because of the lagged value for data expansion that the dynamic features of X can be interpreted efficiently. The desirable value of lags can be selected by using the method proposed in the reference24, and the experience shows that a lagged value of 1 or 2 is usually suitable for the dynamic extension of multivariate statistical methods. After replacing the input matrix (X) with the augmented one ( X a ) in Figure 1, the dynamic GPR-PLS model can be constructed. Although the number of columns for X a can be large, there will not be too many ones for the optimal latent variables. Thus, the higher dimensionality of X a cannot be an obstacle to the inner relation of PLS modeling.

2.4. Implementation Procedure of D-GPR-PLS Prediction. After the D-GPR-PLS model has been trained well, the projection vectors (corresponding to v latent variables) can be

collected

in

matrices,

i.e.,

Wg   w1 , w 2 ,K , w v  Pg  p1 , p 2 ,K , p v  ,

and

ˆ can be Q g  q1 , q 2 ,K , q v  . Then for the test input matrix X new , the predicted matrix Y calculated as follow: (1) Data expansion of input data ( X new a ) using the lagged values. (2) Standardization of the augmented test matrix X new a . (3) Calculating the input score matrix of test data:

Tnewg  X new a Wg (PgT Wg ) 1

ACS Paragon Plus Environment

(22)

9

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 31

(4) Prediction of each output score using GPR model: uˆ newg  f g (t newg ) , g  1, 2,K , v

(23)

where t newg is a variable of Tnewg .

ˆ (5) Prediction of Y ˆ =U ˆ QT Y newg g

(24)

ˆ ˆ ˆ ˆ where U newg  [u new1 , u new2 , K , u newv ] .

ˆ according to the mean and variance of training data Y. (6) Rescaling Y (7) Measure the prediction accuracy using the values of root mean square error (RMSE) and coefficient of determination (R2) based on the real measured variable y and predicted variable yˆ n

RMSE 

 ( yˆ  y ) i

i 1

(25)

n n

R2 = 1 

2

i

 ( yˆ  y ) i

i=1 n

2

i

 ( y  y)

(26)

2

i

i=1

where y describes the mean value of y, n denotes the number of test samples, yˆi and yi are the predicted and measured values of yˆ and y, respectively.

3. MODELING USING DATA FROM BSM1 3.1. BSM1 Data Set. As a widely used simulation platform for testing new control strategies30 and new monitoring methods31 in WWTPs, the benchmark simulation model no. 1 (BSM1) was used in this work to evaluate the modeling performance of D-GPR-PLS. To make a fair comparison, PLS, DPLS, QPLS, D-QPLS, LSSVM-PLS, D-LSSVM-PLS, and GPR-PLS were also used for the modeling. The BMS1 provides the platform with diverse

ACS Paragon Plus Environment

10

Page 11 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

control strategies under independent simulation environments, such as dry weather, rainy weather (including dry weather during the first week and the long rainy event during the second one), and storm weather (containing the dry weather in the first week and two storms in the second week). Figure 2 illustrates the block diagram of the BSM1 simulation benchmark, which is composed of five activated sludge reactors and a secondary clarifier. With the aim to control the concentration of effluent nitrate, two anoxic reactors for predenitrification and three aerobic reactors for nitrification are displayed in the order shown in the diagram. Further information of the BSM1 benchmark can be found in reference32.

Figure 2. Process layout for the BSM1 simulation benchmark.

Ten variables from BSM1 were used for the modeling study. The influent variables include the influent flow rate, the influent ammonia concentration, nitrate concentration of the second reactor, the dissolved oxygen concentrations of the third and fourth reactors, total suspended solid concentration of the fourth reactor, oxygen transfer coefficient of the fifth reactor, and the internal recycle rate. The effluent variables include the effluent ammonia concentration (SNHeff) and the effluent nitrate concentration (SNOeff). In this case, 1345 samples were collected from every 15 minutes in 14 days of dry weather, among which the first 672 samples were used for training the model and the other 673 samples were used for test.

3.2. Results and Discussion in terms of BSM1. The number of latent variables for PLS can be determined by analyzing the variance of each LV. As listed in Table 1, the variances captured by the third LV are 15.64% and 11.60% for the input and out data,

ACS Paragon Plus Environment

11

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 31

respectively, whereas the variances extracted by the forth LV are just 1.67% and 6.25%. It is clear that there is no significant increase in terms of cumulative variances after the third LV. Meanwhile, the cumulative variances of first three LVs (95.15% for input variables and 84.73% for output variables) have shown enough proportion to interpret the data information. Therefore a reasonable number of LVs is three for PLS modeling. Considering the use of dynamic technique and different functions of inner loops for nonlinear PLS modeling, each optimal number of LVs for the other models are tuned the same way.

Table 1. Variance of Latent Variables in PLS Input data (X)

Output data (Y)

L V

Accumulative variance Variance (%)

Accumulative variance Variance (%)

(%)

(%)

1

66.1

66.1

58.0

58.0

2

13.4

79.5

15.2

73.2

3

15.6

95.1

11.6

84.8

4

1.7

96.8

6.3

91.1

5

1.6

98.4

1.4

92.5

6

0.9

99.3

1.2

93.7

7

0.6

99.9

0.7

94.4

8

0.1

100.0

0.6

95.0

For the nonlinear part of PLS, the parameters of LSSVM are selected by using grid search and ten-fold cross validation. In terms of the GPR part in PLS model, the squared-exponential covariance function in equation (2) is used and rewritten as 1 C(xi , x j )  exp{ p f  (xi  x j )T M (xi  x j )}   ij n2 2

ACS Paragon Plus Environment

(26)

12

Page 13 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

where p f  2 ln( f ) is a parameter comes from signal variance  f . In terms of the hyperparameter set of GPR, both of the kernel width l and noise variance  n are set to 1. Then the parameter p f is determined by using scaled conjugate gradient optimization. Besides, we set all of the lagged values for dynamic methods to 2. Table 2 lists all of the optimal numbers of LVs for eight types of models and their prediction results after tuning. The details of the other parameters for each of the inner relation in D-LSSVM-PLS and D-GPR-PLS models can be found in Table S1 of Supporting Information (SI) file. Figure 3 shows the score plots of the first latent variable in the training phase of linear PLS, QPLS, LSSVM-PLS, and GPR-PLS models. According to Figure 3b, 3c, and 3d, the QPLS, LSSVM-PLS and GPR-PLS models can represent such nonlinearity in different degrees, whereas linear PLS model in Figure 3a has poor modeling performance. It is because of the obvious nonlinearity of BSM1 data set that the nonlinear models perform better than the linear one. Compared with LSSVM-PLS and GPR-PLS, however, the nonlinear tracing of QPLS model shows some flaws in -5 to 0 t(1) values of Figure 3b, which derive from the intrinsic limitation of quadratic function. Besides, almost the same nonlinear curves shown in Figure 3c and 3d reflect that both LSSVM and GPR functions have efficient interpretations for the nonlinear characteristic of process data.

(a)

(b)

ACS Paragon Plus Environment

13

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 31

(d)

(c)

Figure 3. Scatter plots of the first latent variable for BSM1 data in (a) linear PLS model, (b) QPLS model, (c) LSSVM-PLS model, and (d) GPR-PLS model.

According to the RMSE and R2 values shown in Table 2, it can be observed that all of the three static nonlinear models show the better modeling performance than PLS does in terms of the prediction for SHNeff. Besides, the RMSE value (0.727) and R2 value (0.942) of GPRPLS show its best prediction capacity of the static methods, and LSSVM-PLS has the similar prediction results. This phenomenon can be explained by the plots c and d of Figure 3. In terms of the prediction results for SNOeff, the values of RMSE and R2 for PLS are 0.715 and 0.798, respectively. Different with the prediction results for SNHeff, QPLS and LSSVM-PLS have not shown any improvement of modeling accuracy in this case. However, the values of RMSE and R2 using GPR-PLS are 0.574 and 0.870, respectively, which have been optimized by 19.72% and 9.02% in comparison with those of PLS. The results show the strong and stable prediction ability of GPR-PLS model when facing the complex nonlinear data. After applying the dynamic technique to static methods, almost all of the results listed in Table 2 show an improvements in prediction accuracy, illustrating that the use of dynamic method is beneficial to the modeling the WWTP data. Simultaneously, it can be observed that D-QPLS, D-LSSVM-PLS, and D-GPR-PLS show almost the same prediction results for SNHeff. It is probably because of the standard structure of simulation data, which makes all of

ACS Paragon Plus Environment

14

Page 15 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

the three types dynamic nonlinear methods could improve the estimation accuracy to the similar degree. In terms of SNOeff prediction, the best prediction results come from D-GPRPLS, with RMSE  0.421 and R 2  0.930 . At the same time, D-QPLS and D-LSSVM-PLS show the slightly weaker estimation accuracy than D-GPR-PLS, but both of the prediction results are also better than DPLS. Considering the desirable modeling performance of GPRPLS based on dynamic technique, we provide the detailed regression results using validation data in Figure 4, from which we can find that the data points are well fitted.

Table 2. Comparison of Different Modeling Methods for SNHeff and SNOeff on BSM1 Test Data Models

No. of LVs

SNHeff

SNOeff

RMSE

R2

RMSE

R2

PLS

3

0.863

0.918

0.715

0.798

DPLS

3

0.877

0.915

0.563

0.846

QPLS

2

0.835

0.923

0.844

0.718

D-QPLS

4

0.715

0.944

0.520

0.893

LSSVM-PLS

3

0.734

0.940

0.724

0.793

D-LSSVM-PLS

4

0.716

0.943

0.504

0.900

GPR-PLS

4

0.727

0.942

0.574

0.870

D-GPR-PLS

4

0.718

0.943

0.421

0.930

ACS Paragon Plus Environment

15

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 31

(a)

(b) Figure 4. Regression results of D-GPR-PLS using BSM1 validation data for (a) SNHeff and (b) SNOeff.

The running times of different models are provided in Table 3. As reported in this table, it is clear that all of the models except GPR-PLS and D-GPR-PLS have training times less than 1 s. For the methods of GPR-PLS and D-GPR-PLS, the training times are much longer, with values of 58.920 s and 61.451 s. This is mainly because the steps used for finding the parameter p f by scaled conjugate gradient optimization are time consuming. In terms of the prediction time, the models including PLS, DPLS, QPLS, and D-QPLS show strong execution efficiency in the prediction part, and all of the running times are much less than 0.1 s. This results from the fact that PLS is a linear model, and the quadratic function possesses a simpler model structure than the LSSVM and GPR methods. Therefore, the prediction times for the nonlinear PLS models embedded with LSSVM and GPR are relatively longer.

ACS Paragon Plus Environment

16

Page 17 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Although LSSVM and GPR use the same kernel function in this work, there are still big differences in prediction time. For the LSSVM-PLS and the D-LSSVM-PLS models, the prediction times are 0.101 s and 0.113 s, respectively, which are lower than those of GPRPLS (1.290 s) and D-GPR-PLS (1.192 s). It illustrates that the prediction algorithm in terms of Gaussian distribution in equation (3) is a little bit more time-consuming than that of LSSVM.

Table 3. Comparison of Execution Time for BSM1 Data Set Models

Training time (s)

Prediction time (s)

PLS

0.030

0.004

DPLS

0.080

0.004

QPLS

0.090

0.007

D-QPLS

0.110

0.012

LSSVM-PLS

0.080

0.101

D-LSSVM-PLS

0.122

0.113

GPR-PLS

58.920

1.290

D-GPR-PLS

61.451

1.192

3.3. BSM1 Long-Term Data Set. As an extension of BSM1, the benchmark simulation model no. 1 long-term (BSM1_LT) was also used to validate the performance of the proposed methods. A constant influent data file was used to enable the BSM1_LT system to reach a steady state in 200 days. Then, dynamic simulation was conducted for 609 days at a 15 min interval by using the dynamic influent data. The variables in BSM1_LT were same as those in BSM1, and the samples were stored every six hours. After the data pretreatment, a total of 2257 samples were saved in BSM1_LT, among which the first 1129 samples constituted training data with the remaining 1128 used as the test data. Figure 5 provides the detailed values of SNHeff and SNOeff, which clearly shows the dynamics derived from the

ACS Paragon Plus Environment

17

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 31

BSM1_LT dynamic influent data. SNHeff concentrations reach a high level in the cold weather and become low during warm periods (Figure 5(b)). In terms of SNOeff concentrations, the situation is just the reverse (Figure 5(c)). For the details on explanation of the process dynamics, one can refer to the paper33.

(a)

(c)

(b)

Figure 5. BSM1_LT data for (a) influent temperature, (b) SNHeff, and (c) SNOeff.

3.4. Results and Discussion in terms of BSM1_LT. With the model structures proposed in Section 3.2, the parameters of different models can be determined in the same

ACS Paragon Plus Environment

18

Page 19 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

way. The detailed parameters of D-LSSVM-PLS and D-GPR-PLS models are listed in Table S2. Besides, the number of LVs and the values of RMSE and R2 can be found in Table 4. Due to the irregular peak depicted in the red rectangle of Figure 5(b), the prediction results of SNHeff for BSM1_LT are worse than those for BSM1 (Table 4). Although the D-GPR-PLS model shows the highest prediction accuracy ( RMSE  5.227 and R 2  0.723 ) for SNHeff, there are still many samples with prediction values deviating from the real-value trajectory (Figure 6). Compared with the prediction of SNHeff, all of the models perform better in terms of SNOeff. According to the results listed in Table 4, D-QPLS provides the highest prediction accuracy ( RMSE  0.585 and R 2  0.894 ), which is slightly higher than that in D-GPR-PLS model.

Figure 6. Regression results of D-GPR-PLS using BSM1_LT validation data for SNHeff.

Table 4. Comparison of Different Modeling Methods for SNHeff and SNOeff on BSM1_LT Test Data Models

No. of LVs

SNHeff RMSE

ACS Paragon Plus Environment

SNOeff R2

RMSE

R2

19

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 31

PLS

4

7.326

0.455

0.652

0.868

DPLS

3

5.627

0.679

0.679

0.869

QPLS

4

7.346

0.452

0.643

0.872

D-QPLS

4

5.330

0.712

0.585

0.894

LSSVM-PLS

5

6.664

0.549

0.668

0.861

D-LSSVM-PLS

4

5.290

0.716

0.626

0.878

GPR-PLS

5

6.637

0.553

0.657

0.866

D-GPR-PLS

4

5.227

0.723

0.603

0.887

On the whole, the dynamic methods display better prediction performance than their corresponding static ones. As shown in Figure 7(a), score points of the first latent variable in the static case are scattered when t(1) value is close to -5, whereas clustered when t(1) value is close to 5. For the dynamic case, the first-latent-variable score points are more evenly distributed and show a clear and nonlinear extension trend (Figure 7(b)-(d)). Therefore, the information extraction in the BSM1_LT data can be easier for the dynamic methods. In addition, the nonlinear curves in Figure 7(b), Figure 7(c), and Figure 7(d) also represent detailed interpretations for the dynamic and nonlinear features by using D-QPLS, D-LSSVMPLS, and D-GPR-PLS, respectively. In terms of the D-QPLS model, the curve in Figure 7(b) is relatively simple, while it can capture the nonlinear characteristic in the BSM1_LT data without the risk of over-fitting. In Figure 7(c) and Figure 7(d), both the curves from -5 to 0 of t(1) values depict a desirable fitting. This is mainly due to the strong nonlinear modeling capacity of LSSVM and GPR. However, there is still the problem of over-fitting at the two ends of the curves, especially for the D-LSSVM-PLS model (Figure 7(c)). This may explain why the D-LSSVM-PLS model performs slightly better than D-GPR-PLS when predicting SNOeff.

ACS Paragon Plus Environment

20

Page 21 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

For the execution times in the BSM1_LT case, there are no significant changes for the models of PLS, DPLS, QPLS, and D-QPLS (Table S3). In terms of the models of LSSVMPLS, D-LSSVM-PLS, GPR-PLS, and D-GPR-PLS, the running times are increased to varying degrees. Due to the increase of sample size, the training times of GPR-PLS and DGPR-PLS increase significantly.

(a)

(b)

(c)

(d)

Figure 7. Scatter plots of the first latent variable for BSM1_LT data in (a) static case, (b) DQPLS model, (c) D-LSSVM-PLS model, and (d) D-GPR-PLS model.

4. MODELING USING DATA FROM INDUSTRIAL DATA 4.1. Industrial WWTP Data Set. The industrial process data collected from a biological WWTP plant was applied to assessing the generalization ability of the proposed D-GPR-PLS

ACS Paragon Plus Environment

21

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 31

method. As shown in Figure 8, the process34 contains four biological reactors (i.e., denitrification, anaerobic, anoxic, aerobic processes), two clarifiers, a thickener tank and a dewatering system. The input variables include flow rate, biological oxygen demand, total suspended solid, total nitrogen, total phosphorous, and influent chemical oxygen demand (COD). In addition, the effluent COD (CODeff) is used as the output variable. The data contains daily average values from approximately one year of operation. After data pretreatment using Jolliffe’s three parameters23, there were 346 samples for modeling. Then, the first 232 samples were used for training, and the remaining 114 samples were used for testing.

Figure 8. Process layout for industrial WWTP.

4.2. Results and Discussion. Just similar to Section 3.2, the desirable parameters for different PLS-based models are obtained in the same way. The number of time lags is 2. Besides, the details of the WWTP data and the tuned parameters are provided in Figure S1 and Table S4, respectively. Combined with the number of latent variables for different PLSbased models, the prediction results for effluent COD are listed in Table 5, where the values

ACS Paragon Plus Environment

22

Page 23 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

of R2 (close to 0 or less than 0) are replaced by letter “F”. As shown in Table 5, only QPLS, D-QPLS, D-LSSVM-PLS and D-GPR-PLS perform well among the eight models tested.

Table 5. Comparison of Different Modeling Methods for Effluent COD on Test Data Effluent COD Models

No. of LVs RMSE

R2

PLS

2

1.417

F

DPLS

2

1.346

F

QPLS

2

1.170

0.320

D-QPLS

3

1.086

0.413

LSSVM-PLS

2

2.032

F

D-LSSVM-PLS

2

1.060

0.442

GPR-PLS

2

1.878

F

D-GPR-PLS

2

0.984

0.519

Compared to the BSM1 simulation data, the score points of the first latent variable for industrial WWTP data in Figure 9 are more scattered. As shown in Figure 9, the linear PLS (see plot a) can obviously not trace the nonlinear features of the latent variables. While the models of LSSVM-PLS and GPR-PLS possess comprehensive nonlinear modeling abilities theoretically, they cannot perform well (see plots c and d) in such irregular distribution of these score points (especially in 0 to 4 t(1) values). On the contrary, the simpler inner structure of Q-PLS in plot b seems to be more suitable for process modeling of industrial WWTP data. This would be because the nonlinear characteristics between the score points shown in Figure 9 are less obvious than those of simulation data exhibited in Figure 3. Therefore, it is accessible that the prediction results of effluent COD using QPLS are better than those of LSSVM-PLS and GPR-PLS in Table 5.

ACS Paragon Plus Environment

23

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 31

(a)

(b)

(c)

(d)

Figure 9. Scatter plots of the first latent variable for industrial WWTP data in (a) linear PLS model, (b) QPLS model, (c) LSSVM-PLS model, and (d) GPR-PLS model.

As shown in Figure 10, dynamic-based PLS is not suitable for the score points modeling of the first latent variable either. Due to the time lagged used for data expansion, the density of score points seems to be a little bit higher than that of Figure 9 and the nonlinearity is easier to trace at the same time. It is therefore follows that D-QPLS, D-LSSVM-PLS and D-GPRPLS can be used to achieve the positive interpretation of the nonlinearity between latent variables, and the RMSE and R2 values in Table 5 also reinforce the effective modeling performance of these 3 dynamic nonlinear PLS models. Besides, the model D-GPR-PLS has the best prediction results ( RMSE  0.984 and R 2  0.519 ). Compared with the prediction

ACS Paragon Plus Environment

24

Page 25 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

results of D-QPLS and D-LSSVM-PLS, the RMSE value of D-GPR-PLS is decreased by 9.39% and 7.17%, respectively, and the R2 value is increased by 20.42% and 14.84%. Compared with the static methods, PLS, QPLS, LSSVM-PLS, and GPR-PLS, the RMSE value of D-GPR-PLS is decreased by 30.56%, 15.90%, 51.57%, and 47.60%, respectively. Due to the reduction of the number of samples, all the models run efficiently in this case. However, the comparison of execution time is still similar to Table 3. The running times of different models can be found in Table S5.

(a)

(b)

(c)

(d)

Figure 10. Scatter plots of the first latent variable for industrial WWTP data in (a) DPLS model, (b) D-LSSVM-PLS model, and (c) D-GPR-PLS model.

According to the case studies, it is clear that both LSSVM-PLS and GPR-PLS models fairly well can model the nonlinearity of WWTP data. For the real industrial data, however,

ACS Paragon Plus Environment

25

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 31

the nonlinear features are unclear to be accurately modelled by the nonlinear PLS. After embedding the aforementioned dynamic technique into the nonlinear PLS models, the dynamic relations among the process and quality data can be interpreted by the lagged variables, which also provide the positive changes of the data structures for nonlinear modeling and lead to more efficient prediction performance of the dynamic-based nonlinear PLS model.

5. CONCLUSIONS

In this paper, a dynamic nonlinear prediction method denoted D-GPR-PLS has been developed for industrial processes. First, the augmented matrices served as the dynamic technique to optimize the data structures. Then the GPR model was used to establish the inner nonlinear relation of PLS (keeping the original outer PLS frameworks). Based on all of the case studies of WWTP, it was shown that the D-GPR-PLS had a better prediction ability than the other tested methods (PLS, DPLS, QPLS, D-QPLS, LSSVM-PLS, D-LSSVM-PLS, and GPR-PLS). However, there is still much space for improvement in the execution efficiency, especially for the training step, of the GPR part. In the future, much more attention will be focused on the development of a time saving algorithm for obtaining the hyper-parameters.

Supporting Information The Supporting Information is available free of charge on ACS Publications website at http://pubs.acs.org/. AUTHOR INFORMATION Corresponding Authors

ACS Paragon Plus Environment

26

Page 27 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

*H.L. E-mail: [email protected]; Tel: +86-13022525070 **S.J.Q. E-mail: [email protected] ***C.Y. E-mail: [email protected] ORCID Hongbin Liu: 0000-0001-9645-686X Notes The authors declare no competing financial interest. Author Contributions ‡ These authors contributed equally.

ACKNOWLEDGMENTS

We would like to thank Dr. Ulf Jeppsson for allowing us to use his BSM1 Simulink model. This study was supported by the Foundation of Nanjing Forestry University (No. GXL029) and the National Research Foundation of Korea (NRF) grant funded by the Ministry of Science and ICT (No. 2017R1E1A1A03070713).

REFERENCES (1) Jin, H.; Chen, X.; Wang, L.; Yang, K.; Wu, L. Adaptive soft sensor development based on online ensemble gaussian process regression for nonlinear time-varying batch processes. Ind. Eng. Chem. Res. 2015, 54, 7320-7345. (2) Shi, H.; Kim, M. J.; Liu, H.; Yoo, C. K. Process modeling based on nonlinear PLS models using a prior knowledge-driven time difference method. J. Taiwan Inst. Chem. E. 2016, 69, 93-105. (3) Grbić, R.; Slišković, D.; Kadlec, P. Adaptive soft sensor for online prediction and process monitoring based on a mixture of Gaussian process models. Comput. Chem. Eng. 2013, 58, 84-97.

ACS Paragon Plus Environment

27

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 31

(4) Wise, B. M.; Gallagher, N. B. The process chemometrics approach to process monitoring and fault detection. J. Process Contr. 1996, 6, 329-348. (5) Ronen, D.; Sanders, C. F. W.; Tan, H. S.; Mort, P. R.; Doyle, F. J. Predictive dynamic modeling of key process variables in granulation processes using partial least squares approach. Ind. Eng. Chem. Res. 2011, 50, 1419-1426. (6) Liu, H.; Yang, C.; Huang, M.; Wang, D.; Yoo, C. Modeling of subway indoor air quality using Gaussian process regression. J. Hazard. Mater. 2018, 359, 266-273. (7) Qin, S. J. Statistical process monitoring: Basics and beyond. J. Chemom. 2003, 17, 480502. (8) Yoo, C.; Lee, I.-B. Integrated framework of nonlinear prediction and process monitoring for complex biological processes. Bioproc. Biosyst. Eng. 2006, 29, 213-228. (9) Lv, Y.; Liu, J.; Yang, T. Nonlinear PLS integrated with error-based LSSVM and its application to nox modeling. Ind. Eng. Chem. Res. 2012, 51, 16092-16100. (10) Wang, M.; Yan, G.; Fei, Z. Kernel PLS based prediction model construction and simulation on theoretical cases. Neurocomputing 2015, 165, 389-394. (11) Huang, X.; Luo, Y.-P.; Xu, Q.-S.; Liang, Y.-Z. Incorporating variable importance into kernel PLS for modeling the structure–activity relationship. J. Math. Chem. 2018, 56, 713727. (12) Wold, S.; Kettaneh-Wold, N.; Skagerberg, B. Nonlinear PLS modeling. Chemometr. Intell. Lab. 1989, 7, 53-65. (13) Qin, S. J.; McAvoy, T. J. Nonlinear PLS modeling using neural networks. Comput. Chem. Eng. 1992, 16, 379-391. (14) Bang, Y. H.; Yoo, C. K.; Lee, I.-B. Nonlinear PLS modeling with fuzzy inference system. Chemometr. Intell. Lab. 2002, 64, 137-155. (15) Rasmussen, C. E.; Williams, C. K. I. Gaussian processes for machine learning; The MIT Press: Cambridge, Massachusetts, 2006.

ACS Paragon Plus Environment

28

Page 29 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

(16) Liu, J.; Chen, D.-S.; Shen, J.-F. Development of self-validating soft sensors using fast moving window partial least squares. Ind. Eng. Chem. Res. 2010, 49, 11530-11546. (17) Ni, W.; Tan, S. K.; Ng, W. J.; Brown, S. D. Localized, adaptive recursive partial least squares regression for dynamic system modeling. Ind. Eng. Chem. Res. 2012, 51, 8025–8039. (18) Liu, H.; Kang, O.; Kim, M.; Oh, T.; Lee, S.; Kim, J. T.; Yoo, C. Sustainable monitoring of indoor air pollutants in an underground subway environment using self-validating soft sensors. Indoor Built Environ. 2013, 22, 94-109. (19) Kaneko, H.; Funatsu, K. Development of soft sensor models based on time difference of process variables with accounting for nonlinear relationship. Ind. Eng. Chem. Res. 2011, 50, 10643-10651. (20) Kaneko, H.; Funatsu, K. Maintenance-free soft sensor models with time difference of process variables. Chemometr. Intell. Lab. 2011, 107, 312-317. (21) Liu, Y.; Gao, Z.; Li, P.; Wang, H. Just-in-time kernel learning with adaptive parameter selection for soft sensor modeling of batch processes. Ind. Eng. Chem. Res. 2012, 51, 4313– 4327. (22) Liu, Y.; Huang, D.; Li, Y. Development of interval soft sensors using enhanced just-intime learning and inductive confidence predictor. Ind. Eng. Chem. Res. 2012, 51, 3356–3367. (23) Liu, H.; Yoo, C. A robust localized soft sensor for particulate matter modeling in Seoul metro systems. J. Hazard. Mater. 2016, 305, 209-218. (24) Ku, W.; Storer, R. H.; Georgakis, C. Disturbance detection and isolation by dynamic principal component analysis. Chemometr. Intell. Lab. 1995, 30, 179-196. (25) Lee, J. M.; Yoo, C. K.; Lee, I. B. Statistical monitoring of dynamic processes based on dynamic independent component analysis. Chem. Eng. Sci. 2004, 59, 2995-3006. (26) Lee, S.; Liu, H.; Kim, M.; Kim, J. T.; Yoo, C. Online monitoring and interpretation of periodic diurnal and seasonal variations of indoor air pollutants in a subway station using parallel factor analysis (parafac). Energ. Buildings 2014, 68, 87-98.

ACS Paragon Plus Environment

29

Industrial & Engineering Chemistry Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 30 of 31

(27) Liu, Q.; Zhu, Q.; Qin, S. J.; Chai, T. Dynamic concurrent kernel CCA for strip-thickness relevant fault diagnosis of continuous annealing processes. J. Process Contr. 2018, 67, 12-22. (28) Serdio, F.; Lughofer, E.; Pichler, K.; Buchegger, T.; Pichler, M.; Efendic, H. Fault detection in multi-sensor networks based on multivariate time-series models and orthogonal transformations. Inform. Fusion 2014, 20, 272-291. (29) Geladi, P.; Kowalski, B. R. Partial least-squares regression: A tutorial. Anal. Chim. Acta 1986, 185, 1-17. (30) Liu, H.; Yoo, C. Performance assessment of cascade controllers for nitrate control in a wastewater treatment process. Korean J. Chem. Eng. 2011, 28, 667-673. (31) Liu, Y.; Liu, B.; Zhao, X.; Xie, M. A mixture of variational canonical correlation analysis for nonlinear and quality-relevant process monitoring. . IEEE T. Ind. Electron. 2018, 65, 6478-6486. (32) Gernaey, K. V.; Jeppsson, U.; Vanrolleghem, P. A.; Copp, J. B. Benchmarking of control strategies for wastewater treatment plants; IWA Publishing: London, 2014. (33) Gernaey, K. V.; Rosén, C.; Jeppsson, U. WWTP dynamic disturbance modelling–an essential module for long-term benchmarking development. Water Sci. Technol. 2006, 53, 225-234. (34) Liu, H.; Huang, M.; Yoo, C. A fuzzy neural network-based soft sensor for modeling nutrient removal mechanism in a full-scale wastewater treatment system. Desalin. Water Treat. 2013, 51, 6184-6193.

Abstract Graphics

ACS Paragon Plus Environment

30

Page 31 of 31

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Paragon Plus Environment

31