Integrated Online Virtual Metrology and Fault Detection in Plasma

Modern semiconductor fab tools are equipped with fault diagnostic sensors that generate large amounts of real-time data. These trace data have been sh...
0 downloads 10 Views 497KB Size
Subscriber access provided by LAURENTIAN UNIV

Article

Integrated Online Virtual Metrology and Fault Detection in Plasma Etch Tools Bo Lu, John Stuber, and Thomas F. Edgar Ind. Eng. Chem. Res., Just Accepted Manuscript • Publication Date (Web): 16 Oct 2013 Downloaded from http://pubs.acs.org on October 22, 2013

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Industrial & Engineering Chemistry Research is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Integrated Online Virtual Metrology and Fault Detection in Plasma Etch Tools Bo Lu,∗,† John Stuber,‡ and Thomas F. Edgar† The University of Texas at Austin, and Texas Instruments E-mail: [email protected]

Abstract This paper presents an integrated virtual metrology (VM) and quality excursion detection framework for on-line implementation on a plasma etch tool. Traditional external metrology have inherent delays and are sparingly performed only on the most critical steps of manufacturing. To reduce metrology delay and limit the impact of quality excursion wafers, VM models can be used predict relevant quality outputs using process variables such as chamber pressure, gas flow rate, power level and other plasma measurements. Although existing VM methods work well using off-line datasets, they face on-line implementation challenges, such as outliers and process drifts that degrade model accuracy. To develop an efficient and reliable framework for online VM monitoring, we combine data pretreatment using total partial least squares projection, model updating with moving window PLS, and model feature selection using PLS variable importance in projections core filtering. The combined model can be implemented with low memory footprint and is also shown to be robust against drifts and disturbances. Modeling results from industrial datasets are shown to demonstrate the effectiveness of our approach. ∗

To whom correspondence should be addressed Department of Chemical Engineering, The University of Texas at Austin, Austin, TX ‡ Texas Instruments, Dallas, TX †

1 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Introduction Traditional semiconductor manufacturing factories (fabs) have relied on run-to-run process control with external metrology. Only the most critical processes have metrology on every wafer; typically a dynamic sampling algorithm will adjust the sampling frequency between one wafer per lot and one wafer every three lots. But even in cases where metrology data is collected on every wafer, excursions in product quality are not detected until many more wafers have been processed. The inherent delays in external metrology prolong product cycle time and introduce additional complexity in run-to-run controller tuning. Modern semiconductor fab tools are equipped with fault diagnostic sensors that generate large amounts of real-time data. These trace data have been shown to correlate with quality relevant outputs 1–4 . As a result, virtual metrology (VM) aims to predict end-of-batch outputs using trace data and other end-point sensor readings. Having a VM signal would allow the fault diagnostic system to intervene and limit the quantity of wafers processed under excursion conditions. This also reduces false alarms by calling for interventions only in cases that directly impact wafer quality. In addition to excursion prevention, a reduction in the average lot’s cycle time can be achieved by replacing physical metrology with VM on some fraction of the lots, and the run-to-run system can adjust some recipe set-points to compensate for variations in product quality that do not reach the excursion limit. Inferential sensors, soft sensors or virtual sensors are used synonymously in the process industry. VM is equivalent to soft sensors applied in a semiconductor manufacturing context. Kadlec et al. provided a comprehensive review of existing soft sensor approaches 5 ; other survey papers are also available on extensions of soft sensor approaches 6,7 . There have been several successful applications of VM in chemical mechanical polishing and chemical vapor deposition 8,9 . However, research efforts on predicting critical dimensions and sidewall angles in plasma etch processes are still on-going 3,10 . Plasma etch is a nonlinear multi-step process due to its complex physical chemistry 11 . As a result, current VM approaches rely on data-driven approaches. Some notable work include Lynn et al., who applied Gaussian Process Regression (GPR) and showed that it performed neural network and PLS models 11 . Zeng compared performances of the popular partial 2 ACS Paragon Plus Environment

Page 2 of 30

Page 3 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

least squares (PLS) method against back-propagating artificial neural networks (BPNN) 10 . Other VM approaches include support vector machine regression, radial basis neural networks and PLS regression trees 2,11–14 . All of these VM model research focuses on modeling performance evaluation using pre-treated offline data. However, for practical application purposes, VM must be implemented online, able to adapt to process drifts, and fault-tolerant. Many of these process drifts and faults are due to faulty wafer, tool environment degradation and preventive maintenance, which cannot be measured easily at runtime. Therefore, this paper proposes combining Total Projection to Latent Structure (TPLS) decomposition process monitoring and moving-window PLS model update to create a robust plasma etch VM model. In addition, model simplification through feature selection of developed models will also be discussed. Moving-window PLS formulation allows for an online implementation that can be updated recursively to ensure resiliency against process drifts. In addition, TPLS monitoring scheme was tested against the conventional Hotelling T 2 and squared prediction (SPE) error metrics. Lastly, PLS variable selection techniques are used to simplify the PLS model. The reduced complexity and model size have the benefits of faster execution, easier implementation and more robust against inherent system process noises. This paper is structured as follows: First, we introduce the theory and methods in latent variable PLS modeling, TPLS based process monitoring, and adaptive PLS. Second, we introduce variable selection techniques for PLS models to simplify the obtained models. Third, we introduce an integrated framework composed of the above-mentioned components. Finally, we present modeling results and assess our performance against industry standard techniques.

Theory and Methods Online Framework Overview The integrated VM framework comprises of preprocessing, model development, and model maintenance to ensure that the VM models constructed are suitable for online adaption. Key problems 3 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

that this setup aims to address are process drifts and outliers during model update. Figure 1 shows the key information flow in the integrated online framework and gives an overview of the proposed configuration. The two major components of the framework are the VM prediction module and the VM maintenance module. In the VM prediction module, an existing PLS model obtained either off-line or from previous online runs will be used in conjunction with T-PLS multivariate monitoring. The T-PLS monitoring statistics serves as a prediction quality alarm that will alert users of possible prediction quality issues. In the VM maintenance module, additional metrology data will be synchronized with the corresponding fault detection trace data to carry out the model maintenance and update tasks. The same T-PLS monitoring statistics will be used to screen out potential outlier batches that will negatively affect the model update process. Afterwards, the model simplification step removes redundant variables and uninformative variables to reduce memory foot-print. The T-PLS fault detection limits are updated for the new PLS model. The VM maintenance module will run asynchronously from the VM prediction module, the speed of the model update will depend on the availability of metrology data. However, one can easily implement logic to force an model update in cases where excessive alarms are raised in the VM prediction module.

VM Methods Multiway Partial Least Squares (MPLS) Regression Latent variable methods such as partial least squares (PLS) and principal component analysis (PCA) project high-dimensional data onto low dimensional latent subspaces. These methods are widely used to analyze large data sets such as those encountered in plasma etching ( 10,14,15 ) A typical plasma etch data set contains over 30 sensor measurements, which include gas flow, chamber conditions, and RF circuitry readings. When sensor signals are unfolded, these sensor readings result in over 3000 variables. Unfolding transforms a three-dimensional data array into a two-dimensional matrix for the purpose of performing PCA and PLS 16 . The feasibility of using a multi-way approach (unfolding) for PLS/PCA analysis was demonstrated by Wold et al. and Nomikos et al. in process industry monitoring 17,18 . Figure 2 visualizes the transformation of a 4 ACS Paragon Plus Environment

Page 4 of 30

Page 5 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

sDWƌĞĚŝĐƚŝŽŶ

&ĂƵůƚ ĞƚĞĐƚŝŽŶ ĂƚĂ

dŽŵŽŶŝƚŽƌŝŶŐŽƌ &ƐLJƐƚĞŵ



WƌĞƉƌŽĐĞƐƐŝŶŐ

dͲW>^DŽŶŝƚŽƌŝŶŐ

W>^WƌĞĚŝĐƚŝŽŶ



sDDĂŝŶƚĞŶĂŶĐĞ

DĞƚƌŽůŽŐLJ ĂƚĂ



ĂƚĂƵĨĨĞƌ

dͲW>^KƵƚůŝĞƌ ZĞŵŽǀĂů

DtͲW>^DŽĚĞů hƉĚĂƚĞ

dͲW>^DŽĚĞů hƉĚĂƚĞ

DŽĚĞů ^ŝŵƉůŝĨŝĐĂƚŝŽŶ

dŽsDWƌĞĚŝĐƚŝŽŶ

Figure 1: Proposed integrated VM framework using T-PLS monitoring and moving window model updates three-dimensional data cube into a two-dimensional matrix. The unfolded data is then analogous to data from a continuous dataset and can be used directly in PLS modeling. The two mainstream PLS algorithm are the Nonlinear Iterative PLS (NIPALS) and the SIMPLS 19,20 . Seven additional PLS algorithms are evaluated and reviewed by Andersson 21 . Both PLS algorithms decompose the mean-centered matrices into the following form: X = TPT +E

(1)

Y = UQT +F where T ∈ Rn×A and U ∈ Rn×A are the X and Y scores respectively, P ∈ Rm×A and Q ∈ Rp×A are the loadings for X and Y, respectively. The number of components in the PLS model is typically determined through cross-validation or through information criterion such as the Akaike Information Criterion (AIC) 22 . The PLS algorithm maximizes the covariance between the X-scores and Y-scores; this property leads to PLS requiring fewer components when compared to principal component regression models 22 .

5 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 30

To apply the PLS latent structures in regression, given unfolded input data matrix x0 , the output predictions yˆ0 can be calculated linearly using y ˆ0 = x0 βˆpls . The βˆpls can be expressed as a function of the latent variables as follows: ( ) βˆpls = R TT Y = RRT XY

(2)

where R is the weight matrix in the SIMPLS algorithm. In the NIPALS method, the weight matrix R can also be calculated iteratively. PLS faces challenges when the input and output relationship is nonlinear. However, multiple options exist depending on the type of nonlinearity encountered. Qin and McAvoy developed neural network PLS (NNPLS) that approximates the latent relationship between the input and output scores using a feed forward neural network 23 . Using this approach, nonlinearity in the latent score space can be approximated to arbitrary precision with increasingly complex neural network structure. Kernel PLS maps original X space data into a nonlinear feature space F, where the relationship can be represented linearly 24 . However, selecting the right kernel function and requirement for large amount of training data (n ≫ m × p) reduces the practical effectiveness of this method 21 . Another method to approximate nonlinearity is by constructing multiple models for each operating regime. The appropriate operating regime can then be selected at run-time based on classification algorithms such as clustering or decision-trees. However, this method also relies on a large amount of training data due to its need to train multiple models. Due to limited training data in a high-mix semiconductor fab, only the NNPLS model is implemented and used in comparison.

Figure 2: Unfolding of a three-dimension data array(Rn×m×p ) into a two-dimensional matrix (Rn×mp )

6 ACS Paragon Plus Environment

Page 7 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Moving Window Update The moving window and recursive PLS are two adaptation mechanisms to compensate for process drifts in PLS models. Recursive PLS was first introduced by Helland 25 and then improved by Qin 26 . An application of recursive PLS in VM was demonstrated by Khan et al. 12 using simulated data. However, the drawback of recursive PLS is that the multivariate control monitoring indices are more difficult to adapt. Moving window PLS is an alternative formulation to recursive PLS that allows for easy update of the monitoring indices. In MW-PLS, the PLS algorithm itself remains unchanged. As new data are acquired, the window of training data slides along the data-set. The two tuning parameters in moving window PLS are the window size and the step size; these two tuning parameters are graphically illustrated in Figure 3.

X1,Y1

X2,Y2

….

XN,YN

...

EƐƚĞƉ EƐĂŵƉůĞ

X, Y Figure 3: Moving window data partition of a dataset using two parameters: sample size(Nsample ) and step size (Nstep ) In addition, to ensure consistency and resiliency against process noise and disturbance, it might be more desirable to update the model in batches instead of as soon as new data points are available. Therefore, we use a block-wise moving window update approach. Considerations to be taken for a MW-PLS scheme are (1) ensuring incoming data are fault free, and (2) optimal selection of update parameters such as moving window size and update step size.

7 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 30

Total PLS Decomposition Outlier Filtering To our best knowledge, current VM literature lacks an in-depth examination of outlier removal in VM implementations. This could be justified since outliers in offline model building can be readily identified and removed using a combination of prior knowledge and automatic screening. However, outlier removal and process monitoring are required online for both the real-time trace data and the overall batch data. There are two levels of outlier removal: the first level focuses on removing spikes/dips temporally; the second level removal focuses on excluding abnormal batch runs from the training set. First level temporal spike removal is done routinely by replacing points outside of the 3σ range with a locally estimated average or median. The second level outlier detection is particularly important for PLS based methods due to its least-squares penalty on residual, where a large deviation in X and Y can cause excessive bias in model parameters. To detect these abnormal runs that affect VM output, we propose using total projected latent structure (T-PLS) monitoring 27 inspired by its application in industrial chemical processes. For multiple-output PLS regression models, an improved version of T-PLS called concurrent projection to latent structure (CPLS) 28 could also be used as a drop-in replacement for the T-PLS method proposed here. Total projection to latent structure (T-PLS) decomposition further decomposes the principal and residual subspaces in normal PLS monitoring. After decomposition, the quality relevant subspaces are separated from the orthogonal components as follows: X = Ty PTy + To PTo + Tr PTr + Er Y=

Ty QTy

(3)

+F

where the subscripts y denote output-relevant, o denote output-orthogonal, r denote output-residual. This decomposition offers a more intuitive interpretation of the PLS decomposed subspaces: Output relevant(spanned by Ty PTy ): Ty represent variations related only to Y from the original PLS model. Output orthogonal(spanned by To PTo ): To represent variations orthogonal to Y from the original T. 8 ACS Paragon Plus Environment

Page 9 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Output residual(spanned by Tr PTr ): To is the majority of the original residual E and Er is the left over residual. Through this decomposition, the subspaces that are not relevant to process monitoring (To PTo and Tr PTr ) are isolated from the output relevant subspaces. The resulting monitoring indices are as follows: Ty2 = tTy Λ−1 y ty ∼

Ay (n2 − 1) FA ,n−Ay n (n − Ay ) y

S 2 Qr = ∥˜ xr ∥ ∼ χ 2 2µ 2µ /s

(4)

2

where ty and x ˜r are the corresponding T-PLS decomposed output-relevant scores and residual, respectively; S and µ are the sample variance and sample mean of Qr . Qr follows a χ2 distribution and Ty2 follows a Fisher distribution, which can be then used to calculate the respective control limits given significance level α. By monitoring only the quality relevant subspaces shown in Equation 4, we can effectively reduce the fraction of false outliers removed to maximize data recovery during training; example illustrating the results of T-PLS compared against traditional Hotelling T 2 is shown in Figure 4. Comparing Figure 4(a) with the top two subplots in Figure 4(b), we can see that the original Hotelling T 2 appears to be a superposition of the output-relevant and output-orthogonal Hotelling T 2 scores; yet, most of the outliers detected by the conventional Hotelling T 2 are outliers that are orthogonal to output quality. As a result, using T-PLS will reduce the number of outliers removed while ensuring that the remaining outliers do not affect PLS regression performance.

Variable Selection The need for variable selection in VM modeling arises due to the implementation and maintenance burden of large and complex VM models. In the case for plasma etch, original raw PLS models using unfolded data have over 3000 parameters, most of these variables are non-informative and non-relevant to the quality variable. Leardi suggested using genetic algorithms in PLS regression feature selection, where a nonlin9 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 30

Traditional PLS T2 70

60

50

40

30

20

10

0

0

50

100

150

200

250

300

350

(a) Y-space T2

Orthogonal Space T2

30

60

20

40

10

20

0

0

100

200

300

400

0

0

Residual Space T2

100

200

300

400

norm of Er, Q2 statistics

300

8 6

200

4 100

0

2

0

100

200

300

400

0

0

100

200

300

400

(b)

Figure 4: (a) PLS Hotelling T 2 monitoring index, (b) PLS decomposed subspace process monitoring indices, top-left: output-relevant T 2 , top-right: output-orthogonal T 2 , bottom-left: outputorthogonal residual T 2 , bottom-right: left-over residual Q2r , red dashed line is the 95% significance level ear heuristic search try to find the best model from a pool of candidate models containing different variable sets 29 . Hoskuldsson presented alternative methods to variable selection, where the input variable and output variable correlation coefficients are used as a metric in determining interval

10 ACS Paragon Plus Environment

Page 11 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

of inclusion in subsequent MPLS modeling 30 . In recent chemical engineering literature, the use of expert knowledge and contribution scores are primary methods to evaluate the significance of regression parameters 31 32 . In 2009, Zeng and Spanos presented a plasma etch specific variables selection scheme; their methods include regression-coefficient based filtering, Pearson correlation coefficient filtering, random modeling and genetic algorithms based filtering, and step-wise regression selection 1 . Since variable selection methods are application specific, we compared correlation coefficient filtering, beta coefficient filtering, PLS variable importance in projection filtering and competitive adaptive weighted sampling variable selection methods to determine the most suitable method for plasma etching. Correlation coefficient filtering Hoskuldsson first suggested using correlation coefficient between input and output to filter out uncorrelated variables prior to PLS modeling 30 . The correlation coefficient for each xi can be calculated as (j)

Σx

(j)

Σy

Σxi yi − iN j rj = √( )( 2 (j) 2 Σxi − (ΣxNi ) Σyi 2 −

(Σyi )2 N

)

(5)

where x(j) ∈ RN ×1 , y ∈ RN ×1 , and N is the sample size. Lower correlation coefficients values correspond to variables of less importance. A cut-off threshold can then be applied to remove these variables. If a range of cut-off thresholds are all evaluated, then optimum cut-off threshold can be determined by finding the cut-off threshold that yields the best fitting performance. Beta coefficient based filtering Beta coefficients in a linear regression model also is indicative the variable importance. For PLS models, the beta coefficients can be calculated using Equation 2. It is important to note that X used in PLS modeling need to be centered and auto-scaled. This way, variables corresponding to coefficients with small magnitudes are uninformative and can be eliminated. A similar cut-off

11 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 30

threshold elimination method can then be utilized to determine the optimal number of variables to keep in the model. PLS variable importance in projection (VIP) based filtering From the PLS decomposed loadings and weights vectors, variable importance in projection (VIP) can be calculated. The VIP value for each variable represents a weighted contribution of the particular variable in the decomposed subspace 33 . Contrary to the first two filtering techniques, filtering based on the VIP reduces the impact of variable elimination on the existing PLS structure. The VIP value for each variable can be calculated as follows: √ ∑ / A 2 vj = p a=1 [SSa (waj /wa )] ∑A SSa a=1

(6)

where A is the total number of components, SSa is the sum of squares explained by the ath component, the

waj ∥wa ∥2

represents the importance of the variable j for component a.

Modeling Results Dataset description The combined data set contains 1200 wafers with metrology measurements of Develop-Inspect Critical Dimension of the resist pattern (DICD) and Final Inspect Critical Dimension (FICD). Modeled outputs could be expressed as either the difference between these measurements (etch bias) or the etch bias divided by the trim etch time as the etch rate. The metrology measurements are the response variables in the model, and trace data are the predictor variables. All of the wafer data are from the same manufacturing thread, but the tool did not produce all the wafers continuously. Actual trace data consist of plasma etch equipment RF circuitry readings, gas flow rates, and chamber condition readings at 0.5 Hz frequency. Miscellaneous information such as recipe step, process time and EWMA-estimated disturbances are also appended to the original data set. In total, batch trajectories from 39 measurements were used as predictor variables in attempt

12 ACS Paragon Plus Environment

Page 13 of 30

to predict the average etch rate. An example plot of the raw trace data is shown in Figure 5. In addition, the batch trajectory can be naturally divided into seven phases according to the trace trajectory ”recipe number”, which is used by the etch rate controller in controlling the etching process. Drifts and disturbances experienced in the quality variables of plasma etching can be caused by many factors. For example, having multiple production threads, meaning that each plasma tool is used to manufacture different products using different recipe settings, could contaminate the current production thread with residual deposits from another product. The inherent process nonlinearity of the batch-type dynamic trajectories in plasma etching also makes it difficult to fully model the plasma etching process on different time-scales. As a result, slower time-scale dynamics could then manifest as a drifting bias in the etch rate. Preventive maintenance done routinely to clean the plasma chamber of deposits and residues also introduces step change to etch rates that must be accounted for. Lastly, errors from upstream processes could also propagate and affect the etch profile in the plasma etching process. In practice, it is difficult to identify the source of drifts and disturbances, as a result, our goal is to account for the slow drifts in the process through moving window model update and to account for the spike disturbances by selectively removing them according to T-PLS quality relevant process monitoring. Batch id 1 − 2042830−1

Variable 1

Variable 2

200

200

100

100

0 0

50 100 Variable 3

0 0

150

2

100

0

50

−2 0

50 100 Variable 5

150

0 0

20

20

10

10

0 0

50 100 Samples

150

0 0

1000 800 600

50 100 Variable 4

150

Value

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

400 200 0

50 100 Variable 6

−200 150

150

40

100

30 20

50 10

50 100 Samples

Time

150

(a)

0

0

Variables

(b)

Figure 5: Raw data plot showing (a) the temporal profiles of six variables for five wafer batches and (b) A three-dimensional representation of trajectories from a single batch

13 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Preprocessing Each data set was first screened for missing or redundant data. Redundant and irrelevant variables were removed by manual inspection. Also excluded from the models were variables with more than 50% data being blank such as status indicators. Temporal outliers such as spikes in sensor readings were removed by checking against the confidence limits of corresponding variables. After initial cleaning, the data set was then unfolded into a 2-D array using batch-wise unfolding. The unfolded data then underwent another outlier removal round to remove batch-wise outliers. Process engineers recommended using recipe number as a guide to verify data integrity, so the recipe number trajectory of every run was compared against a prescribed reference trajectory; large deviations in the reference trajectory raised alarms that label the batch as abnormal. A more in-depth batch trajectory alignment method can also be applied to dynamically warp trajectories based on certain indicator variables such as RF power factor; the dynamic time warping method based on filtered derivative warping by Zhang et al. 34 has been found to be particularly effective. Univariate scaling and mean-centering were then applied to the unfolded input-output data to create the X and Y matrices.

PLS Modeling Both PLS algorithms (NIPALS and SIMPLS) were used to model the input-output relationships. The NIPALS PLS results confirm the solution obtained through SIMPLS. Subsequent modeling was done in SIMPLS because of its superior computational efficiencies. Wafer data are arranged chronologically to place validation wafers after training wafers. This arrangement mimics real production scenario where prediction data will always be after the initial training data. The number of latent components for PLS models were chosen by cross-validation. The mean squared residual error of prediction was minimized. The resulting number of components ranged from 4 to 10 depending on the size of the predictor matrix X. Figure 6 shows the training performance of the base case PLS model without performing any T-PLS outlier removal or variable selection. We note that training data predicted etch rates fall along the diagonal line, indicating 14 ACS Paragon Plus Environment

Page 14 of 30

Page 15 of 30

good agreement. Subplot (b) of Figure 6 shows the cross-validated mean squared error used to determine the optimal number of components in the model. Etch Rate Measured vs Predicted plot,R2=0.99358

Total X Error:1549.8764, Total Y Error:1.0072 0.75

1.3

X Y

0.7 0.65 0.6

1.25 Normalized Error

Predicted Etch Rate

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

1.2

0.55 0.5 0.45 0.4

1.15

0.35 0.3

1.1 1.1

1.15 1.2 1.25 Measured Etch Rate

0.25 0

1.3

(a) Training vs predicted for PLS training set

2

4 6 # of Components

8

10

(b) Cross-validated mean squared error of prediction

Figure 6: Base case PLS training results showing (a) the predicted versus measured plot and (b) the cross-validated mean squared error of prediction in X and Y

Variable Selection Four variable selection methods introduced previously were compared correlation based filtering, beta coefficient based filtering, PLS VIP based filtering, and competitive adaptive re-weighted sampling. These methods are compared for calculation speed and the performance of simplified PLS models. Table 1 shows the final PLS model sizes and model performances after variable selection. Additional details of the variable selection are provided as supporting information. The number of variables in each model are determined by optimizing for the cross-validated prediction performance of the reduced models. The optimal model comparison was chosen over a fixed variable comparison because of variabilities in the number of variables selected by CARS-PLS. The CARS-PLS method uses Monte-Carlo random sampling to generate weighted samples of variable 15 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 30

combinations. As a result, the number of variables selected in CARS-PLS varies from run to run. In addition, the CARS-PLS variable selection algorithm is also computationally expensive. In this comparison, the CARS-PLS variable selection is performed offline and is intended to be a benchmark to validate the selection results of other simpler methods. The selected variables using PLS VIP filtering and beta coefficient filtering from multiple runs are compared against the results from CARS-PLS. The PLS VIP filtering was determined to be a viable method for online implementation because of its comparable prediction performance, selection robustness and the fast computational speed. Table 1: Summary of the variable selection performance at the optimal Q2 using four different selection methods Selection Method Correlation coefficient filtering Beta coefficient filtering PLS VIP filtering CARS-PLS selection

Model Size 250 85 75 70

Cross-validated Q2 0.72 0.77 0.79 0.76

16 ACS Paragon Plus Environment

Computation cost low low low high

Page 17 of 30

Moving Window Update with T-PLS TPLS decomposition of existing PLS models were developed using methods discussed previously. The resulting monitoring indices for the training data are plotted in Figure 7. The figure shows that two wafer samples are out of the control limit for Ty2 ; this outlier can be interpreted as data batches exhibiting abnormal trends in predictor variables that are mostly correlated to response variables. As a result, removing them from the training set improves the PLS model performance. This is a useful property in developing moving window update schemes for PLS models using non-simulated industrial data. Since the data could be contaminated with noise and disturbances, being able to predict them and reduce their impact in subsequent modeling makes moving window models more robust against noise. 1.28

15

1.26

PLS Predicted Measured TPLS Outliers

10 T2y

1.24

5

1.22 Etch Rate

1.2

0 0

50

100

150 200 Wafer Samples

250

300

50

100

150 200 Wafer Samples

250

300

1.18 60

1.16 1.14

40 Q2r

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

1.12

20

1.1 1.08 0

50

100

150

200

0 0

250

Samples

(a)

(b)

Figure 7: Time series plot of the training data at the 1st moving window plus the T-PLS fault diagnostics for the training data Figure 8(a) shows the comparison between the MW-PLS and the base case PLS models. In the base case PLS model, the first 200 wafer samples are used to train the model. In the MW-PLS model, a PLS model with 200 wafer samples is first initialized; subsequent update models are updated at five lot intervals and the maximum model training size is fixed at 300 wafer samples. Comparing the moving window update model results with the fixed PLS model, the MW-PLS model handles drifts and disturbances much better and is able to track the actual metrology measurements much better than the fixed PLS model, especially after 800 wafer samples where the 17 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

fixed PLS model showed larger residuals. In Figure 8(b), the normalized beta coefficient values for the top four variables and the bias for the moving window PLS model samples has been plotted. The top four beta coefficients remained relatively constant, implying that the correlation between the input and output variables remained unchanged. The bias term exhibited the biggest change, which indicates that the process has undergone a slow drift that is unable to be captured using a stationary PLS model.

Integrated Framework Results The exact steps for implementing the integrated VM framework are summarized in Figure 9. The framework consists of two phases. In the initialization phase, existing training data is used to create a PLS model and its corresponding outlier detection limits. Following the initialization, the prediction/execution phase of the integrated framework relies only on future data and can be run online. The integrated framework is robust against metrology delays since the TPLS outlier detection component will provide abnormal data or prediction error warnings prior to collection of actual metrology results. Once metrology results are collected, the data can then be used to update the model. Implementing the integrated VM framework produces the time series plot of predicted and measured results shown in Figure 10. As shown in Figure 10, the simulated integrated framework is able to track the measured etch rate consistently. In regions with strong outlier presence around wafer samples 380, TPLS detection indices identified the abnormal runs and excluded the abnormal data from future training data. Exclusion of abnormal runs is effective at improving prediction quality when the incoming data quality is poor.

Comparison with other methods Principal component regression (PCA), back-propagated neural network (BPNN) , neural network PLS (NNPLS), and moving window partial least squares without TPLS detection (MW-PLS) were implemented on the same data set to compare the results with our proposed framework. Table 2 18 ACS Paragon Plus Environment

Page 18 of 30

Page 19 of 30

(a) 0.5 Bias Top Variable 1 Top Variable 2 Top Variable 3 Top Variable 4

0.4 Normalized BetaCoefficients

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

0.3 0.2 0.1 0 −0.1 −0.2 −0.3 −0.4 0

2

4 6 8 Moving Window PLS model samples

10

12

(b)

Figure 8: (a) Effect of moving window PLS update on predicting etch rate (bottom) compared against PLS without moving window updates (top), (b) Normalized regression coefficient of top four variables and the bias over the moving window range lists the comparison results, the three metrics used to evaluate model performances are parameter size, cross-validated coefficient of determination (R2 ), and cross validated root mean squared prediction error (RMSPECV). 19 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Table 2: Model size, coefficient of determination, and cross-validated root mean squared prediction error(RMSPECV) of the plasma etch data set with 200 wafers as initial training data and 1300 wafers as subsequent testing and model updating data. Method PCA BPNN NNPLS MW-PLS MW-PLS w/ TPLS

Model size 2711 532 528

R2 0.425 0.72 / 0.32 0.56 / 0.25 0.142 0.436

RMSPECV 0.0495 0.0823 0.0623 0.0998 0.0483

The principal component analysis model used the full dataset (without variable filtering) and was found to be the most effective with 10 components. The back-propagated neural network model took a reduced set of unfolded variables as inputs (650 inputs). The reduced set was chosen using the correlation filtering variable selection method. A two-layer structure with ten neurons in the first layer was adopted to allow for efficient back-propagation. This network configuration resulted in 6210 weights in the BPNN model. Performance degradation was observed for the BPNN model. The testing R2 in the BPNN model was around 0.72 for about 200 wafers initially, but it decreased quickly to around 0.32 afterwards. For NN-PLS, 10 outer components was used. The inner neural network size varied between 3 to 6, and was trained using the Levenberg-Marquadt back-propagation. The NN-PLS was able to provide reasonable approximation initially but also experienced relatively fast degradation in performance similar to BPNN; the drop in performance indicates that these two models suffer from over-fitting.The PLS with moving window update model also took the same set of reduced inputs as the BPNN model. Simply applying moving window update scheme to PLS models resulted in a much lower R2 due to the effects of outlier and abnormal runs in future training data. Lastly, the proposed PLS-TPLS with moving window update results was shown. Overall, the principal component analysis regression model and the PLS-TPLS with moving window update model had the best performance across 1300 testing wafers. The BPNN was able to give excellent regression fit but degrades quickly overtime. Alternatively, PLS with moving window update had poor model fit due to impacts of outlier runs that affected the model results. Using the proposed integrated framework, the resulting model was able to maintain

20 ACS Paragon Plus Environment

Page 20 of 30

Page 21 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

its prediction performance across maintenance boundary and abnormal wafers. These properties are valuable in an online metrology environment where accuracy and certainty of measurement data cannot be guaranteed.

Figure 9: Integrated VM metrology framework

21 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1.6 VM Measured Data RPLS Predicted T−PLS outliers

1.5 1.4 Etch Rates

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 30

1.3 1.2 1.1 1 0.9 0.8 0

500

1000

1500

Wafer Samples

Figure 10: Integrated Framework Moving Window PLS Model Prediction and TPLS Outlier Detection

Conclusion In this paper, we address the issue of process drifts and disturbances in developing VM models for plasma etch tools in semiconductor manufacturing. We have shown that T-PLS outlier detection, data preprocessing and appropriate variable selection methods can be combined to form a robust VM framework that can track drifting disturbances. Utilizing quality-relevant monitoring can be an effective tool to predict prediction quality offsets and screen out bad batch data during online update. The proposed framework have been shown to work better than individual methods available in literature. Due to limited data, we still need to test the framework against multiple preventive maintenance events over longer periods. In addition, the current proposed framework does not attempt to perform outlier reconstruction, which could be another area for future improvement.

22 ACS Paragon Plus Environment

Page 23 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Acknowledgement The authors would like to gratefully acknowledge the financial and technical support received from Texas Instruments during the course of this research.

23 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Supporting Information Figures 11,12 and 13 show the cut-off threshold criterion for correlation coefficient, beta coefficient and VIP variable selection methods respectively. The Competitive Adaptive Re-weighted Sampling (CARS) method 35 is used as a benchmark reference to compare the filtering based techniques and the results are shown in 14. Results obtained from CARS is shown in Figure 14. CARS utilizes Monte-Carlo sampling to search for the set of variables with optimal cross-validated prediction performance. From the top figure, we see that the number of selected variables drops exponentially for successive samples. The middle subplot shows the cross-validated residual for each sampling run and can be used to determine the optimal set of variables selected. In this case, the optimal variable selection is achieved in 23 sampling runs and returns about 200 variables instead of the previously given 4000. The identified key contribution phases from the CARS method matched that of the VIP filtering and beta coefficient filtering. However, due to computation times, CARS is not suitable for online implementation. The PLS filtering method is therefore selected as the preferred method of variable selection.

Figure 11: Variables selection results in the PLS models using correlation coefficients as the threshold criterion This information is available free of charge via the Internet at http://pubs.acs.org. Disclaimer 24 ACS Paragon Plus Environment

Page 24 of 30

Page 25 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

Figure 12: Variable selection results in the PLS models using beta coefficient as the threshold criterion

Figure 13: Variable selection in the PLS models using PLS VIP as the evaluation criterion

25 ACS Paragon Plus Environment

Page 26 of 30

4000 2000 0

0

10

20 30 Number of sampling runs

40

50

0

10

20 30 Number of sampling runs

40

50

0

10

20 30 Number of sampling runs

40

50

0.4

RMSECV Regression coefficients path

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Number of sampled variables

Industrial & Engineering Chemistry Research

0.3 0.2

0.2 0 −0.2

Figure 14: Competitive Adaptive Re-weighted Sampling (CARS) variable selection results as reference comparison using the methods outlined in Li 35 Material presented in this publication reflects opinions of the authors, and not of their institutional affiliations.

26 ACS Paragon Plus Environment

Page 27 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

References 1. Zeng, D.; Tan, Y.; Spanos, C. J. Dimensionality reduction methods in virtual metrology. Proceedings of SPIE 2008, 6922, 692238–692238–11. 2. Cheng, F.-t.; Member, S.; Huang, H.-c.; Kao, C.-a. Dual-Phase Virtual Metrology Scheme. 2007, 20, 566–571. 3. Khan, A.; Moyne, J.; Tilbury, D. An approach for factory-wide control utilizing virtual metrology. . . . , IEEE Transactions on 2007, 20, 364–375. 4. Lynn, S. A.; Ringwood, J.; Member, S.; Macgearailt, N. Global and Local Virtual Metrology Models for a Plasma Etch Process. IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING 2012, 25, 94–103. 5. Kadlec, P.; Gabrys, B.; Strandt, S. Data-driven Soft Sensors in the process industry. Computers & Chemical Engineering 2009, 33, 795–814. 6. Kano, M.; Fujiwara, K. Virtual Sensing Technology in Process Industries: Trends and Challenges Revealed by Recent Industrial Applications. Journal of Chemical Engineering of Japan 2013, 46, 1–17. 7. Kadlec, P.; Gabrys, B. Adaptive on-line prediction soft sensing without historical data. Neural Networks (IJCNN), The 2010 . . . 2010, 8. Kang, P.; Lee, H.-j.; Cho, S.; Kim, D.; Park, J.; Park, C.-K.; Doh, S. A virtual metrology system for semiconductor manufacturing. Expert Systems with Applications 2009, 36, 12554– 12561. 9. Hung, M.-h.; Member, S.; Lin, T.-h.; Member, S.; Cheng, F.-t.; Lin, R.-c. A Novel Virtual Metrology Scheme for Predicting CVD Thickness in Semiconductor Manufacturing. 2007, 12, 308–316.

27 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

10. Zeng, D.; Spanos, C. J. Virtual Metrology Modeling for Plasma Etch. IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING 2009, 22, 419–431. 11. Lynn, S.; Ringwood, J. Virtual metrology for plasma etch using tool variables. . . . , 2009. ASMC’09. . . . 2009, 143–148. 12. Khan, A. a.; Moyne, J.; Tilbury, D. Virtual metrology and feedback control for semiconductor manufacturing processes using recursive partial least squares. Journal of Process Control 2008, 18, 961–974. 13. Besnard, J.; Toprac, A. Wafer-to-wafer virtual metrology applied to run-to-run control. 2006. 14. B, G. Development of Virtual Metrology in Semiconductor Manufacturing. Semiconductor Manufacturing 2011, 15. Cherry, G. A. Methods for Improving the Reliability of Semiconductor Fault Detection and Diagnosis with Principal Component Analysis. Ph.D Thesis 2006, 16. Qin, S. J. Survey on data-driven industrial process monitoring and diagnosis. Annual Reviews in Control 2012, 36, 220–234. ¨ 17. Wold, S.; Geladi, P.; Esbensen, K.; Ohman, J. Multi-way principal components-and PLSanalysis. Journal of Chemometrics 1987, 1, 41–56. 18. Nomikos, P.; MacGregor, J. F. Multi-way partial least squares in monitoring batch processes. Chemometrics and Intelligent Laboratory Systems 1995, 30, 97–108. 19. de Jong, S. SIMPLS: an alternative approach to partial least squares regression. Chemometrics and Intelligent Laboratory Systems 1992, 251–263. 20. H¨oskuldsson, A.; Hoskuldsson, A. PLS Regression Methods. Journal of Chemometrics 1988, 2, 211–228.

28 ACS Paragon Plus Environment

Page 28 of 30

Page 29 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Industrial & Engineering Chemistry Research

21. Andersson, M. A comparison of nine PLS1 algorithms. Journal of Chemometrics 2009, 23, 518–529. 22. Kr¨amer, N.; Sugiyama, M. The degrees of freedom of partial least squares regression. Journal of the American Statistical . . . 2011, 106, 697–705. 23. Qin, S. J.; McAvoy, T. J. Nonlinear {PLS} modeling using neural networks. Computers & Chemical Engineering 1992, 16, 379–391. 24. Rosipal, R.; Kr¨amer, N. Overview and recent advances in partial least squares. Subspace, Latent Structure and Feature Selection 2006, 34–51. 25. Helland, K.; Berntsen, H. E.; Borgen, O. S.; Martens, H. Recursive algorithm for partial least squares regression. Chemometrics and Intelligent Laboratory Systems 1992, 14, 129–137. 26. Qin, S. J. Recursive PLS algorithms for adaptive data modeling. Computers & Chemical Engineering 1998, 22, 503–514. 27. Zhou, D.; Li, G.; Qin, S. Total projection to latent structures for process monitoring. AIChE Journal 2010, 56. 28. Qin, S.; Zheng, Y. Quality-relevant and process-relevant fault monitoring with concurrent projection to latent structures. AIChE Journal 2013, 59. 29. Leardi, R.; Lupi´an˜ ez Gonz´alez, A. Genetic algorithms applied to feature selection in PLS regression: how and when to use them. Chemometrics and Intelligent Laboratory Systems 1998, 41, 195–207. 30. Hoskuldsson, A. Variable and subset selection in PLS regression. Chemometrics and Intelligent Laboratory Systems 2001, 55, 23–38. 31. Garc, S.; Kourti, T.; Macgregor, J. F. Troubleshooting of an Industrial Batch Process Using Multivariate. Industrial & Engineering Chemistry Research 2003, 3592–3601. 29 ACS Paragon Plus Environment

Industrial & Engineering Chemistry Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

32. Kassidas, A.; Macgregor, J. F.; Taylor, P. A. Synchronization of Batch Trajectories Using Dynamic Time Warping. AIChE Journal 1998, 44, 864–875. 33. Wold, S.; Sjostrom, M. PLS-regression : a basic tool of chemometrics. 2001, 109–130. 34. Zhang, Y.; Lu, B.; Edgar, T. Batch Trajectory Synchronization with Robust Derivative Dynamic Time Warping. Industrial & Engineering Chemistry Research 2013, 35. Li, H.; Liang, Y.; Xu, Q.; Cao, D. Key wavelengths screening using competitive adaptive reweighted sampling method for multivariate calibration. Analytica chimica acta 2009, 648, 77–84.

30 ACS Paragon Plus Environment

Page 30 of 30