Random Sampling-Based Automatic Parameter Tuning for Nonlinear

ARTICLE pubs.acs.org/IECR

Random Sampling-Based Automatic Parameter Tuning for Nonlinear Programming Solvers Weifeng Chen,† Zhijiang Shao,*,† Kexin Wang,† Xi Chen,† and Lorenz T. Biegler‡ † ‡

Department of Control Science and Engineering, Zhejiang University, Hangzhou 310027, People’s Republic of China, and Department of Chemical Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States ABSTRACT: Nonlinear programming solvers play important roles in process systems engineering. The performance of a nonlinear programming solver is influenced significantly by values of the solver parameters. Hence, tuning these parameters can enhance the performance of the nonlinear programming solver, especially for hard problems and time-critical applications like real time optimization and nonlinear model predictive control. Random sampling (RS) algorithm is utilized to tune the nonlinear programming solver for solving hard problems. By introducing an iterated search technique, heuristic rules and advanced termination criteria, an enhanced random sampling algorithm is developed to determine parameter configuration that work significantly better than the default. These random sampling-based methods can handle all kinds of parameters (e.g., categorical, integer, and continuous) of the nonlinear programming solver. Numerical results with parameter configurations from the proposed random sampling-based methods show remarkable performance improvement compared with the defaults. Future works toward the development and improvement of the automatic parameter-tuning tool for the nonlinear programming solvers are also outlined.

1. INTRODUCTION Nonlinear programming solvers play important roles in process systems engineering. They are applied extensively in the design of chemical processes and process operations. A number of state-of-art nonlinear programming solvers, such as IPOPT,1-3 CONOPT,4 and SNOPT,5 are available for such applications. However, the performance of these nonlinear programming solvers is not always satisfactory when dealing with all problem instances. In this study, the improved performance of existing nonlinear programming solvers is considered when no new solution strategies are developed and introduced. For each of these solvers, many parameters are usually set by developers based on “rules of thumb”. Although these parameters can be adjusted by users, few guidelines are provided by developers for guidance of users. Undeniably, the values for these parameters can significantly impact the performance of nonlinear programming solvers. The success of CPLEX (a commercial mixed integer programming solver) tuning tool, which is introduced in CPLEX version 11, STOP6 and ParamILS7 indicates the necessity and feasibility of tuning parameters for nonlinear programming solvers to improve their performance. To promote this issue, this study is applied to the interior point algorithm (IPOPT, version 3.5.4). IPOPT implements a primal-dual interior point method that avoids the combinatorial problem of identifying active sets of inequalities. Instead, bound constraints are replaced with a logarithmic barrier term added into the objective function, and a sequence of equality constrained optimization problems is solved with a decreasing or adaptive updated barrier parameter. A filter-based line search is utilized to guarantee global convergence. IPOPT is an excellent nonlinear programming solver, especially for large-scale problems with tens of thousands of bound constraints. It has been embedded into many modeling environments, including gPROMS, GAMS, AMPL, AIMMS, and Aspen Plus.8 The r 2011 American Chemical Society

algorithmic parameters in IPOPT are generally divided into three categories: continuous parameters, integer parameters, and categorical parameters. Specific information on the parameters selected from IPOPT is listed in the Appendix.9 To facilitate the observation, two test problems (dallass and palmer1 from the CUTE test set) are considered along with only two tuning parameters (barrier_tol_factor and mu_linear_decrease_factor). Upon generating successful solutions for the tested examples, the value of performance function is set as the actual CPU time cost; otherwise, it is specified with a maximum number (10 s for dallass and 15 s palmer1). In Figure 1, the performance function surface varies drastically on the coordinate plane, as determined by the aforementioned parameters. The proper values for these two parameters can enhance solver performance, suggesting that parameter tuning is desired. In fact, parameter tuning itself could be formulated as an optimization problem. As seen in the figure, the performance function surface is not smooth. Hence, gradient-based approaches cannot be used to solve automatic parameter tuning problems. It is difficult and time-consuming to identify optimal parameter configuration. However, there are at least two application scenarios to justify large-scale computational efforts when determining appropriate parameter configuration. • Scenario 1 (Hard problems): when large-scale and complex problems cannot be solved by nonlinear programming solvers by using their default parameter configuration, these can be addressed by parameters adjustments. • Scenario 2 (Improvement needed): in Real Time Optimization (RTO) or Nonlinear Model Predictive Control Received: April 6, 2010 Accepted: January 28, 2011 Revised: January 22, 2011 Published: March 07, 2011 3907

dx.doi.org/10.1021/ie100826y | Ind. Eng. Chem. Res. 2011, 50, 3907–3918

Industrial & Engineering Chemistry Research

ARTICLE

Figure 1. Relationship between performance of optimization solver and parameter configurations.

(NMPC) applications, problem instances with similar structures are formulated. In most cases, optimization problems are solved repeatedly by using only minor changes in model parameters (e.g., flow rate of feed or concentration of components). The efficiency of nonlinear programming solvers for these applications is crucial, so it makes sense to save on “online” resources with better parameter configuration at the expense of spending additional “offline” resources. Some studies on parameter tuning for optimization algorithm are listed in the extant literature. Parameter tuning based on intelligent sampling of settings through parameter space has been researched by Baz et al.6 Each parameter is assumed to have small discrete sets of values. Audet and Orban10 utilized the mesh adaptive direct search methods to minimize some performance measures on a fine-tuned algorithm, but they did not take into account discrete parameters. Birattari et al.11 described a racing procedure in exploring the configuration of metaheuristics. This was designed for tuning parameters with discrete values. If there were continuous variables, they needed to be discretized in advance. Diaz and Laguna12 proposed an approach for tuning parameters by combining fractional experimental design and local search and it could only handle numerical parameters. Hutter et al.13,14 presented an iterated local search approach and Ansotegui et al.15 proposed a gender-based genetic algorithm (GGA) for tuning all kinds of parameters including categorical, continuous, and integer parameters. Pattern search algorithm for mixed variable programming16 could also be introduced to tune all kinds of parameters of optimization algorithms. The aim of this work is to develop an automatic parametertuning tool to enhance the performance of nonlinear programming solvers, as applied in process systems engineering. A suitable parameter configuration throughout the parameter space consisting of categorical, continuous, and integer parameters is sought. Although the ParamILS13,14 and GGA15 can handle all kinds of parameters, their implementations are more complicated, and they might not work well for the first scenario (hard problems), where the feedback information only identifies whether the solution is found or not. In this work, random sampling method is utilized to tune the nonlinear programming solver for the first application scenario, after which an enhanced random sampling method is proposed for the second application scenario. This paper is structured as follows. The next section describes the random sampling-based methods for tuning the nonlinear

programming solvers. Section 3 shows some experimental results with optimization problems from the CUTE collection, a dynamic optimization problem for a crystallizer and a nonlinear predictive model control problem for system of continuous stirred-tank reactors (CSTRs). Conclusions and directions for future work are given in Section 4.

2. RANDOM SAMPLING-BASED TUNING IN PARAMETER SPACE Automatic parameter tuning is the process of seeking better parameter configuration for a nonlinear programming solver, whose performance is evaluated based on the statistics returned by a solver corresponding to specific problem instance (or a group of similar instances). There are typically no explicit analytical expressions describing the relationship between parameter configuration and the performance of the nonlinear programming solver. It can only be regarded as a black box model. Moreover, it is expensive to evaluate the performance function. Usually, CPU time is utilized to measure the performance of the nonlinear programming solver (since the fluctuation of CPU time for solving the same problem with the same parameter configuration is minor, it is treated as deterministic). In the process of tuning the nonlinear programming solver when dealing with hard problems, the performance function is only set as “successful” or “failed”. To aid in describing the automatic parameter tuning problem, the formulation is given as follows ( min f ðA; θ; InsÞ ð1Þ s:t: θ∈Ω The objective function f is the performance of the nonlinear programming solver; A is a nonlinear programming solver; Ins is a problem instance or a class of similar problem instances; θ is a parameter configuration for solver A; and Ω is the feasible region of valid parameter configurations. For the first application scenario mentioned above, the objective function of formulation (1) is set as 0 (successful) or 1 (failed). Once the hard problem is solved successfully, the search procedure is terminated. There is no feedback to guide the search process and the number of possible parameter configurations could be infinite. The Random Sampling algorithm (RS) is conducted to generate a probable typical representative subset of 3908

dx.doi.org/10.1021/ie100826y |Ind. Eng. Chem. Res. 2011, 50, 3907–3918


ARTICLE

the entire population, so it is considered the proper choice for tuning the nonlinear programming solver in the first application scenario. The algorithm for this first application scenario is shown in Figure 2. In Figure 1, it is shown that there are many local optima in the parameter space. Finding a locally optimal parameter configuration is usually sufficient for improving the performance of a nonlinear programming solver in practical application. In such case, the global optimum is not essential for the second application scenario, so random sampling is still a suitable choice. On the other hand, the random sampling method in Figure 2 cannot be utilized straightforwardly to enhance the performance of the nonlinear programming solver. There are two issues that need to be addressed. The first is how to guide the search procedure to explore the parameter space by using the value of performance function and other information. The second is how to terminate the search procedure properly and efficiently. For the first issue, the iterated search technique is introduced with heuristic rules. Random sampling is applied to each neighborhood to get the best parameter configuration. When the current neighborhood is considered sufficiently explored, a new neighborhood is then constructed. The best recorded parameter configuration is used as the central point for new neighborhood construction, if it is updated in the current neighborhood; otherwise a random parameter configuration is selected. The radius for the neighborhood is kept unchanged for all the time. If the best recorded parameters remain essentially unchanged for certain times, they are considered to be beneficial to seeking optimum and the values of these parameters are fixed. Heuristic rules are applied based on the latest N best recorded parameter configurations {θ1,θ2,...,θN}: • Categorical Parameter: If θi(categorical) = θj(categorical) is satisfied for all i, j∈{1,...,N}, this categorical parameter is fixed; • Numerical Parameter: If R-1θj(Numerical) < θi(Numerical) < Rθj(Numerical), R > 1 is satisfied for all the i, j∈{1,...,N}, the numerical parameter is fixed with θN(Numerical). It is important to design an appropriate termination criterion to judge whether an adequate parameter configuration is obtained. A good termination criterion can make a good trade-off between the goodness of the parameter configuration and the computation effort. Here we apply the concept of convergence depth control for nonlinear programming algorithms.17 An iteration sequence of the search procedure is constructed as follows 8 > < f ðA, θ0 , InsÞ, if k ¼ 0 f ðA, θ, InsÞ and k ¼ k þ 1, if θ is better than θopt fk ¼ > : f A, θopt , Ins and k ¼ k þ 1, if θopt is not updated in Ω ð2Þ where {fk} is the function sequence, θ0 is the default parameter configuration, θ is the current parameter configuration, θopt is the best recorded parameter configuration, and Ω is the current neighborhood. The termination criterion is designed based on the monotonically nonincreasing iteration sequence {fk}, and |fk-1fk| is utilized to describe the progress of the search procedure. The total improvement gotten so far is defined by |f0-fk|. The k-th degree of progress is defined as follows 8 > < jfk - 1 - fk j, if f 6¼ f 0 k jf0 - fk j ð3Þ σk ¼ > : 0, if f0 ¼ fk

Figure 2. RS for tuning the nonlinear programming solver when dealing with hard problems.

Figure 3. ERS for improving the performance of a nonlinear programming solver by tuning parameters.

If σk < σ0 (σ0 is a predefined number, which is greater than zero and less than 1) has been satisfied consecutively for M times, it is considered that it cannot be enhanced any more and the search procedure should be terminated. The search procedure can be finally terminated with the proposed termination criterion, because σk converges to zero based on the hypothesis that the objective function fk is bounded below. Remark 1: If f0 = fk, it implies that f0 = f1 = ... = fk-1 = fk. This is consistent with σk = 0. The Enhanced Random Sampling algorithm (ERS) for the second application scenario is shown as Figure 3. Remark 2: In practical application, the performance of the nonlinear programming solver can be measured by the overall computational time, the overall number of function calls, the number of failures, the total number of iterations, or the agreement of the algorithm with some prescribed behavior.10 It is reasonable to assume that the objective function fk is bounded below, and then fk has a finite limit according to the Monotone Convergence Theorem [Monotone Convergence Theorem: a bounded monotonic sequence is convergent.]. 3909


Industrial & Engineering Chemistry Research Remark 3: The technique of “adaptive capping”14 can be used to enhance the efficiency of the ERS as applied to the automatic parameter tuning problem. In the process of performance function evaluation, if the current CPU time cost is greater than the recorded, the current run (i.e., solving NLP problem) might be abruptly terminated. The reason is that the following run is of no use for accepting the current parameter configuration. In the next section, the RS and ERS are shown to perform better than the previous parameter tuning work, i.e., parameter autotuning (PAT),18 which is implemented based on the combined metaheuristic and direct search method.

3. NUMERICAL RESULTS The examples in the first numerical experiment are from the CUTE collection. In the second numerical experiment, the dynamic optimization problem for the crystallizer is studied. The last numerical experiment is based on a nonlinear model predictive control for a system with two-series nonisothermal CSTRs. The parameters used for tuning are listed in Table 16 in the Appendix. There are 5 categorical parameters, 13 continuous parameters, and 2 integer parameters. However, the parameters not considered in this work can be added easily. For IPOPT, the desired convergence tolerance is 1.0 10-8; the desired threshold for dual infeasibility is 1.0 10-4; the desired threshold for constraint violation is 1.0 10-4, and the desired threshold for complementary conditions is 1.0 10-4. The maximum number of iterations is 3000. The termination condition for the RS corresponds to the number of performance function evaluation that should not exceed 1000 times or the total CPU limit of 24 h. The radius is set as a 10-fold default value for each integer and continuous parameter. As for the termination condition of ERS, σ0 is set as 0.1 and M is set as 3. For heuristic rules, N is set as 3 and R is set as 1.5. The condition for updating the neighborhood is that the current neighborhood is evaluated more than 50 times. The radius is set as the default value for each integer and continuous parameter. All numerical experiments are carried out on a Dell Precision T3400 running Windows XP with Intel(R) Core(TM) 2 Duo CPU E8200, 2.66 GHz and 2.0 G RAM. 3.1. Examples from CUTE. 3.1.1. Hard Problem Test. In the following, RS seeks suitable parameter configuration to solve the problems at which the nonlinear programming solver fails to deliver using the default parameter configuration. The proposed result is specific only to the single case study. Two examples, Cresc50 and Polak3, are selected from the CUTE collection. Table 1 gives the detailed descriptions of these examples. The columns of Table 1 represent the number of variables (#var.), bounds on variables (#bdd), equality constraints (#eq), inequality constraints (#ineq), nonzero elements of Jacobian matrix of constraints (#nzJac), and nonzero elements of the Hessian matrix of Lagrange function (#nzHess). IPOPT (version 3.5.4) with the default parameter configuration could not succeed in solving both of them, as indicated in Table 2 and Table 4. I. Cresc50 Problem. RS is run 20 times. At 113.75-s average CPU time cost and 8.3 times average evaluation of the performance function, RS finds a suitable parameter configuration for IPOPT and declares that Cresc50 is solved successfully. Table 2 shows the numerical results for both the default and tuned parameter configuration of Cresc50. The columns of Table 2 represent the number of iterations (Iter), CPU time (CPU), value of objective function (Objective), constraint violation

ARTICLE

Table 1. Detailed Information on Problems from the CUTE Collection problem

#var.

#bdd

#eq

#ineq

#nzJac

#nzHess

Cresc50

6

4

0

100

550

17

Polak3

12

0

0

10

120

11

(Constraints), dual infeasibility (Dual), returned status (Status) of the solver, and converged flag (i.e., O for “successful” and for “failed”). Table 3 lists the tuned values of the parameters. The default values are listed in Table 16 in the Appendix. II. Polak3 Problem. RS is run 20 times. At 15.75-s average CPU time cost and 1.9 times average evaluation of the performance function, RS finds a suitable parameter configuration for IPOPT and declares that Polak3 solved successfully. Table 4 shows the numerical results for both the default and tuned parameter configuration of Polak3. Table 5 lists the tuned values of the parameters. The default values are listed in Table 16 in the Appendix. Based on previous experiments, the hard problems could be solved successfully using the parameter configurations, as proposed by RS. For comparison with the default parameter configuration, in Tables 3 and 5, most parameters have been modified; incidentally only a few key parameters play an important role in solving the hard problems. By using RS, a single parameter is tuned to identify which one among all these parameters is critical for the hard problems demonstrated earlier. Findings show that parameter obj_scaling_factor is crucial for both hard problems. For Cresc50, changing obj_scaling_factor to 0.271584 leads to successful solution; for Polak3, setting the obj_scaling_factor to 2.93036 achieves a similar result. These and the corresponding results in PAT18 confirm that obj_scaling_factor is a key element, as this parameter influences the gradient of objective function and the Hessian matrix of Lagrange function. Poor scaling can lead to an ill-conditioned Hessian, which then deteriorates the searching procedure of the nonlinear programming solver. There are some other hard problems selected from the CUTE collection, which cannot be solved by the default parameter configuration. Tuning the key parameter obj_scaling_factor to a particular value solves these hard problems. Table 6 lists the numerical results. However, exploring how to define and detect a hard problem instance - a product of poor scaling - remains an open issue. 3.1.2. Improvement Test. This subsection demonstrates how ERS can tune IPOPT well for solving a class of similar problem instances. IPOPT is tuned based on a set of training instances that are, in some sense, representative of the whole set of instances. Based on such training instances, a suitable parameter configuration is generated by ERS. This is expected to perform better for the whole set of instances compared with the default parameter configuration. The instances solved by IPOPT are derived from Palmer1 (i.e., selected from the CUTE collection), which is reformulated as follows 31

min

∑

"

i¼1

Bi -

A2i x1

x2 þ x3 þ A2i x-1 4

s:t: x2 g 0:00001

!#2

ð4Þ

x3 g 0:00001 x4 g 0:00001 3910



ARTICLE

Table 2. Numerical Results for Cresc50 with the Default and Tuned Parameter Setting statistics

iter

CPU (s)

default

635

2.50

tuned

128

0.53

constraints

-3.08440 10-14

3.757 10-1

2.606

converged to a locally infeasible point

2.997 10-11

2.093 10-10

optimal solution found

O

7.86247 10

-1

dual

flag

objective

status

Table 3. Tuned Values of Parameters for Solving Cresc50 parameter

tuned value

parameter

tuned value

alpha_for_y alpha_for_y_tol

safe_min_dual_infeas 66.8755

mu_linear_decrease_factor mu_max

5.78632 10-1 1.00751 106

barrier_tol_factor

7.06992

mu_max_fact

7.11356 102

mu_min

9.35203 10-11 loqo

-2

bound_frac

8.78771 10

bound_mult_init_method

constant

mu_oracle

bound_mult_init_val

9.43898

mu_strategy

adaptive

bound_push

1.08164 10-2

mu_superlinear_decrease_power

1.78333

fixed_mu_oracle

quality-function

nlp_scaling_max_gradient

4.30338 102

max_soc mu_init

36 9.58769 10-2

obj_scaling_factor quality_function_max_section_steps

10.1483 24

Table 4. Numerical Results for Polak3 with the Default and Tuned Parameter Setting statistics

iter

CPU (s)

objective

constraints

dual

flag

status

default

3000

10.32

3.9975 102

1.044 103

3.991 107

maximum number of iterations exceeded

tuned

169

0.343

5.9330

0.000

5.149 10-10


O

Table 5. Tuned Values of Parameters for Solving Polak3 parameter

tuned value

parameter

tuned value

alpha_for_y

min

mu_linear_decrease_factor

3.48966 10-1

alpha_for_y_tol

36.7226

mu_max

9.50511 105

barrier_tol_factor

87.7462

mu_max_fact

3.70348 103

mu_min

5.59014 10-11 quality-function

-2

bound_frac

2.98575 10


constant

mu_oracle

bound_mult_init_val

1.93097

mu_strategy

monotone

bound_push

1.27702 10-2


1.17622

fixed_mu_oracle max_soc

average_compl 35

nlp_scaling_max_gradient obj_scaling_factor

5.49681 102 0.815088

mu_init

9.06702 10-1

quality_function_max_section_steps

7

where Ai, Bi, i = 1,...,31 are constant parameters. A group of similar instances are created by altering parameters A1 and B1. Specifically, A1 is changed from (1-5%)a1 to (1 þ 5%)a1 at step (1%)a1, where a1 = -1.788963 is the original value of A1. Next, B1 is changed in a similar way (b1 = 78.596218). There are 121 instances (including the original instance) distinguished with different values for A1 and B1. Nine instances corresponding to the elements in Cartesian product of set {0.95a1,a1,1.05a1} for A1 and set {0.95b1,b1,1.05b1} for B1 are utilized as the training instances in order to tune the solver. The performance function is constructed as the infinite norm of the CPU time cost vector (i.e., maximal CPU time cost in these 9 instances). ERS is run 20 times. At 16.5-s average CPU time cost and 143.0 times average evaluation of the performance function, a proper parameter configuration is finally produced. The values of the tuned parameter are listed in Table 7, while default values are listed in Table 16 in the Appendix. The numerical results for the whole set of instances by using the default and tuned parameter configuration are compared.

Table 8 shows the numerical comparison of the original instance of Palmer1. With the same objective value and satisfied constraints, the CPU time cost has decreased drastically; such is also the case for other instances. Figure 4 shows the comparison between the default and tuned parameter configuration for all instances. Clearly, the performance function surface (i.e., CPU time cost) generated with the tuned parameter configuration is always better than those with the default. The average CPU time cost using the default parameter configuration for solving the 121 instances is 1.0671 s, whereas the average CPU time cost corresponding to the tuned parameter configuration is 0.0216 s. The computational effort to solve these instances has been reduced remarkably. 3.2. Dynamic Optimization Problem for the Crystallizer. In general, crystallization models deal with population balance and moment equations. In addition to generic moment equations, a nucleation rate equation is included for the batch crystallization process.19 In order to maximize crystal length, a dynamic 3911



ARTICLE

Figure 4. Numerical results of the optimization solver for solving all instances derived from Palmer1 by using the default and tuned parameter configuration (logarithmic scale is applied to the vertical axis).

Table 6. Numerical Results for Key Parameter Tuning Nonmsqrt problem

default

Palmer5a tuned

Sineali

default -1

tuned

default

tuned

1.0

2.29420

1.0

4.21979 10-1

171 0.39

3000 6.64

35 0.09

3000 6.61

56 0.13

7.51803 10-1

3.91605 10-2

2.12809

-1.90096 103

-1.87336 103

obj_scaling_factor

1.0

6.28773 10

iter CPU (s)

3000 13.22

objective

7.51800 10-1

constraints

0

0

0

0

0

0

dual

1.79406 10-5

1. 49153 10-8

4.66628 101

2.16371 10-8

3.43806 10-2

1.91184 10-8

flag

O

O

O

Table 7. Tuned Values of Parameters for Solving Similar Instances Derived from Palmer1 parameter

tuned value

parameter

tuned value

alpha_for_y

dual-and-full


4.51557 10-2

alpha_for_y_tol barrier_tol_factor

12.5727 3.00410 10-1

mu_max mu_max_fact

2.00781 104 1.11139 103

bound_frac

6.98836 10-3

mu_min

1.98859 10-12


mu-based

mu_oracle

quality-function

bound_mult_init_val

2.75359

mu_strategy

adaptive

bound_push

1.51228 10-2


1.77049

fixed_mu_oracle

quality-function

nlp_scaling_max_gradient

172.690

max_soc

4

obj_scaling_factor

5.66501 10-1


11

4.57442 10

mu_init

-2

Table 8. Numerical Results for the Original Instance by using the Default and Tuned Parameters Setting statistics

iteration

CPU (s)

objective

constraints

status

flag

4

0.0


O

0.0


O

default

697

1.41

1.17546 10

tuned

9

0.02

1.17546 104 3912



ARTICLE

Table 9. Parameters and Constants of Crystallization Model parameters and constants

value

parameters and constants

5 10-5 m

initial crystal size: L0

-4

value

mass of seeds added: Ws0

2 10-3 kg

true specific gravity of the crystals: F

1.58 kg/L

mean size of the seeds: Ls0

5 10

shape factors for area of the crystals: R

0.2

shape factors for volume of the crystals: β

1.2

constant: Kg

0.00418

constant: Bn

385

total mass in the crystallizer: w

2.025 kg

mean heat capacity: Cp

0.4 kcal/kg C

specific melting latent heat of the product: Kc

35 kcal/kg

heat-transfer coefficient area: KE

377 kcal/C h

m

Table 10. Simultaneous Approach Compared with Previous Work20

this work 20

previous work

number of finite elements

collocation technique

number of collocation points

25

Radau

3

OCC21 IPOPT(version 3.5.4)

50

Radau

3

DynoPC22 Reduced-space IPOPT

optimization problem20 is formulated and stated as follows maxzðtÞ, uðtÞ, p Ls ðtf Þ s:t:

dLs 1:1 ¼ Kg L0:5 s ΔT dt

dN ¼ Bn ΔT 5:72 dt dL dLs dN ¼N þ L0 dt dt dt dA dLs dN ¼ 2RN þ L20 dt dt dt dVc dLs dN ¼ 3βN þ L30 dt dt dt dM Ws0 2 dLs dVc ¼ 3 3 Ls þ FV dt Ls0 dt dt

ð5Þ

dC dM=dt ¼ dt V

! dT dM Ke ¼ Kc ðT - Tj Þ dt dt wCp φðCÞ e Tj ∈ ½10o C, 100o C where Ls(m) is the mean crystal size, N(106/Lh) is the number of nuclei per liter of solvent, L(106/L) is the total length of crystals per liter of solvent, A(m2/L) is the total surface area of crystals per liter of solvent, Vc(m3/L) is the total volume of crystals per liter of solvent, C(kg/Lh) is the solute concentration, M(kg) is the total mass of crystals, V(L) is the volume of the solvent, and ΔT (C) is the positive deviation from the equilibrium temperature. Function φ(C) is calculated based on a polynomial relationship between concentration and equilibrium temperature. The parameters and constants of the crystallization model are listed in Table 9.20 In this work, the simultaneous approach is applied to the crystallization problem and the generated NLP is solved with IPOPT. Some differences can be observed in the formulation as compared to a previous work20 on the simultaneous approach. The details are given in Table 10. Remark 4: In this work, the dynamic optimization problem is discretized with OCC,21 after which a set of FORTRAN files (bounds.f, constraints.f, Hessian.f, jacobian.f, objective.f, start.f, and svalues.inc) are generated. These files are used to create a

tool

Dynamic Link Library exposing necessary methods which can be called by IPOPT. Unfortunately, this NLP formulation cannot be solved successfully by IPOPT (version 3.5.4) with the default parameter configuration. RS is utilized to tune the parameters of IPOPT in order to solve the crystallization problem; 20 tuning runs are performed. At 3.11-h average CPU time cost and 140.8 times average evaluation of the performance function, IPOPT is finally well-tuned and the crystallization problem is successfully solved. The numerical results for both the default and tuned parameter configuration are listed in Table 11. The optimal profiles of the mean crystal size and the jacket temperature are shown in Figure 5. The suggested parameter configuration is listed in Table 12. The default values are listed in Table 16 in the Appendix. In order to verify its validity, the results obtained from the tuned parameter configuration are compared with previous work.20 To facilitate the comparison, the optimal cooling and crystal size profiles for the crystallizer obtained by the previous work20 is shown as Figure 6. The model information of the generated NLPs and the corresponding results are listed in Table 13. As presented by Table 13, Figure 5, and Figure 6, there is only a slight difference in the objectives and the crystal size profile is almost the same. 3.3. Nonlinear Model Predictive Control for System of CSTRs. A system of two CSTR in series with an irreversible reaction AfB is considered.23,24 The mass and energy balances produce the following nonlinear differential equations dV1 ¼ qF - q1 dt dðV1 CA, 1 Þ E - q1 CA, 1 ¼ qF CA, F - k0 CA, 1 V1 exp RT1 dt ΔHk0 CA, 1 V1 dðV1 T1 Þ E exp - q1 T1 - V1 Qc ¼ qF T F þ Fcp dt RT1

dV2 ¼ q1 - q2 dt

dðV2 CA, 2 Þ E ¼ q1 CA, 1 - k0 CA, 2 V2 exp - q2 CA, 2 ð6Þ RT2 dt ΔHk0 CA, 2 V2 dðV2 T2 Þ E ¼ q1 T1 þ exp - q2 T 2 dt RT2 Fcp The manipulated variables involve the valve position at the outlet of the second reactor u1 and the heat transfer rate from the first 3913



ARTICLE

Figure 5. Optimal cooling and crystal size profiles for the crystallizer.

Figure 6. Optimal cooling and crystal size profiles for crystallizer from the previous work.20

Table 11. Numerical Results for the Crystallization Problem by Using the Default and Tuned Parameter Settings statistics

iter

CPU (s)

default

515

38.67

tuned

669

objective

constraints

-5.000 10-4 -3

-4.372 10

23.09

dual

flag

status

1.077 101

1.000

restoration failed

2.062 10-11

2.385 10-15


O

Table 12. Tuned Values of Parameters for Solving the Crystallization Problem parameter

tuned value

parameter

tuned value

alpha_for_y

dual-and-full


4.27137 10-1

alpha_for_y_tol

62.9276

mu_max

6.97290 105

barrier_tol_factor

50.6677

mu_max_fact

7.26060 103

mu_min

7.70305 10-11 loqo

-2

bound_frac

2.70409 10


mu-based

mu_oracle

bound_mult_init_val

9.68505

mu_strategy

monotone

bound_push

2.22538 10-2


1.04311


quality-function 11


720.957 6.88998

mu_init

8.10389 10-1


16

Table 13. Model Information on the Discretized Crystallization Problem and Corresponding Results #var.

#eq

objective

solver

-3

this work

1050

975

4.372 10

previous work20

1900

1750

4.326 10-3 3914

IPOPT (version 3.5.4) tuned by RS Reduced-Space IPOPT



ARTICLE

Figure 7. Profiles of manipulated and controlled variables.

Figure 8. Numerical results for solving the optimal control problems on each sampling instance by using the default and tuned parameter configuration (logarithmic scale is applied to the vertical axis).

Table 14. Parameters and Constants of System of CSTRs parameters and constants

value

parameters and constants

value

constant: c1

10

pre-exponential factor: k0

7.2 1010

constant: c2

48.1909

active energy/gas constant: E/R

1.0 104 K

feed flow rate: qF concentration of A in the feed: CA,F

100 L/min 1.0 mol/L

density of the fluid: F heat capacity of the fluid: cp

1000 g/L 0.239 J/g-K

feed temperature: TF

350 K

energy of reaction: ΔH

4.78 104 J/mol

reactor u2. There are six state variables for the two reactors: volume of liquid (V1,V2), concentration of species A (CA,1,CA,2), and temperature (T1,T2). In addition, there are three algebraic equations pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi q1 ¼ c1 V1 - V2 pffiffiffiffiffi q2 ¼ c1 V 2 u1 ð7Þ Qc ¼ C 2 u 2

where q1 and q2 are the flow rates of the outlet of the first and second reactors, respectively, and Qc is the rate of cooling from the jacket. The model parameters and some constants are listed in Table 14.24 The system is steady with u1 = 1.0 and u2 = 1.0 before t = 5 min. The initial volume of the liquid, concentration of species A, and temperature in the first and second reactors are 200 L, 0.0357 mol/L, 446.471 K and 100 L, 0.0018 mol/L, 453.2585 K. The 3915



ARTICLE

Table 15. Tuned Values of Parameters for Solving the Optimal Control Problems parameter

tuned value

parameter

tuned value

alpha_for_y

min


3.19089 10-1

alpha_for_y_tol

19.3283

mu_max

5.26676 104

barrier_tol_factor

24.3315

mu_max_fact

1.73389 103

mu_min

5.30381 10-12 quality-function

-3

bound_frac

5.63503 10


constant

mu_oracle

bound_mult_init_val

1.93985

mu_strategy

monotone

bound_push

1.55382 10-3


1.39973


probing 2


70.3439 1.85659

mu_init

1.39956 10-1


11

Table 16. Specific Information on Parameters Selected from IPOPT index 1

name alpha_for_y

meaning

default

method to determine the step

primal

size for constraint multipliers

bound/set primal, bound_mult, min, max, full,

type categorical

primal-and-full, min_dual_infeas, safe_min_dual_infeas, dual-and-full

2

alpha_for_y_tol

tolerance for switching to full equality

10.0

(0,þ¥)

continuous

multiplier steps 3

barrier_tol_ factor

factor for mu in barrier stop test

10.0

(0,þ¥)

continuous

4

bound_frac

desired minimum relative distance from the initial point to bound

0.01

(0,0.5]

continuous

5

bound_mult_

initialization method for

constant

constant, mu-based

categorical

init_method

bound multipliers

6

bound_mult_ init_val

initial value for the bound multipliers

1

(0,þ¥)

continuous

7

bound_push

desired minimum absolute distance

0.01

(0,þ¥)

continuous

average_ compl

probing, quality-function,

categorical

4

average_compl, loqo [0,þ¥)

integer

from the initial point to bound 8 9

fixed_mu_ oracle

oracle for the barrier parameter when

max_soc

switching to fixed mode maximum number of second order correction trial steps at each iteration

10

mu_init

initial value for the barrier parameter

0.1

(0,þ¥)

continuous

11

mu_linear_

determines linear decrease rate of barrier

0.2

(0,1)

continuous

decrease_factor

parameter

12

mu_max

maximum value for barrier parameter

1.0 105

(0,þ¥)

continuous

13

mu_max_fact

factor for initialization of maximum value

1.0 103

(0,þ¥)

continuous

14

mu_min

for barrier parameter minimum value for barrier parameter

1.0 10-11

(0,þ¥)

continuous

15

mu_oracle

oracle for a new barrier parameter in the

quality-function

probing, loqo, quality- function

categorical

adaptive strategy 16

mu_strategy

update strategy for barrier parameter

monotone

monotone, adaptive

categorical

17

mu_superlinear_

determines superlinear decrease rate

1.5

(1,2)

continuous

18

nlp_scaling_ max_gradient

maximum gradient after NLP scaling

100

(0,þ¥)

continuous

19 20

obj_scaling_ factor quality_ function_

scaling factor for the objective function maximum number of search steps during

1 8

(0,þ¥) [0,þ¥)

continuous integer

decrease_ power

max_section_steps

of barrier parameter

direct search procedure determining the optimal centering parameter

controlled variables in this application are the concentration of species A in the second reactor (Ca2) and the temperature in the second reactor (T2). The objective of the control is to track the setting values for Ca2 and T2 which are 3.838024 10-6 mol/L and 550.0003 K, respectively. At t = 5 min, the NMPC strategy is

applied. It is assumed that there are no disturbances and model mismatch in this NMPC simulation. The sampling time is 15 s. The prediction horizon and control horizons both have 30 sampling intervals. The length of the simulation horizon is 100; the objective is to make the system track the new setting 3916



ARTICLE

and reach a new steady state over time interval [5, tf], tf = 30 min. The open-loop optimal control problem for each sampling instant is formulated as follows Rt min tkk - 1 1:0 1010 ðCA, 2 - Csp Þ2 þ ðT2 - Tsp Þ2 þ ðu1 - u1, k - 1 Þ2 þ ðu2 - u2, k - 1 Þ2 dt s:t: DAE modelð6-7Þ V1 > 0L, V2 > 0L, V1 > V2 0 < CA, 1 < 1:0, 0 < CA, 2 < 1:0 T1 > 350K, T2 > 350K u1 > 0, u2 > 0

ð8Þ

where the subscript “sp” means set point, while k-1 and k represent the two end points of k-th sampling interval. This optimal control problem (8) can be solved with simultaneous strategies by fully discretizing both state and control variables. Time interval [tk-1,tk] (15 30 s) is divided into 60 elements. The three Radau collocation points for each finite element are used to discretize the DAE equations with OCC.21 For the discrete system, the number of variables is 2340; the number of equality constraints is 1980; degrees of freedom are 360; the number of nonzero elements of Jacobian of constraints is 10599; and the number of nonzero elements of Hessian is 4140. The numerical results of NMPC applied in this application using IPOPT with the default parameter configuration are obtained (Figures 7 and 8). As shown in Figure 7, the set point can be traced by adjusting the valve position at the outlet of the second reactor and heat transfer rate from the first reactor. Meanwhile, Figure 8 demonstrates that for many sampling instants, the CPU time utilized by IPOPT to solve the corresponding optimal control problems by using the default parameter configuration is very high and even exceeds the sampling time. Thus, there is considerable room for enhancing the efficiency of IPOPT when solving the optimal control problems; that is by tuning the parameters automatically. In the NMPC simulation, 100 optimal control problems need to be solved. As mentioned in Subsection 3.1.2, the first 10 optimal control problems are utilized as training instances to tune IPOPT. The performance function is constructed as the maximal CPU time cost consumed by solving the training instances. ERS is run 20 times. At 905.7-s average CPU time cost and 134.85 times average evaluation of the performance function, the ERS proposes a parameter configuration. The tuned values of the parameters are listed in Table 15. The default values are listed in Table 16 in the Appendix. The profiles of the manipulated and controlled variables obtained by using the tuned parameter configuration are the same as those from the default parameter configuration. The computational resources used for both the default and tuned parameter configuration are presented in Figure 8. A comparison with the runs utilizing default parameter configuration shows that the case for the tuned parameter configuration decreases the total number of iterations from 39577 to 5082 and total CPU time from 1655.00 to 184.64 s. The substantial savings of computational resources is very favorable for implementation and application of NMPC, in which computational expense is always a key issue.

4. CONCLUSIONS The performance of a nonlinear programming solver is influenced significantly by the values for solver parameters. These parameters can be tuned automatically to improve the performance of nonlinear programming solvers. The relation between the performance of the nonlinear programming solver and the parameter values is too complicated to be expressed analytically. Moreover, the performance function may not be as smooth and continuous. Gradient-based approaches cannot be utilized to handle automatic parameter tuning problems. In this work, the random sampling-based methods (RS and ERS) are presented to achieve better parameter configurations compared with the defaults, and they can handle all kinds of parameters (e.g., categorical, continuous, and integer). The experiments are demonstrated with problems from the CUTE collection, a dynamic optimization problem for the crystallizer and a nonlinear model predictive control problem for two-series CSTR systems. All of these tests demonstrate two points: (1) problems that cannot be solved by using the default parameter configuration may be solved successfully with tuned parameter configuration, and (2) the tuned parameter configuration produced by the proposed method, which uses a small number of instances for training cases, generally outperform the default parameter configuration in solving the entire set of instances. However, we note that these tunings are specific to particular problem cases and cannot easily be generalized. In future, we also intend to investigate the robustness of these parameter tunings. The search efficiency of RS can be improved. In light of the characteristics of random sampling, each parameter configuration is selected independently, and as such, the parameter configurations can be evaluated concurrently. Hence, one of the plans for future research is to improve the RS with parallel computation technology. In this work, only two states (“successful” or “failed”) are used for tuning IPOPT, when solving hard problems. The reasons to failures should be investigated carefully (e.g., infeasible problem is detected, search direction becomes too small, restoration fails, and etc.). RS can be further improved by using this information. Also, in section 3.1.1, it is shown that parameter obj_scaling_factor is a key element in solving those hard problems. For future consideration, an adaptive objective scaling strategy can be designed to enhance the performance of IPOPT. Another aspect of future work is to combine closely and internally the random sampling-based methods and IPOPT. ’ APPENDIX Appendix . Specific information on parameters selected from IPOPT can be found in Table 16.

’ AUTHOR INFORMATION Corresponding Author

*E-mail: [email protected].

’ ACKNOWLEDGMENT This research was supported by the 973 Program of China (2009CB320603) and the 863 Program of China (2007A04Z192, 2008AA042902). 3917



’ REFERENCES (1) W€achter, A.; Biegler, L. T. Line Search Filter Methods for Nonlinear Programming: Motivation and Global Convergence. SIAM J. Optimiz. 2005, 16 (1), 1. (2) W€achter, A.; Biegler, L. T. Line Search Filter Methods for Nonlinear Programming: Local Convergence. SIAM J. Optimiz. 2005, 16 (1), 32. (3) W€achter, A.; Biegler, L. T. On the Implementation of an Interior Point Filter Line Search Algorithm for Large Scale Nonlinear Programming. Math. Program. 2006, 106 (1), 25. (4) Drud, A. S. CONOPT—A Large Scale GRG Code. ORSA J. Comput. 1994, 6, 207. (5) Gill, P. E.; Murray, W.; Saunders, M. A. SNOPT: An SQP Algorithm for Large-Scale Constrained Optimization. SIAM J. Optimiz. 2002, 12 (4), 979. (6) Baz, M.; Hunsaker, B.; Brooks, J. P.; Gosavi, A. Automated Tuning of Optimization Software Parameters; Technical Report 2007-7, 2007. (7) Hutter, F.; Hoos, H.; Leyton-Brown, K. Automated Configuration of Mixed Integer Programming Solvers. In Proceedings of Seventh International Conference on Integration of Artificial Intelligence and Operations Research Techniques in Constraint Programming, Bologna, Italy, June 14-18, 2010; Springer: Berlin, 2010; p 186. (8) Chen, W. F.; Shao, Z. J.; Qian, J. X. Interfacing IPOPT with Aspen Open Solvers and CAPE-OPEN. In Proceedings of Tenth International Symposium on Process Systems Engineering, Salvador, Brazil, August 16-20, 2009; Elsevier: Amsterdam, 2009; p 201. (9) W€achter A. http://www.coin-or.org/Ipopt/documentation/ (accessed Nov. 10, 2009). (10) Audet, C.; Orban, D. Finding optimal algorithm parameters using derivative-free optimization. SIAM J. Optimiz. 2006, 17, 642. (11) Birattari, M.; St€utzle, T.; Paquete, L.; Varrentrapp, K. A. Racing Algorithm for Configuring Metaheuristics. In Proceedings of the Genetic and Evolutionary Computation Conference, San Francisco, USA, July 9-13, 2002; Morgan Kaufmann: San Francisco, 2002; p 11. (12) Diaz, B. A.; Laguna, M. Fine-tuning of Algorithm Using Fractional Experimental Designs and Local Search. Oper. Res. 2006, 54 (1), 99. (13) Hutter, F.; Hoos, H.; St€utzle, T. Automatic Algorithm Configuration based on Local Search. In Proceedings of the Twenty-Second National Conference on Artificial Intelligence, Vancouver, Canada, July 22-26, 2007; AAAI: Menlo Park, 2007; p 1152. (14) Hutter, F.; Hoos, H.; Leyton-Brown, K.; St€utzle, T. ParamILS: an automatic algorithm configuration framework. J. Artif. Intell. Res. 2009, 36, 267. (15) Ansotegui, C.; Sellmann, M.; Tierney, K. A gender-based genetic algorithm for the automatic configuration of solvers. In Proceedings of the 15th International Conference on Principles and Practices of Constraint Programming, Lisbon, Portugal, September 20-24, 2009; Springer: Berlin, 2009; p 142. (16) Audet, C.; Dennis, J. E. Pattern Search Algorithms for Mixed Variable Programming. SIAM J. Optimiz. 2000, 11 (3), 573. (17) Chen, W. F.; Shao, Z. J.; Wang, K. X.; Chen, X.; Biegler, L. T. Convergence Depth Control for Interior Point Methods. AIChE J. 2010, 56 (12), 3146. (18) Chen, W. F.; Shao, Z. J.; Wang K. X.; Chen, X.; Biegler, L. T. Automatic Parameter Tuning for Optimization Solver Based on a Combined Metaheuristic and Direct Search Method; Technical Report, 2010. (19) Lang, Y. D.; Cervantes, A. M.; Biegler, L. T. Dynamic optimization of a batch cooling crystallization process. Ind. Eng. Chem. Res. 1999, 38, 1469. (20) Biegler, L. T. An overview of simultaneous strategies for dynamic optimization. Chem. Eng. Process. 2007, 46, 1043. (21) Jockenhovel, T.; Biegler, L. T.; W€achter, A. Dynamic optimization of the Tennessee Eastman process using the OptControlCenter. Comput. Chem. Eng. 2003, 27, 1513.

ARTICLE

(22) Lang, Y. D.; Biegler, L. T. A software environment for simultaneous dynamic optimization. Comput. Chem. Eng. 2007, 31 (8), 931. (23) Hahn, J.; Edgar, T. F. An Improved Method for Nonlinear Model Reduction Using Balancing of Empirical Grammians. Comput. Chem. Eng. 2002, 26 (10), 1379. (24) Hedengren, J. Reactor 4 in http://www.hedengren.net/research/models.htm (accessed Dec. 2, 2009).

3918


Random Sampling-Based Automatic Parameter Tuning for Nonlinear

Recommend Documents