Evolutionary Algorithm for the Determination of Optimal Mode of

Sep 2, 2011 - 1796 dx.doi.org/10.1021/ie2006142 |Ind. Eng. Chem. Res. 2012, 51, 1796-1808. ARTICLE pubs.acs.org/IECR. Evolutionary Algorithm for the ...
0 downloads 0 Views 1MB Size
ARTICLE pubs.acs.org/IECR

Evolutionary Algorithm for the Determination of Optimal Mode of Bioreactor Operation Aravinda R. Mandli and Jayant M. Modak* Department of Chemical Engineering, Indian Institute of Science, Bangalore 560012, India ABSTRACT: Stirred tank bioreactors, employed in the production of a variety of biologically active chemicals, are often operated in batch, fed-batch, and continuous modes of operation. The optimal design of bioreactor is dependent on the kinetics of the biological process, as well as the performance criteria (yield, productivity, etc.) under consideration. In this paper, a general framework is proposed for addressing the two key issues related to the optimal design of a bioreactor, namely, (i) choice of the best operating mode and (ii) the corresponding flow rate trajectories. The optimal bioreactor design problem is formulated with initial conditions and inlet and outlet flow rate trajectories as decision variables to maximize more than one performance criteria (yield, productivity, etc.) as objective functions. A computational methodology based on genetic algorithm approach is developed to solve this challenging multiobjective optimization problem with multiple decision variables. The applicability of the algorithm is illustrated by solving two challenging problems from the bioreactor optimization literature.

1.0. INTRODUCTION Bioreactors are used for the production of a large number of products in the pharmaceutical (monoclonal antibodies, antibiotics) and specialty chemical (biocatalysts) industries. These products are generally produced in low quantities but have high manufacturing costs. Hence, improvements in the operation of the bioreactor can drastically reduce the cost of production.1 Consequently, optimization of bioreactor operation has received a great amount of attention over the past three decades.2 Bioreactors are commonly operated in three modes, viz, batch, fed-batch, and continuous. A typical bioreactor optimization problem considered in the literature consists of specifying the mode of operation of bioreactor a priori and then determining the trajectories of control variables that maximize a desired objective. Some of the control variables considered include inlet feed rates,314 temperature,15 and pH.16 Some of the desired objectives considered include cell productivity,3 amount of product,410 or a profit function.1113 A challenging problem in the field of bioreactor optimization is to first determine which among the three modes of operation is best for a given fermentation process and then determine the suitable optimal control trajectories. Such a problem has received very little attention in the literature. Previous studies on the determination of optimal mode of operation have considered objectives such as the maximization of cell mass productivity,17,18 maximization of the amount of product,19 and maximization of the product yield and productivity,20 using flow rates as control variables. A common approach in all of these studies is to obtain analytical solution to the greatest extent possible. As a result, they are restricted in scope, because they are applicable only to processes of very simple kinetics or objective function. For bioprocesses having complicated kinetics, obtaining analytical solutions, even if they exist, is a very complex task and numerical solutions are inevitable. Furthermore, all the above studies consider a single objective function, i.e., maximizing any one of yield, productivity, etc. However, many real-world problems r 2011 American Chemical Society

require simultaneous optimization of multiple, often conflicting, objectives. In this work, we propose a computational algorithm for determining the optimal mode of bioreactor operation with single/multiple desired objectives. Computational algorithms for the determination of the trajectories of control variables can be broadly classified into two categories: deterministic and stochastic. Deterministic techniques use either Pontryagin’s maximum principle21 or convert the original optimization problem into a nonlinear programming problem (NLP).7 Application of Pontryagin’s maximum principle results in two-point boundary value problems that are difficult to solve. Also, the NLPs resulting from the transformation of optimization problems require utilization of global optimization techniques to obtain globally optimal solutions, which are currently limited by several barriers regarding problem requirements and computational effort.2 Stochastic algorithms use random choice during their search to progress toward optimal solutions22 and can locate the solutions in the vicinity of global optimal solutions.2 Genetic algorithms (GAs) belong to the class of stochastic search techniques and are based on the principles of natural evolution (and, hence, they are called evolutionary algorithms). GAs have been used to determine the optimal feeding policy5 and also initial conditions and time period of operation3 of fed-batch reactors. In these studies, time is discretized into a finite number of intervals and constant or linear approximations of the feeding policies are used between the intervals. GAs have also been used to determine the initial conditions and time period of operation for periodic chemical processes,23 using a constant approximation of feed Special Issue: Nigam Issue Received: March 28, 2011 Accepted: September 2, 2011 Revised: August 30, 2011 Published: September 02, 2011 1796

dx.doi.org/10.1021/ie2006142 | Ind. Eng. Chem. Res. 2012, 51, 1796–1808

Industrial & Engineering Chemistry Research

ARTICLE

Table 1. Modes of Operation of a Bioreactor mode of operation

Figure 1. Schematic diagram of a stirred-tank bioreactor.

trajectory in the discretized time intervals. Sarkar and Modak6 proposed a novel GA solution representation that incorporates knowledge from optimal control theory about the nature of the feeding policies. This representation of the solution avoids the various approximations of flow rates used in previous studies and yields the optimal feeding policy. GAs are also well-suited for handling multiobjective optimization problems. These problems give rise to the solutions in the form of Pareto front, which are characterized by the degradation in the performance of one of the objectives when the other objective is improved. Classical methods generate the Pareto front by transforming the multiobjective optimization problem into a series of single objective optimization problems. On the other hand, evolutionary algorithms can generate multiple solutions on the Pareto front directly.24 The elitist nondominated sorting genetic algorithm (NSGA-II)25 is an elegant evolutionary algorithm to generate the solutions on the Pareto front. In the bioreactor optimization literature, Jadot et al.26 have characterized the tradeoff between yield and productivity by solving several parameter optimization problems. Logist et al.27 have used normal boundary intersection and normalized normal constraint methods for generating the Pareto front characterizing the yieldproductivity conflict for fed-batch lysine fermentation. Maiti et al.28 have used the ε-constraint method for maximizing two conflicting objectives, viz, product concentration and product per unit cost of media for producing a glycopeptide antibiotic. Chen and Wang29 have used hybrid differential evolution for maximizing overall productivity and conversion in a multistage integrated extractive fermentation process. Sarkar and Modak30 and Lee et al.31 have adopted the NSGA-II framework for solving several bioreactor optimization problems considering multiple objectives. In the present study, we propose a novel computational algorithm for bioreactor optimization, which simultaneously addresses three important issues: (i) determination of best mode of operation, (ii) optimal control strategies for the best mode, and (iii) consideration of two or more objectives. The algorithm is based on the GA framework developed by Sarkar and Modak.6,30 The effectiveness of the algorithm in determining the optimal mode of operation is first demonstrated by comparing the results obtained by the algorithm for the optimal mode of periodic operation of bioreactors with the analytical results obtained by Modak and Lim20 for the single objectives of yield and productivity. Then, multiobjective optimization is carried out using NSGA-II to identify the best mode of operation when there is a tradeoff between yield and productivity. Finally, the algorithm is applied to determine the optimal mode of operation for the maximization of secreted protein production.4

2.0. OPTIMIZATION PROBLEM We consider a typical fermentation process involving utilization of substrate S by microorganisms X to produce product P in

conditions

batch

F1 = 0, F2 = 0

semibatch

F1 6¼ 0, F2 = 0

continuous

F1 6¼ 0, F2 6¼ 0, F1 = F2

a stirred-tank bioreactor. It is assumed that inlet feed consists of only substrate and is free of cell mass and product (see Figure 1). The governing equations for the variable volume bioreactor can be written as20 d F1 X ðXÞ ¼ μX  dt V

ð1Þ

d F1 ðSF  SÞ ðSÞ ¼  σX dt V

ð2Þ

d F1 P ðPÞ ¼ πX  dt V

ð3Þ

d ðV Þ ¼ F1  F2 dt

ð4Þ

subject to the constraints 0 e F1 , F2 e Fmax 0 e V ðtÞ e Vmax

ð5Þ

where X, S, and P are concentrations of cells, substrate, and product, respectively. V is the volume of the contents in the reactor. F1 and F2 are, respectively, the inlet and outlet volumetric flow rates. The parameters μ, σ, and π represent the specific cell growth, substrate consumption, and product formation rates, respectively, and are assumed to be functions of concentrations of cells, substrate, and products. Fmax and Vmax are the maximum volumetric flow rate and maximum working volume of the reactor, respectively. The above model emulates the three modes of operation, viz, batch, fed-batch, and continuous, with suitable choice of F1 and F2, as summarized in Table 1. For these modes, the initial conditions, with regard to the cell, substrate, and product concentrations, and the volume of the bioreactor, are fixed a priori. Xðt0 Þ ¼ X0 , Sðt0 Þ ¼ S0 , Pðt0 Þ ¼ P0 , V ðt0 Þ ¼ V0

ð6Þ

In addition to these modes, bioreactors are also operated in several other cyclic (periodic) modes of operation (e.g., repeated batch, repeated fed-batch). Equations 14 emulate the periodic mode of operation with suitable choice of flow rates F1(t) and F2(t) and additional boundary constraints given by eq 7: Xðt0 Þ ¼ XðτÞ, Sðt0 Þ ¼ SðτÞ, Pðt0 Þ ¼ PðτÞ, V ðt0 Þ ¼ V ðτÞ

ð7Þ where τ is the time period of operation. The performance of the bioreactor is assessed in terms of one or more desired objectives IP = [IP1, ..., IPm]T where m is the number of objectives. The objectives may be yield (IP1), the amount of product produced per unit amount of substrate fed into the reactor, productivity (IP2), the amount of product produced per unit time per maximum working volume of the 1797

dx.doi.org/10.1021/ie2006142 |Ind. Eng. Chem. Res. 2012, 51, 1796–1808

Industrial & Engineering Chemistry Research

ARTICLE

For periodic operation, the boundary conditions for the equality of initial and final states are given by fi ðxð0Þ, xðτÞÞ ¼ 0

i ¼ 1, 2; :::; ne

ð14Þ

where ne is the number of equality constraints. Without any loss of generality, the multiobjective optimization problem can now be stated as follows: determine (i) optimal inlet and outlet flow rate trajectories F1(t) and F2(t) in the time interval 0 e t e τ; (ii) initial conditions x0; and (iii) period of operation τ that simultaneously maximizes a set of objectives while satisfying the constraints given by eqs 1214. This can be described mathematically as max

F1 ðtÞ;F2 ðtÞ;τ;x0

IPðxðτÞÞ ¼

max

F1 ðtÞ;F2 ðtÞ;τ;x0

½IP1 ðxðτÞÞ, :::, IPm ðxðτÞÞT

ð15Þ subject to

Figure 2. Typical trajectories of the inlet and outlet flow rates.

Equations 1114 reactor, or the total amount of product produced (IP3). Z τ F2 P dt þ ½PV τ  ½PV 0 0 max IP1 ¼ Z τ F1 , F2 , τ;X0 ;S0 ;P0 ;V0 SF F1 dt þ ½SV 0  ½SV τ

ð8Þ

0

Z max

F1 , F2 , τ;X0 ;S0 ;P0 ;V0

IP2 ¼

τ 0

F2 P dt þ ½PV τ  ½PV 0 Z τ Vmax dt

ð9Þ

0

Z max

F1 , F2 , τ, X0 , S0 , P0 , V0

τ

IP3 ¼ 0

F2 P dt þ ½PV τ  ½PV 0

ð10Þ

The objectives given by eqs 810 are fairly general and reduce to various objectives considered in the literature when suitable problem-specific knowledge is incorporated. For example, eq 9 reduces (up to a multiplicative constant) to the expression for productivity considered by Logist et al.27 when the problemspecific details of noncyclic fed-batch operation and no initial product are incorporated. Similarly, eqs 8 and 9 reduce to the expressions for yield and productivity considered by Modak and Lim20 for periodic operation. Generally, we consider the dynamic bioreactor operation described by the following vector differential equation: x_ ¼ a þ b1 F1 þ b2 F2

ð11Þ

where x is the state vector consisting of n states. The flow rates are bounded by Fi, min e Fi e Fi, max

i ¼ 1, 2

ð12Þ

The inequality constraints on state variables are given by gi ðxðtÞÞ e 0

i ¼ 1, 2; :::; ni

where ni is the number of inequality constraints.

ð13Þ

3.0. GENETIC ALGORITHM FOR THE DETERMINATION OF THE OPTIMAL MODE OF OPERATION The computational algorithm developed in the current study is based on genetic algorithms (GAs). GAs belong to the family of evolutionary algorithms and use the principles of natural evolution. GAs essentially have the following important steps. First, a population of potential solutions (also called chromosomes) is initialized randomly. The objective value of the each of the chromosomes in the initial population is computed, and this objective value is used for the computation of fitness. Then, selection of chromosomes having better fitness is carried out. The selected chromosomes undergo crossover (reproduction) and mutation operations to create the next generation of chromosomes. Solutions with best fitness (called elites) are preserved in the next generation by explicitly introducing them into the next generation. The above process is repeated until a prespecified stop criterion is met. The GA outlined above is only capable of generating solutions to single-objective optimization problems. In order to obtain solutions of multiobjective optimization problems with a good distribution of solutions on the Pareto front, Deb et al.25 have proposed the elitist nondominated sorting genetic algorithm (NSGA-II). NSGA-II differs from GAs mainly in the way the selection operation works. NSGA-II classifies all the solutions in the population into various nondominated fronts and assigns a rank to each of the fronts. Solutions on a single nondominated front are characterized by a decrease in one of the objectives when the other objective is increased. In addition, there exists at least a single solution on a nondominated front with a lower rank that is better in both objectives corresponding to a solution on a nondominated front with a higher rank. NSGA-II selection operator assigns a higher fitness to solutions on nondominated front with a lower rank and, hence, explicitly preserves multiple nondominated solutions. The details of the algorithm can be found in the work of Deb.24 3.1. Chromosome Representation. Determining the solution of the general optimization problem given by eq 15 involves determining the trajectories of inlet and outlet flow rates, the time period of operation, and the initial conditions of state variables. To determine these decision variables, the solution 1798

dx.doi.org/10.1021/ie2006142 |Ind. Eng. Chem. Res. 2012, 51, 1796–1808

Industrial & Engineering Chemistry Research

ARTICLE

representation of Sarkar and Modak6 is adapted in the present study and the chromosome is encoded using the procedure outlined below. The optimization problem defined by eq 15 is a singular control problem, because both control variables F1 and F2 appear linearly in the governing differential equations. Rigorous application of optimal control theory shows that optimal flow rate of singular control problems consists of intervals of maximum, minimum, and singular flow rates.21 During the singular interval, the flow rates are manipulated to maintain the substrate concentration that maximizes a specific rate, product yield, or combination thereof, depending on the objective function.8 A typical example of the optimal flow rate trajectories is shown schematically in Figure 2. These flow rate trajectories are completely characterized by the sequence of maximum, minimum, and singular flow rates (referred to henceforth as “pattern”); the duration of each of the intervals; and a suitable expression for the singular flow rate. The pattern part of the chromosome is encoded using integers while rest of the chromosome is encoded using real numbers. In the pattern, integers 0, 1, and 2 are used to represent the minimum, singular, and maximum flow rates, respectively. The pattern of both the inlet and outlet flow rates is evaluated using GA. The times at which the flow rates switch from one interval to the other are not known a priori and are evaluated using the GA. If nI and nO are the number of intervals in the inlet and outlet flow rates, the inlet and outlet flow rates switch intervals nI  1 and nO  1 times, respectively. If the time period of operation of the bioreactor is not specified a priori, then it is also encoded in the chromosome as a decision variable. Hence, a total of nI + nO  2 and nI + nO  1 switching times are to be determined for fixed and free final time problems, respectively, using the GA. An expression for the flow rates in the singular interval is not available and, hence, the singular inlet and outlet flow rates are to be approximated appropriately. Based on the observation of Modak and Lim11 that the substrate concentration is maintained constant for a certain class of fermentations in the singular interval, Sarkar and Modak6 have proposed a nonlinear feedback control law for the singular inlet flow rate. The expression for the singular inlet flow rate is slightly modified in the present study and is written as F1s ¼

σXV Δσ 1 XV þ SF  S SF  S

ð16Þ

The first term in the above equation is the inlet flow rate, which maintains substrate concentration at a constant level (eq 2), and the second term is a correction term that is added because the substrate concentration may not be maintained constant during the singular interval. Δσ1 acts as correction for the specific substrate consumption rate on the singular interval of the optimal operation. The singular outlet flow rate also needs to be approximated appropriately. In the differential equations governing the operation of bioreactor (eqs 14), F2 appears only in total mass balance (eq 4). F2 influences the concentrations of cell, substrate, and product indirectly by influencing the volume of the reactor. Based on this observation, singular flow rate of F2 is approximated in the present study as F2s ¼ F1 þ

Δσ2 XV SF  S

ð17Þ

The first term in the above equation maintains volume at a constant level and the second term is a correction term. The correction term, similar in form to the correction term of singular inlet flow rate, is added because the volume may not be maintained constant when F2 is singular. The correction terms to the specific substrate consumption rate on the singular interval, Δσ1 and Δσ2 can, in general, be approximated by a general nonlinear function of the state variables as Δσ ¼ f ðxÞ

ð18Þ

The function f(x) is chosen to be a function of state variables alone, to obtain the singular flow rate as a feedback control law, in terms of state variables alone. The function f(x) can emulate different nonlinear characteristics, depending on the number of parameters that are used for its parametrization. Sarkar and Modak6 have parametrized the correction terms as power law functions of the state variables and reported that the algorithm is insensitive to the form of the correction term. Since Δσ1 and Δσ2 represent corrections in specific substrate consumption rate, they are assumed to be dependent on the state variables that influence the specific substrate consumption rates. In general, Δσ1 and Δσ2 can be functions of cell, substrate, and product concentrations. However, the problems considered in this study have specific substrate consumption rates, as a function of substrate concentration alone; hence, Δσ1 and Δσ2 are approximated as Δσ 1 ¼ aSb Δσ 2 ¼ cSd

ð19Þ

Parameters a, b, c, and d are the variables that determine the correction terms of the singular flow rates and are evaluated using GA. During periodic operation, the initial conditions satisfy the constraint described by eq 14 but are not explicitly specified. Hence, the initial conditions are also encoded as part of the chromosome. For problems where the initial conditions are specified, the initial conditions are not evaluated using GA. To summarize, each chromosome has four different parts: pattern, parameters of correction terms of singular flow rates, switching times, and initial conditions for each of the state variables. The novel features of this chromosomal representation include (a) emulation of optimal control theory based inlet and outlet flow rates and (b) consideration of initial conditions. This chromosomal representation is illustrated with a typical optimal policy (Figure 2), as given below.

In Figure 2, the inlet flow rate has the sequence maximum flow rate (2), minimum flow rate (0), singular flow rate (1), and minimum flow rate (0). The switching times between these intervals are 0.65, 2.43, and 10.99 h, respectively. The expression for the singular inlet flow rate is characterized by the first two correction terms (0.23, 0.70). Similarly, the outlet flow rate has the sequence minimum flow rate (0) and maximum flow rate (2). The switching time between these intervals is 11.02 h. The expression for singular outlet flow rate is characterized by the last two correction terms (1.20, 0.50). The time period of operation generally is not fixed a priori and is evaluated using GA (12.68 h). If the initial conditions are not specified, they are 1799

dx.doi.org/10.1021/ie2006142 |Ind. Eng. Chem. Res. 2012, 51, 1796–1808

Industrial & Engineering Chemistry Research

ARTICLE

Figure 3. Variation of instantaneous product yield with substrate concentration for various bioprocess kinetics considered by Modak and Lim.20

Figure 4. Optimal trajectories for maximizing yield of Type C process. (Legend: X, cell concentration; S, substrate concentration; P, product concentration; F1, inlet flow rate; and F2, outlet flow rate.)

also included in the chromosome (0.18, 0.33, 0.90, 0.99). Generally, the number of decision variables that are to be determined for optimal periodic mode of operation are 2(nI + nO)  1 + n + p, where p is the number of parameters used for characterizing the correction terms. For nonperiodic operation with fixed and free final time, the number of parameters that are to be determined are 2(nI + nO)  2 + p and 2(nI + nO)  1 + p. 3.2. Constraint Handling Methodology. The optimal solution should satisfy the path constraints on volume and flow rates (eqs 12 and 13) and terminal constraints on state variables (eq 14) in the case of periodic operation. The constraints on volume are handled by forcing the outlet flow rate to be equal to

the inlet flow rate when the volume is at its lower or upper bound. The constraints on flow rate can only be violated during the singular interval because the maximum and minimum flow rates are feasible. The constraint violation occurring during the singular interval is handled by equating the constraint violating flow rate to the nearest feasible flow rate. The constraints on the equality of states (eq 14) during periodic operation are handled using the adaptive penalty approach developed by Barbosa and Lemonge.32 This approach penalizes the constraint violating solutions, depending on the amount of constraint violation of the solution and the constraint violation of all the solutions of the particular generation. The fitness of the 1800

dx.doi.org/10.1021/ie2006142 |Ind. Eng. Chem. Res. 2012, 51, 1796–1808

Industrial & Engineering Chemistry Research

ARTICLE

Table 2. Comparison of the Results Obtained Using GA with the Analytical Results of Modak and Lim20 Analytical Results of Modak and Lim20 kinetics

objective

S0

optimal mode

Type A

yield

repeated batch

0.004

Type B

yield

continuous

0

Type C

yield

repeated fed-batch

0.003

Type D

yield

any repeated batch

0

Type A

productivity

Type B

productivity

Type C

productivity

Type D

productivity

continuous continuous

1.667 2.500

V0

Optimal Results (from GA)

objective value

optimal mode

S0

V0

objective value

2

0.3900

repeated batch

0.003

2

0.3905

20

0.4050

continuous

0.003

6.012

0.4059 0.3414

2

0.3408

repeated fed-batch

0.005

2

any value

0.4050

repeated batch

0.500

5.846

0.4050

continuous

2.514

19.999

0.3126

20 20

0.1875 0.3164

continuous

1.664

19.999

0.1877

continuous

1.687

19.999

0.1841

continuous

2.504

19.999

0.3167

state variable xi. Various GA operators used in the simulations are given in Appendix A.

4.0. RESULTS AND DISCUSSION 4.1. Case Study 1: Optimal Mode of Operation for Periodic Bioreactor Operation. Modak and Lim20 considered the pro-

Figure 5. Evolution of flow rate trajectories for maximizing the yield of a Type C process. Generation: (A) 50, (B) 75, (C) 100, (D) 125, (E) 150, and (F) 200. V = volume (in liters), F1 = inlet flow rate (in units of L/h), and F2 = outlet flow rate (in units of L/h).

chromosomes is calculated in this approach as Fitness ¼ IP 

∑i ki yi

ð20Þ

where IP is the objective value of a maximization problem, yi is the scaled value of the constraint violation, and ki is the penalty factor. The parameters yi and ki are calculated as follows:   xiτ  x0 yi ¼ abs ð21Þ ximax ki ¼ absðhIPiÞ

∑j

hyi i   ½ yj  2

ð22Þ

In the above equation, Æyiæ is the average of yi for all the solutions in a particular generation and ximax is used for normalizing the

blem of determining the optimal mode of operation for periodically operated bioreactors. They classified bioprocesses as belonging to either Type A, B, C, or D, depending on whether the ratio of specific product production rate to specific substrate consumption rate (instantaneous product yield) increases, decreases, goes through a maximum, or remains constant (see Figure 3). They have considered two single objective optimization problems of maximizing yield (eq 8) and maximizing productivity (eq 9) separately. We apply the computational algorithm outlined in the previous section to these problems to demonstrate its effectiveness in determining the optimal mode of operation. 4.1.1. Yield Optimization. Modak and Lim20 solved the problem of determining the optimal mode of operation for bioprocesses of Type A, B, C, and D kinetics analytically and reported the optimal modes of operation as repeated batch (Type A), continuous (Type B), repeated fed-batch (Type C), and any repeated batch (Type D). Among these kinetics, Type C kinetics presents an interesting case study. Since instantaneous product yield attains a maximum value of 0.3448 g product/g substrate at a substrate concentration of 0.371 g/L, maintaining a continuous operation at this value may be expected to deliver optimal yield. However, as overall yield is defined based on the amount of substrate fed to the bioreactor (eq 8), overall yield for a continuous process is reduced as the substrate in the outflow is not utilized for generating product. Hence, the results obtained by GA for this kinetics are presented first and, subsequently, the results for other bioprocess kinetics are presented. The optimal result obtained by GA for Type C kinetics is shown in Figure 4. The optimal mode of operation obtained is a repeated fed-batch operation (as can be seen in Figure 4C) and is in good agreement with the analytical result. The optimal inlet feed rate consists of an instantaneous fill interval during which the substrate concentration is raised to 0.36 g/L, singular interval during which substrate level is maintained constant and a batch interval where the substrate is completely consumed. The outlet flow rate is at its minimum throughout the operation except during the last interval wherein bioreactor contents are removed to satisfy the equality constraint on initial and final volume levels. 1801

dx.doi.org/10.1021/ie2006142 |Ind. Eng. Chem. Res. 2012, 51, 1796–1808

Industrial & Engineering Chemistry Research

ARTICLE

Figure 6. Evolution of yield and constraint violations of the best solution for maximizing the yield of a Type C process.

The chromosome of the best solution obtained is Flow pattern : 211110 001021 Correction terms : 0:000074 2:000000 1:995675 0:095895 Switching times for F1 : 0:0153 11:0824 14:9956 35:2595 48:7845 80:0000 Switching times for F2 : 7:9545 50:9037 71:7490 78:2010 79:9990 Initial conditions : 0:6139 0:0045 1:7093 2

At the start of the optimal operation, F1 is at its maximum while F2 is zero until 0.0153 h. This operation results in an instantaneous fill of substrate into the reactor, wherein the substrate level is increased to 0.36 g/L, which is very similar to the substrate concentration at which instantaneous yield is highest (0.371 g/L). The time period during which this operation takes place is very short; hence, flow rate F1 is almost like a pulse input. It should be noted that our GA is capable of capturing this feature with a very fine refinement of switching times, which is a task that is considered difficult for other numerical algorithms. During the interval of 0.015348.7845 h, flow rate F1 is singular, whereas F2 is zero. The parameter a for F1 in the singular interval is 0.000074; hence, the correction term does not significantly contribute to the singular flow rate and results in constant substrate concentration during this interval. During the interval of 48.784578.2010 h, a batch operation is maintained where in the substrate concentration decreases and attains a zero value. Although F2 switches from minimum to singular flow rate and then back to minimum during this period, the correction term for F2 evaluates it to a negative value and, hence, the outlet flow rate is forced to be zero, even during the singular interval of F2. The optimal solution has a batch interval from 53.3175 h to 78.2010 h, wherein both the inlet and outlet flow rates are zero, the substrate concentration is zero, and no apparent activity is visible. The presence of this interval can be attributed to the lack of dependence of yield on the time period of operation. After the batch interval, reactor contents are quickly emptied during the period from 78.2010 h to 79.999 h. The optimal initial conditions and objective values obtained by GA are given in Table 2. It is observed from the table that GA

Figure 7. Evolution of average constraint violations (yi) and penalty factors (Ki) for i = X, S, P, V (X: cell concentration, S: substrate concentration, P: product concentration, and V: volume).

Table 3. Influence of Population Size, Random Seed, and Permitted Constraint Violation on Performance of GA for Maximization of Yield of Bioprocesses with Type C Kinetics permitted

yield as defined

population

random

constraint

no.

size

seed

violation

yield

and Lim20

1

500

0.1

0.001

0.3408

0.3409

2

500

0.4

0.001

0.3413

0.3414

3 4

250 250

0.1 0.4

0.001 0.001

0.3414 0.3414

0.3411 0.3413

5

500

0.1

0.0001

0.3400

0.3400

6

500

0.1

0.01

0.3414

0.3383

by Modak

obtains a yield value of 0.3414 g product/g substrate, which is better than the analytical result of 0.3408 g product/g substrate given in the work of Modak and Lim.20 The value of yield using the expression used in the work of Modak and Lim20 for this solution is 0.3411 g product/g substrate. The reason for the higher value is the conversion of constraints on the equality of initial and final states into soft constraints and also the limited numerical accuracy of the results reported in the analytical study. GA was run with a population size of 500 for a maximum of 1000 generations to generate the optimal solution presented above. It is interesting to analyze the evolution of the optimal solution during the progress of GA. The evolution of volume and flow rates of the optimal solutions is shown in Figure 5. The evolution is gradual. First, GA finds a solution that follows fedbatch operation for a short duration. This solution also has the typical characteristic of maintaining the outlet flow rate at a minimum value during the fed-batch operation, followed by removal of reactor contents at the maximal rate until the 1802

dx.doi.org/10.1021/ie2006142 |Ind. Eng. Chem. Res. 2012, 51, 1796–1808

Industrial & Engineering Chemistry Research

ARTICLE

Figure 8. Inlet and outlet flow rate trajectories for optimizing yield of various bioprocess kinetics of Modak and Lim.20

Figure 9. Pareto-Optimal front for maximizing yield and productivity for the Type C process.

volume of the bioreactor reaches its minimum level. GA then progressively develops this solution into the optimal solution by adjusting the flow rates and the initial conditions. The evolution of yield and sum of constraint violations of the best solution is shown in Figure 6. GA quickly finds a feasible solution and then rapidly improves the feasible solution to the optimal feasible solution. The evolution of average constraint violations and penalty parameters is shown in Figure 7. GA reduces the average constraint violation of solutions in the population to very low values right at the beginning of the iterations and then searches for better solutions without drastically increasing the constraint violation. The influence of population size, random seeds, and permitted constraint violation of state variables on the performance of the

GA is summarized in Table 3. GA consistently finds solutions within 1% of the optimal solutions with different random seed values and population sizes. Also, a permitted constraint violation of 0.001 on the equality of initial and final states results in objective values that are identical, up to three decimal places, with the expression for yield given in the work of Modak and Lim.20 GA is now used for maximizing the yield of bioprocesses of Types A, B, and D kinetics and the optimal flow rate trajectories are summarized in Figure 8. For Type A kinetics, the optimal operation consists of filling the reactor to the maximum extent, operating it in batch mode, and then emptying it. Hence, it is a repeated batch operation and agrees well with the analytical result. For Type B kinetics, neither feed is added nor bioreactor content removed from the bioreactor. The analytical result suggests that, for processes of Type B kinetics, optimal operation maintains substrate at zero concentration throughout the operation, which degenerates into a continuous steady-state process with zero flow rate. These sort of results are physically not meaningful and represent the limits of the optimal solutions. GA correctly predicts that the operation should be continuous, with the substrate concentration being maintained at zero level. For Type D kinetics, the bioreactor is maintained in batch mode for a certain time, then the bioreactor content is removed and, subsequently, fresh substrate is added. Hence, the operation is indicative of a repeated batch mode of operation. Since the feed is not added up to the maximum volume of the bioreactor, any repeated batch operation produces the optimal operation, which is in good agreement with the analytical result. The optimal initial conditions and objective values are summarized in Table 2 and are in good agreement with the optimal analytical results. 4.1.2. Productivity Optimization. Modak and Lim20 analyzed the optimal mode of operation for the maximization of productivity (eq 9) and concluded that continuous operation maximizes productivity for processes of constant cell yield (Type B and Type D kinetics). They do not derive analytical results for the bioprocesses of Type A and Type C kinetics. GA is first applied 1803

dx.doi.org/10.1021/ie2006142 |Ind. Eng. Chem. Res. 2012, 51, 1796–1808

Industrial & Engineering Chemistry Research

ARTICLE

Figure 10. Trajectories of solution B (marked in Figure 9) of Type C process. (X: cell concentration; S: substrate concentration; P: product concentration; F1: inlet flow rate; F2: outlet flow rate.)

for maximizing the productivity of bioprocesses with Type B and Type D kinetics and the optimal modes, initial conditions and productivities are summarized in Table 2. The results show that a continuous steady-state process at maximum volume is optimal for both Type B and Type D kinetics. The optimal initial conditions and productivities are also in very good quantitative agreement with the analytical results. GA is then applied for Type A and Type C kinetics and the results, summarized in Table 2, indicate that continuous steady-state operation is the optimal mode for both these kinetics. Using the continuous mode of operation, the optimal productivities can be calculated analytically for Type A and Type C kinetics as 0.3124 g L1 h1 and 0.1839 g L1 h1, respectively. The productivities obtained by GA are 0.3126 g L1 h1 and 0.1841 g L1 h1 for Type A and Type C kinetics, respectively, and are in very good quantitative agreement with the calculated analytical results. We note here that all the solutions for productivity use the minimum allowed period of operation of 20 h. Explicit consideration of time in the expression for productivity (eq 9) results in a high selective pressure, leading the solutions to approach the minimum allowable period of operation. 4.2. Case Study 2: Multiobjective Optimization of Yield and Productivity. For Type C kinetics, the solution with the highest yield (0.3413 g product/g substrate) has a productivity of 0.0192 g L1 h1, whereas the solution with the highest productivity (0.1841 g L1 h1) has a yield of 0.1747 g product/g substrate. Also, the solution for maximizing yield uses repeated fed-batch operation while the solution for maximizing productivity uses continuous steady-state operation. This suggests an inherent tradeoff between the objectives and their modes of operation. To explore the issue further, the problem of multiobjective optimization of yield and productivity is solved using NSGA-II.25 The best Pareto-Optimal front obtained using NSGA-II is presented in Figure 9. The solution with the highest productivity has a continuous operation with a productivity of 0.1841 g L1 h1, which is within 0.1% of the optimal

productivity. The solution with highest yield has a repeated fed-batch operation with a yield of 0.3412 g product/g substrate, also within 0.1% of optimal yield. These two extreme solutions are qualitatively and quantitatively similar to those obtained using the single objective optimization studies. The intermediate solutions between these two extreme solutions clearly show a tradeoff between yield and productivity. For these solutions, as yield increases, productivity decreases and vice versa. The Pareto front shown in Figure 9 can be broadly classified into four regions. The front in the region I is essentially flat. In this region, yield can be increased from 0.1741 g product/g substrate to 0.2032 g product/g substrate (a 16.71% increase) at a minimum reduction in productivity from 0.1841 g L1 h1 to 0.1802 g L1 h1 (a 2.11% decrease). Similarly, the Pareto front in region IV is practically vertical. In this region, productivity can be increased from 0.0192 g L1 h1 to 0.0349 g L1 h1 (an 81.77% increase) with a minute reduction of yield from 0.3412 g product/g substrate to 0.3401 g product/g substrate (a 0.32% decrease). A close examination of the optimal solutions reveals that the feeding pattern—that is, the sequence of maximum, minimum, and singular flow rates—is identical but the time period of operation is different. The objective productivity imposes a penalty on time and, hence, exerts a selective pressure on the algorithm to obtain solutions with high yield in minimal time. However, in both regions II and III, only a modest increase in productivity (from 0.1368 g L1 h1 to 0.1575 g L1 h1, a 15.1% increase) is obtained with a modest decrease in yield (from 0.2752 g product/g substrate to 0.2503 g product/g substrate, a 9% decrease). Hence, the Pareto front obtained by the proposed computational algorithm captures the entire spectrum of solutions for the multiple objectives of maximizing yield and productivity and presents a diverse set of alternatives to a decision maker from which a suitable operation can be chosen. To investigate the different modes of operation on the ParetoOptimal front, two solutions on the Pareto front (marked “A” 1804

dx.doi.org/10.1021/ie2006142 |Ind. Eng. Chem. Res. 2012, 51, 1796–1808

Industrial & Engineering Chemistry Research

ARTICLE

Figure 11. Optimal trajectories for the maximizing secreted protein production with fixed initial conditions. (P: secreted protein; PT: total protein; X: cell concentration; S: substrate concentration; F1: inlet flow rate; F2: outlet flow rate.)

and “B” in Figure 9) are investigated further. Solution A maintains a continuous steady-state operation at a substrate concentration of 0.73 g/L and has a yield of 0.2829 g product/g substrate and productivity of 0.1291 g L1 h1. Solution B, plotted in Figure 10, has a yield of 0.3241 g product/g substrate and a productivity of 0.0570 g L1 h1. This solution is a concatenation of fed-batch and continuous modes of operation. The fed-batch operation occurs from 0 h to 34.65 h; continuous operation occurs from 34.65 h to 69.89 h. The last stage of the operation is used for emptying the contents of the reactor, to meet the constraint of equality on initial and final volume. On the Pareto-Optimal front (Figure 9), the solutions utilize continuous mode of operation for high productivities (regions I and II), combination of fed-batch and continuous modes of operation for intermediate productivity and yield (region III), and repeated fed-batch mode of operation for high yields (region IV). One interesting feature observed among all these solutions is that solutions with high yield have minimum volume as their initial condition while the solutions with high productivity have maximum volume as the initial condition. Hence, solutions with high yield require at least a part of the operation to be in the fed-batch mode, whereas solutions with high productivity use the continuous mode of operation. 4.3. Case Study 3: Optimization of Secreted Protein Production. In order to test the performance of GA for morecomplex problems, it is used to determine the optimal mode of operation for producing recombinant protein by yeast. Park and Ramirez4 have specified the mode of operation as fed-batch a priori and obtained the optimal feeding policy for maximizing the total quantity of protein secreted by the yeast. Banga et al.2 suggested using this model as a benchmark problem for computational optimization algorithms. The model used by Park and Ramirez4 is modified in the present study to consider the problem of determination of mode of operation and is given in Appendix C. The optimal result obtained when the initial conditions are fixed equal to that used in the work of Park and

Ramirez4 is shown in Figure 11. The optimal mode of operation is a fed-batch operation consisting of singular, minimum, singular and maximum intervals for inlet and minimum flow rate for the outlet and is identical to the optimal result obtained by Park and Ramirez4 and Sarkar and Modak.6 The protein produced in this operation is 32.729 g, which is in close agreement with the amount of protein (32.724 g) obtained by Sarkar and Modak.6 The novel feature of the problem considered here is that the mode of operation is not assumed a priori, contrasting with that of Park and Ramirez4 and Sarkar and Modak,6 wherein fed-batch mode of operation is prespecified. GA is then applied to determine the optimal mode of operation for maximizing the yield of secreted protein. The solution obtained represents a continuous steady-state operation at zero substrate concentration and is similar to the solution of yield optimization of Type B kinetics. Investigation of kinetics for this problem reveals that the instantaneous total protein yield is similar in shape to Type B kinetics. Hence, GA correctly finds the optimal mode of operation for maximizing the secreted protein yield. GA is further applied to determine the optimal mode of operation for maximizing the productivity of secreted protein. The solution obtained represents a continuous steady-state operation, wherein the inlet and outlet flow rates are maintained almost constant at a value of 1.4 L/h. The value of productivity of the optimal solution is 0.799 g L1 h1. For a continuous mode of operation, the optimal productivity for this process can be calculated analytically to be 0.795 g L1 h1. Hence, GA is able to predict quantitatively that continuous mode optimizes the productivity of secreted protein production. We note here that the analytical results of Modak and Lim20 hold, even for the complicated kinetics of this process.

5.0. CONCLUSIONS The problem of determining the optimal mode of operation of a bioreactor has been addressed in this paper. This has been 1805

dx.doi.org/10.1021/ie2006142 |Ind. Eng. Chem. Res. 2012, 51, 1796–1808

Industrial & Engineering Chemistry Research

ARTICLE

accomplished by formulating an optimization problem that considers an unsteady model of the bioreactor and various physical constraints and determines the trajectories of the flow rates, initial conditions and time period of operation. The formulated optimization problem can possibly contain either single or multiple objectives. A computational algorithm based on genetic algorithms (GAs) has been developed by extending the work of Sarkar and Modak6 for the solution of this problem. The effectiveness of the developed algorithm is demonstrated by solving an optimization problem for which analytical results were already available in the literature. The algorithm correctly predicts the optimal mode of operation for various bioprocess kinetics considered by Modak and Lim20 for the objectives of yield and productivity. The algorithm is then used in the NSGAII framework to obtain a spectrum of solutions of the multiobjective optimization problem, considering the objectives of yield and productivity simultaneously. Analysis of the obtained solutions reveals the tradeoff that the objectives induce on the optimal modes. The developed algorithm is further applied to the challenging problem of optimizing secreted protein production in yeast.4 The results so obtained indicate that the results of Modak and Lim20 may have wider applicability.

population. Each of the elites of different static penalty constants, the elite from the penalty approach of Barbosa and Lemonge,32 and nine of their corresponding mutants are propagated into future generations, to ensure the survival of the best solutions. A.6. Insert/Delete Time Operator. A new operator is introduced in the simulations wherein a random switching time in the chromosome is selected with a certain probability. A random amount of time is added or removed from this point onward with equal probability. The probability of insertion/deletion and the amount of time added/removed is obtained using trial and error. probability of insertion=deletion ¼ ð0:15  0:1Þ  genno=maxgen time interval

¼ 010 h for kinetics of Modak and Lim20 ¼ 02 h for secreted protein production problem4 with fixed initial conditions ¼ 0–5 h for secreted protein production problem for periodic boundary conditions

Since it has already been reported that the number of intervals does not affect the performance of the GA,6 without making any a priori assumptions, the number of intervals is fixed as six in the present study. The equality constraints on state variables during periodic operation are converted to soft constraints by taking the constraint violation value of 0 when the normalized constraint violation is