J. Phys. Chem. B 2008, 112, 5905-5909
5905
Parallel Search of Long Circular Strands: Modeling, Analysis, and Optimization† Iddo Eliazar,*,‡ Tal Koren,§ and Joseph Klafter§ Department of Technology Management, Holon Institute of Technology, Holon 58102, Israel, and School of Chemistry, Raymond & BeVerly Sackler Faculty of Exact Sciences, Tel AViV UniVersity, Tel AViV 69978, Israel ReceiVed: July 1, 2007; In Final Form: December 3, 2007
We introduce and explore a model of an ensemble of agents searching, in parallel, a long circular strand for a target site. The agents performing the search combine local-scanning (conducted by a one-dimensional motion along the strand) and random relocations on the strand. The agent-ensemble search-durations are analyzed, their limiting probability distributions are obtained in closed-form, and the optimal relocation strategies are derived. The results encompass the cases of parallel and massively parallel searches, taking place in the presence of either finite-mean or heavy-tailed relocation durations. The results are applicable to a wide spectrum of local-scans, including linear motions, Brownian motions, subdiffusive motions, fractional Brownian motions, and fractional Le´vy motions.
1. Introduction Search models in which agents are searching a space for a target site are widespread across many fields of science.1-6 The scientific literature offers a plethora of search models: the space searched being physical, geometric, or combinatorial; the search being single (conducted by a single agent),7 parallel (conducted by an ensemble of agents),8 or massively parallel (conducted by a large ensemble of agents);9-10 the search-method being deterministic or stochastic; and so forth. In this paper we consider the parallel and massively parallel search of a long circular strand, embedded in a higherdimensional domain. This setting is prevalent in molecular biology, where enzymes search a circular DNA strand (plasmid) for a particular binding-site. The enzymatic search is conducted in a process referred to as facilitated diffusion.11-14 Enzymes alternate between a one-dimensional (1D) motion along the strand, and a confined three-dimensional (3D) motion within the cellular medium surrounding the strand. An enzyme performing a 1D motion along the strand occasionally disengages it, and commences the confined motion. The confined motion “lands” the enzyme on a new location on the strand, from which the enzyme resumes the 1D motion. The targetsearch takes place only when in the 1D motion phase. A schematic illustration of the “facilitated diffusion algorithm” is depicted in Figure 1. The process of facilitated diffusion turns out to be a highly effective and efficient search-mechanism, and has attracted considerable focus by both experimentalists and theoreticians in the recent years13-18 (see also refs 19 and 20 for higher dimensional extensions). From an abstract perspective, the two key features of the facilitated diffusion process are (i) localscanning, performed by the 1D motion along the strand and (ii) relocation, caused by the confined 3D motion in the surrounding medium. Motivated by the “biological solution” of the DNA search problem, we introduce and explore a general “facilitated diffusion algorithm” for the search of a long circular strand †
Part of the “Attila Szabo Festschrift”. * Corresponding author. ‡ Holon Institute of Technology. § Tel Aviv University.
Figure 1. An illustration of the facilitated diffusion algorithm for a single agent searching a circular strand. The x-axis denotes time; the y-axis denotes the agent’s location along the strand. (a) Local-scanning performed by a linear 1D motion. (b) Local-scanning performed by a random 1D motion. The vertical bars represent the disengagement epochs; the horizontal gaps following the vertical bars represent the relocation durations.
embedded in a higher-dimensional domain. The algorithm proposed combines the two biological key features of localscanning and relocation. Yet, being general, it considers arbitrary local-scanning and relocation mechanisms, including anomalous mechanisms. Existing models of facilitated diffusion usually (i) consider a single searcher, (ii) consider specific 1D and 3D regular motions, and (iii) provide results regarding the mean search durations. The model proposed in this letter is far more general and robust. Nonetheless, the model turns out to be remarkably tractable:
10.1021/jp075113k CCC: $40.75 © 2008 American Chemical Society Published on Web 02/05/2008
5906 J. Phys. Chem. B, Vol. 112, No. 19, 2008
Eliazar et al.
Closed-form results for the limiting distributions of the search durations are explicitly obtained, with the results covering the cases of parallel (let alone single) and massively parallel searches, conducted in the presence of either regular or anomalous local-scanning and relocation mechanisms. 2. Modeling Consider an agent-ensemble of size m searching, in parallel, a long circular strand of length n + l for a target site of length l. The target site’s length is far smaller than the strand’s length (n . l). The agents operate simultaneously and independently, each agent practicing the following “facilitated diffusion” searchalgorithm, which combines local-scanning and relocation: (i) Begin from an arbitrary initial location on the strand. If the initial location is within the target site, stop. Otherwise, (ii) initiate a local-scan, performed by a 1D motion along the strand, and set an exponential timer. If the local scan traces the target site before the timer expires, stop. Otherwise, (iii) relocate to a new random position on the strand, and start the search anew. The local-scanning mechanism is performed by an arbitrary 1D motion with continuous sample-path trajectories. The localscanning mechanism is characterized by its scan-function Ψ(‚), a positive-valued function defined on the positive halfline, which is contingent upon the process-distribution of the underlying 1D motion.21 Examples of various local-scanning mechanisms (linear, Brownian, self-similar, and subdiffusive), and their corresponding scan-functions, are given in Section 5 below. The relocation mechanism is induced by an arbitrary mixing motion in the embedding domain. The relocation mechanism is characterized by its relocation rate λ, the rate of the underlying exponential timer, and by its relocation duration R, the random time it takes to relocate to a new random position on the strand. Henceforth, 〈R〉 denotes the mean relocation duration, and the relocation duration is referred to as heavytailed if it is asymptotically Pareto-distributed with infinitemean: P(R > t) ∼ a/tR, as t f ∞, with exponent 0 < R < 1 (a being a positive constant). (Heavy-tailed relocation times arise in cases where the confined 3D motion is subdiffusive22-23). 3. Analysis 3.1. Parallel Search. Let T mn denote the agent-ensemble search duration, i.e., the time elapsing till the first “discovery” of the target site by either of the m searching agents. As indicated above, the target site’s length is far smaller than the strand’s length (n . l). This suggests an asymptotic investigation of the random variables T mn in the limit n f ∞. The results of the asymptotic analysis are given by the following stochastic limit-law:21 Proposition 1: The scaled search durations T mn /nν conVerge, in law (as n f ∞), to a stochastic limit Tm where the following apply: (i) If the relocation duration possesses a finite mean, then the scaling exponent is ν ) 1, and the stochastic limit is exponentially distributed:
{
P(Tm > t) ) exp -m
l + Ψ(λ) ‚t 〈R〉 + 1/λ
}
(1)
(ii) If the relocation duration is heaVy-tailed, then the scaling exponent is ν ) 1/R, and the stochastic limit is asymptotically Pareto-distributed:
P(Tm > t) ∼
t f∞
(
)
l + Ψ(λ) P(R > t)
-m
∼
t f∞
(
)
l + Ψ(λ) R ‚t a
-m
(2)
The scaling in Proposition 1 is analogous to the central limit theorem (CLT) scaling for positive-valued random variables,24 being of the order O(n) when the random variables possess a finite mean, and being of the order O(n1/R) when the random variables are heavy-tailed. Note that, in the case of heavy-tailed relocation durations (eq 2), the addition of searching agents results in the addition of converging moments to the stochastic limit Tm. In particular, once the number m of searching agents exceeds the threshold level 1/R (2/R), then the stochastic limit Tm attains a finite mean (variance). 3.2. Massively Parallel Search. The search becomes massively parallel when the agent-ensemble is large (m . 1). In this case, an asymptotic investigation of the random variables T mn in the double limit n,m f ∞ is required. Assuming that the “agents concentration” m/n tends to a positive limit κ (namely, limn,mf∞(m/n) ) κ), the results of the asymptotic analysis are given by the following stochastic limit-law:21 Proposition 2: The search durations T mn conVerge, in law (as n,m f ∞), to a stochastic limit T where the following apply: (i) If the relocation duration possesses a finite mean, then the stochastic limit is asymptotically exponentially distributed:
{
P(T > t) ∼ exp -κ tf∞
l + Ψ(λ) ‚t 〈R〉 + 1/λ
}
(3)
(ii) If the relocation duration is heaVy-tailed, then the stochastic limit is asymptotically Weibull-distributed [the Weibull distribution is often referred to as “stretched exponential”]:
{
P(T > t) ∼ exp -κR t f∞
}
l + Ψ(λ) P(R > t)
∼
t f∞
{
exp -κR
l + Ψ(λ) R ‚t a
}
(4)
where κR ) κ‚sin(πR)/(πR). The stochastic limit-law obtained in the case of heavy-tailed relocation durations (eq 4) resembles the survival probability obtained in the “target problem”.9-10 3.3. Discussion. Numerical simulations of the search durations are presented in Figures 2 and 3. The asymptotic results of Propositions 1 and 2 are summarized in Table 1. We emphasize three important issues implied by Propositions 1 and 2: Scaling. Transcending from parallel to massively parallel searches has a dramatic effect on the time scales of the search durations T mn , reducing them from the orders O(n) and O(n1/R) to the order O(1). Relocation. Shifting from finite-mean to heavy-tailed relocation durations has a marked effect on the limiting distributions obtained: in the case of parallel search, they change from exponential to asymptotically Pareto, and in the case of massively parallel search, they change from asymptotically exponential to asymptotically Weibull. UniVersality. In the presence of finite-mean relocation durations, the limiting distributions turn out to be uniVersal with respect to the relocation, being contingent on the relocationduration’s mean alone 〈R〉, rather than on the relocationduration’s distribution. This universality is vividly exemplified by the numerical simulations presented in Figure 2. Universality, however, fails to hold in the presence of heavytailed relocation durations, in which case the limiting distribu-
Parallel Search of Long Circular Strands
J. Phys. Chem. B, Vol. 112, No. 19, 2008 5907
Figure 2. Simulations of a single agent (m ) 1) searching a circular strand of length n + l ) 2000. (a) Local-scanning performed by a linear motion with velocity V ) 5; the length of the target site is l ) 5; number of realizations: 2 × 105. (b) Local-scanning performed by a Brownian motion with diffusion parameter D ) 1; the length of the target site is l ) 10; number of realizations: 5 × 104. The empirical Probability density function p(t) of the scaled search durations is depicted on a logarithmic plot: log(p(t)) vs t. The simulations were carried out for finite-mean relocation times, with mean 〈R〉 ) 10, drawn from various probability laws: deterministic, log-normal, exponential, gamma, and Weibull (stretched exponential). The solid line depicts the theoretical Exponential limit predicted by eq 1, to which the simulations closely fit. The simulations also demonstrate the phenomena of uniVersality with respect to the relocation mechanism.
tions are contingent on the relocation-duration’s survival probability P(R > t).
Figure 3. Simulations of a massively parallel search of a circular strand of length n + l ) 2000, conducted by an agent-ensemble of size m ) 100 performing linear scanning with velocity V ) 5. The length of the target site is l ) 5, and the “agents concentration” is κ ≈ 0.05. The relocation times are heavy-tailed - drawn from a one-sided Le´vy distribution with exponent R ) 0.75. Number of realizations: 4 × 104. The empirical survival probability P>(t) of the search durations is depicted on (a) A standard plot: P>(t) vs t; (b) A logarithmic plot: log(-log(P>(t))) vs log(t). The solid line depicts the theoretical Weibull (stretched exponential) limit predicted by eq 4, to which the simulations closely fit.
TABLE 1: Searching Long Circular Strands (n . l): Classification of the Asymptotic Probabilistic Limit-Laws of the Search Durationsa
parallel search (m agents) massively-parallel search
4. Optimization Considering the case of finite-mean relocation durations, we now turn to address the issue of search-optimization. From eqs 1 and 3 it is evident that the maximization of the function
F(λ) )
l + Ψ(λ) 〈R〉 + 1/λ
(5)
(λ > 0) renders the stochastic limits Tm (in the case of parallel search) and T (in the case of massively parallel search) minimal. Necessary conditions for the function F(λ) to be unimodal and attain a global maximum are given by the following proposition:21 Proposition 3: Assume that the scan-function decays to zero (limλf∞Ψ(λ) ) 0) and that the function Φ(λ) ) λΨ(λ) initiates at the origin (Φ(0) ) 0), is concaVe (Φ′′(λ) < 0), and grows to infinity (limλf∞Φ(λ) ) ∞). Then, the function F(‚) is unimodal. It initiates at the origin (F(0) ) 0), increases monotonically to a global maximum, and thereafter decreases monotonically to the asymptotic leVel limλf∞F(λ) ) l/〈R〉. The global maximum is attained at unique positiVe solution of the nonlinear equation
finite-mean relocation time
heavy-tailed relocation time (R)
scaling: O(n) limit: exponential scaling: O(1) limit: exponential
scaling: O(n1/R) limit: pareto (mR) scaling: O(1) limit: Weibull (R)
a The rows indicate the search type: parallel (m denoting the number of searching agents), or massively parallel. The columns indicate the type of the relocation time: finite-mean, or heavy-tailed (in the heavytailed column, the numbers in parentheses are the values of the corresponding exponents). The table summarizes the asymptotic results of eqs 1-4: the appropriate scaling of the search durations, and the emerging probabilistic limit-laws.
Ψ(λ) + λ(1 + 〈R〉λ)Ψ′(λ) + l ) 0
(6)
Examples of classes of local-scanning mechanisms with scanfunction satisfying the necessary conditions of Proposition 3 are given in Section 5. Proposition 3 enabled us to compute the asymptotically optimal relocation rates, but is the relocation mechanism at all beneficial? While relocating, the agents are not searching for the target site and thus “waste time”. So, does the very presence of the relocation mechanism improve or worsen the overall search performance? One way of answering this question is to compare the mean search duration of a single searching agent between two scenarios: (i) with relocation and (ii) without relocation.
5908 J. Phys. Chem. B, Vol. 112, No. 19, 2008
Eliazar et al.
Figure 4. Search-performance F(λ) (defined in eq 5) as a function of the relocation rate λ, in the presence of finite-mean relocation durations (with mean 〈R〉 ) 10). (a) Linear local-scanning, the monotone-increasing scenario 〈R〉 < l/V. (b) Linear local-scanning, the monotone-decreasing scenario 〈R〉 > l/V. (c) Brownian and subdiffusive local-scanning (with parameters l ) 2 and D ) 10). (d) Self-similar local-scanning (with parameters l ) 10 and c ) 15).
The mean search duration with relocation is given by21
∼
nf∞
1 ‚n F(λ)
(7)
(the function F(‚) given by eq 5). Let us denote by 〈S1n〉 the mean search duration of a single searching agent without relocation. In order to evaluate the mean-impact of the relocation mechanism, one has to compare the asymptotic behavior (as n f ∞) of the sequences 〈T 1n〉 and 〈S1n〉. When the sequence 〈T 1n〉 grows slower than the sequence 〈S1n〉, then the presence of the relocation mechanism, from a mean-performance perspective, is indeed beneficial. Examples of classes of local-scanning mechanisms for which the presence of the relocation mechanism is beneficial are given in Section 5. 5. Examples 5.1. Linear Local-Scanning. Consider the case where the 1D motion along the strand is deterministic and linear, advancing at a constant velocity V (V > 0). In this case, the scan-function is harmonic Ψ(λ) ) V/λ. This scan-function does not satisfy the conditions of Proposition 3. Rather, in the presence of finite-mean relocation durations, the three following scenarios are possible: • 〈R〉 ) l/V. In this scenario, F(λ) ≡ V, and the search performance is independent of the relocation rate. • 〈R〉 < l/V. In this scenario, the function F(‚) increases monotonically from the initial level F(0) ) V to the asymptotic level limλf∞F(λ) ) l/〈R〉 (see Figure 4a). Consequently, the asymptotically optimal relocation strategy is to relocate as often as possible. • 〈R〉 > l/V. In this scenario, the function F(‚) decreases monotonically from the initial level F(0) ) V to the asymptotic level limλf∞F(λ) ) l/ (see Figure 4b). Consequently, the asymptotically optimal relocation strategy is to relocate as rarely as possible.
The mean search duration of a single searching agent, without relocation, is given by 〈S 1n〉 ∼ n/2V (n f ∞), being of the same order (O(n)) as the mean search duration with relocation 〈T 1n〉 ∼ n/F(λ) (n f ∞). 5.2. Brownian Local-Scanning. Consider the case where the 1D motion along the strand is random and diffusive, conducted by a Brownian motion with diffusion parameter D (D > 0). In this case, the scan-function is given by Ψ(λ) ) x2D/λ. This scan-function satisfies the conditions of Proposition 3 (see Figure 4c), and the optimal relocation rate is given by
λopt )
( x
l2 1 + 2 1+ r Dr
1+2
Dr l2
)
(8)
The mean search duration of a single searching agent, without relocation, is given by 〈S1n〉 ∼ n2/6D (n f ∞), being an order of magnitude greater than the mean search duration with relocation 〈T 1n〉 ∼ n/F(λ) (n f ∞). Hence, in the case of Brownian localscanning, the presence of the relocation mechanism markedly improves the search performance. 5.3. Self-Similar Local-Scanning. A process (Z(t))tg0 is said to be self-similar with Hurst exponent H (H > 0) if, for any positive scale s, the process (Z(st))tg0 is equal, in law, to the process (sHZ(t))tg0.25 Examples of self-similar processes with continuous samplepath trajectories include fractional Brownian motions (0 < H < 1), and fractional Le´ Vy motions (1/ < H < 1, where 1 < < 2 is the motion’s Le´vy exponent).25 Fractional Brownian motions with Hurst exponent 1/2 < H < 1 and fractional Le´vy motions are processes with long-ranged memory.26-27 Consider the case where the 1D motion along the strand is self-similar, conducted by a self-similar process with continuous sample-path trajectories and Hurst exponent H (H > 0). In this case, the scan-function is a power-law Ψ(λ) ) c1/λH (c1 being a positive constant). If the Hurst exponent is in the range 0 < H < 1, then the scan-function satisfies the conditions of Proposition 3 (see Figure 4d).
Parallel Search of Long Circular Strands The mean search duration of a single searching agent, without relocation, is given by 〈S 1n〉 ∼ c2n1/H (n f ∞; c2 being a positive constant). If the Hurst exponent is in the range 0 < H < 1, then the mean 〈Sn〉 grows at a greater order than the mean search duration with relocation 〈T 1n 〉 ∼ n/F(λ) (n f ∞). Hence, in the case of self-similar local-scanning with Hurst exponent 0 < H < 1, the presence of the relocation mechanism markedly improves the search performance. 5.4. Subdiffusive Local-Scanning. Consider the case where the 1D motion along the strand is random and subdiffusive, conducted by a Brownian motion with diffusion parameter D (D > 0), which is occasionally interrupted by heavy-tailed random halts: the durations of the random halts being asymptotically Pareto-distributed with exponent β (0 < β < 1).22 Subdiffusive motions are processes with long-ranged memory.26-27 In this case, the scan-function is given by Ψ(λ) )
x2D/(λ+bλ β) (b being a positive constant). The subdiffusive
scan-function coincides with the Brownian scan-function in the limit λ f 0, and coincides with the self-similar scan-function, with Hurst exponent H ) β/2, in the limit λ f ∞. The subdiffusive scan-function satisfies the conditions of Proposition 3 (see Figure 4c). The mean search duration of a single searching agent, without relocation, is infinite (〈S 1n〉 ) ∞). The infinite-mean is a consequence of the heavy-tailed random halts. Hence, in the case of subdiffusive local-scanning, the presence of the relocation mechanism is essential: relocation bails out the searching agents from being trapped in long subdiffusive halts. 6. Conclusions We introduced and explored a stochastic model of an ensemble of agents searching a long circular strand for a target site. The agents operate in parallel and independently, following a “facilitated diffusion” search algorithm combining localscanning and random relocation. Asymptotic analysis of the search durations was conducted, yielding closed-form formulas for the corresponding limiting probability distributions. These formulas, in turn, enabled the derivation of the asymptotically optimal relocation rates. The theory developed encompasses the cases of both parallel and massively parallel searches, taking place in the presence of
J. Phys. Chem. B, Vol. 112, No. 19, 2008 5909 either finite-mean or heavy-tailed relocation durations. Moreover, the theory is applicable to a wide spectrum of localscanning mechanisms, including linear motions, Brownian motions, subdiffusive motions, fractional Brownian motions, and fractional Le´vy motions. References and Notes (1) Levandowsky, M.; Klafter, J.; White, B. S. Bull. Mar. Sci. 1988, 43, 758. (2) Stone, L. D. Theory of Optimal Search; Operations Research Society of America: Arlington, VA, 1989. (3) Viswanathan, G. M. Nature (London) 1999, 401, 911. (4) Be´nichou, O. Phys. ReV. Lett. 2005, 94, 198101. (5) Be´nichou, O.; Coppey, M.; Moreau, M.; Voituriez, R. Europhys. Lett. 2006, 75, 349. (6) Shlesinger, M. F. Nature (London) 2006, 443, 281. (7) Coppey, M. Biophys. J. 2004, 87, 1640. (8) Lomholt, M. A.; Ambjo¨rnsson, T.; Metzler, R. Phys. ReV. Lett. 2005, 95, 260603. (9) Shlesinger, M. F.; Montroll, E. W. Proc. Natl. Acad. Sci. U.S.A. 1984, 81, 1280. (10) Blumen, A. G.; Zumofen, G.; Klafter, J. Phys. ReV. B 1984, 80, 5379. (11) Riggs, A. D.; Bourgeois, S.; Cohn, M. J. Mol. Biol. 1970, 53, 401. (12) von Hippel, P. H.; Berg, O. G. J. Biol. Chem. 1989, 264, 675. (13) Gowers, D. M.; Wilson, G. G.; Halford, S. E. Proc. Natl. Acad. Sci. U.S.A. 2005, 102, 15883. (14) Widom, J. Proc. Natl. Acad. Sci. U.S.A. 2005, 102, 16909. (15) Slutsky, M.; Mirny, L. A. Biophys. J. 2004, 87, 4021. (16) Sokolov, I. M. Biophys. J. 2005, 89, 895. (17) Hu., T.; Grosberg, A. Y.; Shklovskii, B. I. Biophys. J. 2006, 90, 2731. (18) Oshanin, G.; Wio, H. S.; Lindenberg, K.; Burlatsky, S. F. J. Phys.: Condens. Matter 2007, 19, 065142 (19) Be´nichou, O.; Loverdo, C.; Moreau, M.; Voituriez, R. Phys. ReV. E 2006, 74, 020102(R) (20) Be´nichou, O.; Moreau, M.; Suet, P. H.; Voituriez, R. J. Chem. Phys. 2007, 126, 234109. (21) Eliazar, I.; Koren, T.; Klafter, J. J. Phys.: Condens. Matter 2007, 19, 065140. (22) Eliazar, I.; Klafter, J. Physica D 2004, 187, 30. (23) Lomholt, M. A.; Zaid, I. M.; Metzler, R. Phys. ReV. Lett. 2007, 98, 200603. (24) Gnedenko, B. V.; Kolmogorov, A. N. Limit Distributions for Sums of Independent Random Variables; Addison-Wesley: London, 1954. (25) Embrechts, P.; Maejima, M. Self-Similar Processes; Princeton University Press: Princeton, NJ, 2002. (26) Taqqu, M., Oppenheim, G., Eds. Theory and Applications of LongRange Dependence: Birkha¨user: Boston/Basel/Berlin, 2002. (27) Rangarajan, G., Ding, M., Eds. Processes with Long-Range Correlations: Theory and Applications; Lecture Notes in Physics, 621; Springer-Verlag: New York, 2003.