Novel Genetic Algorithm-Inspired Concept for Macromolecular Crystal

May 4, 2011 - ARTICLE pubs.acs.org/crystal. Novel Genetic Algorithm-Inspired Concept for Macromolecular. Crystal Optimization. Emmanuel Saridakis*...
0 downloads 0 Views 2MB Size
ARTICLE pubs.acs.org/crystal

Novel Genetic Algorithm-Inspired Concept for Macromolecular Crystal Optimization Emmanuel Saridakis* Laboratory of Structural and Supramolecular Chemistry, Institute of Physical Chemistry, National Centre for Scientific Research “Demokritos”, P.O. Box 60228, Aghia Paraskevi 15310, Athens, Greece ABSTRACT: A novel concept is presented for optimizing crystallization conditions, based on the principles of Genetic Algorithm stochastic optimization methods. This concept was tested with three model proteins as well as in a real problem situation involving crystallization of a DNA oligonucleotide and showed that initial microcrystalline suspensions or clusters of microcrystals could be optimized rapidly and with a minimal number of experiments. Macromolecular crystallization conditions are ususally discovered by screening against more or less standard sets of candidate conditions, known as “crystallization screens”. One or more of these conditions may show some promise, most often in the form of microcrystals, clusters, or microcrystalline suspension: such conditions are referred to as “hits”. The following, optimization step, generally consists in fine-tuning these promising conditions by changing the values of the various parameters, such as concentrations and pH, in small increments, until useful crystals are obtained. Very often, this obvious approach fails. In such cases, recombination and mutation of the initial “hits” using a simple Genetic Algorithm-inspired approach can be an effective alternative route.

’ INTRODUCTION When attempting to crystallize a macromolecule, a multidimensional parameter space is initially explored using a set of candidate conditions, referred to as a screen. The relevant parameters include the type and concentration of precipitating agent, the concentration of protein, the type and concentration of a secondary precipitating agent and/or of an additive, the pH, temperature and possibly others.1,2 Crystallization screens can rely on various principles, ranging from a random generation of conditions3 starting from a pool of known precipitating agents, buffers, and additives (at a range of pH and temperature that are assumed not to destroy the macromolecule) to more systematically designed screens.46 The most commonly used type of screen is however the sparse-matrix design,7,8 where conditions that have been most successful for previously crystallized proteins, are selected. This leads to a biased search of the parameter space, weighed according to past success. Whichever screen type is chosen (often a selection of screens based on different design principles are used),5 the experimenter is at this stage looking for a “hit” in one or more of the tried conditions. This hit can—in rare, lucky cases—be a well-diffracting crystal, but more often consists of tiny crystals, “showers” of microcrystals, clusters of microcrystals (often in the shape of seaurchins or spherullites) or even microcrystalline suspension. In some cases, none of the above can even be reached and the experimenter is forced to rely on phase separation or light, “shiny” precipitate as starting points in the search for crystals. Unless well-diffracting crystals are immediately obtained, an optimization must then be conducted, based on these initial hits. The first optimization step usually consists of setting up a fine r 2011 American Chemical Society

grid of conditions centered at each of those “promising points” of the parameter space. Each crystallization variable is sampled in small increments around the hit condition; a set of additives or even a different crystallization setup may also be tried at this stage. If this approach fails to produce better crystals, as it very often does, one must reach for a different screen. In the face of repeated failure, the protein must be modified or the project may even be abandoned. An alternative method of optimization of initial hits is proposed here, based on the principles of Genetic Algorithms (GA). The original GAs are stochastic multiparameter optimization techniques used in computational environments.9 They can be used to numerically solve equations when analytical solutions are not possible.10 They have also been used in various areas of computer-aided industrial design as well as telecommunications and traffic routing problems,11 when a multitude of variables must be simutaneously optimized for the design to be viable. GAs are thus called because they mimic the evolutionary “optimization” (more correctly, adaptation) of living organisms mediated by genetic recombination, mutations, and selective pressure on successive population generations.9 The parameters to be optimized can be thought of as genetic loci on a virtual chromosome. Each value of the parameter is an allele. The whole “chromosome” is thus a full set of parameters with specified values (in our case, a crystallization condition). A few “successful chromosomes” (crystallization hits) are selected from a “parent generation” (a crystallization screen) and their Received: March 2, 2011 Revised: April 28, 2011 Published: May 04, 2011 2993

dx.doi.org/10.1021/cg200263u | Cryst. Growth Des. 2011, 11, 2993–2998

Crystal Growth & Design

ARTICLE

Figure 1. Simplified schematic illustration for one cycle of a Genetic Algorithm-inspired procedure as applied to only two “hits” from a standard crystallization screen, used to generate three 2nd generation conditions. A mutation regarding the setup technique is also introduced.

alleles (parameter values) are recombined to form the next “generation” of “chromosomes” (candidate optimization conditions). From that second generation, the most successful conditions are again selected and the process is reiterated. Sometimes a “mutation” is introduced, that is, a parameter is randomly selected and its value randomly changed to a completely new value, ideally one that was not present in the original screen at all. Mutations can be simple, multiple, or they can be mixed with recombinations. For the protein crystallization case, a chromosome may be specified as follows: C1a ¼ f½proteini , precipitantk , ½precipitantl , temperaturem , pHn , additiveo , ½additivep , ½ligandq , :::g where i, k, l,... are the different discrete or continuous, numerical or descriptive, values that the respective parameters may take and the square brackets signify concentration. Thus a particular condition may for instance be: C1a ¼ f½proteinX20mg=mL , precipitantNaCl , ½precipitant4%ðw=vÞ , temperature20°C , pH4:5 , additivePEG4000 , ½additive2%ðw=vÞ , ::::g Another condition may be: C1b ¼f½proteinX20mg=mL , precipitantamm:phosphate , ½precipitant1:5M , temperature4°C , pH6:5 , additiveKCl , ½additive0:1M , ::::g

Assuming that the above two conditions are hits (neither of which can be optimized by conventional fine-tuning of the variables), one of the possible recombinations is: C2a ¼ f½proteinX20mg=mL , precipitantamm:phosphate , ½precipitant1:5M , temperature20°C , pH4:5 , additivePEG4000 , ½additive2%ðw=vÞ , ::::g

where the subsripts 1 and 2 denote the successive generations. A few such “recombinant” (or “mutated”, or mixed) conditions are randomly generated, by hand or with the help of a computer depending on the number and complexity of the hits, and some or all are set up. Since it is easier to design a great number of second generation conditions that to actually set them

up, the experimenter’s intuition may be used to select the ones that are actually going to be set up. Too much use of intuition nevertheless might lead to missing unlikely but successful conditions (see also Discussion). The second generation trials are inspected and the procedure may stop there if interesting crystals are found in one or more drops, or the process can be reiterated to form a new generation of conditions. A simplified schematic illustration for one cycle of such a procedure is shown in Figure 1. Four “model” proteins, catalase, thaumatin, papain, and lysozyme, have been selected to test this proposed concept, using as “parent generation” the most widely used, commercially available standard protein crystallization screens. Although crystallization conditions are known for these proteins, we pretended ignorance and followed the results of the screens as if we were dealing with unknown targets. We have also successfully applied the method to a real problem case, involving the crystallization of a DNA oligomer in the presence of octakis(6-guanidino6-deoxy)-γ-cyclodextrin (gguanCD).12

’ MATERIALS AND METHODS Bovine liver catalase (C9322), thaumatin from Thaumatococcus daniellii (T7638), papain from Carica papaya (76218) and hen egg white lysozyme (L6876) were purchased from SigmaAldrich, Germany. The self-complementary kB DNA 17-mer oligonucleotide13 50 -CGCTGGAAATTTCCAGC-30 was purchased in purified and lyophylised form from Eurogentec S.A., Belgium. The Crystal Screen (cat. no. HR2110) and the Nucleic Acid Mini Screen (cat. no. HR2118) were purchased from Hampton Research (Aliso Viejo, CA) Structure Screen 2 (cat. no. MD12) was purchased from Molecular Dimensions Ltd., U.K. The polyethylene glycols (except PEG 6000), buffers, salts, 2-methyl-2,4-pentanediol (MPD), spermine and n-octyl-β-Dglucoside (β-OG) were purchased from Sigma-Aldrich. PEG 6000 was purchased from Merck, Germany, and tert-butanol from Riedel-De Ha€en AG, Germany. The stock macromolecular solutions were made as follows. Catalase was dissolved to 20 mg/mL in 25 mM HEPES buffer at pH 7.0. Thaumatin was dissolved to 40 mg/mL in 25 mM HEPES at pH 6.8. Papain was dissolved to 12 mg/mL in 20 mM 2994

dx.doi.org/10.1021/cg200263u |Cryst. Growth Des. 2011, 11, 2993–2998

Crystal Growth & Design NaCl and 20 mM sodium acetate at pH 4.9 (earlier attempts to prepare a solution in HEPES at pH 6.8 resulted in very incomplete dissolution). The slightly cloudy papain solution was then centrifuged for 2 min at 12 000 g and the clear yellowish supernatant was used. Lysozyme was dissolved to 50 mg/mL in 25 mM sodium acetate at pH 4.9. All the buffers had been sterile filtered through 0.22 μm syringe filters (Millipore, Bedford, MA). A 4 mM solution of the kB DNA oligonucleotide was prepared in deionized water and mixed at a 1:1 molar ratio with a octakis(6-guanidino-6-deoxy)-γ-cyclodextrin (gguanCD) solution, prepared as described in.12 All crystallization trials were performed in vapor diffusion hanging drops using 24-well XRL crystallization plates (MD311, Molecular Dimensions Ltd.) and siliconised glass coverslips (HR3231, Hampton Research). All protein trials were performed at 16 °C. Oligonucleotide trials were performed at various temperatures (see Results). The X-ray diffraction of the thaumatin and kB DNA crystals was tested at 100 K on the in-house system: Rigaku RU-H3R rotating anode X-ray generator equipped with an R-AXIS IV image plate detector and an Oxford Cryosystems cryostream.

’ RESULTS Catalase gave hits at four of the 50 conditions of the Hampton Research Crystal Screen, namely conditions #33, #34, #36, and #37. These conditions are detailed in Table 1. The best condition was #34, which gave clusters of microcrystals. The others gave microcrystalline suspensions. Eleven second generation conditions were set up, out of which 7 were pure recombinations, 3 were simple mutations and one was a mixed recombination/mutation (Table 1). Two second generation conditions gave single crystals, whereas all other conditions resulted in precipitation. Small crystals were obtained in 8% PEG 6000, 100 mM Na acetate at pH 4.6, which is a mutated condition #37 (PEG 4000 was mutated to 6000). Larger crystals (ca. 0.18 mm  0.1 mm  0.05 mm) were obtained in 8% PEG 4000, 100 mM Tris at pH 8.5, which is a recombination of screen conditions #36 and #37. Thaumatin is well-known to easily crystallize from solutions containing tartrate ions. Such conditions (#2 and #29) were therefore excluded from the Crystal Screen. From the remaining 48 conditions, four gave hits, namely conditions #4, #15, #20, and #47. These conditions are detailed in Table 1. Conditions #4, #15, and #47 yielded microcrystalline suspensions whereas condition #20 gave clusters of microcrystals after several weeks. Fourteen second generation conditions were generated, of which 7 were pure recombinations, 3 were simple mutations and 4 were mixed recombination/mutations (Table 1). All second generation conditions gave precipitates or microcrystalline suspensions except one, which gave a large single crystal within a few days. A good quality data set to 1.8 Å was collected from that crystal at the in-house data collection system. The successful condition was 25%(w/v) PEG 8000, 0.2 M ammonium sulfate, 100 mM Tris at pH 8.5, that is, a recombination of conditions #4, #15 and #20. A search in the Protein Data Bank and in the Biological Macromolecular Crystallization Database found no reported thaumatin crystallization conditions similar to these. The crystallographic data of the new crystal are: Space Group P41212, a = b = 58.1 Å, c = 149.0 Å. The same space group and similar cell dimensions are reported in the Protein Data Bank (entry 1THW) for crystals grown from tartrate, which resulted in a

ARTICLE

1.75 Å resolution structure14 and in15 for crystals grown with ammonium sulfate without tartrate. Papain gave no hits in the Crystal Screen and was therefore screened against the Molecular Dimensions Structure Screen 2, an extension8 of the original Jancarik and Kim7 screen. Five conditions gave hits, namely conditions #6, #8, #17, #27, and #32. These conditions are detailed in Table 1. All these gave microcrystalline suspensions, however conditions #8 and #17 looked marginally better. It was therefore decided to over-represent those in the second generation. Sixteen second generation conditions were generated, out of which two were simple mutations, three were mixed recombination/mutations and the rest pure recombinations (Table 1). Small single crystals were obtained within three days in 50% MPD, 0.2 M ammonium dihydrogen phosphate and 100 mM HEPES at pH 7.5 (which is a recombination of conditions #8 and #17) and a larger plate within one week in 35% tert-butanol, 0.2 M ammonium dihydrogen phosphate, 100 mM Tris at pH 8.5 (which is a recombination of conditions #8 and #32). The other conditions variously gave clear drops, precipitation, microcrystalline suspensions or phase separation. Eventually, more than five weeks after setup, equally good crystals to those of the optimized condition also appeared at the original screen condition #8. In retrospect, the results of Crystal Screen were submitted to closer scrutiny, to detect indications of potentially successful conditions that had been initially classified as not worth pursuing, and to compare those with our final optimized conditions. Altough no crystalline material had been obtained from Crystal Screen, two of the four MPD-containing conditions of that screen gave phase separation and gel-like precipitation. Other conditions that gave either gel-like or light, apparently nondenatured precipitate, contained PEG 400 (at pH 7.5 and 8.5), sodium/potassium tartrate (at pH 7.5) as well as higher molecular weight PEGs (4000 and 8000) throughout the pH range (4.68.5) and in combination with a wide range of salts including ammonium salts and potassium phosphate (but no ammonium phosphate as in the optimized conditions). These results confirm that it may be possible to detect useful leads (in the present case MPD and ammonium salts, as well as a preference for higher pH) even from screens that at first sight contain no hits and would be discarded. This is particularly important if it is paramount to minimize the number of initial screening experiments. However, in those cases optimization— either as presented here or by other means—may become a more protracted and riskier undertaking. The decision therefore of whether to optimize from borderline hits or to proceed to further screening before homing in on promising conditions, depends on the individual case and its needs. The standard benchmark protein lysozyme gave, as expected, many hits at diverse conditions in the Crystal Screen. Thirteen conditions were hits, seven of which were already large single crystals ready for data collection. Lysozyme was therefore discarded as being too unproblematic for this method. A real problem situation occurred when trying to cocrystallize a 17-mer DNA oligonucleotide of the kB DNA family with a octakis(6-guanidino-6-deoxy)-γ-cyclodextrin (gguanCD).12 The specific DNA sequences collectively known as kB DNA are binding sites for transcription factors which regulate DNA transcription.13 This oligonucleotide has already been crystallized in the presence of protein v-Rel, the oncogenic version of transcription factor c-Rel (the oligonucleotide crystals were a result of a failed attempt to crystallize a v-Rel/oligonucleotide 2995

dx.doi.org/10.1021/cg200263u |Cryst. Growth Des. 2011, 11, 2993–2998

2996

70% MPD, 0.2 M amm.

70% MPD, 0.2 M amm.

0.1 M NaOAc pH 4.9 (M)

Tris pH 8.5 (R) 50% MPD, 0.2 M amm. phosphate, 0.1 M HEPES pH 7.5 (R)

pH 7.5 (R) 50% MPD, 35% tert-butanol,

phosphate, 0.1 M HEPES

Na citrate pH 5.6 (R)

phosphate, 0.1 M

25% PEG 550 MME, 0.1 M

HEPES pH 7.5 (R)

100 mM Tris pH 8.5

Tris pH 8.5

35% tert butanol, 0.1 M

amm. phosphate,

#8: 50% MPD, 0.2 M

0.1 M NaOAc pH 4.6 (R)

10 mM NiCl2, 100 mM

#6: 1 M Li sulf.,

0.1 M Tris pH 8.5 (M)

8K, 0.1 M Tris pH 8.5 (RþM)

2 M amm. sulf. 8% PEG

NaOAc pH 4.6 (R)

30% PEG 8K 0.1 M

35% tert-butanol, 0.1 M

0.1 M HEPES pH 7.5 (RþM)

HEPES pH 7.5 (RþM)

50% tert-butanol, 0.1 M

Tris pH 8.5 (R)

MES pH 6.5

10 mM ZnSO4, 100 mM

#27: 25% PEG 550 MME,

HEPES pH 7.5 (RþM)

Tris pH 8.5 (R) 25% tert-butanol, 0.1 M

amm. phosphate, 0.1 M

35% tert-butanol, 0.2 M

HEPES pH 7.5 (R)

50% MPD, 0.1 M

2nd generation conditions 25% PEG 550 MME 0.1 M NaCl,

HEPES pH 7.5

#17: 70% MPD, 100 mM

1st generation hit conditions

Papain (12 mg/mL in 20 mM NaCl, 20 mM sodium acetate, pH 4.9)

HEPES pH 7.0 (M)

2 M amm. sulf, 100 mM

pH 7.5 (MþR)

30% PEG 8K 0.1 M HEPES

2nd generation conditions

0.1 M NaOAc pH 4.6

#47: 2 M amm. sulf.

1st generation hit conditions

HEPES pH 7.5 (R)

amm. phosphate, 0.1 M

35% tert-butanol, 0.2 M

0.1 HEPES pH 7.5 (R)

50% MPD 0.2 M amm. phosphate,

100 mM Na citrate pH 5.6

#32: 35% tert-butanol,

0.1 M NaOAc pH 4.6 (RþM)

2 M amm. sulf. 8% PEG 8K,

NaOAc pH 4.6 (R)

25% PEG 4K 0.1 M

NaOAc pH 4.6 (R)

4 M Na formate,0.1 M

Tris pH 8.5 (R)

2 M Na formate, 0.1 M

0.1 M Tris pH 8.5 (M)

50% MPD, 0.2 M NaCl,

Tris pH 8.5 (R)

70% MPD, 0.1 M

sulf., 0.1 M Tris pH 8.5 (R)

25% PEG 8K 0.2 M amm.

25% PEG 4K

2 M Na formate (R)

R, recombination; M, mutation; NaOAc, sodium acetate; amm., ammonium; sulf., sulfate; MPD, 2-methyl-2,4-pentanediol; MME, monomethyl ether; MES, 2(N-Morpholino)ethanesulfonic acid. In grey, crystal-yielding 2nd generation conditions.

a

citrate, pH 5.6 (MþR)

NaOAc pH 4.6 (R)

25% PEG 8K 0.2 M amm. sulf.,

25% PEG 4K, 0.1 M Na

25% PEG 4K 0.1 M

1 M amm. sulf.,

HEPES pH 7.5 (M)

NaOAc pH 4.6

#20: 25% PEG 4K 0.2 M

Na cacodyl. pH 6.5

2 M amm. sulf. 0.1 M

Tris pH 8.5 (M)

4 M Na formate, 0.1 M

8% PEG 4K (R)

NaOAc pH 4.6

#37 8% PEG 4K, 0.1 M

Thaumatin (40 mg/mL in 25 mM HEPES buffer, pH 6.8)

Na cac. pH 6.8 (M)

2 M Na formate, 0.1 M

amm. sulf., 0.1 M

Na cacod. pH 6.5 (R)

2 generation conditions

nd

8% PEG 4K M Tris pH 8.5 (R)

M Tris pH 8.5

#36 8% PEG 8K,

0.2 M amm. sulf. 0.1 M

#15: 30% PEG 8K,

8% PEG 6K (RþM)

8% PEG 8K (R)

NaOAc pH 4.6

#34 2 M Na formate, 0.1 M

2 M amm. sulf. 0.1 M

0.1 M Tris pH 8.5

#4: 2 M amm. sulf.

Tris pH 8.5 (M)

8% PEG 6K 0.1 M

NaOAc pH 4.6 (M)

8% PEG 6K 0.1 M

#33 4 M Na formate

1st generation hit conditions

Catalase (20 mg/mL in 25 mM HEPES buffer, pH 7.0)

Table 1. First Generation Hits and Second Generation Conditions For the Three Model Proteinsa

Crystal Growth & Design ARTICLE

dx.doi.org/10.1021/cg200263u |Cryst. Growth Des. 2011, 11, 2993–2998

Crystal Growth & Design complex) and its structure has been solved to 1.60 Å. The published conditions were standard hanging-drop vapor diffusion at 18 °C of 12 mg/mL v-Rel/DNA complex at a 1:1.1 molar ratio, mixed at equal volumes and equilibrated against wells containing 1520% (w/v) PEG 3350, 0.20.4 M CaCl2, 5 mM spermine, 0.05%(w/v) β-octyl-glucoside (β-OG) and 100 mM Tris pH 7.5.13 These conditions in our hands and in the presence of an equimolar amount of gguanCD instead of v-Rel, gave almost completely nondiffracting crystals. Extensive attempts at conventional optimization, including trials at various temperatures, pH and concentrations of all the ingredients, gave at best stacks of thin plates that could not be separated, yielding at best multiple diffraction spots to 3.1 Å. The two best conditions resulting from conventional optimization consisted of 2 mM oligonucleotide/gguanCD solution mixed at equal volumes and equilibrated at 4 °C against wells containing 5 mM spermine, 0.05% (w/v) β-OG, 100 mM Tris at pH 7.5 and (condition i): 11% (w/v) PEG 3400, 0.4 M CaCl2; (condition ii): 12% PEG 3500, 0.5 M CaCl2. These conditions were therefore abandoned and a Hampton Research Nucleic Acid Mini-Screen (24 conditions) was set up in hanging drops with 1.5 mM oligonucleotide/gguanCD stock solution, at 16 °C. The best results consisted of rather large single crystals that gave very few diffraction spots. The two best conditions were: #18: 10% MPD, 12 mM spermine, 80 mM NaCl, 12 mM KCl, 20 mM MgCl2, 40 mM Na cacodylate at pH 7.0. #23: 10% MPD, 12 mM spermine, 40 mM LiCl, 80 mM SrCl2, 40 mM Na cacodylate at pH 7.0. Extensive conventional optimization trials around these conditions yielded even worse crystals. It was therefore decided to use the above four conditions together with the original published conditions, as first generation conditions for the GAinspired method. Sixteen second generation conditions, consisting of recombinations and recombination/mutation mixes, were set up and large single crystals were obtained at two conditions. The best X-rayed single crystal, diffracting to at least 1.7 Å at the rotating anode laboratory beam, grew in a recombination consisting of 20% MPD, 0.4 M CaCl2, 5 mM spermine, 0.05%(w/v) β-OG and 100 mM Na cacodylate at pH 7.0, from a 1 mM oligonucleotide/gguanCD solution.

ARTICLE

optimization. Our results however indicate that even in this extremely simplified form, we have what appears to be a robust optimization tactic that may in some cases lead to quick success with a minimal number of generations and trials per generation. It was surprising to see that in all cases, one round of recombinations and mutations, that is, the second generation, was sufficient to optimize what were quite poor initial screening hits. The Genetic Algorithm-inspired methodology presented here is made particularly flexible by the availability of various options that can be used to tailor it to the situation at hand.9 These options include the following: (i) Parameters that are assumed to be strongly correlated may be placed “close” on the “chromosome”, in other words their values can be kept together in a higher proportion of next generation conditions than in a random recombination case. For instance, in the imaginary example of the Introduction, the type and concentration of precipitant as well as the type and concentration of additive were purposefully kept together in the second generation condition, as it was assumed that each concentration was well-suited to the particular compound. This however is a decision that can be overturned as was done here for thaumatin, where the successful second generation condition contained 25% PEG 8000 (starting from 25% PEG 4000 in one of the first generation hits and 30% PEG 8000 in another). A more sophisticated version of this could be implemented for the more complex cases, where different recombination probabilities may be assigned to the various pairs of parameters. (ii) The experimenter can generate a large next generation of conditions on paper (or in silico), but then only choose a subset of those for actual set up. According to the results, he can then decide whether to pursue to the next generation or to try another subset of conditions from that same generation. (iii) The experimenter can be more or less “conservative”, by introducing as many random mutations as he desires, or even none at all. When the first generation hits are few and each of the hit conditions happens to consist of only one or two ingredients, the possible recombinations are quickly exhausted and mutations are then the only possible way forward. Mutations provide the chance to use any prior knowledge on the macromolecule that had been ignored in setting up the initial screens, as well as the experimenter’s intuition. These can include information on possibly best pH (acquired for example from Dynamic Light Scattering experiments) or of whether the protein is more or less soluble (in which case the precipitant concentration can be mutated to something considerably higher or, respectively, lower than in the starting screen hit condition). We may term these “directed”, rather than random, mutations. Excessive use of such prior knowledge however, can bias the optimization in a way that may prevent it from reaching the desired result. In the examples presented here, recombinations proved in fact to be more useful than mutations. (iv) The experimenter can reintroduce the most successful members of previous generations into subsequent ones (leading to “generational promiscuity”) so as not to “lose” those conditions, or he can decide to be more strict with his generational pattern and take more risks.

’ DISCUSSION In a purely computational/virtual context, hundreds or thousands of recombinations and mutations can be tried in silico and the number of generations can be arbitrarily large. It has been shown that in that case, if the algorithm is well-designed, the parameters can in principle be optimized to the desired level, given enough generations and a large enough population size per generation.9,16 When adapting GA principles to crystallization condition optimization however, each successive generation must actually be set up and assessed by an experimenter (or possibly an automated image-recognition and scoring system). Limitations set on the number of experiments will decide the number of members (the population size) in each generation and the overall number of generations, which may both be rather limited. The possibilities are thus more restricted than in the virtual case and there is certainly no guarantee for convergence to an optimum. Furthermore, various sophisticated strategies for chromosome selection, scoring and others9,10 are unavailable, unpracticable or possibly meaningless in a situation like crystal 2997

dx.doi.org/10.1021/cg200263u |Cryst. Growth Des. 2011, 11, 2993–2998

Crystal Growth & Design In the present context, no attempt was made to design or to use existing software for generating recombinant and mutant conditions, since the initial hits were few and crystals were already obtained at the second generation stage. Possibilities were therefore rather limited and it was more expedient to generate the recombinant/mutant conditions by hand. A highthroughput laboratory might design and implement an automated version of this method, including all the possibilities described above, or possibly modify already existing GA software. A number of computer programs already exist that can stochastically (randomly) generate sets of what are here termed second generation trials, based on ingredients and concentrations/ pH levels determined from the initial hits. These include the pioneering program CRYSTOOL,17,18 and the more recent commercial programs CrystalTrak (Rigaku Corporation) and Rock Maker 2.0 (Formulatrix Inc., Waltham, MA). The initial screen could also very well be a true random screen such as generated by CRYSTOOL, instead of the sparse matrix type used here.

ARTICLE

(16) Greenhalgh, D. SIAM J. Comput. 2000, 30, 269–282. (17) Segelke, B. W.; Rupp, B. American Crystallographic Association Annual Meeting, July 1823, 1998, Arlington, VA. ACA Meeting Series 25, 78. (18) Segelke, B. W. J. Cryst. Growth 2001, 232, 553–562.

’ AUTHOR INFORMATION Corresponding Author

*Tel: þ30-210-6503793. Fax: þ30-210-6511766. E-mail: esaridak@ chem.demokritos.gr.

’ ACKNOWLEDGMENT The author thanks Mr. Eduard Baquero Salazar for assisting him in setting up the initial crystallization trials and Drs. Irene M. Mavridis and Carlo Knupp for useful discussions. Drs. Chrysie Aggelidou and Konstantina Yannakopoulou are also acknowledged for providing the octakis(6-guanidino-6-deoxy)-γ-cyclodextrin. An anonymous referee is thanked for a useful suggestion. ’ REFERENCES (1) Bergfors, T. In Protein Crystallization; Bergfors, T. M., Ed.; International University Line: La Jolla, CA, 1999; Chapter 8, pp 7176. (2) Chayen, N. E.; Saridakis, E. Nat. Methods 2008, 5, 147–153. (3) Rupp, B. J. Struct. Biol. 2003, 142, 162–169. (4) Brzozowski, A. M.; Walton, J. J. Appl. Crystallogr. 2001, 34, 97–101. (5) Newman, J.; Egan, D.; Walter, T. S.; Meged, R.; Berry, I.; Ben Jelloul, M.; Sussman, J. L.; Stuart, D. I.; Perrakis, A. Acta Crystallogr., Sect. D 2005, 61, 1426–1431. (6) Gorrec, F. J. Appl. Crystallogr. 2009, 42, 1035–1042. (7) Jancarik, J.; Kim, S. H. J. Appl. Crystallogr. 1991, 24, 409–411. (8) Cudney, R.; Patel, S.; Weisgraber, K.; Newhouse, Y.; McPherson, A. Acta Crystallogr., Sect. D 1994, 50, 414–423. (9) Baeck, T. Evolutionary algorithms in theory and practice: evolution strategies, evolutionary programming, genetic algorithms; Oxford University Press: Oxford, 1996. (10) Haupt, R. L.; Haupt, S. E. Practical Genetic Algorithms, Second edition; Wiley: Hoboken, NJ, 2004. (11) Ceylan, H.; Bell, M. G. H. Transp. Res. B 2004, 329–342. (12) Mourtzis, N.; Eliadou, K.; Aggelidou, C.; Sophianopoulou, V.; Mavridis, I. M.; Yannakopoulou, K. Org. Biomol. Chem. 2007, 5, 125–131. (13) Huang, D.-B.; Phelps, C. B.; Fusco, A. J.; Ghosh, G. J. Mol. Biol. 2005, 346, 147–160. (14) Ko, T.-P.; Day, J.; Greenwood, A.; McPherson, A. Acta Crystallogr., Sect. D 1994, 50, 813–825. (15) van der Wel, H.; van Soest, T. C.; Royers, E. C. FEBS Lett. 1975, 56, 316–317. 2998

dx.doi.org/10.1021/cg200263u |Cryst. Growth Des. 2011, 11, 2993–2998