Anal. Chem. 2004, 76, 6743-6752
Electrostatic Contributions to Protein Retention in Ion-Exchange Chromatography. 1. Cytochrome c Variants Yan Yao and Abraham M. Lenhoff*
Department of Chemical Engineering, University of Delaware, Newark, Delaware 19716
Among the factors that modulate protein interactions, several protein structural properties, such as size, shape, and charge distribution, may play significant roles. In this work, we investigate the influence of protein structure on binding in ion-exchange chromatography, in which electrostatic interactions are dominant. Chromatographic experiments show separation of cytochrome c variants with a limited number of sequence differences to be feasible. To probe the molecular basis for this behavior, protein-adsorbent electrostatic interactions were modeled in the context of continuum electrostatics accounting for the full 3D protein structure. Protein retention was modeled by averaging over all protein-adsorbent configurations using the full accessible surface of the protein. The electrostatic interaction free energy distribution shows that configurations in which numerous positive protein charges are close to the cation exchanger functional groups produce the most favorable binding. The calculated binding equilibrium constant, found by averaging over the full 3D configurational space, captures the chromatographic differentiation of closely related cytochrome c variants. To obviate the need for full sampling of protein configurations, calculations of interaction free energies at short protein-adsorbent separation distances or of protein surface potentials were found to yield reasonable semiquantitative descriptions of the retention trends. Ion-exchange chromatography (IEC) is the one of the most widely used operations in the separation and purification of biomolecules.1 In broad terms, protein retention in IEC results from attraction between oppositely charged functional groups on the adsorbent and the protein, and separation occurs due to binding differences among the proteins under specific operating conditions. There are, however, many factors that determine the separation performance, including the nature of the proteins, adsorbents, and process conditions. All of these factors ultimately affect the protein-adsorbent interactions, with electrostatic interactions generally accepted to be dominant. The solution pH, which affects the charge state of the biomolecules, and the salt concentration of the elution buffer are the main experimental * Corresponding author. Fax: (302) 831-4466. E-mail:
[email protected]. (1) Bonnerjea, J.; Oh, S.; Hoare, M.; Dunnill, P. Bio-Technology 1986, 4, 954958. 10.1021/ac049327z CCC: $27.50 Published on Web 10/12/2004
© 2004 American Chemical Society
variables for fulfilling particular separation requirements by modulating the strength of the electrostatic interactions. However, the molecular basis for separation remains incompletely understood quantitatively, and additional progress is needed to describe the dependence of protein adsorption on various controlling factors. Modeling of protein-adsorbent surface electrostatic interactions can help in shedding light on protein adsorption mechanisms and thus provide further guidance for efficient process design. There have been a large number of modeling studies of proteinadsorbent interactions. The stoichiometric displacement model (SDM),2 which describes protein adsorption as a strict ionexchange process between the charged protein and ions in the solution, has been further extended to consider the localized binding sites on the protein and the counterions in the solution.3,4 These models have been applied to protein ion-exchange chromatography in analysis of the effects of salt concentration on the protein retention factor. Although the SDM offers a simplified analysis of chromatographic behavior, the model is not strictly mechanistic, so the model parameters do not give a clear representation of the roles of properties of physical significance. An alternative approach, in terms of the continuum colloidal theory based on the Poisson and Poisson-Boltzmann equations, has been examined, with different representations of protein shape, charge distribution, and other structural features. Simplified models of the protein, e.g., a charged planar surface5,6 or a sphere with a centered net charge,7 have been applied in the study of the relationship between retention behavior and system properties. The significance of charge heterogeneity on the protein molecule for protein adsorption has been highlighted by the experimental observations of chromatographic separation of similarly charged proteins8 and attractive interactions between like-charged protein and adsorbent,3,9,10 which cannot be easily explained by a model reducing the protein charge distribution to a single net charge or charge density. Protein modeling with the full 3D structure of (2) Boardman, N. K.; Partridge, S. M. Biochem. J. 1955, 59, 543-552. (3) Kopaciewicz, W.; Rounds, M. A.; Fausnaugh, J.; Regnier, F. E. J. Chromatogr. 1983, 266, 3-21. (4) Melander, W. R.; El Rassi, Z.; Horva´th, C. J. Chromatogr. 1989, 469, 3-27. (5) Ståhlberg, J.; Jo ¨nsson, B.; Horva´th, C. Anal. Chem. 1991, 63, 1867-1874. (6) Ståhlberg, J.; Jo ¨nsson, B.; Horva´th, C. Anal. Chem. 1992, 64, 3118-3124. (7) Roth, C. M.; Lenhoff, A. M. Langmuir 1993, 9, 962-972. (8) Chicz, R. M.; Regnier, F. E. Anal. Chem. 1989, 61, 2059-2066. (9) Rounds, M. A.; Regnier, F. E. J. Chromatogr. 1984, 283, 37-45. (10) Lesins, V.; Ruckenstein, E. Colloid Polym. Sci. 1988, 266, 1187-1190.
Analytical Chemistry, Vol. 76, No. 22, November 15, 2004 6743
the proteins has therefore been performed, which provides a more detailed description of the relevant molecular events in protein adsorption.7,11-17 Atomistic representations of the adsorbent surface that consider the adsorbent topography and charge distribution have also been employed in the modeling of protein-adsorbent interactions.13,14 Due to the complex and diverse physicochemical characteristics of typical adsorbents, various simplified representations of the surface have also been utilized, specifically a uniformly charged surface,7,12,16,17 a planar surface carrying discrete charges,13,17 and an end group model,17 which describes the functional group as a centrally charged sphere on the base matrix. The end group model reflects the fact that the functional groups on the adsorbents are attached through a spacer arm rather than residing directly on flat surfaces. It was found to capture the local effects of electrostatically driven protein adsorption reasonably well.17 For those models with a detailed molecular representation of the protein structures, previous computational results have shown that there are some groups of charged amino acid residues identified as attractive regions that display significantly more favorable protein-adsorbent interactions than others.13,17 Such behavior has been inferred from experimental observations as well.18 From a statistical standpoint, the probability distribution of protein conformations on the surface correlates with the corresponding free energies of interaction.19 The macroscopically observable retention, characterized by the retention factor, can be derived from the distribution of protein-adsorbent interaction free energies via a Boltzmann average over the configurational space of the protein-surface system. The most attractive configurations, which usually involve apposition of oppositely charged areas between the protein molecule and the adsorbent,17 are the controlling contributors to the overall retention behavior. Thus, the overall configurational space used in deriving the retention factor can be simplified to these charged-residue configurations to obtain a rough estimate of macroscopic behavior. For comparative studies of proteins with subtle structural variations, this approach might smear the slight differences incurred by the minor local structural changes over the coarse patches defined by the charged residues and thus be inadequate for differentiating closely related proteins. If this is so, a more finely grained approach would be necessary to give a better description of the adsorption process. Such calculations can be important for chromatographic separations of protein variants with very similar structures. To understand further the local molecular contributions to protein interactions, protein variants or proteins with similar adsorption-determining properties can serve as suitable model systems for controlled studies of structural effects on protein binding. Experimental data are available for some such systems, including gradient retention of subtilisin variants8 and (11) Lu, D. R.; Park, K. J. Biomater. Sci., Polym. Ed. 1990, 1, 243. (12) Yoon, B. J.; Lenhoff, A. M. J. Phys. Chem. 1992, 96, 3130-3134. (13) Roush, D. J.; Gill, D. S.; Willson, R. C. Biophys. J. 1994, 66, 1290-1300. (14) Noinville, V.; Vidal-Madjar, C.; Se´bille, B. J. Phys. Chem. 1995, 99, 15161522. (15) Roth, C. M.; Lenhoff, A. M. Langmuir 1995, 11, 3500-3509. (16) Juffer, A. H.; Argos, P.; De Vlieg, J. J. Comput. Chem. 1996, 17, 17831803. (17) Asthagiri, D.; Lenhoff, A. M. Langmuir 1997, 13, 6761-6768. (18) Regnier, F. E. Science 1987, 238, 319-323. (19) McQuarrie, D. A. Statistical Mechanics; Harper & Row: New York, 1975.
6744
Analytical Chemistry, Vol. 76, No. 22, November 15, 2004
displacement chromatography of bovine and horse cytochrome c,20 both on cation exchangers. Such chromatographic separations of closely related proteins suggest that protein adsorption is tuned on the scale of individual residues and that individual residues can play a nontrivial role in determining the overall adsorption behavior. In this work, we use a continuum electrostatic retention model to examine whether detailed calculations of electrostatic interactions between the protein and the adsorbent are able to capture experimental retention trends. The main goal is to investigate electrostatic effects on retention behavior within the proteinsurface system and to define the properties governing retention, considering the protein fine structure. The ability of retention models to capture experimental observations in such systems represents a demanding test, which can shed light on retention mechanisms. In a companion paper, we use this approach to investigate the influence of local molecular properties on protein affinity to ion exchangers for three protein system comparisons, but our most detailed examination is presented in this paper, for cytochrome c variants from horse, sheep, dog, and rat. Favorable binding regions of cytochrome c, determined by chromatographic retention of chemically modified proteins on cation exchangers, include lysines in the strongly basic patches and the hydrophobic surface containing the exposed heme edge and residues 81-83.21 Xu et al.22 examined the adsorption of cytochrome c using an adsorbent that labeled the lysines in the protein contact regions with succinate residues. They concluded that two domains on a single face of horse cytochrome c, including residues 5, 7, 8, 13, 25, 27, 73, 79, and 86-88, dominate adsorption on cation exchangers. Cytochrome c variants from different species share extensive similarities in amino acid sequences, including a considerable number of basic and hydrophobic residues.23 These variants with similar structures but a small number of residue differences, including ionizable and uncharged residues, serve as good case studies for exploring contributions of individual residues to protein-adsorbent affinity. THEORY AND COMPUTATIONAL METHODS Electrostatic Modeling. Protein adsorption in IEC is generally thought to occur due to the predominantly electrostatic force between the protein and the charged functional moieties covalently linked to the ion-exchange matrix. The system modeled here includes a protein molecule and the active portion of the adsorbent surface, the functional end group extending from the base support. The role of stationary-phase structural parameters such as the ligand density is accounted for as described in the next section, while the discussion here centers on calculation of the proteinend group interaction. The geometry and charge distribution of the protein, based on the crystallographic structure, are explicitly included in the model. The Connolly molecular representation24 is utilized to describe the complex protein geometry, and the locations of ionizable amino acid residues determine the charge (20) Kundu, A.; Barnthouse, K. A.; Cramer, S. M. Biotechnol. Bioeng. 1997, 56, 119-129. (21) Brautigan, D. L.; Ferguson-Miller, S.; Margoliash, E. J. Biol. Chem. 1978, 253, 130-139. (22) Xu, W. S.; Zhou, H.; Regnier, F. E. Anal. Chem. 2003, 75, 1931-1940. (23) Margoliash, E. Proc. Natl. Acad. Sci. U.S.A. 1963, 50, 672-679. (24) Connolly, M. L. J. Mol. Graphics 1993, 11, 139-143.
distribution. While the chemistry and physical characteristics of the adsorbent affect protein chromatographic retention, the structural complexity and lack of complete adsorbent characterization defy accurate representation. Thus, a simplified end group model17 is used here in which the adsorbent functional group is described as a sphere (1.7 Å in radius) with a centered net charge. The end group representation determines the dielectric boundary and the protein molecule’s accessibility to the end group, which affect the charge-charge Coulombic interaction and solvation energies. However, a slight change in the end group size did not result in significant differences in the calculated interaction energies. The protein molecule and the charged adsorbent end group are immersed in an electrolyte solution with mobile ions that form electrical double layers around the respective charged entities. The interaction calculations are performed at the continuum level by solution of the Poisson and Poisson-Boltzmann equations within the interacting moieties and the medium, respectively.25-28 The electrostatic potential φi in the interior of a solute (protein or end group) is described by the Poisson equation
interactions on chromatographic binding. Protein interactions are also fairly short-ranged given the moderate salt concentrations used. Thus, the interactions are reduced to pairwise additive single protein-end group interactions, and the superposition of these individual effects leads to estimation of the overall macroscopic behavior, as follows. Calculation of Chromatographic Retention. Chromatographic retention is characterized by the retention factor k′, expressed as the normalized difference of the retention volume of the adsorbing solute, VR, and that of an unbound solute, V0,
∇2φi ) -(Fi/i0)
K ) cp/ci
1 2
∑qφ
k k
(4)
(5)
where cp is the average protein concentration in the pore space and ci is that in the interstitial volume. This approach yields a relationship for k′ in the form30
k′ ) (K - 1)
Vp ) V0
∫ (e
-∆G(Vp)/RT
Vp
- 1) dVp
V0
(6)
(2)
in which κ is the Debye parameter representing the screening effect of the mobile ions in the solution. Both the protein and end group interiors are assigned a low dielectric constant of 4,26 and the solution has a dielectric constant of 80. After the potential distribution φ is found in 3D by solving the Poisson and Poisson-Boltzmann equations, the interaction free energy is calculated by the superposition of the contributions from all the fixed charges in the system, viz. those on the protein and that on the end group,17,29
G)
V R - V0 V0
To correlate the retention factor to the interaction free energy between a protein molecule and an end group, we follow earlier analyses5,30 in terms of the distribution factor K, which describes the equilibrium partitioning of solute between the pores and the interstitial space,5
(1)
in which Fi is the local charge density, i is the dielectric constant of the solute interior, and 0 is the permittivity of free space. The charge density comprises a set of discrete point charges within the protein molecule and a single point charge at the center of the end group. In the electrolyte solution, the electrostatic potential φe is described by the linearized Poisson-Boltzmann equation
∇2φe ) κ2φe
k′ )
where Vp is the accessible pore volume and ∆G(Vp) is the free energy difference between the protein at location Vp in the pore volume and in the bulk. The free energy differences that we calculate are for binary interactions between protein (1) and end group (2) that depend on the orientation Ω1 of the protein relative to the end group, as described in the previous section. Therefore, it is in the evaluation of the integral in eq 6 that the specifics of the approach lie. The contribution of a single end group, j, to the retention factor is found by accounting only for interaction with that group, for a protein molecule in any orientation
(3)
k
k′j ) (Kj - 1) The interaction free energy between the protein and the adsorbent is then expressed as the difference between the free energy in a given configuration and that at an infinite separation distance. For the linear chromatography of interest here, the surface coverage of protein on the porous stationary phases is assumed to be low enough to exclude the effect of lateral protein-protein (25) Gilson, M. K.; Sharp, K. A.; Honig, B. H. J. Comput. Chem. 1988, 9, 327335. (26) Harvey, S. C. Proteins 1989, 5, 78-92. (27) Davis, M. E.; McCammon, J. A. Chem. Rev. 1990, 90, 509-521. (28) Sharp, K. A. Curr. Opin. Struct. Biol. 1994, 4, 234-239. (29) Zhou, H. X. Biophys. J. 1993, 65, 955-963.
Vpj V0
)
∫∫
r1j Ω1
(e-∆G1j(Ω1,r1j)/RT - 1) dΩ1 dr1j V0
(7)
In view of the short-range nature of the protein-end group interaction, the integrand is nonzero only in the immediate vicinity of the end group. Furthermore, given its attachment and proximity to the base matrix, the end group is accessible to the protein only in about half of the surrounding solid angle. Thus, the integral over the pore volume can be written in spherical coordinates as (30) Tessier, P. M.; Lenhoff, A. M.; Sandler, S. I. Biophys. J. 2002, 82, 16201631.
Analytical Chemistry, Vol. 76, No. 22, November 15, 2004
6745
k′j ) (Kj - 1)
Vpj V0
)
∫ ∫ ∫ (e 2π
∞
π
0
0
r0
-∆G1j/RT
- 1)r2 sin θ dr dθ dβ
2V0
(8)
with the two angular variables, θ and β, defining the orientation of the protein relative to the end group. The factor of 2 in the denominator accounts for the partial accessibility of the end group to the protein, while r0 is the center-to-center distance at contact of the two groups, which also depends on orientation. The overall contribution to protein adsorption from all the accessible end groups is accounted for by summation of the k′j, assuming that a protein molecule does not interact simultaneously with multiple end groups. Since all end groups are regarded as equivalent, it is adequate just to account for the number of functional groups, which are assumed to be uniformly distributed across the full internal pore surface. However, only those in sufficiently large pore spaces are actually accessible to protein solutes, so the interaction is characterized in terms of Fs, the ionic capacity of the stationary phase, i.e., the number of end groups per unit particle volume, A0, the total surface area of the pore space, and A, the pore surface area accessible to the protein-sized solute
k′ )
∫ ∫∫ 2π
A FsVparticle A0
0
π
0
∞
r0
(e-∆G1j(r,θ,β)/RT - 1)r2 sin θ dr dθ dβ 2V0
(9)
Here Vparticle is the total particle volume in a packed column. This development rests on the key assumptions that retention is controlled by electrostatic interactions of a protein molecule with a single end group. The value of ∆Gij in eq 9 can be perturbed by the contributions of other interactions, e.g., van der Waals or hydration, and by the effects of other nearby ligands. The former contributions are difficult to calculate realistically at short range, whereas the latter depend sensitively on the precise distribution of ligands on the surface, information that is not readily accessible. The most pronounced effect would arise from multipoint attachment, where the protein is bound strongly to more than one ligand. This would depend on an ideal ligand spacing, which is feasible for numerous ligand pairs if the ligands are randomly distributed and at sufficiently high density, especially on longer spacer arms. However, such an ideal fit would be encountered in only a minute fraction of the overall angular configurational space, so it is difficult to determine how significant an effect it may be likely to have. For the study presented here, therefore, we restrict the analysis to simple pairwise interactions, but in the companion paper, we specifically examine a system in which the stationaryphase properties have a profound effect on retention behavior, possibly due to the effects neglected here. COMPUTATIONAL METHODS Protein Structure Homology Modeling. The structure of cytochrome c from horse heart (1hrc.pdb) was obtained from the Protein Data Bank (PDB).31 The 3D structures of the other 6746
Analytical Chemistry, Vol. 76, No. 22, November 15, 2004
cytochrome c variants are not available from the PDB, and they were thus constructed using MODELLER,32 a comparative protein structure modeling software package. The available structure of the horse variant serves as a suitable template for homology modeling of other variants with small numbers of residue replacements. Modeled structures using templates with high sequence identity have been found to have good stereochemistry and structures close to those found crystallographically.33 A validation package for macromolecular structures, Biotech Validation Suite for Protein Structures,34 was used to evaluate the modeled variant structures. In addition, a reverse mutation step was performed and the resulting structure of the horse cytochrome c was compared with the crystallographic one for additional structure validation. Solution of Poisson-Boltzmann Equation. Macroscopic Electrostatics with Atomic Detail (MEAD), a program for macroscopic electrostatic modeling that uses a finite difference algorithm to solve the Poisson and Poisson-Boltzmann equations, was used to compute the potentials.35,36 The grids are defined in a series of sizes and resolutions, the coarsest level extending far enough into the solvent to get accurate enough boundary potentials. A resolution of 0.1 Å was used as the finest grid spacing to achieve satisfactory accuracy, especially for the part of the protein molecule facing the end group. Grid artifacts may be introduced into the calculation depending on the grid dimensions used, but for comparative purposes, these contributions are negligible as long as the grid parameters are consistent in the calculations. The calculated electrostatic interaction energy of the protein-adsorbent pair includes direct screened intermolecular interactions between the charges on the protein and the end group and electrostatic solvation energy changes due to the replacement of the surrounding solvent by the low-dielectric binding partners. The charges were assigned on the basis of the positions of the ionizable groups obtained from the crystal structure, using standard pK values.17 Calculation of k′. We accounted in our calculations for electrostatic interactions over the full surface of the protein molecules accessible to an end group. Most of the charged residues are distributed near the surface of the protein molecules, with the charge locations heterogeneously distributed on the surface. Charge clusters can give rise to “sticky” areas on the proteins to which binding is preferred,22 but in general, a priori identification of preferred binding “sites” is not possible. The Connolly molecular surface package24 was used to obtain a full representation of the accessible surface by rolling a watersized probe over the van der Waals surface of the protein. The resulting Connolly surface is made up of connected areas defined as contact and reentrant faces. Lin et al.37 developed a simplified (31) Bernstein, F. C.; Koetzle, T. F.; Williams, G. J. B.; Meyer, E. F.; Brice, M. D.; Rodgers, J. R.; Kennard, O.; Shimanouchi, T.; Tasumi, M. J. Mol. Biol. 1977, 112, 535-542. (32) Sali, A. Mol. Med. Today 1995, 1, 270-277. (33) Sali, A.; Potterton, L.; Yuan, F.; van Vlijmen, H.; Karplus, M. Proteins 1995, 23, 318-326. (34) Biotech Validation Suite, http://biotech.embl-ebi.ac.uk: 8400/. (35) Bashford, D.; Gerwert, K. J. Mol. Biol. 1992, 224, 473-486. (36) Bashford, D. In Scientific Computing in Object-Oriented Parallel Environments; Ishikawa, Y., Oldehoeft, R. R., Reynders, J. V. W., Tholburn, M., Eds.; ISCOPE97; Springer: Berlin, 1997; Vol. 1343, pp 233-240. (37) Lin, S. L.; Nussinov, R.; Fischer, D.; Wolfson, H. J. Proteins 1994, 18, 94101.
Figure 1. Amino acid sequences of cytochrome c from horse, sheep, dog, and rat. Mutated residues are shown in boldface type.
molecular surface representation using critical points on the surface to reduce the computational costs while retaining the overall shape characteristics for applications using the Connolly surface. The critical points are generated by projecting the centers of gravity of the Connolly contact and reentrant faces onto the molecular surface. Each critical point on the Connolly surface and its associated surface normal were taken as the reference frame to define the relative orientations and separation distances between the protein and the adsorbent for a particular orientation. Each critical point is referred to in terms of an individual atom on the surface of the protein, represented in the standard PDB notation. Due to the finite size of the system and the irregular nature of the molecular surface, overlap of the end group with another atom on the protein may occur when the end group is positioned near the nominally interacting group on the protein. For these cases, the accessibility of the protein to these orientations was set to zero. The MEAD calculation includes full protein structural information to allow systems with similar structures to be modeled, but such computations are expensive. To accomplish the detailed configurational exploration with a reasonable tradeoff between modeling accuracy and computational efficiency, the approach taken here was to calculate the interaction free energies at all the orientations defined by the critical points, but only at four discrete separation distances to provide representative information on the interaction free energy distribution over the accessible space. The interaction free energies on a more finely divided grid in the 3D configurational space were then obtained by interpolation using the NAG38 routine e01tgf. To calculate the retention factor, the multidimensional integration in eq 9 was implemented using the NAG Monte Carlo integration routine d01gbf. EXPERIMENTAL MATERIALS AND METHODS Materials and Sample Preparation. Sodium chloride and sodium phosphate (both ACS grade) were obtained from Fisher Scientific. All buffers were prepared using deionized water from a Millipore Milli-Q system (>18.2 MΩ cm), and the pH was further adjusted by adding HCl or NaOH. The buffers were filtered through 0.22-µm Gelman VacuCap bottle-top filters (Pall Corp., Ann Arbor, MI). A range of salt concentrations for specific elution requirements was obtained by mixing a 10 mM sodium phosphate buffer and 1 M NaCl with 10 mM sodium phosphate, both at pH 7. (38) The Numerical Algorithms Group, Inc., Downers Grove, IL. The NAG Fortran Library.
Cytochrome c variants (from horse C-7752, sheep C-2136, dog C-4013, and rat C-7892) were purchased from Sigma-Aldrich (St. Louis, MO) and stored below 0 °C before use. The amino acid sequences of the cytochrome c variants are shown in Figure 1. All proteins were used as received without further purification. The proteins were dissolved in 10 mM phosphate buffer at pH 7, and the solutions were stored at 4°C for a maximum of 1 day before use. Before injection onto the column, all protein solutions were filtered through 0.22-µm Millipore Millex-GV filters to remove possible aggregates formed during preparation or storage. Three cation exchangers, Toyopearl SP-650 C and SP-550 C (Tosoh Biosep, Montgomeryville, PA) and CM Sepharose FF (Amersham Biosciences, Piscataway, NJ) were used in the retention experiments. Toyopearl SP materials have sulfonated propyl functional groups on a methacrylate copolymeric base matrix, with SP-650 C having a larger mean pore size than SP550 C. CM Sepharose FF is a highly cross-linked agarose matrix derivatized with weak cation-exchange groups. Instrumentation. All the chromatography experiments were carried out on an A¨ KTA Purifier chromatography system (Amersham Biosciences) equipped with a multiwavelength UV-visible detector (UV-900). Buffer lines A and B were fed with the two buffers, which were mixed to attain the required salt concentrations for elution. An autosampler (A-900) was utilized for protein injection. The injection volume was 50 µL. Protein detection was performed by UV absorbance at 280, 254, and 215 nm. Column Packing. Deionized water was used to wash residual solvent from the storage solution for the adsorbent. A 1 M NaCl solution in 10 mM phosphate buffer was used as the packing buffer. The adsorbent was allowed to settle in the high-salt buffer, and the supernatant was removed. After the adsorbent was equilibrated with the packing buffer, the slurry was further diluted to a suspension concentration of ∼50%. Helium sparging was used to degas the slurry prior to column packing. AP minicolumns (0.5 cm i.d. × 5 cm length) were purchased from Waters (Milford, MA). The column was mounted vertically, and an 8-mL packing funnel was used to hold all the required slurry at one time. After the bed had settled via gravity, excess solvent was removed from the top of the bed without agitating the bed formed. The bed was then further flow-packed at a flow rate of 4 mL/min. Chromatographic Retention. Isocratic retention of the protein variants was measured at 0.1 M NaCl, pH 7, as the electrostatic modeling focused on this solution condition, which is reasonable for protein binding in the IEC mode. In addition to Analytical Chemistry, Vol. 76, No. 22, November 15, 2004
6747
Table 1. Parameters of the Gradient Runsa
1 2 3 4 5
Ii (M)
If (M)
VG (mL)
0.01 0.01 0.01 0.01 0.1
0.50 0.50 0.75 0.75 0.75
25 40 25 40 25
a I and I are the initial and final salt concentrations, respectively, i f and VG is the total gradient volume.
the isocratic measurements, five linear gradient runs with different elution parameters (Table 1) were performed for every protein. Gradient runs require shorter elution times than those in the isocratic mode and mitigate band spreading, which can be severe in isocratic elution of strongly retained proteins. The chromatographic behavior in multiple gradient runs can provide additional information on relative retention extents of these variants over a wider range of salt concentrations. Every set of conditions was run in duplicate, and the order of the experiments was randomized to avoid confounding of systematic errors. The resulting retention times are the arithmetic averages of the individual data points. RESULTS AND DISCUSSION Experimental Results. The amino acid sequences of the cytochrome c variants (Figure 1)39-43 show a maximum of six residue changes, involving substitutions including ionizable, uncharged polar, and hydrophobic residues. The chromatographic characteristics of the proteins result from the complicated interplay among these residue changes, which alter the charge distribution and the overall molecular geometry. Three of the variants studied (horse, dog, rat) have very similar calculated net charges of 9.2 at pH 7, while the sheep variant has one positive charge less. At 0.1 M salt concentration, the cytochrome c variants show clear retention discrimination on SP-650 C and CM Sepharose FF, two cation exchangers relatively weak retention, on which isocratic elution at low-salt concentrations is feasible (Table 2). The sheep variant has the shortest retention time, consistent with its lower net charge at pH 7. The other three variants have very similar net charge values and share over 90% sequence identity. Nevertheless, the dog variant shows a low retention similar to that of the sheep variant, with additional sequence changes of the dog relative to the sheep being K88T, E92A, and N103K. The rat variant consistently displays a longer retention time than the horse variant, despite their equal net charges, which result from the loss of one acidic and one basic residue from horse to rat. A number of gradient runs with various gradient parameters also consistently yield the same elution order, i.e., rat > horse > dog > sheep, on all three adsorbents studied (Table 3), although the differences vary. The unchanged retention pattern across the salt conditions and the adsorbents explored suggests that protein (39) Moore, G. R.; Pettigrew, G. W. Cytochromes C, Evolutionary, Structural and Physicochemical Aspects; Springer-Verlag: New York, 1990. (40) Margoliash, E.; Smith, E. L. Nature 1961, 192, 1121-1125. (41) Smith, E. L.; Margoliash, E. Fed. Proc. 1964, 23, 1243-1247. (42) McDowall, M. A.; Smith, E. L. J. Biol. Chem. 1965, 240, 4635-4647. (43) Carlson, S. S.; Mross, G. A.; Wilson, A. C.; Mead, R. T.; Wolin, L. D.; Bowers, S. F.; Foley, N. T.; Muijsers, A. O.; Margoliash, E. Biochemistry 1977, 16, 1437-1442.
6748
Analytical Chemistry, Vol. 76, No. 22, November 15, 2004
properties play the key deterministic role in the elution trend of these variants, which is the focal point of our modeling. Modeling. The calculated interactions are angular dependent as expected due to the irregular geometry and anisotropic charge distribution of the protein molecules. In the presentation of results below, the location of the end group is defined with respect to a reference atom on the protein and specified by the local normal on that atom and a separation distance. Because of the geometrical complexity of the protein, the end group may in fact be closer to a different part of the protein surface in the vicinity than to the reference atom, but the representation still depicts the orientational distribution of interactions and captures the overall contribution to retention. A slight change in the end group size was found to cause only small changes in the calculated interaction free energies (IFEs), and the landscape of IFEs in the configurational space remains similar, so results are presented for only one end group radius, namely 1.7 Å. All the electrostatic calculations were performed for 0.1 M ionic strength, for which k′ values are of order 10 (Table 2); i.e., adsorption is moderately strong. The distribution of IFEs on horse cytochrome c is presented as a color map on the molecule (Figure 2) using Visual Molecular Dynamics (VMD),44 demonstrating the variation of interaction strength over the protein. The red region on what we refer to as the front face (Figure 2a) corresponds to favorable orientations with IFEs less than -2 kT. It is surrounded by lysines 5, 7, 8, 13, 25, 27, 73, 79, and 86-88, which have been experimentally determined to participate actively in adsorption on cation exchangers.22 Apart from these basic residues at the perimeter, this favorable region also contains a considerable number of uncharged residues that have highly negative IFE values. This indicates that it is not necessary to have an oppositely charged patch on the protein surface directly facing the adsorbent to produce strong attraction. The charge microenvironment,8 especially the charges in the vicinity of an uncharged portion of the protein, can affect the interaction strength at this reference orientation. Most of the acidic groups are located on the opposite side of the molecule and correspond to orientations with comparatively repulsive interactions with the adsorbent (Figure 2b). Lysines 22, 39, 55, 60, 99, and 100 are located in the acidic residue-rich face of the molecule. These lysines, which were found experimentally to be less likely to be involved in binding,22 have relatively larger IFE values than the lysines on the front face. Besides the suggested steric effects,22 negative charges in the vicinity of these lysines may be another factor contributing to the decreased binding propensity. The distribution of IFEs on cytochrome c agrees with the speculation that concentrated basic patches on the front of the molecule give rise to more favorable orientations, with the back face of the molecule containing more acidic residues being less involved in binding.21 The dog and rat variants represent the two extremes in retention time among the three species with the same net charge. The most favorable interaction configurations at a separation distance of 2 Å calculated for these two variants, namely, those with IFE < -2.5 kT, are plotted in Figure 3, along with the values for the corresponding orientations on the other protein. The comparison of the most favorable orientations for dog and rat (44) Humphrey, W.; Dalke, A.; Schulten, K. J. Mol. Graphics 1996, 14, 33-38.
Table 2. Measured Isocratic Retention Volumes of Cytochrome c Variants on SP-650C and CM Sepharose FF at pH 7, 0.1 M NaCl VR (mL)
SP-650 C CM Sepharose FF
horse
sheep
dog
rat
11.7 ( 0.05 60.2 ( 0.02
6.3 ( 0.08 38.7 ( 0.04
7.1 ( 0.10 56.6 ( 0.04
13.7 ( 0.06 72.0 ( 0.06
Table 3. Average Retention Volumes (in mL) of Cytochrome c Variants in Five Different Gradient Runs (Table 1) horse
sheep
dog
rat
1 2 3 4 5
15.0 ( 0.02 21.2 ( 0.11 11.1 ( 0.01 15.6 ( 0.15 8.7 ( 0.01
SP-550 C 14.3 ( 0.20 20.2 ( 0.01 10.7 ( 0.02 14.9 ( 0.06 8.2 ( 0.02
14.6 ( 0.09 20.8 ( 0.02 10.9 ( 0.02 15.2 ( 0.02 8.4 ( 0.01
15.7 ( 0.03 22.5 ( 0.02 11.6 ( 0.01 16.6 ( 0.07 10.8 ( 0.01
1 2 3 4 5
8.8 ( 0.06 12.2 ( 0.01 6.8 ( 0.01 9.0 ( 0.01 3.8 ( 0.01
SP-650 C 8.3 ( 0.02 8.5 ( 0.02 11.2 ( 0.01 12.0 ( 0.03 6.4 ( 0.02 6.7 ( 0.01 8.5 ( 0.01 8.9 ( 0.03 3.3 ( 0.03 3.4 ( 0.06
9.0 ( 0.05 12.9 ( 0.09 7.1 ( 0.01 9.7 ( 0.01 4.2 ( 0.01
1 2 3 4 5
12.3 ( 0.02 16.3 ( 0.08 9.0 ( 0.03 12.2 ( 0.04 6.2 ( 0.01
CM Sepharose FF 11.4 ( 0.16 11.6 ( 0.04 15.6 ( 0.06 16.0 ( 0.02 8.7 ( 0.09 8.9 ( 0.04 11.7 ( 0.03 12.0 ( 0.00 6.0 ( 0.02 6.1 ( 0.01
14.1 ( 0.14 17.1 ( 0.04 9.3 ( 0.02 12.7 ( 0.01 6.7 ( 0.11
cytochrome c with the same level of orientational tessellation shows that the configurations in the more strongly bound rat species are, in general, more attractive. The rat variant has a larger number of favorable orientations with IFE < -3 kT, which can contribute significantly to the overall retention as a result of Boltzmann averaging of the IFEs over the orientational space (eq 9). It is also notable that these favorable orientations include a significant number of configurations in which the end group is adjacent to an uncharged amino acid. The distribution of IFEs and significant differences between the two variants at some of these uncharged residue-referenced orientations further highlight the need to consider local regions of the protein surface other than simply the patches defined by charged residues in an effort to capture interaction variations. The F82 N-referenced orientation provides the most attractive free energy in rat cytochrome c (-4.0 kT), with the Boltzmann weighting making this a particularly large contribution to the protein affinity. Only ionic groups were accounted for in the calculations, so the positive partial charge on the N atom did not contribute. As discussed above, the irregularity of the molecular shape and charge distribution makes it unsurprising for an uncharged residue to correspond to the most favored location. The F82-referenced orientations display extremely negative IFEs in all four cytochromes c studied. Experimental studies of chemically modified cytochrome c also found F82 to be important in the interaction with cytochrome c oxidase and a CM-cellulose adsorbent.21,45 It has been proposed that a favorable surface charge distribution near residues 81-83 plays a role in the significant contribution of these residues to adsorption,21 and our calculations appear to confirm this.
Figure 2. Distribution of electrostatic IFEs between horse cytochrome c and an end group at a distance of 2 Å, calculated for 0.1 M salt concentration and a charge distribution corresponding to pH 7. The molecular surface is colored according to the local IFEs using a red-green-blue color scale bar. Red regions correspond to more negative IFEs, and the transition to the blue end to weaker attraction or to repulsion (positive IFEs). (a) shows what we refer to as the front face of the molecule, which contains a large number of favorable orientations. The lysines defined as the binding domains by Xu et al.22 are labeled at the NZ atoms. Most of the acidic residues, labeled at the OE1 atoms, are located on the back face of the molecule (b), where less attractive interactions between protein and adsorbent were found by the MEAD calculations. The lysines that were found to be less involved in binding22 are labeled at the NZ atoms in (b).
The distances from the nearby charges in the protein molecule to the end group for some of the favorable orientations were calculated to help elucidate the relation between charge distribution and IFE. Panels a and b of Figure 4 show the charge-to-charge distances of positive and negative charges to the end group charge for the end group in two specific orientations (F82 N and F46 C, respectively) and suggest how the overall charge environment is involved in determining the binding energy. For the F82 N (45) Ferguson-Miller, S.; Brautigan, D. L.; Margoliash, E. J. Biol. Chem. 1978, 253, 149-159.
Analytical Chemistry, Vol. 76, No. 22, November 15, 2004
6749
Figure 3. Most favorable calculated interactions at a 2-Å separation distance between the local protein surface and the end group (orientations with IFE < -2.5 kT) in rat (9) and dog cytochrome c (0), in comparison with corresponding orientations in the other protein. The orientations are defined by the atom; e.g., K13 CD refers to the atom CD in residue K13. The atoms with IFE values for only one protein are the orientations that are not accessible for both proteins due to steric effects.
orientation, for instance, the distance maps show that for both the rat and dog variants the end group is in a region of high positive charge density, with a large number of basic residues in the vicinity and many fewer acidic residues located at greater distances. Each protein has one positive charge very close to the end group, which may be the main contributor to the overall strong retention. A more concentrated positive charge distribution is observed in the rat variant, calculated to have stronger binding in this orientation. Although the rat variant has negative charges located at shorter distances than for the dog variant in the F82 N orientation, the contributions from those acidic residues at distances of more than 10 Å are expected to be small considering the short-range nature of screened electrostatic interactions. Other orientations and related charge maps, not shown here, also support the local charge distribution as the main factor determining the IFE in a given orientation. The charge microenvironment argument also helps explain the more negative IFEs at certain orientations for dog than rat cytochrome c (Figure 3). The range of electrostatic interactions in the solution is given by the Debye length (∼10 Å at 0.1 M), and the impact of more remote charges would be expected to decrease with increasing ionic strength. However, the short-range nature of the interactions of interest, and the low dielectric constant of the protein interior, would be expected to attenuate this effect to some extent. Nevertheless, information from the charge distance maps at different orientations indicates that a configuration with a single positive charge within ∼4 Å of the end group can result in an IFE as low as -4 kT at 0.1 M. Although the microenvironment is complicated, it is always the charges in the vicinity of the end group that are predominant in differentiating among the variants, and these charges contribute disproportionately to the local interaction energy. The IFEs can be used to calculate the contributions of different configurations to retention via eq 9, which includes integration over both orientational and translational variables. The orientational integral, IΩ, at a constant separation distance r
IΩ )
∫ [e
-∆G(Ω,r)/kT
- 1] dΩ
(10)
can be calculated by integration over orientations defined by 6750
Analytical Chemistry, Vol. 76, No. 22, November 15, 2004
Table 4. Orientational Integral IΩ at Protein-Adsorbent Separation Distances of 2, 4, and 6 Å
2Å 4Å 6Å
horse
sheep
dog
rat
3.40 2.37 1.61
2.96 2.15 1.57
3.09 2.17 1.60
3.50 2.39 1.65
Connolly critical points on the molecule. As electrostatic interactions described by the Poisson-Boltzmann equation decay roughly exponentially with distance, IΩ at a fairly small proteinend group separation distance should provide a representative indication of the retention trends of proteins if adsorption is indeed dominated by electrostatic interactions. The IΩ values over the configuration layer at three distances for the four cytochrome c variants are shown in Table 4. The most strongly retained protein, rat cytochrome c, has the greatest contribution from the orientational integrals. The orientational integrals at a short separation distance follow the same order as that of the experimental data and display clear differentiation among the proteins. This suggests that a comparison of IΩ values of different proteins in this set at a short protein-adsorbent separation distance can provide a predictive indication of the retention trend. To obtain the overall retention factor k′, Boltzmann-weighted averaging over the full range of possible states must be performed (eq 9). Again the electrostatic contribution is the only one considered here, and although it is expected to be dominant in IEC, the omission of other contributions, including those from simultaneous interaction with multiple ligands, is expected to lead to underestimation of k′ values and the differentiation among variants. This is indeed found to be the case (Figure 5). However, the calculated retention factors for the four cytochrome c variants show the same trend in retention variation found experimentally, showing the binding strength of the cytochrome c in the order rat > horse > dog > sheep. As the adsorbent properties enter the relation for k′ only as a multiplicative factor, retention differences among different adsorbents are not expected to be captured. Continuum electrostatic modeling of protein-adsorbent interactions with 3D protein structural information thus gives a
Figure 5. Calculated retention factor k′ of cytochrome c variants on SP-650 C at pH 7 and ionic strength 0.1 M, compared with the measured k′.
Figure 4. Distances of nearby positive and negative charges on rat (9) and dog (0) cytochrome c to the end group charge at the orientations defined by the reference atoms (a) F82 N in rat (IFE ) -4.00 kT); F82 N in dog (IFE ) -2.90 kT); (b) F46 C in rat (IFE ) -0.98 kT); F46 CZ in dog (IFE ) -3.15 kT) (F46 C in dog not accessible to the adsorbent end group).
reasonable account of chromatographic differentiation of the cytochrome c variants, where a maximum of six residues changes between any pair of proteins. Since most of the molecule is unaffected, it is mainly in the vicinity of the mutated residues that differences must be considered. The agreement with experiment in the calculated retention trends despite the omission of other effects suggests that the electrostatic contribution is the major one among the molecular forces that contribute to retention in these configurations. Although the retention trend of the closely related cytochrome c variants has been quantitatively captured, the underestimation of k′ values and resulting quantitative dif-
ferentiation among the variants suggests that other molecular forces are needed for a more complete description of protein adsorption. This is seen more dramatically in a companion paper studying comparisons among other proteins.46 Although the approach described here is promising for modeling protein interactions in ion exchange, the computational effort required is quite substantial, because of the need to calculate interactions in a large number of protein-end group configurations. As seen above, the IFE in a given configuration is related to the local charge environment, and developing this into a more quantitative correlation appears to have the potential for computational economy. The local surface potential on the protein, even in the absence of the end group, might be expected to be such an appropriate correlating parameter; indeed, ion-exchange retention times appear to be correlated with the mean protein surface potential.47 The potentials on the Connolly subsurfaces of the cytochrome c variants, defined by the critical point approach, were calculated using MEAD. Comparisons of the local protein surface potentials and the corresponding IFEs (Figure 6) show a reasonable correlation between the protein surface potentials and IFEs. Favorable orientations with more negative IFEs are related to protein faces with larger surface potentials, and most points on the plot of IFE versus surface potential fall in the vicinity of a well-defined trend line. A small number of points deviate from the trend line more significantly, the reason being that, at certain protein-end group orientations, desolvation effects due to the presence of the end group contribute considerably to the calculated IFE. A large number of subsurfaces on the rat variant, the most strongly retained protein, have potentials greater than 4 kT/ e, which correspond to favorable adsorption with IFEs that are highly negative. For the variants that display weaker binding, such as the sheep and dog variants, the distribution of local surface potentials shifts to a lower range. This suggests that the distribution of protein local surface potentials is another way to obtain a rough evaluation of adsorption extent when electrostatic interactions are dominant. CONCLUSIONS We have shown that electrostatic modeling with molecular details can account for the small differences in retention trends among cytochrome c variants with limited structural variations. For those substitutions that add or remove charges, the general (46) Yao, Y.; Lenhoff, A. M., submitted for publication. (47) Haggerty, L.; Lenhoff, A. M. J. Phys. Chem. 1991, 95, 1472-1477.
Analytical Chemistry, Vol. 76, No. 22, November 15, 2004
6751
Figure 6. Comparison of the IFEs and the corresponding local surface potentials of cytochrome c variants: (a) horse; (b) sheep; (c) dog; (d) rat.
retention trend can be well characterized qualitatively by changes in the net charge, consistent with classical ideas of ion-exchange retention. For other variants bearing similar net charge values and small differences in the charge distribution, a fine-grained representation of protein structure is required for satisfactory comparative modeling. The interaction energies versus the distance maps between the charged residues and the adsorbent confirm that the charge distribution on the molecule is an important determinant of the electrostatic interactions. For proteins with similar structures, the main difference in retention due to minor structural changes is reflected in fine-tuning of the electrostatic effect. Apart from the approach that explores the full 3D configurational space to evaluate the contribution of proteinadsorbent interactions to k′, the orientational integral at a short
6752 Analytical Chemistry, Vol. 76, No. 22, November 15, 2004
separation distance is quantitatively representative of the retention extent, and the local protein surface potential offers a more computationally efficient route to correlating this. ACKNOWLEDGMENT We are grateful to Dr. Donald Bashford for making the MEAD package available and to Dr. Dilip Asthagiri for helpful discussions. This work was supported by the National Science Foundation (Grants CTS-9977120 and CTS-0350631).
Received for review May 7, 2004. Accepted August 24, 2004. AC049327Z