A Minimalist Model for Exploring Conformational Effects on the

A Minimalist Model for Exploring Conformational Effects on the Electrospray Charge. State Distribution of Proteins. Lars Konermann*. Department of Che...
0 downloads 0 Views 1MB Size
6534

J. Phys. Chem. B 2007, 111, 6534-6543

A Minimalist Model for Exploring Conformational Effects on the Electrospray Charge State Distribution of Proteins Lars Konermann* Department of Chemistry, The UniVersity of Western Ontario, London, Ontario, N6A 5B7, Canada ReceiVed: January 27, 2007; In Final Form: March 30, 2007

The electrospray ionization (ESI) charge state distribution of proteins is highly sensitive to the protein structure in solution. Unfolded conformations generally form higher charge states than tightly folded structures. The current study employs a minimalist molecular dynamics model for simulating the final stages of the ESI process in order to gain insights into the physical reasons underlying this empirical relationship. The protein is described as a string of 27 beads (“residues”), 9 of which are negatively charged and represent possible protonation sites. The unfolded state of this bead string is a random coil, whereas the native conformation adopts a compact fold. The ESI process is simulated by placing the protein inside a solvent droplet with a 2.5 nm radius consisting of 1600 Lennard-Jones particles. In addition, the droplet contains 14 protons which are modeled as highly mobile point charges. Disintegration of the droplet rapidly releases the protein into the gas phase, resulting in average charge states of 4.8+ and 7.4+ for the folded and unfolded conformation, respectively. The protonation probabilities of individual residues in the folded state reveal a characteristic pattern, with values ranging from 0.2 to 0.8. In contrast, the protonation probabilities of the unfolded protein are more uniform and cover the range from 0.8 to 1.0. The origin of these differences can be traced back to a combination of steric and electrostatic effects. Residues exhibiting a small accessible surface area are less likely to capture a proton, an effect that is exacerbated by partial electrostatic shielding from nearby positive residues. Conversely, sites that are sterically exposed are associated with electrostatic funnels that greatly increase the likelihood of protonation. Unfolding enhances the steric and electrostatic exposure of protonation sites, thereby causing the protein to capture a greater number of protons during the droplet disintegration process.

Introduction Electrospray ionization (ESI) mass spectrometry (MS) has evolved into a highly versatile tool for studies on proteins and many other biological macromolecules.1-3 Detailed insights into protein structure and dynamics can be obtained by combining ESI-MS with hydrogen/deuterium exchange,4 covalent crosslinking,5 proteolytic digestion,6 and rapid on-line mixing.7,8 The ESI process commences when analyte solution is sprayed from a metal capillary to which high voltage has been applied. Droplets of analyte solution emanate from the tip of a Taylor cone at the capillary outlet. In the positive ion mode, these droplets carry excess positive charge due to protons or other cationic species (such as NH4+) that are evenly spread over the droplet surface, while the droplet interior is electrically neutral.9 Solvent evaporation increases the charge density until Coulombic repulsion balances the cohesive forces of the liquid. 10,11 The number of charges, zR, at this so-called Rayleigh limit can be calculated according to12,13

zRe ) 8π(0γR3)1/2 where e is the elementary charge, 0 is the permittivity of vacuum, γ is the surface tension of the solvent, and R is droplet radius. At or slightly below the Rayleigh limit, droplets become unstable, and smaller offspring droplets

(1) the the the are

* To whom correspondence should be addressed. E-mail: konerman@ uwo.ca. Phone: (519) 661-2111 ext. 86313. Fax: (519) 661-3022. Internet: http://publish.uwo.ca/∼konerman/.

formed by jet fission.14,15 Subsequent evaporation and fission events ultimately lead to the formation of nanometer-sized droplets from which multiply protonated [M + zH]z+ ions are produced.10 Experimental16,17 and computational18 studies suggest that the biomolecular macroions generated in this way can retain significant aspects of their solution structure. This study addresses a particularly intriguing phenomenon, namely, the fact that the ESI charge state distributions of proteins are highly sensitive to the polypeptide conformation in solution. Proteins adopting tightly folded structures generally show relatively low charge states, whereas unfolding in solution greatly enhances the degree of protonation during ESI.19,20 On the basis of this empirical relationship, ESI-MS has become a widely used method for probing protein conformational changes in solution.1,8,21-25 The physical mechanisms underlying the correlation between polypeptide conformation and ESI charge states continue to be a hotly debated topic. Generally, the charge states seen in ESIMS do not coincide with those of a protein in bulk solution.26 Early attempts to explain the observed relationship invoked differences in the steric accessibility of protonation sites and changes of the corresponding pKa values.19,20 It has also been argued that more extended conformations can accommodate a higher number of protons, because an increased charge spacing reduces the extent of Coulombic repulsion.27 Similarly, the enhanced conformational flexibility of unfolded proteins might facilitate the intramolecular solvation of charged sites.28 A related possible determinant of ESI charge states is the difference

10.1021/jp070720t CCC: $37.00 © 2007 American Chemical Society Published on Web 05/19/2007

ESI Charge States and Protein Conformation in gas-phase basicity between protein and solvent.29 Grandori30-32 and others33 have put forward hypotheses that involve conformation-dependent charge neutralization effects. Two general mechanisms have been proposed to describe the formation of gas-phase ions from nanometer-sized droplets close to the Rayleigh limit.34 The ion evaporation model (IEM) suggests that analyte ions can be ejected from the surface of an intact, highly charged droplet.35 Fenn and co-workers proposed that unfolded proteins generate higher charge states because their increased surface area spans a larger number of charges at the air/liquid interface as the protein is desorbed.9,36 However, it was later argued that the desorption of macromolecular ions by such a mechanism would result in protonation states much lower than those observed experimentally.13 An alternative framework, the charged residue model (CRM), stipulates that solvent evaporation to dryness releases the protein which retains some of the droplet’s charge.37 The experimental observation that the charge states of proteins electrosprayed from nondenaturing aqueous solutions are close to the zR values of protein-sized water droplets (eq 1) provides strong support for the notion that these ionic species are indeed formed via the CRM.13,32,38-41 Noncompact protein conformers might increase the size of the final nanodroplets, possibly by inducing nonspherical shapes, such that these droplets can accommodate a larger number of charges at the Rayleigh limit.38 However, this view is not undisputed,9 partly because eq 1 may not be applicable to nanometer-sized droplets.13,42 In particular, the expected dependence of the protein charge states on the surface tension γ cannot always be confirmed experimentally.30,43 Also, changes in the number of ionizable residues can lead to alterations of the observed charge states that are difficult to reconcile with the CRM.31 Measurements carried out in negative ion mode result in charge states significantly below zR.41 The difficulties in developing a mechanistic understanding of the relationship between protein conformation and charge state distribution are rooted in the limited understanding of the final steps of the ESI process. In contrast to early fission events of microdroplets, the behavior of ion-producing nanometer-sized solvent clusters is difficult to monitor experimentally.14,15 It appears that computer modeling strategies might offer an interesting alternative. Consta and co-workers42,44-46 have pioneered the use of molecular dynamics (MD) simulations for studying the disintegration of aqueous nanodroplets charged with metal ions. In this work, we chose a related but more simplistic approach to gain insights into the final stages of protein ion formation during ESI. The rationale behind the strategy used here is that minimalist models are often successful in capturing fundamental properties of highly complex systems.47-49 In addition, atomistic simulations of the ESI process would represent an extremely challenging computational problem.18 The MD model employed in this work reproduces the experimental observation that proteins electrosprayed in unfolded conformations generate higher charge states than compact conformers. The minimalist nature of the underlying framework makes it relatively straightforward to identify the factors responsible for the observed relationship. The Model ESI Protonation Sites. We initially consider a hypothetical protein in bulk solution. In its native state, the polypeptide chain adopts a globular fold that has all ionizable sites in contact with the solvent, whereas hydrophobic residues are mostly buried inside the protein core. This native structure is stabilized by a number of covalent cross-links (e.g., disulfide bridges). Disrup-

J. Phys. Chem. B, Vol. 111, No. 23, 2007 6535

Figure 1. Native structure of the minimalist protein model used in this work, consisting of positively charged (blue), negatively charged (red), and solvophobic (green) residues. Also indicated is the residue numbering used throughout this work. The bead diameters shown are approximately 2/3 of the Lennard-Jones σ value used in the simulations.

tion of these cross-links induces unfolding and the transition to a random coil. For reasons of simplicity, it is assumed that the protein carries equal numbers of positive and negative charges at neutral pH, regardless of its conformation. Positive charges reside on all Arg (pKa ∼12) and Lys (pKa ∼11) side chains and on the N-terminus (pKa ∼8). Conversely, all Asp and Glu side chains and the C-terminus are negatively charged (pKa ∼4).50 The protein does not possess any His residues, which would unnecessarily complicate the model due to their pKa value in the neutral range. In short, all basic sites are positively charged, and all acidic sites are negatively charged, a situation that is reasonably close to that of real proteins under a wide range of solvent conditions.50 Let us now consider the changes in this charging pattern that have to take place upon transforming the neutral solution-phase protein into an [M + zH]z+ gas-phase ion. Positive charges on electrosprayed proteins are known to be located mainly on the most basic sites, i.e., Arg and Lys side chains (also His, for proteins containing this type of residue), as well as on the N-terminus.29,51 Surprisingly, previous discussions in the literature often appear to focus on protonation events of these basic sites that supposedly have to occur during ESI. This notion seemingly overlooks the fact that, due to their pKa values, the majority of these basic moieties are already protonated in bulk solution.50 More correctly, the formation of [M + zH]z+ species during ESI has to be ascribed largely to the protonation of negatively charged (acidic) residues, as these carboxylatecontaining groups are the sites that will most readily accommodate H+ ions.27,31,33 Accordingly, the key to understanding the relationship between conformation and ESI charge state distribution is the question, how many of the negative charges on the protein (Glu-, Asp-, C-terminus-) become neutralized during the ionization process? Implementation. The current study employs a minimalist model that does not distinguish among different types of acidic or basic sites. Instead, the protein is described as a simple bead string representing 27 spherical “residues” that are linked by harmonic springs. The spring constant was arbitrarily chosen to be 93 kJ mol-1 Å-2. With this value, the bonds were

6536 J. Phys. Chem. B, Vol. 111, No. 23, 2007

Konermann

Figure 2. Disintegration of a charged nanodroplet initially consisting of a folded protein surrounded by 1600 solvent particles and 14 protons. The 4 panels correspond to the following time points and protein charge states: (A) 3 ps, 1+; (B) 76 ps, 5+; (C) 96 ps, 5+; (D) 104 ps, 5+. Protein residues are shown with the same color coding as in Figure 1, close to their their actual size of σ ) 0.3 nm. Solvent particles are shown at a fraction of their size. Proton locations are marked as white spheres. The two images in panel D refer to the same time point, showing the protein in “ball and stick” representation (top) and as a “stick-only” model which reveals the bound protons (bottom).

sufficiently stiff to limit the extent of large-scale conformational fluctuations of the native protein within the droplet. The centerto-center equilibrium bond length is 0.4 nm, a number that roughly corresponds to the distance between adjacent CR atoms in a polypeptide chain.50 The sequence starts with a positively charged bead (+e, residue 1), followed by a solvophobic neutral (residue 2) and a negatively charged residue (-e, residue 3). This triad is repeated 9 times, resulting in a positively charged “Nterminus” and a negatively charged “C-terminus” (residue 27). Each residue has a mass of 100 g mol-1. The native conformation exhibits a cubic structure, consisting of a solvophobic layer that is sandwiched between two layers of charged residues with alternating polarity (Figure 1). To prevent unfolding of the native state during MD simulations, additional spring connections (not shown in Figure 1) are inserted between all neighboring residues,

with physical characteristics identical to that of the backbone bonds. Calculations on the unfolded state were carried out in the absence of these additional linkages. The temperature was chosen to be 100 °C (373 K), which reflects the fact that most ESI mass spectrometers use heating elements in the ion source region to enhance the rate of solvent evaporation. A strong correlation between protein conformation and ESI charge state distribution has been observed experimentally not only in aqueous solution, but also in mixtures containing large fractions of various organic cosolvents,52,53 as well as additives that significantly affect the surface tension.13,30,43 This implies that the specific properties of the solvent play a secondary role for the question being investigated here. Within our model, the solvent is described as spherical particles (mass 18 g mol-1) that have the same dimensions as the protein residues. The

ESI Charge States and Protein Conformation

J. Phys. Chem. B, Vol. 111, No. 23, 2007 6537

interactions among solvent particles are described by a LennardJones potential

[(σr ) - (σr ) ] 12

VLJ(r) ) 4

6

(2)

which mimics both the short-range repulsion, as well as attractive van der Waals interactions due to dipole-dipole interactions, as a function of the center-to-center distance r.54 The parameter σ represents a measure of the particle diameter and is chosen to be 0.3 nm. The depth of the potential well, , was estimated from the surface tension of water at 100 °C (γ ) 0.058 N m-1)55 based on the relationship56 γ ) 2/πσ,2 resulting in  ) 5 kJ mol-1. This value is close to the average kinetic energy of the solvent particles, 3/2kBT ) 4.6 kJ mol-1. Interactions of the solvent with charged residues are modeled in the same way as interactions among solvent particles, regardless of protonation status (see below). In contrast, solvophobic residues interact with the solvent through a shortrange repulsive potential V′(r), that is obtained by transforming eq 2 according to V′(r) ) V(r) +  and by truncation at its minimum which is located at r ) 21/6 σ. The same repulsive potential is applied to residue-residue interactions, resulting in a self-avoiding polymer chain. Protons are represented as point charges (+e) with a mass of 1 g mol-1. They do not occur as free entities, but only bound to solvent particles or protein residues. In order to ensure a high proton mobility, a Lorentzian-shaped potential well of the form

W(r) ) -R

(

σ2/2 r + σ2/2 2

)

(3)

was assigned to all solvent particles and protein residues (with the exception of solvophobic residues which cannot be protonated). The depth of the well is R ) 30 kJ mol-1, and r in eq 3 represents the proton distance from the center of the host particle. When neighboring particles are in van der Waals contact, the superposition of their potential wells creates a thermally accessible transition state for proton hopping from one particle to the next. Due to the dynamic nature of the solvent packing, each proton trajectory resembles a three-dimensional random walk on an energy landscape with fluctuating barriers. These proton trajectories, however, are also affected by electrostatic effects, i.e., repulsion among protons and positively charged residues, as well as attractive forces resulting from negatively charged residues. All of these interactions are described by Coulombic potentials with a dielectric constant equal to that of water (κe ) 80). Capture of a proton by a negatively charged residue leads to irreversible trapping. It is recognized that the choice of eq 3 to describe the dynamic behavior of protons is guided by heuristic principles. Nonetheless, the framework employed here accounts for some basic hallmarks of “real” protons, i.e., point charges that possess a high mobility within a closely packed solvent matrix, and strong binding to individual solvent particles once contact to the bulk is lost. Simulation Details. MD runs were carried out on the basis of Fortran code developed in-house, employing a leapfrog algorithm54 with a time increment of 3 fs and a simulation window of 150 ps. Center-of-mass translations and nonzero contributions to the overall angular momentum were eliminated during every iteration. The potentials defined in eqs 2 and 3 were truncated for r > 2.5 σ and r > σ, respectively. Coulombic interactions were modeled without a long-range cutoff, but truncation was used for center-to-center distances of less than

Figure 3. Time profiles for several parameters associated with the protein ionization process depicted in Figure 2. (A) Number of solvent particles in the droplet. (B) Protein charge state. (C) Radial position of three selected protons relative to the protein/solvent cluster center of mass. Arrows in panels B and C indicate the point where one of the protons (solid line in panel C) binds to the protein, resulting in a charge state increase from 3+ to 4+. The brackets (top) labeled (1) and (2) indicate the two regimes of the droplet fragmentation process as outlined in the text.

0.1 σ. ESI charge state distributions were derived from 50 independent simulations for each set of conditions, using different random values for initial particle positions and velocities for each run. ESI solvent droplets were generated by initially surrounding the protein by a spherical low-density cloud of solvent, having the protein center-of-mass at its midpoint. Protons were assigned to individual solvent particles at random. The resulting system was then exposed to a radial trapping potential and slowly cooled from 300 to 50 K, resulting in a closely packed spherical droplet. Subsequently, the system was heated; the t ) 0 time of the simulation coincides with the point where the final temperature of 373 K is reached. Up until this time point, all protein residues were given properties identical to those of the solvent, i.e., residue charges and hydrophobicities were only assigned at t ) 0. Also, the spherical trapping potential was removed at t ) 0, resulting in the onset of solvent evaporation. This evaporation tends to cool down the ESI droplet, thus requiring continuous heating to maintain a constant temperature for the continuously shrinking nanocluster. Results and Discussion MD simulations were used to explore the formation of gasphase protein ions from electrically charged nanodroplets, a process that occurs during the final stages of ESI. In these droplets, an initially neutral 27-residue protein is surrounded by 1600 solvent particles, thereby forming a spherical cluster with a radius of R ≈ 2.5 nm. According to eq 1, droplets of this size can accommodate 14 protons. Our objective is to use the minimalist model outlined in the previous sections to gain insights into the physical basis underlying the relationship between protein conformation in solution and ESI charge state distribution. We will initially discuss results obtained for the folded protein, focusing on typical features that were observed for the majority of the simulation runs. For early time points, the droplet

6538 J. Phys. Chem. B, Vol. 111, No. 23, 2007

Konermann

Figure 4. Disintegration of a charged nanodroplet containing an unfolded protein. The time points and charge states are (A) 3 ps, 2+; (B) 87 ps, 7+; (C) 97 ps, 7+; (D) 126 ps, 8+. Further explanations are given in the caption of Figure 2.

maintains a largely spherical shape. The protons, while being highly mobile, are preferentially located at radial positions ca. 2 nm from the droplet center (Figure 2A). This situation is the result of two opposing factors, namely, charge repulsion, which tends to push the protons to the droplet surface, and the trend to maximize proton solvation. The resulting scenario where the protons are located close to, but not directly at, the droplet surface resembles the situation of small water clusters charged with atomic ions.42,46 Solvent evaporation continuously reduces the number of particles (Figure 2B). The time range around 70 ps marks the onset of a large-scale disintegration which lasts for about 25 ps, resulting in the disruption of virtually all solvent-solvent and solvent-protein interactions (Figure 2C). Eventually, these processes produce a desolvated protein in the gas phase that remains bound to a number of protons. Stretching and bending motions of the protein are strongly damped as long as it is surrounded by solvent. Upon desolvation, the protein structure becomes more dynamic, and the residues undergo

large-scale oscillations that lead to distortions of the native structure (Figure 2D). Figure 3A illustrates the two regimes of the droplet disassembly process, characterized by (1) gradual loss of solvent particles at an almost constant rate and (2) large-scale disintegration, which ultimately leads to the formation of a completely desolvated protein ion in the gas phase. The final protein charge state, 5+ for the example depicted here, is typically reached during the first phase of the disassembly process. Protonation of negative residues occurs in a stepwise manner (Figure 3B). These protonation events are irreversible, i.e., neither transfer back to the solvent nor proton loss in the gas phase were observed in any of the simulation runs. Figure 3C depicts some typical proton trajectories, plotted as radial position vs time. Following rapid fluctuations around an average value of ca. 2 nm, individual protons are either captured by a negative site on the protein (Figure 3C, solid line) or released during the second phase of the disintegration process in a solvent-bound

ESI Charge States and Protein Conformation

J. Phys. Chem. B, Vol. 111, No. 23, 2007 6539

Figure 6. Protonation energy (potential energy difference) associated with the transition between individual charge states for the folded and the unfolded protein. The x-axis refers to protonation events, e.g., “1” represents 0 f 1+, and so forth. The data shown for each charge state represent the arithmetic mean of 200 calculations as described in the text; standard deviations are shown as error bars.

Figure 5. Charge state distributions obtained for the folded (A) and unfolded (B) protein, based on the results of 50 independent MD runs for each of the 2 conformations. The direction of the x-axis has been inverted in order to emphasize the analogy of these data with experimental ESI mass spectra. The y-axis represents the number of times each charge state was observed. Error bars correspond to the square root of counts for each charge state

form (Figure 3C, dotted line). Loss of solvated protons can also occur at earlier time points (Figure 3, dashed line), but these events are rare, typically affecting no more than 1 out of the 14 protons. The sequence of events for droplets containing an unfolded protein is very similar to that described above for the folded state (Figure 4). In particular, the time scale of droplet breakdown with two more or less distinct phases, as depicted in Figure 3A, is virtually independent of protein conformation (data not shown). Both forms of the protein show a propensity to move toward the droplet surface during the second half of the disintegration process (Figures 2B,C and 4B). This effect is mainly attributed to the tendency of the droplet interior to maintain electrical neutrality, resulting in forces that push any excess positive charges to the surface.9 The mechanism of protein ionization in our model, therefore, exhibits elements of both the CRM and the IEM. On the one hand, highly asymmetric situations such as that depicted in Figure 2C are reminiscent of protein ejection from the droplet surface, a key feature of the IEM.35 On the other hand, the protein release is closely coupled with the complete disintegration of the droplet, a characteristic element of the CRM.37 As an interesting side aspect, therefore, our data suggest that a strict differentiation between the two classical limiting scenarios (IEM vs CRM) may not be possible when discussing the ESI process for large biomolecular species. The most important result in the context of this study is a clear difference in the degree of protonation for the folded and unfolded conformations after the protein has been released into the gas phase (Figure 5). Simulations carried out for the folded structure result in an ESI charge state distribution ranging from 4+ to 7+, with a dominant maximum at 5+. In contrast, the

distribution obtained for the unfolded protein ranges from 6+ to 9+, with 7+/8+ being the most intense peaks. The model used here, therefore, reproduces a key experimental observation, namely, a strong dependence of the ESI charge state distribution on the solution-phase structure of the protein. Unfolded proteins become more extensively protonated than tightly folded conformations. We will now proceed to identifying the physical reason(s) underlying this relationship within the framework of our model. The recombination of a negatively charged site with a proton will normally be an energetically favorable process. However, the amount of energy released should decrease with every successive protonation step as a result of Coulombic repulsion, caused by the gradually accumulating net positive charge on the protein. It would be expected that this Coulombic repulsion is more pronounced for the folded conformation due to the close proximity of protonation sites, perhaps even leading to unfavorable (positive) protonation energies for the highest charge states. In order to evaluate the magnitude of this effect, the potential energy of the protein was studied as a function of charge state. For these calculations, all charge-charge interactions, as well as deformations of interresidue bonds, were taken into account. For each charge state, the potential energy of the protein (embedded in a droplet of 1600 solvent particles) was calculated as an average of 200 different conformations and proton distribution patterns. From the resulting data, the energy release associated with subsequent protonation events was determined (Figure 6). For the unfolded protein, the protonation energies were found to be around -50 kJ mol-1, virtually independent of charge state. In contrast, the folded protein exhibits a decline in protonation energy, from -50 to about -36 kJ mol-1. Notably, however, these data show that protonation of the folded protein remains highly favorable up to the maximum charge state of 9+. It can be ruled out, therefore, that energetic factors of the type discussed in this paragraph represent a major cause of the lower protonation states seen for the folded state. Interesting clues come from the protonation probabilities pi of individual residues. The folded conformation reveals a characteristic pattern, where high probabilities around 0.8 are observed for residues 3, 9, 21, and 27 (Figure 7A). Intermediate values of ca. 0.4 are seen for sites 6, 12, 18, and 24. Residue 15 shows the lowest of all probabilities, pi ≈ 0.2. This pattern is strikingly different from the pi progression of the unfolded

6540 J. Phys. Chem. B, Vol. 111, No. 23, 2007

Konermann

Figure 7. Protonation probabilities pi of negatively charged residues 3, 6, ‚‚‚, 27 as determined from the MD simulations for the folded (A) and unfolded (B) proteins. Panels C,D: Solvent-exposed surface area Ai of each protonation site i, determined using a closely packed structure of 3 × 3 × 3 cubes for the folded protein (C) and a linear array of cubes for the unfolded protein (D). Panels E,F: Average radial positions jri of individual protonation sites within the droplet. Panels G,H: Estimated protonation probabilities pest i , calculated from eq 4. The normalization constant C in eq est 4 has been chosen such that pest 27 ) 1 for the unfolded conformation, thereby ensuring that the pi values (G,H) are directly comparable with the pi values in panels A,B.

protein, which exhibits high values around 0.8 for residues 3 to 24. Residue 27 was protonated during every single simulation run, corresponding to a probability of unity (Figure 7B). It is important to realize that the sum of these probabilities, ∑pi ) 4.8+ and ∑pi ) 7.4+, reflects the average charge state for the folded and unfolded protein, respectively. Any attempts to explain the different protonation behavior of the two conformations, therefore, have to account for the probability patterns in Figure 7A,B. To rationalize the observed pi progressions, we will first consider geometrical factors, before including Coulombic proton-protein interactions. Naively, it can be stated that for a protonation event to occur a proton has to “collide” with a protonation site. To a first approximation, the probability of

any such collision is proportional to the exposed surface area Ai of each site. For the folded conformation, estimates of these surface areas are most easily obtained by switching from the bead structure depicted in Figure 1 to one that consists of 33 closely packed cubes, where every cube represents one residue. Accordingly, the relative exposed surface areas of the protonation sites follow the sequence A3 ) 3, A6 ) 2, A9 ) 3, A12 ) 2, A15 ) 1, A18 ) 2, A21 ) 3, A24 ) 2, A27 ) 3 (Figure 7C). Similarly, when regarding the unfolded protein as a linear structure consisting of 27 cubes, it is found that the exposed surface areas are A3 ) ‚‚‚ ) A24 ) 4 and A27 ) 5 (Figure 7D). The same result would be obtained for nonlinear unfolded structures, assuming that consecutive cubes are attached face to face.

ESI Charge States and Protein Conformation

J. Phys. Chem. B, Vol. 111, No. 23, 2007 6541

Other factors that have to be considered are the locations of protonation sites within the droplet, and the likelihood that protons will be present at these positions. As seen earlier (Figure 3C), the protons preferentially reside at a radial distance of ca. 2 nm. Thermally activated fluctuations can cause individual protons to move closer to the center of the droplet for short amounts of time. Examples of these events can be seen in Figure 3C, e.g., for t ≈ 10 ps (dotted line). Protonation sites with average radial positions jri close to 2 nm are more likely to undergo a random collision with a proton than those located deeper in the droplet interior. Assuming that the protein center of mass remains close to the midpoint of the droplet during protonation, the jri values for the folded conformation can be estimated directly from Figure 1, e.g., jr15 ) 0.4 nm, jr18 ) x2 × 0.4 nm, and jr3 ) x3 × 0.4 nm (Figure 7E). For the unfolded state, the jri values were calculated on the basis of 200 random coil conformations. This procedure reveals that residues at the chain termini are located at greater distances from the droplet center (e.g., jr27 ) 1.2 nm) than residues in the middle of the protein sequence (rj12 ≈ jr15 ≈ 0.6 nm, Figure 7F). The estimated probability pest i that a proton collides with site i can be determined by considering both the exposed surface area Ai and a Boltzmann factor B(r) that reflects the likelihood of a proton being present at jri according to

pest i )

1 × Ai × B(rji) C

(4)

where C is a normalization constant and

[ ]

B(r) ) exp -

V(r) kBT

(5)

V(r) in eq 5 represents the potential energy of a proton as a function of radial distance from the droplet center (Figure 8). Inspection of the Boltzmann profile in Figure 8B reveals that B(r) does not change very dramatically in the range of 0.4 to 1.2 nm. Thus, the pest values of individual residues are i predominantly determined by their exposed surface areas Ai. Although eq 4 relies on a number of simplifying assumptions, the resulting pest i values (Figure 7G,H) are qualitatively consistent with several key results of the MD simulations. The characteristic pi progression of the folded state (Figure 7A) is reproduced quite well by the pest pattern in Figure 7G. i Similarly, the relatively uniform pest i distribution for the unfolded protein with a maximum at residue 27 (Figure 7H) resembles that of Figure 7B. Most importantly, the predicted charge state of the folded protein, ∑pest i ) 3.2+, is lower than that of the unfolded conformation, ∑pest ) 6.3+. These i observations imply that the exposed surface areas Ai of the protonation sites and, to a lesser degree, their average radial positions jri are important factors for determining the overall appearance of the ESI charge state distribution. We will now expand these considerations to take into account electrostatic proton-protein interactions. Figure 9A depicts the Coulombic energy landscape experienced by a positive test charge (+e) that moves along the righthand side of the native protein structure (y ) -0.4 nm in Figure 1). The figure shows four negatively charged sites (3, 9, 21, and 27) as readily accessible minima. Protonation of these residues is promoted by potential gradients that provide an electrostatic funneling effect for proton-residue distances of less than ca. 0.25 nm.57 At this distance, the potential drop due to Coulombic trapping roughly matches the thermal energy of

Figure 8. (A) Coulombic energy of a single proton at distance r from the droplet center, assuming the presence of 13 other protons that are evenly spaced with r ) 2 nm. The profile has been rescaled such that V(r) ) 0 for r ) 2 nm. (B) Boltzmann factor B(r) from eq 5, using the V(r) profile depicted in panel A. The double-headed arrow indicates the range of average radial positions jri that is relevant for the calculation of pest i values in Figure 7G,H.

) 4.6 kJ mol-1. Residue 15 is surrounded by an electrostatic cage caused by the positive charges on sites 4, 10, 16, and 22. Obviously, this shielding will hinder access of protons to residue 15, an effect that is even more enhanced once some of the easily accessible sites 3, 9, 21, and 27 have been protonated (Figure 9B). The residues 6, 12, 18, and 24 located on the left-hand side of the native structure (y ) 0.4 nm in Figure 1) represent an intermediate scenario, where protonation is somewhat obstructed due to partial shielding, but not as much as for residue 15 (Figure 9C). On the basis of these considerations, the degree of exposure, i.e., the lack of electrostatic shielding, for all sites in the folded protein can be categorized as low (15), intermediate (6, 12, 18, 24), and high (3, 9, 21, 27). An electrostatic energy plot for the unfolded protein (Figure 9D) shows that residues 3, 6, ‚‚‚, 24 are exposed to a degree closely resembling that of the high residues in the folded state. Residue 27 in the unfolded state is the most exposed (and hence most readily protonated) of all residues, because it is not directly adjacent to any positively charged sites (Figure 9D). The qualitative categorization of protonation sites according to their electrostatic exposure (low, intermediate, and high) results in a pattern that strongly resembles the protonation probabilities determined from the MD simulations (Figure 7A,B). This correlation reflects the fact that electrostatically exposed residues are protonated more readily than those that are shielded. The electrostatic properties of individual residues and the accessible surface areas discussed earlier are two factors that cannot be considered independently of one another. The smallest values of Ai go along with a highest degree of electrostatic shielding, as exemplified by residue 15 in the folded conformation. Conversely, sites with the largest accessible surface area (residue 27 in the unfolded protein) are the ones that are electrostatically most exposed. Thus, electrostatic and steric effects act in concert; together, they are responsible for the relationship between protein conformation and ESI charge state distribution in our model. Experimental studies have shown that protein unfolding in solution increases not only the protonation states observed in ESI-MS, but also the width of the charge state distribution.58 The latter effect is not seen for the model used here. The current study treats the unfolded protein as a genuine random coil, 3/ k T 2 B

6542 J. Phys. Chem. B, Vol. 111, No. 23, 2007

Konermann

Figure 9. Two-dimensional Coulombic energy maps of a positive test charge (+e) interacting with positively (red/orange maxima) and negatively charged sites (blue minima) on the protein. (A) Folded conformation without protonation; test charge located at y ) -0.4 nm [see Figure 1 for comparison]. (B) Same as in panel A, but after protonation of residues 3 and 27. Note how this change enhances the shielding of residue 15. (C) Folded conformation without protonation; test charge located at y ) 0.4 nm. (D) Unfolded conformation without protonation; the test charge and all residues are positioned at y ) 0. Numbers indicate the identity of charged residues.

whereas many denatured polypeptide chains in solution appear to retain varying degrees of residual structure.59-61 It seems likely that the increased width of experimental ESI charge state distributions could be a manifestation of this residual structure. This interesting possibility will be addressed in future studies. Conclusions This work employs a minimalist MD model for simulating the formation of multiply protonated gas-phase proteins from highly charged solvent droplets. The results obtained in this way provide insights into the mechanism underlying the relationship between solution-phase conformation and ESI charge state distribution. The model readily reproduces the experimental observation that unfolded proteins become more extensively protonated than tightly folded conformations. The major factors

responsible for this behavior are (i) the solvent-exposed surface areas of individual protonation sites and (ii) the degree to which these sites are electrostatically shielded by positively charged residues. Unfolding of the protein increases both the steric accessibility and the electrostatic exposure of protonation sites. In essence, these conclusions are consistent with earlier proposals put forward on the basis of experimental studies.19,20,27,31,43 The results of this work, therefore, support the commonly held notion that the ESI charge state distribution represents a probe of the overall “compactness” of the protein structure in solution.7,58 The current work represents an initial step toward a better understanding of the final stages of the ESI process. The minimalist nature of the framework used allows the physical basis of the observed phenomena to be readily identified.

ESI Charge States and Protein Conformation However, the relevance of the conclusions reached in this way for the behavior of real systems has to be solidified in future studies. In particular, it will be important to carry out MD simulations employing a more realistic description of the protein, the solvent, and the proton dynamics. For example, instead of modeling all electrostatic interactions with the same bulk dielectric constant, polarization effects should be taken into account explicitly. Possible deformations and charge asymmetry of the ESI droplets caused by external electrostatic fields in the ion source could play a role.62 Also, the size of the final fissionincompetent droplets could be different for folded and unfolded proteins.38 In any case, it appears that computational approaches are well-suited for studies in this fascinating area. Acknowledgment. We thank Paul Kebarle, Martin H. Mu¨ser, and Styliani Consta for stimulating discussions. This work was financially supported by the Natural Sciences and Engineering Research Council of Canada (NSERC), The University of Western Ontario, and by the Canada Research Chairs Program. References and Notes (1) Kaltashov, I. A.; Eyles, S. J. Mass Spectrometry in Biophysics; John Wiley and Sons, Inc.: Hoboken, NJ, 2005. (2) Fenn, J. B. Angew. Chem., Int. Ed. 2003, 42, 3871-3894. (3) Aebersold, R.; Mann, M. Nature (London) 2003, 422, 198-207. (4) Englander, S. W. J. Am. Soc. Mass Spectrom. 2006, 17, 14811489. (5) Sinz, A. J. Mass Spectrom. 2003, 38, 1225-1237. (6) Wales, T. E.; Engen, J. R. Mass Spectrom. ReV. 2006, 25, 158170. (7) Konermann, L.; Simmons, D. A. Mass Spectrom. ReV. 2003, 22, 1-26. (8) Pan, J. X.; Rintala-Dempsey, A.; Li, Y.; Shaw, G. S.; Konermann, L. Biochemistry 2006, 45, 3005-3013. (9) Fenn, J. B.; Rosell, J.; Meng, C. K. J. Am. Soc. Mass Spectrom. 1997, 8, 1147-1157. (10) Kebarle, P.; Peschke, M. Anal. Chim. Acta 2000, 406, 11-35. (11) Cole, R. B. J. Mass. Spectrom. 2000, 35, 763-772. (12) Rayleigh, L. Philos. Mag. 1882, 14, 184-186. (13) Iavarone, A. T.; Williams, E. R. J. Am. Chem. Soc. 2003, 125, 2319-2327. (14) Gomez, A.; Tang, K. Phys. Fluids 1994, 6, 404-414. (15) Duft, D.; Achtzehn, T.; Muller, R.; Huber, B. A.; Leisner, T. Nature (London) 2003, 421, 128. (16) Ruotolo, B. T.; Robinson, C. V. Curr. Opin. Chem. Biol. 2006, 10, 402-408. (17) Tesic, M.; Wicki, J.; Poon, D. K. Y.; Withers, S. G.; Douglas, D. J. J. Am. Soc. Mass Spectrom. 2007, 18, 64-73. (18) Patriksson, A.; Marklund, E.; van der Spoel, D. Biochemistry 2007, 46, 933-945. (19) Chowdhury, S. K.; Katta, V.; Chait, B. T. J. Am. Chem. Soc. 1990, 112, 9012-9013. (20) Katta, V.; Chait, B. T. J. Am. Chem. Soc. 1991, 113, 8534-8535. (21) Vis, H.; Heinemann, U.; Dobson, C. M.; Robinson, C. V. J. Am. Chem. Soc. 1998, 120, 6427-6428. (22) Grandori, R. Protein Sci. 2002, 11, 453-458. (23) Pan, X. M.; Sheng, X. R.; Zhou, J. M. FEBS Lett. 1997, 402, 2527.

J. Phys. Chem. B, Vol. 111, No. 23, 2007 6543 (24) Konermann, L.; Douglas, D. J. Biochemistry 1997, 36, 1229612302. (25) Yan, X.; Watson, J.; Ho, P. S.; Deinzer, M. L. Mol. Cell. Proteomics 2004, 3, 10-23. (26) Wang, G.; Cole, R. B. Org. Mass Spectrom. 1994, 29, 419-427. (27) Grandori, R. J. Mass Spectrom. 2003, 38, 11-15. (28) Wu, J.; Lebrilla, C. B. J. Am. Soc. Mass Spectrom. 1995, 6, 91101. (29) Schnier, P. D.; Gross, D. S.; Williams, E. R. J. Am. Soc. Mass Spectrom. 1995, 6, 1086-1097. (30) Samalikova, M.; Grandori, R. J. Am. Chem. Soc. 2003, 125, 1335213353. (31) Samalikova, M.; Grandori, R. J. Mass Spectrom. 2003, 38, 941947. (32) Nesatyy, V. J.; Suter, M. J.-F. J. Mass Spectrom. 2004, 39, 9397. (33) Prakash, H.; Mazumdar, S. J. Am. Soc. Mass Spectrom. 2005, 16, 1409-1421. (34) Nguyen, S.; Fenn, J. B. Proc. Natl. Acad. Sci. U.S.A. 2007, 104, 1111-1117. (35) Iribarne, J. V.; Thomson, B. A. J. Chem. Phys. 1975, 64, 22872294. (36) Fenn, J. B. J. Am. Soc. Mass Spectrom. 1993, 4, 524-535. (37) Dole, M.; Mack, L. L.; Hines, R. L.; Mobley, R. C.; Ferguson, L. D.; Alice, M. B. J. Chem. Phys. 1968, 49, 2240-2249. (38) de la Mora, F. J. Anal. Chim. Acta 2000, 406, 93-104. (39) Felitsyn, N.; Peschke, M.; Kebarle, P. Int. J. Mass Spectrom. Ion Processes 2002, 219, 39-62. (40) Kaltashov, I. A.; Mohimen, A. Anal. Chem. 2005, 77, 5370-5379. (41) Heck, A. J. R.; Van den Heuvel, R. H. H. Mass Spectrom. ReV. 2004, 23, 368-389. (42) Consta, S.; Mainer, K. R.; Novak, W. J. Chem. Phys. 2003, 119, 10125-10132. (43) Samalikova, M.; Grandori, R. J. Mass Spectrom. 2005, 40, 503510. (44) Consta, S. THEOCHEM 2002, 591, 131-140. (45) Consta, S. Theor. Chem. Acc. 2006, 116, 373-382. (46) Ichiki, K.; Consta, S. J. Phys. Chem. B 2006, 110, 19168-19175. (47) Das, P.; Matysiak, S.; Clementi, C. Proc. Natl. Acad. Sci. U.S.A. 2005, 102, 10141-10146. (48) Nymeyer, H.; Garcia, A. E.; Onuchic, J. N. Proc. Natl. Acad. Sci. U.S.A. 1998, 95, 5921-5928. (49) Dill, K. A.; Chan, H. S. Nat. Struct. Biol. 1997, 4, 10-19. (50) Creighton, T. E. Proteins; W. H. Freeman & Co.: New York, 1993. (51) Loo, J. A.; Edmonds, C. G.; Udseh, H. R.; Smith, R. D. Anal. Chem. 1990, 62, 693-698. (52) Babu, K. R.; Moradian, A.; Douglas, D. J. J. Am. Soc. Mass Spectrom. 2001, 12, 317-328. (53) Pan, J. X.; Wilson, D. J.; Konermann, L. Biochemistry 2005, 44, 8627-8633. (54) Hinchliffe, A. Molecular Modelling; Wiley: Chichester, U.K., 2003. (55) Atkins, P. Physical Chemistry, 6th ed.; W. H. Freeman & Co.: New York, 1998. (56) Dill, K. A.; Bromberg, S. Molecular DriVing Forces; Garland: New York, 2003. (57) Konermann, L. Proteins 2006, 65, 153-163. (58) Kaltashov, I. A.; Eyles, S. J. Mass Spectrom. ReV. 2002, 21, 3771. (59) Plaxco, K. W.; Gross, M. Nat. Struct. Biol. 2001, 8, 659-660. (60) Shortle, D.; Ackerman, M. S. Science 2001, 293, 487-489. (61) Jha, A. K.; Colubri, A.; Freed, K. F.; Sosnick, T. R. Proc. Natl. Acad. Sci. U.S.A. 2005, 102, 13099-13104. (62) Zhou, S.; Cook, K. D. J. Am. Soc. Mass Spectrom. 2001, 12, 206214.