J. Phys. Chem. B 2008, 112, 8165–8173
8165
Flexibility of Human Cytochromes P450: Molecular Dynamics Reveals Differences between CYPs 3A4, 2C9, and 2A6, which Correlate with Their Substrate Preferences Josef Skopalı´k,† Pavel Anzenbacher,*,‡ and Michal Otyepka*,† Department of Physical Chemistry and Center for Biomolecules and Complex Molecular Systems, Palacky UniVersity, Trida SVobody 26, 771 46, Olomouc, Czech Republic, and Department of Pharmacology, Faculty of Medicine and Dentistry, Palacky UniVersity, HneVotinska 3, 775 15 Olomouc, Czech Republic ReceiVed: January 13, 2008; ReVised Manuscript ReceiVed: April 21, 2008
Molecular dynamics (MD) simulations at normal and high temperature were used to study the flexibility and malleability of three microsomal cytochromes P450 (CYPs): CYP3A4, CYP2C9, and CYP2A6. Comparison of B-factors (describing the atomic fluctuations) between X-ray and MD data shows that the X-ray B-factors are significantly lower in the regions where the crystal contacts occur than for other regions. Consequently, the conclusions about CYP flexibility based solely on the X-ray data might be misleading. Comparison of flexibility patterns of the three CYPs enabled common features and variations in flexibility and malleability of the studied CYPs to be identified. The previously described pattern of flexibility in topological elements of microsomal CYPs (a rigid heme binding core, a malleable distal side and intermediately flexible proximal side) was confirmed. These topological features provide an important combination of high stereo- and regiospecificity (mediated by the relative rigidity in the neighborhood of the heme), together with high substrate promiscuity due to the more flexible active site and the malleability of the distal side. The data acquired here show that the malleability of the three studied CYPs correlates with their substrate specificity: CYP2A6 has a narrow substrate range and is the most rigid, CYP3A4 is the most promiscuous CYP known and is the most malleable, and CYP2C9 is intermediate in terms of both its substrate specificity and malleability. Thus, the malleability of CYPs is probably a major determinant of their substrate specificity. Introduction Cytochromes P450 (CYP) are heme-containing monooxygenases that are ubiquitous in both eukaryotes and prokaryotes.1 Human liver microsomal CYP enzymes, discussed here, play major roles in the first phase of metabolism of xenobiotics.2–5 The mechanisms whereby the microsomal CYPs accommodate various substrates, yet oxidize them in a stereospecific and regiospecific manner, is an intriguing aspect of the biochemistry of CYPs, with significant fundamental and potentially practical implications. Recent progress in the knowledge of microsomal CYP X-ray structures has enabled the identification of their common structural features and variations. Notably, the preferred access/egress paths to buried CYPs’ active sites, degree of flexibility, size of the active site, and water network inside the active site have been identified as the most important variations among described CYPs.6,7 Flexibility, one of the key modulators of CYP specificity, seems to play important roles in the substrates’ entry and accommodation in the active site, and also in the release of the products.7–9 Recent results have shown that X-ray analysis does not provide direct evidence regarding protein flexibility. Therefore, conclusions about enzyme flexibility are based mainly on temperature factors derived from X-ray data and comparisons of ligand-free and ligand-bound structures.8,10,11 In addition, the motions observed in these studies may not represent the degree of flexibility required to accommodate some especially large * Corresponding author. Phone/fax +420 585634756; e-mail:
[email protected] (M.O.). Phone +420 585632569; e-mail: anzen@ tunw.upol.cz (P.A.). † Department of Physical Chemistry and Center for Biomolecules and Complex Molecular Systems. ‡ Department of Pharmacology, Faculty of Medicine and Dentistry.
substrates. Such data from X-ray experiments are valuable, but these data should be considered with care because they may be significantly affected by shortcomings of the X-ray crystallography, e.g., by nonphysiological conditions (high concentrations of salts, presence of cryoprotectants, extremely low temperature, etc.) and by crystal packing causing non-native contacts. Therefore, conclusions about flexibility based on X-ray crystal structures should be verified by other independent techniques. A promising technique for this is molecular dynamics (MD) simulation, which allows structural features of a molecule, typically a protein, to be explored in silico (in a computer). MD experiments provide atomic-level simulations of structural fluctuations (at nanosecond timescales) of the molecule of interest immersed in a periodic box of water molecules and counterions simulating physiological conditions. MD simulations can thus provide important indications about (inter alia) the folding, unfolding, conformation, and interactions of proteins. Here, we show that they can also provide valuable information about the stability of proteins, specifically human liver microsomal CYPs. This work compares the flexibility of three human microsomal CYPs: 3A4, 2C9, and 2A6. CYP3A4 is the major P450 enzyme in human liver and intestines, involved in the metabolism of more than half of the marketed drugs with known metabolism. CYP2C9 is the major CYP2C form found in human liver, and is responsible for the biotransformation of many endogenous compounds as well as weakly acidic drugs. CYP2A6 metabolizes about 1% of the drugs metabolized by CYPs, its typical substrates being coumarin and steroids. It also contributes to the metabolism of nicotine and tobacco-specific procarcinogens.12,13 These three CYPs were selected for this study because CYP2A6 has a small active site13 and relatively narrow substrate
10.1021/jp800311c CCC: $40.75 2008 American Chemical Society Published on Web 06/17/2008
8166 J. Phys. Chem. B, Vol. 112, No. 27, 2008
Skopalı´k et al.
Figure 1. Left: Structure of the heme residue used in the MD simulations. The heme is modeled with a modified Cys residue (in thiolate form) and a water molecule in the second heme iron axial position (carbon atoms - green, nitrogens - blue, ogyxen atoms - red, sulfur yellow, hydrogen - white, and iron - orange). Right: Topology of cytochrome P450 with labeled secondary structure elements and highlighted position of the heme (represented in sticks).
TABLE 1: Summary of MD Simulations. normal temperature CYP 2A6 2C9ch 2C9 3A4
initial PDB 1Z10 1OG2 1OG2e 1TQN
a
b
t (ns)
Rg (Å)
5.0 5.0 8.5 5.0
22.63 ( 0.05 22.83 ( 0.06 22.67 ( 0.05 22.73 ( 0.06
high temperature rmsd
(Å)c
1.37 ( 0.08 1.78 ( 0.08 1.81 ( 0.09 1.33 ( 0.06
t
(ns)d
Rg (Å)b
rmsd (Å)c
10.0
22.74 ( 0.15
3.00 ( 0.29
10.0 10.0
22.58 ( 0.11 23.24 ( 0.15
3.79 ( 0.23 4.09 ( 0.22
a Duration of MD simulations under NT conditions (T ) 298.15 K, p ) 1 atm). b Rg is the mean radius of gyration of CR, C, and N atoms ( SMD calculated from the last 3 ns of the respective simulations. c rmsd is the mean root-mean-square deviation ( SMD of CR, C, and N atoms from the starting X-ray structure calculated from the last 3 ns of the respective simulations. d Duration of MD simulations under HT conditions (T ) 398.15 K, p ) 1 atm). e Modified structure, cf. Methods.
specificity,1,3,14 CYP3A4 is the most promiscuous CYP known and has a large active site,1,3 while CYP2C9 is intermediate in these respects.8,15–18 Methods The MD simulations presented here describe the behavior of the enzymes’ resting states,19 a characteristic feature of which is a low-spin hexacoordinated heme FeIII, in which the iron coordination sphere is formed by four nitrogen atoms of the porphyrin ring, with the Cys sulfur anion and oxygen of water molecule located in axial positions. We have developed force field parameters of the nonstandard heme residue to model the enzymes’ resting state following the Cornell et al. procedure.21 The heme cofactor is hence represented in all simulations with a Cys residue covalently bound to the heme iron and a water molecule in the second axial position (Figure 1). The starting geometry of the heme residue heavy atoms for QM calculation was taken from the X-ray structure 1OG2, and missing hydrogen atoms were added using Titan 6.0 software. Tilting geometry19 of the axial water molecule was used in the heme residue starting geometry. The system described was optimized at the UHF/ LACVP* level as a dianion in the doublet state using the Titan 6.0 (with Jaguar 3.5) software. The ESP charges were calculated over the optimized geometry by Gaussian 03 software20 using the UHF/6-31G* method, and further RESP charges were developed from the ESP charges.21 The nonstandard force field parameters were taken from the literature.22–24 A 10 ns long NpT (T ) 298.15 K, p ) 1 atm) simulation of the nonstandard residue capped by acetyl and N-methyl residues documents that the heme geometry is well maintained. The Supporting Information contains the parameters for the heme used in the MD simulations described below.
MD simulations of all CYPs studied (Table 1) were carried out using the AMBER 8.0 package25 with the parm99 force field.26 Starting structures of CYP2C9ch (1OG2),15 CYP3A4 (1TQN),18 and CYP2A6 (1Z10)13 were taken from the PDB database. The 1OG2 structure of CYP2C9 represents a chimera (CYP2C9ch) because seven additional mutations in the F/Gloop (K206E, I215V, C216Y, S220P, P221A, I222L, and I223L) have been inserted in the primary sequence to enhance crystallization.15 The inserted mutations, however, reportedly do not significantly affect the catalytic activity of CYP2C9 and it is believed that they do not significantly alter its structure.15,27 On the other hand, our classical MD simulations show small but significant differences in the access path preferences, which could be significant in the context of this work (see below). Therefore we simulated both CYP2C9ch and CYP2C9 (wt sequence) structures. The N-terminal transmembrane helices were truncated in all structures. We used well-established MD simulation protocols28 as follows. Initially, the protonation states of all histidines were checked by WHATIF29 to create an optimal H-bond network. Then, all hydrogens were added using the Leap program from the AMBER 8.0 package. The structures were neutralized by adding six Cl- (CYP2A6), three Na+ (CYP2C9) and six Cl- (CYP3A4) counterions. Each system was inserted in a rectangular water box in which every protein atom was at least 10 Å from the nearest edge of the box. Then, each system was minimized prior to the production part of the MD run in the following way. The protein was frozen, and the solvent molecules with counterions were allowed to move during a 1000-step minimization followed by a 2 ps long MD run under NpT conditions (i.e., p ) 1 atm, T ) 298.15 K). Then, the sidechains were relaxed by several consequent minimizations with decreasing force constants applied to the backbone atoms.
Flexibility of Human Cytochromes P450 After the relaxation, the system was heated to 250 K for 10 ps and then up to 298.15 K for 40 ps. The particle-mesh Ewald (PME) methods for treating electrostatic interaction were used. All simulations were run under periodic boundary conditions in the NpT ensemble at normal (298.15 K) or high (398.15 K) temperatures and a constant pressure of 1 atm using a 2 fs time integration step. The SHAKE algorithm with a tolerance of 10-5 Å was applied to fix all bonds containing hydrogen atoms. The 9 Å cutoff was applied to treat the nonbonding interactions. Coordinates were stored every two picoseconds. Each of the studied systems (CYP2C9ch, CYP2C9, CYP2A6 and CYP3A4) comprises ∼50 000 atoms. Table 1 summarizes details about the production phases of the studied systems. The starting structures for the high temperature (HT, 398.15 K) simulations were taken from the equilibrated simulations at the normal temperature (NT, 298.15 K). MD trajectories were analyzed by an essential dynamics (ED) technique,30 which separates large concerted structural rearrangements from irrelevant fluctuations in a multiparticle system. From the MD trajectory the covariance matrix of the positional fluctuations of the CR carbon atoms is built up and diagonalized. The procedure yielded new axes (eigenvectors), representing the directions of the concerted motions, and eigenvalues for their respective displacement eigenvectors.30 The eigenvectors are directions in a 3N-dimensional space (where N is the number of atoms), and the motion along a single eigenvector corresponds to concerted displacement of groups of atoms in the Cartesian space. This characterizes the essential subspace of the internal dynamics of a protein. The motions in the essential space are linked to the biological function of the protein. Temperature B-factors were calculated from MD simulation using the ptraj module of the AMBER package. ED analyses of MD trajectories were performed using the GROMACS suite of programs.31 Positions of the structural water molecules were identified by the method described by Krˇízˇ et al.,32 as implemented in RETINAL software (Petr Kulha´nek, NCBR Brno, http://troll.chemi.muni.cz/whitezone/development/root/). The CAVER33 and MOLE34 tools were used to monitor flexibility and preferences for channels connecting the buried active sites with surrounding solvent. These software tools have been developed to explore, automatically, molecular channels in static structures and MD trajectories. Wade’s nomenclature was adopted to describe the found access/egress channels.7 Results MD Simulations at Normal Temperature (NT, 298.15 K). Flexibility. B-factors (temperature factors), calculated from the positional variations (in the last 3 ns), reflect the flexibility of a studied system. B-factors calculated from MD simulations include both rapid thermal atomic fluctuations as well as longrange motion. Fortunately, the latter can be separated by using techniques such as ED. To distinguish between these two scales of flexibility, we use the terms “fluctuation” and “malleability” when discussing B-factors and long-range motions, respectively. CYP2A6 fluctuations are localized mainly to the G′/G, G/H, H/I loops, in accordance with deductions from X-ray analysis.13 However, the MD simulations indicate that the fluctuations in the N-terminal loops and F′/G′ region are greater than the X-ray data suggest (Figure 2). This appears to be due to extensive contacts in this and adjacent regions lowering the fluctuations in the crystallized protein. CYP2C9 fluctuations are spread over the B′/C, C/D, G′/G, G/H, H/I loops, the meander region, and the β31/ β32 loop (residues 463-466). These flexible regions (FRs) are similar to those identified from X-ray analysis15
J. Phys. Chem. B, Vol. 112, No. 27, 2008 8167 (Figure 2). CYP3A4 fluctuates significantly in the D/E, E/F, G′/G, F/H, β21/β22 (residue, 381-385) and β31/β32 (residues 468-470) loops and the meander. As in the previous cases, these regions agree well with malleable regions identified from the X-ray analysis,18 except for the F′, G′, and adjacent regions (namely the N-terminal loops), where crystal contacts lower the flexibility of these regions in the crystallized protein (Figure 2). In general, the structural core of the protein formed by helices D, E, I, J, K is rigid in all of the studied systems. The proximal side (namely the C helix, the Cys-pocket binding the heme cofactor, the N-terminal part of the K helix, and the loop preceding it) of the CYPs’ structures, where interactions with redox-partners are expected,35 is also less flexible than the other solvent-exposed regions. From the comparison of these three systems, one can identify 10 common FRs with enhanced fluctuations (Figure 3). FR1 is formed by the N-terminal domain loops (residues 30-93, excluding rigid β1 core strands and residues 375-390 contributing to the β1 sheet, according to CYP2C9 residue numbering), which is expected to be partially immersed in the microsome membrane.36,37 FR2, FR3, and FR4 correspond to the B′/C, C/D and D/E loops, respectively. FR5 is spread over residues (E/F and G/H loops) joining the entire F/G segment (F, F′, G′, and G helices with connecting loops) to the rest of the structure. FR6 coincides with the G′/G loop. FR7 comprises the highly fluctuating H/I loop. FR8 is formed by the J/J′ loop. FR9 corresponds to the meander and adjacent residues (residues 395-421). FR10 corresponds to the loop following the β31 strand (residues 460-475). The essential motions occur in the distal side regions that were identified as the malleable regions (Supporting Information, Figure S1). ActiWe Site SolWation. The active sites of CYP2C9 and CYP3A4 are sufficiently large to accommodate tens of water molecules, but they contain only a few of them in their crystal structures (six in CYP2C9 and 17 in CYP3A4), while the CYP2A6 active site contains 14, most of which are located in the access path, e.g., eight along the pw2a channel (cf. Supporting Information, Figure S2). The lower numbers of water molecules than expected inside the CYP2C9 and CYP3A4 active sites are due to the low quality of electron maps, which do not allow more water molecules to be modeled in them. The MD simulations show that water molecules readily enter the active sites, and the number of active site water molecules becomes constant after ∼1.5 ns in all three CYPs. The active sites of the CYP2C9ch, CYP2C9, and CYP3A4 systems each accommodate ∼27-29 water molecules, and CYP2A6 accomodates ∼22. These numbers have been identified by visual inspection since it is difficult to clearly define the active site size (cf. refs 6 and 38). The number of active site water molecules does not significantly differ among the 2C9 and 3A4 systems, in general accordance with the recent study by Rydberg et al.38 On the other hand, these authors concluded from their MD simulations that the CYP2C9 active site cavity contains 58 waters, and the CYP3A4 cavity houses 57 water molecules; almost twice as many as we identified. It should be emphasized that these differences are due to differences in the definition of the active site cavities. Nonetheless, these studies clearly show that the CYP2C9 and CYP3A4 active sites are capable of housing large numbers of water molecules and that water molecules can readily enter and fill the active site, preferably via the solvent channel, as described below. The numbers of the active site water molecules also reflect the relative sizes of the active site volumes of the CYP2C9, CYP3A4, and CYP2A6 systems.
8168 J. Phys. Chem. B, Vol. 112, No. 27, 2008
Skopalı´k et al.
Figure 2. Thermal fluctuations mapped onto structures of discussed CYPs. The thick, red tubes represent regions with the highest B-factors, while thin, blue regions are rigid. Structures based on X-ray data and MD simulations at normal (298 K) and high (398 K) temperatures are shown in the left, middle, and right columns, respectively. The thermal fluctuations from NT and HT MD simulations are not comparable in absolute values because two different scales are used for NT and HT B-factors. The HT B-factors are about 1 order of magnitude higher than the NT B-factors. Some important secondary structure elements are labeled by capital letters.
Figure 3. Left: Five plastic regions (PR1-PR5) identified by Zhao et al.9 from crystal structures of CYP2B4 projected onto the structure of CYP3A4 (1TQN). Right: Ten FRs (FR1-FR10) identified in the MD simulations; the FR3 region is hidden behind the heme represented by cyan spheres.
The MD simulations presented here show that the water molecules readily enter the active site not only during the equilibration (∼1.5 ns), but also throughout the entire simulations, in which intensive exchange of water molecules between the active site and bulk solvent occurs (through the solvent
channel). Analysis of the last 2.5 ns of each simulation shows that 26 water molecules enter and 20 leave the CYP2C9ch active site, while 16 enter and 18 leave the CYP2C9 active site through the solvent channel. Thirteen additional water molecules enter and 19 leave the CYP2C9ch active site via the pw2e channel
Flexibility of Human Cytochromes P450 (see ref 7 for the nomenclature used), while eight enter and four leave the CYP2C9 active site via the pw2c channel. In CYP3A4, 16 water molecules enter and four leave the CYP3A4 active site via the solvent channel, while seven enter and two leave the active site via the pw2e channel during the equilibration. During the last 2.5 ns, exchange of water molecules occurs via the pw2e (10 in, eight out), pw2a (six in, five out) and solvent (two in, two out) channels. These differences are due to a shift of the F/F′ loop and reconformation of the L352 side chain, leading to a reduction in the radius of the solvent channel. In the CYP2A6 molecule, which is expected to be inaccessible to X-ray analysis,7 inspection of the water molecules’ traffic indicates that the solvent channel is sufficiently open to enable exchange between the active site and the bulk solvent (five water molecules in, five out). The opening of the solvent channel is caused by a shift of the F209 side chain (cf. Supporting Information, Figure S3). Dynamics of the Access/Egress Path to the ActiWe Site. The CYP active sites are occluded cavities in the enzymes’ interiors that are connected to the outside environment by several access/ egress paths, the nomenclature (cf. Supporting Information, Figure S2) and biological relevance of which have been recently reviewed by Cojocaru et al.7 Information about the accessibility of the active sites originates from crystal structure analyses, which offer, unfortunately, only a static view. In this study, therefore, the channels connecting the buried active site to the bulk solvent were extensively analyzed using MD simulations. The preferred CYP2C9ch routes are the solvent channel followed by the pw2e channel (with 90% and 10% probabilities, respectively, of being the channel used, i.e., the most geometrically convenient path; for details see ref 33). For the CYP2C9 system, the solvent channel is the preferred one (73%), followed by the pw2c (26%) and pw2e (1%) channels. The differences in the channel profiles between CYP2C9ch and CYP2C9 are not surprising, since residues forming the abovementioned channels are close to the F/G-loop region where the mutations facilitating the CYP2C9 crystallization were inserted.15,16 Some differences in channel profiles are also apparent from the static X-ray crystal structures with (1OG2 and 1OG5) and without (1R9O) mutations. The solvent channel of CYP2C9ch has a characteristic bottleneck formed by the I205, S209, E300, and V479 residues, with a radius of 1.75 Å (1OG2) or 1.93 Å (1OG5). The bottleneck radius of the CYP2C9 (1R9O) solvent channel is equal to 2.15 Å. In CYP2C9ch, pw2c is the second most preferred channel, after the solvent channel, with a 1.57 Å (1OG2) or 1.50 Å (1OG5) bottleneck formed by the N107, R108, V237, and K241 residues. In contrast, for CYP2C9, the second most preferred channel is pw2e (F100, L102, E104, and F114, r ) 1.89 Å), while the pw2c channel is narrower, with a 1.36 Å radius. It should be noted here that the pw2e channel is narrower than the pw2c channel in the CYP2C9ch X-ray crystal structures (1OG2 and 1OG5), but it becomes larger than pw2c during the MD simulation. An opposite situation occurs in the MD simulation of CYP2C9, since pw2e is significantly larger than pw2c in the X-ray crystal structure, but the pw2c channel is larger than pw2e in the MD simulation. These findings show that channel preferences deduced from static structures may differ from those drawn from MD simulations, which should theoretically better reflect the real situation. Further notable findings are the differences in preferences of the pw2e vs pw2c channels in the CYP2C9ch and CYP2C9 structures caused by the mutations in the F/G region. In CYP3A4, after equilibration of the structure, the pw2e channel is the preferred path (71%) followed by the pw2a
J. Phys. Chem. B, Vol. 112, No. 27, 2008 8169 channel (27%), in contrast to the situation at the beginning of the MD simulation and predictions based on the X-ray structures, which indicate that the solvent channel is the preferred path. This appears to be due to the significant reduction in the solvent channel radius due to the conformational changes described above (shift of the F/F′ loop and reconformation of the L352 side chain). The CYP2A6 X-ray structure seems to be closed7 because the bottleneck diameters of the access paths identified by MOLE are smaller than 1.4 Å. MOLE found channel 4 to be the most convenient path, with a bottleneck diameter of 1.4 Å (formed by F107, I208, and F209). The second most preferred path was the solvent channel, but it was quite narrow, with a bottleneck (formed by F209, T305, and I366) diameter of 0.7 Å. However, during the MD simulation, the solvent channel opens sufficiently to enable solvation of the active site, as described above. The solvent channel is also the most preferred path (97%) during the entire MD simulation, followed by a minor contribution from the pw2e channel (3%). MD Simulations at High Temperature (HT, 398.15 K). The MD simulations at 398.15 K were used for more extensive conformational sampling of the studied CYPs. The need for thorough sampling of CYP conformational spaces has been highlighted as a prerequisite for understanding CYP-ligand interactions and for modeling studies of high predictive power.9 The increase in the temperature allows energetic barriers to be overcome and simultaneous monitoring of conformational changes, which would not be observable under NT at available simulation timescales. Such HT simulations are routinely used for enhanced conformational sampling, e.g., in protein (un)folding studies.39 Conformational Changes. Comparison of the root-meansqaure deviations (rmsds) of the backbone (CR, C, N) atoms between structures at the beginning (X-ray-based) and end of the simulations shows (Table 1) that minor, moderate, and more pronounced conformational changes occur in the CYP2A6 (rmsd ) 3.0 Å); CYP2C9 (rmsd ) 3.8 Å), and CYP3A4 (rmsd ) 4.1 Å) systems. The overall fold of the CYP2A6 structure is well preserved after 10 ns of the HT simulation, so no unfolding of the structure is initiated within this time. As shown in Figure 4, the largest conformational changes, with standard deviations exceeding 4 Å (an arbitrarily selected threshold), of CYP2A6 occur in the N-terminus, B′/C, C/D, D/E, E/F, G′/G, G/H, H/I loops, the loop following the K helix, and the β31/β32 loop (Figure 4). These regions agree well with the FRs identified from the MD simulation at NT. The observed conformational changes also affect the preferred egress paths found by the MOLE software. The solvent channel is the preferred path in the CYP2A6 starting structure, but, at the end of the simulation, the pw2c channel has become the preferred egress path, while the solvent channel is closed due to a shift of the β31/β32 loop (residues 464-487). At the end of the HT MD simulation, the overall fold of the CYP2C9 structure is well preserved, except for the first nine N-terminal residues, which are shifted by ∼12 Å. The changes in the CYP2C9 structure are distributed in the N-terminal domain, C/D, E/F, F/F′, G′/G, G/H, H/I loops, the loop following the K helix, the β21/β22 loop, and the meander (Figure 4). As in the previous case, these regions correspond well to the identified FRs, and the conformational changes affect the egress path preferences. In the starting structure, the solvent channel is preferred, and the pw2b channel is closed, but the shifts of the F helix C-terminus and F/F′ loop close the solvent channel
8170 J. Phys. Chem. B, Vol. 112, No. 27, 2008
Skopalı´k et al.
Figure 4. Plots of distances of corresponding CR atoms between initial and terminal structures taken from the HT simulations for each structure, displayed in the panels beside them. The secondary structure elements are depicted by black boxes above the x-axis; higher boxes represent R-helices (labeled by uppercase letters), and the smaller ones represent β-strands. Bars at the top of each plot represent the FR regions (cf. Figure 3); the red FRs are split into two fragments of primary structure). The right column includes data for the CYP structures discussed (in similar orientation to that used in Figures 1–3). Thick, red tubes represent regions with high distance differences between CR atoms in the initial and terminal structures, while thin, blue lines indicate regions with smaller differences.
and the pw2b opens, due to a shift (∼5 Å) of the loop between residues 68-72, during the simulation. The CYP3A4 structure exhibits the largest conformational changes among the systems studied during the HT simulations. The rigid protein core is well preserved, but the FRs are highly malleable. The conformational changes occur mainly in the N-terminus, B/B′, D/E loops, middle section of the D-helix, the entire F-G segment, the H/I, J/J′, β21/β22, β31/β32 loops, and the loop between residues 418 and 430 (Figure 4). These identified regions agree with FR regions identified from the NT MD simulations, except for the surprisingly low malleability of FR3 (B′/C loop) and FR4 (D/E) in comparison to the
flexibility identified from NT simulations. On the other hand, the malleability of the F/G segment appears to be far greater in the HT simulation than expected according to the flexibility deduced from the NT simulation. In addition to the great malleability of the entire E/G segment of CYP3A4 (beginning at the E/F and ending at the G/H loop), another interesting feature of the HT simulation is the shift of the F/F′ loop, resembling the rearrangement observed in the CYP3A4 structure when erythromycin binds (Figure 5). The E/G-segment, where the largest conformational changes occur, agrees well with the structurally FR identified from X-ray structures by Ekroos and Sjo¨gren.8 During the simulation, a shift
Flexibility of Human Cytochromes P450
Figure 5. The structure of the CYP3A4 F/G-region according to X-ray analysis of the relaxed (starting) state of the region of substrate-free CYP3A4 (1TQN, light blue), the structure taken after 2 ns of the HT MD simulation (gray), and the X-ray structure of CYP3A4 with bound erythromycin (2J0D, dark blue). The conformational shift in the F/F′ loop induced by the presence of the large ligand erythromycin (light blue vs dark blue) is similar to a conformational change observed in the HT simulation (arrowed), which leads (in both cases) to a dramatic increase in the active site volume.
of the F/F′ loop was observed from the conformation of 1TQN18 (starting structure) to the conformation resembling the F/F′ loop conformation proposed in CYP3A4 structures with bound ligands (2J0D).8 This motion reflects the high plasticity of the region and the shifts that can be induced either by substrates or by higher temperature. In all three cases the smallest conformational changes occurred in the enzyme heme binding cores and the proximal sides of the enzyme surfaces (namely the C helix, the hemebinding Cys pocket, the N-terminal part of the K helix, and the loop preceding it), where interactions with redox partners are likely to occur.35 Discussion Ekroos and Sjo¨gren recently suggested that the flexibility of microsomal CYP structures may correlate with their substrate promiscuity,8 on the basis of conformational differences between the X-ray structures of CYP3A4 and CYP2B4. The suggestion is also consistent with earlier experimental data indicating that these enzymes have rather flexible active sites.40 The X-ray structures of microsomal CYPs have considerably advanced our understanding of their structural features, and their effects on the enzymes’ functions. However, X-ray structures cannot give a complete picture of CYP flexibility because X-ray structural analysis is inherently limited by shortcomings that may severely affect the interpretation of molecular flexibility. The results obtained here clearly show that crystal contacts affect the molecular flexibility indices. The comparison of B-factors demonstrates that, in two cases (CYPs 2A6 and 3A4), the flexibility data obtained from X-ray analyses of the region that is probably important for substrate entry (the N-terminal domain and the F′/G′ region) provide lower estimates of flexibility than those estimated from MD simulations, because of the presence of large crystal contacts in those regions. The comparison of flexibilities of the three studied systems enabled 10 common FRs and the rigid core to be identified (Figure 3). The identified FRs redefine, broaden, and more precisely delimit the set of plastic regions (PRs) proposed by Zhao et al.9 from an X-ray structure analysis of rabbit CYP2B4 structures (cf. Figure 3). Both FR1 and PR1 span the N-terminal part of the CYP structure, where interactions with membranes are expected to occur. FR1 is somewhat larger than PR1, probably because the identification and delimitation of PR1 was affected by crystal contacts, as discussed above. In this region, lower flexibility has been found by Ekroos and Sjo¨gren8 from
J. Phys. Chem. B, Vol. 112, No. 27, 2008 8171 their X-ray analyses of CYP3A4 structures. However, this conflicts with the results of the MD simulations presented here, although it should be emphasized that a large proportion of crystal contacts occur in the region, which may significantly reduce its flexibility. On the other hand, the flexibility of this region calculated from the MD simulations can differ from flexibility under physiological conditions because of the model used where the N-terminal helix (membrane anchor) is missing and no explicit interaction with a membrane is considered. The PR2 region (including the B′ helix, B′/C loop, C helix, and C/D loop) has been split here into two FRs, designated FR2 and FR3, as a result of the low flexibility and malleability of the C helix in the middle of the original PR2 region. The low flexibility of the C helix has been also reported by Ekroos and Sjo¨gren from their X-ray analyses of CYP3A4 structures.8 The PR3 region, containing the C-terminus of the E helix, is part of FR5, which involves the residues forming the hinge of the F/G segment, i.e., the E/F and G/H loops. PR4 includes the entire F/G segment, up to the N-terminal part of the I helix. We feel it is appropriate to split and redefine this region into three FRs: FR5 (the E/F and G/H loops); FR6, including the G′ helix and G′/G loop; and FR7, containing the H/I loop. This division has been prompted by findings that the flexibility and malleability of the F, F′ and G helices are not common features of all studied CYPs. The last PR, PR5, is identical to FR10. In general, the identified FRs are spread over solvent-exposed loops, mainly on the distal CYP side. In contrast to the FRs, the protein core formed by helices D, E, I, J, and K, the loop following the K helix, and the L helix, constitute a relatively rigid core. The proximal side of all CYPs structures studied (namely the C helix, the heme-binding Cys pocket, the Nterminal part of the K helix, and the loop preceding it) is where interactions with redox partners are expected to occur, and is generally less flexible than the distal side. The HT (398 K) MD simulations were used to study the conformational flexibility and malleability of the three studied CYPs. The malleable regions identified from the conformational changes during the course of the HT simulation agree well with the FRs identified from the MD simulations at NT. This implies that analyses based on NT simulations can be used to identify potential regions where large conformational changes may occur. The MD simulations at both temperatures showed that the CYP2A6 system is the least flexible of the studied systems, CYP2C9 is more flexible, and CYP3A4 is both the most flexible and the most malleable. These observations may be related to their substrate promiscuities; CYP2A6 having the narrowest substrate specificity, CYP3A4 being the most promiscuous of known liver microsomal CYPs, and CYP2C9 being intermediate in this respect. We can conclude that the structural malleability of hepatic microsomal CYPs (i.e., their readiness to undergo conformational changes) may significantly influence their spectrum of metabolized substrates. It should be pointed out that the price of large flexibility is instability of the structure, which goes hand-in-hand with high susceptibility to denaturation (as demonstrated for the CYP3A4 system 40). CYP3A4 shows enhanced malleability throughout the entire F/G region and its adjacent loops. The first X-ray structures of CYP3A4 17,18 prompted questions about how the active site could accommodate large substrates (e.g., cyclosporine, which has an Mr of 1.203 kDa) since the active site volume was smaller than expected. This question has been answered by publication of the X-ray structure of CYP3A4 bound to the large ligand, erythromycin, by Ekroos and Sjo¨gren8 (Mr ) 734 Da), which shows that CYP3A4 undergoes a dramatic conformational
8172 J. Phys. Chem. B, Vol. 112, No. 27, 2008
Skopalı´k et al. channel serves as a water highway to/from the active site. The HT simulations also show that the solvent channel is easily closed by the conformational changes described above, and the pw2 channels then open widely. In general, the access path preferences and their profiles are strongly affected by the conformational flexibility of the F/G segment and adjacent regions. It should be emphasized that these results come form MD simulations of unliganded CYP forms and, in principle, ligands can induce channel opening and also distinguish channels according their chemical nature. Clearly, further studies of possible substrate access paths and product release to/from the CYP active sites are warranted, using specialized simulation techniques, such as steered MD42 or random expulsion MD.43 Conclusions
Figure 6. CYP3A4 structure dissected perpendicularly to the heme to show the flexibility of the protein shell. The CYP shell is colored according to decreasing flexibility from red, through orange, yellow and green to blue. The heme is depicted in magenta, and the PS and DS labels stand for the proximal and distal side, respectively. The sites of expected immersion of the CYP into the membrane and its interactions with redox partners (CPRs), as well as the most likely routes of (lipophilic) substrate entrance and (hydrophilic) product release are schematically shown according to the notation of Wade et al.44
change in the above-mentioned F/G region. This conformational change results in an >80% increase in the volume of the active site. The HT MD simulation presented here confirms the high malleability of the F/G region and also shows a conformational change in the F/F′ loop resembling the above-mentioned change identified in the X-ray structure of the CYP3A4-erythromycin complex. The movement in this region observed in the HT MD simulations also leads to a significant increase in the active site volume (Figure 5). The active site volumes estimated by CASTp41 changed from ∼1000 Å3 to ∼4000 Å3. This change is almost certainly an overestimate due to the difficulties discussed above in defining the active site boundaries.38 However, it provides at least qualitative evidence of a large increase in the active site volume induced by the conformational shift of the F/F′ loop. MD data show that, topologically, CYPs have a rigid region and other regions with both low and high malleability (Figure 6). The CYP heme-binding core, where the chemical reaction takes place, is relatively rigid. The proximal side, where the interaction with redox partners occurs, is moderately malleable, i.e., sufficiently malleable to accommodate the redox partners, but sufficiently rigid to enable electron transfer. The distal side, where interaction with the membrane, substrates, and probably other enzymes occurs, is highly malleable as it also contributes to the substrate specificity of the CYPs. These topological features allow easy substrate entrance and the release of products to/from the active site, due to the malleability of the distal side (without disrupting the active site core arrangement9 and/or interaction with the redox partners). Thus the organization of three regions with major differences in malleability allows both the broad substrate promiscuity and the high stereo- and regiospecificity of CYPs. An important question to address is how substrates enter and products leave the buried CYP active site. Hints are provided by analyses of the active site access paths7 by the recently developed tool MOLE.34 The NT MD simulations of the unliganded CYPs helped to identify the solvent channel as the most strongly preferred (i.e., geometrically most convenient) path from the active sites, followed by the pw2e and pw2c paths. Analysis of the water molecules’ traffic shows that the solvent
All atomic MD simulations in explicit solvent at NT and HT were used to investigate the flexibility and malleability of three pharmacologically important human hepatic CYPs: CYP3A4, CYP2C9, and CYP2A6. The obtained data were compared with available experimental information. Comparison of MD-based B-factors (describing the atomic fluctuations) with X-ray data shows that the crystal contacts lowers the B-factors in the regions where they occur. Consequently, the conclusions about CYP flexibility based solely on the X-ray data might be misleading. The MD simulations provided also new insight into the flexibility and malleability of the systems studied. Comparison of flexibility patterns of the three CYPs enabled to identify the common features and variations in flexibility of the studied CYPs. The previously described pattern of flexibility in topological elements of microsomal CYPs, i.e., a rigid hemebinding core, a malleable distal side, and intermediately flexible proximal side, was confirmed. Such a topological feature provides an important combination of high stereo- and regiospecificity, which is mediated by the relative rigidity in the neighborhood of the heme, together with high substrate promiscuity due to the more flexible active site and the malleability of the distal side. The data acquired here show that the malleability of the three studied CYPs correlates with their substrate specificity: CYP2A6 has a narrow substrate range and is the most rigid, CYP3A4 is the most promiscuous CYP known and is the most malleable, and CYP2C9 is intermediate in terms of both its substrate specificity and malleability. Thus, the malleability of CYPs is probably a major determinant of their substrate specificity. Acknowledgment. Support through the GACR Grant No. 305/06/P139, and the MSMT (Ministry of Youth, Sports and Education, Czech Republic) Grants LC512, 1P05OC050 (COST 861.001) and MSM6198959216 is gratefully acknowledged. Supporting Information Available: (1) Parameters for the heme residue, (2) essential motions of studied CYPs, (3) the nomenclature for CYP’s access paths, and (4) changes in the CYP2A6 solvent channel. This material is available free of charge via the Internet at http://pubs.acs.org. References and Notes (1) Ortiz de Montellano, P.R., Ed. Cytochrome P450: Structure, Mechanism, and Biochemistry, 3rd ed.; Kluwer Academic: New York, 2005. (2) Evans, W. E.; Relling, M. V. Science 1999, 286, 487–491. (3) Anzenbacher, P.; Anzenbacherova, E. Cell. Mol. Life Sci. 2001, 58, 737–747. (4) Guengerich, F. P. Chem. Res. Toxicol. 2001, 14, 611–650. (5) Rendic, S. Drug. Metab. ReV. 2002, 34, 83–448. (6) Otyepka, M.; Skopalik, J.; Anzenbacherova, E.; Anzenbacher, P. Biochim. Biophys. Acta 2007, 1770, 376–389.
Flexibility of Human Cytochromes P450 (7) Cojocaru, V.; Winn, P. J.; Wade, R. C. Biochim. Biophys. Acta 2007, 1770, 390–401. (8) Ekroos, M.; Sjo¨gren, T. Proc. Natl. Acad. Sci. U.S.A. 2006, 103, 13682–13687. (9) Zhao, Y. H.; White, M. A.; Muralidhara, B. K.; Sun, L.; Halpert, J. R.; Stout, C. D. J. Biol. Chem. 2006, 281, 5973–5981. (10) Scott, E. E.; He, Y. A.; Wester, M. R.; White, M. A.; Chin, C. C.; Halpert, J. R.; Johnson, E. F.; Stout, C. D. Proc. Natl. Acad. Sci. U.S.A. 2003, 100, 13196–13201. (11) Muralidhara, B. K.; Negi, S.; Chin, C. C.; Braun, W.; Halpert, J. R. J. Biol. Chem. 2006, 281, 8051–8061. (12) Malaiyandi, V.; Sellers, E. M.; Tyndale, R. F. Clin. Pharmacol. Ther. 2005, 77, 145–158. (13) Yano, J. K.; Hsu, M. H.; Griffin, K. J.; Stout, C. D.; Johnson, E. F. Nat. Struct. Mol. Biol. 2005, 12, 822–823. (14) Sansen, S.; Hsu, M.-H.; Stout, C. D.; Johnson, E. F. Arch. Biochem. Biophys. 2007, 464, 197–206. (15) Williams, P. A.; Cosme, J.; Ward, A.; Angova, H. C.; Vinkovic, D. M.; Jhoti, H. Nature 2003, 424, 464–468. (16) Wester, M. R.; Yano, J. K.; Schoch, G. A.; Yang, C.; Griffin, K. J.; Stout, C. D.; Johnson, E. F. J. Biol. Chem. 2004, 279, 35630–35637. (17) Williams, P. A.; Cosme, J.; Vinkovic, D. M.; Ward, A.; Angove, H. C.; Day, P. J.; Vonrhein, C.; Tickle, I. J.; Jhoti, H. Science 2004, 305, 683–686. (18) Yano, J. K.; Wester, M. R.; Schoch, G. A.; Griffin, K. J.; Stout, C. D.; Johnson, E. F. J. Biol. Chem. 2004, 279, 38091–38094. (19) Meunier, B.; de Visser, S. P.; Shaik, S. Chem. ReV. 2004, 104, 3947–3980. (20) Frisch, M. J.; Trucks, G. W.; Schlegel, H. B.; Scuseria, G. E.; Robb, M. A.; Cheeseman, J. R.; Zakrzewski, V. G.; Montgomery, J. A.; Stratmann, R. E.; Burant, J. C.; Dapprich, S.; Millam, J. M.; Daniels, A. D.; Kudin, K. N.; Strain, M. C.; Farkas, O.; Tomasi, J.; Barone, V.; Cossi, M.; Cammi, R.; Mennucci, B.; Pomelli, C.; Adamo, C.; Clifford, S.; Ochterski, J.; Petersson, G. A.; Ayala, P. Y.; Cui, Q.; Morokuma, K.; Malick, D. K.; Rabuck, A. D.; Raghavachari, K.; Foresman, J. B.; Cioslowski, J.; Ortiz, J. V.; Stefanov, B. B.; Liu, G.; Liashenko, A.; Piskorz, P.; Komaromi, I.; Gomperts, R.; Challacombe, M.; Gill, P. M. W.; Johnson, B. G.; Chen, W.; Wong, M. W.; Andres, J. L.; Head-Gordon, M.; Replogle, E. S.; Pople, J. A., Gaussian03., Gaussian, Inc.: Pittsburgh, PA, 2004. (21) Cornell, W. D.; Cieplak, P.; Bayly, C. I.; Kollman, P. A. J. Am. Chem. Soc. 1993, 115, 9620–9631. (22) Autenrieth, F.; Tajkhorshid, E.; Baudry, J.; Luthey-Schulten, Z. J. Comput. Chem. 2004, 25, 1613–1622. (23) Oda, A.; Yamaotsu, N.; Hirono, S. J. Comput. Chem. 2005, 26, 818–826.
J. Phys. Chem. B, Vol. 112, No. 27, 2008 8173 (24) Favia, A. D.; Cavalli, A.; Masetti, M.; Carotti, A.; Recanatini, M. Proteins: Struct., Funct., Bioinf. 2006, 62, 1074–1087. (25) Case, D. A.; Cheatham, T. E.; Darden, T.; Gohlke, H.; Luo, R.; Merz, K. M.; Onufriev, A.; Simmerling, C.; Wang, B.; Woods, R. J. J. Comput. Chem. 2005, 26, 1668–1688. (26) Wang, J. M.; Cieplak, P.; Kollman, P. A. J. Comput. Chem. 2000, 1049–1074. (27) Seifert, A.; Tatzel, S.; Schmid, R. D.; Pleiss, J. Proteins: Struct., Funct., Bioinf. 2006, 64, 147–155. (28) Otyepka, M.; Bartova, I.; Kriz, Z.; Koca, J. J. Biol. Chem. 2006, 281, 7271–7281. (29) Vriend, G. WHAT IF, 5.0; EMBL: Heidelberg, 1997. (30) Amadei, A.; Linssen, A. B. M.; Berenden, H. J. C. Proteins: Struct., Funct., Genet. 1993, 17, 412–425. (31) Spoel, D. V. D.; Buuren, A. R. V.; Apol, E.; Meulenhoff, P. J.; Tieleman, P. D.; Sijbers, A. L. T. M.; Hess, B.; Feenstra, K. A.; Lindhal, E.; Drunen, R. V.; Berendsen, H. J. C. GROMACS, University of Groningen: Groningen, 19912002. (32) Krˇı´zˇ, Z.; Otyepka, M.; Bartova´, I.; Koca, J. Proteins: Struct., Funct., Genet. 2004, 55, 258–274. (33) Petrek, M.; Otyepka, M.; Banas, P.; Kosinova, P.; Koca, J.; Damborsky, J. BMC Bioinformatics 2006, 7, 316. (34) Petrek, M.; Kosinova, P.; Koca, J.; Otyepka, M. Structure 2007, 15, 1357–1363. (35) Bridges, A.; Gruenke, L.; Chang, Y. T.; Vakser, I. A.; Loew, G.; Waskell, L. J. Biol. Chem. 1998, 273, 17036–17049. (36) Cosme, J.; Johnson, E. F. J. Biol. Chem. 2000, 275, 2545–2553. (37) Williams, P. A.; Cosme, J.; Sridhar, V.; Johnson, E. F.; McRee, D. E. Mol. Cell 2000, 5, 121–131. (38) Rydberg, P.; Rod, T. H.; Olsen, L.; Ryde, U. J. Phys. Chem. B 2007, 111, 5445–5457. (39) Daggett, V. Chem. ReV. 2006, 106, 1898–1916. (40) Anzenbacherova, E.; Bec, N.; Anzenbacher, P.; Hudecek, J.; Soucek, P.; Jung, C.; Munro, A. W.; Lange, R. Eur. J. Biochem. 2000, 267, 2916–2920. (41) Binkowski, T. A.; Naghibzadeh, S.; Liang, J. Nucleic Acids Res. 2003, 31, 3352–3355. (42) Li, W. H.; Liu, H.; Luo, X. M.; Zhu, W. L.; Tang, Y.; Halpert, J. R.; Jiang, H. L. Drug. Metab. Dispos. 2007, 35, 689–696. (43) Ludemann, S. K.; Lounnas, V.; Wade, R. C. J. Mol. Biol. 2000, 303, 797–811. (44) Wade, R. C.; Motiejunas, D.; Schleinkofer, K.; Sudarko; Winn, P. J.; Banerjee, A.; Kaniakin, A.; Jung, C. Biochim. Biophys. Acta 2005, 1754, 239–244.
JP800311C