Theory and Procedures for Finding a Correct Kinetic Model for the

The deduced complete kinetic model for the BR photocycle is thus either a ..... as FSIT model), as in The closer the τ's become, the greater will be ...
0 downloads 0 Views 117KB Size
J. Phys. Chem. B 2001, 105, 3319-3328

3319

Theory and Procedures for Finding a Correct Kinetic Model for the Bacteriorhodopsin Photocycle Richard W. Hendler,*,† Richard I. Shrager,‡ and Salil Bose† Laboratory of Cell Biology, National Heart, Lung, and Blood Institute, and Mathematical and Statistical Computing Laboratory, CIT, National Institutes of Health, Bethesda, Maryland, 20892 ReceiVed: July 1, 2000; In Final Form: October 4, 2000

In this paper, we present the implementation and results of new methodology based on linear algebra. The theory behind these methods is covered in detail in the Supporting Information, available electronically (Shrager and Hendler). In brief, the methods presented search through all possible forward sequential submodels in order to find candidates that can be used to construct a complete model for the BR-photocycle. The methodology is limited only to forward sequential models. If no such models are compatible with the experimental data, none will be found. The procedures apply objective tests and filters to eliminate possibilities that cannot be correct, thus cutting the total number of candidate sequences to be considered. In the current application, which uses six exponentials, the total sequences were cut from 1950 to 49. The remaining sequences were further screened using known experimental criteria. The approach led to a solution which consists of a pair of sequences, one with 5 exponentials showing BR* f Lf f Mf f N f O f BR and the other with three exponentials showing BR* f Ls f Ms f BR. The deduced complete kinetic model for the BR photocycle is thus either a single photocycle branched at the L intermediate or a pair of two parallel photocycles. Reasons for preferring the parallel photocycles are presented. Synthetic data constructed on the basis of the parallel photocycles were indistinguishable from the experimental data in a number of analytical tests that were applied.

Introduction For a general introduction to the history and contemporary status of kinetic studies and attempts to define a precise model for the BR-photocycle, we refer the reader to two excellent review articles that have appeared recently.1,2 Although a variety of different types of kinetic model have been considered and proposed, none has met all of the requirements of a true representation of the actual model. For reasons listed in the Discussion, we found it necessary to try to find a model that best describes the kinetic steps that make up the BR-photocycle at neutral pH. Three basic types of kinetic schemes were considered: (1) A single linear sequence of the form A* f B f C... f A, where A* indicates a photon-activated initial state. (2) Parallel cycles consisting of separate sequences as described in kinetic scheme 1. Two sequences are considered most likely because we find two separate L f M transitions, two forms of M, one of which appears to decay directly to O and the other which does not, and because the relative amounts of the two types of apparent M-decay can be altered by light intensity,3 membrane potential,4 changes in lipid-protein interactions,5 or exposure to decane (unpublished observations). (3) Branched sequences of the form

There may be more than a single state in common before the * Corresponding author. Fax: 301-402 1519. E-mail: [email protected]. † Laboratory of Cell Biology, National Heart, Lung, and Blood Institute, National Institutes of Health. ‡ Mathematical and Statistical Computing Laboratory, CIT, National Institutes of Health.

branch point. The analysis becomes simplified with the realization that all three possibilities can be represented as combinations of linear sequences of the first type. The new procedures we have used to deduce a specific kinetic model consistent with all known facts about the photocycle are fully described as Method 3 in the Supporting Information.6 Methods 1. General Principles. A complete presentation of the theory on which the search for a correct kinetic model is contained in the Supporting Information for this paper.6 Only an abbreviated version is presented here. For a fuller description of any of the procedures used in this paper, the Supporting Information must be consulted. Experimental data are arranged in a matrix (call it A h ) the columns of which form a sequence of spectra acquired at different times after initiation of the photocycle by a laser pulse. The bar notation in A h denotes that it is an average of several samples and contains experimental noise. Each element in the matrix is an optical density at a particular wavelength (indexed by the rows of the matrix) and at a particular time (indexed by the columns of the matrix). In previous deconvolution studies from this and other laboratories, the time courses of either A h (global fitting approach) or of an the eigenvector representation of the time courses in the V matrix isolated by SVD deconvolution of A h were used to obtain unique difference spectra which correspond to each isolated exponential decay constant. The latter approach, to which we refer as the method of SVD-based least squares (SVD-lsq),7.8 when used to fit a sum of exponentials to the data, leads to the deconvolution of the raw data matrix A h into a matrix (call it C) of fundamental optical difference spectra for each of the fitted exponential decay time constants and a constant background spectrum, along with a matrix (call it X) containing

10.1021/jp002362z CCC: $20.00 © 2001 American Chemical Society Published on Web 03/27/2001

3320 J. Phys. Chem. B, Vol. 105, No. 16, 2001

Hendler et al.

all of the fitted exponentials in its rows plus a row of 1’s to account for the final spectrum in A h . For more detailed descriptions of the method of SVD-lsq, see refs 9 and 10. In linear algebra terms, the relationship between A h , C, and X is given by

A h ) CX + R

(1)

where R is a matrix of residuals, A h - CX. This solution can be used to describe a model where decays of all species are independent and start at zero time. The solution also serves to describe a forward sequential model when successive time constants are reasonably well separated. As the time constants come closer together, the isolated difference spectra become contaminated with spectral information of other steps. An additional problem with the exponential model is that neither the order of the isolated kinetic steps nor the connectivity of the system (branches, reverse steps, etc.) is specified. The simplest assumption is that the steps proceed in a sequential order of increasing time constants, but this may not be true. The objective of the current project is to find a specific kinetic model which leads to the absolute spectra for each state involved in the photocycle and the clean difference spectra for the transitions that occur between successive states. By this, we mean that each difference spectrum should show only the disappearance of the precursor state and formation of the product state. We refer to these as transition difference spectra, or TD, to distinguish them from another type of difference spectra in which each intermediate state is referenced to the final or ground-state spectrum (i.e., GD spectra). We will use the general designation D to refer to model-specific spectra, whether they be absolute, TD, or GD, and always make it clear which spectra are contained in D. The deconvolution we seek for the original data matrix A h is one that extracts a model-specific set of difference spectra in D and the corresponding real time course for the growth and decay of each spectrum in Y, such that

A h ) DY + R

(2)

The D matrix in eq 2 contains GD spectra. The model-dependent time courses in the rows of Y are related to the exponentials in the rows of X by a mixing matrix, M, such that

Y ) MX

(3)

In the Supporting Information, we describe in detail how to extract the mixing matrix M from the Jacobian matrix described below. TD spectra between successive intermediates are obtained by subtracting the appropriate GD or absolute spectra in D. For example, when GD spectra are used, in the linear sequence described above where A* passes through states B and C before eventually reaching the ground-state A, the transition spectrum between B and C is obtained by subtracting the column in D for B relative to the final state A from the column in D for C relative to A. The spectrum for the ground state is then canceled out, leaving the desired transition difference spectrum between states B and C. The starting point in the process is to express the tested model in the form of a system of differential equations. For example, in a forward-sequential model where the starting species y0 after activation by a laser flash is converted to its activated state, y1, which then passes through a sequence of intermediates such as y2, y3, and y4, with respective kinetic constants of k1, k2, k3, and k4, before returning to the ground-state y0, we have the following:

dy1/dt ) -k1y1

y1(0) ) yT

dy2/dt ) k1y1 - k2y2

y2(0) ) 0

dy3/dt ) k2y2 - k3y3

y3(0) ) 0

dy4/dt ) k3y3 - k4y4

y4(0) ) 0

(4)

where yT is the total concentration of y, yi(0) is the initial value of yi at t ) 0, and yi and dyi/dt without explicit arguments are both understood to be evaluated at some common time t. In matrix terms, the derivatives and the initial conditions are

dy/dt ) Jy

y(0) ) [yT 0 0 0]T

y ) [y1 y2 y3 y4]T dy/dt )

[

dy1 dy2 dy3 dy4 dt dt dt dt

(6)

]

T

and the Jacobian matrix J ) d(dy/dt)/dy is

[

(5)

-k1 0 0 0 k1 -k2 0 0 J) k2 -k3 0 0 k3 -k4 0 0

(7)

]

(8)

In applying the procedures described above, the SVD-lsq approach is not required. The exponential kinetic constants can be determined directly from the raw data by global fitting to all of the data. However, possible advantages to using SVDlsq for this preliminary step are the reduction in computational time for extremely large data matrixes and the reduction of overall background noise. With the exponential constants, the X and J matrixes can be constructed, and from these, the M matrix extracted. The M matrix is then used to obtain the modelspecific Y time courses. With Y, the model-specific difference spectra are extracted directly from the raw data by the formula

D)A h Y+

(9)

where Y+ is the pseudoinverse of Y. The procedure in eq 9 is capable of extracting either difference or absolute spectra from matrix A h . This is accomplished by the nature of the last row in Y. If a row of 1’s is appended to the model-specific time courses in Y, then the spectra in D are difference spectra relative to the ground state (i.e., GD spectra). But if the appended row is 1 sum(Y rows 1 to n) where n is the number of time constants, then the absolute spectra are obtained in D. However, as explained in the Supporting Information, absolute spectra can be computed only after all background signal (e.g., scattering) is removed from A h. The J matrix is a concise mathematical representation of a single kinetic model. From the J matrix, the correct time course matrix Y can be obtained (eq 3), and then the correct spectral matrix D extracted directly from the raw data (eq 9). This approach, which we call Method 1,11 is one of two existing methods used to obtain the correct Y and D matrixes. The problem is that most often neither the correct J nor even the most likely J’s are known. For a simple forward sequential linear sequential model with six exponentials, there are 6! possible J matrixes. Allowing for possible single or multiple branch points and parallel photocycles greatly augments the number of possible J matrixes. If reversible reactions are considered, the population of possible J matrixes skyrockets.

Bacteriorhodopsin Photocycle Method 212 is the reverse of Method 1. If one knows the proper spectral shapes of the intermediates, matrix Y can be extracted directly from A. The time courses in Y are then globally fitted to exponentials and the true kinetic model in J determined as described in the Supporting Information. Neither Method 1 nor Method 2 in their pure form can be applied to our data. Method 1 requires at least three distinct temperatures, and Method 2 requires complete information about species-specific spectra. Consequently, we have resorted to a hybrid method as explained in the Supporting Information. The method is called hybrid because it uses information from both ends of the problem. It generates of J and y0 at the modelspecific end, and it uses criteria on the generated spectra, namely, that a spectrum should not absorb far from its major peak. Such criteria require only that we know the approximate locations of the peaks, which are well-established in the literature. The result is the resolution of all model parameters from data at only one temperature, while avoiding the assumption that spectra remain constant with changing temperature. In this paper and the accompanying the Supporting Information, we introduce Method 3, which uses objective mathematical criteria and filters to eliminate most of the total population of J matrixes for forward sequential models, leaving only a few contenders that could be compatible with the experimental data. The final model may be obtained by using additional known information about the system. If no viable forward sequential model compatible with the data exists, none will be found. 2. Description of Experimental and Synthesized Data. The preparation of purple membranes, the description of the multichannel rapid scan spectrometer, and data collection were as previously described.13 PM, suspended at a concentration of 6 µM BR in 3 mL of 50 mM potassium phosphate buffer (pH 7.2) at room temperature, was activated with a laser pulse of 40 mJ at 532 nm over a 2 cm2 area of the optical cuvette. Typically, spectra comprised of 92 wavelengths with ∼2.9 nm resolution were collected by two spectrometers in a sequence of 512 successive time points. The first 30 time points were collected before the laser flash to mark the ground-state spectrum. The basic time resolution was 10 µs. Early points were taken every 10 µs, and the spacing was then increased to cover a total time of 189 ms. Time-averaging was applied to a total of 900 repeat spectra spaced apart by 4 s. Each photocycle was started by a 5 ns saturation laser flash at right angles to the monitoring light. Two steps were taken to protect the diodes from exposure to the high-intensity laser light. One was to use a 530 nm holographic laser notch filter (Kaiser Optical Systems, Ann Arbor, Mich.) at an angle experimentally determined to have the most light-blocking effectiveness (approximately 30°). The other was to separate the ranges of the two spectrometers to avoid light in the 520-536 nm range that passed through the interference filter. Prior to analysis, any diode showing an erratic or noisy response was eliminated. This behavior was clearly revealed as spikes in the U matrixes obtained by SVD deconvolution. Typically, the first and last diode and one or two internal diodes were removed, leaving 88-89 wavelengths for analysis. Similarly, each time course trace was examined for aberrant or “spike” type behavior, and these time courses were eliminated. Usually, this resulted in the elimination of 3-5 time points, leaving 478-480 times in the analysis. Model-specific synthetic data were generated to provide a data set where the correct model is known so that we can compare experimental data with the known data set in a series of verification tests described below. The synthetic data set was

J. Phys. Chem. B, Vol. 105, No. 16, 2001 3321 TABLE 1 100a ms

189

82

40

30

0.9b µs 8

5.95c 5.98d 33.6 33.6 124 123 494 460 1806 1830 4294 4371 21963

5.86 5.92 33.2 33.4 122 123 548 490 1713 1783 4126 4310 11855

5.75 5.86 32.7 33.1 120 122 643 534 1490 1704 3772 4240 6768

5.70 5.83 32.3 33.0 119 121 748 570 1224 1632 3355 4215 5320

40 112 546 1700 5000 22000 a

The first row shows the extent of fitted data from zero to the time indicated in milliseconds. b All of the other numbers show fitted exponential time constants in microseconds. The data in Column 1 are the fitted constants at 24° reported by Chizhov et al.12 All of the other columns contain fitted constants obtained by us. c The top numbers in regular type in every row were obtained using the sum-of-exponentials model based on seven exponentials. d The bottom numbers in bold type in every row were obtained using a model with six exponentials and one nonexponential term, as described in the Supporting Information.

constructed using the same principles described above. In this case, a D matrix is built using narrow Gaussian shapes centered at characteristic wavelengths to represent BR and each photocycle intermediate. Thus, for BR, L, M, N, and O, the Gaussians were centered at 570, 550, 412, 560, and 640 nm, respectively. Maximum amplitudes were normally set at 0.3 A units, except when two forms of the same species were required such as Lf and Ls, in which case one was set at 0.4 A units and the other at 0.2 A units. The width at half-height was set to 5 nm. Using the model-specific Y obtained as described above, the synthetic data matrix A is constructed directly using

A ) DY where A is used instead of A h to represent noiseless data. For the bicyclic data used in this paper, the procedure was used both on the 5- and 3-exponential sequences to obtain an A for each, and then the two A matrixes were combined for the complete model. These simulations are intended to verify that our methods work as expected. They are not intended as realistic imitations of the experimental systems. The extreme narrowness of the simulated peaks compared to the experimental peaks is an advantage in that the nonoverlapping behavior of the simulated peaks will point up any discrepancies between expected and calculated results in an obvious way. Indeed, the simulations were of immense help in this regard while our methods were under development. Results 1. Exponential Fitting Model. The fitting model most often used for kinetic data in general, which includes data obtained from the BR-photocycle, is a sum of exponentials. All of our previous analyses for this system were based on this approach. Chizhov et al.,14 using the sum-of-exponentials model, found that eight exponentials were required to fully describe the events in the BR-photocycle (Table 1, column 1). With our time resolution of 10 µs, we were not able to confirm the earliest event seen by Chizhov et al., but as shown in column 2 of Table 1, we were able to obtain constants which closely matched the

3322 J. Phys. Chem. B, Vol. 105, No. 16, 2001

Figure 1. Difference spectra obtained from experimental data by SVDlsq. The spectra shown are for each isolated transition having the time constant indicated in the panel. They are representative of either completely parallel decays or of a reasonably well-separated linear sequential chain, as described in the text. In this, as well as in all figures that follow, the dashed lines represent zero absorbances or delta absorbances.

other seven values. Considering that different preparations of PM were used in the two laboratories and that all conditions may not have been exactly reproduced, the agreement between the two data sets is quite close. The difference spectra which we obtained using SVD-lsq (Figure 1) also closely match the difference spectra obtained by Chizhov et al. (their Figure 7), who used a global fitting approach. Both we and Chizhov et al. (as well as others cited by Chizhov et al.) have noticed that the slowest component (i.e., ∼20 ms) occurs with a much smaller amplitude than that seen for the faster components. For this and other reasons, they have decided that this component is not part of the major all-trans cycle but is a minor component representative of a dark-adapted 13-cis cycle. This slow component also differs from the faster components insofar as it does not behave as an exponential.6 A result of this fact is that fitting the data to the sum-of-exponentials model can lead to serious distortions in the values of the fitted kinetic constants, as illustrated in Tables 1 and 2. Table 1 (regular typeface) shows that contrary to the expectations of the sum-of-exponentials model, the fitted constants are functions of the range of data that is fit. The slowest three kinetic constants markedly decrease as the range of fitted data is decreased, while the fourth slowest constant increases. When a model based on six exponentials plus the one nonexponential of the form 1/x(time + 1) is used, the fitted constants become much less sensitive to the range of data that is used (Table 1, bold typeface). A further illustration of the superiority of the alternative fitting model is shown in Table 2. Although the data set we used for Table 1, analyzed by the sum-of-exponentials model, produced kinetic constants

Hendler et al. close to those found by Chizhov et al., we have obtained other data sets which produce different values (Table 2, regular typeface). Using the alternative fitting model brings all of the data sets into agreement (Table 2, bold typeface). The two different fitting models produce markedly different averages, and the standard errors of the means are significantly more narrow in the case of the newer fitting model. Because the derived difference and absolute spectra obtained from the original experimental data are based on the precise values of the fitted constants, these differences in kinetic constants become extremely important. On the basis of the analyses shown in Table 1 and the greater agreements found for the repeat experiments shown in Table 2, it was decided that the fitting model based on six exponentials and one nonexponential produced a more reliable set of kinetic constants. This fitting model applied to experimental data with a time range of 100 ms was used in this paper, unless otherwise noted. 2. Screening Process. The procedures described in this section are an implementation of Method 3 described in the Supporting Information.6 2.1. First Procedures to Scan all Possible Forward Sequential Kinetic Models and Eliminate Unlikely Candidates. With six kinetic constants, there are a total of 1950 unique forwardsequential models. There are 720 permutations each for models containing six and five kinetic transitions, and 360, 120, and 30 for models containing four, three, and two transitions, respectively. The model-specific spectra contained in a matrix D are difference spectra with respect to the final state, which is BR (i.e., GD spectra). This means that the true model must produce spectra in D which show the characteristic features for each particular intermediate in the positive range of the difference spectrum and the disappearing spectrum for only BR in the negative range (i.e., a trough at 570 nm). The procedures we have applied will always lead to matrix D. But, if the model used is not correct, the trough may be at a wrong wavelength, if it exists at all. Another possible consequence resulting from the use of an incorrect model is that the magnitudes for absorbance changes in the spectra of D may become unacceptably large, that is in excess of the amount of BR used in the analysis. Therefore, the first filter we have applied to screen out incorrect models is the finding of troughs in wavelength positions far (i.e., (15 nm) from that of BR and the finding of absorbance changes in the spectra of D in excess of 1 absorbance unit. However, the filter was not applied to the first (i.e., τ ) 5.9 µs) transition for the following reasons. The first species BR*, is shared by the fast and slow pathways, leading to potential branch-point ambiguities.6 Furthermore, because of the 10 µs time resolution of our system, the amount of information contained in A h , for this step is minimal. In view of these considerations, we did not want to reject potentially viable submodels on the basis of any distortions that might arise in this first derived spectrum. In applying the filter to all of the other transitions, we found 3 possible models using all 6 exponentials, 9 using 5 exponentials, 14 using 4 exponentials, 13 using 3 exponentials, and 10 using 2 exponentials. Therefore, the first filter reduced the number of linear sequences under consideration from a total of 1950 to 49. The next filter that we employed eliminated all that were not forward sequential linear paths. The bases for these eliminations were the following: (1) Two separate steps where the same intermediate is formed. For example, if L has already been converted to M, then in a forward sequential path, a repeat conversion of L to M cannot occur. (2) Two separate steps where the same intermediate decayssfor example, M f N and M f O. (3)

Bacteriorhodopsin Photocycle

J. Phys. Chem. B, Vol. 105, No. 16, 2001 3323

TABLE 2: Different Sets of Data (columns) and Different Kinetic Decays (rows) in the Photocyclea 1 2 3 4 5 6 7

1

2

3

4

5

6

7

8

Av ( sem

5.95b 5.93c 33.6 33.4 124 123 494 483 1806 1794 4294 4325 21963

5.37 5.43 31.7 31.5 132 130 584 524 1773 1803 5370 5562 20268

4.79 5.14 28.6 29.6 113 118 809 490 1115 1697 3300 4146 5858

4.67 4.86 27.4 27.9 109 111 585 465 1429 1613 3530 4017 7817

4.65 4.98 28.7 29.5 115 119 757 493 1228 1717 3351 4208 6133

4.68 4.86 28.1 28.3 114 115 610 493 1505 1684 3730 4232 8300

4.80 5.16 28.0 29.1 111 117 773 476 1110 1662 3273 4067 5778

4.57 4.74 28.4 28.8 115 117 582 475 1513 1670 3781 4204 8890

4.94 ( 0.17d 5.14 ( 0.14 29.3 ( 0.8 29.8 ( 0.6 117 ( 3 119 ( 2 649 ( 40 487 ( 6 1435 ( 96 1705 ( 23 3828 ( 250 4345 ( 177 10626 ( 2331

a All of the numbers show fitted time constants in microseconds. b The top numbers in regular type in every row were obtained using the sum-ofexponentials model based on seven exponentials. c The bottom numbers in bold type in every row were obtained using a model with six exponentials and one nonexponential term as described in the Supporting Information. d The last column shows averages ( standard error of the mean for the analyses based on the seven-exponential model (regular type) and the model using six exponentials and one nonexponential term (bold type).

Sequences in which an intermediate is formed but does not decay. (4) Sequences in which an intermediate decays but has not been formed. Applying these criteria reduced the number of contending possible linear sequences from the 49 that survived the first filter to a total of 19: 0 for the 6-exponential model and 3, 5, 7, and 4 for the 5-, 4-, 3-, and 2-exponential linear sequences, respectively, 19 sequences in all. 2.2. Selection of the “Most Likely” Kinetic Model. All of the 19 sequences that survived the screens described above are shown in Table 3. A common feature of all of the sequences with three or more steps is that the final step has a time constant of 4.3 ms. An unexpected finding is that when both the 483 µs step and 1.8 ms step are present in the same sequence, the slower step precedes the faster one. In the evaluation of these “possible” linear sequences, additional facts were considered. There are several observations (listed below) that point to either two separate parallel photocycles or one with a very early branch (i.e., before L). (1) SVD-lsq isolates two kinetic steps showing L f M conversions, one with a time constant of 33 µs and the other 123 µs. Two L f M conversions were also seen in timeresolved FTIR studies.15 (2) There are two kinetically different forms of M. One decays with a time constant close to 2 ms (Mf), and the other decays with a time constant between 4 and 5 ms (Ms). (3) The decay of Mf is accompanied by the formation of O, whereas the difference spectrum for the decay of Ms shows the simultaneous formation of BR.16 (4) The ratio of the Mf to Ms amplitudes in the same sample can be varied over a wide range by light intensity and changes in conditions. Mathematically, the early-branched and parallel photocycle models are indistinguishable. However, there are major conceptual and physical differences between the two cases. The branched photocycle assumes a homogeneous pool of BR. The parallel cycle assumes two physically distinct populations of molecules or molecules in different environments. We have already cited observations which support this kind of heterogeneity for the population of BR units.3,13 The N intermediate was seen only in the 5- and 4- exponential sequences (Table 3). One of these must be included in the final model. The 5-exponential model includes the expected L f M transition, whereas the three 4-exponential sequences which include the N intermediate do not. Therefore, one of the 5-exponential sequences must be selected. It was previously shown that Mf decays through the O intermediate to BR, whereas Ms bypasses the O intermediate in its decay to BR.16 Because both Mf and O are present in the 5-exponential sequences, one of these must represent the fast part of the

TABLE 3: Nineteen Submodel Sequences That Passed the Filters Designed to Eliminate Impossible Candidates, as Explained in the Texta Five Exponentials BR* f 6 µs 6 µs 33 µs

33 µs 123 µs 123 µs

L f M f N f O f BR 1.8 ms 483 µs 1.8 ms 483 µs 1.8 ms 483 µs

4.3 ms 4.3 ms 4.3 ms

Four Exponentials 6 µs 33 µs

BR* f L f M f O f BR 33 µs 1.8 ms 123 µs 1.8 ms

4.3 ms 4.3 ms

6 µs 33 µs 123 µs

BR* f M f N f O f BR 1.8 ms 483 µs 1.8 ms 483 µs 1.8 ms 483 µs

4.3 ms 4.3 ms 4.3 ms

Three Exponentials 6 µs 6 µs 33 µs

BR* f L f M f BR 33 µs 123 µs 123 µs

4.3 ms 4.3 ms 4.3 ms

6 µs 33 µs 123 µs 483 µs

BR* f M f O f BR 1.8 ms 1.8 ms 1.8 ms 1.8 ms

4.3 ms 4.3 ms 4.3 ms 4.3 ms

Two Exponentials BR* f O f BR 6 µs 33 µs 123 µs

483 µs 483 µs 483 µs BR* f M f BR

123 µs

483 µs

a

The designation BR* indicates a photon-activated form of BR. The numbers in the table are fitted time constants for the transitions. The sequences shown in bold typeface (1st row under the 5-exponential sequence and 2nd row under the 3-exponential sequence) were selected as described in the text as the submodels required for the final complete model that has been deduced from the experimental data using the procedures described in this and the Supporting Information.

photocycle. The third sequence which shows the BR f L step with a time constant of 33 µs followed by an L f M step with a time constant of 123 µs is rejected because SVD-lsq identifies both as L f M transitions. One should be present in the Mf decay route and the other in the Ms decay route. Because the 5-exponential sequence represents the fast decay route, we chose

3324 J. Phys. Chem. B, Vol. 105, No. 16, 2001

Hendler et al. 2, panel A). The corresponding spectrum from the experimental data (Figure 1, panel A) shows the disappearance of BR with a steep rise in the 550 nm region which is characteristic of the L intermediate. A decrease in absorbance in the 410 nm region is also seen. A similar but smaller decrease in absorbance in the 410 nm region was also seen with the synthetic data (Figure 2). Because of the paucity of points available in this early time domain due to our 10 µs resolution, confidence in the reliability of this difference spectrum is less than that for the spectrum derived from the slower events shown in the other panels. Panels B and C from both the synthetic (Figure 2) and experimental (Figure 1) data show the conversion of L to M. Panels D are discussed below. Panels E from both sets of data show the simultaneous transformations of M and N to O. Panels F from both sets of data show the simultaneous transformations of M and O into ground-state BR. The results of this first test show that SVD-lsq could not distinguish between the experimental data set and the one based on our deduced bicyclic model. At this point, it is important to recognize the basis and significance of difference spectra derived from the exponential model (e.g., SVD-lsq). This model pertains to a decay process in which all species are at their maximum concentrations at t ) 0 and decay to zero independently. The spectra in C are valid species transition spectra only in such a situation. However, the spectra can also be good approximations of TD spectra for a forward sequential model with well-separated time constants in increasing order (to which we will refer as FSIT model), as in τ1

Figure 2. Difference spectra obtained from model-specific synthetic data by SVD-lsq. The model used to synthesize the data was the parallel two-cycle model deduced by the procedures described in this paper.

the first sequence with the faster L decay (in bold typeface). When a model was tested using the slower L decay, similar but less clean difference spectra were obtained than those from the model using the faster L decay in the 5 exponential process. To complete the overall model, we must describe the Ms decay path. Because the decay of Ms bypasses O and because the slower L f M transition must be accounted for, one of the 3-exponential sequences consistent with these facts must be chosen. The only choice consistent with all of the considerations discussed above is the second sequence (in bold type) shown under the BR f L f M f BR pathway. Therefore, the model we end up with includes the highlighted (i.e., bold type) 5- and 3-exponential sequences shown in Table 3. 3. Testing and Verification of the Bicyclic Model. 3.1. Tests Based on Exponential Fitting. A synthetic data matrix was constructed on the basis of the specific bicyclic model deduced in the previous section. The procedures used to synthesize these data are described briefly in the Methods section and in detail in the Supporting Information. The first test of the synthetic data was to analyze it by the exponential model and to compare the results with those obtained from the experimental data. Keep in mind that for the synthetic data, narrow, well-separated, noiseless Gaussian shapes are used, whereas with the experimental data, broader and less-well separated spectral shapes with noise are present. The difference spectra derived from the experimental data using seven exponentials (Table 1, normal typeface) are shown in Figure 1, and those obtained from the synthetic data, based on six exponentials (Table 1, bold typeface), are shown in Figure 2. It is seen that the difference spectra obtained from the synthetic data match the salient features of the corresponding difference spectra obtained from the experimental data. As expected, the 5.9 µs transition with the synthetic data shows a clear conversion of BR to L (Figure

τ2

τ3

y1 98 y2 98 y3 98 ... , τ1 < τ2 < τ3 ... The closer the τ’s become, the greater will be the danger of spectral contamination of a difference spectrum with spectra of subsequent steps. In cases where a more complicated model has some forward sequential paths, the spectra in C may still be of conceptual value to the overall solution. With these facts in mind, it is obvious that the results of exponential fitting to experimental data clearly show that the reactions in the photocycle are not compatible with the “FSIT model”. The spectra from C show two sequential conversions of L to M (Figure 1, panels B and C). If L has already been converted to M, it cannot happen again in a separate step. For the model based on two sequences, two steps of L to M conversion are expected (Figure 2, panels B and C). Another serious problem is that the exponential model implies that O disappears at 494 µs (panels D of Figures 1 and 2) before it was formed. An additional problem with the applying the results of exponential fitting to an FSIT model is that in two cases, two different intermediates are seen to decay simultaneously. Thus, in Figures 1 and 2, both M and N appear to decay to O in panels E, and both M and O appear to decay to BR in panels F. All of these problems disappear when both the experimental and synthetic data are analyzed by procedures based on the bicyclic model, as described below. 3.2. Tests Based on the Specific Parallel Bicyclic Model. The transition difference spectra obtained from experimental data are shown in Figure 3 and those from synthetic data in Figure 4. In these figures, as well as in Figures 5-8, a double line separates the top five panels from the bottom three. The top panels show spectra of the 5-step fast cycle, and the bottom three panels show spectra of the 3-step slow cycle. In Figure 3, the difference spectra for the sequential steps in the fast cycle are entirely in agreement with the designated intermediates

BR* f Lf f Mf f N f O f BR

Bacteriorhodopsin Photocycle

Figure 3. Transition difference spectra obtained from experimental data using the model-specific analytic tools described in this paper. BR* designates the activated form of BR after absorbing a photon. The analysis was based on the two-cycle model described in the legend to Figure 2.

where BR* represents activated BR after it has absorbed a photon. In fact, this is the first visual spectroscopic evidence which shows N as a clear intermediate in the BR-photocycle, where it should be, between M and O. Because N is formed slowly from M (τ ) 1.8 ms) and decays rapidly (τ ) 483 µs), it is present transiently in very small amounts. The bottom three panels in Figure 3 show spectra of the 3-step slow cycle and are entirely in agreement with the intermediates

BR* f Ls f Ms f BR The wavelengths of minimum absorbance were 550 and 560 nm for the Ls to Ms and N to O transitions, respectively, and the wavelength of maximum absorbance at 570 nm for the O to BR and Ms to BR transitions. Figure 4 shows a comparable analysis performed with synthetic data constructed on the basis of the bicyclic model. With the narrow Gaussians representing the intermediates, the isolated difference spectra show the clean transitions expected from the known sequences in the two cycles. The deviations from the Gaussian shapes for the L and N intermediates seen in the narrow wavelength region of 520-536 nm are due to the measures taken to eliminate optical disturbances from the laser flash, which included the use of a blocking notch filter and the separation of the ranges of data collection by our two spectrometers. 3.3. Tests Based on an Incorrect Parallel Bicyclic Model. The initial screens used to test all possible linear sequences eliminated models in which the faster 483 µs step preceded the slower 1.8 ms step. We altered the bicyclic model tested above by placing the faster step before the slower one and reanalyzed

J. Phys. Chem. B, Vol. 105, No. 16, 2001 3325

Figure 4. Transition difference spectra obtained from the bicyclic model synthetic data described in the legend to Figure 2. The analysis was performed in the same way as that described in the legend to Figure 3. The deviations from Gaussian shape seen in the spectra for L and N are due to a separation in wavelength range of the two spectrometers from 520 to 536 nm, to avoid disturbances from the laser flash, as described in Methods.

both the experimental and synthetic data sets. Figure 5, which shows the results with synthetic data, should be compared to Figure 4. The evidence for the sequential steps Mf f N f O is now lost. Instead, there are two sequential steps of M decay, with the major one leading to O formation. The evidence for N as an intermediate is greatly diminished. With the narrow and cleanly separated spectral shapes, however, an N feature is still seen. The results obtained with the experimental data (Figure 6) were consistent with those seen with the synthetic data (Figure 5). Because of the broader shapes of the characteristic spectra for the intermediates, no evidence for N conversion to O is seen. The transition at 483 µs is seen only as an M f O conversion. As found above in the tests using SVD-lsq and the tests using the correct bicyclic model for analysis, the tests based on an incorrect bicyclic model confirm the marked similarity of the experimental data set with synthetic data constructed from the correct bicyclic model. 3.4. Tests Based on Obtaining Absolute Spectra Directly from the Raw Data Using Y. Using the hybrid method 6 on the bicyclic model, we have isolated the absolute spectra shown in Figure 7. The absolute spectrum we obtain for the activated form of BR* is indistinguishable from that of the ground-state BR. A single spectral shape (Mtotal) was used for Mf and Ms. The absolute spectra for Mtotal, Ls, N, O, and BR all show maximum absorbances at ∼412, ∼550, ∼560, and ∼640 nm, respectively, as expected. The abrupt change in absorbance of Mtotal in the wavelength region 520-536 nm is due to the use of the notch filter and separation of the two spectrometers as described in the Methods. The slight rise in absorbance seen in

3326 J. Phys. Chem. B, Vol. 105, No. 16, 2001

Figure 5. Transition difference spectra obtained from the bicyclic model synthetic data described in the legend to Figure 2. However, for the analysis, analytical software based on an incorrect bicyclic model was used. The incorrect model placed the 483 µs decay step just before the 1.8 ms decay step instead of just after it.

the wavelength region 490-526 nm indicates that the M intermediate spectrum may have a minor absorbance feature in this region. The absolute spectrum for Lf was more noisy than the others, and it is not possible to fix its peak wavelength with the same accuracy. Difference spectra obtained by subtraction of the appropriate absolute spectra shown in Figure 7 are shown in Figure 8. These spectra are quite comparable with the TD difference spectra obtained directly from the GD difference spectra, as shown in Figure 3. Discussion The main impetus for these studies was to find a valid model for the BR-photocycle so that important events could be dependably attributed to unique steps in this cycle. Many of the investigations we are carrying out have reached the point where this greater precision is required. In particular, we are in the process of combining time-resolved data obtained by both optical and FTIR spectroscopies to be able to assign particular changes in protein conformation and proton-binding to transitions between known intermediates in the photocycle to identify the steps involved in proton-pumping. Similarly, we would like to associate particular steps with particular events such as the alteration of photocycle kinetics by the intensity of laser light, the control of particular rates by membrane potential, and the generation of membrane potential. Methods based on exponential fitting alone are not capable of providing the unique kinetic sequences of the BR-photocycle. We are not the first to recognize the importance of defining the steps of the BR-photocycle. In an extensive review of the problem in 1995, Lanyi and Varo summarized the progress of

Hendler et al.

Figure 6. Transition difference spectra obtained from the same experimental data shown in Figure 3 using the incorrect model described in the legend to Figure 5.

efforts to solve this problem.1 They made the point that finding the correct kinetic model could lead to understanding the BRphotocycle and that it could also lead to an understanding of the principles of pumps in general. This review, which is highly recommended reading, provides a detailed discussion of evidence in favor of a single homogeneous, branched, and separate or parallel cycles. The authors also make a case for the existence of reversible steps between certain of the intermediates. We developed new approaches that could objectively lead to a unique solution to this problem. Our starting point was the finding, in agreement with the studies of Chizhov et al.,14 that six exponential kinetic constants were needed to describe the events of the BR-photocycle between 0.005 and 100 ms. Because parallel and branched cycles can be considered as special combinations of linear sequences, we developed a screen to examine all linear sequences containing from 2 to 6 exponentials. We used a linear algebraic deconvolution method on experimental data that, for any viable model, would yield difference spectra for each intermediate relative to the final (i.e., ground) state. If the deconvolution produced difference spectra not related to the ground state or if more than a single intermediate was represented in a single difference spectrum, the model was discarded. We added the requirement that no intermediate could be present in an amount greater than could be derived from the starting amount of BR. These two criteria lowered the number of possible linear sequences to be considered from 1950 to 49. A second screen eliminated unreasonable models, such as the conversion of L to M as a step following a previous L to M conversion, two separate steps where the same intermediate is converted to a different intermediate, sequences where an intermediate is seen to decay before it was formed, and sequences where an intermediate is formed but does not decay. This second screen reduced the number of possible

Bacteriorhodopsin Photocycle

Figure 7. Absolute spectra obtained directly from the experimental data shown in Figure 3 using the procedures described in the Supporting Information.6 To obtain these spectra, we used a single spectral shape for Mf and Ms. In the text, this spectrum is referred to as Mtotal. The apparent discontinuity in the traces, most notably seen in the spectra for M, is due to the separation of the two spectrometers to avoid laser noise, which leaves a gap in the region 520-536 nm.

models from 49 to 19: 0 with 6 exponentials and 3, 5, 7, and 4 for 5, 4, 3, and 2 exponentials, respectively. The final model was chosen on the basis of its consistency with independent experimental findings, namely: (1) Two kinetically different forms of M are present, one with a τ near 2 ms which appears to decay through the O intermediate (Mf) and one with a τ near 5 ms which appears to bypass O and decay directly to BR.16 (2) Two kinetically separate steps are seen for an L to M conversion (this paper and ref 15). (3) N must be present as an intermediate and it should be formed from M and decay to O. A single model which accounted for all of these observations was found, namely, a parallel model with one 5-step sequence that includes Lf, Mf, N, O, and BR, and the other, a 3-step sequence that includes Ls, Ms, and BR. A unique and unexpected feature of the selected model is that N decays much faster than its rate of formation. This model was then subjected to a series of tests using both the experimental data and synthetic data constructed according to the unique bicyclic solution described above. The following were found: (1) The model-specific synthetic data produced the same results by analysis using SVD-lsq as the experimental data. (2) The model-specific analytical procedures applied to the experimental data produced difference spectra between successive intermediates entirely in agreement with expectation. The same results were obtained with the synthetic data. (3) A slight alteration in the model-specific testing procedures, such as placing the faster 483 µs decay step before, instead of after, the 1.8 ms decay step produced (the same) unacceptable results from both the experimental and synthetic data sets. (4) Analyti-

J. Phys. Chem. B, Vol. 105, No. 16, 2001 3327

Figure 8. Transition difference spectra obtained by subtraction of the absolute spectra shown in Figure 7.

cal software developed on the basis of the unique bicyclic model produced the expected absolute spectra for the intermediates directly from the raw data. (5) Subtraction of these successive absolute spectra produced the same difference spectra as obtained directly from the raw data. To our knowledge, no other model for the BR photocycle has been based on as exhaustive a search nor subjected to as rigorous testing as the one we have deduced from the studies reported in this paper. Any other unidirectional model that was proposed in the past did not survive the screening procedures we have applied. The spectrally silent model of Chizhov et al.14 was proposed to account for a situation where more exponentials than intermediates were present. This is not the case in our data. As discussed in this paper, the raw data available to us and the Chizhov group were very similar. Our use of two forms of L intermediate and the modeling procedures we describe led us to a solution which includes only pure intermediate states as an alternative to the spectrally silent model. The bicyclic model we have derived does not contain any reversible steps, which is at odds with recent models under consideration.1 At the moment, no complete reversible model demonstrated to be entirely compatible with experimental data has been presented. The most promising development in this direction is based on the use of singular value decomposition with self-modeling.17 At its present state of development, however, the approach is limited to a cycle with four intermediates. Therefore, a complete model for the wild-type photocycle is not available for testing. When available, if this model is capable of explaining any facts that our simpler model fails to explain, it should be adopted. In conclusion, we would like to make it clear that the studies reported in this paper apply to the BR photocycle under a particular set of conditions, namely, at pH near neutrality, 50 mM phosphate buffer, and a temperature near 23°. It is possible

3328 J. Phys. Chem. B, Vol. 105, No. 16, 2001 that under a different set of conditions, a different photocycle may occur, even one with reversible reaction steps. Supporting Information Available: Paper detailing the theory behind the implementation and results of our new methodology based on linear algebra. This material is available free of charge via the Internet at http://pubs.acs.org. References and Notes (1) Lanyi, J. K.; Varo, G. V. Isr. J. Chem. 1995, 35, 365-385. (2) Haupts, U.; Tittor, J.; Oesterhelt, D. Annu. ReV. Biomol. Struct. 1999, 28, 367-399. (3) Shrager, R. I.; Hendler, R. W.; Bose, S. Eur. J. Biochem. 1995, 229, 589-595. (4) Hendler, R. W.; Drachev, L. A.; Bose, S.; Joshi, M. K. Eur. J. Biochem. 2000, 267, 5879-5890. (5) Joshi, M. K.; Dracheva, S.; Mukhopadhyay, A. K.; Bose, S.; Hendler, R. W. Biochemistry 1998, 41, 14463-14470.

Hendler et al. (6) See Supporting Information. (7) Shrager, R. I.; Hendler, R. W. Anal. Chem. 1982, 54, 1147-1152. (8) Shrager, R. I. Chemometr. Intell. Lab. Sys. 1986, 1, 59-70. (9) Hendler, R. W.; Shrager, R. I. J. Biochem. Biophys. Methods 1994, 28, 1-33. (10) Henry, E. R.; Hofrichter, J. Methods Enzymol. 1992, 210, 129192. (11) Nagle, J. F. Biophys. J. 1991, 59, 476-487. (12) Szundi, I.; Lewis, J. W.; Kliger, D. S. Biophys. J. 1997, 73, 688702. (13) Mukhopadhyay, A. K.; Bose, S.; Hendler, R. W. Biochemistry 1994, 33, 10889-10895. (14) Chizhov, I.; Chernavskii, D. S.; Engelhard, M.; Mueller, K.-H.; Zubov, B. V.; Hess, B. Biophys. J. 1996, 71, 2329-2345. (15) Ro¨dig, C.; Chizhov, I.; Weidlich, O.; Seibert, F. Biophys. J. 1999, 76, 2687-2701. (16) Hendler, R. W.; Dancshazy, Zs.; Bose, S.; Shrager, R. I.; Tokaji, Zs. Biochemistry 1994, 33, 4604-4610. (17) Zimanyi, L.; Kulcsar, A.; Lanyi, J. K.; Sears, D. F., Jr.; Salteil, J. S. Proc. Natl. Acad. Sci. U.S.A. 1999, 4414-4419.