Protein microenvironment governs the suitability of labeling sites for

Jul 30, 2018 - To address this issue, we identified the contribution of virtually all individual parameters that affect Förster resonance energy tran...
1 downloads 0 Views 1MB Size
Subscriber access provided by University of South Dakota

Article

Protein microenvironment governs the suitability of labeling sites for single molecule spectroscopy of RNP complexes Andreas Schmidt, Nadide Altincekic, Henrik Gustmann, Josef Wachtveitl, and Martin Hengesbach ACS Chem. Biol., Just Accepted Manuscript • DOI: 10.1021/acschembio.8b00348 • Publication Date (Web): 30 Jul 2018 Downloaded from http://pubs.acs.org on August 1, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 13 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

TOC graphic 378x159mm (300 x 300 DPI)

ACS Paragon Plus Environment

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 13

Protein microenvironment governs the suitability of labeling sites for single molecule spectroscopy of RNP complexes Andreas Schmidt*, Nadide Altincekic*, Henrik Gustmann#, Josef Wachtveitl#, and Martin Hengesbach* *

Institute for Organic Chemistry and Chemical Biology, Goethe-University Frankfurt

#

Institute for Physical and Theoretical Chemistry, Goethe-University Frankfurt

Supporting Information Placeholder Abstract. Single molecule techniques allow unique insights into biological systems as they provide unequaled access to structural dynamics and conformational heterogeneity. One major bottleneck for reliable smFRET analysis is the identification of suitable fluorophore labeling sites which neither impair the function of the biological system nor cause photophysical artifacts of the fluorophore. To address this issue, we identified the contribution of virtually all individual parameters that affect Förster resonance energy transfer between two fluorophores attached to a ribonucleoprotein complex consisting of the RNA-binding protein L7Ae and a cognate kink-turn containing RNA. A non-natural amino acid was incorporated at various positions of the protein by the use of an amber suppression system (pEVOL) to label the protein via copper(I)-alkyne-azide cycloaddition (CuAAC). Based on simulations followed by functional, structural and multiparameter fluorescence analysis of five different smFRET RNPs, new insights into the design of smFRET RNPs were obtained. From this, a correlation between the photophysical properties of fluorophores attached to the protein and the predictability of the corresponding smFRET construct was established. Additionally, we identify a straightforward experimental method to characterize selected labeling sites. Overall, this protocol allows fast generation and assessment of functional RNPs for accurate single molecule experiments.

Introduction. Over the last years, several studies have highlighted important structural dynamics in RNP complexes using singlemolecule Förster resonance energy transfer (smFRET) spectroscopy. These FRET studies were applied to study large complexes, such as ribosomes, spliceosomes and polymerases1-5. Many of the results obtained in these studies have significantly advanced our understanding of structural dynamics that are pivotal to the complexes’ function. As FRET reports on distances and distance changes between two fluorophores in spatial proximity, most of these studies were only possible by employing technologies that allowed site-specific labeling of RNAs and proteins within their respective complexes7, 8. While many initial studies relied on cysteine labeling which comes at the expense of mutating other cysteines and can lead to inhomogeneous FRET populations in the case of partial cysteine replacement9-12, recent work on the expan-

sion of the genetic code allowed incorporation and bioorthogonal labeling of non-natural amino acids14-18. However, many of these studies relied on labeling of proteins in isolation, and oftentimes analyzed only few labeling sites for each protein. Two of the major problems in analyzing proteins by smFRET spectroscopy are a) expression of protein functionalized with a non-natural amino acid, and b) a change in fluorescent dye properties caused by the physicochemical microenvironment of the dye. While many of the photophysical properties that affect Förster resonance energy transfer can in principle be easily measured or even exploited to increase accuracy of the smFRET measurement19, the precision and predictability of labeling sites remains challenging. Despite recent significant advances in the modeling of fluorophores attached to i.e. proteins20, 21, there are many additional factors to be considered that improve identification and assessment of labeling sites, and improve precision and predictability of such labeling sites. With these or similar modeling attempts, a number of studies use very elaborate and very successful strategies to calculate back distances from smFRET data20-22. Here, we demonstrate how careful design, expression and purification, biophysical investigation and smFRET measurements can be combined to circumvent these problems, increasing the number of accurate smFRET constructs as well as the validity of the obtained data. Based on these new insights, we propose an optimized workflow to obtain smFRET constructs that more accurately report on distances, as exemplarily shown for RNP complexes. We demonstrate all these steps for the RNA-binding protein L7Ae, which is a highly conserved component of the H/ACA complex responsible for i.e. ribosomal RNA (rRNA) pseudouridylation. Within the H/ACA complex, L7Ae binds an archaeal sRNA that is required for targeting the modification site within the rRNA. L7Ae specifically binds the kink-turn motif containing sRNA with high affinity23-25. Choosing five different labeling sites, we analyzed and compared these at each of the experimental steps, including smFRET assessment. Determination of biophysical fluorescence parameters allows to identify constructs that significantly increase the precision of distance assessment, aiding in identifying and quantifying structural dynamics in RNP complexes.

ACS Paragon Plus Environment

Page 3 of 13 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

Results and Discussion Selection of labeling sites. The selection of labeling sites is a crucial aspect in the design of smFRET constructs mainly for three reasons: A suboptimal choice of labeling sites and introduction of modifiable amino acid at such a position may 1) result in functionally impaired molecules, 2) difficult subsequent attachment of a suitable fluorophore, and/or 3) in a severely perturbed photophysical behavior of the attached dye. However, to obtain meaningful data from smFRET experiments, the incorporation of modifiers should leave structure and function of the protein of interest unaffected, and allow for labeling with a fluorophore under native conditions in quantitative manner. To this end, steric accessibility of the amino acid at the labeling site can easily be rationalized as an important feature. In either case, detailed structural information about the protein in question is absolutely essential at this stage.

Figure 1. Schematic representation of the RNP FRET construct and SDS-PAGE analysis of L7Ae variants. A: Representation of all labeling sites within the RNP. Distances are measured between attachment site atoms (Cα-Cα). B: Coomassie stained SDS gel of cell lysate before (-) and after (+) induction of protein expression C: In-gel fluorescence scan for Cy3 before (-) and after (+) coupling with Sulfo-Cy3-azide, D: Coomassie stained gel of the same samples as in C.. Based on an X-ray structure of the L7Ae/sRNA complex suitable labeling sites were selected (Figure 1A), initially mainly based on inspection with respect to surface accessibility and intermolecular interactions within the complex.. As the complex originates from a thermophilic organism, we expect few structural dynamics at ambient temperature used for the majority of smFRET experiments. The highly exposed U26 of the sRNA was chosen as attachment site of the acceptor dye (Figure 1A). Five different attachment sites for the donor dye within the protein were identified according to the following criteria: 1) minimization of structural changes of the RNP complex upon non-natural amino acid

incorporation by giving preference to residues exposed on the protein surface, 2) exclusion of labeling sites within the binding interface between L7Ae and the sRNA, 3) a preference for distances shorter than the Förster radius, which allows a direct readout of photophysical perturbation of the dyes on single molecule level and a straightforward benchmarking of the labeling position, 4) avoiding labeling sites near the C-terminus of the protein to allow for an improved analysis and purification of functionalized protein. At this stage, distances between the FRET dyes of these initial labeling sites are calculated from their Cα-Cα distance. The five labeling positions within L7Ae are shown in Figure 1A. Solvent accessibility and the number of direct contacts with other residues of the substituted amino acid were calculated employing the Voronoi Laguerre Delaunay Protein webserver (VLDPws) and the X-ray structure of L7Ae (PDB entry: 3HJW25, 26 , Table 1). The labeling sites (“X”) can be divided into three groups: one highly accessible labeling site with the least number of contacts with other amino acids (K14X, 5 contacts), two medium accessible (K32X, G47X, 7 contacts) and two low accessible (E25X, S83X, 10 and 8 contacts) labeling sites, respectively. It can be assumed that replacement of a highly exposed amino acid that has few contacts with a non-natural amino acid does not cause a significant structural or functional perturbation. The nature and strength of interactions between neighouring amino acids will very likely be reflected the same way. At the same time, residues which are solvent accessible may increase the probability for high labeling yields due to minimization of steric hindrances. Both of these considerations do however not take into account the structural and physicochemical microenvironment of the fluorophore after labeling. To identify the effect of local structures on the accessible volume, we used the algorithmic calculation of the FRET positioning and screening (FPS) program6, 21. This algorithm takes into account linker length and flexibility, as well as sterical clashes with the rest of the complex, to predict the FRET distribution that can be expected from the labeling sites of choice (shown in Table 1). This should give a better prediction of FRET distribution than merely measuring the Cα-Cα distance. Incorporation of a clickable non-natural amino acid into L7Ae. To site-specifically functionalize L7Ae with a suitable reactive group for labeling, the orthogonal pyrrolysyl tRNA/pyrrollysyl-tRNA synthetase pair from Methanosarcina mazei was used, which is able to incorporate a range of nonnatural amino acid at an amber stop codon position within proteins27-29. An amber stop codon (TAG) was introduced at each of the previously identified labeling sites within the expression vector DNA by whole plasmid site-directed mutagenesis. The non-natural amino acid propargyllysine (PrK) (SFigure 1) was synthesized according to literature30. The non-bulky alkyne group allows biorthogonal coupling to an azide functionalized dye by Cu(I)-catalyzed alkyne-azide cycloaddition (CuAAC) under native conditions31, 32. Even though smFRET spectroscopy does not require large amounts of sample, in practical terms this technique requires highly pure protein. This in turn is irreconcilable with low expression yields.

2

ACS Paragon Plus Environment

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 13

Table 1. Properties of selected labeling sites and comparison of FRET efficiencies. Labeling site

Solvent accessibility factor

Number of AAAA contacts

Xc-EFRET Cα-Cα

Xc-EFRET AV

Xc-EFRET meas.



FWHM

K14X

0.72

5

0.92

0.71 ± 0.05

0.68 ± 0.01

0.03

0.09

E25X

0.35

10

1.00

0.92 ± 0.02

0.82 ± 0.01

0.10

0.08

K32X

0.52

7

1.00

0.96 ± 0.02

0.84 ± 0.01

0.12

0.10

G47X

0.48

7

0.96

0.72 ± 0.05

0.70 ± 0.01

0.02

0.11

S83X

0.27

8

0.99

0.87 ± 0.03

0.69 ± 0.01

0.18

0.13

Number of AA-AA contacts: number of contacts of the replaced amino acid (AA) with other amino acids in the protein; Xc-EFRET Cα-

Cα: center of FRET distribution calculated based on the distance between the Cα atoms of the dyes at the attachment site; Xc-EFRET 6 AV X -E meas.: To optimize the incorporation efficiency of PrK, we systematicalstructure due to incorporation of PrK, and the same overall strucly investigated the influence of the PrK concentration, media ture as the wild type protein. Compared to published data25, the composition and induction time between the amber suppression spectra of L7Ae WT and all the mutants correctly report an oversystem and the target protein. The results uncovered a strong all alpha-helical folding34. The absence of any significant signals dependence of all three parameters with respect to expression between 250-270 nm (Figure 2, inset) strongly suggests that the yield of functionalized protein (data not shown). To obtain condiprotein is free of contaminating RNA35, 36. To assess the functiontions yielding the highest incorporation efficiency, the proteins ality of L7Ae mutant proteins, we investigated the interaction to a were expressed in rich media (Terrific Broth) supplemented with cognate sRNA construct by means of microscale thermophoresis 2.5 mM PrK, while the time delay between induction of the amber (MST)37-40. These experiments were performed under native suppression system and the L7Ae protein was optimized. Figure conditions after reconstitution of the RNP complex in presence of 1B shows that under these optimized conditions there is signifi50 nM Cy5-U26 sRNA with a range of protein concentrations (0.6 cant expression of full-length target protein under these optimized nM-40µM). conditions. The data revealed that - compared to the wild type protein - incorProtein identity and yield. Sequential purification by affinity chromatography, SEC and removal of nucleic acids yielded highly pure protein. Incorporation of PrK into L7Ae was confirmed by SDS-PAGE and mass spectrometry (MS) (Figure 1, C-D, STable 1). The MS data also revealed an incorporation efficiency of PrK of > 90% for four of the labeling labeling sites (K14X, E25X, G47X, S83X, SFigure 2), while position K32X showed a reduced incorporation efficiency of only ~ 60%. MS results suggest that this is due to misincorporation of lysine (SFigure 2). As this degree of misincorporation cannot be seen for the other constructs, we conclude that the local sequence context within the mRNA and/or translation speed may increase misincorporation, as has been suggested before27, 33. Under optimized expression and purification conditions, the yield of isolated pure protein for all five L7Ae mutants ranged between 20-40 mg L-1 of culture, which represents ~60-80% of the yield compared to the wild type protein. Structural and functional integrity of L7Ae mutants. To validate the folding of purified L7Ae containing PrK, circular dichroism (CD) spectroscopy was performed to determine folding and secondary structure properties of proteins in solution. This analysis revealed no significant changes in the spectra between the wild type L7Ae34 and either of the five L7Ae mutant proteins (Figure 2). These findings suggest no significant perturbation of the L7Ae

poration of PrK at all five different positions within L7Ae only has a minor effect on the RNA binding capacity of L7Ae, as indicated by an average 1.6-fold increase in the apparent KD (Table 2), which is in good agreement with previously published data for the L7Ae-sRNA interaction41. The residue (E25) with the most amino acid contacts within the protein structure exhibits the highest increase in the apparent KD (2.3 fold), suggesting that PrK does not functionally replace all native amino acid contacts, and a disruption of the local structure around this position.

Figure 2. CD spectra of L7Ae wild type (WT) and five mutant proteins bearing PrK (“X”) at positions K14, E25, K32, G47, S83. Inset: enlarged region of 240-270 nm.

3

ACS Paragon Plus Environment

Page 5 of 13 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

Table 2. Apparent dissociation constants (KDapp) of L7Ae variants and Cy3 fluorophore labeling yield. Name

KDapp (nM)

Fold change

Labeling yield (%)

WT

226 ± 45

K14X

388 ± 68

1.7

95

E25X

521 ± 103

2.3

87

K32X

304 ± 45

1.3

86

G47X

384 ± 194

1.7

93

smFRET analysis, a similar observation was made for the Alexa Fluor 555 constructs, with both K14X and E25X displaying a narrow distribution and static FRET (SFIgure 7, 8). The Atto 550 construct K14X showed one major population, while the E25X construct did not yield analyzable smFRET data. Further, we calculated the Xc-EFRET of each smFRET construct based on Cα-Cα distance of the attachment site (Cα-Cα) and FPS. The FRET efficiency calculated from Cα-Cα failed to predict the EFRET derived from our smFRET measurements (Table 1, and STable 2)8, 13. FPS modelling of accessible volumes, the positions of all donor dyes at L7Ae and the position of Cy5 at the sRNA and its implementation into FPS based calculation of Xc-EFRET (Xc-EFRET AV) lead to a significantly improved predictability of the smFRET data (Table 1, and STable 3).

Site-specific labeling of L7Ae mutants. CuAAC was used as a S83X 198 ±click 37 reaction 0.9 to selectively 91 couple azide derivatives biorthogonal of Sulfo-Cy3 (Cy3), Alexa Fluor 555 or Atto 550 to the sitespecifically incorporated PrK within L7Ae . Further purification was achieved via size exclusion chromatography (SEC) to quantitatively remove excess dye (SFigure 3, 4). Covalent attachment of the dyes to L7Ae was confirmed by SDS-PAGE in-gel fluorescence (Figure 1D, and SFIgure 4) and MALDI-MS (STable 1). Labeling yield was determined by UV/Vis absorption spectroscopy. Overall, near-quantitative coupling for all five labeling sites was achieved (> 86%, Table 2). Benchmarking of constructs in single molecule FRET experiments. We validated the performance of all Cy3-L7Ae labeling sites on single molecule level in the reconstituted RNP complex, i.e. with an acceptor labeled RNA. smFRET measurements were performed on immobilized molecules, using a polyethylene glycol (PEG) passivated glass surface and biotin-streptavidin mediated immobilization. Imaging was performed on an objective-type TIRF microscope with an integration time of 100 ms. After recording data of several thousand molecules, FRET efficiency distributions for each construct was plotted into histograms. Molecules with EFRET ≤ 0.2 were considered donor-only molecules and thus were removed. Fitting one or two Gaussian distributions into the resulting histograms showed that most of the constructs could be assigned a single population each. From these fits, two attributes of smFRET constructs (center (Xc-EFRET) and the full width at half-maximum (FWHM) of EFRET distribution) were obtained. As shown in Figure 3, only the G47X-construct shows an indication of a significant additional second population. This finding can also be seen in the analysis of longer time traces (SFIgure 5). While we do not anticipate a secondary mode of binding between the RNA and the protein, this may be explained by the presence of a minor overlap of the accessible volume (as derived from FPS calculations) with the sRNA (SFigure 6), rendering this construct less useful for accurate smFRET experiments. Four of the five Cy3 smFRET constructs (K14X, E25X, K32X, S83X) show one major EFRET population and a very narrow width (~ 0.1), indicating a lack of photophysical artefacts and shot-noise limitation (Figure 3 and Table 1). To further test our findings, we synthesized L7Ae constructs for two Alexa Fluor 555- and Atto 550-labeling sites (K14X and E25X). Regarding

Figure 3. smFRET distribution histograms of RNP complexes and absolute fluorescence quantum yield. A-E: FRET distribution of Cy3-L7Ae in complex with Cy5-sRNA. F: Absolute fluorescence quantum yield of Cy3-L7Ae. Insetsshow accessible volume (AV) cloud of Cy3 (green)and Cy5 (red) within the RNP The blue line represents the Gaussian fit. The dashed line represents the simulated Xc based on AV calculation. As Table 1 shows, four of the smFRET constructs (E25X, K32X, G47X, S83X) showed discrepancies between the modeled and the measured data of values between 0.10-0.18. This would render many of these attachment sites useless for accurate smFRET experiments. However, the K14X-construct shows very good agreement between the FPS-simulated and the smFRET measurement data (Xc-EFRET 0.68 ± 0.01 and Xc-EFRET AV 0.71 ± 0.05, Table 1). FPS has been previously shown to increase the accuracy of smFRET measurements6, this is however represented in only one out of the constructs tested. We therefore hypothesize that the deviation between the simulated and the measured data is the result of photophysical perturbation of Cy3 at its attachment site,

4

ACS Paragon Plus Environment

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

caused by the local protein environment. The same trend could be observed for both Alexa Fluor 555 and Atto 550 constructs (STable 2): Whereas the K14X-Alexa Fluor 555 construct shows a small deviation of Xc-EFRET from the simulated value, E25XAlexa Fluor 555 shows a deviation towards lower FRET efficiency. For the Atto 550 derivatives, quality of the data was significantly decreased; whereas the K14X construct showed a reduced Xc-EFRET, E25X did not yield analyzable smFRET data. To investigate this trend, we performed an exhaustive multiparameter fluorescence analysis of the Cy3 constructs to gain a detailed view of each labeling site and to determine the photophysical basis of these differences. Multiparameter fluorescence analysis of Cy3-L7Ae. Fluorescence properties of Cy3 are strongly dependent on its environment e.g. solvent, viscosity, temperature, or labeling to a macromolecule, some of which influence the spectral overlap, anisotropy, or quantum yield e.g. via changing the rate constant of trans-cis photoisomerization42-46. All of these may influence the results of smFRET measurements and therefore may have a significant impact on data interpretation19, especially when considering local effects on spectral changes in excitation or emission wavelengths47, 48. Fluorescence anisotropy is an important parameter, which is used to study the rotational freedom of a fluorophore at its attachment site49-51. The mobility of a fluorophore influences the orientation of the dipoles during FRET process, thereby significantly altering the Förster radius, which has immense potential for uncertainties in smFRET experiments aiming at high precision19, 50, 52. We therefore systematically characterized the fluorescence properties of Cy3 at each labeling site by performing a multiparameter fluorescence analysis in order a) to identify differences between the photophysical properties of Cy3 at each labeling site, b) to transfer these data into computational analysis, and c) to refine the understanding and rationalize smFRET construct design. Spectral properties of labeled L7Ae. As evident from data shown in Supplementary Figure 9, the effect of attachment to the protein L7Ae on spectral properties of Cy3, Alexa Fluor 555 or Atto 550 is very small. In comparison to the free dye in solution, a negligible uniform bathochromic shift of 2-3 nm of both the absorption and emission maximum was observable for all labeling positions analyzed. Mobility of Cy3-L7Ae. The fluorescence anisotropy at all Cy3 labeling positions was, compared to the free dye in solution, slightly increased (Figure 4, A). All Cy3 labeling positions were in the same range, indicating a labeling site independent decrease in mobility of the dye. However, the data did not exhibit a clear trend with respect to the labeling site. Previously, studies have shown that caution should be exercised when interpreting the fluorescence anisotropy data with respect to the rotational freedom of the dye due to the insensitivity for microenvironment, which is crucial for its photophysical behavior50. Analysis of absolute fluorescence quantum yield to probe the microenvironment of L7Ae labels. As spectral properties and anisotropy of L7Ae-attached dyes do not explain the deviation between simulated and experimentally determined FRET efficien-

Page 6 of 13

cy, we therefore assessed the rotational diffusion of all labeled L7Ae constructs by determining the fluorescence quantum yield (QY)49. In case of Cy3, the QY is strongly dependent on the rate of trans-cis photoisomerization. Sterical hindrances caused by proximity of the dye to the protein may lead to a decrease in the non-fluorescent cis-isomer formation, and thus result in an increased QY42, 53. By comparing Cy3-L7Ae constructs with the free dye in solution, mobility of the dye can be gained indirectly. Absolute QYs were determined by using an integrating sphere, and revealed that the absolute QY of Cy3 attached to positions E25X, K32X, G47X and S83X ranged between 0.13-0.23, which is significantly higher than the QY of the free dye in solution (0.09 ± 0.007), indicating a change in the mobility caused by residues in its proximity within the protein (Figure 4A). In contrast to this, Cy3 at the position K14 had the same QY of 0.09 ± 0.012 as the free Cy3 in solution, which suggests a nearly unrestricted mobility of the dye (Figure 4A). For Alexa Fluor 555, quantum yield of the K14X construct was within the range of the free dye, whereas for E25X there was a significant increase observable (STable 4). For Atto 550, quantum yield was decreased for the K14X construct, and vastly increased for the E25X construct. This confirms the trend observed for Cy3, in that the quantum yield is provides a valuable prediction for the validity of the smFRET experiments.

Figure 4. Bulk fluorescence analysis. A: Steady-state anisotropy of Cy3 and Cy3-labeled L7Ae, and Cy5 and Cy5-labeled sRNA. Error bars represent a standard error of 20 trials. B: fluorescence decay of Cy3-L7Ae. Fit results are summarized in Table 3. To further analyze the underlying difference in the microenvironment of Cy3 at each labeling site, we used time correlated single photon counting (TCSPC). TCSPC is a powerful technique, which relies on a high repetition rate pulsed laser excitation of the fluorophore and allows the determination of fluorescence lifetimes based on a picosecond timescale fluorescence decay. In general, fluorescence lifetime is proportional to QY, which relies on the rotational freedom of the fluorophore, and becomes longer if the rotation is restricted, for example by interaction with a protein42, 49, 53.

5

ACS Paragon Plus Environment

Page 7 of 13 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology Table 3. Fluorescence lifetime and absolute florescence quantum yield of Cy3 and Cy3-L7Ae. Name

Fluorescence lifetime (ns)a

Absolute fluorescence QY

free Cy3

τ = 0.47 (100%)

0.09 ± 0.01

L7Ae-Cy3 K14X

τ1 = 0.34 (64%)

0.09 ± 0.01

τ2 = 1.00 (36%) τav = 0.58 E25X

τ1 = 0.45 (37%)

0.23 ± 0.02

τ2 = 1.77 (63%) τav = 1.28 K32X

τ1 = 0.51 (58%)

0.18 ± 0.01

τ2 = 1.26 (42%) τav = 0.83 G47X

τ1 = 0.44 (63%)

0.15 ± 0.01

τ2 = 1.18 (37%) τav = 0.71lifetimes from the fluorescence decay data To extract fluorescence (Figure 4B), we fitted a minimal number of exponential terms to S83X describe τ1 = the 0.37 data (66%)54. The presence 0.13 ±of0.02 correctly the sRNA had a negligible influence on the Cy3 fluorescence lifetime (SFigure 10, τ2 = 1.11 (34%) and STable 5). In contrast to the free dye, the satisfactory description of the Cy3-L7Ae τav = 0.62fluorescence decays both in absence and presence of the sRNA required two exponential terms, yielding a short-lived (0.33-0.51 ns) and a relatively long-lived (1.001.77 ns) species (Table 3). The short lifetime of the species is comparable to the free dye and represents the main part (except the E25X position) of the fractional intensity. As we did not find contamination of the samples with free dye (SFigure 3, 4), this is consistent with fairly unrestricted dye molecules attached to the protein55. Further, we assume that the minor long-lived species at positions K14X, K32X, G47X and S83X represent constrained fluorophores that are however not devoid of trans-cis photoisomerization, which is also reflected in the absolute QY values (Table 2)19. In contrast to this, the E25X labeling position suggests a mainly isomerization-impaired fluorophore, as the main part of the long-lived species exhibits a lifetime close to 2 ns (E25X τ1 = 1.77 ns (63%), τ2 = 0.45 ns (37%))19, 42, 48, 50. The finding that E25X has the highest QY of all labeling sites (Table 3, STable 4) supports these observations. The trend of intensity weighted fluorescence lifetimes (τav) is proportional to the QY and suggests a collisional quenching process of Cy3-L7Ae49 (Table 3). These results are comparable with previously reported data for the fluorescence behavior Cy3 attached to DNA43, 50, 52, 56-60. Collectively, only the microenvironment analysis of Cy3 could uncover the

differences between the labeling positions and shows that the position with the highest solvent accessibility K14 (Table 1) has highly similar photophysical properties with respect to spectral properties and QY as the free dye in solution (Table 1, SFigure 9), indicating unchanged photophysics upon labeling. Analysis of two other attachment sites decorated with Alexa Fluor 555 or Atto 550 further confirms this conclusion (SFigure 9, STable 4). Predictability of smFRET construct accuracy. To generalize the identification of suitable labeling sites for smFRET constructs, we deconvoluted our experimental and computational dataset with respect to predictability of Xc-EFRET. Most importantly, our results confirm that the FPS-based approach leads to a more accurate simulation of Xc-EFRET. To refine the simulation data, we implemented, based on multiparameter fluorescence analysis, the calculated Förster radii of each smFRET construct into the FPS-based simulation, which does not significantly improve the predictability of Xc-EFRET (STable 3) with the exception of K14X. Comparison of the multiparameter analysis of Cy3-L7Ae constructs exposed one relevant exception between the K14 labeling position and the other four labeling positions, which are the largely unaltered photophysical properties as compared to the free dye in solution (Tables 1-3, SFigure 9). Moreover, compared to other labeling positions, calculations showed the highest solvent accessibility for the K14 position, and revealed a mainly linear correlation between the discrepancy of Xc-EFRET AV and the solvent accessibility factor (Table 1). These observations suggest that fewer differences in photophysical properties between dye in a coupled form and free dye in solution lead to a significantly smaller deviation in the predicted Xc-EFRET AV value. This correlation corroborates the previous assumption that the uncertainties in the Xc-EFRET AV prediction are the result of changes in the photophysical properties of the dye rather than systematic errors in distance assumptions between the FRET pair.This again is corroborated by the analysis of constructs with Alexa Fluor 555 or Atto 550 donor dyes. In summary, we find that the solvent accessibility of the replaced residue correlates with unaltered photophysical dye properties. Together with FPS modeling of the AV of a FRET pair and its consideration into Xc-EFRET calculation lead to the most accurate simulation of the measured data, and pave the way towards an easy identification and robust predictability of accurate smFRET constructs. A novel workflow to improve smFRET construct accuracy. Based on this detailed exemplary study, we derived a general workflow (Figure 5) to design and validate smFRET constructs for RNP complexes, which may very well be transferred to protein- or RNA-only constructs as well: Starting from existing structural data, we suggest implementing tools for both solvent accessibility calculation as well as accessible volume simulation prior to deciding on labeling sites, as both of these significantly improve accuracy. Optimization of protein expression and labeling using site-specific introduction of a clickable non-natural amino acid allows to more precisely define the inter-dye distance by vastly increasing the number of potential labeling sites.

6

ACS Paragon Plus Environment

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 13

Figure 5: Proposed workflow. To obtain accurate smFRET constructs, we suggest to perform a thorough computational analysis with both structural assessment of possible labeling site amino acids, and simulation of accessible volumes for the FRET dyes. Following expression labeling, and purification of the initial constructs, quality control measures should be implemented. Determination of the quantum yield of the donor fluorophore serves as an important measure to immediately assess smFRET construct quality.

Consideration of solvent accessibility also improves labeling. However, both computational and biochemical approaches do not alleviate the necessity for thorough control experiments: structure and function of the labeled protein as well as photophysical properties of the fluorophores should be assessed prior to smFRET analysis. Here, we find that quantum yield determination of the coupled dye provides inexpensive, robust means to get an excellent initial estimate of the suitability of a given labeling site.

Conclusion This study shows how at each step of biomolecular smFRET studies, accuracy can immediately be improved by three measures: First, in silico analysis and simulation of potential fluorophore attachment sites yields a superior measure for their usability and validity. This allows in a second step to generate these constructs by establishing a systematic optimization for expression, labeling, and functional integrity. Third, experimen-

tally straightforward determination of quantum yield then allows for assessment of FRET efficiency deviations from expected and/or calculated values. Together, these steps significantly accelerate and improve generation and validation of constructs for single molecule FRET measurements, including RNP complexes.

Material and Methods AV calculation and Xc-EFRET prediction. The accessible volume calculation and Xc-EFRET prediction were performed employing FRET positioning and screening software (FPS)61. For the SulfoCy3 dye linker length of 23.1 Å, a linker width of 4.5 Å and dye radii of 6.8 Å, 3.0 Å and 1.5 Å respectively were used. The SulfoCy5 were characterized by a linker length of 22.3 Å, a linker width of 4.5 Å and dye radii 11.0 Å, 3.0 Å and 1.5 Å respectively; The corresponding measures for Alexa Fluoro 555 were 23 Å, 4.5 Å, 6.8 Å, 4.5 Å, 1.5 Å and for Atto 550 24 Å, 4.5 Å, 8.1 Å, 3.4 Å, 2.1 Å (x,y,z dimension of the dyes, SFigure 11). The attachment

7

ACS Paragon Plus Environment

Page 9 of 13 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

points for each dye position were identify based on X-ray structure (PDB entry: 3HJW)25. Introduction of an amber stop codon. The introduction of an amber stop codon (TAG) at desired positions within L7Ae was achieved by use of QuikChange Site-directed Mutagenesis Kit (Agilent Genomics). DNA oligonucleotides were designed according to the manufacturer’s protocol and purchased from Eurofins Genomics, Munich. Synthesis of PrK. PrK synthesis was synthesized as described previously31, and analyzed by NMR. Expression and purification of L7Ae mutants. Expression of L7Ae with incorporated PrK amino acid was achieved in E. coli BL21 (DE3) cells. The cells harboring pET15a-XL7Ae (encodes amber stop codon containing L7Ae from Pyrococcus furiosus with 6xHis at the N-terminus, X denotes one of the mutation sites: K14, E25, K32, G47, S83) and pEVOL plasmid31 (encodes the wild type pyrrolysyl tRNA/pyrrollysyl-tRNA synthetase pair from Methanosarcina mazei) were inoculated into with ampicillin (100 µg mL-1) and chloramphenicol (34 µg mL-1) supplemented terrific broth media (~ OD600 0.1). After 1h incubation (37°C, 160 rpm) the culture was supplemented with 2.5 mM PrK and incubated for an additional hour before by addition of arabinose to a final concentration of 0.2% (w/v) the induction of the pEVOL system followed. The induction of L7Ae was achieved with 0.1 mM Isopropyl β-D-1-thiogalactopyranoside (IPTG) and followed five hours later after the induction of amber suppression system. After 18-20 h of induction, the cells were harvested (10,000 g, 4°C, 10 min) and suspended in ~ 30 ml buffer a (50 mM Na2HPO4 (pH 6.0), 1 M NaCl, 0.1 mM EDTA, 10 mM imidazole, 14 mM 2mercaptoethanol, supplemented with EDTA-free protease inhibitor cocktail (Roche) and 100 µg RNase A (Qiagen)). The lysis was achieved by sonication on ice. The cell lysate was clarified by centrifugation (10,000 g, 4°C, 30 min), the supernatant was supplemented with 20 µg RNase A and incubated at 70°C for 20 min. After centrifugation (10,000 g, 4°C, 15 min) the supernatant was loaded onto a with Puffer A preequilibrated Ni2+-NTA column (HisTrapTM HP, GE Healthcare) at a flow rate of 1 ml min-1. The elution of protein was performed with a linear gradient of buffer b (500 mM Na2HPO4 (pH 6.0), 1 M NaCl, 0.1 mM EDTA, 10 mM imidazole, 14 mM 2-mercaptoethanol) on a Äkta purifier 900 system. Fractions that contained protein (absorption at 280 nm) were pooled and analyzed by 15% (w/v) SDS-PAGE. To remove copurified RNA a second RNase A treatment followed (40 µg RNase A, 70°C, 30 min). The RNase A was removed (as described before) by a second Ni2+-NTA column purification. The L7Ae containing fractions were combined, concentrated (VivaSpin 20, 3 kDa MWCO PES, (Sartorius)) and further purified by size exclusion chromatography (SEC) employing a Superdex 75 10/300 GL column (GE Healthcare) at a flowrate of 0.4 ml min-1. (buffer c: 50 mM HEPES pH 7.0, 200 mM KCl, 5% (v/v) glycerol). Purified L7Ae was analyzed by SDS-PAGE and mass spectrometry. The protein concentration was determined using a Nanodrop one UV/Vis spectrophotometer (Thermo Fisher Scientific) and an extinction coefficients (ε) of 4470 1/M*cm at 280 nm. The yield was up to 40 mg L-1.

Expression and purification of L7Ae wild type protein. L7Ae wildtype protein was expressed in the same cells and in the same media as the L7Ae mutants, harboring pET15a-L7Ae plasmid. The expression of the protein was induced by adding 0.1 mM IPTG at OD600 ~ 4-6. After 3 h induction, the cells were treated as described above. The yield was up to 60 mg L-1. Circular dichroism (CD) spectrometry. CD spectra ranged over a range of 190-300 nm using a 0.1 cm path length quartz cuvette on a Peltier temperature controlled cell holder (model PTC-423S, Jasco) were conducted on a CD spectropolarimeter (model J-810, Jasco). The spectra were acquired with a scanning speed of 50 nm min-1 at 1 nm data interval. Protein concentration was adjusted to 20 µM in CD measurement buffer (20 mM Na2HPO4 (pH 7.0), 20 mM NaCl). Each spectrum was baseline corrected and represents an average of 10 scans. Protein labeling via Cu(I) catalyzed alkyne-azide cycloaddition (CuAAC) and purification. Protein labeling was performed in presence of 50 µM purified, site-specifically incorporated protein, 150 µM fluorophore-azide (Sulfo-Cy3: Jena Bioscience, Alexa Fluor 555: Thermo Fisher Scientific, Atto 550: Attotec) in buffer c supplemented with 500 µM CuSO4, 2.5 mM Tris(3hydroxypropyltriazolylmethyl)amine (THPTA, Sigma Aldrich), 5 mM aminoguanidine and 5 mM sodium ascorbate (freshly added) at 37°C for 6 h. SEC was performed to remove unreactive dye, and quantitative dye removal was verified by SDS gel electrophoresis (SFigure 3, 4); peak that contained protein and dye absorption were pooled and analyses by SDS-PAGE and mass spectrometry. The labeling yield was determined by UV/Vis spectroscopy, by means of equation (1) and area under the curve (AUC) of the peak for the coupled (AUCCy3-Protein), free dye (AUCcy3) and the concentration of dye (c0,Cy3) and protein (c0,Protein). The integration of the peak was performed employing Origin software (v8.0, OriginLab Corp.).  ∙ ,  

,

∙ 100% = labeling yield/% (1)

Emission spectra. The emission spectra were recorded in a 10 mm x 4 mm quartz cuvettes (29-F/Q/10, Starna) on a JASCO FP 8500 spectrometer (Jasco). The spectra represent an average of 10 scans and was corrected as described previously49. Absorption spectra. The absorption spectra were recorded in a 10 mm x 4 mm quartz cuvettes (29-F/Q/10, Starna) on a JASCO V 650 spectrometer (Jasco). The baseline corrected spectra represent an average of 10 scans. Steady-state fluorescence anisotropy. The measurements of steady-state fluorescence anisotropy were performed as described previously51. Briefly, the Cy3 labeled protein concentration was adjusted to 5 nM in psi-buffer (100 mM Tris-HCl (pH 8.0), 100 mM NH4OAC, 5 mM MgCl2) and the measurements were performed at 22°C on a FluoroMax-4 spectrophotometer (Horiba Scientific) in L-format geometry. The integration time was 100 ms and the calculated fluorescence anisotropy value (r) represent an average of 20 scans at the excitation/emission wavelength 525 nm/565 nm with an excitation/emission bandwidth of 7 nm.

8

ACS Paragon Plus Environment

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Absolute quantum yield determination. For absolute quantum yield determination, a JASCO FP 8500 spectrometer (Jasco), equipped with a 100 mm integrating sphere (ILF-835, JASCO), was used. Sample and solvent were prepared in identical 10 mm x 4 mm quartz cuvettes (29-F/Q/10, Starna) containing psi-buffer. The absorbance at 500 nm (excitation wavelength) of the sample was kept under 0.05 and the measurement were performed at room temperature. Time-resolved fluorescence measurements. The fluorescence lifetimes were measured with a partly home-built time-correlated single photon counting (TCSPC) setup as described previously62. For excitation, a mode-locked titanium-doped sapphire (Ti:Sa) laser (Tsunami 3941-X3BB, Spectra-Physics) was pumped by a 10 W continues wave diode pumped solid state laser (Millennia eV, Spectra-Physics, 532 nm). The Ti:Sa Laser allowed the tuning of the excitation wavelength to 950 nm at a repetition rate of 80 MHz. With the help of an acousto-optic modulator, the repetition rate was reduced to 8 MHz and the excitation wavelength of 475 nm was obtained by SHG in a BBO crystal (frequency doubler and pulse selector, Model 3980, Spectra-Physics). Excitation pulses of about 0.1 nJ at 475 nm were applied to the sample. The sample was prepared in a 10 mm x 4 mm quartz cuvette (29F/Q/10, Starna) with a fixed temperature of 22°C. Emission filters (OG515, OG530, Schott AG) suppressed excitation stray light. The instrument response function (IRF, FWHM 200 ps) was obtained without emission filters using a TiO2 suspension as scattering sample. For single-photon detection, a photomultiplier tube (PMT, PMA-C 182-M, PicoQuant) and a TimeHarp 260 PICO Single PCIe card (PicoQuant) was used. Multi-exponential fitting was carried out with FluoFit Pro 4.6 (PicoQuant)54. All fitted time constants result in an intensity-weighted average lifetimes (equation 2)63. The sample was diluted to ~ 1 µM in psibuffer. For the measurements in a complex with the sRNA, the sample was reconstituted by incubation at 70°C for 5 min. ∑ & ' (

"#$ = ∑)

) & '

(2)

RNA labeling and DNA-splinted RNA ligation. DNA splinted ligation after RNA labeling was performed as described previously64. Briefly, the RNA oligonucleotides 1 (oligo 1) (oligo 1 5’GGG CCA CGG AAA CCG CGC GCG GUG AUC AAU-3’) were purchased in their 2’-ACE protected form from Dharmacon, resuspended in H2O and ethanol precipitated. The amine-reactive dye (Dye Packs Cy5, GE Healthcare) was resuspended in 50% (v/v) DMSO and the coupling reaction was carried out in the presence of 30 nmol of oligo 1 for 90 min at room temperature. Following ethanol precipitation, deprotection (30 min, 60°C in provided deprotection buffer) and again ethanol precipitation, RNA was resuspended in HPLC buffer (0.1 M trimethylamin (pH 7.0)). HPLC purification with acetonitrile as second eluent was performed on an Äkta purifier, using a C8 column (250x 4.6 mm, Kromasil). Peak that contained RNA and dye absorption were pooled and washed by ethanol precipitation. Concentrations were determined using Nanodrop one UV spectrophotometer (Thermo Fisher Scientific). For the DNA-splinted RNA ligation oligonucleotides (labeled oligo 1, oligo 2 5’-pGAG CCG CGU

Page 10 of 13

UCG CUC CCG UGG CCC ACA A-3’biotin and DNA splint 5'TTG TGG GCC ACG GGA GCG AAC GCG GCT CAT TGA TCA CCG CGC GCG GTT TCC GTG GCC CTA TAG TGA GTC GTA TTA-3') were used in equimolar amounts (1 µM). Each oligonucleotides were hybridized in 0.5x T4 DNA Ligase (Thermo Fisher Scientific) buffer by heating for 3 min at 85°C and slow cooling to room temperature. Samples were adjusted to 1x T4 DNA ligase buffer, 5 units T4 DNA Ligase buffer (Thermo Fisher Scientific) was added and incubated at 37°C for 3 h. The DNA splint was digested by addition of 2 units DNase (TURBO DNase, Ambion) at 37°C for 30 min. The RNA was ethanol precipitated and the ligation products were separated by denaturing urea PAGE 10% (w/v). The ligation product was identified by eye (colored band with more restricted migration behavior), excised, and eluted by shaking in 0.5 M ammonium acetate at room temperature for ~ 18 h. Following ethanol precipitation, concentration was determined using a Nanodrop one UV/Vis spectrophotometer (Thermo Fisher Scientific). Calculation of isotropic Förster radii. The Förster radius for each construct was calculated as described previosly19, where fluorescence quantum yield values for each Cy3 labeling site and the overlap integral from the recorded donor emission spectra with an excitation wavelength of 500 nm and acceptor absorption spectra (350 nm-750 nm) were taken. Together with the refraction index (n = 1.35) and the orientation factor (κ2 = 2/3), the calculated isotropic Förster radii were: R0 = 61 ± 8 Å for the K14X construct, R0 = 72 ± 6 Å for the E25X construct, R0 = 68 ± 2 Å for the K32X construct, R0 = 66 ± 6 Å for the G47X construct and R0 = 62 ± 8 Å for the S83X construct. Förster radii for Sulfo-Cy3 were: R0 = 61 ± 8 Å for the K14X construct, R0 = 72 ± 6 Å for the E25X construct, R0 = 68 ± 2 Å for the K32X construct, R0 = 66 ± 6 Å for the G47X construct and R0 = 62 ± 8 Å for the S83X construct; for Alexa Flour 555 R0 = 62 ± 7 Å for the K14X construct, R0 = 71 ± 6 Å for the E25X construct and for Atto 550 R0 = 73 ± 4 Å for the K14X construct, R0 = 84 ± 5 Å for the E25X construct. For the error calculation of R0 Gaussian error propagation was performed. Microscale thermophoresis (MST). Measurements were performed employing a Nanotemper Monilith NT.115) using the standard treated capillaries and the data was analyzed with a MO.AffinityAnalysis v2.1.2030 software. The apparent KD was determined by fitting the data with the Hill-function (n=1) by use of the Origin software (v8.0, OriginLab). The error represents a standard deviation of three measurements. Instrument settings were: red excitation type, MST-power 80%, excitation-power 40% and 22°C for the temperature control. The concentration of the Cy5 labeled sRNA was kept constant at 50 nM. Prior the measurement the sample was incubated for 5 min at 70°C in psibuffer in order to reconstitute the RNP complex and cooled down to 22°C. Surface passivation and cover slip preparation. Objective slides and cover slides were cleaned by exposure to oxygen plasma for 10 min. After sonication for 5 min in methanol, cover slips were silanized in 3-Triethoxysilylpropylamine (Sigma Aldrich), 1 mM acetic acid dissolved in methanol for 20 min at room tem-

9

ACS Paragon Plus Environment

Page 11 of 13 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

perature. Following a washing step with H2O and drying with nitrogen, cover slips surface was functionalized with PEG/PEGbiotin (33 mM mPEG-NHS, MW 5 kDa; 0.7 mM biotin-PEGNHS, MW 5 kDa; NANOCS, PG1-SC-5K ,PG2-BNNS-5k) in 100 mM bicarbonate buffer overnight at room temperature. Excess of PEG was removed by washing with H2O and cover slips were dried with nitrogen. Prior a smFRET experiment a measurement chamber was generated by combining cover slips and objective slides with double-sided sticky tape64. smFRET spectroscopy and data analysis. Measurements and analysis were performed as described previously51, 64. Briefly, to reconstitute the RNP complex 300 nM of Cy5 labeled (position 26), biotinylated sRNA and 900 nM Cy3 labeled protein was incubated in psi-buffer for 5 minutes at 70°C and immediately placed on ice. Prior the measurement the sample was diluted in sibuffer to a final concentration of ~ 30-60 pmol. Following a measurements slides preparation64 and PEG passivated as described aforementioned, a measurement channel was flushed with 1 mg mL-1 biotinylated BSA (Sigma Aldrich) and incubated for 5 minutes. After washing with 50 µL psi-buffer, a measurement channel was flushed with 10-20 µL of the diluted sample and incubated for 2 min. The sample flushing volume was adjusted to yield in 200-350 molecules per field. After this, the buffer in the measurement channel was exchanged to psi-buffer supplemented with oxygen scavenging system (10 % (w/w) glucose, 14 U mL-1 glucose oxidase (Sigma Aldrich), 1000 U mL-1 catalase (Sigma Aldrich), Trolox (salt, saturated, Carl Roth)). All measurements were performed at 22°C with a 532 nm laser excitation on an objective-type spinning-spot total internal reflection microscopy setup with an EMCCD camera (iXon, Andor Technology) at 100 ms integration time65. For histograms, the first 20 frames (2 s) of up to 20 movies were analyzed as described previously.64 From the same samples, single molecule traces were obtained from movies of 2 minutes to check for single step photobleaching and possible FRET dynamics. Only FRET efficiencies >0.2 were included in Gaussian distribution fitting of the histogram data employing Origin software (v8.0, OriginLab). Mass spectrometry. MALDI-TOF mass spectrometry was performed at the Goethe University Frankfurt in the mass spectrometry service. Prior the measurement the proteins were desalted according to the manufacturer´s protocol using C18 resin pipet tips (ZipTip, Millipore).

Acknowledgements This work has been supported by the DFG, CRC “Molecular Principles of RNA-based regulation”. M.H. and J.W. are members of the Cluster of Excellence “Macromolecular complexes in action”. The authors would like to thank H. Schwalbe for constant support, and M. Heilemann for access to smFRET microscopy. pEVOL plasmids were a gift from E. Lemke, the expression plasmid for L7Ae was a gift from H. Li.

Supporting Information

Supporting Figure 1: Molecular structure of Propargyllysine (PrK); Supporting Figure 2 and Supporting Table 1: Mass spectrometry of L7Ae variants; Supporting Figure 3: Analysis of unspecific binding of free Sulfo-Cy3 azide to L7Ae under denaturing conditions by Tricin-SDS-PAGE before TCSPC; Supporting Figure 4: Analysis of L7Ae labeling with Alexa Fluor 555/Atto 550 by SDS-PAGE; Supporting Figure 5: Representative smFRET time traces of RNP complexes labelled with Sulfo-Cy3; Supporting Figure 6: Cartoon representation of sterical clash of Cy3-L7Ae at position 47; Supporting Figure 7: FRET distribution histograms of smFRET RNP complexes; Supporting Figure 8: Representative smFRET time traces smFRET RNP complexes labelled with Alexa Fluor 555 and Atto 550; Supporting Figure 9: Spectral properties of free and protein-coupled donor dyes; Supporting Figure 10 and Supporting Table 4: Fluorescence decays of L7Ae-Cy3; Supporting Figure 11: Molecular structure of Alexa Fluor 5551 and Atto 550; Supporting Table 2: Comparison of FRET efficiencies of Alexa Flour 555 and Atto 550 attached to L7Ae; Supporting Table 3: Distance and Xc-EFRET calculation of Cy3-Cy5 attachments sites within the RNP; Supporting Table 5: Fluorescence lifetime of Cy3-L7Ae in presence and absence of sRNA. This material is available free of charge via the internet at http://pubs.acs.org.

References (1) Hoskins, A. A., Friedman, L. J., Gallagher, S. S., Crawford, D. J., Anderson, E. G., Wombacher, R., Ramirez, N., Cornish, V. W., Gelles, J., and Moore, M. J. (2011) Ordered and Dynamic Assembly of Single Spliceosomes, Science 331, 1289-1295. (2) Munro, J. B., Wasserman, M. R., Altman, R. B., Wang, L., and Blanchard, S. C. (2010) Correlated conformational events in EF-G and the ribosome regulate translocation, Nat. Struct. Mol. Biol. 17, 1470. (3) Wang, L., Pulk, A., Wasserman, M. R., Feldman, M. B., Altman, R. B., Cate, J. H. D., and Blanchard, S. C. (2012) Allosteric control of the ribosome by small-molecule antibiotics, Nat. Struct. Mol. Biol. 19, 957. (4) Nagy, J., Grohmann, D., Cheung, A. C., Schulz, S., Smollett, K., Werner, F., and Michaelis, J. (2015) Complete architecture of the archaeal RNA polymerase open complex from single-molecule FRET and NPS, Nat. Commun. 6, 6161. (5) Schulz, S., Gietl, A., Smollett, K., Tinnefeld, P., Werner, F., and Grohmann, D. (2016) TFE and Spt4/5 open and close the RNA polymerase clamp during the transcription cycle, Proc. Natl. Acad. Sci. U. S. A 113, E1816-1825. (6) Kalinin, S., Peulen, T., Sindbert, S., Rothwell, P. J., Berger, S., Restle, T., Goody, R. S., Gohlke, H., and Seidel, C. A. (2012) A toolkit and benchmark study for FRET-restrained high-precision structural modeling, Nat. Methods 9, 1218-1225. (7) Förster, T. (1948) Zwischenmolekulare Energiewanderung und Fluoreszenz, Annalen der Physik 437, 55-75. (8) Roy, R., Hohng, S., and Ha, T. (2008) A practical guide to singlemolecule FRET, Nat. Methods 5, 507-516. (9) Seo, M.-H., Lee, T.-S., Kim, E., Cho, Y. L., Park, H.-S., Yoon, T.-Y., and Kim, H.-S. (2011) Efficient Single-Molecule Fluorescence Resonance Energy Transfer Analysis by Site-Specific Dual-Labeling of Protein Using an Unnatural Amino Acid, Anal. Chem. 83, 8849-8854. (10) Lang, S., Spratt, D. E., Guillemette, J. G., and Palmer, M. (2005) Dual-targeted labeling of proteins using cysteine and selenomethionine residues, Anal. Biochem. 342, 271-279. (11) Ratner, V., Kahana, E., Eichler, M., and Haas, E. (2002) A general strategy for site-specific double labeling of globular proteins for kinetic FRET studies, Bioconjugate Chem. 13, 1163-1170.

10

ACS Paragon Plus Environment

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(12) Kim, Y., Ho, S. O., Gassman, N. R., Korlann, Y., Landorf, E. V., Collart, F. R., and Weiss, S. (2008) Efficient Site-Specific Labeling of Proteins via Cysteines, Bioconjugate Chem.19, 786-791. (13) Sun, Y., Shopova, S. I., Wu, C.-S., Arnold, S., and Fan, X. (2010) Bioinspired optofluidic FRET lasers via DNA scaffolds, Proc. Natl. Acad. Sci. U. S. A 107, 16039-16042. (14) Mitchell, A. L., Addy, P. S., Chin, M. A., and Chatterjee, A. (2017) A Unique Genetically Encoded FRET Pair in Mammalian Cells, ChemBioChem 18, 511-514. (15) Chin, J. W. (2017) Expanding and reprogramming the genetic code, Nature 550, 53-60. (16) Ryu, Y., and Schultz, P. G. (2006) Efficient incorporation of unnatural amino acids into proteins in Escherichia coli, Nat. Methods 3, 263-265. (17) Ellman, J., Mendel, D., and Schultz, P. (1992) Site-specific incorporation of novel backbone structures into proteins, Science 255, 197-200. (18) Wang, K., Schmied, W. H., and Chin, J. W. (2012) Reprogramming the genetic code: from triplet to quadruplet codes, Angew. Chem., Int. Ed. Engl. 51, 2288-2297. (19) Iqbal, A., Arslan, S., Okumus, B., Wilson, T. J., Giraud, G., Norman, D. G., Ha, T., and Lilley, D. M. (2008) Orientation dependence in fluorescent energy transfer between Cy3 and Cy5 terminally attached to double-stranded nucleic acids, Proc. Natl. Acad. Sci. U. S. A 105, 1117611181. (20) Nagy, J., Grohmann, D., Cheung, A. C. M., Schulz, S., Smollett, K., Werner, F., and Michaelis, J. (2015) Complete architecture of the archaeal RNA polymerase open complex from single-molecule FRET and NPS, Nat. Commun. 6, 6161. (21) Sindbert, S., Kalinin, S., Nguyen, H., Kienzler, A., Clima, L., Bannwarth, W., Appel, B., Muller, S., and Seidel, C. A. (2011) Accurate distance determination of nucleic acids via Forster resonance energy transfer: implications of dye linker length and rigidity, J. Am. Chem. Soc. 133, 2463-2480. (22) Nagy, J., Eilert, T., and Michaelis, J. (2018) Precision and accuracy in smFRET based structural studies-A benchmark study of the Fast-NanoPositioning System, J. Chem. Phys. 148, 123308. (23) Wang, J., Fessl, T., Schroeder, Kersten T., Ouellet, J., Liu, Y., Freeman, Alasdair D., and Lilley, David M. (2012) Single-Molecule Observation of the Induction of k-Turn RNA Structure on Binding L7Ae Protein, Biophys. J. 103, 2541-2548. (24) Rozhdestvensky, T. S., Tang, T. H., Tchirkova, I. V., Brosius, J., Bachellerie, J.-P., and Hüttenhofer, A. (2003) Binding of L7Ae protein to the K-turn of archaeal snoRNAs: a shared RNA binding motif for C/D and H/ACA box snoRNAs in Archaea, Nucleic Acids Res. 31, 869-877. (25) Liang, B., Zhou, J., Kahen, E., Terns, R. M., Terns, M. P., and Li, H. (2009) Structure of a functional ribonucleoprotein pseudouridine synthase bound to a substrate RNA, Nat. Struct. Mol. Biol. 16, 740-746. (26) Esque, J., Leonard, S., de Brevern, A. G., and Oguey, C. (2013) VLDP web server: a powerful geometric tool for analysing protein structures in their environment, Nucleic Acids Res. 41, W373-378. (27) Wan, W., Tharp, J. M., and Liu, W. R. (2014) Pyrrolysyl-tRNA synthetase: an ordinary enzyme but an outstanding genetic code expansion tool, Biochim. Biophys. Acta 1844, 1059-1070. (28) Crnkovic, A., Suzuki, T., Soll, D., and Reynolds, N. M. (2016) Pyrrolysyl-tRNA synthetase, an aminoacyl-tRNA synthetase for genetic code expansion, Croat. Chem. Acta 89, 163-174. (29) Nikic, I., and Lemke, E. A. (2015) Genetic code expansion enabled site-specific dual-color protein labeling: superresolution microscopy and beyond, Curr. Opin. Chem. Biol. 28, 164-173. (30) Milles, S., Tyagi, S., Banterle, N., Koehler, C., VanDelinder, V., Plass, T., Neal, A. P., and Lemke, E. A. (2012) Click Strategies for SingleMolecule Protein Fluorescence, J. Am. Chem. Soc. 134, 5187-5195. 31. Milles, S., Tyagi, S., Banterle, N., Koehler, C., VanDelinder, V., Plass, T., Neal, A. P., and Lemke, E. A. (2012) Click strategies for singlemolecule protein fluorescence, J. Am. Chem. Soc. 134, 5187-5195. (32) Presolski, S. I., Hong, V. P., and Finn, M. G. (2011) CopperCatalyzed Azide-Alkyne Click Chemistry for Bioconjugation, Curr. Protoc. Chem Biol. 3, 153-162. (33) Pott, M., Schmidt, M. J., and Summerer, D. (2014) Evolved sequence contexts for highly efficient amber suppression with noncanonical amino acids, ACS Chem. Biol. 9, 2815-2822.

Page 12 of 13

(34) Greenfield, N. J. (2006) Using circular dichroism spectra to estimate protein secondary structure, Nat. Protoc. 1, 2876-2890. (35) Kypr, J., Kejnovska, I., Bednářová, K., and Vorlickova, M. (2012) Circular Dichroism Spectroscopy of Nucleic Acids. (36) Gray, D. M., Hung, S.-H., and Johnson, K. H. (1995) [3] Absorption and circular dichroism spectroscopy of nucleic acid duplexes and triplexes, Methods Enzymol., pp 19-34, Academic Press. (37) Seidel, S. A., Dijkman, P. M., Lea, W. A., van den Bogaart, G., Jerabek-Willemsen, M., Lazic, A., Joseph, J. S., Srinivasan, P., Baaske, P., Simeonov, A., Katritch, I., Melo, F. A., Ladbury, J. E., Schreiber, G., Watts, A., Braun, D., and Duhr, S. (2013) Microscale thermophoresis quantifies biomolecular interactions under previously challenging conditions, Methods 59, 301-315. (38) Jerabek-Willemsen, M., Wienken, C. J., Braun, D., Baaske, P., and Duhr, S. (2011) Molecular interaction studies using microscale thermophoresis, Assay Drug Dev. Technol. 9, 342-353. (39) Mueller, A. M., Breitsprecher, D., Duhr, S., Baaske, P., Schubert, T., and Langst, G. (2017) MicroScale Thermophoresis: A Rapid and Precise Method to Quantify Protein-Nucleic Acid Inatnteractions in Solution, Methods Mol. Biol. 1654, 151-164. (40) Jerabek-Willemsen, M., André, T., Wanner, R., Roth, H. M., Duhr, S., Baaske, P., and Breitsprecher, D. (2014) MicroScale Thermophoresis: Interaction analysis and beyond, J. Mol. Struct. 1077, 101-113. (41) Fourmann, J. B., Tillault, A. S., Blaud, M., Leclerc, F., Branlant, C., and Charpentier, B. (2013) Comparative study of two box H/ACA ribonucleoprotein pseudouridine-synthases: relation between conformational dynamics of the guide RNA, enzyme assembly and activity, PLoS One 8, e70313. (42) Stennett, E. M. S., Ciuba, M. A., Lin, S., and Levitus, M. (2015) Demystifying PIFE: The Photophysics Behind the Protein-Induced Fluorescence Enhancement Phenomenon in Cy3, TJ. Phys. Chem. Lett. 6, 1819-1823. (43) Sanborn, M. E., Connolly, B. K., Gurunathan, K., and Levitus, M. (2007) Fluorescence Properties and Photophysics of the Sulfoindocyanine Cy3 Linked Covalently to DNA, J. Phys. Chem. B 111, 11064-11074. (44) Aramendia, P. F., Negri, R. M., and Roman, E. S. (1994) Temperature Dependence of Fluorescence and Photoisomerization in Symmetric Carbocyanines.Influence of Medium Viscosity and Molecular Structure, J. Chem. Phys. 98, 3165-3173. (45) Gruber, H. J., Hahn, C. D., Kada, G., Riener, C. K., Harms, G. S., Ahrer, W., Dax, T. G., and Knaus, H.-G. (2000) Anomalous Fluorescence Enhancement of Cy3 and Cy3.5 versus Anomalous Fluorescence Loss of Cy5 and Cy7 upon Covalent Linking to IgG and Noncovalent Binding to Avidin, Bioconjugate Chem. 11, 696-704. (46) Lerner, E., Ploetz, E., Hohlbein, J., Cordes, T., and Weiss, S. (2016) A Quantitative Theoretical Framework For Protein-Induced Fluorescence Enhancement-Forster-Type Resonance Energy Transfer (PIFE-FRET), J. Phys. Chem. B 120, 6401-6410. (47) Morgan, T. T., Muddana, H. S., Altinoglu, E. I., Rouse, S. M., Tabakovic, A., Tabouillot, T., Russin, T. J., Shanmugavelandy, S. S., Butler, P. J., Eklund, P. C., Yun, J. K., Kester, M., and Adair, J. H. (2008) Encapsulation of organic molecules in calcium phosphate nanocomposite particles for intracellular imaging and drug delivery, Nano Lett. 8, 41084115. (48) Muddana, H. S., Morgan, T. T., Adair, J. H., and Butler, P. J. (2009) Photophysics of Cy3-encapsulated calcium phosphate nanoparticles, Nano Lett. 9, 1559-1566. (49) Lakowicz, J. (2006) Principles of Fluorescence Spectroscopy. (50) Sanborn, M. E., Connolly, B. K., Gurunathan, K., and Levitus, M. (2007) Fluorescence properties and photophysics of the sulfoindocyanine Cy3 linked covalently to DNA, J. Phys. Chem. B 111, 11064-11074. (51) Warhaut, S., Mertinkus, K. R., Hollthaler, P., Furtig, B., Heilemann, M., Hengesbach, M., and Schwalbe, H. (2017) Ligand-modulated folding of the full-length adenine riboswitch probed by NMR and single-molecule FRET spectroscopy, Nucleic Acids Res. 45, 5512-5522. (52) Norman, D. G., Grainger, R. J., Uhrín, D., and Lilley, D. M. J. (2000) Location of Cyanine-3 on Double-Stranded DNA:  Importance for Fluorescence Resonance Energy Transfer Studies, Biochemistry 39, 63176324. 53. Lerner, E., Ploetz, E., Hohlbein, J., Cordes, T., and Weiss, S. (2016) A Quantitative Theoretical Framework For Protein-Induced Fluorescence

11

ACS Paragon Plus Environment

Page 13 of 13 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

Enhancement–Förster-Type Resonance Energy Transfer (PIFE-FRET), J. Phys. Chem. B 120, 6401-6410. (54) Enderlein, J., and Erdmann, R. (1997) Fast fitting of multiexponential decay curves, Opt. Commun. 134, 371-378. (55) Chibisov, A. K., Zakharova, G. V., Goerner, H., Sogulyaev, Y. A., Mushkalo, I. L., and Tolmachev, A. I. (1995) Photorelaxation Processes in Covalently Linked Indocarbocyanine and Thiacarbocyanine Dyes, J. Phys. Chem. 99, 886-893. (56) Spiriti, J., Binder, J. K., Levitus, M., and van der Vaart, A. (2011) Cy3-DNA stacking interactions strongly depend on the identity of the terminal basepair, Biophys. J. 100, 1049-1057. (57) Levitus, M., and Ranjit, S. (2011) Cyanine dyes in biophysical research: the photophysics of polymethine fluorescent dyes in biomolecular environments, Q. Rev. Biophys. 44, 123-151. (58) Harvey, B. J., Perez, C., and Levitus, M. (2009) DNA sequencedependent enhancement of Cy3 fluorescence, Photochem. Photobiol. Sci. 8, 1105-1110. (59) Harvey, B. J., and Levitus, M. (2009) Nucleobase-specific enhancement of Cy3 fluorescence, J. Fluoresc. 19, 443-448. (60) Stennett, E. M., Ma, N., van der Vaart, A., and Levitus, M. (2014) Photophysical and dynamical properties of doubly linked Cy3-DNA constructs, J. Phys. Chem. B 118, 152-163.

(61) Kalinin, S., Peulen, T., Sindbert, S., Rothwell, P. J., Berger, S., Restle, T., Goody, R. S., Gohlke, H., and Seidel, C. A. M. (2012) A toolkit and benchmark study for FRET-restrained high-precision structural modeling, Nat. Methods. 9, 1218-1225. (62) Reuss, A. J., Grunewald, C., Braun, M., Engels, J. W., and Wachtveitl, J. (2016) The Three Possible 2-(Pyrenylethynyl) Adenosines: Rotameric Energy Barriers Govern the Photodynamics of These Structural Isomers, Chemphyschem 17, 1369-1376. (63) Lee, S., Lee, J., and Hohng, S. (2010) Single-molecule three-color FRET with both negligible spectral overlap and long observation time, PLoS One 5, e12270. (64) Hengesbach, M., Kim, N.-K., Feigon, J., and Stone, M. D. (2012) Single-Molecule FRET Reveals the Folding Dynamics of the Human Telomerase RNA Pseudoknot Domain, Angew. Chem., Int. Ed. Engl. 51, 5876-5879. (65) Ellefsen, K. L., Dynes, J. L., and Parker, I. (2015) Spinning-Spot Shadowless TIRF Microscopy, PLoS One 10, e0136055.

12

ACS Paragon Plus Environment