2D IR Correlation Spectroscopy in the ... - ACS Publications

May 1, 2017 - Protein Research Center, University of Puerto Rico, Mayagüez Campus, Mayagüez, Puerto Rico 00681-9019, United States. •S Supporting ...
0 downloads 0 Views 2MB Size
Subscriber access provided by UNIV OF NEW ENGLAND

Article

2D IR Correlation Spectroscopy in the Determination of Aggregation and Stability of KH Domain GXXG Loop Peptide in the Presence and Absence of Trifluoroacetate. Aslin Rodriguez Nassif, Igor de la Arada, Jose Luis R. Arrondo, and Belinda Pastrana Rios Anal. Chem., Just Accepted Manuscript • Publication Date (Web): 01 May 2017 Downloaded from http://pubs.acs.org on May 2, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Analytical Chemistry is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 16

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

2D IR Correlation Spectroscopy in the Determination of Aggregation and Stability of KH Domain GXXG Loop Peptide in the Presence and Absence of Trifluoroacetate. Aslin Rodríguez Nassif, † Igor de la Arada, ‡ José Luis Arrondo, ‡ and Belinda Pastrana-Rios †, ⁎ †Department of Chemistry, University of Puerto Rico, Mayagüez Campus, Mayagüez, PR 00681-9019, USA ‡Biofisika Institute and Biochemistry and Molecular Biology Department, CSIC and University of Basque Country, Bilbao, 48080, Spain ⁎Protein Research Center, University of Puerto Rico, Mayagüez Campus, Mayagüez, PR 00681-9019, USA ABSTRACT: Trifluoroacetate (TFA) is a strong anion byproduct of solid phase peptide synthesis. Fourier transform infrared (FTIR) spectroscopy can be used to ascertain the presence of this excipient in peptide samples for quality assessment. TFA absorbs as a strong sharp peak (1675 cm-1) within the amide I’ band of the spectral region. A peptide sample and the TFA excipient can be studied simultaneously by FT-IR and 2D IR correlation spectroscopies. In addition, these techniques are able to determine the effect of TFA on the stability of the peptide. Herein, we describe the spectroscopic characterization of the GXXG loop peptide (GXXGlp), which is present in KH domain containing proteins. The sequence of the Homo sapiens Krr1 GXXGlp is evolutionarily conserved (165KRRQRLIGPKGSTLKALELLTNCY189) and has been associated with ssDNA interaction and ribosome biogenesis. Our goal was to determine the structural elements present in this peptide and evaluate whether TFA affects the stability of GXXGlp during thermal stress. We observed differences in the molecular behavior of the synthetic peptide in the presence and absence of TFA at various peptide concentrations. The mechanism of aggregation was established by FTIR at high peptide concentrations, but more importantly, we were able to define the optimum formulation conditions under which to study this peptide. Finally, 2D IR correlation spectroscopy was used for the determination of the unfolding process, mechanism and extent of peptide aggregation, and the effect of TFA on the stability of the peptide. This spectroscopic method can be applied to the characterization of any synthetic peptide.

Peptides are an important class of molecules. 1–3 They are used for fragment-based drug design, as pharmacologically active molecules, to identify or confirm the binding interface in proteinprotein interactions of interest, as a tethered component in antibody-drug conjugates, and as modified peptides to probe targets in situ. The development of synthetic peptide-based drugs can be complicated by the challenge of adequate characterization, because the peptides exhibit structural effects and stability differences that can be due to the polarity of the side chain composition and the effect of excipients or of complex impurity profiles. Identification of chemical impurities and the analysis of the conformation of the peptide often requires orthogonal analytical techniques to ensure the quality of the peptide sample. Another factor that affects the structural analysis of peptides is aggregation, 4–8 which is a concern in the biotechnology industry and is a phenomenon frequently observed during the expression and purification of recombinant proteins in the research laboratory. Protein aggregation can occur through several distinct mechanisms. 9–12 Aggregation can occur when the native state of the protein is changed or modified and/or when particles act like seeds to aggregate the protein. One type of reversible aggregation is called self-association, which is simply interacting in a non-covalent manner. The monomer is self-complementary and can reversibly form different oligomeric species. However, the aggregate can also dissociate. 13–17 To prevent protein aggregation, the mechanism of aggregation must be ascertained. Protein aggregation can be detected by FT-IR, which shows a characteristic shoulder at ~1624 cm-1 within the amide I’ band, which is typically located

within 1700 – 1600 cm-1. 4–7 Fourier transform infrared (FT-IR) and 2D IR correlation spectroscopies are useful techniques for the study of proteins and peptides in vitro without the use of labels. Because they are based on vibrational modes (motion), they are extremely sensitive to conformational changes, allowing for direct observation of the protein or peptide and its excipients in equilibrium and in solution. IR spectroscopy is a direct optical method that relies on first principles. This technique is both highly sensitive and selective, allowing for the simultaneous study of peptides and contaminants or excipients in the same sample. Unlike other spectroscopic techniques, if aggregates are present, the information associated with the protein structure is not lost, but instead an opportunity exists to decipher the mechanism of aggregation as a function of sample conditions such as temperature, pH, light exposure, acoustics, agitation, etc. It is possible to use a correlation algorithm (2D IR correlation analysis) to relate spectral changes to the perturbation (response of the peptide to the stressor), thus providing a molecular understanding of these effects on the peptide. Together, FT-IR and 2D IR correlation spectroscopy are especially suited to determine the effect of an excipient on the stability of a peptide and whether the excipient inhibits aggregation of the peptide. 18–22 2D IR correlation spectroscopy technique has been proven useful in: (1) validating the presence of TFA, (2) determining the effect of TFA in the stability of the peptide and (3) determining the presence of aggregate species in a peptide sample of interest. 23–26 Trifluoroacetic acid (TFA) is a strong carboxylic acid that is commonly used in solid phase peptide synthesis for the removal of protecting groups. However, this

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 16

strong counter ion is an excipient that interferes with the structural analysis of proteins, the stability of cell cultures or animal model studies. Typically, dry peptide samples contain 30-50% TFA by weight and can be removed by using ion exchange chromatography or with repeated lyophillizations using 0.1 N HCl followed by dialysis. 27-31 FT-IR spectroscopy can be used to ascertain the presence of this excipient in a sample within the Mid-IR region. 18–22,32 TFA absorbs as a strong narrow peak at 1675.3 cm-1 within the amide I’ band in the spectral region 1700 – 1600 cm-1. 29,30 The objective of this study was to characterize a synthetic peptide whose sequence is based on the Krr1 protein by FT-IR and 2D IR spectroscopies. Homo sapiens Krr1 is involved in the processing, assembly, and maturation of the small subunit of eukaryotic ribosomes. Previous studies showed that defects in ribosome biogenesis can have detrimental effects on cellular metabolism, and many diseases have been found to be associated with defects in ribosome synthesis, including cancer and the fragile X syndrome which is associated with mental retardation. 33–41 Krr1 contains a conserved KH domain, which is present in a wide variety of nucleic acid-binding ribosomal proteins. The sequence of the Krr1 GXXG loop peptide (lp) is KRRQRLIGPKGSTLKALELLTNCW and corresponds to residues 165-189 of Krr1’s KH domain (see supporting information). The sequence identity of this region is 100% for mammalian organisms (Figure 1A). Homology modeling has yielded a proposed combination of secondary structures which includes a beta strand, a loop, and two helical segments (Figure 1B). The surface charge model of this peptide shows a high concentration of positively charged residues at its N-terminus (Figure 1C). We were interested in evaluating the secondary structure components of the peptide in a buffered aqueous solution and the effect of TFA, inherently present as a by product of the solid phase synthesis, on the stability of the GXXGlp.

150 mM NaCl, and 4 mM CaCl2, pH 7.4). Both cells were set in a custom dual-chamber cell holder. The FT-IR spectrophotomer used was a Jasco model 6200 equipped with a MCT detector and sample shuttle interface. Thermal control was achieved via a Neslab RTE-740 refrigerated bath (Thermo Electron Corp.) and monitored with a thermocouple in close contact with the sample cell. The data acquisition was performed at 5oC intervals. Only after the desired temperature was reached and thermal equilibrium (15 min) was achieved did acquisition begin. The data collection was 512 scans apodized with a triangular function, and Fourier transformed to provide a resolution of 4 cm-1 with data encoded every 2 cm-1. 4,18–22,30,32 2D IR Correlation Spectroscopy. 2D IR correlation spectroscopy is a technique developed by Dr. Isao Noda 43–46 that uses the FT-IR series of sequential spectra as a function of a perturbation, in this case temperature (5-90°C), to obtain a difference spectral data set by subtraction of the initial spectrum from all subsequent spectra. Difference spectra used in 2D IR correlation spectroscopy are defined as:

EXPERIMENTAL SECTION

Asynchronous 2D correlation intensities of the covariance spectral data are defined by:

Materials. The KH domain GXXGlp (AcKRRQRLIGPKGSTLKALELLTNCW-NH2) was purchased from Biosynthesis, Inc. (see supporting information, Figure S1). Part of the sample was subjected to TFA removal by repeated lyophilizations with 0.1 N HCl, followed by exhaustive dialysis against the appropriate buffer (see below) as per our established method. 4,19,21,51 The GXXGlp was then analyzed by CD to determine if the peptide had any structural components. Circular dichroism. Far-UV CD spectra of 20.7 µM TFA-free GXXGlp in 2.5 mM HEPES, 7.5 mM NaCl, 0.2 mM MgCl2 and 0.2 mM CaCl2, at pH 7.4, was subject to a 10-scan acquisition within the spectral range of 250−193 nm, at a scan rate of 20 nm/min and 20 °C using a Jasco spectropolarimeter model J-810 (Tokyo, Japan) equipped with a temperature controller. The CD absorbance data was normalized to mean residue molar ellipticity and smoothed with an adaptive smoothing method. 42 FT-IR Spectroscopy. A comparative analysis of GXXGlp under various conditions was performed using FT-IR spectroscopy. These peptide samples were: (a) 33.5 mg/mL GXXGlp in the presence of TFA (i.e., commercial peptide product without the removal of the solid phase synthesis byproduct), and in the absence of TFA: (b) high concentration 50.1 mg/mL GXXGlp and (c) low concentration 16.7 mg/mL GXXGlp. The buffer for each sample consisted of 50 mM HEPES, 150 mM NaCl, and 4 mM CaCl2, at pH 7.4. These peptide samples were fully H→D exchanged by repeated lyophilization (8x) after dissolving the sample in D2O as previously described. 4,18–22,30,32 The final volume of the sample, 60 µL, was placed on CaF2 custom milled cells. A reference cell was prepared with only the buffer (50 mM HEPES,

à ,   =

 ,   − Ā   1 ≤  ≤  (1) 0 ℎ

Where, Ā  is the initial spectrum of the data set to generate the covariance spectra. Synchronous 2D correlation intensities of the covariance spectral data are defined by: Φ ,  ! = Ã ,   ∙ Ã ,  

(2)

The resulting correlation intensity Φ ,  ! as a function of two independent wavenumber axes, ν1 and ν2, is the synchronous plot.

Ψ ,  ! = Ã ,   ∙ $% Ã , % !

(3)

The term $% is the element of the so-called Hilbert-Noda transformation matrix, given by: $% = &

0   = ' 

()%!

ℎ

(4)

It is to this difference spectral data set that a cross correlation function is applied, which results in two separate, yet symmetrical 2D plots. The first plot is referred to as the synchronous plot. It contains positive peaks on the diagonal, known as the auto peaks, and summarizes the changes observed in the spectral data set. The relationship established in this synchronous plot relates the spectral intensity changes that are in-phase with one another. The second 2D plot is known as the asynchronous plot. This plot relates the out-of-phase intensity changes, enhances the resolution of the spectral region of interest, and can easily be distinguished from the previous plot because it lacks peaks on the diagonal. Both plots contain off-diagonal peaks, which are referred to as cross peaks, these peaks correlate the spectral changes observed. Spectral intensity changes observed are due to the incremental thermal perturbation applied to the peptide sample. Therefore, the information in both plots allows for the determination of the sequential order of molecular events that occur during the perturbation following Noda’s rules. 43-46 These plots are symmetrical in

ACS Paragon Plus Environment

2

Page 3 of 16

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

nature and for discussion purposes we will always refer to the top triangle for analysis. We begin with the plot that has the greatest resolution enhancement (i. e., the asynchronous plot): I.

asynchronous cross peak, ν2 if positive, then ν2 is perturbed prior to ν1 (ν ν2 → ν1).

II.

asynchronous cross peak, ν2 if negative then ν2 is perturbed after to ν1. (ν ν 2 ← ν 1) III. If the corresponding synchronous cross peak is positive, then the order of the event is established using the asynchronous plot (rules I and II). IV. However, if the corresponding synchronous cross peak is negative and the asynchronous cross peak is positive then the order is reversed. The order of events can be established for each peak observed in the ν2 axis. Our group has successfully applied 2D IR correlation spectroscopy to the study of numerous proteins and peptides. 4,18– 20,22

Spectral analysis. The spectral data was not manipulated except for baseline correction. Spectral overlay, peak pick routines, and 2D IR correlation spectroscopy analysis were performed using the Kinetics program by E. Goormaghtigh (Free University of Brussels, Brussels, Belgium) for MATLAB (MathWorks, Natick, MA). The curve-fitting routines were done using Grams 3.01 (Galactic Industries Corp.). Finally, the CD spectrum, curvefitting and thermal dependence plots were generated using Origin version 7 from Origin Lab Corp. (Northampton, MA). RESULTS AND DISCUSSION Circular dichroism. Far-UV CD was employed to verify the secondary structure composition of the GXXGlp in the absence of TFA (Figure 1D). The mean and standard deviation are as follows: GXXGlp helical contribution was determined at 222 nm and 25.0oC to be [θ]MR = -18,599.9 ± 839.3 degrees cm2/decimol, while the random coil component was determined at 201 nm and at 25.0oC to be [θ]MR = -62,845.8 ± 1,431.1 degrees cm2/decimal for triplicate spectra. Spectral features were indicative of a helical peptide, but the overlap of the beta or loop components of the spectrum precluded a determination of other secondary structures. Vibrational spectroscopy. We explored the molecular behavior of fully H→D exchanged GXXGlp in the presence and absence of TFA at various peptide concentrations using FT-IR spectroscopy within the spectral region of 1750-1500 cm-1(Figure 2A-C). The spectral region of interest was comprised of: the amide I′ band (1700-1600 cm-1) and shoulders due to side chain modes (1601500 cm-1). The complete H→D exchange simplifies the amide I’ band contour, allowing for the helical and loop vibrational modes associated carbonyl stretching modes (ʋ(C=O)) to be observed. These carbonyl stretching vibrations are highly sensitive to conformational changes. Similarly, the side chain modes were studied as shoulders within the contour. The side chain modes are limited to the spectral region of 1600−1500 cm-1 and are comprised of: arginine guanidinium asymmetric and symmetric stretching vibrational modes (ʋ(N-D)) and a single glutamate carboxylate stretching vibrational mode (ʋ(COO-)). This glutamate residue at position 18 of the 24 residue peptide, located near the C-terminal end within the second helical motif of the peptide sequence serves as a probe for this region of the peptide. In general for the spectral overlay, the amide I’ band maximum shifted toward higher wavenumbers as the temperature increased, and a concomitant decrease in intensity was observed, suggesting a transition toward unfolding (Figure 2A-C). The intensity of the side chain band of GXXGlp also decreased with increasing temperature in the absence of TFA at both high and low peptide concentration (Figure 2B, C), in contrast to the TFA-containing sam-

ple (Figure 2A), which may be indicative of the interaction between the TFA and the peptide. In the absence of TFA and at high peptide concentration, aggregation of the peptide was observed as a shoulder at 1620.7 cm-1 (Figure 2B). In this peptide sample, the intensity of the aggregation peak continued to increase and shift to higher wavenumbers until the temperature reached 50°C, after which the intensity of the aggregation peak (shoulder at 1620.7 cm-1) decreased and shifted to lower wavenumbers. Subsequent spectral changes observed followed the unfolding process of the peptide, suggesting that at temperatures above Tm, the backbone dynamics governed the process. FT-IR thermal dependence study. We monitored the amide I′ band maximum peak position as a function of temperature to generate the thermal dependence plots shown in Figure 2D-F. In the presence of TFA, the initial amide I’ band position at 1644.5 cm-1 at 5oC, suggesting that the peptide is less stable than the peptide sample in the absence of TFA (Figure 2C). Moreover, GXXGlp in the presence of TFA had a transition temperature of Tm = 27.5°C and no aggregation was observed. In the absence of TFA, we determined the thermal transition temperature for the well behaved peptide sample at low concentration to be Tm = 44.9°C (Figure 2F). Furthermore, the unfolding process was determined to be reversible for this peptide sample (see supporting information Figure S3). However, at high peptide concentration, we observed aggregation of the GXXGlp peptide (Figure 2E). For this sample, we monitored both the amide I’ band and the aggregation peak maximum as a function of temperature. The concomitant amide I’ band shifted to higher wavenumbers, suggesting destabilization of the secondary structure, and a dramatic shift of the aggregation peak was observed. Furthermore, the unfolding process dominated at higher temperatures (50-70oC), thus dissociating the aggregate species. At temperatures above 70oC, the behavior of the aggregate region followed that of the unfolding peptide. The decrease in aggregate species at high temperatures and the continued unfolding event suggests that the aggregation mechanism was that of self-association at high concentrations of the peptide. Moreover, no aggregation was observed at low peptide concentration in the absence of TFA (Figure 2D, F). Band assignments. 2D IR correlation spectroscopy and curvefitting analysis were performed to improve our understanding of the relationship between the backbone dynamics and side chain interactions involved in the aggregation process and determine the extent of aggregation.. The band assignment values determined are the mean values for each assignment (Table 1). The TFA containing GXXGlp sample spectra has TFA absorption4,47,48 as a sharp peak at 1675.3 cm-1 (Figure 2A). The backbone vibrational modes for GXXGlp (Figure 2A-C) have two different loop types (loopA and loopB) that are referring to the single GXXG loop found in the peptide (Figure 1) with varying degrees of flexibility due to hydrogen bonding found in a mixed population of the peptide in solution. LoopA (1684.3 cm-1) will be considered the flexible loop present in lower abundance within the sample and loopB (1661.5 cm-1) will be considered the rigid loop present at higher abundance. We also assigned the helical component (1642.2 cm-1) and the β-strand (1624.1 cm-1) component located at the Cterminal end of the peptide. Aggregation was observed only for the highly concentrated TFA-free sample (Figure 2B) as a shoulder at 1620.7 cm-1. The side chain modes included the following: the positively charged arginine residues (R166, R167, and R169) guanidinium stretching modes ʋa(N−D) and ʋs(N−D) at 1605.7 and 1585.5 cm-1, respectively. Also, a single glutamate residue (E182) assigned to a stretching vibrational mode ʋ(COO-) at 1558.7 cm-1. These band assignments are validated by the high resolution secondary structure information available. 40 Curve-fitting analysis. The number and position of sub-bands used for the curve-fitting routine were determined by using the defined auto peaks within the synchronous plot of the 2D IR cor-

ACS Paragon Plus Environment

3

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

relation analysis. Typical curve-fitted spectra are shown in Figure 3. All of the sub-bands used in the fit have been properly assigned to a specific vibrational mode (discussed above, see supporting information Figure S4 and Table S1). In the curve-fitting analysis (Figure 3), we can see the small intensity changes TFA has a function of increasing temperature due to it being a small molecule (Figure 3A-C). In the presence of TFA, the α-helical subband confirms the thermal dependence study and the 2D IR correlation analysis in that the secondary structure contribution is initially less stable. We determined the GXXGlp helical contribution to be 61.8 ± 0.6% for this sample. In the absence of TFA and at high GXXGlp concentration (Figure 3D-F) the α-helix contribution is 40.4 ± 0.7% along with aggregate species. The extent of aggregation was determined to be 27.0 ± 1.0%. However, at low GXXGlp concentration (Figure 3G-I), the helical contribution was determined to be 67.4 ± 0.4%, suggesting that the TFA does indeed destabilize the secondary structure of the peptide. More importantly, the positively charged residues (arginines and lysines) are located within the helical motifs where the TFA will interact as an ion pair, thus destabilizing the helical structure of the peptide. 2D IR correlation spectroscopy. This technique enhances the resolution of the FT-IR spectra (Figure 2A-C) and correlates the peak intensity changes observed due to the perturbation, enabling the description of the molecular behavior of the peptide during the thermal stress. Analyses of the synchronous (Figure 4D-F) and asynchronous plots (Figure 4G-I) involves the evaluation of peaks observed in these contour plots. In general, the synchronous plot contains peaks on the diagonal known as auto peaks. These peaks are always positive and define the magnitude of the overall spectral intensity changes observed. For both the synchronous and asynchronous plots the off diagonal peaks are referred to as cross peaks. Cross peaks can be either positive or negative. The sign of the cross peaks is used to determine the order of molecular events, as per Noda’s rules (see Experimental Section). 43–46 Covariance between the 2D IR correlation plots can be observed for GXXGlp (Figure 4D-I), suggesting distinct peptide behavior due to thermal stress in each sample in the presence and absence of TFA and at varying peptide concentration. A step analysis 18–21 was performed to highlight key events at set temperature ranges (pre-transition, during the increase in aggregation and during the unfolding event) as shown for GXXGlp at high concentration and in the absence of TFA in Figure 5. The 2D IR correlation plots for the step analysis provide the alterations due to the aggregation (self-association) when compared to the monomer species and the observed similarities during thermal unfolding. The 2D IR plots confirm the molecular events defined for the full-temperature range (Figure 4D-I). Similarly, for the peptide samples: in the presence of TFA or in the absence of TFA at low GXXGlp concentration are described in supporting information Figures S5 and S6, respectively. Sequential order of molecular events. In each case we analyzed the asynchronous and synchronous plots (Figure 4D-I) to generate the order of events for GXXGlp in the presence and absence of TFA at varying peptide concentrations within the spectral region of 1750-1500 cm-1 in the temperature range of 5-90°C. Hence, the initial events of perturbation within the peptide occur at low temperatures and therefore are the least stable regions of the peptide. Similarly, the last event of perturbation occurs at high temperatures and therefore corresponds to the most stable regions of the peptide. The results of the interpretation are shown in Figure 6 and summary of events in supporting information, Tables S2-S4. For the GXXGlp in the presence of TFA (Figure 6A and supporting information Table S2), the helical component (1642.2 cm-1) and the β-strand (1624.1 cm-1) located in the C-terminal end were perturbed initially, followed by the flexible loopA and rigid loopB

Page 4 of 16

(1684.3 and 1661.5 cm-1, respectively); by the arginines (1605.7 cm-1, 1585.5 cm-1) which were perturbed by TFA via ion-pair interaction; and then the glutamate (1558.7 cm-1). The final event was the dissociation of the ion pairing with the TFA (1675.3 cm1 ). Finally, the intramolecular salt-bridge interactions were perturbed (Glu- 1558.7 cm-1 and Arg 1585.5 cm-1). In the absence of TFA and at high GXXGlp concentration (Figure 6B and supporting information Table S3), the initial perturbation occurs within the flexible loopA (1684.3 cm-1, the least stable motif) followed by the β-strand (1624.1 cm-1) located in the C-terminal end; then the aggregate species (1620.7 cm-1) was observed to increase involving the α-helical motif (1642.2 cm-1), and as the temperature increased the inter-molecular salt-bridge interactions involving the single glutamate (E182) residue (1558.7 cm-1) and the arginines (1585.5, 1605.7 cm-1) located in the N-terminal helix were broken, disrupting the self-association. Finally, the most stable motif was the rigid loopB (1661.5 cm-1). In this sample, there were two separate molecular events occurring as the temperature increased: the first was the increase in aggregation or self-association which upon closer inspection revealed molecular details about the selfassociation involving the salt-bridge interaction which was also modeled (Figure 7). Also, highlighted were the cross peaks evident in the interaction that support the self-association mechanism. In the absence of TFA at low GXXGlp concentration (Figure 6C and supporting information Table S4), which represents the optimum formulation conditions, both loop types representing the ssDNA binding region of the peptide were perturbed initially (loopA at 1684.3 cm-1 and loopB at 1661.5 cm-1). As the temperature as increased the helical (1642.2 cm-1) and β-strand (1624.1 cm-1) motifs were perturbed, followed by the disruption of the intramolecular salt-bridge interaction involving the single glutamate (E182) residue (1558.7 cm-1) and the arginines (1585.5, 1605.7 cm-1) located in the N-terminal end of the α-helical motif. Consequently, the intramolecular salt-bridge interaction is key to the structural integrity of the peptide. CONCLUSIONS Trifluoroacetate (TFA) is intrinsic to solid phase synthesis of peptides. TFA is a strong counter ion to the peptide which commonly is provided in a lyophilized powder form. Often these peptide samples are used as is without further purification, yielding erroneous physicochemical characterization of the peptide, destabilizing cell cultures and biological assays or potentially interfering with drug efficacy in a clinical treatment of a patient. 51-54 Some spectroscopist suggest the digital subtraction of the TFA peak at 1675.3 cm-1 from the FT-IR spectrum without being aware of the strong ion pairing that exist between TFA and the peptide via its positively charged amino acids. We were motivated to investigate the effect of TFA on the stability and integrity of a structured synthetic peptide. In this study, the secondary structure of the GXXGlp in the absence of TFA was characterized by CD, and were also validated by FT-IR and 2D IR correlation spectroscopies. In addition, FT-IR and 2D IR correlation spectroscopies were used to establish the differences in conformational stability of the GXXGlp in the presence and absence of TFA. The results presented herein have demonstrated the wealth of information the combination of these molecular biophysical techniques can provide in the following order: 2D IR correlation > FT-IR > CD. This orthogonal approach provided a direct and label-free method for monitoring the molecular changes in a peptide/protein. Furthermore, 2D IR correlation spectroscopy via the identification of cross peaks enabled us to define TFA’s interaction with the arginine residues presumably through ion pairing which resulted in the destabilization of the amino terminal end helical motif. We established the relative stability of the structural motifs within the peptide for each formulation. More importantly,

ACS Paragon Plus Environment

4

Page 5 of 16

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

we determined the mechanism and extent of aggregation/selfassociation and the optimal formulation conditions allowing novel peptide-target interactions to be characterized. Also, we identified the intermolecular salt-bridge as the key factor in the mechanism of aggregation of GXXGlp. The salt-bridge was disrupted by increasing the temperature, which allowed the peptide to continue its path towards unfolding. We are convinced that FT-IR and 2D IR correlation spectroscopies can be used to obtain critical empirical information about the stability of proteins and peptides under a variety of conditions. These studies may be extended in the future to include a ternary complex involving the GXXGlp under biologically significant conditions. Finally, the dynamic molecular information used to describe the behavior of the peptide in solution may be used for in silico modeling. ASSOCIATED CONTENT Supporting Information. The supporting information is available free of charge on the ACS Publications Website. Figure S1, Figure S2, Figure S3, Table S1, Figure S4, Figure S5, Figure S6, Table S2, Table S3, and Table S4.

(8)

(9) (10) (11) (12)

(13) (14) (15) (16)

AUTHOR INFORMATION Corresponding Author

(17)

* Telephone: (787) 832-4040 ext. 2302, E-mail: [email protected]

(18)

Author Contributions The manuscript was written through contributions of all authors. All authors have given approval to the final version of the manuscript.

(19) (20)

ACKNOWLEDGMENT The authors would like to acknowledge the support of SLOAN and NIH-RISE (ARN), NIH Grant No. SO6GM08103-28 (B.P.R.), the Henry Dreyfus Foundation Teacher Scholar Award (B.P.R.), National Institutes of Health COBRE grant P20 RR16439-01 (B.P.R.) Hector Collazo, Esq., for the funds provided through the International Health Games, the University of Puerto Rico Mayagüez Campus, and Dr. Melissa Stauffer (Scientific Editing Solutions, Walworth, WI) for editing the manuscript. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

(21)

(22)

(23)

REFERENCES (1)

(2)

(3) (4) (5)

(6)

(7)

Pelay-Gimeno, M.; Glas, A.; Koch, O.; Grossmann, T. N. Angew. Chem. Int. Ed. 2015, 54, 8896–8927. Navab, M.; Anantharamaiah, G. M.; Reddy, S. T.; Hama, S.; Hough, G.; Frank, J. S.; Grijalva, V. R.; Ganesh, V. K.; Mishra, V. K.; Palgunachari, M. N.; Fogelman, A. M. Circ. Res. 2005, 97 (6), 524–532. Pini, A.; Falciani, C.; Bracci, L. Curr. Protein Pept. Sci. 2008, 9 (5), 468–477. Pastrana-Rios, B.; del Valle Sosa, L.; Santiago, J. Struct. Dyn. 2015, 2 (4), 041701-041711. Mitri, E.; Kenig, S.; Coceano, G.; Bedolla, D. E.; Tormen, M.; Grenci, G.; Vaccari, L. Anal. Chem. 2015, 87 (7), 3670–3677. Shivu, B.; Seshadri, S.; Li, J.; Oberg, K. A.; Uversky, V. N.; Fink, A. L. Biochemistry 2013, 52 (31), 5176–5183. Kong, J.; Yu, S. Acta Biochim. Biophys. Sin. 2007,

(24) (25) (26) (27) (28) (29)

(30)

(31)

39 (8), 549–559. Rondeau, P.; Navarra, G.; Cacciabaudo, F.; Leone, M.; Bourdon, E.; Militello, V. Biochim. Biophys. Acta. 2010, 1804 (4), 789–798. Cromwell, M. E. M.; Hilario, E.; Jacobson, F. AAPS J. 2006, 8 (3), E572-E579. Philo, J. S.; Arakawa, T. Curr. Pharm. Biotechnol. 2009, 10 (4), 348–351. Philo, J. S. Curr. Pharm. Biotechnol. 2009, 10 (4), 359–372. Roberts, S. A.; Andrews, P. A.; Blanset, D.; Flagella, K. M.; Gorovits, B.; Lynch, C. M.; Martin, P. L.; Kramer-Stickland, K.; Thibault, S.; Warner, G. Regul. Toxicol. Pharmacol. 2013, 67 (3), 382–391. Balbo, A.; Schuck, P. Protein-Protein Interact. A Mol. Cloning Man. 1950, 301, 2–33. Gottschalk, M.; Venu, K.; Halle, B. Biophys. J. 2003, 84 (6), 3941–3958. Schuck, P. Anal. Biochem. 2003, 320 (1), 104–124. Zehender, F.; Ziegler, A.; Schönfeld, H. J.; Seelig, J. Biochemistry 2012, 51 (6), 1269–1280. Kudryashova, E. V; Visser, A. J. W. G.; De Jongh, H. H. J. Protein Sci. 2005, 14 (2), 483–493. Iloro, I.; Pastrana-Rios, B. J. Mol. Struct. 2006, 799 (1–3), 153–157. Pastrana-Rios, B. Biochemistry 2001, 40 (31), 9074–9081. Pastrana-Rios, B.; Ocaña, W.; Rios, M.; Vargas, G. L.; Ysa, G.; Poynter, G.; Tapia, J.; Salisbury, J. L. Biochemistry 2002, 41 (22), 6911–6919. Sosa, L. del V.; Alfaro, E.; Santiago, J.; Narváez, D.; Rosado, M. C.; Rodríguez, A.; Gómez, A. M.; Schreiter, E. R.; Pastrana-Ríos, B. Proteins Struct. Funct. Bioinforma. 2011, 79 (11), 3132–3143. Pastrana-Ríos, B.; Reyes, M.; De Orbeta, J.; Meza, V.; Narváez, D.; Gómez, A. M.; Rodríguez Nassif, A.; Almodovar, R.; Díaz Casas, A.; Robles, J.; Ortiz, A. M.; Irizarry, L.; Campbell, M.; Colón, M. Biochemistry 2013, 52 (7), 1236–1248. Takaoka, Y.; Sakamoto, T.; Tsukiji, S.; Narazaki, M.; Matsuda, T.; Tochio, H.; Shirakawa, M.; Hamachi, I. Nat. Chem. 2009, 1 (7), 557–561. Lin, M. F.; Larive, C. K. Anal. Biochem. 1995, 229 (2), 214–220. Shen, C. L.; Fitzgerald, M. C.; Murphy, R. M. Biophys. J. 1994, 67 (3), 1238–1246. D’Hondt, M.; Gevaert, B.; Wynendaele, E.; De Spiegeleer, B. J. Pharm. Anal. 2016, 6 (1), 24–31. Merrifield, R. B. J. Am. Chem. Soc. 1963, 85 (14), 2149-2154. Aapptec. Aapptec, Synth. Notes 2011, 1, 1–76. Sigdel, T. K.; Nicora, C. D.; Hsieh, S.-C.; Dai, H.; Qian, W.-J.; Camp, D. G.; Sarwal, M. M. Clin. Proteomics 2014, 11 (1), 7-14. Gaussier, H.; Morency, H.; Lavoie, M. C.; Subirade, M. Appl. Environ. Microbiol. 2002, 68 (10), 4803–4808. Roux, S.; Zekri, E.; Rousseau, B.; Paternostre, M.; Cintrat, J.-C.; Fay, N. J. Pept. Sci. 2008, 14 (3), 354–359.

ACS Paragon Plus Environment

5

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(32) (33) (34)

(35) (36) (37) (38) (39) (40) (41)

(42)

(43) (44) (45)

(46) (47)

Barth, A. Biochim. Biophys. Acta - Bioenerg. 2007, 1767 (9), 1073–1101. Valverde, R.; Edwards, L.; Regan, L. FEBS J. 2008, 275 (11), 2712–2726. Ciuffreda, L.; Sanza, C. Di; Milella, U. C. I. and M. Curr. Cancer Drug Targets. 2010, 10 (5), 484– 495. Drygin, D.; Rice, W. G.; Grummt, I. Annu. Rev. Pharmacol. Toxicol. 2010, 50, 131–156. Yuwen, T.; Xue, Y.; Skrynnikov, N. R. Biochemistry 2016, 55 (12), 1784–1800. Freed, E. F.; Bleichert, F.; Dutca, L. M.; Baserga, S. J. Mol. Biosyst. 2010, 6 (3), 481–493. Narla, A.; Ebert, B. L. Blood 2010, 115 (16), 3196– 3205. van Riggelen, J.; Yetil, A.; Felsher, D. W. Nat Rev Cancer 2010, 10 (4), 301–309. Zheng, S.; Lan, P.; Liu, X.; Ye, K. 2014, 289 (33), 22692–22703. Kuo, P.-H.; Chuang, L.-C.; Su, M.-H.; Chen, C.H.; Chen, C.-H.; Wu, J.-Y.; Yen, C.-J.; Wu, Y.-Y.; Liu, S.-K.; Chou, M.-C.; Chou, W.-J.; Chiu, Y.-N.; Tsai, W.-C.; Gau, S. S.-F. PLoS One 2015, 10 (9), 1-15. Circular Dichroism: Principles and Applications, 2nd Ed.; Berova, N., Nakanishi, K., Woody, R.; Wiley Online Library, 2000. Noda, I. Vib. Spectrosc. 2004, 36 (2 ), 143–165. Noda, I. J. Mol. Struct. 2016, 1124, 29-41. Two-dimensional correlation spectroscopy: applications in vibrational and optical spectroscopy; Noda, I.; Ozaki, Y. Wiley Online Library, 2004. Noda, I. J. Mol. Struct. 2014, 1069 (1), 23–49. Valenti, L. E.; Paci, M. B.; Pauli, C. P. De; Giacomelli, C. E. Anal. Biochem. 2011, 410 (1), 118–123.

(48) (49) (50)

(51)

(52)

(53)

Page 6 of 16

Ito, F. Vib. Spectrosc. 2014, 71, 57–61. Billaud, J. QIAGEN Bioinformatics. 2014, 1, 1-12. Pettersen, E. F.; Goddard, T. D.; Huang, C. C.; Couch, G. S.; Greenblatt, D. M.; Meng, E. C.; Ferrin, T. E. J. Comput. Chem. 2004, 25 (13), 1605–1612. Roux, S.; Zekri, E.; Rousseau, B.; Paternostre, M.; Cintrat, J.-C.; Fay, N. J. Pept. Sci. 2008, 14, 354359. Cornish, J.; Callon, K.E.; Lin, C.Q.; Xiao, C.L., Mulvey, T.B.; Cooper, G.J.; Reid, I.R. Am. J. Physiol. Endocrinol. Metab. 1999, 277 (5), E779E783. You, C.; Cheng, L.; Ju, C. Toxicol. Lett. 2010, 194 (3), 79-85.(54) Morange, I.; De Boisvilliers, F.; Chanson, P.; Lucas, B.; DeWally, D.; Catus, F.; Thomas, F.; Jaquet, P. J. Clin. Endocrinol. Metab. 1994, 79 (1): 145-151.

ACS Paragon Plus Environment

6

Page 7 of 16

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

FIGURE CAPTIONS Figure 1. Krr1 KH domain peptide sequence and structure. (A) Sequence alignment from K165-Y189 specific for the GXXGlp region based on the following Universal Protein Resource accession numbers: Q13601 (Homo sapiens), K7CWD4 (Pan troglodytes), F6YQM6 (Macaca mulatta), G1SKQ8 (Oryctolagus cuniculus), M3XB72 (Felis catus), Q3B7L9 (Bos taurus), F1SGD9 (Sus scrofa), Q8BGA5 (Mus musculus), A1A5R3 (Rattus norvegicus), H0VQU1 (Cavia porcellus), Q1ED12 (Danio rerio), P25586 (Saccharomyces cerevisiae), and Q9VPU8 (Drosophila melanogaster) using CLC Main Workbench 7. Sequence alignment Dots in the represent identical residues to the top sequence (Homo sapiens), the also included is a schematic secondary structure representation of the peptide (red ribbon) represents the α-helix component, (black lines) random coil, (yellow region) represents a short β-strand region and (green line) represents the GXXG loop region.49 (B) Ribbon model representation of the Homo sapiens peptide using the Saccharomyces cerevisiae high resolution structure. The model was made using UCSF’s Chimera program and Protein Data Bank structure 4QMF.50 Blue represents the N-terminal end and red corresponds to the C-terminal end of the peptide. (C) Electrostatic solid surface representation of the GXXGlp. Two views are shown, rotated by 180° about the y-axis. The peptide appears to expose many positively charged residues (in blue; R166, 167, 169 and K165, 174, 179) and a single negatively charged residue (in red; E182). The positively charged regions are where ion-pairing may occur with the TFA excipient. (D) CD analysis of the KH domain GXXG loop peptide in triplicate. Far-UV CD spectra at 25.0 °C in the spectral region of 193−250 nm for 20.7 µM GXXGlp in 2.5 mM HEPES, 7.5 mM NaCl, and 0.2 mM CaCl2 at pH 7.4. Helical and random coil contributions were observed for the peptide. Figure 2. FT-IR spectroscopy of GXXGlp in 50 mM HEPES, 150 mM NaCl, 4 mM CaCl2, and 4 mM MgCl2 at pH 7.4. (A-C) FT-IR spectral overlay of GXXGlp in the temperature range of 5°C (dark blue) 10-30°C (blue), 35-65°C (green), 70°C (yellow), 75-85°C (orange) and 90°C (red) showing the amide I′ and side chain bands in the spectral region of 1750−1500 cm-1 for: (A) 33.5 mg/mL GXXGlp in the presence of TFA, (B) 50.1 mg/mL GXXGlp in the absence of TFA, and (C) 16.7 mg/mL GXXGlp in the absence of TFA. (D-F) Thermal dependence plots are shown for the samples corresponding to (A), (B), and (C). The amide I’ (black) and the aggregation (red) peak maxima are shown as a function of increasing temperature (5-90°C). Also shown in (E) is the maximum extent of aggregation at 50oC with a dramatic drop due to thermal denaturation and subsequent unfolding of the aggregated peptide within the temperature range 55-75oC, suggesting the mechanism of aggregation was due to self-association. Optimal condition was confirmed by the broad sigmoidal curve observed for the cooperative unfolding process for this peptide at lower peptide concentration (F). Figure 3. Typical curve-fitted spectra of GXXGlp at 20, 45, and 85°C. (A-C) 33.5 mg/mL GXXGlp in the presence of TFA, (D-F) 50.1 mg/mL GXXGlp in the absence of TFA, and (G-I) 16.7 mg/mL GXXGlp in the absence of TFA. All sub-bands used for the curve fitting have been assigned to a vibrational mode. The sub-bands associated with backbone vibrational modes are represented by black lines, the side chain sub-bands are represented by gray lines, the TFA sub-band is represented by a dash/dot red line (A-C), and the aggregation subband is represented by a dash/dot black line (D-F). Figure 4. 2D IR correlation spectroscopy during the thermal perturbation of the GXXGlp in the presence and absence of TFA within the spectral region of 1750-1500 cm-1 and in the temperature range 5-90°C. These plots provide a molecular description of the peptide conformational changes, the side chain interactions and the aggregation events, if present, during thermal stress through the correlation of the peak changes observed. (A–C). The thermally induced spectral changes in GXXGlp in the presence of TFA at 33.5 mg/mL (A) and in the absence of TFA at 50.1 mg/mL (B) and 16.7 mg/mL (C) are shown as sub-bands associated with the backbone vibrational modes (black lines), side chain modes (gray lines), TFA (red dash/dot line) and aggregation (black dash/dot line). (D–F) Synchronous and (G–I) asynchronous plots. Figure 5. Selected step analysis 2D IR correlation plots confirm the details of the thermally induced spectral changes in the aggregation process in the absence of TFA. (A, C, E) synchronous and (B, D, F) asynchronous plots for the temperature ranges 5-15oC, 45-55oC, and 75-85oC, respectively. Figure 6. The sequential order of molecular events for all three GXXGlp samples. Schematic representation summarizing the molecular events during the thermal perturbation of (A) 33.5 mg/mL GXXGlp in the presence of TFA, (B) 50.1 mg/mL GXXGlp in the absence of TFA, and (C) 16.7 mg/mL GXXGlp in the absence of TFA. Differences in stability are attributed to the presence of TFA and the aggregate species when compared to the optimized conditions sample in the absence of TFA at lower concentration of the peptide. Figure 7. Model of the inter-molecular salt-bridge interaction that mediates the self-association of 50.1 mg/mL GXXGlp in the absence of TFA, along with the empirical evidence of the interaction via 2D IR correlation plots. (A) A representative cartoon model depicting the GXXGlp based on the Saccharomyces cerevisiae Krr1 high resolution structure available in Protein Data Bank (PDB ID: 4QMF). The Nterminal and C-terminal ends of the peptide are indicated in blue and red, respectively. The ball-and-stick side chains in blue and red correspond respectively to the arginines (R165, 166, 168) located at the C-terminal end and the internal glutamate (E182), showing the intermolecular salt-bridge interaction. (B-C) 2D IR synchronous and asynchronous plots within the spectral range of 1630-1500 cm-1 and in the temperature range of 5-90°C. These plots provide the molecular evidence of the weak interaction of the inter-molecular salt-bridge that is responsible for the aggregation (self-association) of the peptide.

ACS Paragon Plus Environment

7

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 16

TABLES Table 1. Summary of Peak Assignments for the Fully H→D Exchanged GXXGlp in D2O in the Spectral Region of 1750−1500 cm-1 as Determined by 2D IR Correlation Spectroscopy Analysis.

GXXGlp Peak Positions Peak Assignment TFA

Mean -1

presence of TFA

absence of TFA

absence of TFA

high concentration

low concentration

-1

-1

(cm )

(cm )

(cm )

(cm-1)

1675.3

1675.3

---a

---a

Backbone modes loopA

1684.3

1693.7

1679.6

1679.6

loopB

1661.5

1658.7

1661.1

1664.8

α-helix β-strand

1642.2 1624.1

1642.3 1623.3

1640.5 1622.3

1643.7 1626.7

Aggregation

1620.7

---b

1620.7

---b

Side chain modes Arginine guanidinum ʋa(N-D) ʋs(N-D) Glutamate ʋs(COO-)

1605.7

1606.5

1600.9

1609.7

1585.5

1584.5

1587.4

1584.7

1558.7

1564.5

1546.4

1565.3

(_ _ _a) N/A The sample does not contain TFA. (_ _ _b) The sample is aggregate free.

ACS Paragon Plus Environment

8

Page 9 of 16

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

TOC

ACS Paragon Plus Environment

9

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 1. Krr1 KH domain peptide sequence and structure. (A) Sequence alignment from K165-Y189 specific for the GXXGlp region based on the following Universal Protein Resource accession numbers: Q13601 (Homo sapiens), K7CWD4 (Pan troglodytes), F6YQM6 (Macaca mulatta), G1SKQ8 (Oryctolagus cuniculus), M3XB72 (Felis catus), Q3B7L9 (Bos taurus), F1SGD9 (Sus scrofa), Q8BGA5 (Mus musculus), A1A5R3 (Rattus norvegicus), H0VQU1 (Cavia porcellus), Q1ED12 (Danio rerio), P25586 (Saccharomyces cerevisiae), and Q9VPU8 (Drosophila melanogaster) using CLC Main Workbench 7. Sequence alignment Dots in the represent identical residues to the top sequence (Homo sapiens), the also included is a schematic secondary structure representation of the peptide (red ribbon) represents the α-helix component, (black lines) random coil, (yellow region) represents a short β-strand region and (green line) represents the GXXG loop region.48 (B) Ribbon model representation of the Homo sapiens peptide using the Saccharomyces cerevisiae high resolution structure. The model was made using UCSF’s Chimera program and Protein Data Bank structure 4QMF.49 Blue represents the N-terminal end and red corresponds to the C-terminal end of the peptide. (C) Electrostatic solid surface representation of the GXXGlp. Two views are shown, rotated by 180° about the y-axis. The peptide appears to expose many positively charged residues (in blue; R166, 167, 169 and K165, 174, 179) and a single negatively charged residue (in red; E182). The positively charged regions are where ion-pairing may occur with the TFA excipient. (D) CD analysis of the KH domain GXXG loop peptide in triplicate. Far-UV CD spectra at 25.0 °C in the spectral region of 193−250 nm for 20.7 µM GXXGlp in 2.5 mM HEPES, 7.5 mM NaCl, and 0.2 mM CaCl2 at pH 7.4. Helical and random coil contributions were observed for the peptide. 84x110mm (96 x 96 DPI)

ACS Paragon Plus Environment

Page 10 of 16

Page 11 of 16

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Figure 2. FT-IR spectroscopy of GXXGlp in 50 mM HEPES, 150 mM NaCl, 4 mM CaCl2, and 4 mM MgCl2 at pH 7.4. (A-C) FT-IR spectral overlay of GXXGlp in the temperature range of 5°C (dark blue) 10-30°C (blue), 35-65°C (green), 70°C (yellow), 75-85°C (orange) and 90°C (red) showing the amide I′ and side chain bands in the spectral region of 1750−1500 cm-1 for: (A) 33.5 mg/mL GXXGlp in the presence of TFA, (B) 50.1 mg/mL GXXGlp in the absence of TFA, and (C) 16.7 mg/mL GXXGlp in the absence of TFA. (D-F) Thermal dependence plots are shown for the samples corresponding to (A), (B), and (C). The amide I’ (black) and the aggregation (red) peak maxima are shown as a function of increasing temperature (590°C). Also shown in (E) is the maximum extent of aggregation at 50oC with a dramatic drop due to thermal denaturation and subsequent unfolding of the aggregated peptide within the temperature range 55-75oC, suggesting the mechanism of aggregation was due to self-association. Optimal condition was confirmed by the broad sigmoidal curve observed for the cooperative unfolding process for this peptide at lower peptide concentration (F). 177x127mm (96 x 96 DPI)

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 3. Typical curve-fitted spectra of GXXGlp at 20, 45, and 85°C. (A-C) 33.5 mg/mL GXXGlp in the presence of TFA, (D-F) 50.1 mg/mL GXXGlp in the absence of TFA, and (G-I) 16.7 mg/mL GXXGlp in the absence of TFA. All sub-bands used for the curve fitting have been assigned to a vibrational mode. The subbands associated with backbone vibrational modes are represented by black lines, the side chain sub-bands are represented by gray lines, the TFA sub-band is represented by a dash/dot red line (A-C), and the aggregation sub-band is represented by a dash/dot black line (D-F). 177x160mm (96 x 96 DPI)

ACS Paragon Plus Environment

Page 12 of 16

Page 13 of 16

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Figure 4. 2D IR correlation spectroscopy during the thermal perturbation of the GXXGlp in the presence and absence of TFA within the spectral region of 1750-1500 cm-1 and in the temperature range 5-90°C. These plots provide a molecular description of the peptide conformational changes, the side chain interactions and the aggregation events, if present, during thermal stress through the correlation of the peak changes observed. (A–C). The thermally induced spectral changes in GXXGlp in the presence of TFA at 33.5 mg/mL (A) and in the absence of TFA at 50.1 mg/mL (B) and 16.7 mg/mL (C) are shown as sub-bands associated with the backbone vibrational modes (black lines), side chain modes (gray lines), TFA (red dash/dot line) and aggregation (black dash/dot line). (D–F) Synchronous and (G–I) asynchronous plots. 177x152mm (96 x 96 DPI)

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 5. Selected step analysis 2D IR correlation plots confirm the details of the thermally induced spectral changes in the aggregation process in the absence of TFA. (A, C, E) synchronous and (B, D, F) asynchronous plots for the temperature ranges 5-15oC, 45-55oC, and 75-85oC, respectively. 177x219mm (96 x 96 DPI)

ACS Paragon Plus Environment

Page 14 of 16

Page 15 of 16

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Figure 6. The sequential order of molecular events for all three GXXGlp samples. Schematic representation summarizing the molecular events during the thermal perturbation of (A) 33.5 mg/mL GXXGlp in the presence of TFA, (B) 50.1 mg/mL GXXGlp in the absence of TFA, and (C) 16.7 mg/mL GXXGlp in the absence of TFA. Differences in stability are attributed to the presence of TFA and the aggregate species when compared to the optimized conditions sample in the absence of TFA at lower concentration of the peptide. 177x177mm (96 x 96 DPI)

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 7. Model of the inter-molecular salt-bridge interaction that mediates the self-association of 50.1 mg/mL GXXGlp in the absence of TFA, along with the empirical evidence of the interaction via 2D IR correlation plots. (A) A representative cartoon model depicting the GXXGlp based on the Saccharomyces cerevisiae Krr1 high resolution structure available in Protein Data Bank (PDB ID: 4QMF). The N-terminal and C-terminal ends of the peptide are indicated in blue and red, respectively. The ball-and-stick side chains in blue and red correspond respectively to the arginines (R165, 166, 168) located at the C-terminal end and the internal glutamate (E182), showing the inter-molecular salt-bridge interaction. (B-C) 2D IR synchronous and asynchronous plots within the spectral range of 1630-1500 cm-1 and in the temperature range of 590°C. These plots provide the molecular evidence of the weak interaction of the inter-molecular salt-bridge that is responsible for the aggregation (self-association) of the peptide. 177x127mm (96 x 96 DPI)

ACS Paragon Plus Environment

Page 16 of 16