Identification of Formaldehyde-Induced Modifications in Proteins

Apr 28, 2006 - Toward improving the proteomic analysis of formalin-fixed, paraffin-embedded tissue. Carol B Fowler , Timothy J O'Leary , Jeffrey T Mas...
62 downloads 17 Views 110KB Size
Bioconjugate Chem. 2006, 17, 815−822

815

Identification of Formaldehyde-Induced Modifications in Proteins: Reactions with Insulin Bernard Metz,*,†,‡ Gideon F. A. Kersten,† Gino J. E. Baart,† Ad de Jong,† Hugo Meiring,† Jan ten Hove,† Mies J. van Steenbergen,‡ Wim E. Hennink,‡ Daan J. A. Crommelin,‡ and Wim Jiskoot‡,§ Unit Research and Development, The Netherlands Vaccine Institute (NVI), Bilthoven, The Netherlands, Department of Pharmaceutics, Utrecht Institute for Pharmaceutical Sciences (UIPS), Faculty of Pharmaceutical Sciences, Utrecht University, Utrecht, The Netherlands, and Department of Drug Delivery Technology, Leiden/Amsterdam Center for Drug Research (LACDR), Leiden University, Leiden, The Netherlands. Received November 24, 2005; Revised Manuscript Received April 6, 2006

Formaldehyde is frequently used to inactivate, stabilize, or immobilize proteins. The treatment results in a large variety of chemical modifications in proteins, such as the formation of methylol groups, Schiff bases, and methylene bridges. The purpose of the present study was to identify the stable formaldehyde-induced modifications in a small protein. Therefore, insulin was treated with excess formaldehyde (CH2O) or deuterated formaldehyde (CD2O). In a separate experiment, insulin was modified by formaldehyde (CH2O vs CD2O) and glycine. The mixture of CH2O-treated and CD2O-treated insulin was digested by the proteinase Glu-C. The peptide fragments obtained were analyzed by liquid chromatography-mass spectrometry (LC-MS). Seven intramolecular cross-links were identified in formaldehyde-treated insulin. Furthermore, eight out of the sixteen potentially reactive sites of the insulin molecule were modified by incubation with formaldehyde and glycine. Both the location and the chemical nature of the modifications could be assigned based on the mass increase of potential adducts as elucidated in our previous study (B. Metz et al. (2004) J. Biol. Chem. 279, 6235-6243). To confirm the assigned structures, LCMS measurements with collision-induced dissociation (LC-MS/MS) were performed on insulin fragments. The results of the LC-MS/MS analyses agreed excellently with the assignments. The study showed that arginine, tyrosine, and lysine residues were very reactive. However, eight theoretically reactive residues did not show detectable modifications, probably because of their low intrinsic reactivity, inaccessibility, or both. The asparagine, glutamine, and histidine residues were not converted in insulin. The N-termini of insulin were partly converted to the expected imidazolidinone adducts, indicating that the protein conformation affects the accessibility and reactivity of these residues. In conclusion, this study shows that, based on our current insights in the chemistry of the reactions between proteins and formaldehyde, we are able to elucidate the location and nature of formaldehyde-induced modifications in a small protein. The approach followed in this study may be generally applicable to larger formaldehyde-treated proteins, such as toxoids used in vaccines.

INTRODUCTION In the biochemical and pharmaceutical fields, chemical modifications are often introduced in proteins to add useful properties. Chemical treatment is utilized for bioconjugation of biologicals to reduce the immunogenicity (1-3), for stabilization of biopharmaceuticals to protect against proteolytic activity (47), and for coupling of functional groups to proteins (8-10). Formaldehyde is one of the well-known chemicals that is frequently applied as a cross-linking agent, for example, for the preparation of vaccines (11-14), for isotope-labeling of proteins (15-17), and for studying protein-protein interactions (18-20). However, the identification of the formaldehydemodified sites in the primary sequence of proteins is cumbersome because numerous chemical modifications occur in proteins during the treatment with formaldehyde. During the past 60 years, extensive studies have revealed that formaldehyde can modify several amino acid residues (4, 21-28). Only a few of these articles describe investigations of formaldehyde-induced modifications in proteins (25-28), but the nature of each specific modification and the exact location of the modified residues were not identified. * Corresponding author. Mailing address: P.O. Box 457, 3720 AL Bilthoven, The Netherlands. Phone: +31 30 27 433 73. Fax: +31 30 27 444 26. E-mail: [email protected]. † The Netherlands Vaccine Institute (NVI). ‡ Utrecht University. § Leiden University.

Table 1. Mass (∆m) Increase Due to Formaldehyde Modifications of Reactive Amino Acid Residuesa residue

methylol ∆m

imine ∆m

formaldehyde-glycine adduct ∆m

arginine asparagine cysteine glutamine histidine lysine tryptophan tyrosine N-terminal amino acid

30 b 30 b 30 30 30 b b

b b b b b 12 12 b 12

99/198 87 b 87 87 b 87 87/174 99

a

Results from studies with peptides (29). b No modification observed.

In a recent study with model peptides, we elucidated the structures of formaldehyde-induced modifications and the intrinsic reactivity of each amino acid residue (29). The reaction of formaldehyde with a peptide or protein starts with the formation of unstable methylol adducts on amino and thiol groups of arginine, cysteine, histidine, lysine, and tryptophan residues (Table 1). The methylol groups partially dehydrate on lysine and tryptophan residues, yielding labile Schiff bases (Scheme 1). The Schiff base on a lysine residue can form stable cross-links with several amino acid residues, including arginine, asparagine, glutamine, histidine, tryptophan, and tyrosine. Also, our previous study confirmed other studies showing that the N-terminal residues are modified after formaldehyde treatment

10.1021/bc050340f CCC: $33.50 © 2006 American Chemical Society Published on Web 04/28/2006

816 Bioconjugate Chem., Vol. 17, No. 3, 2006 Scheme 1: The Reaction of Formaldehyde with Proteinsa

Metz et al. Table 2. Composition of Reaction Mixtures Used in This Study mixture

insulin, 0.35 mM (mL)

formaldehyde, 1 M (µL)

glycine, 1 M (µL)

NaCNBH3, 1 M (µL)

1 2 3

3.36 3.36 3.36

320 320 320

a 320 a

a a 320

a

Not applicable.

were also performed in the presence of glycine. The reaction conditions were largely based on the detoxification process of diphtheria toxin for vaccine production (11). After the chemical treatment, mixtures of equimolar amounts of CH2O-treated and CD2O-treated insulin were enzymatically digested into small peptides, which were subsequently analyzed by reversed-phase liquid chromatography, electrospray ionization mass spectrometry (LC-MS), and LC-MS measurements with collision-induced dissociation (LC-MS/MS). Using this approach, we were able to unequivocally identify the chemical nature and location of the formaldehyde-induced modifications in the protein.

MATERIALS AND METHODS a The reaction starts with the formation of methylol adducts on amino groups [1]. The methylol adducts of primary amino groups are partially dehydrated, yielding labile Schiff bases [2], which can form crosslinks with several amino acid residues, for example, with tyrosine [3]. After addition of formaldehyde, a 4-imidazolidinone adduct can be formed at the N-terminal site of the protein, probably via a Schiff base intermediate [4].

Figure 1. The primary structure of bovine insulin. The A-fragment and B-fragment of insulin are connected to each other via two disulfide bridges. Furthermore, the A-fragment has an intrachain disulfide bridge. In theory, residues given in black are reactive with formaldehyde and glycine.

resulting in an imidazolidinone (29-32). However, the effect of the protein conformation on the reactivity of amino acid residues cannot be studied in model peptides. In addition, the position and local environment of each intrinsically reactive amino acid in the protein may affect the reactivity. The aim of the present study was to investigate whether we can elucidate formaldehyde-induced and formaldehyde/glycineinduced modifications in proteins by applying the knowledge obtained from our previous study with model peptides (29). For this purpose, bovine insulin was used as a model protein. Insulin is composed of two polypeptide chains: the A-chain consists of 21 amino acid residues, and the B-chain contains 30 amino acid residues; the two chains are interconnected via two disulfide linkages, CysA7-CysB7 and CysA20-CysB19 (Figure 1). Moreover, the A-chain contains an intrachain disulfide bridge (CysA6-CysA11). The protein can also appear in different quaternary structures, as monomers, dimers, and hexamers (33). For example, insulin molecules associate into hexameric complexes in the presence of two zinc ions and at a neutral pH (34). In this study, insulin was treated with either native formaldehyde or deuterated formaldehyde (CH2O vs CD2O). In a parallel experiment, formaldehyde treatments of insulin

Chemicals. Formaldehyde (37% (w/v) in water with 10% methanol), formic acid (99%), glycine, potassium dihydrogen phosphate (KH2PO4‚3H2O), and dipotassium hydrogen phosphate (K2HPO4‚3H2O) were purchased from Merck (Amsterdam, The Netherlands). Formaldehyde-D2 (CD2O) was delivered from C/D/N Isotopes Inc. (Utrecht, The Netherlands). Insulin from bovine pancreas, β-lactoglobulin, DL-dithiotreitol (DTT), and sodium cyanoborohydride (NaCNBH3) were obtained from Sigma (Zwijndrecht, The Netherlands). Endoproteinase Glu-C was bought from Roche Applied Science (Almere, The Netherlands). Size-Exclusion Chromatography. Size-exclusion chromatography (SEC) was performed to determine the quaternary structure of bovine insulin prior to chemical treatment. The method used was adapted from Ahmad et al. (35). SEC was performed on a Waters Alliance 2695 instrument equipped with a Superdex 75 10/300GL column from GE healthcare (exclusion limit 1 × 105 Da) at a flow rate of 0.5 mL/min with detection at 280 nm. Bovine insulin (MW 5.7 kDa) was dissolved in 10 mM potassium phosphate, pH 8.5 (the buffer used for the reactions with formaldehyde), or in 20% acetic acid (in which hexameric insulin dissociates to form monomers (35)) to a final concentration of 0.35 mM. In addition, β-lactoglobulin (MW 36 kDa) was used as a marker. Aliquots of 200 µL were loaded on the column, which was equilibrated with 10 mM potassium phosphate, pH 7.2. Reactions of Insulin with Formaldehyde. Three different reactions with insulin were performed in this study: (i) insulin with formaldehyde, (ii) insulin with formaldehyde and glycine, and (iii) insulin with formaldehyde and NaCNBH3. For each reaction, two identical mixtures were prepared except for the type of formaldehyde: either native formaldehyde (CH2O) or deuterium-labeled formaldehyde (CD2O) was used. Prior to the reactions, insulin was dissolved in 10 mM potassium phosphate, pH 8.5, to a final concentration of 0.35 mM, with formaldehyde (CH2O or CD2O), glycine, and NaCNBH3 at a concentration of 1.0 M. The compositions of the reaction mixtures are given in Table 2. The final concentration of insulin during the reaction was 0.30 mM, and the final concentrations of formaldehyde, glycine, and NaCNBH3 were 80 mM. After mixing, the solution was incubated for 1 week at 35 °C. Samples were extensively dialyzed against 10 mM potassium phosphate, pH 8.5 (MWCO 1000). After incubation and dialysis, equimolar amounts of CH2O-treated and CD2O-treated insulin were mixed. All samples were stored at 4 °C prior to digestion with proteinase Glu-C.

Bioconjugate Chem., Vol. 17, No. 3, 2006 817

Reactions of Formaldehyde and Glycine with Insulin

Digestion by Endoproteinase Glu-C. Nontreated insulin and formaldehyde-modified insulin samples were digested by mixing 50 µL of these samples with 5 µL of 1.0 M potassium phosphate buffer, pH 9.0, 1.0 µL of 1 µg/µL endoproteinase Glu-C solution, and 44 µL of water. The mixtures were incubated for 24 h at 37 °C. To reduce disulfide bonds, 1 µL of 1.0 M DTT was added to the digests, and the samples were incubated for 1 h at 37 °C. Subsequently, the samples were stored at -20 °C prior to LC-MS analysis. Nano-electrospray MS. Insulin was diluted to a concentration of 1 µM in water containing 5% (v/v) DMSO and 5% (v/ v) formic acid. Gold-coated nano-electrospray needles (length 600-700 µm; internal diameter 1-2 µm; homemade) were loaded with 10 µL of the sample. The protein solution was analyzed by electrospray ionization mass spectrometry using a Q-TOF Ultima API mass spectrometer (Waters, England). MS scans were obtained from m/z 350 to 2000 amu. Spectra were deconvoluted using the MaxEnt 1 tool in the MassLynnx MS software (Waters). LC-MS. Protein digests were analyzed by nanoscale reversed phase liquid chromatography electrospray ionization mass spectrometry (Q-TOF Ultima API), essentially as previously described by Meiring et al. (36). Briefly, each digested sample was diluted in water containing 5% (v/v) DMSO and 5% (v/v) formic acid to a concentration, corresponding to 2.5 nM of the original insulin concentration. An injection volume of 10 µL was used for analysis. Analytes were trapped on a trapping column of 15 mm (length) × 100 µm (inner diameter) with Aqua C18 (5 µm; Phenomenex) at a flow rate of 3 µL/min with 100% solvent A (0.1 M acetic acid in water) as eluent for 10 min. Then, analytes were separated by reversed phase chromatography by using a 25 cm L × 50 µm i.d. analytical column with Pepmap (5 µm; Dionex) at a flow rate of 125 nL/min. A linear gradient was started from 5% solvent B (0.1 M acetic acid in acetonitrile) to 60% solvent B in 55 min. The digested peptides were measured in the MS mode (m/z 300-2000) to determine the masses of peptides in the mixture. The mass spectrometer was adjusted to following conditions: the electrospray voltage was set to 1.8 kV, the TOF voltage to 9.1 kV, and the MCP voltage to 2.2 kV. Peptides containing formaldehyde modifications typically appeared as mass spectral doublets as a result of the use of native (CH2O) and deutrated (CD2O) formaldehyde. The location and the nature of a formaldehyde-induced modification was assigned on the basis of the measured mass, the number of incorporated formaldehyde molecules, and the knowledge of potential formaldehyde adducts, the structures of which were elucidated in our previous study with synthetic model peptides (29). From the LC-MS results, a list of masses was compiled of formaldehyde-modified peptides, which were measured in a second run by LC-MS/MS to obtain sequence information and to verify the modified residues. Therefore, the peptides were analyzed by datadependent scanning comprising a survey MS scan (m/z 3002000) followed by collisional activated decomposition (CAD) of the abundant ion in the MS spectra from the compiled list. The collision energy was set between 15 and 25 V. Accessibility of Amino Acid Residues. The accessibility of each amino acid residue was calculated from the crystal structure of bovine insulin (37) using a described method of Fraczkiewicz and Werner (38). The accessible surface area of the amino acid residues in monomeric insulin varied from 0% for completely buried residues to 100% for surface residues. Several amino acid residues were not observed as entire structural entities in the crystal structure of bovine insulin (37). Therefore, the accessibility of the residues GlnA5, TyrA14, GlnA15, AsnB3, and GlnB4 was taken from corresponding residues in human insulin (39). Furthermore, for certain residues, the accessibility in the

Table 3. List of Expected and Detected Insulin-Derived Peptides after Digestion of Insulin with Proteinase Glu-C peptide 1 2 3 4 5 6 7 8

fragment

sequence

MH+ (Da)a

A-Fragment GIVE 417.2 (0.0) QCCASVCSLYQLE 1446.6 (-2.0)b NYCN 513.2 (0.0) B-Fragment B1-B13 FVNQHLCGSHLVE 1482.7 (0.0) B14-B21 ALYLVCGE 867.4 (0.0) B22-B30 RGFFYTPKA 1086.5 (-0.1) Peptides Connected with Disulfide Links A5-A17 and QCCASVCSLYQLE 2924.6 (0.3) B1-B13 FVNQHLCGSHLVE 18 21 A -A and NYCN 1377.6 (0.0) B14-B21 ALYLVCGE

A1-A4 A5-A17 A18-A21

a Eight masses were observed that could be ascribed to peptides from bovine insulin. Deviation from the calculated masses given between brackets. b Mass difference of 2 Da most probably due to nonreduced intrachain disulfide bridge (CA6-CA11) or reoxidized cysteine residues.

hexameric form differs from that in the monomeric form (40). For these residues, the accessibilities in both forms were considered because of the presence of hexameric insulin. Conversion of Amino Acid Residues. The conversion (Conv) of a particular residue was estimated based on the peak areas of the modified (Amod) and nonmodified peptide (Anon) containing that particular residue, according to the equitation Conv (%) ) 100 × Amod/(Amod + Anon). For this estimation, it is assumed that the specific response (peak area/mole) is equal for the modified and nonmodified peptide.

RESULTS Characterization of Untreated Insulin. Bovine insulin was chosen as a small model protein to study the possible reactions of formaldehyde with amino acid side chains of proteins. Prior to formaldehyde treatment, the composition and purity of the insulin were examined by size-exclusion chromatography (SEC) and nano-electrospray MS analysis. The protein, which can appear as monomers, dimers, and hexamers (33), was analyzed by SEC, which revealed one single peak with an elution time of 25.7 min (data not shown). The elution was close to that of β-lactoglobulin (ca. 21.7 min; MW of 36 kDa), indicating that the insulin dissolved in potassium phosphate buffer (pH 8.5) is completely assembled into hexamers (MW of 34 kDa). The elution time of monomeric insulin (MW of 5.7 kDa) dissolved in 20% acetic acid was 41 min. Nano-electrospray MS analysis showed that the major compound had a mass of 5729 Da (arbitrary intensity 100%), close to the theoretical mass of bovine insulin (5730 Da). Insulin was observed as a monomeric compound by MS as a result of the low concentration and the low pH used for ionization (41, 42). Also three other masses of 5659, 5792, and 5855 Da were observed with relative intensities of 9%, 33%, and 7%, respectively. The mass of 5659 Da can be ascribed to insulin lacking the alanine residue at the C-terminus of the insulin, probably as a result of a wrong cleavage of the proinsulin. The other two masses can be explained by the binding of one or two Zn2+ ions, each taking the position of two hydrogens (∆m ) 65 - 2 Da). Insulin was digested into small peptides by using proteinase Glu-C. This protease cleaves proteins at the C-terminal site of glutamate and aspartate residues (43, 44). In theory, complete cleavage of the insulin and reduction of disulfide bridges will result in six nonoverlapping peptides (Table 3, peptides 1-6). All six expected peptides were detected. Peptide 2 was observed with a mass difference of 2 Da with regard to the expected mass, which is most probably due to nonreduced intrachain disulfide

818 Bioconjugate Chem., Vol. 17, No. 3, 2006

Figure 2. Examples of mass spectra of a native peptide (A), a formaldehyde-modified peptide (B), and a formaldehyde/glycine modified peptide (C). These three peptides were observed as double protonated ions (MH22+), corresponding to (A) MH+ ) 1086.5 (Table 3, peptide 6), (B) MH+ ) 1098.5 (Table 4, peptide 4), and (C) MH+ ) 1296.4 (Table 5, peptide 14). From the distance between peptide pairs (B and C), the number of incorporated formaldehyde molecules can be determined. In peptides B and C, one and five formaldehyde molecules were incorporated, respectively.

bridge (CysA6-CysA11) or reoxidized cysteine residues. Also due to incomplete reduction, peptide fragments of the A- and B-chains were observed that are connected to each other with disulfide links (Table 3, peptides 7 and 8). Besides the peptides shown in Table 3, four minor peaks were observed with masses that were 16 Da heavier than the theoretical mass. Also a few masses with minor intensities were found that could not directly be related to insulin (not shown). Altogether, the digestion of insulin with proteinase Glu-C resulted in a total number of eight peptides that cover the whole protein. In principle, this permits the complete mapping of formaldehyde/glycine-induced modifications in hexameric insulin. Formaldehyde-Induced Modifications in Insulin. In two separate samples, the insulin hexamers were treated with either CH2O or CD2O. The experiment was aimed to reveal intramolecular cross-links in insulin as a result of the chemical treatment. After the reaction, these two samples were mixed, digested by proteinase Glu-C, and analyzed by LC-MS (details given in Materials and Methods). This approach provided us with a method to discriminate formaldehyde-modified peptides (showing peptide pairs) from unmodified ones (no peptide pairs, see Figure 2A). Peptides containing formaldehyde modifications

Metz et al.

typically appear as mass spectral doublets because the use of CH2O or CD2O results in peptide pairs with mass differences of n(2 Da), where n is the number of formaldehyde residues incorporated in the peptide. Indeed the LC-MS analyses revealed seven of these peptide pairs, which eluted simultaneously from the LC column (Table 4). A representative example of a peptide pair is given in Figure 2B. The mass difference between each peptide pair was 2 or 4 Da. The total mass increment of the modified peptides, as compared to the nonmodified peptides, was 12 or 24 Da for CH2O-treated peptides and 14 or 28 Da for CD2O-treated peptides (Table 4). These mass increases correspond to the formation of one and two formaldehyde crosslinks in the peptide (-CH2- or -CD2-). So, the observed doublets could be assigned to peptides with an intrachain crosslink. In addition, LC-MS/MS measurements were done to reveal the exact location of individual cross-links in these peptides. The MS/MS spectra showed an increased mass of 12 Da for residues GlyA1, CysA7, TyrA19, PheB1, TyrB16, and TyrB26. Also a mass increment of 24 Da was observed for the residue TyrB26. Previous reactions with model peptides demonstrated that primary amino groups and thiol groups react primarily with formaldehyde and form subsequently cross-links with amino acid residues, for example, tyrosine and glutamine residues (29). Based on the present LC-MS/MS results and the previous study, we conclude that the intrapeptide cross-links had occurred between the TyrA19 and CysA20 residues, the TyrB16 and CysB19, the TyrB26 and LysB29 residues (Figure 2) and, probably, the GlnA5 and CysA7 residues. The formaldehyde-induced modifications of the N-terminal residues, GlyA1 and PheB1, are consistent with the formation of a 4-imidazolidinone, as has been described in many other studies (31, 32, 45) (Scheme 1, reaction 4). Formaldehyde/Glycine-Induced Modifications in Insulin. In a parallel experiment, insulin hexamers were treated with either CH2O or CD2O in combination with glycine. In our previous paper (29), we showed that nine different amino acid residues in proteins are modifiable by formaldehyde and glycine (Table 1). According to that study, 16 amino acid residues in insulin can be modified (as indicated in Figure 1), provided that they are easily accessible to formaldehyde and glycine. MS analysis of intact formaldehyde-treated insulin resulted in a broad peak, demonstrating the heterogeneous character of the product (results not shown). Subsequently, the digest of modified insulin was analyzed by LC-MS. The measurements showed 21 masses from peptide pairs, each of which represents one or more formaldehyde/glycine-induced modifications (Table 5). The mass difference between the observed peptide pairs was n(2 Da), with n ranging from one to six, corresponding to the number of incorporated formaldehyde residues (CH2O vs CD2O). An example of a formaldehyde/glycine-modified peptide pair containing five formaldehyde residues is given in Figure 2C. All masses listed in Table 5 could be traced back as modified peptide fragments from insulin. The assignments were based on the insulin sequence and the earlier described formaldehyde modifications (29). Moreover, the number of incorporated formaldehyde molecules corresponded with the prediction. The resulting mass difference with the nonmodified peptides varied between 12 and 372 Da (Table 5). LC-MS/MS measurements were performed to verify the assigned peptide sequences as listed in Table 5 and to reveal the exact location of individual modifications in the peptides. Two examples of MS/MS spectra are given in Figure 3. From the data presented in Figure 3, it appears that the formaldehydeglycine attachments are not resistant to fragmentation during MS/MS analysis. Apparently, this fragmentation started with the loss of a part of the formaldehyde-glycine adduct from the peptide backbone, followed by backbone fragmentation.

Bioconjugate Chem., Vol. 17, No. 3, 2006 819

Reactions of Formaldehyde and Glycine with Insulin Table 4. Insulin-Derived Peptides Containing Formaldehyde-Induced Cross-Links peptide

MH+ (Da)

peptide sequencea

fragment

no. of CH2Od

∆me (Da)

conversion (%)

1 2 3 4 5 6 7

429.2 525.2 879.2 1098.4 1110.4 1458.6 1496.6

GIVEb NYCNc ALYLVCGEc RGFFYTPKAc RGFFYTPKAc QCCASVCSLYQLEc FVNQHLCGSHLVEb

A1-A4 A18-A21 B14-B21 B22-B30 B22-B30 A5-A17 B1-B13

1 1 1 1 2 1 1

12 12 12 12 24 12 12

20.7 35.7 1.0 33.1 44.8 3.6 32.9

a Modified residues marked in bold. b The N-terminal peptide has formed a 4-imidazolidinone adduct (Scheme 1). c Intrachain cross-link formed between underlined residues. d Number of formaldehyde residues incorporated in the peptide. e Mass increment as compared to nonmodified peptide.

Table 5. Insulin-Derived Peptides with Formaldehyde/Glycine-Modified Amino Acid Residues peptide 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22

MH+ (Da) 429.2 525.2 879.4 954.4 1041.6 1098.6 1110.6 1173.6 1185.4 1197.4 1272.1 1284.4 1296.4 1371.1 1383.6 1458.5 1458.6 1533.6 1620.6 1494.7 1581.7

peptide sequencea

fragment

GIVEb NYCNc ALYLVCGE ALYLVCGE ALYLVCGE RGFFYTPKAc RGFFYTPKAc RGFFYTPKA RGFFYTPKA RGFFYTPKAc RGFFYTPKA RGFFYTPKA RGFFYTPKAc RGFFYTPKAc RGFFYTPKA RGFFYTPKAc RGFFYTPKA QCCASVCSLYQLEc QCCASVCSLYQLE QCCASVCSLYQLE FVNQHLCGSHLVEb FVNQHLCGSHLVEb

A1-A4 A18-A21 B14-B21 B14-B21 B14-B21 B22-B30 B22-B30 B22-B30 B22-B30 B22-B30 B22-B30 B22-B30 B22-B30 B22-B30 B22-B30 B22-B30 B22-B30 A5-A17 A5-A17 A5-A17 B1-B13 B1-B13

no. of CH2Od 1 1 1 1 2 1 2 1 2 3 3 4 5 5 6 6 1 1 2 1 2

∆me (Da)

conversionf (%)

12 12 12 87 174 12 24 87 99 111 (R, +99; Y-K, +12) 186 (R, +99; Y, +87) 198 198 (R, +99; Y, +87; Y-K, +12) 210 (R, +198; Y-K, +12) 285 (R, +198; Y, +87) 279 (R, +198; Y, +87; Y-K, +12) 372 (R, +198; Y, +174) 12 87 174 12 99

33.0 ( 25.9 31.3 ( 4.7 2.4 ( 1.6 38.6 ( 3.7 29.2 ( 5.6 4.4 ( 1.5 1.9 ( 1.2 2.7 ( 1.0 17.7 ( 5.6 10.8 ( 4.6 7.9 ( 2.0 12.1( 1.9 6.4 ( 1.5 8.7 ( 2.3 4.7 ( 0.9 11.0 ( 3.6 0.9 ( 0.7 0.4 ( 0.4 44.6 ( 6.4 53.4 ( 6.7 39.6 ( 28.2 9.0 ( 6.3

a Modified residues marked in bold. b The N-terminal peptide has formed a 4-imidazolidinone adduct (Scheme 1). c Intrachain cross-link formed between underlined residues. d Number of formaldehyde residues incorporated in the peptide. e Mass increment as compared to nonmodified peptide. f Mean ( SD; n ) 3.

Figure 3. MS/MS analysis of two peptide fragments of insulin containing formaldehyde-induced modifications. Spectrum A is obtained from the peptide (Table 5, peptide 11, MH+ ) 1272.1) that contained a formaldehyde-glycine adduct on an arginine (B22) and on a tyrosine (B26) residue and is dominated by a y-ion series. Spectrum B is acquired from the peptide (Table 5, peptide 14, MH+ ) 1296.4) with an intramolecular cross-link between a tyrosine (B26) and a lysine (B29) residue and two formaldehyde-glycine adducts on the arginine residue.

Fortunately, a small mass tag of 12 Da was left on the modified residue and enabled the identification of the modified sites. When two formaldehyde-glycine adducts were attached to tyrosine or arginine residues (resulting in mass increments of 174 and 198, respectively), the LC-MS/MS analyses demon-

strated a residual mass tag of 24 Da. All peptides were sequenced by MS/MS, and the results corresponded to the assignment based on MS data. Formaldehyde-induced modifications were observed on eight residues: GlyA1, TyrA14, TyrA19, PheB1, TyrB16, ArgB22, TyrB26, and LysB29. However, there was

820 Bioconjugate Chem., Vol. 17, No. 3, 2006

Metz et al.

Figure 4. Formaldehyde-induced cross-links in an insulin-derived peptide (Table 4, peptide 4 and 5).

no evidence of modification of eight other residues, GlnA4, GlnA15, AsnB3, AsnB3, AsnB3, GlnB4, HisB5, and HisB10, although these types of residues have been shown before to be reactive with formaldehyde and glycine (29). The same intramolecular cross-links were expected in insulin as those observed after the reaction with formaldehyde alone. Indeed, nine insulin-derived peptides could be ascribed to peptides with an intrachain crosslink and verified by LC-MS/MS (Table 5). In summary, LCMS/MS analyses could identify the modified amino acid residues in insulin. Conversion and Accessibility of Formaldehyde/Glycine Modifiable Residues. In the previous section, the identification of formaldehyde/glycine-induced modifications in insulin hexamers was described. However, eight expected modifications in insulin were missing, for example, the formaldehyde-glycine adducts upon asparagine, glutamine, and histidine residues. No or little conversion observed is most probably due to poor intrinsic reactivity (29), poor or partial accessibility, or both in the native insulin molecule (37-39). In this section, the accessibility and the intrinsic reactivity are compared to the conversion of the formaldehyde reactive residues. The conversion of each modified residue was calculated based on the peak areas of the formaldehyde/glycinemodified peptide and the corresponding nonmodified peptide (Table 6). The intrinsic reactivity values for each modifiable residue were previously assessed (29) and are also given in Table 6. Furthermore, the accessibility of each amino acid residue was calculated from the crystal structure of both monomeric and hexameric bovine insulin (37) using a described method of Fraczkiewicz and Werner (38, 39) (Table 6). Consistent with our previous study (29), it was shown that arginine and tyrosine residues were very reactive (Table 6). Nevertheless, a large variation in conversion was found between the different tyrosine residues, probably, as a result of

accessibility. The most accessible tyrosine residue (TyrA14) was highly modified by formaldehyde and glycine (97%), whereas a buried residue (TyrA19) was converted in a much lesser degree (36%). Substantial conversion (59%) was observed for a fully buried tyrosine residue (TyrB26), but 32% of this conversion was due to the formation of methylene bridges between LysB29 and TyrB26. A methylene bridge (-CH2-) is smaller than a formaldehyde-glycine adduct, which might explain the relatively high conversion of this tyrosine residue. No modifications of asparagine, glutamine, and histidine residues were observed. The low intrinsic reactivity of these residues together with their poor accessibility (between 13% and 68%) in hexameric insulin are likely key to the absence of conversion. Only moderate conversion of the N-terminal amino groups (residues GlyA1 and PheB1) into imidazolidinone adducts was observed after either formaldehyde treatment or formaldehydeglycine treatment (Tables 4 and 6). However, high conversion was expected, because in general imidazolidinone adducts are rapidly formed by the reaction of formaldehyde with N-terminal amino groups (29-32). Therefore, the reactivity of N-terminal amino groups in insulin for formaldehyde was studied in more detail. Insulin was treated with formaldehyde (CH2O or CD2O) and NaCNBH3 to study the accessibility to formaldehyde of the N-terminal residues GlyA1 and PheB1. Full accessibility would be reflected by full conversion of the N-terminal amines to dimethylated amine groups. Indeed, LC-MS analyses revealed masses corresponding to N-terminal insulin fragments of the A-chain and the B-chain with a dimethylated GlyA1 or a dimethylated PheB1 residue as confirmed by MS/MS. Moreover, these residues, GlyA1 and PheB1, were completely converted (100%). We can therefore conclude that poor accessibility of the N-termini to formaldehyde was not the reason for the partial imidazolinone formation observed. Instead, the second step in the imidazolidinone formation may be hampered in the A-chain and B-chain of insulin as a result of poor access to the adjacent residues, IleA2 and ValB2, which take part in the reaction with formaldehyde (20). Indeed, according to the crystal structure, these amino acids are buried by surrounding residues: the accessibility of IleA2 and ValB2 is 1.0% and 0.0%, respectively (37, 38). These results underline the vital influence of protein conformation on the conversion of intrinsically reactive sites.

Table 6. Formaldehyde/Glycine-Reactive Residues in Insulin residuea

position

expected ∆m

observed ∆m

accessibilityc,e (%)

intrinsic reactivityf (%)

conversiong (%)

glycine glutamine tyrosine glutamine asparagine tyrosine asparagine phenylalanine asparagine glutamine histidine histidine tyrosine arginine tyrosine lysine

A1 A5 A14 A15 A18 A19 A21 B1 B3 B4 B5 B10 B16 B22 B26 B29

12/99 87 87/174 87 87 87/174 87 12/99 87 87 87 87 87/174 99/198 87/174 12/24b

12 h 87/174 87 87 87/174 87 12/99 h h h h 87/174 99/198 87/174 12/24

55.3 (80.1) 46.8d (36.1) 89.5d (89.1) 49.5d (49.5) 65.4 (67.6) 20.1 (18.9) 90.1 (63.3) 59.3 (86.3) 69.4d (36.6) 64.4d (46.7) 56.6 (16.4) 76.5 (13.3) 73.0 (1.2) 53.0 (59.0) 28.0 (0.0) 96.6 (71.5)

89 4 67 4 4 67 4 89 4 4 7 7 67 97 67 i

33.0 ( 25.9 0 98.1 ( 0.4 0 0 31.3 ( 4.7 0 48.6 ( 34.5 0 0 0 0 59.4 ( 5.1 80.3 ( 3.9 59.4 ( 5.1 32.2 ( 4.8

a Possible formaldehyde/glycine-reactive amino acid residues in bovine insulin are listed based on our previous study (29). b Formaldehyde treatment causes intramolecular cross-links between lysine residues and other amino residues, resulting in a mass increase of 12 or 24 Da (29). c Accessibility of amino acids in monomeric bovine insulin (see Materials and Methods for further details). d The accessibility of these residues was calculated based on the structure of monomeric human insulin (48). e Values given in parentheses represent the accessibility of the residue in hexameric insulin (40). f The intrinsic reactivity of each type of amino acid residue was assessed previously, except for the lysine residue (29). The table lists the expected conversions after a reaction time of 48 h. g The conversion was determined after a reaction time of 1 week (see Materials and Methods for further details). Mean ( SD; n ) 3. h No change. i Unknown.

Reactions of Formaldehyde and Glycine with Insulin

DISCUSSION This study demonstrates a great diversity of chemical modifications in the model protein insulin caused by the treatment either with formaldehyde or with formaldehyde and glycine. Eight different amino acid residues were modified by the reaction of formaldehyde and glycine. Furthermore, seven distinct intrachain cross-links were identified. The results showed that arginine, tyrosine, and lysine residues were very reactive, whereas a little or no conversion was observed for asparagine, glutamine, and histidine residues, which is consistent with our earlier study with model peptides (29). Moreover, the data indicate that the protein conformation can have a serious effect on the reactivity of particular amino acid residues, because formaldehyde and glycine do not or only slightly convert the poorly accessible amino acids that are potentially reactive. Insulin was used as a model protein to study the reaction of formaldehyde with amino acid residues. The protein was chosen because of its defined structure (35, 37, 39-42). In this study, hexameric insulin was used, which has a moderate size (34 kDa). Complete digestion of native insulin by proteinase Glu-C resulted is a small set of six peptides (Table 3). Nevertheless, after formaldehyde treatment 21 different masses were detected that could be assigned to these peptides containing different modifications (Table 5). As a consequence of the formaldehyde treatment, several modifications formed are distributed on different (monomeric) insulin molecules. This finding underscores the structural heterogeneity of formaldehyde-treated proteins, which will be even more pronounced for proteins larger than monomeric insulin. In conclusion, we have provided a practical method to reveal the site and chemical nature of each particular formaldehydeinduced modification in proteins. The approach may be useful to study the complex modifications of larger proteins, such as the formaldehyde treatment of bacterial toxins for preparation of vaccines. The detoxification process of the bacterial toxins is of utmost importance for the antigenicity, immunogenicity, and residual toxicity of the resulting toxoid (11, 46, 47). Detailed knowledge about the chemical modifications in these antigens can help to gain a better insight into the relationship between the structure of the antigens and the safety and efficacy of the corresponding vaccines. Moreover, our strategy used to identify chemical modification sites in proteins may be applicable to many other bioconjugations.

ACKNOWLEDGMENT We thank Wilma Witkamp, Sytse Piersma, Bert Zomer, and Anne Jan Metz for their contributions to this work. This study was, in part, supported by a grant from the “Platform Alternatieven voor Dierproeven” (the Dutch platform on alternatives to animal experiments), Grant No. 3170.0039.

LITERATURE CITED (1) Caliceti, P., Schiavon, O., and Veronese, F. M. (2001) Immunological properties of uricase conjugated to neutral soluble polymers. Bioconjugate Chem. 12, 515-522. (2) Hermeling, S., Crommelin, D. J., Schellekens, H., and Jiskoot, W. (2004) Structure-immunogenicity relationships of therapeutic proteins. Pharm. Res. 21, 897-903. (3) Schellekens, H. (2002) Immunogenicity of therapeutic proteins: clinical implications and future prospects. Clin. Ther. 24, 17201740; discussion 1719. (4) Kelly, D. P., Dewar, M. K., Johns, R. B., Wei-Let, S., and Yates, J. F. (1977) Cross-linking of amino acids by formaldehyde. Preparation and 13C NMR spectra of model compounds. AdV. Exp. Med. Biol. 86, 641-647. (5) Rice, R. H., Means, G. E., and Brown, W. D. (1977) Stabilization of bovine trypsin by reductive methylation. Biochim. Biophys. Acta 492, 316-321.

Bioconjugate Chem., Vol. 17, No. 3, 2006 821 (6) Baudys, M., Letourneur, D., Liu, F., Mix, D., Jozefonvicz, J., and Kim, S. W. (1998) Extending insulin action in vivo by conjugation to carboxymethyl dextran. Bioconjugate Chem. 9, 176-183. (7) Baudys, M., Uchio, T., Mix, D., Wilson, D., and Kim, S. W. (1995) Physical stabilization of insulin by glycosylation. J. Pharm. Sci. 84, 28-33. (8) Stan, A. C., Radu, D. L., Casares, S., Bona, C. A., and Brumeanu, T. D. (1999) Antineoplastic efficacy of doxorubicin enzymatically assembled on galactose residues of a monoclonal antibody specific for the carcinoembryonic antigen. Cancer Res. 59, 115-121. (9) Trakselis, M. A., Alley, S. C., and Ishmael, F. T. (2005) Identification and mapping of protein-protein interactions by a combination of cross-linking, cleavage, and proteomics. Bioconjugate Chem. 16, 741-750. (10) Hentz, N. G., Richardson, J. M., Sportsman, J. R., Daijo, J., and Sittampalam, G. S. (1997) Synthesis and characterization of insulinfluorescein derivatives for bioanalytical applications. Anal. Chem. 69, 4994-5000. (11) Rappuoli, R. (1997) New and improved vaccines against diphtheria and tetanus, in New generation Vaccines (Levine, M. M., Woodrow, G. C., Kaper, J. B., and Cobon, G. S., Eds.) pp 417-435, Marcel Dekker, inc., New York. (12) Stapleton, J. T., and Lemon, S. M. (1997) New vaccines against hepatitus A, in New generation Vaccines (Levine, M. M., Woodrow, G. C., Kaper, J. B., and Cobon, G. S., Eds.) pp 417-435, Marcel Dekker, inc., New York. (13) Leppla, S. H., Robbins, J. B., Schneerson, R., and Shiloach, J. (2002) Development of an improved vaccine for anthrax. J. Clin. InVest. 110, 141-144. (14) Murdin, A. D., Barreto, L., and Plotkin, S. (1996) Inactivated poliovirus vaccine: past and present experience. Vaccine 14, 735746. (15) Jentoft, J. E., Jentoft, N., Gerken, T. A., and Dearborn, D. G. (1979) 13C NMR studies of ribonuclease A methylated with [13C]Formaldehyde. J. Biol. Chem. 254, 4366-4370. (16) Means, G. E., and Feeney, R. E. (1995) Reductive alkylation of proteins. Anal. Biochem. 224, 1-16. (17) Gold, T. B., Smith, S. L., and Digenis, G. A. (1996) Studies on the influence of pH and pancreatin on 13C-formaldehyde-induced gelatin cross-links using nuclear magnetic resonance. Pharm. DeV. Technol. 1, 21-26. (18) Kunkel, G. R., Mehrabian, M., and Martinson, H. G. (1981) Contact-site cross-linking agents. Mol. Cell Biochem. 34, 3-13. (19) Jackson, V. (1999) Formaldehyde cross-linking for studying nucleosomal dynamics. Methods 17, 125-139. (20) Feldman, M. Y. (1973) Reactions of nucleic acids and nucleoproteins with formaldehyde. Prog. Nucleic Acid Res. Mol. Biol. 131-149. (21) Fraenkel-Conrat, H., and Mecham, D. K. (1948) The reaction of formaldehyde with proteins VII. Demonstration of intermolecular cross-linking by means of osmotic pressure measurements. J. Biol. Chem. 177, 477-487. (22) Fraenkel-Conrat, H., and Olcott, H. S. (1948) Reaction of formaldehyde with proteins VI. cross-linking of amino groups with phenol, imidazole, or indole groups. J. Biol. Chem. 174, 827-843. (23) Blass, J. (1966) Pre´paration et e´tude d'une base de Mannich cristallise´e obtenue par action du formol sur me´lange de thre´onine et de dimethy-2,4-phe´nol. Bull. Soc. Chim. Fr. 10, 3120-3121. (24) Blass, J., Bizzini, B., and Raynaud, M. (1965) Mechanisme de la de´toxification par le formol. C. R. Acad. Sci. Paris 261, 1448-1449. (25) Bizzini, B., and Raynaud, M. (1974) La detoxication des toxines proteiques par le formol: mecanismes supposes et nouveaux developpements [Detoxication of protein toxins by formol: supposed mechanisms and new developments]. Biochimie 56, 297-303. (26) Blass, J., Bizzini, B., and Raynaud, M. (1968) Etude sur le mecanisme de la detoxification des toxines proteiques par le formol. I. Incorporation de la lysine libre radioactive dans la toxine diphterique pure, en presence du formol (detoxification dite “irreversible”). Ann. Inst. Pasteur (Paris) 115, 881-898. (27) Blass, J., Bizzini, B., and Raynaud, M. (1969) Etude sur le mecanisme de la detoxification des toxines proteiques par le formol. II. Fixation quantitative du formol 14C. [Study on the mechanism of protein toxin detoxification by formol. II. Quantitative fixation of C14 formol]. Ann. Inst. Pasteur (Paris) 116, 501-521.

822 Bioconjugate Chem., Vol. 17, No. 3, 2006 (28) Blass, B. (1964) Etat actuel de nos connaissances sur le me´canisme de la de´toxification par le formol. Biol. Me´ d. 53, 202-234. (29) Metz, B., Kersten, G. F. A., Hoogerhout, P., Brugghe, H. F., Timmermans, H. A., de Jong, A., Meiring, H., ten Hove, J., Hennink, W. E., Crommelin, D. J., and Jiskoot, W. (2004) Identification of formaldehyde-induced modifications in proteins: reactions with model peptides. J. Biol. Chem. 279, 6235-6243. (30) Lin, R. C., Smith, J. B., Radtke, D. B., and Lumeng, L. (1995) Structural analysis of peptide-acetaldehyde adducts by mass spectrometry and production of antibodies directed against nonreduced protein-acetaldehyde adducts. Alcohol: Clin. Exp. Res. 19, 314319. (31) Fowles, L. F., Beck, E., Worrall, S., Shanley, B. C., and de Jersey, J. (1996) The formation and stability of imidazolidinone adducts from acetaldehyde and model peptides. A kinetic study with implications for protein modification in alcohol abuse. Biochem. Pharmacol. 51, 1259-1267. (32) Heck, A. J., Bonnici, P. J., Breukink, E., Morris, D., and Wills, M. (2001) Modification and inhibition of vancomycin group antibiotics by formaldehyde and acetaldehyde. Chemistry 7, 910916. (33) Birnbaum, D. T., Dodd, S. W., Saxberg, B. E., Varshavsky, A. D., and Beals, J. M. (1996) Hierarchical modeling of phenolic ligand binding to 2Zn-insulin hexamers. Biochemistry 35, 5366-5378. (34) Goldman, J., and Carpenter, F. H. (1974) Zinc binding, circular dichroism, and equilibrium sedimentation studies on insulin (bovine) and several of its derivatives. Biochemistry 13, 4566-4574. (35) Ahmad, A., Millett, I. S., Doniach, S., Uversky, V. N., and Fink, A. L. (2004) Stimulation of insulin fibrillation by urea-induced intermediates. J. Biol. Chem. 279, 14999-15013. (36) Meiring, H. D., Heeft, E. v. d., Hove, G. J. t., and Jong, A. d. (2002) Nanoscale LC-MS(n): technical design and applications to peptide and protein analysis. J. Sep. Sci. 25, 557-568. (37) Badger, J., and Caspar, D. L. (1991) Water structure in cubic insulin crystals. Proc. Natl. Acad. Sci. U.S.A. 88, 622-626. (38) Fraczkiewicz, R., and Braun, W. (1998) Exact and efficient analytical calculation of the accessible surface areas and their gradients for macromolecules. J. Comput. Chem. 19, 319-333.

Metz et al. (39) Smith, G. D., Ciszak, E., Magrum, L. A., Pangborn, W. A., and Blessing, R. H. (2000) R6 hexameric insulin complexed with m-cresol or resorcinol. Acta Crystallogr., Sect. D: Biol. Crystallogr. 56 (Part 12), 1541-1548. (40) O’Donoghue, S. I., Chang, X., Abseher, R., Nilges, M., and Led, J. J. (2000) Unraveling the symmetry ambiguity in a hexamer: calculation of the R6 human insulin structure. J. Biomol. NMR 16, 93-108. (41) Whittingham, J. L., Scott, D. J., Chance, K., Wilson, A., Finch, J., Brange, J., and Guy Dodson, G. (2002) Insulin at pH 2: structural analysis of the conditions promoting insulin fibre formation. J. Mol. Biol. 318, 479-490. (42) Nettleton, E. J., Tito, P., Sunde, M., Bouchard, M., Dobson, C. M., and Robinson, C. V. (2000) Characterization of the oligomeric states of insulin in self-assembly and amyloid fibril formation by mass spectrometry. Biophys. J. 79, 1053-1065. (43) Sorensen, S. B., Sorensen, T. L., and Breddam, K. (1991) Fragmentation of proteins by S. aureus strain V8 protease. Ammonium bicarbonate strongly inhibits the enzyme but does not improve the selectivity for glutamic acid. FEBS Lett. 294, 195197. (44) Gatlin, C. L., Eng, J. K., Cross, S. T., Detter, J. C., and Yates, J. R., 3rd. (2000) Automated identification of amino acid sequence variations in proteins by HPLC/microspray tandem mass spectrometry. Anal. Chem. 72, 757-763. (45) Braun, K. P., Cody, R. B., Jones, D. R., and Peterson, C. M. (1995) A structural assignment for a stable acetaldehyde-lysine adduct. J. Biol. Chem. 270, 11263-11266. (46) Rappuoli, R. (1994) Toxin inactivation and antigen stabilization: two different uses of formaldehyde. Vaccine 12, 579-581. (47) Metz, B., Jiskoot, W., Hennink, W. E., Crommelin, D. J. A., and Kersten, G. F. A. (2003) Physicochemical and immunochemical techniques predict the quality of diphtheria toxoid vaccines. Vaccine 22, 156-167. (48) Hua, Q. X., Jia, W., Frank, B. H., Phillips, N. F., and Weiss, M. A. (2002) A protein caught in a kinetic trap: structures and stabilities of insulin disulfide isomers. Biochemistry 41, 14700-14715. BC050340F