Cyanogen Bromide Peptides of the Fibrillar ... - ACS Publications

May 25, 2007 - sition,6 amino acid sequence,9 but above all their molecular ... molecular weight, all CB peptides, but also most of their ... Bio-Gel ...
1 downloads 0 Views 6MB Size
Cyanogen Bromide Peptides of the Fibrillar Collagens I, III, and V and Their Mass Spectrometric Characterization: Detection of Linear Peptides, Peptide Glycosylation, and Cross-Linking Peptides Involved in Formation of Homo- and Heterotypic Fibrils Werner Henkel*,† and Klaus Dreisewerd‡ Am Waldplatz 15, D-33098 Paderborn, Germany, and Institute of Medical Physics and Biophysics, University of Mu ¨ nster, Robert-Koch-Str. 31, D-48149 Mu ¨ nster, Germany Received May 25, 2007

The network of the fibrillar collagens I, III, and V, extracted from fetal calf skin and cleaved with cyanogen bromide, was studied by means of ultraviolet matrix-assisted laser desorption ionization time-of-flight mass spectrometry (UV-MALDI MS). Nearly all of the expected cyanogen bromide peptides of the different alpha chains were detected. Distinct peptides are identified that can serve as a reference signal for the individual R-chains. Homo- and heterotypic cross-linking patterns, some of which have not been described before for bovine collagen, are indicated by comparison of the mass spectrometric data with documented amino acid sequences. Potential cross-linking mechanisms are discussed. For example, the mass spectrometric data suggest that the formation of heterotypic I/III and I/V fibrils is substantially determined by the telo-regions of type I collagen, which are covalently connected to the corresponding helical and nonhelical cross-linking domains of adjacent molecules either by 4D or 0D-stagger bonds. The chemical nature of the cross-links can be concluded. The data also indicate a disturbed formation of heterotypic fibrils. Finally, collagen glycosylation can also be identified. Keywords: MALDI MS • collagens • cyanogen bromide cleavage • cross-linking • glycosylation

Introduction Embryonic calf skin contains the three tissue-specific collagens type I, III, and V, all of which are fibril-forming molecules with a cross striation period of 67 nm, building a network of cross-linked copolymers. Previous studies have established the codistribution of collagens I, III, and V by immunoelectron microscopic methods.1 Despite the imminent physiological importance of the heterotypic interactions, e.g., type V collagen levels play a key role in the assembly of corneal fibrils by regulating the fibril diameter,2 their molecular basis has thus far been investigated in only a few studies.3,4 For biochemical analysis, instead of using enzymes, collagens are frequently cleaved with cyanogen bromide (CNBr).5-8 This approach has some advantages: Typically, the cleavage reaction is almost complete and the relatively small number of methionine residues in the R-chains has the consequence that on average not more than about a dozen CB peptides per chain are formed. The presence of larger peptides, which will in part contain more than 200 amino acids, facilitates their identification and, hence, the determination of the collagen type. CB peptides have been characterized by their amino acid composition,6 amino acid sequence,9 but above all their molecular * To whom correspondence should be addressed. Dr. Werner Henkel; Phone, +49-5251-71835; E-mail, [email protected]. † Am Waldplatz 15. ‡ University of Mu ¨ nster. 10.1021/pr070318r CCC: $37.00

 2007 American Chemical Society

weight (MW) using sodium dodecyl sulfate (SDS) disk electrophoresis.10 Tandem MS approaches addressing enzymatically digested collagens have also been described. In a recent publication, Zhang et al., for instance, identified collagen type I and II specific tryptic peptides by liquid chromatographyelectrospray ionization tandem mass spectrometry (LC-ESIMS/MS).11 Here, we have applied matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF-MS) for the analysis of linear and cross-linked CB peptides produced from the embryonic calf skin collagens I, III, and V. By their molecular weight, all CB peptides, but also most of their copolymers, fall into the mass range well suited for MALDITOF-MS analysis, ranging from a few 100 Da to some 10 kDa. A drawback and complication of the method is the formation of multiplets, owing to microheterogenities of R-chains as well as formylation and other reactions occurring during the cyanogen bromide induced cleavages. The nature of the multiplet formation is discussed in detail at the example of four CB peptides of different lengths. Next to the linear CB peptides, further ion signals are detected. Comparing their molecular weight with amino acid sequence and cDNA data, homo- and heterotypic cross-linking structures are indicated. Their possible molecular basis is discussed. Glycosylation of the R-chains, in particular that of heavily glycosylated type V collagen, is also investigated. Journal of Proteome Research 2007, 6, 4269-4289

4269

Published on Web 10/16/2007

research articles Experimental Section Materials. Pepsin was purchased from Serva, cyanogen bromide was purchased from Fluka, 98% formic acid was purchased from Merck, and sodium iodoacetate was purchased from Sigma-Aldrich. Bio-Gel A-15 (200-400 mesh) was from Bio-Rad, and the Mono Q HR 5/5 column was from PharmaciaLKB. MALDI matrices 2,5-dihydroxybenzoic acid, 2-hydroxy5-methoxybenozic acid, and calibration peptides (bradykinin fragment 1-7, angiotensin I, melittin, insulin, cytochrome C) were from Sigma-Aldrich (Deisenhoven, Germany) and were used without further purification. Aqueous solutions of 10 mg/ mL 2,5-DHB and 10 mg/mL 2-hydroxy-5-methoxybenzoic acid (each containing 10% ethanol) were mixed 10:1 (v/v) to produce DHBs matrix. General Strategy. The general strategy of the combined approach is sketched in Figure 1. Numbers denote amounts of material as obtained in the study. Preparation of collagens was described in detail previously12 and comprises pepsin digestion of fetal calf tissue and differential salt precipitation to isolate the three collagen types, followed partially by heat denaturation of the collagens and chromatographic separation of the arising R and β subunits. CNBr cleavage is performed with either the total salt precipitates containing the whole triple-helical molecules (γ) of collagen I, III, and V or the subunits thereof (R,β). MALDI MS data obtained from the products are compared to those deposited in amino acid sequence and cDNA databanks. Cyanogen Bromide Cleavage. Fragmentation with cyanogen bromide was performed according to Rauterberg and Ku¨hn5 and adapted to small amounts of collagen proteins; 0.5 - 1 mg of the collagen samples in a 2 mL screw-cap glass vial were dissolved in 100 µL of 70% formic acid. The solutions were saturated with nitrogen for 1 min; 0.5 - 1 mg of cyanogen bromide was added, and the suspension was shaken until the cyanogen bromide crystals were dissolved. After passing nitrogen through the solution for 1 min, the solution was incubated for 4 h at 30 °C. The digestion was terminated by adding 100 µL of 0.1 M acidic acid, and the solution was evaporated to dryness under nitrogen. This procedure was repeated twice to remove residues of cyanogen bromide. The dried peptides were taken up in 100 µL of 0.1 M acidic acid. The suspension was transferred from the screw-cap vial into a 1 mL standard Eppendorf vial and centrifuged for 2 min at 14 000 rpm. The clear supernatant was used for mass spectrometric analysis. Mass Spectrometer and MALDI Sample Preparation. Ultraviolet (UV-)MALDI MS measurements were performed with a time-of-flight mass spectrometer (TOF-MS; Reflex III Bruker Daltonik, Bremen, Germany), equipped with an N2 laser that emits pulses of 3 ns duration at λ ) 337 nm. The focal spot diameter is about 70 µm in diameter. The reflector mode of the instrument was used. MALDI samples were prepared by subsequently applying 1 µL of DHBs matrix solution and 0.5 µL of collagen samples onto the stainless steel sample plate. Typically, the concentrations of the collagen species were 1 mg/ 100 µL 0.1 M acetic acid, corresponding to about 5 µg per 0.5 µL of the solvent; given an approximate average weight of 300 kDa for triple-helical assemblies, this thus corresponds to about 17 pM of material used for the analysis. Typically, 2-3 mass spectra were recorded for each sample, covering the lower mass range up to 5000-7000 Da and that above. Typical MALDI conditions were applied. To obtain an 4270

Journal of Proteome Research • Vol. 6, No. 11, 2007

Henkel and Dreisewerd

optimal mass resolution of about 5000 (full-width halfmaximum, fwhm) for peptide ions in the lower mass range, the oscilloscope, digitizing the ion detector signal and equipped with (only) 50k channels for storage of the transient time-offlight signals, was set to a time resolution of 2 ns. Laser fluence was adjusted to within a factor of 2 above the ion detection threshold fluence (the ion detection threshold can be estimated from other studies to ∼100 J m-2 13). To cover the higher mass range, the time resolution of the oscilloscope was changed to 4 and/or 10 ns. To enhance the intensity of ions of very high mass, laser fluences were in some measurements slightly increased (by about 20%). External mass calibration was achieved with selected standard peptides and proteins (see above) under the conditions used for the following collagen analysis. As is typical with this type of TOF mass analyzer, mass accuracy is not only affected by the mass resolution of the spectrum but the calibration may also shift slightly when moving over the inhomogeneous sample preparation or from spot to spot. Moreover, the applied delayed ion extraction conditions provide an optimal mass resolution only for a limited mass window. For the low-to-intermediate mass range the overall precision of the mass determination was, therefore, in the range of 100 to a few 100 ppm. We note that internal calibration by adding suitable standards would improve the performance. Experimental m/z values will be presented here with a 1 digit precision for ions for which the 12C/13C isotopes are well resolved (up to ∼m/z 3500), whereas whole numbers are listed for ions of larger masses. Owing to the fact that the specific distributions of large CB peptides, resulting from microheterogeneity and chemical modifications of the molecules, cannot be resolved above about 10 kDa, broad signals are obtained for heavy ions. Therefore, rounded m/z values are listed where appropriate. Nomenclature. The abbreviations used are: CB, cyanogen bromide peptide; he, triple helical; no-he, nontriple helical; Gal, Galactose; Glc, Glucose; C-and N-telopeptides, short nontriple helical sequences joining to the carboxy and amino ends of the main triple helix of type I, III, and V collagen chains; D-period, the 67 nm repeating periodicity of type I, III, and V collagen fibrils. Composition of Type I, III, and V Collagen. Triple helices of type I and V collagen are of the two-chain form [R1(I)]2[R2(I)] and [R1(V)]2[R2(V)], respectively, and are held together essentially by noncovalent bonds generated by electrostatic forces as hydrogen bonds,37 hydrophobic bonding, 37 and VanDer-Waals interactions in consequence of closely packed amino acid residues in the folded triple helix.14 In contrast, triple helices of type III collagen are of the single-chain form [R1(III)]3, and the chains are stabilized, apart from noncovalent hydrogen bonds, also by disulfide bridges between groups, localized at the C-terminus of the alpha-chains. Nomenclature of Cross-Linked Peptides. A cross-linked peptide is composed of single peptide components; in all presented figures, the covalent connection between the components is represented by the symbol “×”. The type of collagen is denoted by Roman numerals in brackets. The cyanogen bromide cleavage that produced the peptides is denoted by “CB”. Numbering of the peptides derived from R1(I),15 R1(III),15 and R1(V)7 follows the historical notation, and for those derived from R2(I) and R2(V), their sequence order is from the N- to the C-terminus of the chains (see Figure 2 for a schematic illustration). The subscript numerals indicate either the number of single peptide components involved in cross-linking or the

research articles

Fibrillar Collagens I, III, and V

Figure 1. Work flow diagram of the general strategy to analyze the fibrillar collagen network.

saccharide components per peptide. The term (he+n‚no-he) denotes a single peptide component, consisting mainly of the triple-helical domain but including n nontriple-helical residues

from either the N- or C-terminus of the alpha chain. Thus, [R2(I)CB1(he+2no-he)]2 × R1(V)CB5(he+1no-he)‚(Gal‚Glc)7 is a cross-linked peptide containing three polypeptide chains, two Journal of Proteome Research • Vol. 6, No. 11, 2007 4271

research articles

Henkel and Dreisewerd

Figure 2. Cyanogen bromide peptides of the R1(I),5 R2(I), R1(III),6 R1(V),7 and R2(V)8 chains from embryonic calf skin and their glycosylation. The corresponding values of the R2(I)-chain are based on c-DNA data from the databank (see text). The triple-helixforming regions of the R-chains are represented. Vertical red lines indicate positions of methionine residues, numbers without brackets designate the peptides and with brackets correspond to the number of residues per peptide. The vertical dark lines represent hydroxylysineglycosides, the long ones represent glucosyl‚galactosyl, and the short ones represent galactosyl derivatives. The number of hydroxylysine-linked glycosides per peptide is shown, not their location. The helical cross-linking domains and their corresponding CB peptides, as detected by mass spectrometry, are indicated by red lines.

derived from type I collagen and one from type V collagen. The type I component R2(I)CB1 is present as a dimer consisting of the triple-helical domain and two nontriple-helical residues. The type V component R1(V)CB5, binding seven (Gal‚Glc) disaccharide units, is composed of the triple-helical domain including one nontriple-helical residue. Calculation of Molecular Weights. Molecular weight calculation of CB peptides is based on amino acid sequence and nucleotide sequence data of collagen R-chains taken from databanks. Modifications of single amino acid residues must be taken into consideration. In the strong acid solution of a 70% formic acid, the formation of N-formyl-methionine is readily possible, and the C-terminal methionine (N-formylmethionine) may be further transformed to homoserinelactone (N-formyl-homoserinelactone). In some cases, this may occur in equilibrium with homoserine (N-formyl-homoserine). In CB “double peptides”, containing a missed cleavage site, the internal methionine has to be calculated as a methionine 4272

Journal of Proteome Research • Vol. 6, No. 11, 2007

residue and the C-terminal one as homoserinelactone. Furthermore, post-translational biochemical modifications like hydroxylation of proline and lysine as well as glycosylation of hydroxylysine must be taken into account. Hydroxyproline occurs mainly as 4-hydroxyproline and in small amounts (