Article pubs.acs.org/jpr
Cite This: J. Proteome Res. XXXX, XXX, XXX-XXX
Characterization of Proteomes Extracted through Collagen-based Stable Isotope and Radiocarbon Dating Methods Caroline Wadsworth and Michael Buckley* Manchester Institute of Biotechnology, The University of Manchester, 131 Princess Street, Manchester M1 7DN, United Kingdom S Supporting Information *
ABSTRACT: Isotope analyses on “collagen” extracted from ancient bone have been routinely used for dietary and chronological inferences worldwide for decades. These methods involve the decalcification of biomineralized tissues with acid, often followed by processes to remove exogenous contaminants, and then gelatinization of what is often described as the “collagen” fraction. However, little is known about the relative content of collagen to the many other noncollagenous proteins (NCPs) potentially present. Some of these NCPs have great longevity in ancient bone, and some, for example, fetuin-A, are useful for obtaining better taxonomic information than collagen. This study uses Orbitrap Elite LC−MS/MS to characterize the proteomes of the acid-soluble and base-soluble fractions, which are usually discarded, and the gelatinized “collagen” fraction obtained from both stable isotope and radiocarbon methods applied to several ancient bovine bones. The results showed that all fractions tested contain numerous NCPs, but the base-soluble fraction for both methods contains the greatest number of NCPs with the highest relative abundances. This study confirms that not only do the waste fractions obtained from the “collagen” extraction procedure of stable isotope or radiocarbon dating methods yield a plentiful resource of NCPs that is currently being overlooked but that they also provide proteomes as complex as those obtained from standard proteomics methods. KEYWORDS: radiocarbon dating, stable isotope analysis, collagen, noncollagenous proteins, ancient bone proteomics
■
INTRODUCTION For decades, the analysis of isotopes in ancient organic remains has become routine in archeological science,1−3 carried out daily by many researchers across the planet. Skeletal tissues such as bone and teeth are most frequently used in isotope analyses because they are the most common organic materials recovered at archaeological sites and are relatively resistant to environmental contamination due to low porosity,4 with various methodological procedures used to remove any such contamination.2 However, although it has been shown that hundreds of different proteins can be obtained from different proteomics-based extractions from bone,5 the protein composition of fractions specific to these commonly used bioarcheological methods remains unclear. This is particularly important given the potential effects on isotopic results that different proteins (or their ligands) could have, where although the fraction of interest is considered to be “collagen”, this is often only likely to be true in the more poorly preserved remains. Equally important is that these isotopic methods undoubtedly remove proteins earlier in the processing, yet given the importance of many of the samples being put through these methods, the molecular information contained could become much more widely utilized.6 We address these limits in our understanding of the protein components of these isotopic methods by using proteomics to characterize each of the fractions typically discarded through some of the most © XXXX American Chemical Society
commonly used molecular protocols in bioarcheology, as well as the fraction often described as “collagen”. Isotope Analysis in Ancient Tissues
Isotopes of carbon, nitrogen, oxygen, and strontium are some of the most often analyzed in archeological samples, with carbon7 and nitrogen8 isotopes being of greatest importance in biomolecular archeology because they are used in the reconstruction of ancient diets (both) as well as in radiocarbon dating (carbon).9 Isotopes become incorporated into biological tissues via the food chain, initially by plants via photosynthesis and then into animal tissues when the plants are ingested by other organisms.10 These isotopes are continuously incorporated into an organism’s tissues until its death, at which point the decaying unstable isotopes (e.g., 14C) are no longer replaced while the stable isotopes (e.g., 12C and 13C and 14N and 15N) remain in the same concentrations.10 The relative amounts of these isotopes can then be measured using mass spectrometry to obtain information about a sample.3,11 Stable Isotope Analysis. Because the ratios of stable isotopes remain constant after an organism’s death, the isotope ratios measured in archeological “collagen” are considered representative of the dietary isotopes consumed by the Received: September 1, 2017
A
DOI: 10.1021/acs.jproteome.7b00624 J. Proteome Res. XXXX, XXX, XXX−XXX
Article
Journal of Proteome Research
typically ranges from 7011 to 95 °C (e.g., ref 2). Modifications to these techniques include a treatment step with sodium hydroxide (NaOH) before gelatinization to remove basesoluble contaminants such as humic acids,12,25−27 the addition of an acid wash step to remove dissolved carbon dioxide, and the use of ultrafiltration to preferentially isolate gelatinized collagen (along with other proteins) above a particular molecular weight.28 These pretreatment steps produce several extracts from each sample, although normally only the gelatinized “collagen” fraction (known as the “pH 3” fraction in this study) is used for stable isotope analysis or radiocarbon dating and the rest are typically discarded. As well as the “collagen” fraction itself, this study characterizes the fractions usually discarded in isotope analyses of ancient bone (acidsoluble, base-soluble, and acid wash) using proteomic techniques to evaluate their potential as a resource of alternative biomolecular information6 that is largely overlooked.
organism in life.7,12 In the natural environment, lighter isotopes are predominant (e.g., 12C, 14N) and their heavier isotopes present in much lower (orders of magnitude) amounts (e.g., 13 C, 15N). To become incorporated into an organism’s tissues, dietary isotopes must undergo various biochemical reactions (e.g., photosynthesis) that preferentially metabolize the lighter isotopes, causing the isotope ratios of a biological sample to be enriched in heavier isotopes when compared with an environmental standard; this process is known as kinetic fractionation.10 In bioarcheology, the most important application of stable isotope analysis is in dietary reconstruction, which uses the differences in 13C/12C and 15N/14N isotope ratios in ancient bone and/or tooth tissue to determine the composition of ancient diets, for example, the contribution of animal products versus plant products (including distinguishing C3 from C4 plants such as maize13 or the inclusion of marine resources).11 Radiocarbon Dating. Radiocarbon dating is a technique used to date organic remains from archeological sites by measuring their 14C content, which is a radioisotope present at trace levels in the atmosphere.14 It is created in the upper atmosphere by cosmic ray spallation reactions, eventually combining with atmospheric oxygen to create radioactive carbon dioxide.15 This radioactive carbon dioxide is then incorporated into plant tissues by photosynthesis and then enters the food chain when plants are consumed by various animal species. In the same way as stable isotopes, radiocarbon stops being incorporated into organic tissues when the organism dies; however, the decomposition of 14C, which is no longer being replaced, is considered to have a half life of 5730 ± 40 years;16 therefore, the amount of radiocarbon detected in a sample by accelerator mass spectrometry (AMS) can be used to calculate when the organism died (i.e., the older the sample, the less radiocarbon will be detected). Bone is generally used for radiocarbon dating when available17 because archeologists often want to know the age of a skeleton, directly rather than indirectly with other tissue types (e.g., charcoal).
Liquid Chromatography−Tandem Mass Spectrometry
In recent years, the study of ancient proteins has benefited from the development of highly sensitive and selective proteomic techniques such as liquid chromatography−tandem mass spectrometry (LC−MS/MS), a technique particularly suited to the study of ancient proteins because it uses soft ionization approaches that cause minimal damage to peptides during analysis and allows peptide detection at extremely low (femtomolar or even lower) levels. Recent studies have shown that peptides from NCPs can be detected in ancient bone using this technique,29,30 and that such peptides are detectable in bones up to almost 1 million years in age31 and have enough sequence variation to be useful in phylogenetic investigations.6 Label-Free Quantitation
Previous work on ancient bone proteomes has focused primarily on qualitative data, that is, the presence or absence of peptides from a particular protein in a sample when analyzed by LC−MS/MS.29,31 This approach can be useful for poorly preserved samples in which it is expected that there will be limited protein survival but is less useful when analyzing wellpreserved samples that may have more consistent and similar proteomes. We consider that a quantitative analysis would be more useful in such comparisons because it would allow an assessment of protein abundance between samples, important for improving our understanding of the heterogeneity of different sample types, particularly low yield samples. For example, it has been debated whether NCPs become relatively more abundant and even outlast collagen, which would impact upon isotopic measurements.32,33 There are several ways of obtaining quantitative data in proteomics,34 but this study utilizes a label-free quantitation method for relative abundances because it does not require any additional enzymatic or chemical alterations to the extracted peptides. There are two main methods of label-free quantitation: spectral counting and the measurement of MS1 ion intensity or peak area.34 Spectral counting is the simplest form of obtaining quantitative data and involves simply counting the number of MS2 (MS/MS) spectra that match to a particular peptide; protein abundances can be calculated from the data by summing the spectral counts for all peptides identified as deriving from a specific protein. However, spectral counting becomes inaccurate when considering peptides at low abundances, making this method potentially unsuitable for the analysis of proteins from ancient samples that are likely to have
“Collagen” Extraction Methods
Bone tissue is made up of both organic and inorganic materials; the organic component accounts for ∼30% of the tissue (by dry weight), ∼90% of which is collagen (predominantly type I collagen) and the remainder is made up of noncollagenous proteins (NCPs), polysaccharides, and other molecules of the extracellular matrix.18 Calcium phosphate crystals, also known as bioapatite, make up the inorganic (mineral) component and account for the other ∼70% of bone.19 Bioarcheological isotope analyses of ancient bone usually focus on the acid-insoluble organic “collagen” fraction, whereas the initial acid-soluble fraction, as well as other subsequent fractions, are often removed and discarded. However, here we emphasize the degree to which highly informative proteins are present, potentially useful as a screening technique to identify samples that have good biomolecular preservation20 or for species identification.21−24 Although there are many different protocols for the extraction of gelatinized “collagen” from archeological bone for isotopic analysis,12 each method generally involves a step using an acid to remove carbonates, phosphates, and fulvic acids (and therefore dissolving the bone mineral matrix; this step is known as demineralization), followed by collagen gelatinization of the resulting pellet by acid hydrolysis in a much weaker acid solution at an elevated temperature, which B
DOI: 10.1021/acs.jproteome.7b00624 J. Proteome Res. XXXX, XXX, XXX−XXX
Article
Journal of Proteome Research
the supernatant was removed and retained as the base-soluble (NaOH) fraction, and the pellet was rinsed with 3 × 1 mL aliquots of distilled water to neutralize. The remaining pellet was incubated at 95 °C for 5 h in 500 μL of 1 mM HCl2 and spun a final time in a microfuge using the same conditions as above, removing and retaining the supernatant as the pH 3 fraction (gelatinized collagen). The three fractions obtained (the acid-soluble, base-soluble and pH 3 fractions) were then buffer-exchanged into 50 mM AMBIC using a 10 kDa MWCO poly(ether sulfone) spin column (Vivaspin, U.K.) with 3 × 500 μL 50 mM AMBIC washes, and the retentates were removed in a final volume of 100 μL of 50 mM AMBIC. Fractions extracted using the stable isotope analysis sample processing protocol were labeled “SI” (Table 2). Radiocarbon Dating Protein Extraction Method. Following established methods for radiocarbon dating,35 ∼50 mg bone powder from each sample was weighed into 1.5 mL microtubes and, following being shaken, incubated for 2 h at room temperature in 1 mL of 0.6 M HCl. After incubation, the samples were spun in a microfuge for 2 min at 14 000g, the supernatant was removed and retained as the first acid-soluble fraction, and a further 1 mL of 0.6 M HCl was added to the pellet before being shaken and incubated overnight at room temperature. The next day, the samples were spun as before and the supernatant was retained as the second acid-soluble fraction; then, the pellet was rinsed to neutrality with 3 × 1 mL aliquots distilled water before being incubated at room temperature for 30 min in 500 μL of 0.1 M NaOH. Subsequently, the samples were spun in a microfuge as previously described, the supernatant was removed and retained as the base-soluble NaOH fraction, and the pellet rinsed to neutrality before being incubated with 500 μL of 0.6 M HCl at room temperature for 2 h. Next, the samples were centrifuged, the supernatant removed and retained as the acid wash fraction, and the pellet rinsed with 3 × 1 mL aliquots of distilled water. The remaining pellet was then incubated with 500 μL of 1 mM HCl at 70 °C for 20 h, then centrifuged as before and the supernatant likewise removed and retained as the pH 3 fraction. This process resulted in four fractions for each sample: the acid-soluble fraction (first and second soluble fractions pooled) and the base-soluble (NaOH), acid wash, and pH 3 fractions. These four fractions were then buffer-exchanged into 50 mM AMBIC using a 10 kDa MWCO spin column (Vivaspin, U.K.) with 3 × 500 μL of 50 mM AMBIC washes, and the retentate removed in a final volume of 100 μL of 50 mM AMBIC. Fractions extracted using the radiocarbon dating sample-processing protocol were labeled “RC”(Table 2). 30 kDa MWCO Retentate and Filtrate Proteins. In the standard protocols for both the stable isotope and radiocarbon dating protein extraction methods the gelatinized collagen fraction (pH 3 fraction) is concentrated using 30 kDa molecular weight cutoff (MWCO) ultrafilters, where molecules that pass through the column are typically discarded. It is
lower peptide abundances than those from modern tissues due to degradation. Measurement of MS1 peak areas can also be used to obtain quantitative data in which masses and elution times from LC−MS analysis are normalized and aligned with data from other runs in the same experiment to create an accurate profile of identified features in the analysis (e.g., m/z values, etc.). These features are then matched to peptides by comparing MS2 data to protein sequences stored on a reference database; from this, relative abundance values for each identified protein are calculated by summing the area under the chromatographic elution profile for selected peptides belonging to that protein. This latter approach is utilized in this study because it is able to identify and compare low abundance proteins more accurately than spectral counting.
■
MATERIALS AND METHODS Four ancient bovine bone samples (UI3, NF1, AuCPC, and KC2) ranging from 4000 years to 130 000 years in age, most of which have previously been shown to contain both collagens and NCPs,31 were selected for this study (Table 1). Extracts Table 1. List of the Ancient Samples Used in This Study, Their Site Locations, and Approximate Ages sample
age
location
UI3 NF1 AuCPC KC2
4 Ka 6 Ka 6 Ka 130 Ka
Durrington Walls, Wiltshire North Ferriby, East Yorkshire Carsington Pasture Cave, Derbyshire Kirkdale Cave, North Yorkshire
from all four ancient samples were used in the comparison of the stable isotope (SI) and radiocarbon dating (RC) methods, and twoof which (AuCPC and NF1) were used to evaluate what passes through the 30 kDa molecular weight cutoff (MWCO) ultrafilters in terms of protein extraction (i.e., the total number and relative abundances of NCPs extracted using each type of filter). Modern cow bone powder was used as a positive control for all comparisons and blank samples of 50 mM ammonium bicarbonate (AMBIC) were used as negative controls for the ultrafiltration tests. Protein Extraction Methods
Stable Isotope Extraction Method. Approximately 50 mg dremel-drill-powdered modern and ancient bovine bone (Table 2) was weighed into a 1.5 mL microtube and, following being shaken, incubated for 18 h at 4 °C in the presence of 1 mL of 0.6 M hydrochloric acid (HCl). Following incubation, samples were spun in a benchtop microfuge at 14 000g for 2 min, and the supernatant was removed and retained as the soluble fraction. The remaining pellet was rinsed with 3 × 1 mL aliquots of distilled water, then incubated in 500 μL of 0.1 M NaOH at room temperature for 20 h (following an average of previously published protocols2,25). The samples were then spun a second time in the microfuge using the same conditions,
Table 2. Conditions of the Fractions Characterized in This Study Showing the Three Fractions from Stable Isotope (SI) Methods and Four from Radiocarbon (RC) Methods method step
stable isotope (SI)
radiocarbon (RC)
decalcification (0.6 M HCl) removal of humics (0.1 M NaOH) acid wash/rinse (0.6 M HCl) gelatinization (0.001 M HCl)
18 h @ 4 °C 20 h @ RT
2 + 18 h @ RT 0.5 h @ RT 2 h @ RT 20 h @ 70 °C
5 h @ 95 °C C
DOI: 10.1021/acs.jproteome.7b00624 J. Proteome Res. XXXX, XXX, XXX−XXX
Article
Journal of Proteome Research
Figure 1. Comparison of (A) total protein complexities obtained using both the SI and RC extraction methods as well as those by individual fraction for (B) radiocarbon and (C) stable isotope methods.
(activation: CID, 2+ default charge state, 2 m/z isolation width, 35 eV normalized collision energy, 0.25 Activation Q, 10.0 ms activation time). Data Handling. Proteome Complexity. In this study, one of the parameters used to assess the degree of preservation of a sample was its proteome complexity, here defined as a measure of the number of proteins extracted from a sample. It was calculated by manually examining peptide matches from Mascot searches for the relevant taxon (e.g., Bos or Bison) that have an ion score above a set cutoff point. To determine proteome complexity, peptide masses obtained via LC−MS/ MS analysis were searched against the Swiss-Prot database for matches to primary protein sequences using the Mascot search engine. Each search included the fixed carbamidomethyl modification of cysteine (+57.02 Da), a modification resulting from the reduction step described in the methods above, and the variable modifications for deamidation (asparagine and glutamine, +0.98 Da), pyroglutamate formation (from Nterminal glutamine residues, −17.02 Da), and oxidation of lysine, proline, and methionine residues (all +15.99 Da) to account for post-translational modifications and diagenetic alterations. Enzyme specificity was limited to trypsin with up to two missed cleavages allowed, mass tolerances set at 0.5 ppm for the precursor ions and 0.5 Da for the fragment ions, and all spectra were considered as having either 2+ or 3+ precursors (a semitrypsin search with one missed cleavage allowed was also carried out). A protein was classified as present in a sample if it had at least two unique peptide matches above an ion cut off score of 40. An ancient bone proteome summary was then created by collating a list of all proteins that were identified at least once in any of the ancient samples (according to the above criteria). Label-Free Quantitation. Comparison of the relative abundances of proteins between samples and fractions for both the radiocarbon dating and stable isotope methods was used to assess which method was most effective in terms of the amount of protein extracted. Relative abundances were calculated using a label-free quantitation method carried out using Progenesis QI software. This works by aligning raw retention time data from LC−MS/MS runs and normalizing all obtained retention time and m/z values. These normalized peptide data are then identified using database matching software (e.g., Mascot) as described earlier. The relative abundances of different proteins are then calculated by comparing the area under the curve (AUC) of the chromatographic elution profile for each peptide assigned to a specific protein (in the default settings only the three most abundant unique peptides are selected for use in quantitation and then averaged). Two separate analyses were run on the Progenesis QI software (Supplementary Tables S1−S2). One analysis compared the relative abundances of proteins in the pH 3
possible that NCPs extracted from ancient samples that could be used for proteome analysis are being removed in this ultrafiltration step because, due to degradation over time, they are potentially small enough to pass through the membrane in the 30 kDa MWCO ultrafilter. To determine whether the flowthrough from 30 kDa MWCO columns contains any proteins or fragments of proteins that might be useful for proteomic analysis, the stable isotope and radiocarbon dating protein extraction methods described above were repeated on a subset of the samples: modern cow, NF1, and AuCPC; the only difference in the methods was that all fractions obtained other than the final pH 3 fraction were discarded. The pH 3 fraction was then filtered through a 30 kDa MWCO ultrafilter and centrifuged at 14 000g for 20 min; the flow-through was then filtered through a 10 kDa MWCO spin column in the same manner. The retentates from both the 10 and 30 kDa MWCO ultrafilters were buffer-exchanged separately into 50 mM AMBIC as described above and processed as normal. Reduction, Alkylation, Digestion, and Purification. All samples were reduced by incubating them in the presence of 5 mM dithiothreitol (final concentration; DTT) at 60 °C for 10 min, then alkylated by incubation in the presence of 15 mM iodoacetamide (final concentration; IAN) for 45 min in the dark at room temperature. The reaction was quenched by the addition of 5 mM DTT (final concentration) and incubation at room temperature for 10 min. Trypsin digestions for each sample were then carried out by adding 0.1 μg trypsin and incubating for 18 h at 37 °C. After digestion, samples were acidified to 0.1% trifluoroacetic acid (TFA) and then purified using C18 ZipTips (OMIX, U.K.) following manufacturer’s protocol, eluting with 50% acetonitrile (ACN)/0.1% TFA.36 Samples were then completely lyophilized in a centrifugal evaporator and resuspended in 20 μL of 0.1% formic acid (FA) in preparation for LC−MS/MS analysis. Mass Spectrometry. LC−MS/MS analyses were carried out using an UltiMate 3000 Rapid Separation LC (RSLC, Dionex Corporation, Sunnyvale, CA) coupled to an Orbitrap Elite (Thermo Fisher Scientific, Waltham, MA) mass spectrometer (120k resolution, full scan, positive mode, normal mass range 350−1500). Peptides in the sample were separated on a 75 mm × 250 μm i.d. 1.7 μM ethylene bridged hybrid (BEH) C18 analytical column (Waters, U.K.) using a gradient from 92% A (0.1% FA in water) and 8% B (0.1% FA in ACN) to 33% B in 44 min at a flow rate of 300 nL min−1. Peptides were then automatically selected for fragmentation by datadependent analysis; six MS/MS scans (Velos ion trap, product ion scans, rapid scan rate, Centroid data; scan event: 500 count minimum signal threshold, top 6) were acquired per cycle, dynamic exclusion was employed, and one repeat scan (two MS/MS scans total) was acquired in a 30 s repeat duration with that precursor being excluded for the subsequent 30 s D
DOI: 10.1021/acs.jproteome.7b00624 J. Proteome Res. XXXX, XXX, XXX−XXX
Article
Journal of Proteome Research
Figure 2. Proteome networks in each fraction of sample NF1 from both the RC (A) and SI (B) methods and (C) compared with that obtained from a standard proteome extraction method previously published.31
fractions grouping both the SI and RC methods filtered using 30 kDa MWCO membranes as well as the filtrates using 10 kDa membranes for each of three samples (modern, AuCPC and NF1; resulting in six groups). The other compared protein relative abundances between the different fractions obtained in both methods, grouping the fractions separately.
■
richest in all of the samples except KC2 and NF1, in which the RC pH 3 and SI NaOH fractions were most complex (Supplementary Table S1). For almost every sample the RC method extracted a greater total number of proteins than the SI fraction (with the exception of sample NF1). In general, the NaOH fraction in both the RC (Figure 1B) and SI (Figure 1C) methods yielded the greatest protein complexity, followed by the acid wash fraction (where included) and then pH 3 fractions. Collagen alpha 1 (I) and alpha 2 (I) peptides (hereafter collagen α1(I) and α2(I), respectively) were recovered in all fractions from all samples (Figure 2). NCPs such as fetuin-A, lumican, pigment epithelium derived factor (PEDF), and prothrombin were regularly observed, most often seen in the SI and RC NaOH fractions, but were also found in the pH 3, acid wash, and acid-soluble fractions for some of the more wellpreserved ancient samples as well as in the modern samples. With regards to the solutions passing through the 30 kDa ultrafilters, the RC method pH 3 solutions contained more proteins than those of the SI method (Table 3), where in most sample fractions the 30 kDa filter-retained fraction usually contained more proteins than what passed through; however, there was a notable exception with the NF1 sample (RC method), which had 41 proteins (39 NCPs) pass through compared with only 18 proteins (13 NCPs) retained. Perhaps most interestingly, this flow-through sample also contained the highest number of proteins of any other fraction from this sample and any fraction from any other ancient sample in this ultrafiltration experiment (only the modern sample contained more proteins).
RESULTS
Proteome Complexity
There were 99 individual proteins (93 NCPs) identified in total across all SI and RC tests of ancient bones (Supplementary Tables S1−S2) with an additional 25 proteins (all NCPs) only identified in modern samples. Of the ancient samples, UI3 appeared the best preserved, with 77 proteins (71 NCPs) identified in the RC extractions and 29 (24 NCPs) in the SI extraction. The most poorly preserved of the ancient samples was KC2, with 24 (21 NCPs) and five proteins (two NCPs) for the RC and SI methods, respectively (Figure 1A); typically more proteins were recovered using the RC method (four fractions) than the SI method (three fractions). What is somewhat unexpected is that both of these approaches yielded a wider range of proteins than was observed for more standard proteome extraction methods using guanidine hydrochloride31 for several of the same specimens (e.g., for NF1 this produced a range of proteins typical of bone albeit with the notable lack of fetuin-A observed previously, a serum protein that was frequently observed in most ancient specimens previously analyzed).31 When examined as individual fractions the RC method generally produced fractions with greater protein complexity than the SI method (Figure 1). The RC NaOH fractions were E
DOI: 10.1021/acs.jproteome.7b00624 J. Proteome Res. XXXX, XXX, XXX−XXX
Article
Journal of Proteome Research
each step after the initial demineralization. This is particularly valuable information given that the less damaging RC approach still manages to repeatedly yield a greater range of proteins (Table 3). Overall, however, it is clear that the methodological extraction steps are proving more efficient at preferentially removing the NCPs, including serum proteins, than the target collagens, with a reduced complexity that focuses more on collagen in the final pH 3 gelatinization step (Figure 2).
Table 3. Number of Proteins (NCPs only number shown in brackets) Obtained from Each Sample Using Either the RC or SI Protein Extraction Method Concentrated Using Either Only a 30 kDa or That Passed through the 30 kDa and Retained on the 10 kDa MWCO Spin Columns (Supplementary Table S2) no. of retentate proteins (NCPs) NF1 RC NF1 SI AuCPC RC AuCPC SI modern RC modern SI
18 (13) 11 (6) 21 (16) 12 (8) 68 (62) 27 (22)
no. of filtrate proteins (NCPs) 41 9 9 5 28 8
Relative Abundances of Proteins
(39) (4) (7) (2) (24) (3)
Using proteome complexities to determine the presence of proteins in different samples and fractions is relatively simple but this approach does not provide much information beyond a final count of the total number of proteins extracted from each. With ancient samples that are generally poorly preserved, this total number of proteins extracted and the presence or absence of certain proteins can indeed be useful for assessing the biomolecular preservation state of a sample or making comments on the longevity of different types of proteins.31 However, when analyzing relatively well preserved “young” samples (i.e., 30 kDa pH 3 fractions of NF1, AuCPC, and modern bone for both radiocarbon and stable isotope methods compared with levels of protein that passed through (>10 kDa).
relative abundances in the RC method. In NF1, all five of these NCPs had higher total relative abundances in the SI method whereas in UI3 all had higher total relative abundances in the fractions obtained from the RC method. However, in the modern sample, all had higher total relative abundances in the RC method than the SI method. NCPs such as fetuin-A, lumican, and PEDF were most commonly found in their highest abundances in the NaOH fraction and at their lowest in the acid-soluble or pH 3 fractions (e.g., Figure 5). There were occasional exceptions; for example, both lumican and PEDF were found at their most abundant in the acid wash fractions for samples KC2 and UI3 (lumican only). Fetuin-A, which was generally more abundant in the RC methods, was abundantly present specifically in the NaOH fraction but also present at low levels in the acid-soluble, pH 3, and acid wash fractions for the ancient samples (Figure 5). Similarly, chondroadherin has greater abundance in the RC fractions for the NF1 and AuCPC samples but in the SI fractions for the modern sample. Ultrafilter Size and Protein Recovery. As expected, the number of different proteins that were identified in the filtrate were typically lower than those in the retentate, with the only exception being for the NF1 filtrate via the RC method (42 proteins compared with only 18; Supplementary Table S2). However, most of the NCPs of greatest value to species identification and phylogenetic inference (e.g., fetuin-A and albumin6) were still recovered in the retentate regardless of the size of membrane used. In terms of relative abundance, the SI filtrate fractions in the modern and NF1 samples had the greatest relative abundance for both collagen chains, whereas with sample AuCPC this was the RC retentate fraction (Figure 6A). Given the aims of this study and the ubiquity of collagen in bone samples, it is important to consider the NCPs, where the RC retentate appeared to have a greater relative abundance than the SI retentate for the NF1 and AuCPC samples (although the opposite was observed in the modern sample) but with a much greater difference in abundance between retentate and filtrate for chondroadherin (Figure 6B) than for collagen (Figure 6A). Lumican also appeared more abundant in the RC fractions, although in the ancient samples all NCPs
appeared more abundant in the retentate than the filtrate (Figure 6B). Conversely, the relative abundances of PEDF were higher in the SI fractions for the modern and NF1 sample but lower in the RC fractions for AuCPC; similarly, abundances of prothrombin were higher in the fractions obtained from the SI method than the RC method for NF1 and AuCPC samples (although not for the modern sample; Figure 6B).
■
DISCUSSION
Proteome Differences between Stable Isotope and Radiocarbon Dating Methods
There have been two main factors to the proteomes investigated in this study, proteome complexity and the relative abundances of collagenous and selected NCPs. For the former, the RC extraction method appears to extract more proteins in total than the SI method, which could be due to the fact that the RC method utilises two initial demineralization steps, which were pooled to create the acid-soluble fraction as well as a further acid wash step; the extra demineralization step could potentially be removing lower molecular weight collagen from the bone extracts in an early stage, supported by the decreasing relative amount of semitryptic peptides, allowing the less abundant NCPs to be more easily solubilized and detected via mass spectrometry in subsequent fractions (e.g., the basesoluble [NaOH] fraction). Collagen α1(I) and α2(I) sequences were identified in every fraction from every sample in both methods; they were found at their greatest abundances in the acid-soluble and pH 3 fractions, and, in general, both collagen chains had greater relative abundances in the SI fractions than in those from the RC method. The latter likely relates to the initial demineralization steps of the RC method being carried out at room temperature rather than at 4 °C as it is for the SI method, as these acid-soluble fractions are thought to consist of the more damaged and low-molecular-weight collagen molecules and the collagen N- and C-terminal telopeptides that have increased solubility when compared with the intact collagen molecule.39 Conversely, the collagen in the pH 3 fractions is likely to be hydrolyzed sections of insoluble collagen caused by the gelatinization process itself, which was noticeably less aggressive H
DOI: 10.1021/acs.jproteome.7b00624 J. Proteome Res. XXXX, XXX, XXX−XXX
Article
Journal of Proteome Research
proteins present interact differently with the ultrafilter membrane due to various other properties such as polarity, although this effect should be minimal.
in the RC approach than the SI approach. However, one of the clearest outcomes of this study was the heterogeneity of the proteomes of different archeological specimens, for which the state of collagen (and NCP) preservation would clearly impact upon which fractions are most relevant. This appears to even be true of different species of modern bone, where our results for modern bovine bone conflict with those published elsewhere for modern chicken40 in that we observe some of our lowest ranges of proteins in the initial demineralization step. Although it is unexpected that, of the two collagen chains analyzed, the α2(I) chain was generally more abundant than the α1(I) chain in every sample (e.g., Figure 6; Supplementary Table S4) given that there are two α1(I) chains and one α2(I) chain per molecule, it is possible that this is due to the way in which Progenesis QI software selects proteins for use in quantitation. Only non-conflicting peptides, that is, those that are not also part of another protein match, are used in the quantitation calculations; the highly conserved collagen α1(I) sequences are potentially only being quantified using a small number of peptides that contain minute variations from the other collagen types also present in bone. It is unclear which of the two extraction methods gives the highest relative abundances for NCPs overall as it appears to vary between samples and does not appear to be dependent on any obvious factors such as sample age. However, it was noticeable that NCPs were most often identified at their highest abundances in the base-soluble NaOH fraction for both extraction methods, whereas the collagens are generally found at their lowest relative abundances in these fractions. This suggests that NaOH incubation may be relatively more efficient for solubilizing NCPs, indicating that it could be a potential alternative to GuHCl extractions for low abundance proteins in ancient bone.
Limitations of Progenesis QI and Relative Abundance Determination
Although Progenesis QI is an excellent tool in label-free quantitation methods, it has some disadvantages when applied to ancient samples. The initial problem is that these methods still rely on matching experimental peptide masses identified in the analysis to calculated mass values derived from curated sequence databases. Because the peptide sequences obtained from archeological bones come from ancient organisms, it is possible that these sequences do not exactly match the modern peptide sequences in the databases, or they may be from a species that is not included in the database; this problem is exacerbated by the fact that ancient peptides are often degraded and therefore have different masses from those expected. In this study this problem is overcome by using bones from Bos species, which has well-characterized protein sequences; however, it must be taken into consideration if applying similar methods to ancient bone extracts from other species (i.e., crossspecies proteomics). An additional problem lies in the calculation of relative abundances carried out by Progenesis QI. The use of nonconflicting peptides ensures that only peptides unique to a particular protein are used for quantitation, theoretically giving a more accurate result despite sacrificing some peptide data. However, when applied to peptides from ancient samples this approach becomes problematic for two reasons: (1) the experimental peptides may be degraded and are therefore not recognized as belonging to a specific protein and (2) there is often a reduced number of peptides present to assign to each protein. Both of these factors mean that there may be some instances in which peptides have been assigned to a protein, but they are not being accounted for in quantitation because they are shared with another protein that has a similar sequence. Given the higher homology of the highly repetitive collagen α1(I) to other collagenous proteins than collagen α2(I), it could be affecting such approaches to relative quantitation, particularly through false positive matches to the other similar sequences present.
Ultrafilter Size and Proteome Recovery
Interestingly, extracts analyzed from above the 30 kDa MWCO filters contained additional proteins to those that passed through but were retained above the 10 kDa MWCO filters, although there were some exceptions. This indicates that some proteins do not pass through the 30 kDa MWCO filter either due to size or potentially through association with the much larger collagen fibrils. Although understanding this is not essential in isotope analyses, as these studies are reliant on isotope ratios measured in this gelatinized collagen, further processing of these extracts to remove excess collagen may be necessary if they are to be used in proteomic analysis. Also, as expected, relative abundances for both collagens and NCPs were higher in the retentate fractions concentrated with only the 30 kDa MWCO filters than those that passed through (but retained by the 10 kDa filters). However, it is noteworthy that although the 30 kDa MWCO extracts generally had higher relative abundances of NCPs such as chondroadherin and lumican, the filtrate extracts had greater relative abundances of serum and ECM proteins (e.g., PEDF and prothrombin) in the SI method but not the RC method. Chondroadherin and lumican are members of the small leucine-rich proteoglycans (SLRP) family, which are known to interact with collagen (e.g., Figure 2) and are thought to play a role in collagen fibril organization in bone.41,42 This could explain their survival (via affiliation with collagen) and higher abundance in the 30 kDa fraction, which likely contains high concentrations of relatively intact collagen molecules. However, this observation is likely related to the differences in the method, with the SI method being apparently more aggressive. It is also possible that the
■
CONCLUSIONS Currently there are relatively few proteomic studies that utilize ancient bone, as the processing of ancient samples for analysis requires specific techniques and protocols to prevent contamination and to reduce further protein degradation through sample processing. In archeology, radiocarbon dating and stable isotope analyses are routinely practiced by thousands of academics across the world; however, these methods inherently discard biomolecular material that could provide molecular information on past organisms, particularly those which may be too old for sequenceable DNA to survive. This study has shown that several of the extracts specifically created by these methods, which are typically discarded, contain NCPs that can be successfully identified by proteomic techniques. In particular, the base-soluble NaOH fraction appears to be the best fraction to retain for subsequent proteomic analysis because it generally has the highest number of identifiable NCPs in most of the fractions analyzed here, providing a plentiful resource of NCP-rich extracts for use in either species identification or recovering phylogenies.6 It is particularly important that similar complexities of proteomes can be I
DOI: 10.1021/acs.jproteome.7b00624 J. Proteome Res. XXXX, XXX, XXX−XXX
Article
Journal of Proteome Research
(9) Longin, R. New method of collagen extraction for radiocarbon dating. Nature 1971, 230, 241−2. (10) Tieszen, L. L.; Boutton, T. W.; Tesdahl, K.; Slade, N. A. Fractionation and turnover of stable carbon isotopes in animal tissues: implications for δ13C analysis of diet. Oecologia 1983, 57 (1−2), 32− 7. (11) Milner, N.; Craig, O. E.; Bailey, G. N.; Pedersen, K.; Andersen, S. H. Something fishy in the Neolithic? A re-evaluation of stable isotope analysis of Mesolithic and Neolithic coastal populations. Antiquity 2004, 78 (299), 9−22. (12) DeNiro, M. J.; Epstein, S. Influence of diet on the distribution of nitrogen isotopes in animals. Geochim. Cosmochim. Acta 1981, 45, 341−51. (13) Vogel, J. C.; van der Merwe, N. J. Isotopic Evidence for Early Maize Cultivation in New York State. Am. Antiq. 1977, 42 (2), 238− 42. (14) Levin, I.; Bösinger, R.; Bonani, G.; Francey, R. J.; Kromer, B.; Münnich, K.; Suter, M.; Trivett, N. B.; Wölfli, W. Radiocarbon in atmospheric carbon dioxide and methane: global distribution and trends. In Radiocarbon after Four Decades; Springer, 1992; pp 503−18. (15) Craig, H. The natural distribution of radiocarbon and the exchange time of carbon dioxide between atmosphere and sea. Tellus 1957, 9 (1), 1−17. (16) Godwin, H. Half-life of radiocarbon. Nature 1962, 195 (4845), 984. (17) Wood, R. From revolution to convention: the past, present and future of radiocarbon dating. Journal of Archaeological Science 2015, 56, 61−72. (18) Currey, J. D. Bones: Structure and Mechanics; Princeton University Press: Princeton, NJ; 2002; xii, p 436. (19) Weiner, S.; Traub, W. Bone structure: from angstroms to microns. FASEB J. 1992, 6 (3), 879−85. (20) Harvey, V. L.; Egerton, V. M.; Chamberlain, A. T.; Manning, P. L.; Buckley, M. Collagen Fingerprinting: A New Screening Technique for Radiocarbon Dating Ancient Bone. PLoS One 2016, 11 (3), e0150650. (21) Van der Sluis, L.; Hollund, H.; Buckley, M.; De Louw, P.; Rijsdijk, K.; Kars, H. Combining histology, stable isotope analysis and ZooMS collagen fingerprinting to investigate the taphonomic history and dietary behaviour of extinct giant tortoises from the Mare aux Songes deposit on Mauritius. Palaeogeogr., Palaeoclimatol., Palaeoecol. 2014, 416, 80−91. (22) Buckley, M.; Gu, M.; Shameer, S.; Patel, S.; Chamberlain, A. T. High−throughput collagen fingerprinting of intact microfaunal remains; a low−cost method for distinguishing between murine rodent bones. Rapid Commun. Mass Spectrom. 2016, 30 (7), 805−12. (23) Buckley, M.; Kansa, S. W. Collagen fingerprinting of archaeological bone and teeth remains from Domuztepe, South Eastern Turkey. Archaeological and Anthropological Sciences 2011, 3 (3), 271−80. (24) Buckley, M.; Fraser, S.; Herman, J.; Melton, N.; Mulville, J.; Pálsdóttir, A. Species identification of archaeological marine mammals using collagen fingerprinting. Journal of Archaeological Science 2014, 41, 631−41. (25) Sealy, J.; Johnson, M.; Richards, M.; Nehlich, O. Comparison of two methods of extracting bone collagen for stable carbon and nitrogen isotope analysis: comparing whole bone demineralization with gelatinization and ultrafiltration. Journal of Archaeological Science 2014, 47, 64−9. (26) Katzenberg, M. A. Stable isotope analysis of archaeological faunal remains from southern Ontario. Journal of Archaeological Science 1989, 16 (3), 319−29. (27) Schoeninger, M. J.; DeNiro, M. J. Nitrogen and carbon isotopic composition of bone collagen from marine and terrestrial animals. Geochim. Cosmochim. Acta 1984, 48 (4), 625−39. (28) Brown, T. A.; Nelson, D. E.; Vogel, J. S.; Southon, J. R. Improved Collagen Extraction by Modified Longin Method. Radiocarbon 1988, 30, 171−7.
directly retrieved from some of the most commonly used methodological extractions in archeological science when compared with standard proteomic methods.
■
ASSOCIATED CONTENT
S Supporting Information *
The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jproteome.7b00624. Supplementary Table captions. (PDF) Table S1. Standard protein summary. (XLSX) Table S2. Ultrafilter protein summary. (XLSX) Table S3. Standard peptide abundances. (XLSX) Table S4. Standard protein abundances. (XLSX)
■
AUTHOR INFORMATION
Corresponding Author
*E-mail:
[email protected]. ORCID
Michael Buckley: 0000-0002-4166-8213 Notes
The authors declare no competing financial interest.
■
ACKNOWLEDGMENTS We acknowledge the NERC for studentship funding to C.W. (NE/J500057/1) as well as the Royal Society for fellowship funding to M.B. (UF120473). We also thank Andrew Chamberlain and Mike Parker-Pearson and the Yorkshire Museum for access to samples, Andrew Chamberlain and Andrew Millard for reading an early draft of this manuscript, as well as the Biomolecular Analysis Core facility at the University of Manchester, specifically Julian Selley, for bioinformatics support. C.W. carried out laboratory analyses, whereas both C.W. and M.B. carried out data analysis and wrote the manuscript.
■
REFERENCES
(1) Chisholm, B. S.; Nelson, D. E.; Hobson, K. A.; Schwarcz, H. P.; Knyf, M. Carbon isotope measurement techniques for bone collagen: notes for the archaeologist. Journal of Archaeological Science 1983, 10 (4), 355−60. (2) Ambrose, S. H. Preparation and characterization of bone and tooth collagen for isotopic analysis. Journal of archaeological science 1990, 17 (4), 431−51. (3) Richards, M. P.; Hedges, R. E. Stable isotope evidence for similarities in the types of marine foods used by Late Mesolithic humans at sites along the Atlantic coast of Europe. Journal of Archaeological Science 1999, 26 (6), 717−22. (4) Hedges, R. E. Bone diagenesis: an overview of processes. Archaeometry 2002, 44 (3), 319−28. (5) Jiang, X.; Ye, M.; Jiang, X.; Liu, G.; Feng, S.; Cui, L.; Zou, H. Method development of efficient protein extraction in bone tissue for proteome analysis. J. Proteome Res. 2007, 6 (6), 2287−94. (6) Buckley, M.; Wadsworth, C. Proteome degradation in ancient bone: Diagenesis and phylogenetic potential. Palaeogeogr., Palaeoclimatol., Palaeoecol. 2014, 416, 69−79. (7) Lee-Thorp, J. A.; Sealy, J. C.; Van der Merwe, N. J. Stable carbon isotope ratio differences between bone collagen and bone apatite, and their relationship to diet. Journal of archaeological science 1989, 16 (6), 585−99. (8) Hedges, R. E.; Reynard, L. M. Nitrogen isotopes and the trophic level of humans in archaeology. Journal of Archaeological Science 2007, 34 (8), 1240−51. J
DOI: 10.1021/acs.jproteome.7b00624 J. Proteome Res. XXXX, XXX, XXX−XXX
Article
Journal of Proteome Research (29) Cappellini, E.; Jensen, L. J.; Szklarczyk, D.; Ginolhac, A.; da Fonseca, R. A.; Stafford, T. W., Jr; Holen, S. R.; Collins, M. J.; Orlando, L.; Willerslev, E.; et al. Proteomic analysis of a pleistocene mammoth femur reveals more than one hundred ancient bone proteins. J. Proteome Res. 2012, 11 (2), 917−26. (30) Buckley, M.; Larkin, N.; Collins, M. Mammoth and Mastodon collagen sequences; survival and utility. Geochim. Cosmochim. Acta 2011, 75 (7), 2007−16. (31) Wadsworth, C.; Buckley, M. Proteome degradation in fossils: investigating the longevity of protein survival in ancient bone. Rapid Commun. Mass Spectrom. 2014, 28 (6), 605−15. (32) Nielsen-Marsh, C. M.; Richards, M. P.; Hauschka, P. V.; Thomas-Oates, J. E.; Trinkaus, E.; Pettitt, P. B.; Karavanić, I.; Poinar, H.; Collins, M. J. Osteocalcin protein sequences of Neanderthals and modern primates. Proc. Natl. Acad. Sci. U. S. A. 2005, 102 (12), 4409− 13. (33) Masters, P. M. Preferential preservation of noncollagenous protein during bone diagenesis: Implications for chronometric and stable isotopic measurements. Geochim. Cosmochim. Acta 1987, 51 (12), 3209−14. (34) Nikolov, M.; Schmidt, C.; Urlaub, H. Quantitative mass spectrometry-based proteomics: an overview. Methods Mol. Biol. 2012, 893, 85−100. (35) Storm, P.; Wood, R.; Stringer, C.; Bartsiokas, A.; de Vos, J.; Aubert, M.; Kinsley, L.; Grün, R. U-series and radiocarbon analyses of human and faunal remains from Wajak, Indonesia. Journal of human evolution 2013, 64 (5), 356−65. (36) Buckley, M.; Fariña, R. A.; Lawless, C.; Tambusso, P. S.; Varela, L.; Carlini, A. A.; Powell, J. E.; Martinez, J. G. Collagen sequence analysis of the extinct giant ground sloths Lestodon and Megatherium. PLoS One 2015, 10 (11), e0139611. (37) Procopio, N.; Buckley, M. Minimizing Laboratory-Induced Decay in Bone Proteomics. J. Proteome Res. 2017, 16 (2), 447−58. (38) Smith, B. D.; McKenney, K. H.; Lustberg, T. J. Characterization of collagen precursors found in rat skin and rat bone. Biochemistry 1977, 16 (13), 2980−5. (39) Buckley, M.; Collins, M.; Thomas-Oates, J. A method of isolating the collagen (I) α2 chain carboxytelopeptide for species identification in bone fragments. Anal. Biochem. 2008, 374 (2), 325− 34. (40) Schroeter, E. R.; DeHart, C. J.; Schweitzer, M. H.; Thomas, P. M.; Kelleher, N. L. Bone protein “extractomics”: comparing the efficiency of bone protein extractions of Gallus gallus in tandem mass spectrometry, with an eye towards paleoproteomics. PeerJ 2016, 4, e2603. (41) Neame, P. J.; Kay, C. J. Small leucine-rich proteoglycans. Proteoglycans-structure, biology and molecular interactions 2000, 201−35. (42) Young, M.; Bi, Y.; Ameye, L.; Xu, T.; Wadhwa, S.; Heegaard, A.; Kilts, T.; Chen, X. Small leucine-rich proteoglycans in the aging skeleton. J. Musculoskeletal Neuronal Interact. 2006, 6 (4), 364.
K
DOI: 10.1021/acs.jproteome.7b00624 J. Proteome Res. XXXX, XXX, XXX−XXX