and inter-tooth proteomes of ancient dentin

six MS/MS scans (Velos ion trap, product ion scans, rapid scan rate, centroid data; scan event: 500 ... assuming 0 #C13s with peak list generation usi...
1 downloads 9 Views 2MB Size
Subscriber access provided by STEPHEN F AUSTIN STATE UNIV

Article

Exploring biological and geological age-related changes through variations in intra- and inter-tooth proteomes of ancient dentine Noemi Procopio, Andrew T. Chamberlain, and Michael Buckley J. Proteome Res., Just Accepted Manuscript • DOI: 10.1021/acs.jproteome.7b00648 • Publication Date (Web): 22 Jan 2018 Downloaded from http://pubs.acs.org on January 23, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Proteome Research is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Exploring biological and geological age-related changes through variations in intra- and inter-tooth proteomes of ancient dentine

Noemi Procopio1, Andrew T. Chamberlain2, Michael Buckley1*

1

Manchester Institute of Biotechnology, The University of Manchester, 131 Princess Street,

Manchester, M1 7DN, UK

2

School of Earth and Environmental Sciences, The University of Manchester, Stopford Building, 99

Oxford Road, Manchester, M13 9PG, UK

*Correspondence: [email protected]; +44(0)161 306 5175

ACS Paragon Plus Environment

1

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 31

Abstract Proteomic analyses are becoming more widely used in archaeology due to not only the greater preservation of proteins in ancient specimens than DNA but also because they can offer different information, particularly relating to compositional preservation and potentially a means to estimate biological and geological age. However, it remains unclear to what extent different burial environments impact upon these aspects of proteome decay. Teeth have to date been much less studied than bone but are ideal to explore how proteins decay with time due to the negligible turnover that occurs in dentine relative to bone. Here we investigated the proteome variability and deamidation levels of different sections of molar teeth from archaeological bovine mandibles as well as their mandibular bone. We obtained a greater yield of proteins from the crown of the teeth, but did not find differences between the different molars analysed within each mandible. We also obtained the best variety of protein from a well-preserved mandible that was not archaeologically the youngest one, showing the influence of the preservation conditions on the final proteomic outcome. Intriguingly, we also noticed an increase in abundance levels of fetuin-A in biologically younger mandibles, as reported previously, but we observed the opposite trend analysing tooth dentine. Interestingly, we observed higher glutamine deamidation levels in teeth from the geologically oldest mandible, despite being the biologically youngest specimen, showing that the archaeological age impacts strongly on the level of deamidations observed, more than biological ageing, and indicating that the glutamine deamidation ratio of selected peptides may act as a good predictor of the relative geochronological age of archaeological specimens.

Keywords Ancient dentine proteome, ancient bone proteome, biological ageing, geological ageing, shotgun proteomics, deamidations, fetuin-A.

ACS Paragon Plus Environment

2

Page 3 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Introduction Proteomic analyses on ancient mineralised specimens are becoming increasingly utilized in archaeology and palaeontology, fuelled by their potential to not only inform upon the fossilisation process, but also to access the phylogenetic information that can be obtained from proteins1–5, beyond the limits of ancient DNA (aDNA). Moreover, because of the ontogenetic information that they appear to contain6,7 they may be applied to assess the biological age of skeletal remains, perhaps those of unknown victims in crime scenes, mass graves or mass disasters8–10. Previous studies have been conducted to verify if the degradation of bone collagen could predict the postmortem interval (PMI) of skeletonized remains11 or determine the age-at-death (AAD) from skeletons combining different parameters such as collagen thermal degradation12 but remain poorly resolved. However, among the mineralised tissues, teeth have been long considered one of the best sources of information for forensic applications due to the high stability of collagen in these specimens and to their limited in vivo turnover13. Teeth are composed essentially of two main elements, the intra-oral part called the crown, and the lower part located anatomically inside the jaw called the root14. The crown is coated by the enamel layer, a hard and highly mineralised substance that protects the tooth, whereas the root is covered with a bone-like tissue avascularised called cementum14. Under these exterior layers is the dentine, a calcified tissue in which the organic component is composed of ~90% collagen (mostly type I collagen) and ~10% Non-Collagenous Proteins (NCPs). The NCPs include phosphorylated proteins such as SIBLINGs (Small Integrin-Binding Ligand N-linked Glycoproteins), SLRPs (Small Leucine-Rich Proteoglycans), amelogenin and proteolipids, and some nonphosphorylated proteins like osteocalcin, MGP (Matrix-Gla Protein), osteonectin, proteins from the blood serum (albumin and fetuin), SLRPs, growth factors, enzymes, polyamines and calcium binding proteins15. Of particular interest is that collagen in dentine has been shown to be metabolically stable during life and its turnover so low that it is assumed negligible13,16,17; for this reason, dentine is the tissue of choice in forensics when there is the need to estimate the age of remains using techniques such as radiocarbon dating and aspartic acid (Asp) racemization analysis8,10,18–22. Aspartic acid racemization (AAR) is a non-enzymatic covalent modification that takes place in different tissues in an age-dependent way that allows the conversion of L-Asp into D-Asp through the formation of a short lived L-succinimide ring23, although this phenomenon can also occur under acidic conditions via direct hydrolysis without such formation24. While AAR analysis on

ACS Paragon Plus Environment

3

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 31

dentine appears to be capable of estimating the age of a body in forensic settings, there are some problems when applying the method to archaeological materials and to exposed remains because of the limited knowledge about the diagenetic processes involving the decay of the proteins and the effects that different burial conditions can have on racemization rates25. Since racemization rates are normally evaluated on collagen (commonly on the ‘acid-insoluble’ fraction of the protein extracts), the protein composition of the analysed fraction can strongly influence the racemization rates20,22 as well as the biological ageing that can lead to collagen deterioration. This can in turn result in a subsequent increase in solubilised collagen under acidic conditions25. Furthermore, the treatment of the samples during laboratory procedures (e.g., temperature, pH of the demineralising agent, pulverization of the samples prior to their extraction, etc.) can affect racemization levels, for example intentional acidic hydrolysis can induce racemization25. Similarly, the non-enzymatic deamidation of asparaginyl and glutaminyl residues to produce aspartyl and glutamyl residues is thought to be a good indicator for the levels of protein degradation26; this post-translational modification (PTM) is thought to be directly related with the biological ageing process, being considered a “molecular clock” for protein turnover, organismic development and ageing27–29. Where asparaginyl deamidation has been more associated with the biological ageing in vivo due to its fast deamidation rate in living tissues30, the slower glutaminyl deamidation31 has been more frequently used for decay studies and for the dating of historical specimens31,32. The levels of glutamine deamidation in collagen have been used to estimate the damage of archaeological bone extracts34, and this technique has also been used to estimate the level of degradation in mural paintings35. These studies focused on the identification of a few peptides that could be used as deterioration signatures and not on the whole set of extracted proteins, to control for factors that can affect the deamidation rate like the primary sequence36, the secondary structure37 and the three-dimensional structure38. However, other aspects of the molecular environment (proteome) will most likely also affect deamidation rate. To better understand how proteins degrade in the absence of the complications involved in bone turnover, and also to test the proteomic variability within teeth (from the top of the crown to the apex of the root) and within individuals (looking for differences between teeth that form at different times in the animal's development), we compared different proteomes obtained from three sections (top/crown, middle and bottom/root) of three molar teeth (M1, M2, M3) each from three mandibles of different biological ages. The acid-soluble and acid-insoluble protein fractions were kept separate in order to investigate also the proteomic variability between the two. To

ACS Paragon Plus Environment

4

Page 5 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

evaluate the effects that different environments may have on the protein recovery, the three bovine mandibles used in the experiment were obtained from two different archaeological sites (Carsington Pasture Cave and Rains Cave) both located on the same geological substrate of Carboniferous Limestone in Derbyshire, U.K. To compare the levels of protein survival and decay between teeth and bones from the same individual, we also extracted bone proteomes from the mandibles used for the study, with the final aim to evaluate the effects that bone remodelling can have on the proteomic composition and on PTMs.

Experimental procedures Materials Three archaeological bovine mandibles of different biological ages were used in this study (Table 1 and Supporting Fig. S-1). One mandible (RC-1139) was collected from an archaeological deposit in Rains Cave (latitude 53.095N, longitude 1.664W) and it was radiocarbon dated to 4290 ± 32 yr BP (~5000 years old), whereas the other two specimens (CPC-2355 and CPC-3020) were from Carsington Pasture Cave (latitude 53.080N, longitude 1.641W), and were radiocarbon dated respectively 3670 ± 32 yr BP and 4638 ± 32 yr BP (~4000 yr old and ~5500 yr old). The ages at death of the three animals represented by the mandibles were estimated using standard zooarchaeological methods that take account of the state of development, the extent of occlusal wear and the location of the cemento-enamel junction on the molar teeth39,40. Specimen CPC3020 had M2 in wear and M3 erupting but not yet in wear (Jones/Sadler wear stage Dcd) with an estimated age (mean ± s.d.) of 19 months ±2 months. Specimen CPC-2355 had M2 and M3 both in wear at stage g (Jones/Sadler wear stage Ggh) with an estimated age of 47 months ± 4 months. Specimen RC-1139 had M2 and M3 in wear at stages k and j respectively (Jones/Sadler wear stage Gk+) with an estimated age of 73 months ± 6 months (Supporting Fig. S-2). Specimen

SUERC code

Cave

Biological age

Geological age

RC-1139

SUERC-76663 (GU46404)

Rains Cave

73 ± 6 months

~5000 yr old

CPC-2355

SUERC-76664 (GU46405)

Carsington Pasture Cave

47 ± 4 months

~4000 yr old

CPC-3020

SUERC-76665 (GU46406)

Carsington Pasture Cave

19 ± 2 months

~5500 yr old

Table 1. Archaeological specimens used in the study. We reported here the names of the samples, the SUERC code attributed by the Scottish Universities Environmental Research Centre AMS Facility to radiocarbon date the samples, the cave from which the samples have been collected, their estimated biological age at death and their geological age.

From each specimen, we hand drilled in triplicate part of the area close to the angle of the mandible, in order to obtain ~25 mg of bone powder from each of the three drilling locations ACS Paragon Plus Environment

5

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 31

(comprehensively, we drilled nine bone powder samples from three mandibles). In addition, we also drilled with a diamond-tipped Dremel™ ~400 mg of bone powder from approximately the same positions, to obtain three samples for radiocarbon (14C) dating. In addition to bone samples, three molars (M1-M2-M3) were also extracted from each of the three mandibles using a Proxxon hand-held drill with cutting disk, and cut in half in the sagittal plane with a Buehler water cooled circular saw (25.40 cm diameter by 1 mm thickness). After the first cut, the teeth were further cut transversally into five slices with a thickness of ~0.5 cm each, in order to obtain, from the crown to the root, a slice from the top part of the tooth, a second one below this one, a slice in the middle at the centre of the tooth, a fourth one below this one and a fifth slice with the root of the tooth. We only used the top/crown slice (called “T”) the middle slice (called “M”) and the root slice (called “R”) for this experiment, while we excluded the two slices in between. So comprehensively, we extracted proteins from 27 tooth slices (Fig. 1).

Figure 1. Images of the three sections of the three molars taken from each mandible (M1 – M2 – M3) showing for the first mandible, the three slices used in the study (crown/top “T”, middle “M” and root “R”). The orientation of the second molar from mandible 3020 (fourth column) is reversed.

Formic acid (FA) was purchased from Fluka (UK); Tris, dithiothreitol (DTT), iodoacetamide (IAM), trifluoroacetic acid (TFA), guanidine hydrochloride (GuHCl) and ammonium acetate (AMAC) were purchased from Sigma-Aldrich (UK), hydrochloric acid (HCl) and Sartorius 0.45 µm syringe filters was purchased from Fisher Scientific (UK), 10 kDa Molecular Weight Cut Off (MWCO) ultrafiltration units were purchased from Vivaspin (UK); sequencing grade trypsin was purchased from Promega (UK), and OMIX C18 reversed-phase Zip-Tips were purchased from Agilent Technologies (UK). ACS Paragon Plus Environment

6

Page 7 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Methods We carried out all of our experiments with at least three biological replicates including three sections of each tooth from three molar teeth of three different biologically aged archaeological bovine mandibles. This resulted in a dataset of 27 proteomes from tooth dentine, as well as a further 9 from the mandibular bone samples. As we analysed both the acid-soluble and acid-insoluble fractions from each tooth subsample, this resulted in 63 proteomes in total, which collectively included 120,915 ions. The reason why we only focused on the acid-insoluble fraction from the bone subsamples was to maximise the variability of the recovered proteome, as showed previously by Procopio and Buckley23. The first section of the manuscript will focus on results obtained from dentine analysis, whereas the last part of the work will focus more on mandible proteins and on their comparison with the tooth results.

Collagen extraction for 14C dating Bone powder was demineralised with 0.5 M HCl at 4 °C overnight. Acid was poured off and samples were rinsed three times with ultrapure water and then gelatinised in pH 3 ultrapure water at 70 °C for 48 hours. After this step, samples were filtered using 0.45 µm filters to remove the undissolved bone, then they were frozen at -80 °C overnight, and finally freeze-dried until the drying was complete. The lyophilized collagen was then sent to the Radiocarbon Laboratory at the Scottish Universities Environmental Research Centre (SUERC) for the 14C dating.

Protein extraction Tooth fragments were washed with ultrapure water before starting the protein extraction procedure. To limit the laboratory-induced deamidations, we followed the procedure from Procopio and Buckley23 for both teeth and bone samples. 2 mL of 10% FA was added to each piece of tooth, whereas 1 mL of 10% FA was added to ~25 mg of bone powder; then they were left to demineralise at 4°C for six hours. After this step, the acid-soluble fraction was removed and frozen, while the teeth or the pellets were further treated with 1.7 mL (for teeth) or with 500 mL (for bone pellets) of 6 M GuHCl buffer containing 100 mM Tris at pH 7.4 at 4 °C for 18 hours (acidinsoluble fraction). Then, the fractions were separately ultrafiltered with 10 KDa MWCO filters (both for teeth; only the insoluble fraction was ultrafiltrated and further processed for mandibular bone samples). The buffer was then exchanged with 200 µL of 50 mM AMAC and the samples reduced with 5 mM DTT, alkylated with 15 mM IAM and digested with 1 µg of trypsin at 37°C for 5 hours. After the digestion, samples were desalted, purified and concentrated using C18 reversedphase Zip-Tips (OMIX, UK) following manufacturer’s protocols, and eluted in 100 µL of 50% ACN ACS Paragon Plus Environment

7

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 31

and 0.1% TFA. Samples were dried under a fume cupboard for 1 day at room temperature and then resuspended in 20 μL of 0.1% TFA/5% ACN for subsequent LC-MS/MS analysis.

LC-Orbitrap mass spectrometry Resuspended samples were analysed by LC-MS/MS using an UltiMate 3000 Rapid Separation LC (RSLC, Dionex Corporation, Sunnyvale, CA, USA) coupled to an Orbitrap Elite (Thermo Fisher Scientific, Waltham, MA, USA) mass spectrometer (120 k resolution, full scan, positive mode, normal mass range 350−1500). PepVdes were separated on an Ethylene Bridged Hybrid (BEH) C18 analytical column (75 mm × 250 μm i.d., 1.7 μM; Waters) using a gradient from 92% A (0.1% FA in water) and 8% B (0.1% FA in ACN) to 33% B in 44 min at a flow rate of 300 nL min−1. PepVdes were then automaVcally selected for fragmentaVon by data-dependent analysis; six MS/MS scans (Velos ion trap, product ion scans, rapid scan rate, centroid data; scan event: 500 count minimum signal threshold, top six) were acquired per cycle, dynamic exclusion was employed, and one repeat scan (i.e., two MS/MS scans total) was acquired in a 30 s repeat duration with that precursor being excluded for the subsequent 30 s (activation: collision-induced dissociation (CID), 2+ default charge state, 2 m/z isolation width, 35 eV normalized collision energy, 0.25 Activation Q, 10.0 ms activation time).

Data analysis In total, we performed 54 LC-MS/MS analyses (27 soluble fractions and 27 insoluble fractions) from teeth samples, and nine further LC-MS/MS analyses from the bone powder extracted from the mandibles. Results obtained via LC-MS/MS were searched as .mgf files (created assuming 0 #C13s with peak list generation using ExtractMSN) against the database Swiss-Prot 2016_04 (550,960 sequences; 196,692,942 residues) using the Mascot Daemon search engine (version 2.5.1; Matrix Science, London, UK). Each search included the fixed carbamidomethyl modification of cysteine (+57.02 Da) and the variable modifications for deamidation (+0.98 Da) of asparagine/glutamine and oxidation (+15.99 Da) of lysine, proline, and methionine residues, to account for PTMs and diagenetic alterations; the oxidation of lysine and proline is equivalent to hydroxylation. Enzyme specificity was set to trypsin with up to two missed cleavages allowed; mass tolerances were set at 5 ppm for the precursor ions and 0.5 Da for the fragment ions, with all spectra considered as having either 2+ or 3+ precursors. To compare samples in terms of number of proteins and number of deamidated peptides, Scaffold software version 4.4.1 (Proteome Software Inc., Portland, OR) was used. Peptide identifications were accepted if they could be established at greater than 90% probability by the

ACS Paragon Plus Environment

8

Page 9 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Peptide Prophet algorithm41 with Scaffold delta-mass correction. Protein identifications were accepted if they could be established at greater than 90% probability and contained at least two identified unique peptides; protein probabilities were assigned by the Protein Prophet algorithm42. Proteins that contained similar peptides and could not be differentiated based on MS/MS analysis alone were grouped to satisfy the principles of parsimony (the display option chosen for this work was “total spectrum count” to allow a semi-quantitative measurement of the proteins present in the sample). Subsequently, only bovine proteins were filtered and selected in order to avoid exogenous proteins; in addition, a filter for deamidation was also used to extrapolate data for analysis of this particular PTM. Progenesis QI for proteomics version 2.0 (Nonlinear Dynamics, Newcastle, UK) was also used to perform quantitative analysis, which attempts to give the relative quantitation of each sample enabling within-sample protein comparison; furthermore, it also allows for comparisons between different replicates or different experimental conditions. To do so Progenesis quantifies the ionic abundance of unique peptides and then normalizes the data to allow comparisons between the runs. Peptide identifications were done against the database Swiss-Prot 2016_04 (550,960 sequences; 196,692,942 residues) using Mascot version 2.5, using a .mgf file created by Progenesis. All the parameters chosen for the search were the same previously used for each search; search results were then imported back to Progenesis as .xml files. Principal Component Analysis (PCA) was performed on normalized abundances from Progenesis, both at the peptide and protein level, using R Software with the package FactoMineR; plots were then produced with Python programming language (Python Software Foundation, https://www.python.org/) to visually separate the samples according to the variations in abundance of proteins in each sample. We used a score cut-off of 41 on protein exports in order to have identity or extensive homology according to the Mascot evaluation of the peptide score distribution. To exclude unreliable proteins, we also removed proteins for which the number of unique peptides was equal to or less than one. STRING software (version 10.5) was used to create protein-protein interaction networks, with the thickness of the network edges representing the strength of the interaction. Data were clustered following the k-means algorithm (the “k” value chosen was from 2-4 depending on the number of proteins, reflecting ubiquitous, bone, and plasma proteins (the latter often forming two clusters). The minimum required interaction score was set to “medium confidence” (0.400), and all of the active interaction sources were selected

ACS Paragon Plus Environment

9

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 31

(text mining, experiments, databases, co-expression, neighbourhood, gene fusion and cooccurrence). Statistical tests chosen for each analysis varied based on the statistical question that had to be addressed, and they have been always calculated by the software used for the proteomic analyses (e.g. Scaffold or Progenesis). The only statistical test not performed directly by the software was the ANOVA test, that was used to assess the variation among and between group of samples when this was required. Analyses of the biochemical properties of the proteins were done using the protein statistics tool EMOSS Pepstats (https://www.ebi.ac.uk/Tools/seqstats/emboss_pepstats/).

Results Soluble and insoluble protein fractions from dentine Proteins were extracted from teeth following a previously published protocol23 keeping the soluble and the insoluble fractions separated; comprehensively, we identified 56 bovine proteins within the complete dataset analysed with Scaffold, finding 21 different proteins that were only in the insoluble fraction, four different proteins only in the soluble fraction and 31 proteins present in both fractions (Fig. 2; see Supporting Table S-1 for complete protein list and functions and Supporting Data S-1 for peptides and proteins lists from Progenesis). Proteins uniquely present in the insoluble fraction were mostly ubiquitous proteins like serine proteases, chaperones and proteins involved in the immune response and in the coagulation cascade. However, some tooth-specific proteins were also only observed in this fraction, such as ameloblastin (AMBN), specifically involved in the mineralisation and organization of enamel43,44, asporin (ASPN), a suppressor of the differentiation and mineralisation of the periodontal ligament45 and metalloproteinase-2 (MMP2), a protein expressed during the development of teeth46. On the contrary, proteins only observed in the soluble fraction were two proteins normally bound to hydroxyapatite (bone sialoprotein 2 (IBSP) and osteopontin (SPP1)47), one tooth-specific protein (dentin matrix acidic phosphoprotein 1 (DMP1)48) and one collagenous protein (collagen alpha-1(XVII) chain (COL17A1)) that was shown to differ in cases of epidermolysis bullosa and abnormal dentition49. Proteins that were present in both fractions revealed strong interactions with one another, showing distinct clusters for specific types of proteins. One cluster (Fig. 2; blue) was characterised by plasma-proteins, with some of them also interacting with the extracellular matrix or involved in the ossification of bones and teeth (SPARC50, fetuin-A51 (AHSG) ACS Paragon Plus Environment

10

Page 11 of 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

and clusterin52 (CLU)). Another cluster (Fig. 2; red) constituted collagenous proteins and collagenbinding proteins such as biglycan (BGN), transforming growth factor beta-1 (TGFB1) and lumican (LUM). The third (Fig. 2; green) was composed by plasma proteins with coagulation functions (prothrombin (F2), coagulation factor IX (F9), coagulation factor X (F10) and antithrombin-III (SERPINC1)), but also from some bone- and tooth-specific proteins such as osteomodulin53 (OMD), matrix GLA protein54 (MGP) and matrix metalloproteinase-2055 (tooth-specific protein also called enamelysin, MMP20). Proteins in yellow clustered separately from the rest, and included thrombospondin-1 (THBS1) which has a role in dentinogenesis, insulin-like growth factor binding protein 5 (IGFBP5) which was found to be highly expressed in dental tissue-derived mesenchymal stem cells56 and angiopoietin 1 (ANGPT1) that regulates vascular development and angiogenesis. We investigated also the biochemical properties of the proteins extracted from the two fractions, to test if the size, the charge and the hydrophobicity/hydrophilicity were influenced by the fraction in which specific proteins were collected. While size and charge did not appear to influence the extraction of specific proteins in specific fractions, the hydrophobicity/hydrophilicity showed some association with 75% of the proteins from the soluble fraction being hydrophilic (polar) and 86% of the proteins in the insoluble fraction being hydrophobic (non-polar).

Figure 2. Networks of proteins extracted uniquely from the insoluble (blue), soluble (red) and from both fractions (purple); the line thickness indicates the strength of data support (edge confidence).

Furthermore, semi-quantitative levels of the proteins extracted from the two fractions were evaluated. Results showed that six proteins in particular were statistically more abundant in

ACS Paragon Plus Environment

11

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 31

the soluble fraction, 36 proteins more abundant in the insoluble fraction and 14 equally abundant between the two fractions (Fisher’s Exact Test was applied, P-value