Screening and Validation of Novel Biomarkers in ... - ACS Publications

Apr 5, 2017 - spots at 200 μm (see Supporting Figure 1B). A total of six serum samples were ... estimated for each array using the DEPC water spot si...
1 downloads 0 Views 2MB Size
Subscriber access provided by University of Missouri-Columbia

Article

Screening and Validation of Novel Biomarkers in Osteoarticular Pathologies by Comprehensive Combination of Protein Array Technologies Álvaro Sierra-Sánchez, Diego Garrido-Martín, Lucía Lourido, María González-González, Paula Díez, Cristina Ruiz-Romero, Ronald Sjöberg, Conrad Friedrich Droste, Javier De Las Rivas, Peter Nilsson, Francisco Javier Blanco García, and Manuel Fuentes J. Proteome Res., Just Accepted Manuscript • DOI: 10.1021/acs.jproteome.6b00980 • Publication Date (Web): 05 Apr 2017 Downloaded from http://pubs.acs.org on April 6, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Proteome Research is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Screening and Validation of Novel Biomarkers in Osteoarticular Pathologies by Comprehensive Combination of Protein Array Technologies

Álvaro Sierra-Sánchez‡1,2, Diego Garrido-Martín‡1,2, Lucía Lourido3, María GonzálezGonzález1,2, Paula Díez1,2, Cristina Ruiz-Romero3, Ronald Sjöber4, Conrad Droste5, Javier De Las Rivas5, Peter Nilsson4, Francisco Blanco3* & Manuel Fuentes1,2* ‡ Both authors equal contribution 1

Department of Medicine and General Cytometry Service-Nucleus, Cancer Research Centre

(IBMCC/CSIC/USAL/IBSAL), 37007 Salamanca, Spain 2

Proteomics Unit, Cancer Research Centre (IBMCC/CSIC/USAL/IBSAL), 37007 Salamanca,

Spain 3

Proteomics

Group-PBR2-ProteoRed/ISCIII,

Rheumatology

Division,

Instituto

de

Investigación Biomédica de A Coruña (INIBIC/CHUAC/Sergas/UDC), 15001 A Coruña, Spain 4

Affinity Proteomics, Science for Life Laboratory, School of Biotechnology, Royal Institute of

Technology (KTH), SE-17165 Stockholm, Sweden 5

Bioinformatics and Functional Genomics Research Group, Cancer Research Centre

(IBMCC/CSIC/USAL/IBSAL), 37007 Salamanca, Spain * Corresponding authors:

E-mail: [email protected]. Phone: +34 923294811. Fax: +34 923294743 E-mail: [email protected]. Phone: +34 981 17 82 72. Fax: +34 981 17 82 73

1

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 38

ABSTRACT

Osteoarthritis (OA) is one of the most prevalent articular diseases. The identification of proteins closely associated with the diagnosis, progression, prognosis, and treatment response is dramatically required for this pathology. In this work, differential serum protein profiles have been identified in OA and rheumatoid arthritis (RA) by antibody arrays containing 151 antibodies against 121 antigens, in a cohort of 36 samples. Then, the identified differential serum protein profiles have been validated in a larger cohort of 282 samples. The overall immunoreactivity is higher in the pathological situations in comparison with the controls. Several proteins have been identified as biomarker candidates for OA and RA. Most of these biomarker candidates are proteins related to inflammatory response, lipid metabolism or bone and extracellular matrix formation, degradation or remodeling.

Keywords: Antibody arrays, serum protein profiles, osteoarthritis, rheumatoid arthritis, contact printing, non-contact printing, biomarkers.

2

ACS Paragon Plus Environment

Page 3 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

1. INTRODUCTION

Osteoarthritis (OA) is one of the most prevalent articular diseases. It is characterized by a gradual loss of cartilaginous matrix that often extends over years and decades. Indeed, age is the main risk factor for OA. Thus, as longevity increases, OA has become a leading cause of disability for older adults in developed countries. Worldwide estimates indicate that approximately 10% of men and 18% of women aged over 60 years have symptomatic osteoarthritis. Besides, about 80% of them present substantial limitations of movement, and 25% cannot perform their major daily activities1. Currently, OA diagnosis is essentially symptomatic and relies on the description of pain symptoms, stiffness of the affected joints and radiography, which is still the reference technique to determine the degree of joint destruction. However, it provides only indirect information about the tissue and lacks sensitivity to detect small changes in the joint structures. Furthermore, most OA treatments are also symptomatic, and the only effective therapy in advanced stages is joint replacement2. Important advances have been done during the last decade in the understanding of the pathogenesis of OA and other diseases affecting the joint tissues, such as rheumatoid arthritis (RA, where autoimmune response leads to the joint destruction). However, we are still far from having a clear picture of the molecular network that predisposes an individual to develop the disease, to worsen symptoms, or to successfully respond to a specific treatment. In this regard, the identification of proteins closely associated with the diagnosis, disease progression, prognosis, and treatment response in these pathologies is dramatically required. Over the last years, multiple biological markers have been 3

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 38

proposed, that may reflect the synthesis or degradation of the three main joint tissues (cartilage, synovial membrane, and bone)3. However, none of them is sufficiently validated and qualified for its systematic use. Proteomic profiling technologies are powerful tools for biomarker discovery and validation. The basic strategy implies the fractionation of proteins contained in the samples followed by peptide identification using mass spectrometry (MS). This allows an indirect and highly specific identification of the proteins4,5. Indeed, several studies have recently reported the MS characterization of synovial fluid, cartilage or subchondral bone in order to identify proteins as potential biomarker candidates6–9. Nonetheless, MS lacks sensitivity to analyze complex biological samples. In the case of human serum, the high dynamic range of protein concentrations10 precludes the direct detection of medium/low abundant biomarkers by MS11. In contrast, the use of high-throughput protein microarrays offers a direct approach to simultaneously screen thousands of antigens in an unbiased manner using a minimal amount of sample. Moreover, it has proven to be a suitable tool for antigen and autoantibody profiling in complex samples across multiple diseases2,12. Among other aspects, one of the key steps in the construction of protein microarrays is the methodology for spotting/printing the proteins or antibodies onto the functionalized surface of the slide13. Currently, multiple technologies are commercially available with different printing procedures, most of them imported from DNA arrays. These procedures rank from simple deposition by pins or needles, followed by adsorption of the biomolecule onto the functionalized surface (commonly called contact printing) to more complex nano-injection fluidic approaches, where nano-drops are dispensed by piezo-electric systems onto the functionalized surface (commonly referred to as non4

ACS Paragon Plus Environment

Page 5 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

contact printing)14. Both approaches have advantages and drawbacks regarding intraand inter-array reproducibility, surface format (planar slides, microtiter wells, colorcoded beads, etc.), real-time detection systems, or compatibility with complex biological samples, among others15. In this study, we pursued the characterization of differential serum protein profiles in OA patients, RA patients and healthy controls (C). With this aim, in a preliminary discovery phase, a small subset of samples (n = 36) was employed, performing a comprehensive evaluation of the suitability of the clinical samples, antibodies, reagents and experimental procedures for biomarker discovery. This cohort was screened against a panel of 151 antibodies (151 antibodies against 121 proteins, 151 x 121) using antibody arrays developed by contact printing. Then, a validation phase was performed in a larger cohort (n = 282 samples) of the different patient groups (OA, RA, C) by screening the sera employing 151 x 121 antibody arrays developed by non-contact printing.

2. MATERIALS AND METHODS 2.1. Materials Acetone >98% (Panreac, Barcelona, Spain), 3-(2-aminoethylamino) propyl-methyl dimethoxysilane (MANAE) (Fluka, Steinheim, Germany), dimethyl sulfoxide (DMSO) (Merck Millipore, Billerica, USA), bis-(sulfosuccinimidyl)-suberate (BS3), Nunc® 384 clear flat well plates, SuperBlock® Blocking Buffer, Microtiter Plate 96 Well/V Bottom, LifterslipTM coverslips (Thermo Scientific, Portsmouth, USA); Bovine Serum Albumin (BSA) >98%, NHS-PEG4-Biotin, Tween®20 viscous liquid (polyoxyethylenesorbitan 5

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 38

monolaurate), ampicillin sodium, Corning® hybridization chambers (Sigma Aldrich, St Louis, USA); peroxidase-AffiniPure Goat Anti-Human IgG, Fcγ Fragment Specific, Peroxidase-AffiniPure F(ab')2 Fragment Goat Anti-Mouse IgG (H+L) (Jackson ImmunoResearch Laboratories, Baltimore, USA); TSA individual cyanine 3 Tyramide Reagent Pack (TSA) (PerkinElmer, Waltham, USA); slides Ground Edges 76x26 mm (LíneaLab, Badalona, Spain); Amersham Cy™5-Streptavidin, 16-Array Chamber Covers (GE Healthcare, Buckinghamshire, UK); Goat Anti-Rabbit IgG (H+L) HRP Conjugate (BIO-RAD, California, USA), powdered concentrated skimmed milk (Central Lechera Asturiana, Granda-Siero, Spain). Purified rabbit polyclonal anti-human antibodies against the antigens of interest (Supporting Table 1) were kindly provided by the Human Protein Atlas (www.hpa.com). These antigens had been previously described to be altered in osteoarthritic patients at different levels in cartilage (extracellular matrix, chondrocytes) and blood, by genomic and mass spectrometry (MS) assays16–20. 2.2. Patients Human serum samples corresponding to patients belonging to three different groups: osteoarthritis (OA), rheumatoid arthritis (RA), and healthy controls (C) were provided by the Biobank at the Institute for Biomedical Research of A Coruña (INIBIC). The samples were extracted and processed, after written informed consent was signed by each donor, according to the guidelines of the local Ethics Committee (Comité Ético de Galicia, Galicia, Spain). The OA group consisted of 108 patients diagnosed with OA according to the American College of Rheumatology (ACR) criteria21. The RA group comprised 108 patients diagnosed with RA following the ACR/European League 6

ACS Paragon Plus Environment

Page 7 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Against Rheumatism (EULAR) criteria22. The 102 healthy controls selected for the analyses were donors without a history of joint disease. In the first discovery phase, a small subset of 36 samples (OA, n=12; RA, n=12 and C, n=12) was selected. In a second validation stage, a total of 282 samples were studied (OA, n=96; RA, n=96 and C, n=90). 2.3. Methods 2.3.1. Surface functionalization The glass slide surfaces were activated8 by treatment with 2% (v/v) MANAE in acetone for 30 min with shaking at room temperature (RT). Slides were subsequently washed with acetone and MilliQ® water and dried with compressed filtered air. 2.3.2. Microarray preparation Contact printing technology (CT) Anti-human polyclonal antibodies (0.25 mg/mL) were supplemented with 2 mM BS3 as cross-linker. Spotting buffers with and without cross-linker, in the absence of antibodies, as well as BSA (0.038x10-3 mg/mL to 0.625 mg/mL range), were used as negative controls. In turn, positive controls included NHS-PEG4-biotin (0.78 mg/mL) and goat anti-human IgG (0.25 mg/mL). Antibodies and controls were spotted onto MANAE-functionalized slides using the MicroGrid II printer with a 4x4 384 split spot tool (BioRobotics, Cambridge, UK). Each sample was spotted in quadruplicate in each of the three identical subarrays contained in each slide. Spot diameter was set at 150 µm, being the distance between spots 585 µm (see Supporting Figure 1A). Eventually, printed slides were packed and stored protected from light in a dry atmosphere at RT, until assayed. 7

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 38

Non-contact printing technology (NC) In this case, anti-human polyclonal antibodies were re-suspended (1:1) in a 47% (v/v) glycerol solution, according to the ArrayJet® Printer Marathon v1.4 specifications (ArrayJet®, Roslin, UK). NHS-PEG4-biotin (0.39 mg/mL) was prepared as positive control. Spotting buffers with and without cross-linker, in the absence of antibodies, and BSA (0.6 mg/L to 3.66 mg/mL range) were prepared as negative controls. Slides printed using the ArrayJet® Printer Marathon) contained 12 identical subarrays, each one including all antibodies and controls to be analyzed. The spot diameter was set at 100 µm and the separation distance among spots at 200 µm (see Supporting Figure 1B). A total of 6 serum samples were analyzed per array, in duplicate. Eventually, printed arrays were packed and stored protected from light in a dry atmosphere at RT, until assayed. 2.3.3. Evaluation of array performance All the following steps were performed at RT. Antibody arrays were blocked with a SuperBlock®-PBS solution for 1 h on a rocking platform. Then, they were washed (3X) with PBS 5 min. After that, the arrays were incubated with HRP-conjugated anti-rabbit secondary IgG (1:200 (v/v) in SuperBlock®-PBS) for 1 h in a humidified chamber. Then, they were individually washed with i) PBS (5 min, 3X) and ii) distilled water (5 min, 1X). Subsequently, arrays were incubated with 1:50 (v/v) TSA solution for 10 min in a humidified chamber. Arrays were then washed as described above and dried with filtered compressed air. Finally, they were scanned using the GenePix® 4000B Scanner (Axon Instruments, Union City, USA) and the SensoSpot Fluorescence Scanner

8

ACS Paragon Plus Environment

Page 9 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

(Sensovation AG; Radolfzell, Germany) for CT and NC technologies, respectively, and analyzed. 2.3.4. Sera biotinylation Following the protocol described by Häggmark A. et al.23, proteins present in OA, RA and C sera samples were biotinylated by incubation with 0.78 mg/mL NHS-PEG4-biotin for 2 h at 4˚C. Biotinylation reactions were stopped with 0.5 M Tris-HCl (pH 8). 2.3.5. Detection of protein serum profiles All steps were performed at RT unless otherwise is specified. Discovery phase CT antibody arrays were blocked with a blocking solution (1% PBS, 0.2% Tween20 and 5% (w/v) powdered skimmed milk) for 1 h on a rocking platform following by washing with distilled water (5 min, 1X). Each array was incubated with 1:600 (v/v) serum (diluted in blocking solution), with slight shaking at 4ºC, overnight (O/N). Subsequently, slides were incubated with 1:100 (v/v) Cy3-Streptavidin for 1 h in darkness, in a humidified chamber. Prior to scanning, arrays were washed as described above and dried with filtered compressed air. Validation phase NC arrays were blocked with Superblock® for 1 h with shaking and, subsequently, washed with distilled water (5 min, 3X). Then, 40 µL of 1:1000 (v/v) biotinylated serum were added to each well of the 16-array chamber. Chambers were covered and incubated O/N at 4˚C with slight shaking. After that, the arrays were individually

9

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 38

washed with distilled water and revealed using 1:50 (v/v) Cy5-Streptavidin for 20 min. Finally, arrays were washed, dried with compressed filtered air and scanned. 2.3.6. Image analysis The TIFF images generated by array scanning were analyzed using GenePix® Pro 4.0. software. Parameters were set to quantify light intensity values at Cy3 (λ=532 nm) and Cy5 (λ=635 nm) emission wavelengths, respectively for CT and NC technologies. 2.3.7. Signal standardization Signal intensity values were normalized following equation 1, where  is referred to the intra and inter array normalized signal,  to the raw signal in antibody containing spots,  to the background signal and   to the raw signal in companion buffers’ spots. Background signal subtraction within each array was followed by fold change calculation with respect to a blank across arrays24.

 =

(   )

(1)

In the discovery phase (CT arrays), background signal was estimated for each array using the DEPC water spot signal values corresponding to the first quartile. In the case of the validation phase (NC arrays), as several sera were assayed per array (in duplicate), the background signal was estimated as the mean of DEPC water spot intensity values present in each pair of subarrays corresponding to the same serum. For both printing technologies, the blank signal was estimated as the median of intra-array normalized intensity values of negative control spots containing the companion buffers (i.e., PBS+H2O+BS3 in CT and glycerol+PBS+BS3 in NC), across all the arrays. The 10

ACS Paragon Plus Environment

Page 11 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

median signal of the replicates per antibody was computed as an estimate of reactivity. Antibodies with normalized intensity values greater than 1 were considered as hits (signal due to antibody-antigen binding) and the corresponding samples were considered as positive for the antigen specifically bound by the antibody. 2.3.8. Statistical analysis To remove the effect of outlier samples, the sera in which the number of hits detected was below 1% or above 99% were not considered for further analyses. After this filtering step, 34 CT samples (12 RA, 12 OA, and 10C) in the discovery phase and 246 NC samples (80 RA, 84 OA, and 82 C) in the validation phase were analyzed. The significance level selected was 0.05. The comparison of the overall serum protein profiles among patient groups (RA, OA, and C) was performed using multivariate analysis of variance (MANOVA). The Canonical Biplot (CB) method25 was also employed to represent both patients and protein

levels

(MultBiplotR

R

package

vs.

0.2,

http://biplot.usal.es/classicalbiplot/multbiplot-in-r)26. CB method provides a simultaneous representation of n individuals belonging to K groups and p variables measured on them in a space of reduced dimension, maximizing the ratio of “between-group” to “pooled within-group” variance. It allows not only to observe differences among groups but also to identify the variables responsible for them. In a CB representation, individuals are displayed as points and variables as vectors. The group means and their confidence intervals are also shown. The length of the vectors shows the relevance of the variables to explain the differences among groups. The angle between variables (vectors) can be interpreted as an approximation of 11

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 38

their correlation (small angles indicate high correlations). The projection of individual points over a vector provides an estimation of the levels of the corresponding variable in these individuals. It is possible to determine the magnitude of the differences between groups and their significance by checking the non-overlap of confidence circles over the variables. These confidence circles correspond to the 95% confidence intervals for the location of the centroid of each group, representing the average individual. The small sample size (n=34) in the discovery phase did not allow MANOVA analysis (higher number of variables than observations); however, a CB representation of the data provided an approximation for the differences in the sera protein profiles. Subsequent ANOVA and t-test analyses were performed for both discovery and validation phases in order to explore differences in the levels of individual proteins among groups. To control for multiple hypothesis testing, the Benjamini-Hochberg method for FDR was employed. FDR level was set at 0.01. Additionally, for the validation phase, a hierarchical clustering analysis using euclidean distances was carried out on log2 values of normalized signals, to address the unsupervised classification of patients based on significant protein levels (FDR < 0.01). Besides, different supervised machine learning approaches were employed to classify the samples based on the levels of the significant proteins (FDR < 0.01). The validation dataset was split into a training set (172 samples, 70%) and a test set (74 samples, 30%) with balanced classes. We selected three different, commonly used classifiers: i) multinomial logistic regression (MLR), ii) support vector machine with linear kernel (SVM), and iii) artificial neural network with one hidden layer of three units (NN), and trained them using the

train() function in the caret R package vs. 6.0-70

(https://CRAN.R-project.org/package=caret)27. We used a 10 fold, repeated 10 times 12

ACS Paragon Plus Environment

Page 13 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

cross-validation procedure with parameter tuning to select the optimal models based on their ROC values. We employed the selected models to make predictions on the test dataset, obtaining overall and by class (“one vs all”) statistics (i.e., accuracy, sensitivity, specificity). ROC curves were built for the OA and RA patient groups, using the average “one vs all” sensitivity and 1-specificity values across classifiers.

13

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 38

3. RESULTS AND DISCUSSION In this work, a total of 318 serum samples (36 in a preliminary discovery phase and 282 in a validation phase) from patients with OA, RA, as well as healthy controls, were screened using a panel of 151 antibodies, immobilized on planar microarrays, against 121 preselected antigens (Supporting Table 1). The main goals of this study included the evaluation of the suitability of the experimental pipeline for biomarker discovery and the characterization of differential serum protein profiles of patients suffering from OA and RA as potential diagnostic tools; along with the identification of candidate biomarkers. 3.1. Evaluation of antibody array performance This study was made up of two subsequent phases: discovery and validation, which involved multiple experimental steps (Figure 1). For each phase, a different array printing technology was employed: contact (CT) and non-contact (NC), respectively. To be able to assess functionality and detect differences in performance, both phases followed almost identical processing steps. This experimental design avoided introducing additional biases to the intrinsic variability of the printing procedures. An identical quality control (QC) was designed in order to evaluate spot features and to detect undesirable effects such as cross-talking or cross-contamination between spots for both CT and NC arrays. Antibody arrays were incubated with HRP-conjugated antirabbit IgG and matched with the expected content of each spot (i.e., antibody spot, empty spot, blank spot). In addition, the signal variation could be related to the amount of immobilized antibodies and saturation concentration was determined.

14

ACS Paragon Plus Environment

Page 15 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

According to the results of the QC, both printing technologies showed high reliability for their usage in biomarker discovery. However, CT arrays displayed irregular shape and larger size of spots, with increased probability of cross-contamination. These limited the number of subarrays that could be printed per slide, and therefore the number of replicates and samples that could be assayed per array. CT printing also required considerably larger amounts of clinical sample and reagents (with respect to NC), and subsequently was not suitable for the analysis of a big number of clinical samples. Taking into account the limitations of CT printing, we decided to use a small number of samples for the discovery phase (n = 36). This sample size would be sufficient to assess the experimental procedures and characterize differential protein profiles; and also for the identification of individual biomarker candidates, assuming enough statistical power even if limited. This decision was key to save important sample and reagent amounts, increasing the performance in the subsequent stage, in which we aimed to validate the results observed in the initial phase and to identify new candidate antigens. Regarding the validation phase, where a larger number of antibody arrays and clinical samples were analyzed, NC technology was selected due to the higher precision printing and lower amount of reagents and samples required. Additionally, to evaluate the similarity in performance of both array platforms, the median normalized signal per antigen and patient group (OA, RA, and C) was computed for each array technology. The correlation (r, Pearson) of these values between CT and NC arrays was obtained for each antigen. The distribution of the correlation coefficients is shown in Supporting Figure 2. Note that we did not aim to exhaustively compare both 15

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 38

array formats, but to assess the degree of similarity in terms of the global trends. Despite the multiple sources of variability (experimental procedures, different samples assayed in each case, etc.), correlations were positive for the majority of antigens (71%), showing similar behaviours in both array platforms. In fact, 31% of the antigens showed correlations larger than 0.8, implying highly concordant measurements across technologies. 3.2. Identification of differential protein profiles in serum 3.2.1. Immunoreactivity Overall, immunoreactivity estimates obtained using both CT and NC technologies displayed a wide range of values across individual samples. Despite the high variability, different profiles were clearly detected, with proteins showing low/moderate levels in certain patient groups and higher in others. In the case of the discovery phase, the total number of hits per sample was similar between the two disease groups (median values of 67.5 and 68.5 in OA and RA, respectively), and higher than for healthy controls (median value of 49.0). The median normalized intensities of hits per sample, estimates of the antigen levels, displayed an analogous behavior (Figure 2A). The percentage of positive samples per antigen was plotted and correlated to illustrate differences in immunoreactivity between the patient groups (Figure 2B, see also Supporting Figure 3). The fraction of reactive samples for particular antigens was observed higher in OA and RA when compared with C. Antigens present in a high number of pathological samples and absent in most of the controls would be potential biomarker candidates. We found antigens with diseasespecific immunoreactivity profiles for both RA and OA. 16

ACS Paragon Plus Environment

Page 17 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

In the validation stage, the total number of hits per sample was substantially different among the three groups of samples, although clearly higher for pathological sera when compared to the C group (median values of 93.7 for OA, 67.6 for RA, and 21.2 for C). However, the estimates of antigen levels (Figure 2A) displayed similar median levels for RA and C and higher for OA. This may be a result of the antibody selection against antigens that have been previously reported in OA pathology. This pattern was not seen in the discovery phase. The variability of intensity values was higher for the C group, suggesting that some of the patients in this group might have been erroneously included (note that the inclusion criterion for healthy controls was just the absence of a history of joint disease) and would eventually develop the pathology. As in the initial phase, the percentage of positive samples per antigen was plotted and correlated (Figure 2B, see also Supporting Figure 3). The range of percentages was narrower for this second stage; however, the cloud of points was greatly displaced to the pathology side, meaning that for the vast majority of tested antigens, the fraction of samples showing high levels was remarkably larger in the disease groups than in the healthy controls. We also found antigens with strongly disease-biased immunoreactivity profiles when comparing RA vs OA. 3.2.2. Deciphering differential protein serum profiles A CB representation of the samples (points) and antigens (vectors) analyzed in the discovery phase is shown in Figure 3A. This representation of the data reflects the differences in protein profiles across the patient groups. A relatively good separation of the patients is achieved based on the antigen levels, especially in the case of OA samples. The top 5 antigens with major contribution to the differences among groups (TNF, ITGAM, SPARCL1, TGFBI, VASN) were displayed as vectors, their length 17

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 38

representing the size of their contribution (note that this could be done for all the assayed antigens, but we have restricted the representation to the most relevant ones for visualization purposes). Given the properties of the CB representation and the direction of the antigen vectors, it is shown that most of the differences correspond to changes in antigen levels between the C group (lower levels) and both disease groups (higher levels). An exception was the VASN protein, with differences in its levels between the RA group (higher levels) and the other two groups (OA and C). We can also make statements about the correlation of the antigen levels across samples. The smaller the angle between their corresponding vectors, the higher their correlation. For instance, here we can see a higher correlation and thus similar behavior across sera for the antigens TNF, SPARCL1, and TGFBI. The overlap of the Bonferroni-corrected confidence circles for the group centroids (average individuals) over the antigen vectors, together with the subsequent statistical analyses based on ANOVA and t-tests, showed these differences in antigen levels to be non-significant after multiple testing corrections (FDR > 0.01). In the case of the validation stage, different protein profiles were significantly detected (MANOVA Pillai, p-value < 0.001) among the studied groups. CB representation of the serum samples (Figure 3A) shows a good separation of the OA and RA patients and healthy controls upon the canonical axes. The antigens IL1RAP, PLTP, SLC11A1, ANXA6, FBN1, COL1A1, and VASN are displayed as vectors (again, note that this could be done for all the assayed antigens, but we have restricted the representation to the most relevant ones for visualization purposes). These antigens showed significant differences in their levels between groups in ANOVA and t-test analyses after multiple testing corrections (FDR < 0.01), and also large contributions to the differences between 18

ACS Paragon Plus Environment

Page 19 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

groups (Figure 4). IL1RAP showed higher levels in the C group, SLC11A1 and PLTP in the OA group, and COL1A1, ANXA6, FBN1 and VASN in the RA group. The higher correlation (small angles between the corresponding vectors) between the antigens ANXA6, FBN1, and VASN in the CB representation reflects their similar behavior across sera. It is also relevant that vectors corresponding to the same antigens have similar directions in both screening phases, relative to the patient groups, meaning that the antigen levels are high/low in the same groups of samples for both discovery and validation phases (data not shown). This also supports the similarity and comparability between both printing technologies (see section 3.1.). The highlighted antigens in both screening phases belong to three major protein groups: i) Proteins involved in proinflammatory and inflammatory processes, such as IL1RAP (interleukin 1 receptor accessory protein) and VASN (vasorin). Indeed, IL1RAP is a necessary component of the interleukin 1 receptor complex, which initiates the signaling cascade that results in the activation of IL1-responsive genes. ii) Lipid metabolism related proteins, such as PLTP (phospholipid transfer protein) and ANXA6 (annexin A6). For example, PLTP is one of the lipid transfer proteins that interacts with apolipoproteins A1 and A2 (APOA1, APOA2). iii) Proteins related to ECM formation, degradation or reparation processes, such as FBN1 (fibrillin 1) and COL1A1 (collagen type I alpha 1 chain). COL1A1 is one of the components collagen I, a member of the family of proteins that strengthen and support many tissues in the body (cartilage, bone, tendons, skin, etc.). Some of the identified proteins had been already reported as relevant for rheumatic pathologies in previous studies28–29, what can be considered as an asset of this work. Others have been related to OA and RA for the first time at the protein level in this study. Note also that VASN has been detected as highly expressed 19

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 38

in the RA patients when compared to OA and C in both discovery and validation phases. This protein has been recently described as a modulator of the vascular response to injury through the attenuation of TGF-β signaling30 and might have a similar role during joint destruction. Different unsupervised and supervised methods were employed to classify patients and controls according to their serum protein profiles. We restricted most of these analyses to the validation phase, given the difficulties to apply them to the entire dataset, mainly due to the differences in signal distribution between both array technologies. Median raw intensities were larger in NC than in CT arrays. Also, signal variability was platform-dependent, smaller in NC arrays due to the higher precision printing. Even if standardization substantially corrected these issues, hierarchical clustering on the entire study still pointed to the array technology as an important clustering factor (see Supporting Figure 4). In addition, proteins employed for classification (those identified in the validation stage) were not seen as significant for distinguishing between patient groups in the initial discovery phase (except for VASN). Subsequently, including the discovery samples would hinder the classification. Note also that they only represent a small fraction of the total number of samples (36 / (282 + 36) ~ 11%) and therefore would not provide much additional information. Hierarchical clustering of all antigens and samples analyzed for all patient groups and both array technologies are shown in Supporting Figure 4. Heatmaps for the log2 values of the normalized intensities are also displayed. As it can be seen, there is no clear separation of the patient groups based on the levels of all the antigens. Additionally, in the case of the validation phase, for each pair of groups, the samples were also clustered based on the antigens whose levels were identified as significantly different (FDR < 20

ACS Paragon Plus Environment

Page 21 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

0.01) between patient groups (Figure 3B). A substantially better classification was achieved in this case. We further addressed the supervised classification of the validation samples based on the levels of the significantly differential proteins (FDR < 0.01), using different classifiers (multinomial logistic regression, MLR; support vector machine with linear kernel, SVM; and neural network with one hidden layer of three units, NN). The average classification accuracy achieved in the test set was of 0.717 (very similar across classifiers; MLR: 0.712, SVM: 0.726, NN: 0.712). Other statistics’ estimates by class (“one vs all” classification) were also obtained and summarized in the Supporting Table 2. As they were similar across classifiers, the focus was set on their average values. Besides, ROC curves built upon these values for OA and RA are shown in Figure 5. The highest sensitivity (proportion of positives that were correctly classified as such) was achieved for the control group (0.833). Sensitivity for the OA group was also high (0.800). However, it was low for RA classification (0.514). Specificity (proportion of negatives that were correctly identified as such) estimates were especially high for the RA group (0.966), followed by C (0.857) and OA (0.750). This means that it is possible to discriminate with high confidence between diseased and healthy individuals. Additionally, it is feasible to determine with moderate confidence whether an individual is affected by OA or not. However, in the case of RA, it is only possible to affirm with high confidence that a given individual does not have the disease. This is likely to be due to the antibody selection, against antigens that have been previously reported in OA pathology and that may not be related to RA.

21

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 38

The present study provides interesting insights into the differential protein serum profiles of OA, RA, and healthy individuals, leading to candidate antigens worthy of further evaluation. However, additional steps will be required to verify the actual findings. These could be also immunoassay-based, employing alternative formats to the CT/NC antibody arrays (e.g. beads, ELISA, etc.); but preferably more orthogonal techniques such as mass spectrometry (MS), combined with liquid chromatography (HPLC) and different labelling techniques (e.g. isobaric tags for relative and absolute quantitation, iTRAQ; or tandem mass tags, TMT). Finally, although studying the clinical interest of individual antigens is beyond the goals of the present study, it is worth mentioning that the profiles identified, together with information on other clinical variables (bone densitometry, current treatment, environmental factors, etc.) constitute a valuable resource for early diagnosis, progression evaluation and treatment response assessment in OA and RA. Besides, antibody arrays provide a simple, fast, and miniaturized technology, whose immunoassay format allows direct translation into the clinics (e.g. ELISA).

22

ACS Paragon Plus Environment

Page 23 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

4. CONCLUSIONS Our

results

showed

differential

protein

profiles

between

patient

groups.

Immunoreactivity of pathological samples (OA and RA) was proven substantially higher when compared to healthy control samples. A set of antigens showing significantly different levels between groups was identified, mostly related to inflammatory response, lipid metabolism and bone and ECM degradation or remodeling. Unsupervised and supervised machine learning approaches allowed the accurate classification of the patients based on these antigens, which constitute candidate biomarkers. NC printing appears to be the best strategy to process a higher number of samples in a reproducible manner, reducing the amount of antibodies, reagents, and sample required. Nevertheless, further studies will be necessary in order to accurately quantify differences in protein levels, with a larger number of samples, including other rheumatic/inflammatory pathologies and exploring the correlation with clinical information (i.e., treatment, disease status, etc.).

23

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 38

FIGURE LEGENDS Figure 1. General overview of the experimental procedures. The steps performed to process the antibody arrays were the following: 1. Glass slides’ selection and cleaning. 2. Surface activation by the addition of highly reactive amine groups. 3. Antibody selection and preparation of the antibody and control solutions for printing. 4. Antibody spotting on the slides using two different printing techniques: a) contact (CT, discovery phase) and b) non-contact (NC, validation phase). 5. Quality control (QC) of the printing procedure: addition of HRP-conjugated secondary antibodies and revealing by TSA. 6. Sera samples from RA and OA patients and healthy controls (C). 7. Biotinylation of the proteins present in the sera. 8. Incubation of printed arrays with biotinylated sera. Specific antibody-antigen bindings were revealed using Cy3- and Cy5-streptavidin for CT and NC technologies, respectively. 9. Array scanning and image processing. Figure 2. Immunoreactivity profiles of the patient groups. For discovery and validation phases: A) signal intensity of hits in rheumatoid arthritis (RA) patients, osteoarthritis (OA) patients and healthy controls (C); and B) correlation of the percentage ([%]) of positive samples per antibody across patient groups. Figure 3. Differential serum protein profiles across patient groups. A) For both discovery (left panel) and validation (right panel) phases, canonical biplot (CB) representations of patients (points) and proteins (vectors) are displayed. The protein vectors have been scaled by a factor of 0.5 (discovery phase) and 0.125 (validation phase) to fit the plotting areas. For simplicity, only the proteins with major differences in their levels estimates among groups (rheumatoid arthritis, RA; osteoarthritis, OA and 24

ACS Paragon Plus Environment

Page 25 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

healthy controls, C) are shown (top 5 proteins for discovery phase, significant proteins in t-test, FDR < 0.01 for validation phase). The name of the patient groups represents the position of the group centroid (average individual). The percentage of variability explained by each canonical axis is shown between parentheses. B) For the validation phase, hierarchical clusters based on the proteins with significant differences (t-test, FDR < 1%) in RA vs C, OA vs C and RA vs OA are shown. Heatmaps are also displayed, representing the antigen levels’ estimates (log2). Figure 4. Normalized signal for significant (t-test FDR < 0.01) differentially present proteins among patient groups in the validation phase. Box-plots show the distributions of the non-contact (NC) signal (estimate of the protein levels) for the proteins IL1RAP, PLTP, SLC11A1, ANXA6, FBN1, COL1A1 and VASN in rheumatoid arthritis (RA) patients, osteoarthritis (OA) patients, and healthy controls (C). Figure 5. ROC curves. Representation in the ROC space, for each classifier (multinomial logistic regression, MLR; support vector machine with linear kernel, SVM; and neural network with one hidden layer of three units, NN) and disease group (osteoarthritis, OA, and rheumatoid arthritis, RA) the values of sensitivity and 1specificity corresponding to the “one vs all” classification in the test set. The curves are drawn through the group centroids.

25

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 38

ACKNOWLEDGMENTS We gratefully acknowledge financial support from the Carlos III Health Institute of Spain (ISCIII, FIS PI14/01538, PI12/00624, PI12/00329, PI14/01707, CIBERCB06/01/0040 and RETIC-RIER-RD12/0009/0018), Fondos FEDER (EU), Junta Castilla y León (BIO/SA07/15) and Fundación Solórzano (FS-23-2015). The Proteomics Unit belongs to ProteoRed, PRB2-ISCIII, supported by grant PT13/0001 (ISCIII-Fondos FEDER). P. D. and C. D. are supported by a JCYL-EDU/346/2013 Ph.D. scholarship. The work of D.G.M. was supported by a collaboration scholarship (BOE-A-2014-3844) from the Spanish Ministry of Education, Culture, and Sports (MECD), and awarded in the XIII Certamen Universitario Arquímedes (BOE-A-201412688, MECD). We also thank Javier Martín Vallejo and Manuel Muñoz Aguirre for useful discussions and statistical support.

26

ACS Paragon Plus Environment

Page 27 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

SUPPORTING INFORMATION The following files are available free of charge at ACS website: http://pubs/acs.org Supporting Figure 1. TIFF images and design of contact (CT) and non-contact (NC) microarrays. Supporting Figure 2. Distribution of the correlation of median antigen levels per patient group between array platforms. Supporting Figure 3. Percentage of hits in the serum samples of each patient group. Supporting Figure 4. Clustering analysis of the samples based on all antigens’ levels. Supporting Table 1. Antibodies employed in the study. Supporting Table 2. Classification statistics (validation phase).

27

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 38

REFERENCES (1)

Cross, M.; Smith, E.; Hoy, D.; Nolte, S.; Ackerman, I.; Fransen, M.; Bridgett, L.; Williams, S.; Guillemin, F.; Hill, C. L.; et al. The global burden of hip and knee osteoarthritis: estimates from the global burden of disease 2010 study. Ann. Rheum. Dis. 2014, 73 (7), 1323–1330.

(2)

Henjes, F.; Lourido, L.; Ruiz-Romero, C.; Fernández-Tajes, J.; Schwenk, J. M.; Gonzalez-Gonzalez, M.; Blanco, F. J.; Nilsson, P.; Fuentes, M. Analysis of autoantibody profiles in osteoarthritis using comprehensive protein array concepts. J. Proteome Res. 2014, 13 (11), 5218–5229.

(3)

Rousseau, J.-C.; Delmas, P. D. Biological markers in osteoarthritis. Nat. Clin. Pract. Rheumatol. 2007, 3 (6), 346–356.

(4)

Ruiz-Romero, C.; Blanco, F. J. Proteomics role in the search for improved diagnosis, prognosis and treatment of osteoarthritis. YJOCA 2010, 18, 500–509.

(5)

Nedelkov, D.; Kiernan, U. A.; Niederkofler, E. E.; Tubbs, K. A.; Nelson, R. W. Investigating diversity in human plasma proteins. Proc. Natl. Acad. Sci. U. S. A. 2005, 102 (31), 10852–10857.

(6)

Balakrishnan, L.; Nirujogi, R.; Ahmad, S.; Bhattacharjee, M.; Manda, S. S.; Renuse, S.; Kelkar, D. S.; Subbannayya, Y.; Raju, R.; Goel, R.; et al. Proteomic analysis of human osteoarthritis synovial fluid. Clin. Proteomics 2014, 11 (1), 6.

(7)

Kharaz, Y. A.; Tew, S. R.; Peffers, M.; Canty-Laird, E. G.; Comerford, E. Proteomic differences between native and tissue-engineered tendon and ligament. Proteomics 2016, 16 (10), 1547–1556. 28

ACS Paragon Plus Environment

Page 29 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

(8)

Hsueh, M.-F.; Khabut, A.; Kjellström, S.; Önnerfjord, P.; Kraus, V. B. Elucidating the Molecular Composition of Cartilage by Proteomics. J. Proteome Res. 2016, 15 (2), 374–388.

(9)

Briggs, M. T.; Kuliwaba, J. S.; Muratovic, D.; Everest-Dass, A. V.; Packer, N. H.; Findlay, D. M.; Hoffmann, P. MALDI mass spectrometry imaging of N glycans on tibial cartilage and subchondral bone proteins in knee osteoarthritis. Proteomics 2016, 16 (11–12), 1736–1741.

(10)

Anderson, N. L.; Polanski, M.; Pieper, R.; Gatlin, T.; Tirumalai, R. S.; Conrads, T. P.; Veenstra, T. D.; Adkins, J. N.; Pounds, J. G.; Fagan, R.; et al. The human plasma proteome: a nonredundant list developed by combination of four separate sources. Mol. Cell. Proteomics 2004, 3 (4), 311–326.

(11) Gillette, M. A.; Mani, D. R.; Carr, S. A. Place of Pattern in Proteomic Biomarker Discovery. J. Proteome Res. 2005, 4 (4), 1143–1154. (12)

Ayoglu, B.; Häggmark, A.; Khademi, M.; Olsson, T.; Uhlén, M.; Schwenk, J. M.; Nilsson, P. Autoantibody profiling in multiple sclerosis using arrays of human protein fragments. Mol. Cell. Proteomics 2013, 12 (9), 2657–2672.

(13)

González-González, M.; Bartolome, R.; Jara-Acevedo, R.; Casado-Vela, J.; Dasilva, N.; Matarraz, S.; García, J.; Alcazar, J. A.; Sayagues, J. M.; Orfao, A.; et al. Evaluation of homo- and hetero-functionally activated glass surfaces for optimized antibody arrays. Anal. Biochem. 2014, 450, 37–45.

(14)

McWilliam, I.; Kwan, M. C.; Hall, D. Inkjet Printing for the Production of Protein Microarrays; 2011; pp 345–361. 29

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(15)

Page 30 of 38

Gonzalez-Gonzalez, M.; Jara-Acevedo, R.; Matarraz, S.; Jara-Acevedo, M.; Paradinas, S.; Sayagües, J. M.; Orfao, A.; Fuentes, M. Nanotechniques in proteomics: protein microarrays and novel detection platforms. Eur. J. Pharm. Sci. 2012, 45 (4), 499–506.

(16)

Lourido, L.; Calamia, V.; Mateos, J.; Fernández-Puente, P.; Fernández-Tajes, J.; Blanco, F. J.; Ruiz-Romero, C. Quantitative Proteomic Profiling of Human Articular Cartilage Degradation in Osteoarthritis. J. Proteome Res. 2014, 13 (12), 6096–6106.

(17)

Fernández-Puente, P.; Mateos, J.; Fernández-Costa, C.; Oreiro, N.; FernándezLópez, C.; Ruiz-Romero, C.; Blanco, F. J. Identification of a Panel of Novel Serum Osteoarthritis Biomarkers. J. Proteome Res. 2011, 10 (11), 5095–5101.

(18)

Mateos, J.; Lourido, L.; Fernández-Puente, P.; Calamia, V.; Fernández-López, C.; Oreiro, N.; Ruiz-Romero, C.; Blanco, F. J. Differential protein profiling of synovial fluid from rheumatoid arthritis and osteoarthritis patients using LC– MALDI TOF/TOF. J. Proteomics 2012, 75 (10), 2869–2878.

(19)

Díaz-Prado, S.; Cicione, C.; Muiños-López, E.; Hermida-Gómez, T.; Oreiro, N.; Fernández-López, C.; Blanco, F. J. Characterization of microRNA expression profiles in normal and osteoarthritic human chondrocytes. BMC Musculoskelet. Disord. 2012, 13 (1), 144.

(20)

Evangelou, E.; Kerkhof, H. J.; Styrkarsdottir, U.; Ntzani, E. E.; Bos, S. D.; Esko, T.; Evans, D. S.; Metrustry, S.; Panoutsopoulou, K.; Ramos, Y. F. M.; et al. A meta-analysis of genome-wide association studies identifies novel variants associated with osteoarthritis of the hip. Ann. Rheum. Dis. 2014, 73 (12), 2130– 30

ACS Paragon Plus Environment

Page 31 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

2136. (21)

Altman, R.; Asch, E.; Bloch, D.; Bole, G.; Borenstein, D.; Brandt, K.; Christy, W.; Cooke, T. D.; Greenwald, R.; Hochberg, M. Development of criteria for the classification and reporting of osteoarthritis. Classification of osteoarthritis of the knee. Diagnostic and Therapeutic Criteria Committee of the American Rheumatism Association. Arthritis Rheum. 1986, 29 (8), 1039–1049.

(22)

Aletaha, D.; Neogi, T.; Silman, A. J.; Funovits, J.; Felson, D. T.; Bingham, C. O.; Birnbaum, N. S.; Burmester, G. R.; Bykerk, V. P.; Cohen, M. D.; et al. 2010 rheumatoid

arthritis

classification

criteria:

an

American

College

of

Rheumatology/European League Against Rheumatism collaborative initiative. Ann. Rheum. Dis. 2010, 69 (9), 1580–1588. (23)

Häggmark, A.; Byström, S.; Ayoglu, B.; Qundos, U.; Uhlén, M.; Khademi, M.; Olsson, T.; Schwenk, J. M.; Nilsson, P. Antibody-based profiling of cerebrospinal fluid within multiple sclerosis. Proteomics 2013, 13 (15), 2256– 2267.

(24)

Díez, P.; Dasilva, N.; González-González, M.; Matarraz, S.; Casado-Vela, J.; Orfao, A.; Fuentes, M. Data Analysis Strategies for Protein. Microarrays 2012, 1 (2), 64–83.

(25)

Varas, M. J.; Vicente-Tavera, S.; Molina, E.; Vicente-Villardón, J. L. Role of canonical biplot method in the study of building stones: an example from Spanish monumental heritage. Environmetrics 2005, 16 (4), 405–419.

(26)

Vicente-Villardón, J.L. MultBiplotR: Multivariate Analysis Using Biplots. R 31

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 32 of 38

package version 0.2. 2015, http://biplot.usal.es/classicalbiplot/multbiplot-in-r (27)

Kuhn, M.; Wing, J.; Weston, S.; Williams, A.; Keefer, C.; Engelhardt, A.; Cooper, T.; Mayer, Z.; Kenkel, B.; the R Core Team, Benesty, M.; Lescarbeau, R.; Ziem, A.; Scrucca, L; Tang, Y. and Candan, C. caret: Classification and Regression Training. R package version 6.0-70. 2016, https://CRAN.Rproject.org/package=caret

(28)

Balakrishnan, L.; Nirujogi, R.; Ahmad, S.; Bhattacharjee, M.; Manda, S. S.; Renuse, S.; Kelkar, D. S.; Subbannayya, Y.; Raju, R.; Goel, R.; et al. Proteomic analysis of human osteoarthritis synovial fluid. Clin. Proteomics 2014, 11 (1), 6.

(29)

Campbell, K. A.; Minashima, T.; Zhang, Y.; Hadley, S.; Lee, Y. J.; Giovinazzo, J.; Quirno, M.; Kirsch, T. Annexin A6 interacts with p65 and stimulates NF-κB activity and catabolic events in articular chondrocytes. Arthritis Rheum. 2013, 65 (12), 3120–3129.

(30)

Ikeda, Y.; Imai, Y.; Kumagai, H.; Nosaka, T.; Morikawa, Y.; Hisaoka, T.; Manabe, I.; Maemura, K.; Nakaoka, T.; Imamura, T.; et al. Vasorin, a transforming growth factor -binding protein expressed in vascular smooth muscle cells, modulates the arterial response to injury in vivo. Proc. Natl. Acad. Sci. 2004, 101 (29), 10732–10737.

32

ACS Paragon Plus Environment

Page 33 of 38

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42

Func�onalized glass slides

Journal of Proteome Research

Discovery contact arrays -B STV

non-contact arrays Selected an�bodies

Valida�on

Disease and control sera incuba�on

Streptavidin-Cy revealing

Posi�ve biomarker candidates

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

Page 34 of 38

Figure 1 3. Selected an�bodies

NH+3 NH+3

NH+3

2. Surface func�onaliza�on 1. Glass slides 4. Array prin�ng a) CT (Discovery phase) b) NC (Valida�on phase)

RA OA C

-B STV -B -B

6. Pa�ent sera samples

5. QC

-B -B

7. Protein bio�nyla�on

9. Scanning and image processing 8. Incuba�on and revealing ACS Paragon Plus Environment

Page 35 of 38

Journal of Proteome Research

Figure 2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

A

Immunoreactivity − Discovery phase

Immunoreactivity − Validation phase 6

Median normalized intensity of hits

Median normalized intensity of hits

4

3

2

1

5

4

Group C OA RA

3

2

1 C

OA

RA

C

OA

Group

RA

Group

40

60

RA [%]

80

100

0

20

40

60

OA [%]

80

100

OA vs C

20

40

60

OA [%]

80

100

100

100 20

40

60

RA [%]

ACS Paragon Plus Environment

80

100

80 20 0

0 0

60

RA [%]

60

C [%]

20

40

60

C [%]

40 20 0

0

OA vs RA

80

80

100 60

RA [%]

20 0

0 20

40

60 20

40

C [%]

80

80

100 80 60 40 20 0

RA vs C

40

OA vs RA 100

OA vs C 100

RA vs C

0

C [%]

B

0

20

40

60

OA [%]

80

100

0

20

40

60

OA [%]

80

100

Journal of Proteome Research

Page 36 of 38

Figure 3

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Validation phase

OA

−0.5

RA

0.2

VASN ANXA6 FBN1 C

RA

0.0

0.0

C

IL1RAP

COL1A1

OA

−0.2

Canonical axis II (44.08 %)

0.5

0.4

1.0

Discovery phase

SLC11A1

−0.4

ITGAM VASN TNF

−1.0

Canonical axis II (38.25 %)

A

−1.0

−0.5

SPARCL1 TGFBI

0.0

0.5

1.0

PLTP

−0.6

Canonical axis I (61.75 %)

−0.4

−0.2

0.0

0.2

0.4

0.6

Canonical axis I (55.92 %)

B

Group IL1RAP PLTP SLC11A1

3 Group C 2.5 2 OA 1.5 1 0.5 0

Group IL1RAP FBN1 VASN ANXA6 COL1A1

3

Group RA C

2 1 0

Group PLTP SLC11A1 ANXA6 FBN1 VASN

3 2 1 0

ACS Paragon Plus Environment

Group RA OA

Page 37 of 38

Figure 4

Differential proteins (FDR < 0.01, validation phase) 4

ANXA6

COL1A1

FBN1

IL1RAP

PLTP

SLC11A1

3 2 Normalized intensity of hits (log2)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

Journal of Proteome Research

1 0 4 3 2 1 0 4

VASN

3 2 1 0

ACS Paragon Plus Environment

Group C OA RA

Journal of Proteome Research

Figure 5

0.4

0.6

0.8

1.0

ROC

0.2

OA RA MLR SVM NN

0.0

Sensitivity

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

Page 38 of 38

0.0

0.2

0.4

0.6

1 − Specificity

ACS Paragon Plus Environment

0.8

1.0