Characterization of Differences between Blood Sample Matrices in

Dec 22, 2010 - These values are comparable to previous studies of analytical variation within plasma and serum analysis by our laboratory and consiste...
92 downloads 11 Views 1MB Size
ARTICLE pubs.acs.org/ac

Characterization of Differences between Blood Sample Matrices in Untargeted Metabolomics Judith R. Denery,* Ashlee A. K. Nunes, and Tobin J. Dickerson* Department of Chemistry and Worm Institute for Research and Medicine, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, California 92037, United States

bS Supporting Information ABSTRACT: Large-scale proteomic and metabolomic technologies are increasingly gaining attention for their use in the diagnosis of human disease. In order to ensure the statistical power of relevant markers, such analyses must incorporate a large number of representative samples. While in a best-case scenario these samples are collected through a study design that is specifically tailored for the desired analysis, often studies must rely upon the analysis of large numbers of previously banked samples that may or may not have complete and accurate documentation of their associated collection and storage methods. In this study, several human blood matrices were analyzed and compared for the quality of metabolomic output. The sample types that were tested include plasma prepared with a variety of anticoagulants and serum collected by venipuncture and capillary blood collection protocols. Analysis with liquid chromatography-mass spectrometry (LC-MS) revealed only subtle differences between the various plasma preparation methods. Differences between the serum and plasma samples appear to be largely peptide/protein-based and are consistent with the biological distinction of the two matrices. Interestingly, the small molecule lysophosphatidylinositol was found to be in higher abundance in plasma, as a possible consequence of the effect of the intrinsic clotting cascade on adjacent metabolic pathways. Comparison of the small-molecule profiles of the capillary- and venipuncturecollected samples revealed 23 statistically significant compound differences between these sample types. Most of these features can be attributed to surfactants and detergents used to pretreat the skin in order to maintain the sterility of sample collection. However, several have identical mass and molecular formulas as endogenous human metabolites and could be erroneously attributed to actual metabolic perturbations. Understanding the extent of these matrix effects is important for control of systematic bias and ensuring the quality of metabolomic analysis.

I

ntuitively, a comprehensive approach to studying the downstream, posttranslational processes of an organism is a logical means of studying the effects of disease. Metabolomics, or the measurement of all the metabolites present in an organism, and metabolite profiling, in which a smaller subset of metabolites are measured, have become established as useful tools in the “realtime” measurement of organismal metabolism. As metabolomics technologies advance (e.g., analytical equipment, data processing, and statistical software), the capacity of the techniques are increasingly being realized for use as a tool in the diagnosis of human disease.1-5 The most successful example of the use of metabolomics for disease diagnosis includes the widely adopted screening for multiple metabolic disorders in newborn blood.6-8 Recent studies have demonstrated the utility of metabolomics as a diagnostic tool in various types of cancer, diabetes, and coronary heart disease5,9 and a growing number of infectious diseases.10,11 As a starting point, the blood matrix serves as an easy-toobtain, chemically complex data-rich matrix for metabolite analysis.12,13 Historically, clinical trials and epidemiological studies have tended to collect blood specimens as a method for assessing overall patient health, leading to an available supply of banked serum/plasma samples. Small-molecule analysis of whole r 2010 American Chemical Society

blood is technically feasible, yet disfavored, presumably due to the increased complexity of the matrix by including extracellular and intracellular pools of molecules in the same analysis. Furthermore, the process of cell lysis and isolation can lead to significant quantities of membrane components that further complicate identification of relevant biomarkers. Thus, it is the blood-derived serum and plasma matrices that have been the focus of the majority of metabolomic studies to date. Often, the intrinsic differences between the two biofluids are overlooked; however, inherent to their preparation, serum and plasma are quite different. Serum can usually be prepared quite simply through utilization of the complex enzymatic reactions of the blood coagulation cascade, involving the activation of numerous glycoproteins resulting in the production of a polymeric fibrinand platelet-based blood clot. In the presence of an anticoagulant [i.e., heparin, sodium citrate, ethylenediaminetetraacetic acid (EDTA)], the fibrinogen-initiated clotting cascade is not activated and no clot is formed, resulting in plasma. Therefore, it is expected that the proteins involved with the coagulation cascade Received: October 28, 2010 Accepted: December 8, 2010 Published: December 22, 2010 1040

dx.doi.org/10.1021/ac102806p | Anal. Chem. 2011, 83, 1040–1047

Analytical Chemistry are still present. Depending on the type of biochemical assay to be performed, specific analysis of either serum or plasma may be preferable. Additionally, the choice of anticoagulant may have an effect on the quality of the data and potentially result in analytical bias. For example, citrate is preferred for the analysis of matrix metalloproteinases (MMPs) and tissue inhibitor of matrix metalloproteinases (TIMP).14 In situations where sample volumes must be maximized, plasma samples may be preferable since in comparison to serum, a greater volume of plasma can be obtained from the same initial volume of whole blood. Depending on the analytical platform, however, it is possible that technical issues with the presence of the anticoagulant can complicate analysis (e.g., competing ionization in MS, an overwhelming signal in NMR). While there are numerous advantages with the metabolomics of these whole-blood-derived biofluids for disease diagnosis, the collection of samples from a sizable population, such as that necessary for ensuring the statistical power required for metabolome-wide association studies, can be an operational challenge. In a setting with a large number of potential research subjects, the volume of blood that must be withdrawn, as well as the time required for the route of puncture and sample withdrawal, must all be kept at a minimum. Obtaining filter paper blood spots, in which patient blood is evenly applied to a filter paper, dried, and then extracted with solvent, is one approach. Another is low volume collection of capillary blood through a finger prick. Unfortunately, there are numerous potential drawbacks to the filter paper analysis of blood samples for metabolomics, including: the difficulty with extracting metabolites that may have bound directly or indirectly to chemical components of the paper, the presence of background interference attributable to the filter paper, the loss of labile metabolites during drying and extraction conditions, and the potential for errors of quantitation caused by uneven application of blood. While capillary whole-blood collection avoids some of these pitfalls, this biofluid has yet to be tested for its utility in metabolomic analysis. Increased attention has been paid toward the introduction of standards for conducting and reporting research in the field of metabolomics.15,16 Recent studies have raised the issue of the effect of analytical and experimental bias on the results of both small-molecule and proteomic biomarker detection approaches.17,18 Interfering compounds that arise from external factors such as the method with which a sample is collected, stored, handled, or analyzed can introduce erroneous results that are not necessarily relevant to the focus of the study and can interfere with the identification of potentially relevant, biologically significant markers. In this study, a metabolomic analysis was conducted to compare the effect of different blood matrices on the quality of resultant data. Initially, the variable of plasma preparation anticoagulant type was optimized and then used for the preparation of matched plasma and serum samples collected from the same donor individuals. The quality of the metabolomic output from blood collected through both capillary and venipuncture withdrawal under simulated point-of-care conditions was also analyzed by liquid chromatography (LC) and both positive and negative electrospray ionization modes with time-of-flight (TOF) mass spectrometry (MS). Statistical comparison of the data revealed a number of significant distinguishing compounds between the various sample types. These compounds could be used to illustrate the intrinsic matrix effects between these heterogeneous biofluids and serve to highlight considerations that must be made for minimizing experimental bias in metabolomic studies.

ARTICLE

’ METHODS Sample Collection. Blood samples used in this study were collected from healthy consenting adult donors through The Scripps Research Institute Normal Blood Donor Service. The research of human blood samples used in the study was approved by the Scripps Health Human Subjects Committee. All patient codes have been removed in this publication. Samples used for the serum/plasma comparison were collected from 12 donors. From each donor, venous blood (10 mL) was collected into heparinized (plasma) and nonheparinized (serum) Vaccutainer tubes. After collection, samples were allowed to clot (∼30 min) and then placed on ice (∼20 min). The samples were aliquoted into 2 mL eppendorf tubes, without disrupting the clot for the untreated tubes. After centrifugation for 5 min at 13780g, serum/plasma was removed in 100 μL aliquots and stored at -80 °C until sample preparation. Samples used for the venipuncture/capillary comparison were collected from nine donors. For each donor, the capillary collection was performed first, followed by the venipuncture blood collection. The protocol for capillary blood collection involved the use of 1.5 mm  2.0 mm BD Microtainer retractable lancets (BD Scientific, Franklin Lakes, NJ) and was based upon the manufacturer's instructions and the approved standard for capillary blood collection.19 Prior to puncture, the site was warmed, cleaned with a 70% isopropyl alcohol wipe, and allowed to air-dry. Once the puncture was performed with the pressureactivated lancet, the first drop of blood was allowed to form and then wiped away with clean gauze. The remaining blood was collected into a BD Microtainer tube (BD Scientific, Franklin Lakes, NJ) containing no additive. The tube was filled to 500 μL or until blood flow stopped. To mimic an optimized transport protocol under anticipated point-of-care conditions, the blood collection tubes for the capillary/venipuncture samples were allowed to clot upon collection and placed on ice for 3-4 h and then were centrifuged at 13780g and transferred to eppendorf tubes in 100 μL aliquots. Subsequent steps of sample preparation and metabolite extraction were treated identically to those used for the serum/plasma analysis. Liquid Chromatographic-Electrospray Ionization Mass Spectrometric Analysis. Experiments were performed with an electrospray ionization time-of-flight (ESI-TOF) MS (Agilent 1200 LC, TOF 6210, Agilent Technologies, Santa Clara, CA). Each sample analysis consisted of an 8 μL injection of extracted sample with chromatographic separation across a reverse-phase C18 column (Zorbax 300SB C18 microbore rapid resolution column, 3.5 μm, 1 mm 150 mm; Agilent Technologies, Santa Clara, CA) held constant at 30 °C within a thermostated column compartment. The capillary pump flow rate was set to 75 μL/min. For positive-mode analysis, the solvent system was composed of mobile phase A (water with 0.1% formic acid), and mobile phase B (acetonitrile with 0.1% formic acid). For negative-mode analysis, the solvent system was mobile phase A (5 mM ammonium acetate in 95% water and 5% acetonitrile) and mobile phase B (acetonitrile). Each sample was analyzed over a 60 min run time Data Preprocessing and Statistical Analysis. All mass spectral data were collected in .d format by use of Mass Hunter Qualitative (MHQ) software version B.03.01 (Agilent Technologies, Santa Clara, CA) and then converted from .d to .cef file format. Statistical comparison was conducted with Mass Profiler Professional (MPP) version 2.0 (Agilent Technologies, Santa 1041

dx.doi.org/10.1021/ac102806p |Anal. Chem. 2011, 83, 1040–1047

Analytical Chemistry Clara, CA). For targeted analysis, the compounds of interest from MPP were confirmed from the raw data files in MHQ via a recursive workflow, followed by export back to MPP and another iteration of the data analysis with identical settings. Preliminary compound identification was conducted by use of the MPP ID Browser. MPP results were used to create proportional Venn diagrams with the Pacific Northwest National Laboratories freely downloadable Venn Diagram Plotter.20

’ RESULTS AND DISCUSSION There has been recent research emphasis into the development of metabolomic approaches for disease diagnostics. Smallmolecule/metabolite-based tests hold the potential for reflecting a more accurate picture of infection status; a metabolite profile serves as a comprehensive measure of the effects of posttranslational modification and regulation. Furthermore, small molecules diffuse easily relative to peptides and proteins, are frequently constitutively produced (e.g., excretory-secretory products), and are inherently nonimmunogenic in vivo, thus avoiding some of the technical challenges associated with DNA- and protein-based diagnostics. In approaching a metabolomics study of human samples, it is important to minimize potential variability between subject populations. While standardization of the collected samples would be ideal, it is frequently impractical. Factors such as diet,12,21 exercise,22 gender, and age,23 have the potential to introduce bias in a study. In situations where the identification of generalizable disease markers is desired, the best alternative is to incorporate a wide variety of subject populations in the study design in order to minimize the effects of nonrelevant metabolic variation and magnify those metabolic differences that are not only statistically significant between specific populations but also relevant in identifying the changes in metabolism that can be directly attributable to disease.24 Unfortunately, access to samples collected by identical protocols is not always feasible. Sample Preparation Standards. Whether designing or analyzing and interpreting a metabolomics experiment, factors such as sample collection, handling, storage, and preparation can have a tremendous impact on the quality of data output. While universal standards for conducting and reporting metabolomics experiments are still being developed,16,25 standard operating procedures (SOPs) for proteomics experiments proposed by the Early Detection Research Network (EDRN)26 and the Human Proteome Organisation (HUPO) Plasma Proteome Project27 have been established and can easily be generalized for a metabolomics study. Therefore, our experimental design was based upon the precepts of these proposed proteomics SOPs for sample collection and processing. These standards included consistency of blood collection tube types, clot formation conditions, use of a consistent holding temperature within minimal time until processing, and the flash-freezing and storage of aliquoted samples. Both sets of SOPs suggest that the choice of anticoagulant is dependent upon how the samples will be processed downstream. Plasma Preparation. Ideally, all of the plasma samples being analyzed within a particular study would be prepared with the same type of anticoagulant. We hypothesized that the specific anticoagulant used for plasma preparation could affect the resulting metabolic profile. For example, while heparin functions as an anti-thrombin activator, both EDTA and sodium citrate act to chelate calcium ions; cofactors necessary for enzyme activation in the coagulation cascade. By affecting different stages of the

ARTICLE

clotting cascade, different molecules can accumulate, resulting in dramatically different metabolic profiles. In this experiment, three separate 100 μL aliquots of each plasma sample type were analyzed in both positive and negative ionization modes by ESITOF MS. The analysis yielded a larger number of features in positive mode with comparable coefficients of variation (CV) for the three biological replicates (separate aliquots collected from the same donor) (Table S-1, Supporting Information). Although the number of detectable features does not directly correlate to absolute data output quality, this measure can be utilized as a first-level assessment of relative data quality across a sample set. The difference in the number of features between the heparin and sodium citrate anticoagulant was small, indicating little difference between the effect of these anticoagulants on metabolic output. On the basis of this information and the consideration that the majority of clinical plasma samples accessible are collected with heparin as an anticoagulant, it was decided that blood collection with heparinized tubes would be used for all subsequent plasma analysis. Certainly each anticoagulant type will have a specific and unique biochemical interaction with the blood matrix.27 This biochemical effect must be considered within the context of the metabolic question under study. For example, in a targeted metabolomic analysis, anticoagulant type may have a profound effect on the quality of the resultant data, and this variable should be investigated in the early steps of experimental design. Serum versus Plasma Matrix Distinction. Due to the physical and chemical processes essential to coagulation, there are inherent biochemical differences between serum and plasma. These distinctions have been investigated by LC-MS technologies on a protein level28,29 and have been analyzed for metabolomics applications by NMR17 and GC-MS.30 However, relative to these analytical methods, LC-MS approaches can have increased sensitivity and compatibility with aqueous samples; as such, we applied this method to more clearly delineate the differences in biofluids at the small-molecule level. Twelve matched serum and heparinized plasma samples were analyzed within separate analytical sequences by ESI-TOF MS in both positive and negative ionization modes. To test the overall reproducibility of the analysis, all mass feature intensity values were compared across triplicate injections of a single serum sample analyzed throughout the course of each of the analytical sequences. The coefficient of variance was found to be 14.4% for positive-mode and 13.4% for negative-mode analysis. These values are comparable to previous studies of analytical variation within plasma and serum analysis by our laboratory and consistent with a number of other LC-MS-based metabolomics studies.31,32 Results of the Mass Profiler Professional (MPP) software comparison between the matched serum and plasma samples uncovered a large number of aligned compounds (Table 1). Initial threshold settings of the centroided data were set so as to be as inclusive as possible of potentially valuable features of interest. Under such liberal conditions, the number of aligned compounds is most likely inflated due to the presence of noise, random interference, or potentially misaligned compounds. Further filtration of the data set was conducted in order to include only those features that are of good mass spectral quality and consistency (e.g., a mass was detected in at least two of the total number of samples). Analysis of the filtered data set revealed that the majority of compounds are common to both blood derivatives (Figure 1). 1042

dx.doi.org/10.1021/ac102806p |Anal. Chem. 2011, 83, 1040–1047

Analytical Chemistry

ARTICLE

Table 1. Results of Statistical Comparison between Blood Sample Types in Both Positive- and Negative-Mode Analysis of Serum and Plasma Samples and Serum Collected through Capillary and Venipuncture Withdrawal serum vs plasma

capillary vs venipuncture

þ

-

ionization mode

þ

-

24 26 170

24 29 550

data files total aligned compounds

18 11 063

18 7508

7549

11 631

compounds present in at least 2 samples

3029

3508

327 (100%)

318 (100%)

compounds present in percentage of one of the sample conditions

483 (66%)

858 (66%)

46

8

p-value 2

29

45

PEGylated compounds removed

12

44

3

compounds present after targeted recursion

14

9

Figure 1. Proportional Venn diagrams display a representational view of the results of both positive and negative ionization modes for (A) compounds present in at least two of the total 24 serum and plasma samples compared and (B) compounds present after targeted analysis.

The inclusion of features that are present in a certain percentage (66% or 100%) of samples helps to focus the total number of statistically significant features and ensures that those features used for downstream data processing are indeed features that are consistent across a majority of the sample set. Further statistical requirements (>2-fold change and p-value