A methodology for the validation of isotopic analyses by mass

All the scripts used to analyze the data and generate the fig- ures are distributed in Supporting Information S-5 under an open source license to ensu...
0 downloads 8 Views 3MB Size
Subscriber access provided by UNIV OF NEW ENGLAND ARMIDALE

Article

A methodology for the validation of isotopic analyses by mass spectrometry in stable-isotope labelling experiments Maud Heuillet, Floriant Bellvert, Edern Cahoreau, Fabien Letisse, Pierre Millard, and Jean-Charles Portais Anal. Chem., Just Accepted Manuscript • DOI: 10.1021/acs.analchem.7b03886 • Publication Date (Web): 20 Dec 2017 Downloaded from http://pubs.acs.org on December 21, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Analytical Chemistry is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 16 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

A methodology for the validation of isotopic analyses by mass spectrometry in stable-isotope labelling experiments Maud Heuillet1, 2, Floriant Bellvert1, 2, Edern Cahoreau1, 2, Fabien Letisse1,3, Pierre Millard1 and JeanCharles Portais1,2,3 1

LISBP, Université de Toulouse, CNRS, INRA, INSA, Toulouse, France. MetaToul-MetaboHUB, National infrastructure of metabolomics and Fluxomics 3 Université Paul Sabatier, Université de Toulouse, Toulouse, France. 2

ABSTRACT: Stable-isotope labeling experiments (ILEs) are widely used to investigate the topology and operation of metabolic networks. The quality of isotopic data collected in ILEs is of utmost importance to ensure reliable biological interpretations, but current evaluation approaches are limited due to a lack of suitable reference material and relevant evaluation criteria. In this work, we present a complete methodology to evaluate mass spectrometry (MS) methods used for quantitative isotopic studies of metabolic systems. This methodology, based on a biological sample containing metabolites with controlled labeling patterns, exploits different quality metrics specific to isotopic analyses (accuracy and precision of isotopologue masses, abundances, and mass shifts, and isotopic working range). We applied this methodology to evaluate a novel LC-MS method for the analysis of amino acids, which was tested on high resolution (Orbitrap operating in fullscan mode) and low resolution (triple quadrupole operating in MRM mode) mass spectrometers. Results show excellent accuracy and precision over a large working range and revealed matrix-specific as well as mode-specific characteristics. The proposed methodology can identify reliable (and unreliable) isotopic data in an easy and straightforward way, and efficiently supports the identification of sources of systematic biases as well as of the main factors that influence the overall accuracy and precision of measurements. This approach is generic and can be used to validate isotopic analyses on different matrices, analytical platforms, labeled elements, or classes of metabolites. It is expected to strengthen the reliability of isotopic measurements and thereby the biological value of ILEs.

INTRODUCTION Stable isotope labeling experiments (ILEs) are widely used to investigate metabolic systems, e.g. to identify metabolic pathways1 and metabolites2, to assist the discovery of novel regulatory interactions3, to quantify the metabolic response to perturbations4, to profile metabolic variants5, or to quantify the control exerted by enzyme on fluxes.6,7 Mass spectrometry (MS) is a method of choice for measuring the isotopic incorporation in molecules collected during ILEs.8 MS distinguishes molecular entities of a given molecule according to the number of each isotope incorporated (i.e. isotopologues), from which the isotopic content of the molecule is determined in terms of isotopologue distribution.9 The reliability of isotopic measurements determines the quality of the isotopic information and, hence, of their biological interpretations. Several potential biases can occur at all the different levels of the analytical workflow (e.g. during cultivation, sampling, analysis, data treatment).10–15 This may impact the quality of isotopic data and ultimately jeopardize the biological value of ILEs.

must be performed on each isotopologue of each molecule of interest. This excludes the use of standards at natural abundance, in which only the lightest isotopologues can be detected and quantified.17,20 Another approach consists in using mixtures of commercially labeled compounds,17,21 but all the isotopic species existing for a particular compound are not always available or are very expensive, and some sources of bias (e.g. related to the matrix or to sample preparation) cannot be investigated. An interesting alternative consists in the biological production of a labeled reference sample that contains all isotopic species in a predictable and controllable amount. The complete conceptual and practical framework for the production of this sample was published by Millard et al.8 Briefly, it is obtained from a 13C-labelling experiment in which the actual distribution of the isotope in the various metabolites does not depend on the operating metabolic pathways but depends strictly on the composition of the label input. This biological sample is perfectly suited for the evaluation of methods for isotopic measurement.

It is therefore essential to assess both quantitatively and qualitatively analytical platforms used for isotopic analyses.16,17 This is usually done by evaluating accuracy and precision, two criteria classically considered for the evaluation of quantitative metabolomics platforms.18,19 However, additional criteria specific to isotopic measurements (e.g. linearity within the isotopic cluster, or limit of quantification of the CID) should be assessed. As another important requirement, validation

In this work, we exploit such reference sample to establish a generic workflow for quality assessment of MS platforms dedicated to isotopic analyses. The proposed workflow includes: i) evaluation of analytical methods using unlabeled material and standard metrics, ii) the production of suitable labeled reference material, and iii) the validation of the method for isotopic measurement using the labeled reference material and specific criteria that we introduce here. For demon-

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

stration purpose, the proposed workflow was applied on an LC-MS method developed for isotopic analyses of underivatized amino acids, and considering two different matrices (cellular extracts and proteinogenic amino acids) and two LC-MS platforms (i.e. high resolution MS in the fullscan mode and low resolution MS in the MRM mode) widely used in metabolomics.2–4,17–21. The results illustrate how the present workflow can support the identification of sources of systematic biases as well as of the main factors that influence the quality of isotopic measurements.

EXPERIMENTAL SECTION Chemicals and Reagents. All unlabeled amino acids, formic acid (98% LC-MS grade), acetonitrile (> 99.9% LCMS grade) and methanol (> 99.9% LC-MS grade) were purchased from Sigma-Aldrich (St. Louis, MO, USA). 12C-acetate was obtained from VWR International GmbH (Darmstadt, Germany) and 1-13C-, 2-13C- and U-13C-acetate (13C-purity > 99%) were obtained from Eurisotop (St. Aubin, France). MilliQ water was used to prepare all samples, extraction solutions and mobile phases.

Organism and Culture. Escherichia coli K-12 MG1655 was grown on minimal medium containing 5 mM KH2PO4, 10 mM Na2HPO4, 9 mM NaCl, 40 mM NH4Cl, 0.8 mM MgSO4, 0.1 mM CaCl2, 0.3 mM of thiamine and 45 mM of 13C-acetate as a unique carbon source. Cultivation was carried out in a Multifors Bioreactor (Infors HT, BottmingenBasel, Switzerland) with 500 mL of medium following the protocol developed by Millard et al.8 The pH was maintained at 7.0 by adding the appropriate volume of 2 M HCl. Cell growth was monitored by measuring optical density at 600 nm with a Genesys 6 spectrophotometer (Thermo, Carlsbad, CA, USA). Amount of biomass in each sample was calculated using a coefficient of 0.37 g of cell dry weight per OD unit. Analysis of

13

C-Labeled Acetate by NMR. 13C-

labeled substrate was obtained by mixing the four different isotopic forms of acetate in equal proportions, i.e. 25 % of 12Cacetate, 25 % of 1-13C-acetate, 25 % of 2-13C-acetate and 25 % of U-13C-acetate. Isotopic composition of the 13C-labeled substrate was controlled by quantitative 1H-NMR before use. NMR spectra were recorded with an Avance 500 MHz spectrometer (Bruker, Rheinstetten, Germany) at 298 K, using a 30° pulse and a relaxation delay of 20 s. The proportion of each isotopic form was quantified by fitting using a mixed Gaussian-Lorentzian model, as described in Millard et al.6

Preparation of the Reference Material. Free intracellular amino acids (FA-PT sample) were sampled by fast filtration.22 Briefly, 1 mL of broth at OD600nm=1.7 (corresponding to 0.65 mg of biomass) was filtered (Sartolon polyamides 0.2 µm, Sartorius, Goettingen, Germany). The filter was rapidly plunged in a precooled centrifuged tube maintained at -20 °C and containing 5 mL of acetonitrile/methanol/H20 (2:2:1) with 0.1 % formic acid and incubated 20 min at -20 °C. Cellular extract was centrifuged (7000 g, -20 °C, 15 min) to remove proteins and cell debris. Supernatant was then evaporated with a Rotavapor RII (Buchi, Flawil, Switzerland), re-suspended in 250 µL of MilliQ water and stored at -80 °C. To sample proteinogenic amino acids (PA-PT sample), pellet obtained with the cellular extract was hydrolyzed with 4 mL of HCl 6 N overnight at 110 °C.23 Samples were evaporated and rinsed 3 times with 4 mL of MilliQ water. After centrifugation (10 min, 12000 g), supernatant was stored at -

Page 2 of 16

80 °C. These solutions, expressed as µg of biomass, corresponds to the injected amount of proteinogenic amino acids obtained after hydrolysis of 10, 5, 2.5, 1.25, 0.66, 0.16 and 0.08 µg of biomass.

Determination of Carbon Isotopologue Distributions by LC-MS Analysis. Amino acids were separated on a PFP column (150 × 2.1 mm i.d., particle size 5 µm; Supelco Bellefonte, PEN, USA). Solvent A was 0.1 % formic acid in H20 and solvent B was 0.1 % formic acid in acetonitrile at a flow rate of 250 µL/min. Gradient was adapted from the method used by Boudah et al.24 Solvent B was varied as follow: 0 min: 2 %, 2 min: 2 %, 10 min: 5 %, 16 min: 35 %, 20 min: 100 % and 24 min: 100%. The column was then equilibrated for 6 min at the initial conditions prior to the next sample analysis. The volume of injection was 20 µL. Low resolution MS experiments were performed with a 1290 UHPLC system (Agilent) coupled to a 6460 Triple Quadrupole (Agilent). MS analyses were carried out in the positive mode with an electrospray ionization probe with the following source parameters: source temperature was set at 275 °C, nebulizer gas flow was 11 L/min, nebulizer pressure was 40 psi, sheath gas temperature was 300 °C, sheath gas flow was 12 L/min and capillary voltage was ± 4000 V. Dwell time was set at 20 ms for absolute quantification and at 10 ms for isotopic measurement. Compound-dependent parameters were determined individually. Isotopic clusters of molecular ions [M+H]+ were quantified in the MRM mode. Transitions for each compound were selected as a compromise between the most abundant fragments (to increase sensitivity) and fragments with low number of carbons (to limit the number of transitions to be monitored). The selected transitions were optimized with the Agilent MassHunter Optimizer software and are reported in Supporting Information S-1. Nature of the fragments generated was determined based on literature.25–27 High-resolution experiments were performed with an Ultimate 3000 HPLC system (Dionex, CA, USA) coupled to a LTQ Orbitrap Velos mass spectrometer (Thermo Fisher Scientific, Waltham, MA, USA) equipped with a heated electrospray ionization probe. MS analyses were performed in the positive FTMS mode at a resolution of 60 000 (at 400 m/z) in fullscan mode, with the following source parameters: capillary temperature was 275 °C, source heater temperature was 250 °C, sheath gas flow rate was 45 a.u. (arbitrary unit), auxiliary gas flow rate was 20 a.u., S-Lens RF level was 40 % and source voltage was 5 kV. Isotopic clusters were determined by extracting the exact mass of all isotopologues, with a tolerance of 5 ppm (Supporting Information S-2).

Data Processing. Experimental CIDs were obtained after correction of raw MS data for naturally occurring isotopes other than carbon, using IsoCor.28 Elemental formulas used to construct the correction matrices were adapted to the mass resolution of each dataset. For each compound, the correction formula includes only elements that give rise to the measured isotopic cluster. For example, since Carbon isotopologues (CI) were separated from Nitrogen isotopologues (NI) in HRMS analysis, the natural abundance of 15N isotopes was not considered in this correction step. The predicted CIDs were calculated using the equation8: M = . p . (1 − p) 

eq. (1)

where n is the total number of carbon atoms in the molecular entity having k 13C atoms and p is the abundance of 13C iso-

ACS Paragon Plus Environment

Page 3 of 16 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

topes. p was calculated from the molecular enrichment of 13Cacetate measured by NMR.

Validation Criteria. First of all, the method was validated following the EURACHEM guideline29, without any 13C tracer, as detailed in Supporting Information (S-3 and S-4). The method was then validated in the context of isotopic analyses (i.e. using a 13C tracer). We thus introduce specific metrics dedicated to the evaluation of isotopic measurements: •

The mass accuracy, i.e. the error on CIs masses, estimated from the difference between the theoretical mass of each CI (calculated from its elemental formula) and its experimental exact mass:   () = ( −  )/ ∙ 10



The mass precision, i.e. the spread of experimental CI masses, estimated from the standard deviation of replicate measurements of the PA-PT sample.



The accuracy of isotopic distance, i.e. the error of experimental mass shifts induced by the incorporation of an additional 13C atom, estimated from each pair of successive CIs peaks within each CID and compared to the theoretical value of 1.0034 m/z: !"#$%  (/&) = (' − ' ( ) − 1.0034



The precision of isotopic distance, i.e. the spread of experimental mass shifts, estimated from the standard deviation of replicate measurements of the PA-PT sample.



The CID accuracy, i.e. the error of measured CIDs, determined from the difference between predicted and measured CIDs: +,-  = +,- − +,-



The CID precision, i.e. the spread of measured CIDs, estimated from the standard deviation of replicate measurements of the PA-PT sample.

The CID working range, i.e. the interval over which the method provides CIDs with an acceptable uncertainty – fixed here as an accuracy and precision below 0.05. The accuracy and precision of masses and isotopic distances, and the accuracy and precision of CIDs measurements, were determined from 10 independent analyses of the most concentrated PA-PT sample (10 µg of biomass injected). The CID working range was assessed through the injection in triplicate of seven dilutions of PA-PT sample, which corresponds to 10, 5, 2.5, 1.25, 0.66, 0.16 and 0.08 µg of biomass (cell dry weight) injected. •

Statistical Analyses. The uncertainty on the predicted CIDs was determined using a Monte Carlo analysis from the NMR analysis of the label input, as detailed previously.8,30 Two-way ANOVA without interaction was used to identify factors that impact the reliability of isotopic measurements, from which general validation criteria could be established, as detailed in the Results section. The following factors were considered to explain the experimental bias of the 700 isotopologues measured in the seven dilutions of PA-PT samples: metabolite, CI position in the CID (M0: lightest, Mx: intermediate, Mn: heaviest), amount of biomass-equivalent injected, peak area and signal to noise ratio of each CI, and average peak area of each CID. ANOVA was performed using the FactoMineR package (v1.32) of R (v3.2.4, www.rproject.org).

All the scripts used to analyze the data and generate the figures are distributed in Supporting Information S-5 under an open source license to ensure reproducibility and reusability.

RESULTS AND DISCUSSION Definitions and Methodology. The workflow proposed to evaluate the reliability of isotopic measurements by a given LC-MS method consists in three main steps (Figure 1): i) method validation using unlabeled material and standard metrics, ii) (biological) production of a labeled reference material containing metabolites with controlled and predictable isotopic patterns, and iii) evaluation of the method for isotopic measurements based on this reference material and dedicated metrics. This work focuses on 13C-labelling experiments, which is the most widespread approach in isotopic studies of metabolism, though most of the concepts and tools proposed in the work can be applied to other elements of interest (e.g. 15N). Some important definitions are given below before detailing the workflow. Definitions and Properties of Carbon Isotopologue Distributions. The actual results of isotopic measurements made by MS in the context of 13C-labelling experiments are the Carbon Isotopologue Distributions (CIDs) of the various metabolites detected in the labelled samples. Isotopologues refer to molecular entities that differ only in their number of isotopic substitutions, according to its IUPAC definition31, and Carbon isotopologues (CIs) are defined as molecular entities that differ only in their composition in carbon isotopes. Considering only carbon, which has two predominant stable isotopes (12C and 13C), a molecular entity containing n carbon atoms has a total of 2n isotopic forms which distribute in n+1 (carbon) isotopologues. The Carbon Isotopologue Distribution (CID) of a given compound is defined as the relative distribution of all the CIs of this compound. In the MS spectrum, the CIs are embedded into the peaks of the isotopic cluster, which encompasses also isotopes from other elements. The contribution to the isotopic cluster of naturally occurring isotopes of elements other than carbon has thus to be removed. The CID is calculated from individual CIs after correction for naturally occurring isotopes. Quantification of a CID is therefore a complex process based on the measurement of not only one peak but many peaks to which significant data processing (correction for natural abundance, calculation of relative fractions) is applied. This means that CID measurement is highly prone to error. The CID being a relative parameter, its measurement is also strongly prone to error propagation, since the error on the measurement of only one CI will generate an error on the entire CID. Reference Material and Evaluation Criteria. The reference material consists in a biologically produced sample containing 13 C-labelled metabolites with controlled and predictable isotopologue composition. The nature, production and qualification of this standard sample have been extensively detailed in a previous report.8 Each metabolite in the sample is forced to contain all possible isotopic species at the same concentration. This is critical for the purpose of evaluating the measurement methods since each form can potentially behave differently during the analytical workflow and hence can introduce a bias in the measurement. This reference material will be referred to as the Pascal Triangle (PT) sample. Based on CID properties, specific criteria were defined to evaluate the reliability of CID measurements by a given MS

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 16

Figure 1.Workflow used for evaluation of isotopic analyses by MS. After having validated the method with unlabeled material, different amounts of a biological labeled sample containing metabolites with controlled and predictable CIDs are analyzed by MS(/MS). Quality metrics (grey boxes) are computed to evaluate these measurements, from which various information (relative to the matrix, the presence of contaminants, or the analytical platform) can be inferred.

method (and instrument). These criteria (Figure 1) include: •

The accuracy (and precision) of exact masses of CIs, which represents the closeness of agreement between the theoretical and experimental mass of each isotopologue.

The accuracy (and precision) of isotopic distances, which represents the closeness of agreement between the exact mass shift of 13C isotope and the mass differences between two consecutive peaks of the isotopic cluster. These criteria are applicable to high-resolution MS instruments only. Systematic error could indicate instrument bias (related to the instrument), and specific error could indicate co-elution with other compound (related to the compound and/or matrix and/or contaminations). •



The CID accuracy is the closeness of agreement between the measured CIDs and the CIDs predicted for the reference material. The predicted CIDs, as well as uncertainties on the predicted values, are calculated from the isotopic composition of the labeled substrate. Systematic error could indicate instrument bias (related to the instrument), and specific error could indicate overlap with other compound (related to the compound and/or matrix).



The CID precision is the spread of measured CID values. It is estimated from replicate measurements of the PT

sample and is expressed as standard deviation (SD). A high SD value can be explained by a low signal to noise ratio, a signal under the LOQ, etc. The CID working range is the interval of metabolite concentration (or sample amount) over which the method provides CID values with an acceptable uncertainty, i.e. CIDs should be constant over this range. It was determined from the analysis of various amounts of the PT sample. This metric can be applied to other isotopologue distributions in the reference material (i.e. binomial distributions with enrichments different from 0.5) or in biological samples with unknown labeling. Test Cases. For demonstration purpose, the evaluation workflow was applied to the measurement of amino-acid CIDs by LC-MS with two different MS analyzers. Amino acids provide a simple but valuable test case since the isotopic profiling of these compounds is broadly used in current fluxomics approaches. A LC method for the separation of underivatized amino acid was first developed, as detailed in Material & Methods. To test the most widespread MS instruments for isotopic profiling, the LC method was setup on both a triplequadrupole (Agilent 6460) and a high-resolution instrument (LTQ-Orbitrap). MS analyses were performed in the MRM mode with the triple-quadrupole, and in the full scan mode •

ACS Paragon Plus Environment

Page 5 of 16 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

with the LTQ-Orbitrap. The two LC-MS systems were first validated against classical evaluation criteria (accuracy, limit of detection, limit of quantification, linearity) using unlabeled amino-acid standards (Supporting Information S-3 and S-4). As reference labeled material, 13C-proteinogenic aminoacids were produced from E. coli to evaluate amino acid CID measurements (PA-PT sample). Because proteins represent about 50% of the total cell material in E. coli32, large amounts of amino-acids can be released from protein hydrolysis. Such material was produced from E. coli cells grown on an equimolar mixture of the four carbon isotopic forms of acetate (i.e. unlabeled- , [1-13C]-, [2-13C]-, and [1,2-13C]-acetate) as sole carbon source. To obtain reference values as accurate as possible, the theoretical - i.e. predicted - CIDs were calculated from the actual proportion of each of the isotopic forms of acetate used in the experiment determined by 1H-NMR (0.4995). Uncertainties on predicted CIDs were determined from the standard deviation of NMR measurements (± 0.005). The values of predicted CIDs and their errors are given in Supporting Information S-6. Finally, a second reference material, consisting in free, intracellular 13C-labelled amino-acids (FA-PT sample), was produced. It was obtained from the same cultivation as the PA-PT sample by extracting intracellular amino-acids without protein degradation. This sample was used to investigate matrix effects and additional analytical considerations. The predicted amino-acid CIDs and their errors for the FA-PT sample are the same as for the PA-PT sample.

Evaluation of CID Measurement by High Resolution Mass Spectrometry. The evaluation workflow was applied to the measurement of amino-acid CIDs by High Resolution Mass Spectrometry (HRMS) in the full scan mode. HRMS is increasingly used for isotopic profiling, for two main reasons. First, the high resolution allows a much better separation (in the m/z dimension) of analytes, isotopes and eventually isotopologues. Second, it allows untargeted isotopic profiling approaches to be set up using the full scan mode of acquisition, though these approaches are still in their infancy. The full scan LC-HRMS method was first developed and validated for the quantification of underivatized, unlabeled amino-acid (Supporting Information S-4). The results showed an excellent sensitivity (LOQ < 11 pmole injected) and accuracy (R² > 0.99) over a wide range of concentrations (from 10.40 to 782.88 pmole injected). The same analytical method was then used for measuring amino-acid CIDs in the PA-PT sample. Without any particular adaptation, we measured CIDs of the 16 amino-acids that were detected. With the MS detector (LTQ Orbitrap) operating at a resolution of 60 000 (at 400 m/z), it was possible to fully separate 15N from all 13C-isotopologues. Consequently, only the CIs peaks were exploited and were corrected for natural abundance of overlapping isotopes - such as 17O and 18O - to calculate CIDs. This was done by adapting the molecular formula used to construct the correction matrix (see Methods). The first evaluation criterion was the accuracy and precision of CIs masses. From 10 replicate analysis of the PA-PT sample, the error of CIs masses was -4.42 ± 0.74 ppm, which is less than the acceptable measurement error (set at 5 ppm). Then, the accuracy of isotopic distances was determined from the mass differences between CIs belonging to the same

molecule, which were compared to the theoretical value (1.0034 m/z). Regarding our LC-HRMS method, the mean bias for all measured CIs was -2.4·10-5 ± 6.9·10-5 m/z. These results indicated the absence of systematic errors and an unambiguous assignment of isotopic peaks.

Figure 2. Distribution of measurements bias (A) and precision (B) for the 100 amino acid isotopologues of the PA-PT sample. The inset in panel A shows the distribution of errors for CIs of all amino acids excluding leucine. In this inset, the black line corresponds to a Gaussian distribution of measurements errors, and the dotted vertical red line to the center of this distribution.

The CID precision and accuracy of the LC-HRMS method were then evaluated from the same 10 replicate measurements of the PA-PT sample. All CID values are given in Supporting Information S-7. The CID precision was very good with a mean SD of 0.0006 over the 100 measured isotopologues (Figure 2B). The highest standard deviations were observed for M0 (unlabeled metabolite) and Mn (fully labeled metabolite) isotopologues, i.e. the isotopologues with the lowest abundances in the PA-PT sample, especially for amino-acids that are not well ionized (valine and glycine). Errors distribute between -0.0388 and 0.0327 with a mean error of -2·10-5 indicating a very good correlation between measured and predicted values and the absence of systematic bias (Figure 2A). The confidence interval determined from the standard deviations of these values was 0.0103, which indicates a high reproducibility. Moreover, the error distribution shows no clear trend related to particular isotopologue(s). The CID of only one amino-acid out of 16, namely leucine, showed elevated bias (Figure 2). As discussed later in this report, this is due to contamination by another compound, an analytical issue that can be clearly identified using the proposed workflow. Excluding leucine, these results showed that the LC-HRMS method is very accurate for amino acids isotopologues measurements. The working range of the LC-HRMS method was evaluated from the analysis of samples containing varying amounts of the PA-PT sample, and covering more than two orders of magnitude of amino-acid concentrations (Figure 3 and Supporting Information S-8). Results showed that CIDs of the majority of amino acids are accurately measured at the highest concentration. Only M0 and Mn of glycine had bias higher than 0.025, which can be explained by the weak ionization of this compound. Since CID consists in a relative measurement, the bias on a single CI will impact the entire CID, and this impact will increase for molecules with a low number of carbon atoms. When samples are diluted drastically (from around 65 times from the most concentrated), two different behaviors are observed. For the majority of amino acids (i.e. 9 of 15 amino acids), these dilutions induce a systematic positive bias on the M0, as observed elsewhere15, despite a low mass scan range

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 16

Figure 3. Evaluation of the working range of amino-acid CIDs measurement by LC-HRMS. CIDs of amino-acids were measured for various dilutions of the PA-PT sample. The solid lines connect the CI abundances measured for the same dilution level (the darkest line corresponds to the most concentrated sample). A color code is applied to each single value (circles) to represent the bias of the measured value compared to the predicted value. Circles filled in blue represent negative biases and circle filled in red positive biases.

(75-700 m/z). This could be explained by the limit of quantification which is reached for these isotopologues, leading to their overestimation. A large bias on M0 causes negative bias on the other isotopologues (e.g. serine, alanine, and histidine). For the six other amino acids, dilutions lead to a non-detection of the least abundant isotopologue(s). The impact of this phenomenon is weaker for metabolites with a large number of carbon atoms. Isoleucine, phenylalanine and tyrosine are thus less impacted than serine or glycine, which contain less carbon atoms. Overall, the method showed a CID working range of at least two orders of magnitude for most amino-acids. CID is a relative measurement, in contrast to the (absolute) concentrations measured in metabolomics, hence LOQs may thus differ between the two approaches. ANOVA was applied on the complete set of data to identify the main factor(s) that determine the LOQ of CIDs in isotopic studies. Among the six considered factors (metabolites, dilution levels, isotopic cluster peak areas, average isotopic cluster peak areas, signal-tonoise ratios, and position of the CIs in the CIDs), the most prominent one is the average area of isotopic cluster peaks (pvalue of 2.28·10-5, Supporting Information S-9). The distribution of biases in CI abundances measured for the various dilutions of the PA-PT sample were thus plotted against the average area of isotopic cluster peaks (Figure 4). Considering an acceptable bias of 0.05, reliable measurement of amino-acid CIDs by LC-HRMS is achieved when the average area of isotopic clusters are above the threshold limit of 1·106 a.u. (arbitrary unit). This threshold value can be refined as a function of a given amino acid. For example, with this mass spectrometer, the lowest CID LOQ was observed for methionine (average isotopic cluster area of 4.6·105 ± 1.0·104 a.u.,

Figure 4. Relationship between the averaged area of isotopic clusters and the accuracy of CIDs measured by LC-HRMS. The inset corresponds to a zoom of the area in which the biases increased significantly with the decrease of signal.

corresponding to 0.33 µg of biomass injected) and the highest CID LOQ for glycine (average isotopic cluster area of 3.2·106 ± 5.1·104 a.u., corresponding to 1.25 µg of biomass injected). Compared to the LOQ estimated from the unlabeled sample (Supporting Information S-4), CIDs LOQs were in average 4 times lower for all amino acids. For glycine, the areas corresponding to the LOQ were respectively 9·106 with the metabolomics method and 1·106 with the isotopic method. This confirms that the LOQ determined from unlabeled samples cannot thus be used as an estimator of the LOQ in ILEs.

ACS Paragon Plus Environment

Page 7 of 16 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Identification of potential matrix and analytical issues. A significant and reproducible error of 0.3854 ± 0.0015 was observed on the M0 of leucine in the PA-PT sample. To understand this discrepancy, the commercial standard (at natural abundance) and the PA-PT sample were analyzed by MS², which revealed two distinct fragmentation patterns. One fragment observed in the PA-PT sample (peak at 86.0963 m/z) correspond to that of the unlabeled standard, but another fragment of unknown origin (peak at 114.0913 m/z) was also detected (Figure 5). This contaminant affects only the M0, indicating it originates from an unlabeled compound which is isobar of leucine (M0). While data obtained for the M0 isotopologue are not reliable and should be disregarded, these results indicate that data related to other isotopologues of leucine are reliable and can be exploited further in metabolic studies. These results also suggest that contamination may have occurred during the sample preparation step, and thereby might be related to the particular matrix analyzed.

Figure 6. Overlapping of asparagine (Asn) and pyroglutamate (pGlu) CIDs. Comparison of the predicted (black bars) and measured CID (grey bars) of Asn (top) and the corresponding mass spectrum (bottom, with Asp peaks in red, pGlu peaks in blue, and peaks from other unknown compounds in black).

These results highlight the importance of having such a reference sample with predictable CIDs, which allows not only the identification of unreliable isotopic data but also provides clues on their origin (e.g. contaminations with exogenous unlabeled compounds or overlapping with isotopologues of other endogenous labeled metabolites). This would not have been possible with the use of another reference sample, e.g. at the natural abundance or with unpredictable labeling. These results demonstrate the necessity of evaluating the analytical methods with respect to each matrix analyzed. It should be mentioned that the present sample has been produced using E. coli grown on acetate. If isotopic analyses are carried out with another carbon source or another organism, this sample will not be fully adapted. The reference sample should preferably be obtained from the organism and conditions of interest, or closely related organisms and conditions if this is too complicated to implement (e.g. for mammalian cells which require several carbon sources to grow, the reference materiel can be produced from yeast). Figure 5. LC-MS/MS characterization of leucine contamination in the PA-PT sample. Product ion spectra for m/z 132.1018 at 10.95 min of commercial standard of leucine (A) and PA-PT sample (B) acquired in collision-induced dissociation (NCE=20) by LC-HRMS.

To test this hypothesis, we analyzed CIDs of the same metabolites in a different matrix (FA-PT sample, Supporting Information S-10). In this intracellular extract containing free amino acids, leucine CID was in excellent agreement with the predicted one. No contamination of the M0 was observed, confirming that the contamination of the PA-PT sample occurs during its preparation. Results related to other amino acids of the FA-PT sample were also consistent with predictions, except for asparagine for which the lightest CIs where strongly overestimated (Figure 6). In this sample, M0-M2 of asparagine are overlapped with the M3-M5 of an unknown compound, which is labeled and thus endogenous to E. coli. Elemental formulas generated by Xcalibur from the exact mass of the M0 of the unknown compound (130.0500 m/z) suggested it could be pyroglutamic acid, which may be produced in-source by cyclization of glutamate and glutamine.31 The injection of commercial standard of glutamine and glutamate confirmed this hypothesis. The precise identification of this co-elution allows implementing a correction step where the labelling pattern of glutamate can be used to correct asparagine signals.

Evaluation of CID measurement with triplequadrupole. The proposed method was then applied to evaluate the quality of amino-acid CIDs measured by LCQqQ, which are extensively used to measure isotopic profiles. Following the proposed workflow, the LC-QqQ analysis of amino-acid at natural abundance (Supporting Information S-3) showed a good sensitivity (LOQ < 3 pmol injected) and accuracy (R² > 0.99) over a wide range of concentrations (from 2.66 to 643.53 pmol). This method was then extended to the analysis of amino acids CIDs by including the 164 MRM transitions required for measuring isotopic clusters (Supporting Information S-1, S-11 and S-12). Results (Supporting Information S-11 and S-13) showed a very good correlation between measured and predicted CI abundances, with a mean bias of -6·10-4 ± 0.0303 over the 100 measured CIs. CIDs appeared to be reliable (bias < 0.05) and reproducible (mean bias of -1.1·10-7 ± 0.0127, average SD of 6.2·10-4 ± 0.0082, Supporting Information S-13) for 10 amino-acids, including leucine. For the latter compound, the good agreement between predicted and measured CIDs is related to the MRM mode, which in this situation shows increased specificity compared to the full-scan mode.18 ANOVA (Supporting Information S9) confirms that the factor selected to estimate the CID LOQ of the HRMS method, i.e. the average area of isotopic cluster peaks, is also relevant for LOQ determination on the LC-QqQ

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

platform. The bias is relatively constant for the various levels of sample dilution, indicating a good linearity of CID measurements over the tested range of concentrations (Supporting Information S-13). Overall, these results highlight modespecific characteristics and demonstrate the applicability and benefits of the workflow on low-resolution instruments.

CONCLUSION The workflow presented in this work allows in-depth evaluation of analytical platforms used for quantitative isotopic measurements. In addition to the quality metrics classicaly investigated in quantitative metabolomics analyses, we introduced metrics specific to isotopic analyses. Based on these metrics, we defined several quality criteria that ensure detailed assessment of MS-based methods and of the resulting isotopic datasets. This workflow was applied both to high and low resolution mass spectrometry analyses of underivatized amino acids, and highlighted mode-specific as well as matrix-specific characteristics. Besides assessing the overall accuracy and precision of each LC-MS method in an easy and straightforward way, the workflow proved to be a valuable solution to efficiently identify reliable (and unreliable) isotopic data and trace the origin of measurement error, as illustrated on two different examples. The proposed approach is generic and can be used to validate isotopic measurements carried out on other matrices, analytical platforms, labeled elements, or classes of metabolites. It is expected to strenghten the reliability of isotopic datasets and thereby the biological value of ILEs.

ASSOCIATED CONTENT Supporting Information Additional information as indicated in the text. This material is available free of charge via the Internet at http://pubs.acs.org/.

AUTHOR INFORMATION Corresponding Author * Telephone: +33(0)5-61-55-94-07. E-mail: [email protected].

Notes The authors declare no competing financial interest.

ACKNOWLEDGMENT MetaToul (Metabolomics & Fluxomics Facitilies, Toulouse, France, www.metatoul.fr) and its staff members are gratefully acknowledged for technical support and access to NMR and mass spectrometry facilities. MetaToul is part of the national infrastructure MetaboHUB-ANR-11-INBS-0010 (www.metabohub.fr).

REFERENCES (1) Peyraud, R.; Kiefer, P.; Christen, P.; Massou, S.; Portais, J.-C.; Vorholt, J. A. Proc. Natl. Acad. Sci. U. S. A. 2009, 106 (12), 4846–4851. (2) Huang, X.; Chen, Y.-J.; Cho, K.; Nikolskiy, I.; Crawford, P. A.; Patti, G. J. Anal. Chem. 2014, 86 (3), 1632–1639. (3) Link, H.; Kochanowski, K.; Sauer, U. Nat. Biotechnol. 2013, 31 (4), 357–361. (4) Revelles, O.; Millard, P.; Nougayrède, J.-P.; Dobrindt, U.; Oswald, E.; Létisse, F.; Portais, J.-C. PLoS One 2013, 8 (6), e66386.

Page 8 of 16

(5) Fischer, E.; Sauer, U. Eur. J. Biochem. 2003, 270 (5), 880– 891. (6) Millard, P.; Cahoreau, E.; Heuillet, M.; Portais, J.-C.; Lippens, G. Anal. Chem. 2017, 89 (3), 2101–2106. (7) Millard, P.; Portais, J.-C.; Mendes, P. BMC Syst. Biol. 2015, 9, 64. (8) Millard, P.; Massou, S.; Portais, J.-C.; Létisse, F. Anal. Chem. 2014, 86 (20), 10288–10295. (9) Nic, M.; Jirat, J.; Kosata, B.; Jenkins, A.; McNaught, A. IUPAC Compendium of Chemical Terminology, Edition 2.1.0; IUPAC: Research Triangle Park; NC, 2009. (10) Antoniewicz, M. R.; Kelleher, J. K.; Stephanopoulos, G. Anal. Chem. 2007, 79 (19), 7554–7559. (11) Hellerstein, M. K.; Neese, R. A. Am J Physiol 1999, 276, 1146–1170 (12) Dauner, M.; Sauer, U. Biotechnol. Prog. 2000, 16 (4), 642– 649. (13) Vogt, J. A.; Wachter, U.; Georgieff, M. J. Mass Spectrom. JMS 2003, 38 (2), 222–230. (14) Wahl, S. A.; Dauner, M.; Wiechert, W. Biotechnol. Bioeng. 2004, 85 (3), 259–268. (15) Su, X.; Lu, W.; Rabinowitz, J. D. Anal. Chem. 2017, 89 (11), 5940–5948. (16) Theorell, A.; Leweke, S.; Wiechert, W.; Nöh, K. Biotechnol. Bioeng. 2017, 114 (11), 2668–2684. (17) Guerrasio, R.; Haberhauer-Troyer, C.; Steiger, M.; Sauer, M.; Mattanovich, D.; Koellensperger, G.; Hann, S. Anal. Bioanal. Chem. 2013, 405 (15), 5133–5146. (18) Kiefer, P.; Portais, J.-C.; Vorholt, J. A. Anal. Biochem. 2008, 382 (2), 94–100. (19) Wu, L.; Mashego, M. R.; van Dam, J. C.; Proell, A. M.; Vinke, J. L.; Ras, C.; van Winden, W. A.; van Gulik, W. M.; Heijnen, J. J. Anal. Biochem. 2005, 336 (2), 164–171. (20) Kiefer, P.; Nicolas, C.; Letisse, F.; Portais, J.-C. Anal. Biochem. 2007, 360 (2), 182–188. (21) Antoniewicz, M. R.; Kelleher, J. K.; Stephanopoulos, G. Anal. Chem. 2007, 79 (19), 7554–7559. (22) Millard, P.; Massou, S.; Wittmann, C.; Portais, J.-C.; Létisse, F. Anal. Biochem. 2014, 465, 38–49. (23) Fischer, E.; Sauer, U. Eur. J. Biochem. 2003, 270, 880–891. (24) Boudah, S.; Olivier, M.-F.; Aros-Calt, S.; Oliveira, L.; Fenaille, F.; Tabet, J.-C.; Junot, C. J. Chr. B 2014, 966, 34–47. (25) Thiele, B.; Füllner, K.; Stein, N.; Oldiges, M.; Kuhn, A. J.; Hofmann, D. Anal. Bioanal. Chem. 2008, 391 (7), 2663–2672. (26) Petritis, K.; Chaimbault, P.; Elfakir, C.; Dreux, M. J. Chromatogr. A 2000, 896 (1–2), 253–263. (27) Qu, J.; Chen, W.; Luo, G.; Wang, Y.; Xiao, S.; Ling, Z.; Chen, G. The Analyst 2002, 127 (1), 66–69. (28) Millard, P.; Letisse, F.; Sokol, S.; Portais, J.-C. Bioinforma. Oxf. Engl. 2012, 28 (9), 1294–1296. (29) Magnusson, B.; Örnemark, U. Eurachem guide: The fitness for purpose of analytical methods, 2014. (30) Guide to the expression of uncertainty in measurement; ISO/IEC Guide 98; 2008. (31) Purwaha, P.; Silva, L. P.; Hawke, D. H.; Weinstein, J. N.; Lorenzi, P. L. Anal. Chem. 2014, 86 (12), 5633–5637. (32) Neidhardt, F. C.; Curtiss, R.; Ingraham, J. L.; Lin, E. C. C.; Low, K. B. ASM press: Washington DC 1996.

ACS Paragon Plus Environment

Page 9 of 16 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

For Table of Contents Only

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 1.Workflow used for evaluation of isotopic analyses by MS. After having validated the method with unlabeled material, different amounts of a biological labeled sample containing metabolites with controlled and predictable CIDs are analyzed by MS(/MS). Quality metrics (grey boxes) are computed to evaluate these measurements, from which various information (relative to the matrix, the presence of contaminants, or the analytical platform) can be inferred. 136x125mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 10 of 16

Page 11 of 16 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Figure 2. Distribution of measurements bias (A) and precision (B) for the 100 amino acid isotopologues of the PA-PT sample. The inset in panel A shows the distribution of errors for CIs of all amino acids excluding leucine. In this inset, the black line corre-sponds to a Gaussian distribution of measurements errors, and the dotted vertical red line to the center of this distribution. 84x45mm (300 x 300 DPI)

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 3. Evaluation of the working range of amino-acid CIDs measurement by LC-HRMS. CIDs of aminoacids were measured for various dilutions of the PA-PT sample. The solid lines connect the CI abundances measured for the same dilution level (the darkest line corresponds to the most concentrated sample). A color code is applied to each single value (circles) to represent the bias of the measured value compared to the predicted value. Circles filled in blue represent negative biases and circle filled in red positive biases. 177x104mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 12 of 16

Page 13 of 16 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Figure 4. Relationship between the averaged area of isotopic clusters and the accuracy of CIDs measured by LC-HRMS. The inset corresponds to a zoom of the area in which the biases in-creased significantly with the decrease of signal. 84x63mm (300 x 300 DPI)

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 5. LC-MS/MS characterization of leucine contamination in the PA-PT sample. Product ion spectra for m/z 132.1018 at 10.95 min of commercial standard of leucine (A) and PA-PT sample (B) acquired in collision-induced dissociation (NCE=20) by LC-HRMS. 75x83mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 14 of 16

Page 15 of 16 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Figure 6. Overlapping of asparagine (Asn) and pyroglutamate (pGlu) CIDs. Comparison of the predicted (black bars) and meas-ured CID (grey bars) of Asn (top) and the corresponding mass spectrum (bottom, with Asp peaks in red and pGlu peaks in blue). 84x48mm (300 x 300 DPI)

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

graphical for table of contents 84x47mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 16 of 16