Scale-Invariant Biomarker Discovery in Urine and Plasma Metabolite

Aug 21, 2017 - This is a feature of zero-sum regression that proved valuable in the context of cross-platform application of linear signatures.(26). F...
1 downloads 15 Views 944KB Size
Subscriber access provided by Eastern Michigan University | Bruce T. Halle Library

Article

Scale-invariant biomarker discovery in urine and plasma metabolite fingerprints Helena Ursula Zacharias, Thorsten Rehberg, Sebastian Mehrl, Daniel Richtmann, Tilo Wettig, Peter J. Oefner, Rainer Spang, Wolfram Gronwald, and Michael Altenbuchinger J. Proteome Res., Just Accepted Manuscript • DOI: 10.1021/acs.jproteome.7b00325 • Publication Date (Web): 21 Aug 2017 Downloaded from http://pubs.acs.org on August 22, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Proteome Research is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Scale-invariant biomarker discovery in urine and plasma metabolite ngerprints Helena U. Zacharias,

Richtmann,



∗, †, §

Thorsten Rehberg,

Tilo Wettig,

Gronwald,

∗,†



‡, §

Peter J. Oefner,



Sebastian Mehrl,

Rainer Spang,

and Michael Altenbuchinger





Daniel

Wolfram

∗, ‡

†Institute of Functional Genomics, University of Regensburg, Am Biopark 9, 93053

Regensburg, Germany ‡Statistical Bioinformatics, Institute of Functional Genomics, University of Regensburg,

Am Biopark 9, 93053 Regensburg, Germany ¶Department of Physics, University of Regensburg, Universitätsstraÿe 31, 93053

Regensburg, Germany §The authors wish it to be known that, in their opinion, the rst two authors should be

regarded as Joint First Authors. E-mail: [email protected]; [email protected]; [email protected]

Abstract Metabolomics data is typically scaled to a common reference like a constant volume of body uid, a constant creatinine level, or a constant area under the spectrum. Such scaling of the data, however, may aect the selection of biomarkers and the biological interpretation of results in unforeseen ways. Here, we studied how both the outcome of hypothesis tests for dierential metabolite concentration and the screening for multivariate metabolite signatures are aected by the choice of scale. To overcome this problem for metabolite signatures and to establish 1

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 30

a scale-invariant biomarker discovery algorithm, we extended linear zero-sum regression to the logistic regression framework and showed in two applications to 1 H NMR-based metabolomics data how this approach overcomes the scaling problem. Logistic zero-sum regression is available as an R package as well as a high-performance computing implementation that can be downloaded at https://github.com/rehbergT/zeroSum.

Keywords: metabolomics, NMR, LASSO, zero-sum, normalization, scaling.

Introduction Metabolomics is the comprehensive study of all small organic compounds in a biological specimen.

1

Metabolite concentrations in body uids such as urine, serum, and plasma have

proven valuable in predicting disease onset and progression.

26

Metabolomic data can be

generated by a variety of methods of which mass spectrometry and are the most common.

1

H NMR spectroscopy

While its high sensitivity makes mass spectrometry the preferred

method for discovery projects, the high reproducibility of NMR data

7

is ideal for applications

in precision medicine.

1

H NMR allows for the simultaneous detection of all proton-containing metabolites

present at sucient concentrations in biological specimens. Furthermore, NMR signal volume scales linearly with concentration. The complete set of NMR signals acquired from a given specimen is called its metabolite ngerprint. Due to dierences in pH, salt concentration, and/or temperature, relative displacement of signal positions occurs across spectra. Binning schemes can compensate for this eect by splitting spectra into segments called bins and summing up signal volumes contained therein.

Equal-sized bins are commonly used,

albeit other schemes such as adaptive binning have been suggested.

8,9

Binned ngerprint

data is the typical starting point for subsequent multivariate data analysis. Metabolite ngerprints need to be scaled to a common unit. Typical examples include

2

ACS Paragon Plus Environment

Page 3 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

mmol metabolite per ml plasma, mmol metabolite per mmol creatinine in urine, or the relative contribution of a bin to the total spectral intensity of an NMR spectrum. In practice this is achieved by dividing each spectrum by the unit dening quantity: the intensity of an NMR reference such as TSP, the intensity of creatinine, or the total spectral intensity. This scaling of the raw data denes the measurement, but it also serves a second purpose: scaling corrects for unwanted experimental and physiological variability in the raw spectra. We refer to correcting unwanted sample-to-sample variability as normalization.

10

A

second preprocessing step is variance adjustment across all measured metabolites to reduce, e.g., heteroscedasticity of the data.

10

Here we are mostly concerned with the rst step, while

the second is achieved by taking the logarithm of the data.

11

All scales have their pros and cons: the NMR reference can normalize for spectrometer performance

11

but not for unwanted variability, e.g., in urine density, which is aected

by drinking, respiration, defecation, perspiration, or medication, without reecting disease state.

10,12

Choosing creatinine as a standard for urine assumes the absence of inter-individual

dierences in the production and renal excretion of creatinine.

13

In fact, creatinine produc-

tion and excretion is aected by sex, age, muscle mass, diet, pregnancy, and, most importantly, renal pathology.

14,15

Normalization to a constant total spectral area or spectral map-

ping to a reference spectrum assume that the total amount of metabolites is constant over time and across patients and that spectra are not contaminated by signals that do not represent metabolites. However, this is not always the case.

16,17

In fact, for urinary specimens of

patients suering from proteinuria the additional protein signals greatly increase total spectral area, and scaling to a constant total intensity systematically underestimates metabolite abundances. Similarly, excessive glucose uptake, for example by a glucose infusion, leads to high total spectral intensities that are dominated by glucose and its metabolites. While the high values for these metabolites correctly reect the metabolic state of these patients, their inuence on the total spectral area leads to systematic underestimation of metabolites not related to glucose metabolism.

3

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 30

In general, the preferred scaling protocol depends on the specic data set to be investigated. However, for some data sets it is not possible to use the same scale for all patients in the cohort. We will describe such data sets below. In this case dierent protocols need to be used for dierent patients, introducing new challenges to data analysis. In this contribution, we rst studied how the choice of scale aects statistical analysis, the selection of biomarkers, and patients' diagnosis by these biomarkers. We report on two supervised metabolomics data analysis scenarios, namely urine and plasma biomarker discovery in 1D

1

H NMR metabolite ngerprints for the early detection of acute kidney injury

onset after cardiac surgery.

4,5

In both applications we tested metabolites for dierential

abundance using alternative scaling protocols. We observed pronounced disagreements between the lists of signicantly dierential metabolites depending on how the same data was scaled.

More importantly, the dierent scalings led to inconsistencies in the classication

of individual patients. In view of these observations, reproducibility of metabolic studies is only possible if the exact same scaling protocols are used. To overcome this problem, we extended zero-sum regression, demonstrated to be invariant under any rescaling of data,

19

18,19

which has recently been

to logistic zero-sum regres-

sion and compared it to standard methods for constructing multivariate signatures. Unlike commonly used methods, logistic zero-sum regression always identies the same biomarkers regardless of the scaling method. Consequently, prior data normalization may be omitted completely. We make logistic zero-sum regression available as an

R

package and as a high-

performance computing software that can be downloaded at https://github.com/rehbergT/zeroSum.

Materials and methods Data Sets The rst data set comprised 1D

1

H NMR ngerprints of

n = 106 urine specimens, which had

been collected from patients 24 h after cardiac surgery with cardiopulmonary bypass (CPB)

4

ACS Paragon Plus Environment

Page 5 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

use at the University Clinic of Erlangen.

4

Of these 106, 34 were diagnosed with acute kidney

injury (AKI) 48 h after surgery. The challenge in this data is to dene a urinary biomarker signature that allows for the early detection of AKI onset (Acute Kidney Injury Network (AKIN) stages 1 to 3). The second data set consisted of 85 EDTA-plasma specimens, which had been collected 24 h post-op from a subcohort of the original cohort of 106 patients undergoing cardiac surgery with CPB use and which had been subjected to 10 kDa cuto ltration. AKI.

5

5

In total, 33 patients out of these 85 patients were diagnosed with postoperative

Again our goal is to detect biomarkers for an earlier detection of AKI.

NMR Spectroscopy A total of 400

µl

of urine or EDTA-plasma ultraltrate was mixed with 200

phate buer, pH 7.4, and 50

µl

µl

of phos-

of 0.75% (w) 3-trimethylsilyl-2,2,3,3-tetradeuteropropionate

(TSP) dissolved in deuterium oxide as the internal standard (Sigma-Aldrich, Taufkirchen, Germany). NMR experiments were carried out on a 600 MHz Bruker Avance III (Bruker

1 13 31 2 BioSpin GmbH, Rheinstetten, Germany) employing a triple resonance ( H, C, P, H lock) cryogenic probe equipped with each sample, a 1D

1

z -gradients

and an automatic cooled sample changer.

For

H NMR spectrum was acquired employing a 1D nuclear Overhauser

enhancement spectroscopy (NOESY) pulse sequence with solvent signal suppression by presaturation during relaxation and mixing time following established protocols.

20,21

NMR sig-

nals were identied by comparison with reference spectra of pure compounds acquired under equal experimental conditions.

20

Data extraction The spectral region from 9.5 to

−0.5

ppm of the 1D spectra was exported as even bins of

0.001 ppm width employing Amix 3.9.13 (Bruker BioSpin), see supplementary data les 1 and 2 for the urine and plasma AKI data set, respectively. The data matrix was imported into the statistical analysis software

R

version 3.3.2. For the urinary spectra, the region 6.5

5

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 30

 4.5 ppm, which contains the broad urea and water signals, was excluded prior to further analysis. For plasma spectra, the region 6.2  4.6 ppm, containing the urea and remaining water signals, was removed prior to analysis. In addition, the regions 3.82  3.76 ppm, 3.68  3.52 ppm, 3.23  3.2 ppm, and 0.75  0.72 ppm, corresponding to lter residues and free EDTA, were excluded prior to classication for the plasma specimens.

Scaling and normalization We compare our scaling- and normalization-invariant approach to standard analysis strategies that were applied to data preprocessed by four state-of-the-art normalization protocols. For creatinine normalization of urinary data, we divided each bin intensity by the summed intensities of the creatinine reference region ranging from 3.055 to 3.013 ppm of the corresponding NMR spectrum. When scaling to the total spectral area we summed over all signal intensities from 9.5 to 0.5 ppm after exclusion of the water and urea signals and divided each bin intensity by this sum. For the plasma spectra, free EDTA and lter residues were still present at this point and were only removed for sample classication to ensure that these metabolites were not included in the models.

When scaling to the signal intensity of the

NMR reference compound, in our case TSP, we summed up the intensities of the TSP bins ranging from

−0.025 to 0.025 ppm and divided each bin intensity by this sum.

ization method corrects for dierences in spectrometer performance, in uid intake.

5

This normal-

but not for dierences

Finally, we also evaluated Probabilistic Quotient Normalization (PQN).

22

PQN follows the rationale that changes in concentration of one or a few metabolites aect only small segments of the spectra, whereas specimen dilution, for example due to dierences in uid intake in case of urinary specimens, inuences all spectral signals simultaneously. Following

22

22

we rst normalized each spectrum to its total spectral area. Then, we took the

median across all these spectra as the reference spectrum and calculated the ratio of all bin intensities between raw and reference spectra. The median of these ratios in a sample was our nal division factor.

6

ACS Paragon Plus Environment

Page 7 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

For subsequent analysis, only the spectral region between 9.5 and 0.5 ppm was taken into account. To compensate for slight shifts in signal positions across spectra due to small variations in sample pH, salt concentration, and/or temperature, bin intensities across ten bins were fused together in one bin of 0.01 ppm width by summing up the individual absolute values of the bin intensities. Here, absolute bin intensities were summed up to circumvent negative bin intensities, which cannot be logarithmically transformed. ties were nally

log2

These bin intensi-

transformed to account for heteroscedasticity, and all subsequent data

analysis was performed with these values.

Logistic zero-sum regression Zero-sum regression of the data.

19

18

is a novel machine-learning algorithm that is insensitive to rescaling

It allows for a selection of biomarkers that does not depend on the units

chosen. The classications of patients that result from these signatures do not depend on any scaling of the data either. In fact, patients can be classied with spectral data that were not subjected to any scaling normalization. For the reader's convenience we review the concept here in a nutshell: Let

xi = (xi1 , xi2 , . . . , xip )T

be metabolomics data, where

j ∈ {1, . . . , p}

i ∈ {1, . . . , N }

bin

in sample

and

yi

xij

(xi , yi )

with

is the logarithm of the intensity for

the corresponding clinical response of

patient i. In regression analysis the data sets need to be normalized to a common unit. Note that the data are on a logarithmic scale, therefore the scaling to a common unit becomes a shifting of the spectrum

xi

γi .

Thus for normalized data the

βj (xij + γi ) + i .

(1)

by some sample specic value

regression equation reads

yi = β0 +

p X j=1

Now note that the equation becomes independent of the normalization

7

ACS Paragon Plus Environment

γi

if and only if the

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

βj

regression coecients

Page 8 of 30

sum up to zero, i.e.,

p X

βj = 0 .

(2)

j=1

This is the idea of zero-sum regression.

In the machine-learning context of high-content

data, zero-sum regression can be combined with the least absolute shrinkage and selection operator (LASSO)

23

or elastic-net regularization and shows predictive performances that

were not compromised by the zero-sum constraint.

19

In many biomarker discovery challenges the response

y

is not continuous but binary. We

thus need to extend the concept of zero-sum regression from linear regression to classication. We do this by introducing logistic zero-sum regression. In standard logistic regression the log-likelihood of normalized data reads

1 L0 (β0 , β) = N where

x˜ij = xij + γi .

( N X



β0 +˜ xT i β

yi β0 + x˜Ti β − log e 

+1

γi

if the regression coecients

βj

ridge regression

25

α

is usually xed to a specic value in

and

nding coecients

(3)

α = 1 to the LASSO. 23

(β0 , βj )

add up to zero.

This

λPα (β) = λ(α||β||1 + (1 − α)||β||2 ), which

corresponds to the elastic-net regularization penalty. The parameter while

,

i=1

statement also holds if we add the penalizing term

24

)

In this generalized linear model the log-likelihood again becomes

independent of the normalization

validation,



[0, 1].

λ is calibrated in cross-

Here,

α=0

corresponds to

Thus, logistic zero-sum regression amounts to

that minimize

−L0 (β0 , β) + λPα (β)

subject to

p X

βj = 0 .

(4)

α = 1.

Note that Equation

j=1

For all applications throughout the article, we have chosen (4) yields the same regression weights

βj

for the data matrix

xij

as for

are arbitrary feature-wise shifts (all dierences can be absorbed into

8

ACS Paragon Plus Environment

xij + ωj ,

β0 ).

where

ωj

Thus, we learn

Page 9 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

the same linear models independently from the mean intensity of a bin. This is a feature of zero-sum regression that proved valuable in the context of cross-platform application of linear signatures.

26

Further details about the developed algorithm are given in Supplementary Section 1.

Classication algorithms and performance assessment We compared logistic zero-sum regression to three classication algorithms for normalized data, namely a Support Vector Machine univariate feature ltering (f-SVM), (LASSO) logistic regression, squares (PLS).

29

23

28

27

employing a linear kernel function combined with

the least absolute shrinkage and selection operator

and a sparse classication method based on partial least

The latter is currently one of the most popular methods for multivariate

data analysis in metabolomics. The classication performance was assessed in leave-one-out cross-validation, where in each cross-validation step the parameter estimation was performed independently.

30

Details can be found in the supplementary material.

HPC Implementation of zero-sum logistic regression We evaluated two dierent data sets, each within a leave-one-out CV, one data set with 4 and the other with 3 dierent normalizations, and therefore we had to do 107 ×4+86×3=686 independent inner CVs (10 fold inner CV). Each of these CVs was performed on an approximated regularization path of length 500, albeit, the algorithm stops when overtting occurs. In total we performed lenge.

1,790,855

model ts, which obviously presents a computational chal-

To avoid numerical uncertainties in the feature selection we used the very precise

convergence criterion of

10−8 ,

which additionally increases the computational eort.

On a standard server (2 Intel Xeon X5650 processors with 6 cores each), the complete calculation took about two days.

We were able to bring the compute time down to 12

minutes by developing an HPC implementation of

zeroSum

and executing it on 174 nodes

of the supercomputer QPACE 3 operated by the computational particle physics group.

9

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 30

QPACE 3 currently comprises 352 Intel Xeon Phi 7210 (Knights Landing) processors, connected by an OmniPath network. Each processor contains 64 compute cores, and each of these compute cores contains two 512-bit wide vector units. To run eciently on this machine, the

zeroSum

C code was extended to use AVX512 vector intrinsics for the calculation

of the coordinate-descent updates. OpenMP is used to parallelize the inner CV so that the data has to be stored in memory only once and can be accessed from all folds. Additionally, we provide a new

zeroSum R package, which is a wrapper around the HPC

version and can easily be used within

R. This R package also includes functions for exporting

and importing all necessary les for/from the HPC implementation. For users without access to HPC facilities, our package can be run on a regular workstation with reduced convergence precision and a shorter regularization path in less than one hour.

Results Stand-alone biomarker discovery depends on the choice of scale Metabolic biomarkers can be metabolites or just spectral features. Moreover, they can be identied as stand-alone predictors or as part of a multivariate biomarker signature. Before we focus on signatures we study the eect of scales on stand-alone biomarkers. More precisely, we focus on spectral bins with dierential intensities between two classes of samples, i.e., from patients who developed acute kidney injury after cardiac surgery versus from those that did not. To this end the urinary AKI data set is considered. Bin intensities were normalized to (a) a constant total spectral area, (b) a constant creatinine intensity, (c) a median reference spectrum employing the most probable quotient, and (d) a constant intensity of the NMR reference. For scoring normalized bin intensities as potential biomarkers we used Benjamini-Hochberg (B/H) corrected erated

t -statistics implemented in the R

package LIMMA

31

10

ACS Paragon Plus Environment

and

p-values

log 2

from the mod-

fold changes between

2

8

7

6

8

7

6

5

4

3

2

log2 FC

0.5

2

8

7

6

● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●●●● ● ● ● ●● ● ● ● ● ● ●● ● ●●● ● ● ●●● ● ●● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ●● ●● ● ● ● ● ● ●● ● ● ● ● ● ●●●●● ● ● ● ●● ● ●● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ●● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ●●● ●●●●● ● ● ●● ● ● ●● ● ● ● ● ●●● ● ● ● ● ● ● ● ●● ● ●● ●● ●● ● ● ● ● ●● ● ● ●● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ●●● ● ●● ● ● ● ●● ● ● ● ● ●

2

8

7

6

5

6 4

9

8

7

6

4

3

2

● ●

● ● ● ●● ●● ● ● ●

● ●●

5

4

3

2

1

TSP

(h) ●

● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ●●● ● ●●● ● ●● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ●●●●● ●● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ●● ● ● ● ●● ● ● ●● ● ● ● ● ● ●●●● ● ● ●● ●● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●●● ● ● ●● ● ● ● ● ● ●● ● ●● ● ●● ● ● ● ● ● ●● ● ●● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ●● ● ●●● ● ●● ● ● ● ●● ● ● ● ● ●

●● ● ● ● ● ● ●● ● ●● ● ●● ●● ● ●● ●● ● ● ● ●● ●● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ●●● ● ●● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ●● ●● ●● ●● ● ● ●● ● ● ● ● ● ● ●●● ● ● ● ●● ●● ● ●●● ● ● ● ●● ● ● ●● ● ●● ●● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ●●●● ●●●● ● ● ●● ● ● ●● ● ●● ●● ● ● ● ● ● ●● ● ● ●● ● ●●●● ● ● ● ● ● ● ● ● ● ●

1





● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ●●● ● ●●● ● ●● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ●● ●● ● ● ● ● ● ●● ● ● ● ● ● ●●●●● ● ● ● ●● ● ●● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ●● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ●●● ●●●●● ● ● ● ● ●● ● ● ● ● ●●● ● ●● ● ● ● ● ● ●● ● ●● ●● ●● ● ● ● ● ● ●● ● ●● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ●● ● ●● ● ● ● ●● ● ● ● ● ●

● ●● ●● ● ● ● ● ● ●● ● ●● ● ●● ●● ● ●● ●● ● ● ● ●● ●● ● ● ●● ● ●● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ●● ●● ● ●● ● ● ●● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●●●●● ● ● ● ●● ● ●●● ● ● ● ●● ● ●● ● ● ●● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ●●●● ●●●● ● ●● ● ● ●● ● ●● ● ●●● ● ● ● ● ● ●● ● ●●●● ●● ●● ● ● ● ● ● ● ● ● ● ● ●



−1.5

−1.5

−1.5

9

ppm

2

1





● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ●● ● ● ● ● ● ●●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ●

ppm

● ● ●●

● ●



1

3

PQN

(g)





4



ppm

● ● ● ●● ● ●● ● ● ● ● ● ●● ● ●● ● ●● ●● ● ●● ●● ●●● ● ● ●● ●● ● ● ●● ● ●● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ●● ●● ● ●● ● ● ●● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ●●● ● ●● ● ● ● ● ● ● ● ●●● ● ● ● ●● ● ●●● ● ● ● ● ●● ● ● ●● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ●●●● ●●●● ● ●● ● ● ●● ● ●● ● ●●● ● ● ● ● ● ●● ● ●●●● ●● ●● ● ● ● ● ● ● ● ● ●

5

−log10 (p−value)

8 9

0

4

1



● ● ●●● ● ●● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

1.5

1.5 9

−0.5

1.5 0.5

log2 FC

−0.5

● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ●●● ● ●●● ● ●● ● ● ● ● ●●● ● ● ● ●● ● ● ● ● ● ● ●●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ●● ●● ● ● ● ● ● ● ●●● ● ●● ● ●● ●● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ●●● ● ● ●● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ●●●●● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ●● ●● ●● ●● ●● ● ● ● ●● ● ●● ● ● ●● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ●●● ● ●● ● ● ● ●● ● ● ● ● ●



3

Creatinine

(f)



● ● ● ●● ● ●● ● ● ● ● ● ●● ● ●● ● ●● ●● ● ●● ●● ● ● ● ●● ●● ● ● ●● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ●● ● ● ●● ● ●●● ● ● ● ● ●● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ●● ●● ●● ●●● ● ● ●● ● ● ● ● ● ● ●●● ● ● ● ●● ●● ● ●●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ●● ● ● ● ● ●● ● ● ●● ● ● ●● ●● ●●●● ● ● ●● ● ●● ●● ●● ●● ● ● ● ● ● ●● ● ● ●● ● ●●● ●● ● ● ● ● ● ● ● ● ● ●

4



● ● ●● ●● ● ● ●● ● ● ● ●● ● ● ● ● ● ●● ●● ● ●● ●● ● ● ● ● ● ● ●● ●●●● ●● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ●● ●● ●● ● ● ● ● ●● ● ● ● ●● ● ● ●● ●●● ● ●● ● ●●● ●● ●● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ●● ● ●● ●● ●● ● ● ●● ●●● ● ●● ● ●● ● ●

ppm

Total Area

(e)

5

6

8 9

0

6

1





0.5

3



log2 FC

4



● ●● ● ● ●● ● ●● ●● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ●●●● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ●●●● ● ● ●● ● ● ●● ● ● ●● ●●●● ● ●● ● ●● ●● ● ●● ●● ● ● ● ● ● ● ● ● ●● ●●● ●● ● ● ●● ●●● ● ● ● ● ●● ● ●●●●●● ● ● ● ●●● ● ●● ●●● ●● ● ● ●● ● ● ●● ● ● ● ●● ● ● ●● ● ● ● ● ● ●● ●●● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ●● ●● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ●● ● ● ● ● ● ● ●●● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●

1.5

5

ppm

TSP

(d)



0.5

6

●● ● ● ●●● ●● ●● ● ●●● ●●● ● ● ●●● ● ● ●●● ● ● ● ●● ●●● ●● ●● ● ● ● ● ●● ● ● ●● ● ●●● ●● ● ● ● ● ●● ●●● ● ● ● ● ● ●● ●● ● ● ● ●● ●● ●● ● ● ●●● ● ●● ● ●● ● ● ●●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ●● ●● ● ● ● ● ● ●●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ●●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ●● ● ●●● ●

log2 FC

7

● ●●

−0.5

8



● ● ● ● ●● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ●● ● ●● ● ● ●● ● ●● ● ● ● ● ●● ● ●● ●● ● ● ● ●●●● ●● ● ●● ●● ● ●●●● ● ● ● ● ● ●● ● ●● ●●●● ●● ● ● ● ● ●●● ●● ●● ●● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ●●●● ● ●● ● ● ●●● ● ●● ● ● ● ● ● ●● ● ● ●● ● ● ●● ●● ● ●● ● ● ●●● ● ●● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ●● ●●●●●●● ●● ● ● ●●●● ● ● ● ● ● ●● ●● ●●● ● ● ●●● ● ● ● ●● ● ● ●●● ● ●● ● ● ●●●●● ● ● ●● ●● ● ● ● ● ●● ● ● ●● ●● ● ● ● ● ●● ●● ● ●●● ● ● ●●●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ●● ●

−log10 (p−value)

8 9

PQN

(c) ● ●

4

●● ● ●●● ● ● ● ●● ● ● ●● ● ● ●● ●● ● ●● ● ● ● ●● ● ● ● ●●● ●● ●●● ●● ● ● ● ● ● ●● ● ● ● ● ●●● ●●● ●● ● ● ● ● ● ●● ● ● ●● ● ●● ●● ●● ●● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●●● ● ● ● ● ●● ● ●●● ● ●●● ●● ●● ●● ●● ●● ● ●● ●● ● ● ●●●● ●● ●● ●● ● ●●● ●● ● ● ● ●● ●●● ● ●● ●● ●●● ● ● ●● ● ●●● ●●● ●● ●● ● ● ●● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ●● ●● ● ●● ● ● ● ●● ● ●

2



0

● ● ● ● ● ● ●● ● ● ● ● ●● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ●● ●● ● ● ● ●●● ●●● ● ● ● ● ●● ● ●● ●●● ●● ● ● ● ● ●●● ● ●● ● ●● ● ●● ● ● ● ● ● ● ●● ●● ● ● ●●● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ●● ● ● ●● ● ●● ●● ● ●● ● ● ● ●●● ● ● ●● ●● ●● ●●● ●● ● ● ●● ● ● ● ● ● ● ●● ● ● ●● ● ●●● ● ● ● ● ●● ● ● ●● ● ● ●● ● ● ●● ● ●●● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ●● ●● ●● ● ● ●●●● ● ● ●● ● ●●●● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ●●● ● ●●●● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ●●

● ●

−log10 (p−value)

8 6 4 2

−log10 (p−value)

● ●●●● ●

0

Creatinine

(b)

2

Total Area

(a)

−1.5

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

−0.5

Page 11 of 30

9

ppm

8

7

6

5

4

3

2

1

9

ppm

8

7

6

5

4

3

2

1

ppm

Figure 1: − log10 (p-values) of moderated t-test analysis comparing healthy versus diseased patients for the urinary AKI data set after preprocessing with four dierent normalization methods, i.e., scaling to (a) equal total spectral area, (b) scaling to creatinine, (c) PQN, and (d) scaling to TSP, respectively, plotted versus the ppm regions of the corresponding NMR features (upper gure). A red line marks the signicance level for Benjamini-Hochberg (B/H) adjusted

p-values

below 0.01, corresponding to a false discovery rate (FDR) below

1%. All NMR features with a B/H-adjusted

p-value

The lower gures, (e - h), show the corresponding ppm regions of the corresponding NMR features. non-AKI, thus positive

log 2

below 0.01 are represented as red dots.

log 2 fold changes (log 2 FC) versus the log 2 FCs were calculated as AKI minus

FCs correspond to higher values in AKI than in non-AKI.

the classes. Figure 1 shows plots of spectral bin positions against

− log 10 (p-values)

(a-d) and

log 2

fold changes (e-h) between AKI and non-AKI urine specimens. From left to right the plots correspond to (a,e) normalization by total spectral area, (b,f ) normalization to creatinine, (c,g) PQN, and (d,h) normalization to TSP. The red line in plots (a-d) marks a signicance level of 0.01, corresponding to a false discovery rate (FDR) below 1 %. Signicant bins are plotted in red in all eight plots. Normalization to total spectral area: On this scale we observed almost exclusively

negative fold changes indicating lower bin intensities for AKI in comparison to non-AKI. 242 features were signicant with features corresponding to carnitine, 2-oxoglutaric acid, and glutamine ranking highest. Supplementary Table S1 (a) lists raw and B/H-adjusted as well as metabolite assignments of the top ten signicant bins.

11

ACS Paragon Plus Environment

p-values

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 30

Technical artifacts of the scale become apparent too: the spectral region from 4 to 3.5 ppm, highlighted as a blue band, hardly comprised any signicant features. This region is dominated by signals from sugars such as D-mannitol, which had been used as a pre-lling material for the tubes of the CPB machine.

4

As D-mannitol exhibits a rather large number

of NMR signals, as highlighted in an exemplary urine spectrum shown in Supplementary Figure S2, it comprised between

15%

and

82%

of the total spectral areas.

All patients receive D-mannitol during surgery, and the amount of D-mannitol still found in urine 24 h past surgery is modulated by actual kidney function. Thus, D-mannitol entangles kidney function with the total spectral area. As a consequence, the spectral areas for AKI are higher than those for non-AKI. Hence, normalization to constant spectral areas leads to underestimation of metabolite concentrations predominantly in AKI patients and, consequently, many negative log-fold changes of individual metabolites. Therefore, the spectral area can not be recommended for scaling biomarkers, at least not in this context. Normalization to creatinine:

log 2

For this scale we observed dierent analysis results.

fold changes were predominantly positive indicating higher metabolite levels in AKI

patients. 204 bins were signicant, and the top ten are listed in Supplementary Table S1 (b). The biomarker ranked highest was tranexamic acid, which is given in cardiac surgery to prevent excessive blood loss. Strikingly, the region between 4 and 3.5 ppm (blue band) containing the D-mannitol signals was highly signicant on this scale. However, creatinine confounds with aspects of kidney function.

13

If renal performance

is compromised, creatinine accumulates in the blood, leading to lower creatinine levels in urinary specimens of AKI patients. Thus, normalization to creatinine leads to overestimation of metabolite concentrations predominantly in AKI patients and, consequently, many positive log-fold changes of individual metabolites. This is particularly true for metabolites actively secreted into urine. Furthermore, normalization to creatinine can also obscure other metabolite biomarkers if their excretion correlates with that of creatinine. Probabilistic Quotient Normalization: On this scale we observed both positive and

12

ACS Paragon Plus Environment

Page 13 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

negative fold changes in almost equal numbers. Only 89 features were now signicant, and the top ten are listed in Table S1 (c), with carnitine and glutamine as the leading biomarkers. Again, the blue shaded region now covers signicant features, although the number is much lower than after creatinine scaling. Due to the strong conceptual similarity between PQN and normalization to total spectral area, the D-mannitol artifact also compromises the use of PQN. TSP normalization: On this scale we did not detect any signicant biomarkers. The

scaling method corrects only for dierences in spectrometer performance, not for changes in global metabolite concentration. Due to large variability in urine density throughout the cohort, it should not be used in this context. We observed only one bin at 3.715 ppm, identied as an overlap of propofol-glucuronide, broad protein signals, and tentatively D-glucuronic acid, which obtained a signicant

p-value

on three scales (Figure 2a). The plasma data biomarker discovery was not consistent across scales either (Supplementary Table S2, Figure S1, and Figure 2b). Scaling to total spectral area and scaling to TSP predominantly identied metabolites that accumulate in the blood of patients developing AKI. One might explain this observation by a reduced glomerular ltration in these patients. However, the PQN data immediately challenged this interpretation, as it identied a large set of down-regulated metabolites in AKI. Creatinine normalization is not common for plasma metabolomics and was not investigated here.

(a)

(b) PQN

TSP

0

0

Area

0

32 188

Crea. 0

0

0

TSP

PQN 3

86

126

64

0 0

0

1

56 98

1

21

62

Area

Figure 2:

Number of signicant bins and their overlap between the dierent scaling methods

for (a) the AKI urine and (b) the AKI plasma data set.

13

ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 30

Multivariate biomarker signatures can be scale dependent Biomarkers can be combined to biomarker signatures in multivariate analysis. learning algorithms are used to learn these signatures from training data. of these algorithms, namely a linear SVM with

t-score

Machine-

We tested two

based feature ltering (f-SVM) and

standard logistic LASSO regression. The corresponding results for a third classier, sparse partial least squares discriminant analysis (sPLS-DA), can be found in the Supplement. All these methods implement linear signatures of the form

β0 +

p X

βj log2 (Γi Xij ),

(5)

j=1

where

Xij

is the raw signal from bin

j

in patient

the scaling factor used to normalize sample

i.

i, βj

the weight of this feature, and

Γi

All methods were used to learn signatures

from data normalized in four dierent ways, namely scaling by total spectral area, scaling to creatinine, PQN, and scaling to TSP. The performance of the algorithms was tested in cross-validation. Figure 3a shows ROC curves of signatures that aim to predict AKI from urinary ngerprints using f-SVM signatures. The areas under the ROC curves (AUC-ROC) are summarized in Table 1.

The higher this area the better the prediction of patient outcome.

Performances ranged from an AUC of 0.72 for TSP-scaled data to an AUC of 0.83 for PQN data. In line with the results for stand-alone biomarkers, the f-SVM algorithm picked dierent features across scaling methods. Not a single bin was chosen for all four scales (Figure 3a bottom row). Figure 3b and Table 1 show the corresponding results for standard LASSO logistic regression. The LASSO signatures performed slightly better and were more consistent. Nevertheless, there is still a dependence of the chosen biomarkers on the scale. Similar results were observed for the plasma data set (Supplementary Figure S3).

14

ACS Paragon Plus Environment

Page 15 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Table 1: AUC-ROC values of four dierent classication approaches, f-SVM, LASSO logistic regression, sPLS-DA, and zero-sum regression after application of four dierent normalization methods, i.e., scaling to total spectral area, scaling to creatinine, Probabilistic Quotient Normalization (PQN), and scaling to TSP, for the (a) urinary AKI and (b) plasma AKI data set.

Normalization method Tot. spec. area Creatinine PQN TSP Normalization method Tot. spec. area PQN TSP

(a) Urinary AKI data set f-SVM LASSO sPLS-DA logistic regr.

Zero-sum logistic regr.

(b) Plasma AKI data set f-SVM LASSO sPLS-DA logistic regr.

Zero-sum logistic regr.

0.80 0.77 0.83 0.72

0.85 0.84 0.87

0.79 0.81 0.85 0.81

0.84 0.86 0.84

0.75 0.73 0.81 0.79

0.84 0.82 0.83

0.83 0.83 0.83 0.83

0.89 0.89 0.89

For standard methods prediction of patient outcome is scale dependent From a clinical perspective, varying biomarkers constitute only a minor problem, as long as they agree in their predictions of outcome.

However, they usually do not agree.

For the

urinary data set, LASSO signatures yielded conicting predictions in 16% of patients. For f-SVM signatures, the percentage increased to 30%. Figure 4 summarizes the predictions of AKI onset after cardiac surgery patient by patient for the urinary data set.

The row patient outcome shows patients in blue who did not

develop AKI (AKIN stage 0) and in red those that developed AKI. Furthermore, the yellow dashed lines highlight patients with AKIN stage 2 and 3, a more severe manifestation of kidney injury.

The latter patients constitute the high-risk group where early detection of

AKI onset can save lives.

32

The true outcomes 48 h after surgery are contrasted to the

predictions 24 h earlier. Shown are predictions of f-SVM and LASSO signatures for the four scaling methods. We observed that predictions frequently changed with scale and that signatures learned on data that was scaled to, e.g., creatinine did not properly identify the highest risk group. For f-SVM only the signature on PQN data identied all high-risk patients correctly. The

15

ACS Paragon Plus Environment

Journal of Proteome Research

(a)

(b)

0.6 0.4

True positive rate

0.8

1.0

LASSO

0.0

0.2

0.8 0.6 0.4 0.2

True positive rate

1.0

f−SVM

0.0

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 30

0.0

0.2

0.4

0.6

0.8

1.0

0.0

0.2

False positive rate

PQN Area 33

TSP

1

39 0 2 6 30 27 8

0 0

0

0.4

0.6

0.8

1.0

False positive rate

PQN Crea.

Area

TSP

7

35

10

5

7

0 7

40

2

1

0

Crea.

4 1

1

0

7 2

3

Figure 3: The urinary AKI data set: Receiver operating characteristic (ROC) curves for two classication approaches, (a) SVM in combination with

t-test

based feature ltering

and (b) LASSO, after application of four dierent normalization strategies: scaling to total spectral area (red solid line), scaling to creatinine (blue dashed line), Probabilistic Quotient Normalization (PQN) (green dotted line), and scaling to TSP (yellow dashed-dotted line). The bottom row shows the number of features included in the respective classication models in Venn diagrams. The corresponding models were built by averaging over all models of the outer CV loop.

LASSO identied high-risk patients for total spectral area and PQN data correctly. We discuss some misclassications in more detail. Patients AKI-35 and AKI-106 developed severe AKI after 48 h but were not predicted to do so by two of the f-SVM signatures, namely those for creatinine and total spectral area scaled data. A key observation of this paper is that these misclassications could be traced back to data normalization. In theory, the predictions could have been saved by readjusting the scaling of these ngerprints (however, the necessary scaling factor is not evident probabilities of these signatures on the

y -axis.

an onset of AKI, otherwise we did not. The

a priori ).

Figure 5 shows the prediction

If this probability was above 0.5 we predicted

x-axis

shows possible multiplicative scale ad-

16

ACS Paragon Plus Environment

Page 17 of 30

zero−sum (TSP) zero−sum (PQN) zero−sum (Crea.) zero−sum (area) LASSO (TSP) LASSO (PQN) LASSO (Crea.) LASSO (area) f−SVM (TSP) f−SVM (PQN) f−SVM (Crea.) f−SVM (area) patient outcome AKI−1 AKI−3 AKI−7 AKI−9 AKI−10 AKI−11 AKI−12 AKI−13 AKI−14 AKI−16 AKI−19 AKI−20 AKI−21 AKI−22 AKI−23 AKI−24 AKI−26 AKI−27 AKI−30 AKI−31 AKI−32 AKI−33 AKI−34 AKI−37 AKI−39 AKI−40 AKI−41 AKI−42 AKI−43 AKI−44 AKI−50 AKI−51 AKI−52 AKI−53 AKI−55 AKI−56 AKI−57 AKI−58 AKI−59 AKI−60 AKI−61 AKI−63 AKI−64 AKI−66 AKI−67 AKI−68 AKI−69 AKI−72 AKI−73 AKI−77 AKI−78 AKI−79 AKI−81 AKI−84 AKI−85 AKI−86 AKI−87 AKI−88 AKI−91 AKI−93 AKI−95 AKI−96 AKI−97 AKI−100 AKI−101 AKI−102 AKI−104 AKI−105 AKI−107 AKI−108 AKI−109 AKI−110 AKI−2 AKI−6 AKI−8 AKI−15 AKI−17 AKI−18 AKI−28 AKI−29 AKI−45 AKI−46 AKI−48 AKI−49 AKI−62 AKI−65 AKI−71 AKI−74 AKI−75 AKI−76 AKI−80 AKI−82 AKI−89 AKI−90 AKI−92 AKI−98 AKI−99 AKI−103 AKI−36 AKI−38 AKI−106 AKI−4 AKI−5 AKI−35 AKI−70 AKI−94

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Figure 4: Classication results patient by patient for the urinary AKI data set: The row patient outcome shows patients that did not develop AKI in blue, patients that developed AKI in red, whereas patients that developed severe AKI (AKIN stage 2 and 3) are further highlighted by the yellow dashed region. Above we give predictions for the onset of AKI for f-SVM, LASSO, and zero-sum, using normalization strategies as indicated in brackets. AKI predictions are shown in red (AKIN stages 1 to 3), while patients predicted as non-AKI are shown in blue. Patients discussed in the text are indicated by green sample names.

justments

Γ.

A value of

Γ = 1 indicates the actual scale used for prediction.

correspond to a down-scaling of all bins by the factor to an up-scaling by

Γ.

Γ,

while values of

Values of

Γ>1

Γ