iTop-Q: an Intelligent Tool for Top-down Proteomics Quantitation

Nov 22, 2017 - The performance evaluations on an in-house standard data set and a public large-scale yeast lysate data set show that iTop-Q achieves h...
1 downloads 7 Views 2MB Size
Subscriber access provided by READING UNIV

Article

iTop-Q: an intelligent tool for top-down proteomics quantitation us-ing DYAMOND algorithm Hui-Yin Chang, Ching-Tai Chen, Chu-Ling Ko, Yi-Ju Chen, YuJu Chen, Wen-Lian Hsu, Chiun-Gung Juo, and Ting-Yi Sung Anal. Chem., Just Accepted Manuscript • DOI: 10.1021/acs.analchem.7b02343 • Publication Date (Web): 22 Nov 2017 Downloaded from http://pubs.acs.org on November 23, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Analytical Chemistry is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 17 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

iTop-Q: an intelligent tool for top-down proteomics quantitation using DYAMOND algorithm Hui-Yin Chang1,&, Ching-Tai Chen1,&, Chu-Ling Ko2, Yi-Ju Chen3, Yu-Ju Chen3, Wen-Lian Hsu1, Chiun-Gung Juo4,5,*, Ting-Yi Sung1,* &



1. Institute of Information Science, Academia Sinica, Taipei 115, Taiwan 2. Department of Computer Science, National Chiao Tung University, Hsinchu 300, Taiwan 3. Institute of Chemistry, Academia Sinica, Taipei 115, Taiwan 4. Molecular Medicine Research Center, Chang Gung University, Taoyuan 333, Taiwan 5. PharmaEssentia Corp., Taipei 115, Taiwan ABSTRACT: Top-down proteomics using liquid chromatogram coupled with mass spectrometry has been increasingly applied for analyzing intact proteins to study genetic variation, alternative splicing, and post-translational modifications (PTMs) of the proteins (proteoforms). However, only a few tools have been developed for charge state deconvolution, monoisotopic/average molecular weight determination and quantitation of proteoforms from LC-MS1 spectra. Though Decon2LS and MASH Suite Pro have been available to provide intra-spectrum charge state deconvolution and quantitation, manual processing is still required to quantify proteoforms across multiple MS1 spectra. An automated tool for inter-spectrum quantitation is a pressing need. Thus in this paper, we present a user-friendly tool, called iTop-Q (intelligent Top-down Proteomics Quantitation), that automatically performs large-scale proteoform quantitation based on inter-spectrum abundance in top-down proteomics. Instead of utilizing single spectrum for proteoform quantitation, iTop-Q constructs extracted ion chromatograms (XICs) of possible proteoform peaks across adjacent MS1 spectra to calculate abundances for accurate quantitation. Notably, iTop-Q is implemented with a newly proposed algorithm, called DYAMOND, using dynamic programming for charge state deconvolution. In addition, iTop-Q performs proteoform alignment to support quantitation analysis across replicates/samples. The performance evaluations on an in-house standard data set and a public large-scale yeast lysate data set show that iTop-Q achieves highly accurate quantitation, more consistent quantitation than using intra-spectrum quantitation. Furthermore, the DYAMOND algorithm is suitable for high charge state deconvolution and can distinguish shared peaks in co-eluting proteoforms. iTop-Q is publicly available for download at http://ms.iis.sinica.edu.tw/COmics/Software_iTop-Q.

Liquid chromatography (LC) coupled with mass spectrometry (MS) or tandem mass spectrometry (MS2) has become a predominant platform for proteomics research because of its high sensitivity, increasing resolution and high processing speed1-4. Bottom-up and top-down proteomics are two complementary approaches in the field of proteomics5,6. In bottomup proteomics, proteins are digested into peptides using proteases, and then the peptides are separated by LC and analyzed by MS and MS/MS7,8. Top-down proteomics, without proteolytical digestion, utilizes intact protein masses for proteomics analyses, providing an opportunity for the characterization and identification of post-translational modifications (PTMs) on the proteins. In Top-down proteomics, intact proteins are separated by LC prior to MS. The separated intact proteins with the assistance of heat, nebulizing gas and high voltage are desorbed as multiple charged protein ions for MS detection9,10. Since a protein usually elutes as a chromatographic peak in a retention time duration, the charged protein ions will be recorded in several consecutive MS1 spectra and intensive protein ions are subjected to MS2 analysis to obtain fragment information for identifying proteoforms along with their PTMs11.

Quantitation is an important task in proteomics, because it provides an opportunity of the comparative studies of proteins between different disease or health states for biomarker discovery11-13. Several strategies have been proposed for intact protein quantitation. For example, Du, Y. et al. utilized 14 N/15N metabolic labeling strategies for measuring expression ratios of intact proteins using top-down mass spectrometry14. Bergmann, U. et al. combined bottom-up and top-down approaches with MeCAT labeling strategy for the absolute quantitation of proteolytic peptides and intact proteins from a complex biological system15. Nevertheless, limitations on the application of labeling strategy to intact protein quantitation were noted because the labeling efficiency decreases as molecular mass increases11,13,16. Label-free strategy, on the other hand, has drawn much attention for the relative quantitation of intact proteins because of the relatively easy sample preparation, using no expensive labeling reagents, and applicability to primary human samples13. For example, Ntai et al. presented an integrated platform for identification and label-free quantitation analyses of protoeforms and applied it to measure the abundance fold

1 ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

change of deleting a histone deacetylase in S. cerevisiae17. In general, there are two different methodologies in label-free strategy. The first methodology is intra-spectrum quantitation which performs relative quantitation of modified and unmodified proteoforms present within the same spectrum11,13,18. For example, Pesavento, J. J. et al. demonstrated ionization efficiency as a major issue by measuring the intensity ratios of histone H4 proteoforms and their fragment ions in single MS2 spectra19. The other methodology is the construction of extracted ion chromatograms (XICs) from MS1 spectra. Castagnola, M. et al. revealed hypo-phosphorylation as a defective event in more than 60% of autistic spectrum disorders patients using extracted ion abundances of intact proteins of interest in human saliva20. Recently, Wu, S. et al. quantitatively profiled 83 proteoforms of 20 identified proteins in human parotid and submandibular gland secretions by using an accurate mass and time tag database for identified proteoforms and generating XICs from the raw data accordingly21. Several tools have been proposed for intact protein quantitation. Decon2LS22 and MASH Suit Pro23 are two public tools for intra-spectrum quantitation. ProSightPCTM24-26 (Thermo Scientific™), Biopharma FinderTM (Thermo Scientific™), and MassHunter BioConfirmTM (Aglient) are commercial tools that also perform intact protein quantitation. Nevertheless, cross-spectra label-free strategy for proteoform quantitation is relatively more challenging because of its additional requirement of automated construction of XICs. Recently, ProMex included in MSPathFinder27 has been publicly available that clusters isotopic envelopes, constructs XICs to determine elution time span for refinement, and uses theoretical isotopic envelopes to score the likelihood of detected proteoform features. In this study, we present a fully automated tool, called iTopQ (intelligent Top-down Proteomics Quantitation), to construct XICs across multiple MS1 spectra for proteoform quantitation. Since most proposed charge state deconvolution algorithms, such as MaxEnt28, THRASH29 (implemented in Decon2LS and MASH Suit Pro), MS-Deconv30,31, and UniDec32, are mainly aimed for intra-spectrum deconvolution, we particularly propose a new deconvolution algorithm for iTop-Q implementation, called DYAMOND (DYnamic progrAMming ON charge state Deconvolution), for the deconvolution of the constructed XICs. Using iTop-Q, the constructed XICs are clustered and those passing our quality criteria are called putative proteoform envelopes, corresponding to putative proteoforms. With our newly developed DYAMOND algorithm, the monoisotopic and average masses of detected putative proteoforms are accurately calculated and reported. Moreover, iTop-Q also aligns the detected putative proteoforms across different replicates/samples for direct abundance comparison. EXPERIMENTAL SECTION Standard Protein Data Set. Chemicals. Cytochrome c (Cyt c) standard protein (theoretical protein average mass: 12361.96 obtained from ProteoMass™), all chemicals and solvents were purchased from Sigma Aldrich (St. Louis, MO, USA). The chemicals were all of analytical grade. Water and acetonitrile were of CHROMASOLV grade. The protein sample was dissolved in 10% acetonitrile to form a solution of 1 mg/mL.

Page 2 of 17

Instrument. A UPLC system (Waters, Milford, MA, USA) equipped with a C4 reversed-phase column (2.1 × 100 mm, 1.8 m, BEH 300; Waters, Milford, MA, USA) was coupled with an LTQ-Orbitrap XL MS(Thermo Scientific, San Jose, CA, USA) with an orthogonal electrospray ionization (ESI) source. For liquid chromatography, the initial flow rate was 0.1 mL/min 98% solvent A (0.1% formic acid and 0.01% trifluoroacetic acid) and 2% solvent B (acetonitrile with 0.1% formic acid and 0.01% trifluoroacetic acid). A volume of 5 µL of sample was injected. After injection, solvent B was maintained at 2% for 10 minutes then increased to 40% during a span of 40 minutes, maintained at 40% for 5 minutes then to 98% over 5 minutes, after which this percentage composition was held for 12 minutes. Finally, solvent B was reduced back down to 2% in 8 minutes and held at this percentage for 5 minutes. For mass spectrometry, full scan acquisition was performed in profile mode with the preset resolution of 60,000. Public Large-scale Yeast Lysate Data Set A public yeast lysate data set of 7 fractions acquired by LCUVPD-MS/MS (LC: Bruker-Michrom, Auburn, CA; MS/MS: Thermo Scientific Orbitrap Elite mass spectrometer, Bremen, Germany), with three or four technical replicates in each fraction, was downloaded33. A total of 292 proteoforms corresponding to 215 proteins were identified from these 7 fractions using ProSightPC ™ 3.0. The detailed descriptions of data processing for both datasets are described in Supporting Information, Section I. METHODS Intelligent Algorithms for Top-down Proteomic Quantitation. iTop-Q accepts input files in mzXML and mzML formats which can be conveniently converted from raw data by existing converters. It also accepts LC-MS (possibly also containing MS2) data acquired in profile or centroid mode. Since iTop-Q focused on quantifying intact proteins, it particularly extracts and processes MS1 spectra. The general workflow of iTop-Q is shown in Figure 1. Preprocessing of MS1 data In order to reduce data complexity, iTop-Q first performs a preprocessing for each MS1 spectrum. The detailed descriptions of signal centroiding, noise removal, selecting representative isotopic signals in each spectrum and constructing XICs across MS1 spectra are described in Supporting Information, Section II. We use peaks to represent XICs in the following paragraphs. The DYAMOND algortihm for charge state deconvolution Grouping peaks based on retention time. Since peaks of a protein theoretically elute in close retention time, iTop-Q first groups peaks based on their retention time. Starting from the most intensive peak, say pi, in the detected peaks, iTop-Q groups peaks with apex retention time in the range of t1-∆ to t2+∆, where ∆ is the retention time tolerance (2 seconds by default), and t1 and t2 are the starting and ending retention time of pi, respectively.

2 ACS Paragon Plus Environment

Page 3 of 17 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry  F (i − 1, j − 1) + M ( pi , z j )  F (i, j ) = max F (i − 1, j ) − d1  F (i, j − 1) − d 2 

Equation 3

+ 2n , if z j ∈ l i M ( pi , z j ) =  − 2n , if z j ∉ li ,

Figure 1. The general workflow of iTop-Q. After data input, iTop-Q first performs putative proteoform detection on each individual run using DYAMOND. Then, it aligns detected putative proteoforms across different samples/replicates and generates a summary table of protein abundances with sample/replicate names in columns and detected proteoforms in rows.

Calculating possible charge states. Let P={p1, p2, …, pi,…, pn} be a list of grouped peaks sorted in increasing m/z values, where n is the number of peaks. Assuming that most of peaks in P correspond to a single protein, Prot, with mass M having consecutive charge states (e.g., 12+, 13+, 14+, etc.), we compute the possible charge states between any two peaks in P. To be specific, assuming two arbitrary peaks, pi and pj, have consecutive charge states zi and zi-1, solving the following simultaneous equations: mzi =

M + (zi −1) × H M + zi × H and mzj = zi zi −1

Equation 1.

we obtain zi =

mz j − H

Equation 2.

mz j − mzi

where mzi and mzj are the m/z values of pi and pj (mzj> mzi), and H is the mass of a proton. For each pi in the peak list P, by applying Equation 2 to pi paired with any peak pj in P, iTop-Q generates a list of possible charge states, li, for peak pi, 1≦i≦n-1. To reduce the size of li, we check for each possible charge state candidate, z, whether there is an isotopic peak in the right-hand and left-hand sides of the most intensive isotopic peak of pi, with the m/z intervals of 1/z among the three isotopic peaks. If yes, z is regarded as a possible charge state; otherwise, z will be removed from li. Determining charge states using dynamic programming. Let zmin and zmax be the minimum and maximum charges, respectively, of all peaks in P; and let Z be a list of consecutive integers in a decreasing order from zmax to zmin,. We use dynamic programming to assign charge states by optimizing the following scoring function F(i, j) for i=1, 2,…, n and j = 1, 2,…, zmax─zmin+1:

where d1 and d2 are the penalties of assigning a possible charge state to a gap in the peak list and assigning a gap in the charge state (i.e., no charge state) to a peak, respectively; and we set d1 =d2= 1. Initially, F(i,0) = F(0,j) = 0 for all i,j. Using dynamic programming, a score table will be established, on which a backtracking procedure is applied to find the optimized charge state assignment. The list of peaks with optimized assigned charge states, denoted by Pc, defines a putative proteoform of an intact protein Prot. Sometimes, there are possibly more than one path achieving the maximum score, DYAMOND calculates the standard deviation of protein masses determined by the peaks in each path and selects the one with minimum standard deviation as the optimized path for charge state assignment. Furthermore, if there are discontinuous charge states in the defined proteoform (e.g., the proteoform is assigned with the charge states of 13+, 14+, 16+, 18+, and the charge states of 15+ and 17+ are missing), a post-processing procedure is performed to re-search all peaks (with or without charge states) in the retention time range, and calculate the mass of each peak using its m/z value and the missing charge states. If the mass difference between the calculated mass and M is within a mass tolerance, the peak will also be regarded as part of the defined proteoform and reported as a shared peak if it has already been assigned to another already-determined proteoform. Finally, iTop-Q validates the quality of Pc by the number of peaks and the continuity of the charge states. If Pc contains at least 3 peaks and includes at least 2 continuous charge states, Pc is regarded as qualified. Otherwise, Pc is regarded as unqualified and the peaks of Pc are put back to the original peak list for another charge state deconvolution procedure. In addition to the quality validation on putative proteoforms, iTop-Q also validates the quality of charge states in the putative proteoforms using isotopic signals. To be specific, for each charge state in a putative proteoform, we examine whether there are at least 3 isotopic signals with m/z intervals equal to the charge state. If yes, the charge state is regarded as qualified and the peak is considered as validated. Otherwise, the charge state is regarded as unqualified and the peak is considered as invalidated (marked in green color in the user interface of iTop-Q). Calculating the masses and abundances of a putative proteoform. Since Pc is composed of multiple peaks, determining its protein monoisotopic/average mass and abundance are important tasks. We calculate the protein average mass by using the m/z and charge state of the most intensive peak in Pc and apply averagine model34 to compute the protein monoisotopic mass of Pc. For protein abundance, according to our analyses under six different abundance calculation methods, we utilize the abundance of the most intensive peak in Pc as the representative since it has the most consistent quantitation performance. After putative proteoform detection, each input file has its corresponding putative proteoform list.

3 ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 17

Figure 2. Comparison of iTop-Q, ProMex, and Decon2LS on the standard protein Cytochrome c (Cyt c) in terms of calculated charge states and protein monoisotopic mass, where “Total” denotes the total number of charge states being detected, and “Distinct” denotes the number of distinct charge states. (A) The number of charge states and monoisotopic masses of Cyt c calculated by iTop-Q, where PPE1 and PPE2 are two detected putative proteoforms. (B) The number of charge states and monoisotopic masses of Cyt c calculated by ProMex, where PPE1 and PPE2 are displayed together because ProMex reports multiple proteoform features corresponding to Cyt, where each feature contains peaks eluted within the entire retention time range of PPE1 and PPE2, without distinguishing PPE1 and PPE2. Since the representative peak of a detected feature of Cyt c is assigned with incorrect charge state, the standard deviation of calculated monoisotopic mass in the third replicate is relatively larger than those in the other two replicates. However, the standard deviation becomes much smaller after removing the feature. (C) The number of charge states and monoisotopic masses calculated by Decon2LS, where three MS1 spectra, each with the most intensive signals of PPE1 and PPE2, are selected from each replicate as representatives.

Aligning Putative Proteoforms across replicates /samples In a label-free top-down LC-MS experiment, proteins with almost the same protein masses in any two runs eluted in close retention time are commonly regarded as identical proteoforms. iTop-Q, therefore, groups putative proteoforms across runs based on their masses and retention time. To avoid possible retention time shift of the same proteoform in different runs, iTop-Q performs a retention time adjustment procedure prior to putative proteoform grouping. iTop-Q first selects the run having the largest number of detected proteoforms as the reference and pair-wise aligns the proteoform list in each of the other runs with respect to those in the reference. The commonly detected putative proteoforms in both runs, called landmarks, are used to model the retention time shift distribution between the reference and the other run. Two proteoforms are considered commonly detected in both runs if they satisfy the following conditions: (1) their masses differ within a user-defined mass tolerance; (2) they have at least two common peaks, i.e., close m/z and the same charge state. With a list of landmarks, we utilize LOESS regression algorithm35,36 with span of 20% and weight of 1 to construct a retention time drift model, and adjust all putative proteoforms in the aligned run accordingly. After retention time adjustment, putative proteoforms in the other run are aligned with those in the reference if two proteoforms have a protein mass

difference less than a mass threshold and the adjusted retention time difference within a given retention time tolerance. RESULTS AND DISCUSSION Performance Evaluation by a Standard Protein Data Set We first used a standard protein data set with three technical replicates to evaluate the performance of iTop-Q. A standard intact protein Cytochrome c (Cyt c) with the same concentration was injected into three technical replicates. Because of the separations of LC gradient, the intact protein forms two proteoform envelopes in each technical replicate (as shown in Supporting Information Figure S1). Evaluation on charge state deconvolution. Using iTop-Q, two putative proteoform envelopes, called PPE1 and PPE2, with assigned charge states and eluted in close retention time in the three replicates were detected (Supporting Information Figure S2). The m/z, retention time, charge states, intensity, and S/N values of the peaks of PPE1 and PPE2 in the three replicates are listed in Supporting Information Tables S1 and S2, respectively. In order to evaluate the performance of iTop-Q, we utilized ProMex27 and Decon2LS22 to process this data set as well. Similar to iTop-Q, ProMex also constructs XICs across MS1 spectra. Decon2LS determines the charge states of signals in each MS1 spectrum using THRASH algorithm29 and reports the abundances of detected signals accordingly. The number of peaks, proteoform features or signals belonging to Cyt c

4 ACS Paragon Plus Environment

Page 5 of 17 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

detected by iTop-Q, ProMex and Decon2LS in each replicate are listed in Supporting Information Table S3. Figure 2 shows the number of charge states and monoisotopic masses of Cyt c in the three replicates calculated by iTop-Q, ProMex, and Decon2LS. Since Decon2LS detects the signals of Cyt c in several neighboring MS1 spectra depending on the elution of Cyt c, we only show the MS1 spectra with the most intensive signals of PPE1 and PPE2 in Figure 2(C). All the MS1 spectra with the signals of PPE1 and PPE2 detected by Decon2LS are shown in Supporting Information Figures S3-5. As shown in Figure 2 (A), the total number of detected charge states is equal to the number of distinct charge states, meaning there is no redundant charge states in iTop-Q. In addition, we noticed that ProMex did not distinguish PPE1 and PPE2, i.e., the proteoform features contain peaks in the entire retention time range of PPE1 and PPE2 as shown in its output results (Supporting Information Table S4), whereas iTop-Q detected both proteoform envelopes though they eluted in close retention time. For protein monoisotopic mass calculation, iTop-Q, ProMex, and Decon2LS report the masses of 12350.04±0.4, 12408.83±99.8, and 12351.8±0.9, respectively; where the relative large standard deviation of ProMex was caused by a detected peak incorrectly assigned with a charge state (as shown in Figure 2 and Supporting Information Table S4). After removing the incorrect charge state, the standard deviation of monoisotopic mass calculated by ProMex become much smaller (12352.71±5.9). For protein average mass calculation, both iTop-Q and Decon2LS have mass error smaller than 5 Da compared with the theoretical average mass obtained from ProteoMass™, while ProMex is not compared since it does not provide protein average mass information. iTop-Q has smaller standard deviation (calculated protein average mass: 12,358.46±0.35) than Decon2LS (calculated protein average mass: 12,359.61 ± 1.11). The detailed average mass comparison of iTop-Q and Decon2LS is shown in Supporting Information Figure S6. Finally, iTop-Q took 3.28 minutes in average to process each replicate, whereas ProMex and Decon2LS take 66.33 and 76.48 minutes in average, respectively. Evaluation on abundance calculation by six different methods. Selecting a proper method for accurately quantifying the proteoform is important. We considered six different methods to calculate the abundance of a proteoform, including using the most abundant peak area (M1), using the apex intensity of the most intensive peak (M2), summing the areas of top three intensive peaks (M3), summing the apex intensities of top three intensive peaks (M4), summing the areas of all peaks (M5), and summing the apex intensities of all peaks (M6). Note that here peaks refer to the XICs for iTop-Q and proteoform features for ProMex (as apex intensity and abundance based on area are provided), and signals for Decon2LS. To evaluate these methods, we calculated proteoform abundance ratios between any two replicates, defined by the abundance in replicate i measured by a specific method divided by the abundance in replicate j, 1≦i, j≦3 and i≠j, which are expectedly close to 1. Since Decon2LS reports signal abundances in individual spectrum, different abundances of PPE1 and PPE2 could be reported from different MS1 spectra of each replicate. We thus calculated the abundance ratios among all of the MS1 spectra containing signals of PPE1 or PPE2 of the three repli-

cates using the methods of M2, M4, and M6. Note that M1, M3, and M5 cannot be applied in Decon2LS since it does not provide integrated signal abundance across spectra. All the abundance ratios of Decon2LS are listed in Supporting Information Tables S5-7.

Figure 3. Abundance ratio distributions under six different abundance calculation methods of detected proteoform envelopes of the standard protein data set by iTop-Q, ProMex, and Decon2LS. The six methods are as follows. M1: using the most abundant

peak area, M2: using the apex intensity of the most intensive peak, M3: summing the areas of top three intensive peaks, M4: summing the apex intensities of top three intensive peaks, M5: summing the areas of all peaks, and M6: summing the apex intensities of all peaks. Figure 3 shows the abundance ratio distributions under six abundance calculation methods on putative proteoforms detected by iTop-Q, ProMex and Decon2LS. We noticed that, using the three tools, all the six methods have their replicate abundance ratios close to 1, and, for iTop-Q, M1 has the smallest standard deviation. Compared with Decon2LS, iTopQ and ProMex have smaller standard deviation on the calculated abundance ratios, suggesting that using XIC area could provide better quantitation accuracy. Performance Evaluation and an Application Demonstrated by a Public Large-scale Yeast Lysate Data Set We utilized a public yeast lysate data set33 of a higher sample complexity than the standard data set, where overlapping proteoform envelopes may occur, to evaluate iTop-Q’s performance. This data set includes 7 fractions, each with three or four technical replicates. A total of 292 proteoforms, provided by the authors, were identified from these 7 fractions using ProSightPC™ 3.0. This data set was processed by iTop-Q, ProMex, and Decon2LS. Though iTop-Q constructs XICs across MS1 spectra and reassembles XICs to form putative proteoform envelopes, it took on average 1.4 min to process each replicate, compared to 154.67 and 746.67 minutes for each replicate by ProMex and Decon2LS, respectively. Processing the data set, iTop-Q acquired a total of 4,027 putative proteoform envelopes (including non-distinct, being multiply detected putative proteoforms), comprised of 26,089 peaks. On the other hand, ProMex detected a total number of 79,448 proteoform features (each with a representative peak with assigned charge state and m/z value), and Decon2LS detected a total number of 1,199,287 signals in all of the replicates of the 7 fractions. Manually examining the quality of 292 identified proteoforms by our proposed criteria for qualified proteoform

5 ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

envelopes described in the DYAMOND algorithm in Experimental Section, we found 176 proteoforms passed the criteria, whereas 47 proteoforms did not pass the criteria even though their precursor XICs were constructed, and 69 had low signal intensities so that their precursor XICs could not be constructed. We thus utilized the 176 proteoforms as the benchmark for the following analyses. The detailed information of the 176 proteoforms is listed in Supporting Information Table S8. Performance evaluation on charge state deconvolution and protein monoisotopic/average mass accuracy. iTop-Q detected 1,516 peaks of all the 176 proteoform envelopes from the replicates of 7 fractions, while ProMex and Decon2LS detected 943 peaks of 168 proteoforms and 7,387 signals of 176 proteoforms, respectively. The detailed information of peaks, proteofrom features, and signals detected by iTop-Q, ProMex, and Decon2LS is listed in Supporting Information Tables S911, respectively. In the output results of Decon2LS, signals with the same charge state and m/z could be repeatedly reported (Supporting Information Figure S7) and the number of detected signals comprising the same putative proteoform could vary across spectra (Supporting Information Figure S8 and Table S10). Similar situation occurs in the output results of ProMex (Supporting Information Figure S9 and Table S11). We thus independently took the union of charge states of the proteoforms detected by ProMex and Decon2LS to compare with the charge states detected by iTop-Q. The number of total and distinct charge states detected by the three tools is shown in Supporting Information Figure S10(A). The charge state comparison of iTop-Q, ProMex, and Decon2LS is shown in Supporting Information Figure S10(B), where 28% (506/1,827) of charge states were commonly detected by the three tools, and 7% (135/1,827) of charge states were only detected by iTop-Q. In addition, our tool detected more high charge states than ProMex and Decon2LS (Supporting Information Figure S10(C)). Analyzing the peaks with charge states detected by iTop-Q alone, we observed that ProMex and Decon2LS also detected 48% (65 out of 135) and 83% (112 out of 135) of them in their representative peaks and signals, respectively, but assigning the them with incorrect charge states (Supporting Information Table S12). This is probably due to noisy background or overlapping isotope envelopes that leads to the incorrect charge state deconvolution. Our DYAMOND algorithm calculates possible charge states and applies dynamic programming to filter out incorrect charge state assignment, and thus is shown to be highly accurate. On the other hand, the 17% (311 out of 1,827) charge states undetected by iTop-Q is mainly due to the signals with low S/N ratios such that their XICs could not be constructed (Supporting Information Figure S11). It is also noted that, though the distinct charge state number of ProMex is relatively lower than iTop-Q and Decon2LS because of only one representative peak with its charge state reported for each detected feature, the number of detected charge states greatly increases when considering all charge states within the charge state range provided by ProMex as detected (Supporting Information Figure S10(A) and (B)). Finally, based on the 506 commonly detected peaks/signals, the monoisotopic masses calculated by iTop-Q are close to those calculated by ProMex and Decon2LS (Supporting Information Figure S12(A)), and the average masses calculated

Page 6 of 17

by iTop-Q and Decon2LS are also highly correlated (as shown in Supporting Information Figure S12(B)). Performance evaluation on protein quantitation. Quantifying proteoforms in a complex data set is challenging since overlapping proteoforms may occur, increasing the difficulty of accurate protein quantitation. Similar to quantitation analysis of standard protein data set, we compare quantitation analysis on the 176 benchmark proteoforms detected by iTop-Q, ProMex, and Decon2LS to evaluate protein quantitation. Since we do not know the exact protein abundance ratios among different fractions, we calculated protein ratios among technical replicates (which are expected to be 1) in each fraction for our evaluation. Figure 4 shows the abundance ratios of the detected features between any two replicates in the same fraction using iTop-Q, ProMex, and Decon2LS. All three tools achieved median proteoform ratios of 1. iTop-Q and ProMex had much smaller abundance ratio deviation than Decon2LS. This result suggests that using the peak areas reveals more consistent abundance calculation, echoing the previous finding in the standard data set. Furthermore, based on the ratio distributions of iTop-Q and ProMex, using the area of the most intensive peak (M1) in a putative proteoform envelope provides the smallest protein ratio deviation.

Figure 4. The abundance ratio distributions of iTop-Q, ProMex, and Decon2LS in the yeast lysate data set using the six different abundance calculation methods.

Accurate deconvolution of shared peaks in co-eluting proteoforms. In a large-scale top-down proteomics experiments, several proteoforms can possibly co-elute in close retention time. Even in some cases of co-elution, one or more peaks in a proteoform envelope may overlap with some in another proteoform envelope; and we call such overlapping peaks in coeluting proteoforms as shared peaks henceforth. It is important for a tool capable of separating proteoform envelopes from one another and distinguishing shared peaks in the co-eluting proteoforms. Among the 176 proteoforms detected by iTop-Q, only 25 proteoforms did not co-elute with any other proteoform within a retention time tolerance of ±0.5 minute, while 151 proteoforms were co-eluted with other putative proteoforms (Supporting Information Figure S13(A)). The detailed number of co-eluting proteoforms is shown in Supporting Information Figure S13(B). Among the 151 co-eluting proteoforms, 123 proteoforms did not have any shared peak and were relatively easy to distinguish, and 28 proteoforms had shared peaks which were successfully detected by iTop-Q (Supporting Information Figure S13(C)).

6 ACS Paragon Plus Environment

Page 7 of 17 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry the seven proteins with both modified and unmodified patterns were computed (Figure 6), where the protein (id: P02293) was calculated four times since this protein has been identified with four modifications. Notably, for SOD1 protein, decreased phosphorylation level (0.27-fold) on Cu–Zn superoxide dismutase (SOD1, P00445), 60S ribosomal protein L22-A (RPL22A, P05749, 0.07-fold) and L26-B (RPL26B, P53221, 0.11-fold) were quantified by iTop-Q.

Figure 5. An example of two identified and co-eluted proteins, RS20_YEAST (average protein mass: 13,817.62) and RS19B_YEAST (average protein mass: 15,784.39). The proteoform envelopes of RS20_Yeast and RS19B_Yeast are colored in red and blue, respectively. The peak shared by the two proteoforms is colored in green.

Figure 5 demonstrates a heat map of two co-eluting proteoform envelopes of two proteoforms selected from the 176 identified proteoforms, where x-axis and y-axis represent the retention time and m/z, respectively. As shown in Figure 5, one green peak is shared by two proteoform envelopes. In addition, four peaks shaped by dotted lines in the two proteoform envelopes are invalidated by iTop-Q since their isotopic patterns do not fit those calculated by the charge states assigned by iTop-Q. Using iTop-Q, not only proteoforms but also shared peaks can be successfully distinguished from one to another. This analysis demonstrates the ability of iTop-Q in distinguishing co-eluting proteins and allowing users to verify the quality of detected proteoform envelopes. Application to protein post-translational modification quantitation. The quantitative study of post-translational modifications is important since post-translational modifications play pivotal roles in the determination of biological processes37. For example, SOD1 protein localized in mitochondria regulates proteins from oxidative injury, energy generation, and provides its ubiquity for fermentative and respiratory in yeast38. The expression level of this protein will be altered in yeast in response to different redox stimuli, and it may directly or indirectly influence the activity of protein kinases, regulating translational activity of the ribosomes39. In addition, the activity between SOD1 and the phosphorylation state of ribosomal P proteins has been shown to reveal a close correlation in diauxic shift and logarithmic growth of yeast32. Computing the abundance ratios of SOD1 protein and its phosphorylation modification could benefit the study of biological processes of yeast lysate. According to the identification results of the yeast lysate data set provided by Cannon et al.33, 56% (98/176) proteoforms were modified, where acetylation is the most commonly observed modification in the data set (Supporting Information Table S13). Specifically, seven proteins including SOD1 protein (corresponding to 18 proteoforms) were identified with both unmodified and modified forms; ten proteins (corresponding to 35 proteoforms) were identified with various PTM forms, but no umodified form; and 123 proteins (corresponding to 123 proteoforms) were identified with either modified or unmodified form. Using iTop-Q, the abundance ratios of

Figure 6. The abundances of seven proteins with and without post-translational modifications. Two proteins (accession number: P02293 and P02294) have four and two different post-translational modifications, respectively, and the other five proteins have a single post-translational modification. The fold change is defined as the abundance of a protein with post-translational modification divided by the abundance of the protein without post-translational modification.

iTop-Q: a Friendly and Graphical Quantitation Tool We implemented iTop-Q using C# programming languange as a portable tool (i.e., without requiring installation) such that one can easily operate the tool in several Microsoft Windows series platform (including Windows 7, 8, 10, and Windows Server 2008, 2012) for top-down proteomics quantitation. For users to easily quantify their LC-MS data, iTop-Q is implemented with a quantitation wizard that guides users to process the imported data step by step. In the quantitation wizard, only one parameter, i.e., mzWidth, is required since the data resolution varies among instruments. But users are allowed to modify parameters, such as the mass tolerance and noise threshold in the advance setting, to optimize quantitation results. Once the quantitation process is completed, the main user interface of iTop-Q will display six panels (Figure 7(A)-(F)). The first panel (Figure 7(A)) shows a proteoform summary table which lists the monoisotopic mass, average mass, retention time, and abundance of detected putative proteoforms in different replicates or samples. By double-clicking an entry of abundance (i.e., the abundance of a proteoform in a specific processed file) in the summary table, the putative proteoform envelope table (Figure 7(B)) will list the detailed information (e.g., the m/z, retention time, assigned charge state, and intensities) of peaks included in the envelope of selected putative proteoform. More importantly, co-elution information is also provided in this panel. If the selected proteoform is co-eluted with another proteoform, say, proA, the protein ID of proA will be listed in the column of “Shared with protein ID”. The third panel provides graphical visualizations which display the elution heatmap, constructed XIC, and projected MS1 spec-

7 ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

trum of the selected proteoform. In the plot of constructed XIC, users can use mouse to redefine the boundary of XIC, and the proteoform abundance will be updated instantly. Finally, the fourth panel (Figure 7(F)) lists the parameter setting used in the quantitation. To use iTop-Q for intact protein quantitation, users having identification results of a top-down proteomics data set can map their identified proteins with those detected by iTop-Q using proper monoisotopic/average mass and retention time tolerances.

Figure 7. The main user interface of iTop-Q. (A) Proteoform summary table summarizes all detected putative proteoforms with their calculated masses, retention time, and abundances in different replicates/samples. (B) Putative proteoform envelope table lists peaks included in the clicked proteoform envelope with their m/z, retention time, intensity values, and charge states. (C), (D), and (E) are the heat map, XIC plot, and projected spectrum plot of a selected peak in the putative proteoform envelope table. (F) Parameter setting table lists the parameters used in the quantitation.

CONCLUSION As top-down proteomics continues to increase in throughput and complexity of the samples analyzed, the lack of robust bioinformatics tools for the top-down data analysis, management and interpretation has become a major obstacle in comparison with bottom-up approach10,12. To conquer the challenges, we have developed iTop-Q as a friendly and graphical tool for protein quantitation in MS1 level. An intelligent algorithm, DYAMOND, has also been designed to perform the difficult charge state deconvolution at MS1 spectra. According to our analyses, iTop-Q is an effective quantitation tool with high quantitation accuracy. ASSOCIATED CONTENT Supporting Information The Supporting Information is available free of charge on the ACS Publications website. ac-2017-02343z_iTop-Q_SupportingInformation_Figures.pdf The description of data processing and parameter settings; the preprocessing of MS1 data; supplementary Figures S1S13. ac-2017-02343z_iTop-Q_SupportingInformation_Tables.xlsx supplementary Tables S1-S13. AUTHOR INFORMATION Corresponding Authors

Page 8 of 17

* Ting-Yi Sung, Phone: +886-2-2788-3799 ext. 1711. Fax number: +886-2-2782-4814. Email: [email protected]. * Chiun-Gung Juo, Phone:+886-2-2655-7688 ext. 1367. Fax number: +886-2-2655-7626. Email: [email protected]. AUTHOR CONTRIBUTION &

H.-Y.C., and C.-T.C. contributed equally.

ACKNOWLEDGMENTS This work was supported by the Academia Sinica, Ministry of Science and Technology of Taiwan (MOST106-2221-E-001018), and Taiwan International Graduate Program. The first co-author also thank for Prof. Alexey Nesvizhskii’s support during paper revision. REFERENCES (1) Bogdanov, B.; Smith, R. D. Mass Spectrom Rev 2005, 24, 168-200. (2) Zhou, H.; Ning, Z.; Starr, A. E.; Abu-Farha, M.; Figeys, D. Anal Chem 2012, 84, 720-734. (3) Gosetti, F.; Mazzucco, E.; Gennaro, M. C.; Marengo, E. J Chromatogr B Analyt Technol Biomed Life Sci 2013, 927, 2236. (4) Lanucara, F.; Holman, S. W.; Gray, C. J.; Eyers, C. E. Nat Chem 2014, 6, 281-294. (5) Savaryn, J. P.; Catherman, A. D.; Thomas, P. M.; Abecassis, M. M.; Kelleher, N. L. Genome Med 2013, 5, 53. (6) Skinner, O. S.; Havugimana, P. C.; Haverland, N. A.; Fornelli, L.; Early, B. P.; Greer, J. B.; Fellers, R. T.; Durbin, K. R.; Do Vale, L. H.; Melani, R. D.; Seckler, H. S.; Nelp, M. T.; Belov, M. E.; Horning, S. R.; Makarov, A. A.; LeDuc, R. D.; Bandarian, V.; Compton, P. D.; Kelleher, N. L. Nat Methods 2016, 13, 237-240. (7) Washburn, M. P.; Wolters, D.; Yates, J. R., 3rd. Nat Biotechnol 2001, 19, 242-247. (8) Link, A. J.; Eng, J.; Schieltz, D. M.; Carmack, E.; Mize, G. J.; Morris, D. R.; Garvik, B. M.; Yates, J. R. Nature Biotechnology 1999, 17, 676-682. (9) Kelleher, N. L.; Thomas, P. M.; Ntai, I.; Compton, P. D.; LeDuc, R. D. Expert Rev Proteomics 2014, 11, 649-651. (10) Catherman, A. D.; Skinner, O. S.; Kelleher, N. L. Biochem Biophys Res Commun 2014, 445, 683-693. (11) Cai, W.; Tucholski, T. M.; Gregorich, Z. R.; Ge, Y. Expert Rev Proteomics 2016, 13, 717-730. (12) Cui, W. D.; Rohrs, H. W.; Gross, M. L. Analyst 2011, 136, 3854-3864. (13) Toby, T. K.; Fornelli, L.; Kelleher, N. L. Annu Rev Anal Chem (Palo Alto Calif) 2016, 9, 499-519. (14) Du, Y.; Parks, B. A.; Sohn, S.; Kwast, K. E.; Kelleher, N. L. Anal Chem 2006, 78, 686-694. (15) Bergmann, U.; Ahrends, R.; Neumann, B.; Scheler, C.; Linscheid, M. W. Anal Chem 2012, 84, 5268-5275. (16) Collier, T. S.; Hawkridge, A. M.; Georgianna, D. R.; Payne, G. A.; Muddiman, D. C. Anal Chem 2008, 80, 49945001. (17) Ntai, I.; Kim, K.; Fellers, R. T.; Skinner, O. S.; Smith, A. D. t.; Early, B. P.; Savaryn, J. P.; LeDuc, R. D.; Thomas, P. M.; Kelleher, N. L. Anal Chem 2014, 86, 4961-4968. (18) Smith, L. M.; Kelleher, N. L.; Consortium for Top Down, P. Nat Methods 2013, 10, 186-187.

8 ACS Paragon Plus Environment

Page 9 of 17 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

(19) Pesavento, J. J.; Mizzen, C. A.; Kelleher, N. L. Anal Chem 2006, 78, 4271-4280. (20) Castagnola, M.; Messana, I.; Inzitari, R.; Fanali, C.; Cabras, T.; Morelli, A.; Pecoraro, A. M.; Neri, G.; Torrioli, M. G.; Gurrieri, F. Journal of Proteome Research 2008, 7, 53275332. (21) Wu, S.; Brown, J. N.; Tolic, N.; Meng, D.; Liu, X.; Zhang, H.; Zhao, R.; Moore, R. J.; Pevzner, P.; Smith, R. D.; Pasa-Tolic, L. Proteomics 2014, 14, 1211-1222. (22) Jaitly, N.; Mayampurath, A.; Littlefield, K.; Adkins, J. N.; Anderson, G. A.; Smith, R. D. BMC Bioinformatics 2009, 10, 87. (23) Cai, W.; Guner, H.; Gregorich, Z. R.; Chen, A. J.; AyazGuner, S.; Peng, Y.; Valeja, S. G.; Liu, X.; Ge, Y. Mol Cell Proteomics 2016, 15, 703-714. (24) LeDuc, R. D.; Taylor, G. K.; Kim, Y. B.; Januszyk, T. E.; Bynum, L. H.; Sola, J. V.; Garavelli, J. S.; Kelleher, N. L. Nucleic Acids Res 2004, 32, W340-345. (25) Zamdborg, L.; LeDuc, R. D.; Glowacz, K. J.; Kim, Y. B.; Viswanathan, V.; Spaulding, I. T.; Early, B. P.; Bluhm, E. J.; Babai, S.; Kelleher, N. L. Nucleic Acids Res 2007, 35, W701706. (26) Fellers, R. T.; Greer, J. B.; Early, B. P.; Yu, X.; LeDuc, R. D.; Kelleher, N. L.; Thomas, P. M. Proteomics 2015, 15, 1235-1238. (27) Park, J.; Piehowski, P. D.; Wilkins, C.; Zhou, M.; Mendoza, J.; Fujimoto, G. M.; Gibbons, B. C.; Shaw, J. B.; Shen, Y.; Shukla, A. K.; Moore, R. J.; Liu, T.; Petyuk, V. A.; Tolic, N.; Pasa-Tolic, L.; Smith, R. D.; Payne, S. H.; Kim, S. Nat Methods 2017, 14, 909-914. (28) Ferrige, A. G.; Seddon, M. J.; Jarvis, S. Rapid Communications in Mass Spectrometry 1991, 5, 374-377. (29) Horn, D. M.; Zubarev, R. A.; McLafferty, F. W. J Am Soc Mass Spectrom 2000, 11, 320-332. (30) Liu, X.; Inbar, Y.; Dorrestein, P. C.; Wynne, C.; Edwards, N.; Souda, P.; Whitelegge, J. P.; Bafna, V.; Pevzner, P. A. Mol Cell Proteomics 2010, 9, 2772-2782. (31) Kou, Q.; Wu, S.; Liu, X. BMC Genomics 2014, 15, 1140. (32) Marty, M. T.; Baldwin, A. J.; Marklund, E. G.; Hochberg, G. K.; Benesch, J. L.; Robinson, C. V. Anal Chem 2015, 87, 4370-4376. (33) Cannon, J. R.; Cammarata, M. B.; Robotham, S. A.; Cotham, V. C.; Shaw, J. B.; Fellers, R. T.; Early, B. P.; Thomas, P. M.; Kelleher, N. L.; Brodbelt, J. S. Anal Chem 2014, 86, 2185-2192. (34) Senko, M. W.; Beu, S. C.; McLaffertycor, F. W. J Am Soc Mass Spectrom 1995, 6, 229-233. (35) Cleveland, W. S. Journal of the American Statistical Association 1979, 74, 829-836. (36) Cleveland, W. S. American Statistician 1981, 35, 54-54. (37) Aebersold, R.; Mann, M. Nature 2016, 537, 347-355. (38) Nedeva, T. S.; Petrova, V. Y.; Zamfirova, D. R.; Stephanova, E. V.; Kujumdzieva, A. V. FEMS Microbiol Lett 2004, 230, 19-25. (39) Zielinski, R.; Pilecki, M.; Kubinski, K.; Zien, P.; Hellman, U.; Szyszka, R. Biochem Biophys Res Commun 2002, 296, 1310-1316.

9 ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 17

FOR TOC ONLY

10 ACS Paragon Plus Environment

Page 11 of 17 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

191x247mm (300 x 300 DPI)

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

472x242mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 12 of 17

Page 13 of 17 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

340x166mm (300 x 300 DPI)

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

346x172mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 14 of 17

Page 15 of 17 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

311x191mm (300 x 300 DPI)

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

220x130mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 16 of 17

Page 17 of 17 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

362x257mm (300 x 300 DPI)

ACS Paragon Plus Environment