Correction to “Large-Scale Reanalysis of Publicly Available HeLa Cell

Publication Date (Web): February 25, 2019. Copyright © 2018 American Chemical Society. Cite this:J. Proteome Res. XXXX, XXX, XXX-XXX. The Supporting ...
2 downloads 0 Views 919KB Size
Addition/Correction pubs.acs.org/jpr

Cite This: J. Proteome Res. XXXX, XXX, XXX−XXX

Correction to “Large-Scale Reanalysis of Publicly Available HeLa Cell Proteomics Data in the Context of the Human Proteome Project” Thibault Robin, Amos Bairoch, Markus Müller, Fred́ eŕ ique Lisacek, and Lydie Lane* J. Proteome Res. 2018, 17 (12), 4160−4170. DOI: 10.1021/acs.jproteome.8b00392 S Supporting Information *

J. Proteome Res. Downloaded from pubs.acs.org by 31.40.210.80 on 03/15/19. For personal use only.



PROBLEM DESCRIPTION While designing a new experiment, we realized that we had a data handling problem in the results of our previous study “Large-Scale Reanalysis of Publicly Available HeLa Cell Proteomics Data in the Context of the Human Proteome Project”. Out of the 1233 tandem mass spectrometry files that were originally processed in that work, 27 turned out to be associated with cell lines other than HeLa. The problematic files come from two distinct data sets: PXD001426 (3/3 files, originating from the HCT 116 cell line) and PXD002395 (24/ 42 files, originating from the HepG2 and HEK293 cell lines). The PXD001426 data set was not produced on HeLa cells and was improperly selected to be analyzed by our workflow. Even if HeLa was used in a subpart of this study, the actual raw files available on PRIDE all belong to HCT 116. The PXD002395 data set was produced on a large panel of 11 distinct cell lines including HeLa. Of note, the raw files were properly annotated in the PRIDE database. Our oversight was a loose regular expression “He*”, which caused the retrieval of files matching cell line names starting with He and that we omitted to doublecheck.



entries, allowing the validation of 5508 (−68) protein entries in accordance with the HPP guidelines version 2.1. The X! Tandem hyperscore threshold remained at 60.4 after recomputation, ensuring that the global FDR would be set to 1.0% at the protein level. All of the missing protein identifications were conserved. The validation of two Nterminal acetylation sites was lost, reducing the number of validated sites to 390, while no phosphorylation site validation was lost (Table 1). The identification of the wild-type peptide of the TARS p.T453I variant was lost (Table 2), removing as a consequence the variant from the sign-test and reducing the number of heterozygous cases where a significant difference was detected to two out of eight. Overall, the removal of the 27 files that did not belong to the HeLa cell line only slightly alters our results and does not change the conclusions that we drew in the original article.



ASSOCIATED CONTENT

S Supporting Information *

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jproteome.9b00113. Descriptions of file contents (PDF) Supporting data (ZIP)

REANALYSIS

The workflow previously described was reapplied on the 1206 files from the 40 data sets associated with the HeLa cell line after the removal of the 27 incriminated files (representing a loss of 1 103 531 tandem mass spectra, 2% of the total amount). The methods detailed in the original article were identically reapplied. The FDR threshold was recomputed at the different levels, and the N-terminal acetylation and phosphorylation sites were validated again. The variant and “missing protein” identifications were also revalidated, and the sign-test statistical analysis was rerun. The different tables from the Supporting Information were rebuilt.



RESULTS The reanalysis of the 40 HeLa data sets led to the identification of 48 583 (−883) unique peptides in 7174 (−92) protein Table 1. Summary of the PTM Site Identifications

PTM phosphorylation N-terminal acetylation

number of modified proteins

number of modified peptides

number of sites

number of novel sites

percent of novel sites

1779 1048

3511 1246

4921 1083

189 390

3.84% 36.0%

© 2018 American Chemical Society

A

DOI: 10.1021/acs.jproteome.9b00113 J. Proteome Res. XXXX, XXX, XXX−XXX

Journal of Proteome Research

Addition/Correction

Table 2. List of HeLa Variant Identificationsa

a

The identified variant and wild-type (phospho)peptides are reported along with their corresponding spectral counts. For variants where both peptide versions were observed, a two-tailed sign-test with an α level of significance of 5% was applied. Two cases emerged as having a statistically significant abundance difference, potentially induced by the variant insertion.

B

DOI: 10.1021/acs.jproteome.9b00113 J. Proteome Res. XXXX, XXX, XXX−XXX