Editorial pubs.acs.org/ac
Cite This: Anal. Chem. 2018, 90, 8721−8721
Where Is the Data?
Anal. Chem. 2018.90:8721-8721. Downloaded from pubs.acs.org by 191.101.54.24 on 08/08/18. For personal use only.
W
cannot be included, such as with human subjects research, where all information that could be used to identify individuals must be removed. In my experience, there seems to be some confusion about these rules. I have been told by many scientists in the USA that the National Institutes of Health (NIH) requires data to be kept forever. I am not sure what “forever” really means. I have not found this to be the typical practice among past colleagues, and I have never found stashes of old faculty notebooks in our basement. In actuality, the NIH Office of Research Integrity states that data should be kept for three years after the last required report is filed.2 Three years is certainly not forever. Another aspect of this issue that has caused tension between reviewers, readers, and authors involves the author’s proprietary interests, confidential business information, and intellectual property rights. These are important concerns but do not outweigh the requirement to provide enough detail to allow the experiments to be repeated. Given this complexity, there are many factors to be considered and worked out before data deposit becomes normative behavior.3 While the details are not always clear and can be expected to evolve over time, ensuring data availability and preservation should be the ultimate objective and will help advance our fields. Our goal at Analytical Chemistry is to handle this issue in a thoughtful manner, balancing utility, necessity, and practicality.
hen reading the Analytical Chemistry author guidelines and comparing them to some of the other journals you may publish in, you will find no specific requirements for data deposition. Our Editors, Reviewers, and Authors all have strong feelings on data retention and sharing, so why is this? Said differently, what is expected for our manuscripts? The overarching goals of data sharing are several and include supporting the results, emphasizing rigor and transparency in research, helping others repeat the work, and advancing the field by allowing data mining. Toward these goals, we expect that appropriate manuscripts have a data availability statement that contains a persistent identifier (e.g., DOIs or ascension numbers), a statement if specific data will not be made available (e.g., because of human subject identification concerns), and other important details on data availability. While it is possible to come up with these general guidelines, some journals outline exactly the requisite details for a specific data type (e.g., for proteomics or transcriptomics data) and where to deposit the data. It is difficult to come up with specific requirements/locations for data deposition for our journal, which encompasses all subfields of analytical chemistry. We do not want to hinder the rapid evolution of a field or prevent innovative manuscripts from being submitted to Analytical Chemistry by specifying requirements that may not be flexible, appropriate, or even possible. However, we do expect that the experimental details and software used are described in enough detail in the manuscript to allow someone to repeat the measurements. Where practical, example data should be included in the Supporting Information and appropriate data repositories used. Finally, authors need to keep the data long enough to answer questions that often arise from the scientific community during and after publication. We recognize that there are many issues (e.g., format, file size) that can hamper the ability to deposit data. As an example, mass spectrometry imaging proteomics data are not often accepted in many existing proteomics repositories. My group now acquires tens of thousands of spectra each from distinct individual cells; we have yet to find a repository that will readily accept such large data files. Although our university (and perhaps yours) provides a general data repository for its staff (in my case, the Illinois Data Bank), we are limited to data sizes of less than 2 Tb per data set per year. For our larger data sets, we are still not sure what to do. Fortunately, new opportunities for data storage and dissemination continue to emerge. Despite the “extra” effort and challenges, I strongly support sharing the data with interested users. In addition to meeting the requirements of the publisher, the source of funding for the research often has its own set of rules. Interestingly, the Office of Science and Technology Policy in the USA distinguishes between publications (which should be made available long-term) and data (which has fairly nebulous expectations). Indeed, the specific language calls for “...preserving the balance between the relative value of long-term preservation and access and the associated cost and administrative burden.”1 Federal guidelines also specify what © 2018 American Chemical Society
■
Jonathan V. Sweedler AUTHOR INFORMATION
ORCID
Jonathan V. Sweedler: 0000-0003-3107-9922 Notes
Views expressed in this editorial are those of the author and not necessarily the views of the ACS.
■
REFERENCES
(1) Holdren, J. P. Increasing Access to the Results of Federally Funded Scientific Research. Office of Science and Technology Policy, 2013; http://web.archive.org/web/20160115125401/https://www. whitehouse.gov/sites/default/files/microsites/ostp/ostp_public_ access_memo_2013.pdf. (2) https://ori.hhs.gov/education/products/rcradmin/topics/data/ tutorial_11.shtml and https://www.gpo.gov/fdsys/granule/CFR2012-title2-vol1/CFR-2012-title2-vol1-part215, page 102. (3) Imker, H. J. Chapter 5: Overlooked and Overrated Data Sharing: Why so many scientists are confused and/or dismissive. In Curating Research Data; Johnston, L. R., Ed.; Association of College and Research Libraries, 2017; pp 127−150, http://hdl.handle.net/2142/ 95024.
Published: August 7, 2018 8721
DOI: 10.1021/acs.analchem.8b03212 Anal. Chem. 2018, 90, 8721−8721