Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Jun;11(11):2182-94.
doi: 10.1002/pmic.201000602. Epub 2011 May 2.

A posteriori quality control for the curation and reuse of public proteomics data

Affiliations

A posteriori quality control for the curation and reuse of public proteomics data

Joseph M Foster et al. Proteomics. 2011 Jun.

Abstract

Proteomics is a rapidly expanding field encompassing a multitude of complex techniques and data types. To date much effort has been devoted to achieving the highest possible coverage of proteomes with the aim to inform future developments in basic biology as well as in clinical settings. As a result, growing amounts of data have been deposited in publicly available proteomics databases. These data are in turn increasingly reused for orthogonal downstream purposes such as data mining and machine learning. These downstream uses however, need ways to a posteriori validate whether a particular data set is suitable for the envisioned purpose. Furthermore, the (semi-)automatic curation of repository data is dependent on analyses that can highlight misannotation and edge conditions for data sets. Such curation is an important prerequisite for efficient proteomics data reuse in the life sciences in general. We therefore present here a selection of quality control metrics and approaches for the a posteriori detection of potential issues encountered in typical proteomics data sets. We illustrate our metrics by relying on publicly available data from the Proteomics Identifications Database (PRIDE), and simultaneously show the usefulness of the large body of PRIDE data as a means to derive empirical background distributions for relevant metrics.

PubMed Disclaimer

Similar articles

Cited by

  • The PRoteomics IDEntifications (PRIDE) database and associated tools: status in 2013.
    Vizcaíno JA, Côté RG, Csordas A, Dianes JA, Fabregat A, Foster JM, Griss J, Alpi E, Birim M, Contell J, O'Kelly G, Schoenegger A, Ovelleiro D, Pérez-Riverol Y, Reisinger F, Ríos D, Wang R, Hermjakob H. Vizcaíno JA, et al. Nucleic Acids Res. 2013 Jan;41(Database issue):D1063-9. doi: 10.1093/nar/gks1262. Epub 2012 Nov 29. Nucleic Acids Res. 2013. PMID: 23203882 Free PMC article.
  • Visualization of proteomics data using R and bioconductor.
    Gatto L, Breckels LM, Naake T, Gibb S. Gatto L, et al. Proteomics. 2015 Apr;15(8):1375-89. doi: 10.1002/pmic.201400392. Proteomics. 2015. PMID: 25690415 Free PMC article. Review.
  • The reuse of public datasets in the life sciences: potential risks and rewards.
    Sielemann K, Hafner A, Pucker B. Sielemann K, et al. PeerJ. 2020 Sep 22;8:e9954. doi: 10.7717/peerj.9954. eCollection 2020. PeerJ. 2020. PMID: 33024631 Free PMC article.
  • Toward an Integrated Machine Learning Model of a Proteomics Experiment.
    Neely BA, Dorfer V, Martens L, Bludau I, Bouwmeester R, Degroeve S, Deutsch EW, Gessulat S, Käll L, Palczynski P, Payne SH, Rehfeldt TG, Schmidt T, Schwämmle V, Uszkoreit J, Vizcaíno JA, Wilhelm M, Palmblad M. Neely BA, et al. J Proteome Res. 2023 Mar 3;22(3):681-696. doi: 10.1021/acs.jproteome.2c00711. Epub 2023 Feb 6. J Proteome Res. 2023. PMID: 36744821 Free PMC article. Review.
  • qcML: an exchange format for quality control metrics from mass spectrometry experiments.
    Walzer M, Pernas LE, Nasso S, Bittremieux W, Nahnsen S, Kelchtermans P, Pichler P, van den Toorn HW, Staes A, Vandenbussche J, Mazanek M, Taus T, Scheltema RA, Kelstrup CD, Gatto L, van Breukelen B, Aiche S, Valkenborg D, Laukens K, Lilley KS, Olsen JV, Heck AJ, Mechtler K, Aebersold R, Gevaert K, Vizcaíno JA, Hermjakob H, Kohlbacher O, Martens L. Walzer M, et al. Mol Cell Proteomics. 2014 Aug;13(8):1905-13. doi: 10.1074/mcp.M113.035907. Epub 2014 Apr 23. Mol Cell Proteomics. 2014. PMID: 24760958 Free PMC article.

Publication types

MeSH terms

LinkOut - more resources