Probability-based protein identification by searching sequence databases using mass spectrometry data
- PMID: 10612281
- DOI: 10.1002/(SICI)1522-2683(19991201)20:18<3551::AID-ELPS3551>3.0.CO;2-2
Probability-based protein identification by searching sequence databases using mass spectrometry data
Abstract
Several algorithms have been described in the literature for protein identification by searching a sequence database using mass spectrometry data. In some approaches, the experimental data are peptide molecular weights from the digestion of a protein by an enzyme. Other approaches use tandem mass spectrometry (MS/MS) data from one or more peptides. Still others combine mass data with amino acid sequence data. We present results from a new computer program, Mascot, which integrates all three types of search. The scoring algorithm is probability based, which has a number of advantages: (i) A simple rule can be used to judge whether a result is significant or not. This is particularly useful in guarding against false positives. (ii) Scores can be compared with those from other types of search, such as sequence homology. (iii) Search parameters can be readily optimised by iteration. The strengths and limitations of probability-based scoring are discussed, particularly in the context of high throughput, fully automated protein identification.
Similar articles
-
Probability-based validation of protein identifications using a modified SEQUEST algorithm.Anal Chem. 2002 Nov 1;74(21):5593-9. doi: 10.1021/ac025826t. Anal Chem. 2002. PMID: 12433093
-
Comparative evaluation of tandem MS search algorithms using a target-decoy search strategy.Mol Cell Proteomics. 2007 Sep;6(9):1599-608. doi: 10.1074/mcp.M600469-MCP200. Epub 2007 May 28. Mol Cell Proteomics. 2007. PMID: 17533222
-
A mass accuracy sensitive probability based scoring algorithm for database searching of tandem mass spectrometry data.BMC Bioinformatics. 2007 Apr 20;8:133. doi: 10.1186/1471-2105-8-133. BMC Bioinformatics. 2007. PMID: 17448237 Free PMC article.
-
Large-scale database searching using tandem mass spectra: looking up the answer in the back of the book.Nat Methods. 2004 Dec;1(3):195-202. doi: 10.1038/nmeth725. Nat Methods. 2004. PMID: 15789030 Review.
-
Protein identification by tandem mass spectrometry and sequence database searching.Methods Mol Biol. 2007;367:87-119. doi: 10.1385/1-59745-275-0:87. Methods Mol Biol. 2007. PMID: 17185772 Review.
Cited by
-
pepgrep: A tool for peptide MS/MS pattern matching.Genomics Proteomics Bioinformatics. 2013 Apr;11(2):127-32. doi: 10.1016/j.gpb.2013.02.001. Epub 2013 Mar 16. Genomics Proteomics Bioinformatics. 2013. PMID: 23511729 Free PMC article.
-
A comprehensive and scalable database search system for metaproteomics.BMC Genomics. 2016 Aug 16;17(1):642. doi: 10.1186/s12864-016-2855-3. BMC Genomics. 2016. PMID: 27528457 Free PMC article.
-
Protein identification using customized protein sequence databases derived from RNA-Seq data.J Proteome Res. 2012 Feb 3;11(2):1009-17. doi: 10.1021/pr200766z. Epub 2011 Dec 14. J Proteome Res. 2012. PMID: 22103967 Free PMC article.
-
Proteomic Analysis of Non-human Primate Peripheral Blood Mononuclear Cells During Burkholderia mallei Infection Reveals a Role of Ezrin in Glanders Pathogenesis.Front Microbiol. 2021 Apr 22;12:625211. doi: 10.3389/fmicb.2021.625211. eCollection 2021. Front Microbiol. 2021. PMID: 33967974 Free PMC article.
-
Antigen-specificity of oligoclonal abnormal protein bands in multiple myeloma after allogeneic stem cell transplantation.Cancer Immunol Immunother. 2012 Oct;61(10):1639-51. doi: 10.1007/s00262-012-1220-x. Epub 2012 Feb 21. Cancer Immunol Immunother. 2012. PMID: 22350072 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources