Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Sep 3;85(17):8291-7.
doi: 10.1021/ac401564v. Epub 2013 Aug 14.

Method for assessing the statistical significance of mass spectral similarities using basic local alignment search tool statistics

Affiliations

Method for assessing the statistical significance of mass spectral similarities using basic local alignment search tool statistics

Fumio Matsuda et al. Anal Chem. .

Abstract

A novel method for assessing the statistical significance of mass spectral similarities was developed using modified basic local alignment search tool (BLAST; Karlin-Altschul) statistics. In gas chromatography/mass spectrometry-based metabolomics, many signals in raw metabolome data are identified on the basis of unexpected similarities among mass spectra and the spectra of standards. Since there is inevitably noise in the observed spectra, a list of identified metabolites includes some false positives. In the developed method, electron ionization (EI) mass spectrometry-BLAST, a similarity score of two mass spectra is calculated using a general scoring scheme, from which the probability of obtaining the score by chance (P value) is calculated. For this purpose, a simple rule for converting a unit EI mass spectrum to a mass spectral sequence as well as a score matrix for aligned mass spectral sequences was developed. A Monte Carlo simulation using randomly generated mass spectral sequences demonstrated that the null distribution or the expected number of hits (E value) follows modified Karlin-Altschul statistics. A metabolite data set obtained from green tea extract was analyzed using the developed method. Among 171 metabolite signals in the metabolome data, 93 signals were identified on the basis of significant similarities (P < 0.015) with reference data. Since the expected number of false positives is 2.6, the false discovery rate was estimated to be 2.8%, indicating that the search threshold (P < 0.015) is reasonable for metabolite identification.

PubMed Disclaimer

Similar articles

Cited by

Publication types

MeSH terms

LinkOut - more resources