Combining evidence using p-values: application to sequence homology searches

doi:10.1093/bioinformatics/14.1.48

. 1998;14(1):48-54.

doi: 10.1093/bioinformatics/14.1.48.

Combining evidence using p-values: application to sequence homology searches

T L Bailey¹, M Gribskov

Affiliations

PMID: 9520501
DOI: 10.1093/bioinformatics/14.1.48

Combining evidence using p-values: application to sequence homology searches

T L Bailey et al. Bioinformatics. 1998.

. 1998;14(1):48-54.

doi: 10.1093/bioinformatics/14.1.48.

Authors

T L Bailey¹, M Gribskov

Affiliation

¹ San Diego Supercomputer Center, CA 92186-9784, USA.

PMID: 9520501
DOI: 10.1093/bioinformatics/14.1.48

Abstract

Motivation: To illustrate an intuitive and statistically valid method for combining independent sources of evidence that yields a p-value for the complete evidence, and to apply it to the problem of detecting simultaneous matches to multiple patterns in sequence homology searches.

Results: In sequence analysis, two or more (approximately) independent measures of the membership of a sequence (or sequence region) in some class are often available. We would like to estimate the likelihood of the sequence being a member of the class in view of all the available evidence. An example is estimating the significance of the observed match of a macromolecular sequence (DNA or protein) to a set of patterns (motifs) that characterize a biological sequence family. An intuitive way to do this is to express each piece of evidence as a p-value, and then use the product of these p-values as the measure of membership in the family. We derive a formula and algorithm (QFAST) for calculating the statistical distribution of the product of n independent p-values. We demonstrate that sorting sequences by this p-value effectively combines the information present in multiple motifs, leading to highly accurate and sensitive sequence homology searches.

PubMed Disclaimer

Comment in

Concerning the accuracy of MAST E-values.
Bailey TL, Gribskov M. Bailey TL, et al. Bioinformatics. 2000 May;16(5):488-9. doi: 10.1093/bioinformatics/16.5.488. Bioinformatics. 2000. PMID: 10871274 No abstract available.

Cited by

Evaluation and integration of existing methods for computational prediction of allergens.
Wang J, Yu Y, Zhao Y, Zhang D, Li J. Wang J, et al. BMC Bioinformatics. 2013;14 Suppl 4(Suppl 4):S1. doi: 10.1186/1471-2105-14-S4-S1. Epub 2013 Mar 8. BMC Bioinformatics. 2013. PMID: 23514097 Free PMC article.
MARZ: an algorithm to combinatorially analyze gapped n-mer models of transcription factor binding.
Zellers RG, Drewell RA, Dresch JM. Zellers RG, et al. BMC Bioinformatics. 2015 Jan 31;16:30. doi: 10.1186/s12859-014-0446-3. BMC Bioinformatics. 2015. PMID: 25637281 Free PMC article.
Two different domain architectures generate structural and functional diversity among bZIP genes in the Solanaceae family.
Choi JW, Kim HE, Kim S. Choi JW, et al. Front Plant Sci. 2022 Aug 19;13:967546. doi: 10.3389/fpls.2022.967546. eCollection 2022. Front Plant Sci. 2022. PMID: 36061789 Free PMC article.
Combinatorial motif analysis of regulatory gene expression in Mafb deficient macrophages.
Morita M, Nakamura M, Hamada M, Takahashi S. Morita M, et al. BMC Syst Biol. 2011;5 Suppl 2(Suppl 2):S7. doi: 10.1186/1752-0509-5-S2-S7. Epub 2011 Dec 14. BMC Syst Biol. 2011. PMID: 22784578 Free PMC article.
Characterization of the Newly Isolated Lytic Bacteriophages KTN6 and KT28 and Their Efficacy against Pseudomonas aeruginosa Biofilm.
Danis-Wlodarczyk K, Olszak T, Arabski M, Wasik S, Majkowska-Skrobek G, Augustyniak D, Gula G, Briers Y, Jang HB, Vandenheuvel D, Duda KA, Lavigne R, Drulis-Kawa Z. Danis-Wlodarczyk K, et al. PLoS One. 2015 May 21;10(5):e0127603. doi: 10.1371/journal.pone.0127603. eCollection 2015. PLoS One. 2015. PMID: 25996839 Free PMC article.

See all "Cited by" articles

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions

Grants and funding

P41 RR-08605/RR/NCRR NIH HHS/United States

LinkOut - more resources

Full Text Sources
- Silverchair Information Systems
Other Literature Sources
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Combining evidence using p-values: application to sequence homology searches

Affiliation

Combining evidence using p-values: application to sequence homology searches

Authors

Affiliation

Abstract

Comment in

Similar articles

Cited by

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Abstract

Comment in

Similar articles

Cited by

Publication types

MeSH terms

Substances

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources