A new method to measure the semantic similarity of GO terms
- PMID: 17344234
- DOI: 10.1093/bioinformatics/btm087
A new method to measure the semantic similarity of GO terms
Abstract
Motivation: Although controlled biochemical or biological vocabularies, such as Gene Ontology (GO) (http://www.geneontology.org), address the need for consistent descriptions of genes in different data sources, there is still no effective method to determine the functional similarities of genes based on gene annotation information from heterogeneous data sources.
Results: To address this critical need, we proposed a novel method to encode a GO term's semantics (biological meanings) into a numeric value by aggregating the semantic contributions of their ancestor terms (including this specific term) in the GO graph and, in turn, designed an algorithm to measure the semantic similarity of GO terms. Based on the semantic similarities of GO terms used for gene annotation, we designed a new algorithm to measure the functional similarity of genes. The results of using our algorithm to measure the functional similarities of genes in pathways retrieved from the saccharomyces genome database (SGD), and the outcomes of clustering these genes based on the similarity values obtained by our algorithm are shown to be consistent with human perspectives. Furthermore, we developed a set of online tools for gene similarity measurement and knowledge discovery.
Availability: The online tools are available at: http://bioinformatics.clemson.edu/G-SESAME.
Supplementary information: http://bioinformatics.clemson.edu/Publication/Supplement/gsp.htm.
Similar articles
-
Measure the Semantic Similarity of GO Terms Using Aggregate Information Content.IEEE/ACM Trans Comput Biol Bioinform. 2014 May-Jun;11(3):468-76. doi: 10.1109/TCBB.2013.176. IEEE/ACM Trans Comput Biol Bioinform. 2014. PMID: 26356015
-
A relation based measure of semantic similarity for Gene Ontology annotations.BMC Bioinformatics. 2008 Nov 4;9:468. doi: 10.1186/1471-2105-9-468. BMC Bioinformatics. 2008. PMID: 18983678 Free PMC article.
-
PlasmoDraft: a database of Plasmodium falciparum gene function predictions based on postgenomic data.BMC Bioinformatics. 2008 Oct 16;9:440. doi: 10.1186/1471-2105-9-440. BMC Bioinformatics. 2008. PMID: 18925948 Free PMC article.
-
The human phenotype ontology.Clin Genet. 2010 Jun;77(6):525-34. doi: 10.1111/j.1399-0004.2010.01436.x. Epub 2010 Feb 11. Clin Genet. 2010. PMID: 20412080 Review.
-
Gene Ontology annotation status of the fission yeast genome: preliminary coverage approaches 100%.Yeast. 2006 Oct 15;23(13):913-9. doi: 10.1002/yea.1420. Yeast. 2006. PMID: 17072883 Review.
Cited by
-
EFMSDTI: Drug-target interaction prediction based on an efficient fusion of multi-source data.Front Pharmacol. 2022 Sep 23;13:1009996. doi: 10.3389/fphar.2022.1009996. eCollection 2022. Front Pharmacol. 2022. PMID: 36210804 Free PMC article.
-
Inflammation and immune activation are associated with risk of Mycobacterium tuberculosis infection in BCG-vaccinated infants.Nat Commun. 2022 Nov 3;13(1):6594. doi: 10.1038/s41467-022-34061-7. Nat Commun. 2022. PMID: 36329009 Free PMC article.
-
Fusing literature and full network data improves disease similarity computation.BMC Bioinformatics. 2016 Aug 30;17(1):326. doi: 10.1186/s12859-016-1205-4. BMC Bioinformatics. 2016. PMID: 27578323 Free PMC article.
-
Predicting synthetic lethal interactions in human cancers using graph regularized self-representative matrix factorization.BMC Bioinformatics. 2019 Dec 24;20(Suppl 19):657. doi: 10.1186/s12859-019-3197-3. BMC Bioinformatics. 2019. PMID: 31870274 Free PMC article.
-
WEADE: A workflow for enrichment analysis and data exploration.PLoS One. 2018 Sep 28;13(9):e0204016. doi: 10.1371/journal.pone.0204016. eCollection 2018. PLoS One. 2018. PMID: 30265728 Free PMC article.
MeSH terms
LinkOut - more resources
Full Text Sources