Is protein classification necessary? Toward alternative approaches to function annotation
- PMID: 19269161
- PMCID: PMC2745633
- DOI: 10.1016/j.sbi.2009.02.001
Is protein classification necessary? Toward alternative approaches to function annotation
Abstract
The current nonredundant protein sequence database contains over seven million entries and the number of individual functional domains is significantly larger than this value. The vast quantity of data associated with these proteins poses enormous challenges to any attempt at function annotation. Classification of proteins into sequence and structural groups has been widely used as an approach to simplifying the problem. In this article we question such strategies. We describe how the multifunctionality and structural diversity of even closely related proteins confounds efforts to assign function on the basis of overall sequence or structural similarity. Rather, we suggest that strategies that avoid classification may offer a more robust approach to protein function annotation.
Figures



Similar articles
-
In silico characterization of proteins: UniProt, InterPro and Integr8.Mol Biotechnol. 2008 Feb;38(2):165-77. doi: 10.1007/s12033-007-9003-x. Epub 2007 Oct 4. Mol Biotechnol. 2008. PMID: 18219596 Review.
-
Identification of subfamily-specific sites based on active sites modeling and clustering.Bioinformatics. 2010 Dec 15;26(24):3075-82. doi: 10.1093/bioinformatics/btq595. Epub 2010 Oct 26. Bioinformatics. 2010. PMID: 20980272
-
ArchDB: automated protein loop classification as a tool for structural genomics.Nucleic Acids Res. 2004 Jan 1;32(Database issue):D185-8. doi: 10.1093/nar/gkh002. Nucleic Acids Res. 2004. PMID: 14681390 Free PMC article.
-
Protein family classification and functional annotation.Comput Biol Chem. 2003 Feb;27(1):37-47. doi: 10.1016/s1476-9271(02)00098-1. Comput Biol Chem. 2003. PMID: 12798038 Review.
-
A biocurator perspective: annotation at the Research Collaboratory for Structural Bioinformatics Protein Data Bank.PLoS Comput Biol. 2006 Oct 27;2(10):e99. doi: 10.1371/journal.pcbi.0020099. PLoS Comput Biol. 2006. PMID: 17069453 Free PMC article. Review. No abstract available.
Cited by
-
FragBag, an accurate representation of protein structure, retrieves structural neighbors from the entire PDB quickly and accurately.Proc Natl Acad Sci U S A. 2010 Feb 23;107(8):3481-6. doi: 10.1073/pnas.0914097107. Epub 2010 Feb 3. Proc Natl Acad Sci U S A. 2010. PMID: 20133727 Free PMC article.
-
The CanOE strategy: integrating genomic and metabolic contexts across multiple prokaryote genomes to find candidate genes for orphan enzymes.PLoS Comput Biol. 2012 May;8(5):e1002540. doi: 10.1371/journal.pcbi.1002540. Epub 2012 May 31. PLoS Comput Biol. 2012. PMID: 22693442 Free PMC article.
-
MeMotif: a database of linear motifs in alpha-helical transmembrane proteins.Nucleic Acids Res. 2010 Jan;38(Database issue):D181-9. doi: 10.1093/nar/gkp1042. Epub 2009 Nov 12. Nucleic Acids Res. 2010. PMID: 19910368 Free PMC article.
-
Global view of the protein universe.Proc Natl Acad Sci U S A. 2014 Aug 12;111(32):11691-6. doi: 10.1073/pnas.1403395111. Epub 2014 Jul 28. Proc Natl Acad Sci U S A. 2014. PMID: 25071170 Free PMC article.
-
Protective gene expression changes elicited by an inherited defect in photoreceptor structure.PLoS One. 2012;7(2):e31371. doi: 10.1371/journal.pone.0031371. Epub 2012 Feb 20. PLoS One. 2012. PMID: 22363631 Free PMC article.
References
-
- Taylor WR. Evolutionary transitions in protein fold space. Current opinion in structural biology. 2007;17:354–361. - PubMed
-
- Commichau FM, Stulke J. Trigger enzymes: bifunctional proteins active in metabolism and in controlling gene expression. Molecular microbiology. 2008;67:692–702. - PubMed
-
- Reeves GA, Dallman TJ, Redfern OC, Akpor A, Orengo CA. Structural Diversity of Domain Superfamilies in the CATH Database. Journal of Molecular Biology. 2006;360:725–741. This paper, along with reference [5], describe the surprising amount of structural diversity that can arise in proteins that are related evolutionarily as a result of variations functional necessities, such as novel oligomeric states or binding of ligands with different moieties. - PubMed
-
- Andreeva A, Murzin AG. Evolution of protein fold in the presence of functional constraints. Current opinion in structural biology. 2006;16:399–408. - PubMed
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources