Identification of subfamily-specific sites based on active sites modeling and clustering
- PMID: 20980272
- DOI: 10.1093/bioinformatics/btq595
Identification of subfamily-specific sites based on active sites modeling and clustering
Abstract
Motivation: Current computational approaches to function prediction are mostly based on protein sequence classification and transfer of annotation from known proteins to their closest homologous sequences relying on the orthology concept of function conservation. This approach suffers a major weakness: annotation reliability depends on global sequence similarity to known proteins and is poorly efficient for enzyme superfamilies that catalyze different reactions. Structural biology offers a different strategy to overcome the problem of annotation by adding information about protein 3D structures. This information can be used to identify amino acids located in active sites, focusing on detection of functional polymorphisms residues in an enzyme superfamily. Structural genomics programs are providing more and more novel protein structures at a high-throughput rate. However, there is still a huge gap between the number of sequences and available structures. Computational methods, such as homology modeling provides reliable approaches to bridge this gap and could be a new precise tool to annotate protein functions.
Results: Here, we present Active Sites Modeling and Clustering (ASMC) method, a novel unsupervised method to classify sequences using structural information of protein pockets. ASMC combines homology modeling of family members, structural alignment of modeled active sites and a subsequent hierarchical conceptual classification. Comparison of profiles obtained from computed clusters allows the identification of residues correlated to subfamily function divergence, called specificity determining positions. ASMC method has been validated on a benchmark of 42 Pfam families for which previous resolved holo-structures were available. ASMC was also applied to several families containing known protein structures and comprehensive functional annotations. We will discuss how ASMC improves annotation and understanding of protein families functions by giving some specific illustrative examples on nucleotidyl cyclases, protein kinases and serine proteases.
Availability: http://www.genoscope.fr/ASMC/.
Similar articles
-
Functional classification of CATH superfamilies: a domain-based approach for protein function annotation.Bioinformatics. 2015 Nov 1;31(21):3460-7. doi: 10.1093/bioinformatics/btv398. Epub 2015 Jul 2. Bioinformatics. 2015. PMID: 26139634 Free PMC article.
-
Isofunctional Protein Subfamily Detection Using Data Integration and Spectral Clustering.PLoS Comput Biol. 2016 Jun 27;12(6):e1005001. doi: 10.1371/journal.pcbi.1005001. eCollection 2016 Jun. PLoS Comput Biol. 2016. PMID: 27348631 Free PMC article.
-
Analysis and prediction of functional sub-types from protein sequence alignments.J Mol Biol. 2000 Oct 13;303(1):61-76. doi: 10.1006/jmbi.2000.4036. J Mol Biol. 2000. PMID: 11021970
-
Enzymes, pseudoenzymes, and moonlighting proteins: diversity of function in protein superfamilies.FEBS J. 2020 Oct;287(19):4141-4149. doi: 10.1111/febs.15446. Epub 2020 Jun 30. FEBS J. 2020. PMID: 32534477 Review.
-
Prediction of protein function from protein sequence and structure.Q Rev Biophys. 2003 Aug;36(3):307-40. doi: 10.1017/s0033583503003901. Q Rev Biophys. 2003. PMID: 15029827 Review.
Cited by
-
New computational approaches to understanding molecular protein function.PLoS Comput Biol. 2018 Apr 5;14(4):e1005756. doi: 10.1371/journal.pcbi.1005756. eCollection 2018 Apr. PLoS Comput Biol. 2018. PMID: 29621256 Free PMC article. No abstract available.
-
Revealing the hidden functional diversity of an enzyme family.Nat Chem Biol. 2014 Jan;10(1):42-9. doi: 10.1038/nchembio.1387. Epub 2013 Nov 17. Nat Chem Biol. 2014. PMID: 24240508
-
Domain-mediated interactions for protein subfamily identification.Sci Rep. 2020 Jan 14;10(1):264. doi: 10.1038/s41598-019-57187-z. Sci Rep. 2020. PMID: 31937869 Free PMC article.
-
Structure-guided selection of specificity determining positions in the human Kinome.BMC Genomics. 2016 Aug 18;17 Suppl 4(Suppl 4):431. doi: 10.1186/s12864-016-2790-3. BMC Genomics. 2016. PMID: 27556159 Free PMC article.
-
An Atlas of Peroxiredoxins Created Using an Active Site Profile-Based Approach to Functionally Relevant Clustering of Proteins.PLoS Comput Biol. 2017 Feb 10;13(2):e1005284. doi: 10.1371/journal.pcbi.1005284. eCollection 2017 Feb. PLoS Comput Biol. 2017. PMID: 28187133 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Miscellaneous