Detecting homology of distantly related proteins with consensus sequences
- PMID: 3430622
- DOI: 10.1016/0022-2836(87)90200-2
Detecting homology of distantly related proteins with consensus sequences
Abstract
A simple protocol is described that is suitable for the detection of distantly related members of a protein family. In this procedure, similarity to a consensus sequence is used to distinguish chance similarity from similarity due to common ancestry. The consensus sequence is constructed from the sequences of established members of a protein family and it incorporates features characteristic of the protein fold of this family: conserved residues, the pattern of variable and conserved segments, preferred location of gaps etc. The database is searched with the consensus sequence, using the unitary matrix or log odds matrix for scoring the alignments, with variable gap penalty. The advantage of the method is that it weights key residues, ignores sequence similarity in variable segments (thus partially eliminating "background noise" coming from chance similarity), distinguishes gaps disrupting conserved segments from those occurring in positions known to be tolerant of gap events. The utility of the method was demonstrated in the case of the protein family homologous with the internal repeats of complement B as well as the internal repeats identified in fibroblast proteoglycan PG40. The consensus sequence method succeeded in finding some new members of these protein families that could not be detected by earlier methods of sequence comparison.
Similar articles
-
Prediction of surface loops of protein-folds from multiple alignments of homologous sequences.Acta Biochim Biophys Hung. 1989;24(1-2):3-13. Acta Biochim Biophys Hung. 1989. PMID: 2481916
-
A symmetric-iterated multiple alignment of protein sequences.J Mol Biol. 1998 Feb 13;276(1):249-64. doi: 10.1006/jmbi.1997.1527. J Mol Biol. 1998. PMID: 9514731
-
Local multiple alignment by consensus matrix.Comput Appl Biosci. 1992 Aug;8(4):339-45. doi: 10.1093/bioinformatics/8.4.339. Comput Appl Biosci. 1992. PMID: 1498689
-
Rapid and sensitive sequence comparison with FASTP and FASTA.Methods Enzymol. 1990;183:63-98. doi: 10.1016/0076-6879(90)83007-v. Methods Enzymol. 1990. PMID: 2156132
-
The mitochondrial carrier family of transport proteins: structural, functional, and evolutionary relationships.Crit Rev Biochem Mol Biol. 1993;28(3):209-33. doi: 10.3109/10409239309086795. Crit Rev Biochem Mol Biol. 1993. PMID: 8325039 Review.
Cited by
-
Amino-terminal leucine-rich repeats in gonadotropin receptors determine hormone selectivity.EMBO J. 1991 Jul;10(7):1885-90. doi: 10.1002/j.1460-2075.1991.tb07714.x. EMBO J. 1991. PMID: 2050124 Free PMC article.
-
A comparison of several similarity indices used in the classification of protein sequences: a multivariate analysis.Nucleic Acids Res. 1992 Jul 25;20(14):3631-7. doi: 10.1093/nar/20.14.3631. Nucleic Acids Res. 1992. PMID: 1641329 Free PMC article.
-
Decorin is processed by three isoforms of bone morphogenetic protein-1 (BMP1).Biochem Biophys Res Commun. 2010 Jan 15;391(3):1374-8. doi: 10.1016/j.bbrc.2009.12.067. Epub 2009 Dec 22. Biochem Biophys Res Commun. 2010. PMID: 20026052 Free PMC article.
-
A survey of multiple sequence comparison methods.Bull Math Biol. 1992 Jul;54(4):563-98. doi: 10.1007/BF02459635. Bull Math Biol. 1992. PMID: 1591533
-
Amino acid substitution matrices from an information theoretic perspective.J Mol Biol. 1991 Jun 5;219(3):555-65. doi: 10.1016/0022-2836(91)90193-a. J Mol Biol. 1991. PMID: 2051488 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous