Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2012 Jun;22(3):316-25.
doi: 10.1016/j.sbi.2012.05.001. Epub 2012 May 24.

The use of evolutionary patterns in protein annotation

Affiliations
Review

The use of evolutionary patterns in protein annotation

Angela D Wilkins et al. Curr Opin Struct Biol. 2012 Jun.

Abstract

With genomic data skyrocketing, their biological interpretation remains a serious challenge. Diverse computational methods address this problem by pointing to the existence of recurrent patterns among sequence, structure, and function. These patterns emerge naturally from evolutionary variation, natural selection, and divergence--the defining features of biological systems--and they identify molecular events and shapes that underlie specificity of function and allosteric communication. Here we review these methods, and the patterns they identify in case studies and in proteome-wide applications, to infer and rationally redesign function.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Evolutionary approaches to characterize protein function rely on both global or local patterns. These include global sequence similarity (Homology) and local residue conservation (Motifs), or global structural similarity (Fold Recognition) and local structural similarity (3D Templates). Another pattern is evolutionary classification (Phylogenomics). The Evolutionary Trace (ET) combines these approaches by defining key structural or functional positions based on whether their evolutionary variations couple to small (blue, where the breaks between rectangles indicate residue variations), or large evolutionary divergences (red). Top-ranked positions typically map out functional sites to guide targeted mutations or extract functional motifs, such as for 3D templates. Proteome-wide 3D template matches between structures give rise to a proteomic network that can be analyzed for global function prediction.
Figure 2
Figure 2
Distribution of the statistical clustering z-score of ET residues in 10417 proteins from the PDB90. This z-score is the difference between the observed and the expected random clustering pattern in units of standard deviation. A z-score can be obtained at any ET coverage of a protein. This histogram shows the maximum clustering z-score between 0% to 50% coverage, which is representative of z-scores over most of this interval. The high values (94% with z-score > 2) show that evolutionarily important residues cluster together in the protein, as a general rule.

Similar articles

Cited by

References

    1. Liolios K, et al. The Genomes On Line Database (GOLD) in 2009: status of genomic and metagenomic projects and their associated metadata. Nucleic Acids Res. 38:D346–354. - PMC - PubMed
    1. The Universal Protein Resource (UniProt) 2009. Nucleic Acids Res. 2009;37:D169–174. - PMC - PubMed
    1. Liolios K, Mavromatis K, Tavernarakis N, Kyrpides NC. The Genomes On Line Database (GOLD) in 2007: status of genomic and metagenomic projects and their associated metadata. Nucleic Acids Res. 2008;36:D475–479. - PMC - PubMed
    1. Ashburner M, et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000;25:25–29. - PMC - PubMed
    1. Kuznetsova E, et al. Enzyme genomics: Application of general enzymatic screens to discover new enzymes. FEMS Microbiol Rev. 2005;29:263–279. - PubMed

Publication types

LinkOut - more resources