Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Feb 12:11:86.
doi: 10.1186/1471-2105-11-86.

Enrichment of homologs in insignificant BLAST hits by co-complex network alignment

Affiliations

Enrichment of homologs in insignificant BLAST hits by co-complex network alignment

Like Fokkens et al. BMC Bioinformatics. .

Abstract

Background: Homology is a crucial concept in comparative genomics. The algorithm probably most widely used for homology detection in comparative genomics, is BLAST. Usually a stringent score cutoff is applied to distinguish putative homologs from possible false positive hits. As a consequence, some BLAST hits are discarded that are in fact homologous.

Results: Analogous to the use of the genomics context in genome alignments, we test whether conserved functional context can be used to select candidate homologs from insignificant BLAST hits. We make a co-complex network alignment between complex subunits in yeast and human and find that proteins with an insignificant BLAST hit that are part of homologous complexes, are likely to be homologous themselves. Further analysis of the distant homologs we recovered using the co-complex network alignment, shows that a large majority of these distant homologs are in fact ancient paralogs.

Conclusions: Our results show that, even though evolution takes place at the sequence and genome level, co-complex networks can be used as circumstantial evidence to improve confidence in the homology of distantly related sequences.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Co-complex network alignment and homology inference in insignificant BLAST hits. Green lines: human-yeast unambiguous and readily identifiable orthologs (human and yeast proteins in one Inparanoid cluster), gray dotted line: insignificant BLAST hit. If two proteins with an insignificant BLAST hit are subunits of homologous complexes, are these proteins more likely to be homologous than would follow from the score returned by BLAST?
Figure 2
Figure 2
Fraction of True Positives for different E-value bins for different subsets of BLAST hits with that E-value. The fraction of True Positives for all BLAST hits ('BLAST', blue line), the BLAST hits for which both the human query as the yeast hit are part of a co-complex network ('BLAST+cocomplex', red line), the BLAST hits for which both the human query as the yeast hit are part of a co-complex network and both have a direct co-complex network neighbour that has a clear ortholog in the other species (is part of a human-yeast Inparanoid cluster) ('BLAST+cocomplex+inparanoid', brown line), the BLAST hits for which both the human query as the yeast hit are part of a co-complex network and both have a direct co-complex network neighbour and these neighbours are clear orthologs of each other (are part of the same human-yeast Inparanoid cluster) ('BLAST+network alignment', green line).
Figure 3
Figure 3
The Multisynthetase complex. Yeast homologs were detected for all subunits of the Multisynthetase complex. Green solid lines link proteins which are together in an Inparanoid cluster, green dashed lines indicate a significant BLAST hit between the two proteins linked, gray dashed lines indicate insignificant BLAST hits between proteins for which homology is confirmed by the co-complex network alignment.
Figure 4
Figure 4
Evolutionary histories that explain why for a query protein in human, we find both a close and a distant homolog in yeast. Some proteins for which we recover a distant homolog in yeast with our method, in fact have a better hit (a closer homolog) in yeast. The three scenario's depicted here could have this effect. We test which scenario occurs more often by looking whether the distant homolog in yeast (Yeast2 in this Figure) have a closer homolog in human than Human1. Red square: gene duplication event, green circle: speciation event.

Similar articles

Cited by

References

    1. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215(3):403–410. - PubMed
    1. Remm M, Storm CE, Sonnhammer EL. Automatic clustering of orthologs and in-paralogs from pairwise species comparisons. J Mol Biol. 2001;314(5):1041–1052. doi: 10.1006/jmbi.2000.5197. - DOI - PubMed
    1. Li L, Stoeckert JCJ, Roos DS. OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome research. 2003;13(9):2178–2189. doi: 10.1101/gr.1224503. - DOI - PMC - PubMed
    1. Soding J. Protein homology detection by HMM-HMM comparison. Bioinformatics. 2005;21(7):951–960. doi: 10.1093/bioinformatics/bti125. - DOI - PubMed
    1. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25(17):3389–3402. doi: 10.1093/nar/25.17.3389. - DOI - PMC - PubMed

Publication types