Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Jul 17:7:61.
doi: 10.1186/1752-0509-7-61.

A novel function prediction approach using protein overlap networks

Affiliations

A novel function prediction approach using protein overlap networks

Shide Liang et al. BMC Syst Biol. .

Abstract

Background: Construction of a reliable network remains the bottleneck for network-based protein function prediction. We built an artificial network model called protein overlap network (PON) for the entire genome of yeast, fly, worm, and human, respectively. Each node of the network represents a protein, and two proteins are connected if they share a domain according to InterPro database.

Results: The function of a protein can be predicted by counting the occurrence frequency of GO (gene ontology) terms associated with domains of direct neighbors. The average success rate and coverage were 34.3% and 43.9%, respectively, for the test genomes, and were increased to 37.9% and 51.3% when a composite PON of the four species was used for the prediction. As a comparison, the success rate was 7.0% in the random control procedure. We also made predictions with GO term annotations of the second layer nodes using the composite network and obtained an impressive success rate (>30%) and coverage (>30%), even for small genomes. Further improvement was achieved by statistical analysis of manually annotated GO terms for each neighboring protein.

Conclusions: The PONs are composed of dense modules accompanied by a few long distance connections. Based on the PONs, we developed multiple approaches effective for protein function prediction.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Relationship between quantity and size of sub-graphs. The PON of an entire genome was calculated.
Figure 2
Figure 2
Main sub-graph of yeast PON. (A) Main sub-graph consisting of two modules. (B) Network connections between two modules. The dashed line represents the edge between two nodes and the arrow indicates the domain connection within a protein. Two modules in the graph are connected by domain RWD [Pfam: PF05773] associated with protein binding function. Pkinase [Pfam: PF0069, upper module], DEAD [Pfam: PF00270, lower module], and Helicase_C [Pfam: PF00271, lower module] are prevailing domains associated with ATP binding function. Only annotated domains in InterPro database were presented for proteins GCN2 [UniProt: P15442], GIR2 [UniProt: Q03768], IMPACT homolog [UniProt: P25637], and a putative ATP-dependent RNA helicase [UniProt: Q06698].
Figure 3
Figure 3
Distribution of degree values in human PON. The points in the box represent huge clusters of proteins with the same domain composition.
Figure 4
Figure 4
Effect of domain diversity on function prediction accuracy. A GO term was used for prediction only if it was associated with at least a certain number of domain types at the neighboring nodes.

Similar articles

Cited by

References

    1. Eisenberg D, Marcotte EM, Xenarios I, Yeates TO. Protein function in the post-genomic era. Nature. 2000;405:823–826. doi: 10.1038/35015694. - DOI - PubMed
    1. Dobson PD, Cai YD, Stapley BJ, Doig AJ. Prediction of protein function in the absence of significant sequence similarity. Curr Med Chem. 2004;11:2135–2142. doi: 10.2174/0929867043364702. - DOI - PubMed
    1. Procter JB, Thompson J, Letunic I, Creevey C, Jossinet F, Barton GJ. Visualization of multiple alignments, phylogenies and gene family evolution. Nat Methods. 2010;7:S16–S25. doi: 10.1038/nmeth.1434. - DOI - PubMed
    1. Watson JD, Laskowski RA, Thornton JM. Predicting protein function from sequence and structural data. Curr Opin Struct Biol. 2005;15:275–284. doi: 10.1016/j.sbi.2005.04.003. - DOI - PubMed
    1. Pal D, Eisenberg D. Inference of protein function from protein structure. Structure. 2005;13:121–130. doi: 10.1016/j.str.2004.10.015. - DOI - PubMed

Publication types

LinkOut - more resources