Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Mar;39(6):e34.
doi: 10.1093/nar/gkq1232. Epub 2010 Dec 21.

GiRaF: robust, computational identification of influenza reassortments via graph mining

Affiliations

GiRaF: robust, computational identification of influenza reassortments via graph mining

Niranjan Nagarajan et al. Nucleic Acids Res. 2011 Mar.

Abstract

Reassortments in the influenza virus--a process where strains exchange genetic segments--have been implicated in two out of three pandemics of the 20th century as well as the 2009 H1N1 outbreak. While advances in sequencing have led to an explosion in the number of whole-genome sequences that are available, an understanding of the rate and distribution of reassortments and their role in viral evolution is still lacking. An important factor in this is the paucity of automated tools for confident identification of reassortments from sequence data due to the challenges of analyzing large, uncertain viral phylogenies. We describe here a novel computational method, called GiRaF (Graph-incompatibility-based Reassortment Finder), that robustly identifies reassortments in a fully automated fashion while accounting for uncertainties in the inferred phylogenies. The algorithms behind GiRaF search large collections of Markov chain Monte Carlo (MCMC)-sampled trees for groups of incompatible splits using a fast biclique enumeration algorithm coupled with several statistical tests to identify sets of taxa with differential phylogenetic placement. GiRaF correctly finds known reassortments in human, avian, and swine influenza populations, including the evolutionary events that led to the recent 'swine flu' outbreak. GiRaF also identifies several previously unreported reassortments via whole-genome studies to catalog events in H5N1 and swine influenza isolates.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Incompatibility graph. The graph contains a node for every split observed in any sampled tree. Edges connect incompatible splits contained in trees from different segments. The weight of a subset of splits is equal to the probability that the true tree contains one of the splits, estimated here as the weighted fraction of sampled trees that contain at least one of the splits. Darker lines indicate a biclique and boxes show the trees that contain one of the splits involved in the biclique. For a high-confidence biclique, the true trees for the segments cannot simultaneously come from the corresponding boxed sets of trees.
Figure 2.
Figure 2.
Reassortment candidates. The pair of incompatible splits in the two segment trees define four candidate sets (obtained by computing intersections, {a, b}∩{a, c} = {a}, {a, b} ∩ {b, d, e} = {b}, {c, d, e} ∩ {a, c} = {c} and {c, d, e} ∩ {b, d, e} = {d, e}), one of which is the reassortment set ({b}). The set {b} also satisfies the condition that it is similar to some taxa and more diverged with respect to others when comparing the two segment trees. Note that set {d} also has this property, demonstrating that it is not a sufficient condition for identifying a reassortment.
Figure 3.
Figure 3.
Multiple reassortments in recent human influenza A (H3N2) isolates. Consensus trees [from sampled trees in GiRaF, using MrBayes (25)] for (a) HA segment and (b) NA segment for the 156 isolates studied in Holmes et al. (6). The three candidate reassortments identified by GiRaF are {A/New York/52/2004, A/New York/59/2003} and {A/New York/32/2003, A/New York/198/2003, A/New York/199/2003}, which were also previously identified, and the novel candidate {A/New York/105/2002}. The candidate reassortments are highlighted on the trees (drawn using Mesquite version 2.72, http://mesquiteproject.org). Note that some clades have been collapsed for clarity and the full trees can be seen in Supplementary Figure S5.
Figure 4.
Figure 4.
Analysis of avian influenza isolates from Salzberg et al. (7). Consensus trees for (a) PB1 segment and (b) PA segment. The two candidate reassortments identified by GiRaF are {A/chicken/Nigeria/1047-62/2006}, which was previously identified and the novel candidate {A/cygnus olor/Italy/742/2006}, and they are both highlighted on the trees (drawn using Mequite).
Figure 5.
Figure 5.
Sensitivity of GiRaF as function of phylogenetic distance. Results from the ‘All Events’ dataset in Table 1, were categorized based on the F84 distance of implanted reassortments (from their original location) and the corresponding frequency histogram was graphed. This distance is a proxy for the sequence similarity of the (unobserved) ancestral sequences from which the two segments derived. GiRaF has nearly perfect sensitivity for implants with F84 distance >0.005 suggesting that the false positives are largely due to the challenge of distinguishing subtle events from phylogenetic noise.
Figure 6.
Figure 6.
Robustness to complex reassortment histories. The graph summarizes results from four datsets where minsize = 1, maxsize = 20 and count was varied over the set {1, 2, 5, 10}, testing GiRaF’s robustness to multiple reassortments and complex histories. While the task of identifying the original implants (‘Exact’ results) becomes increasingly intractable, GiRaF’s sensitivity and PPV remain stable under a more relaxed definition of matches (‘Relaxed’ results).

Similar articles

Cited by

References

    1. Kawaoka Y, Krauss S, Webster RG. Avian-to-human transmission of the PB1 gene of influenza A viruses in the 1957 and 1968 pandemics. J. Virol. 1989;63:4603–4608. - PMC - PubMed
    1. Dawood FS, Jain S, Finelli L, Shaw MW, Lindstrom S, Garten RJ, Gubareva LV, Xu X, Bridges CB, Uyeki TM. Emergence of a novel swine-origin influenza A (H1N1) virus in humans. N. Engl. J. Med. 2009;360:2605–2615. - PubMed
    1. Ghedin E, Sengamalay NA, Shumway M, Zaborsky J, Feldblyum T, Subbu V, Spiro DJ, Sitz J, Koo H, Bolotov P, et al. Large-scale sequencing of human influenza reveals the dynamic nature of viral genome evolution. Nature. 2005;437:1162–1166. - PubMed
    1. Rambaut A, Pybus OG, Nelson MI, Viboud C, Taubenberger JK, Holmes EC. The genomic and epidemiological dynamics of human influenza A virus. Nature. 2008;453:615–619. - PMC - PubMed
    1. Rabadan R, Levine AJ, Krasnitz M. Non-random reassortment in human influenza A viruses. Influenza Other Resp. Viruses. 2008;2:9–22. - PMC - PubMed

Publication types

MeSH terms