Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Dec;40(22):11189-201.
doi: 10.1093/nar/gks918. Epub 2012 Oct 12.

LoFreq: a sequence-quality aware, ultra-sensitive variant caller for uncovering cell-population heterogeneity from high-throughput sequencing datasets

Affiliations

LoFreq: a sequence-quality aware, ultra-sensitive variant caller for uncovering cell-population heterogeneity from high-throughput sequencing datasets

Andreas Wilm et al. Nucleic Acids Res. 2012 Dec.

Abstract

The study of cell-population heterogeneity in a range of biological systems, from viruses to bacterial isolates to tumor samples, has been transformed by recent advances in sequencing throughput. While the high-coverage afforded can be used, in principle, to identify very rare variants in a population, existing ad hoc approaches frequently fail to distinguish true variants from sequencing errors. We report a method (LoFreq) that models sequencing run-specific error rates to accurately call variants occurring in <0.05% of a population. Using simulated and real datasets (viral, bacterial and human), we show that LoFreq has near-perfect specificity, with significantly improved sensitivity compared with existing methods and can efficiently analyze deep Illumina sequencing datasets without resorting to approximations or heuristics. We also present experimental validation for LoFreq on two different platforms (Fluidigm and Sequenom) and its application to call rare somatic variants from exome sequencing datasets for gastric cancer. Source code and executables for LoFreq are freely available at http://sourceforge.net/projects/lofreq/.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
In silico and experimental validation. (a) Sensitivity as a function of SNV frequency for LoFreq, SNVer and Breseq on a simulated viral population (see ‘Materials and Methods’ section). (b) Venn diagram showing the overlap of SNV predictions on the simulated population. (c) Detection limits for LoFreq and SNVer as a function of sequencing quality and coverage. Note that SNVer results are unaffected by varying quality values. (d) Validation results for rare variants on a Fluidigm Digital Array. Standard deviations are shown as boxes with error-bars. Note that three assays failed (reporting a non-sense frequency of 50%) and are not shown here.
Figure 2.
Figure 2.
SNV calling in the presence of tumor sample heterogeneity. Germline and somatic variant frequencies for paired tumor-normal exome sequencing datasets from a custom samtools-based pipeline (32) are compared here with those from LoFreq (see ‘Materials and Methods’ section). As shown, while germline variants are consistently distributed around 50% (as expected for heterozygous variants), somatic variants are shifted to lower frequencies, likely due to contamination in the tumor sample from normal stromal tissue. Note that while samtools-based somatic calls appear ‘clipped’ at lower frequencies, LoFreq calls are symmetrically distributed as expected.
Figure 3.
Figure 3.
Mutational hotspots and cold-spots in the dengue virus genome. Circos plots (56) of mutational hotspots and cold-spots derived from clinical (a) DENV1 and (b) DENV2 samples. Outer ring: gene annotation; inner ring: average coverage (log10-scaled). The inner bars mark mutational hotspots (red) and cold-spots (blue), which were derived from intra-host variations called by LoFreq (see ‘Materials and Methods’ section). Height of hotspots indicates how often the hotspot was found (sqrt(count)), whereas the height of cold-spots is fixed. The cold-spot in prM is shared between both serotypes. The last hotspot window in NS1 for the DENV2 samples was only found in pre-dose samples (Table 3) and disappears at later time points.
Figure 4.
Figure 4.
Structural view of hot and cold-spots in the dengue virus genome. (a) Surface representation of dengue virus NS5 methyltransferase (PDB accession number 1R6A). The nucleoside-analog ribavirin 5′-triphosphate (RTP) is shown in blue and the by-product of S-adenosyl-l-methionine (SAM) after the transfer of a methyl group, S-adenosyl-l-homocysteine (SAH), is in red, both in ball-and-stick representation. Cold-spots are colored in violet. The first group of cold-spots consists of contiguous residues which completely enclose the binding site for SAM. SAM molecules serve as a methyl donor in the reaction catalyzed by the NS5 methyltransferase, which results in the capping of viral mRNAs. The second group of cold-spots corresponds to the carboxyl end of the NS5 methyltransferase which act as the linker region that connects the domain to the NS5 polymerase domain. (b) Surface representation of dengue virus NS5 RNA-dependent RNA polymerase (PDB accession number 2J7W). The GDD catalytic triad is colored in red whereas the cold-spots identified from SNV analysis are colored in violet. Cold-spots include the dengue virus NS5 RNA-dependent RNA polymerase GDD catalytic triad and also parts of the template tunnel through which the viral RNA substrate enters and exits during replication.

Similar articles

Cited by

References

    1. Eigen M. Selforganization of matter and the evolution of biological macromolecules. Die Naturwissenschaften. 1971;58:465–523. - PubMed
    1. Thai KTD, Henn MR, Zody MC, Tricou V, Nguyet NM, Charlebois P, Lennon NJ, Green L, de Vries PJ, Hien TT, et al. High-resolution analysis of intrahost genetic diversity in dengue virus serotype 1 infection identifies mixed infections. J. Virol. 2012;86:835–843. - PMC - PubMed
    1. Lee HH, Molla MN, Cantor CR, Collins JJ. Bacterial charity work leads to population-wide resistance. Nature. 2010;467:82–85. - PMC - PubMed
    1. Toprak E, Veres A, Michel J-B, Chait R, Hartl DL, Kishony R. Evolutionary paths to antibiotic resistance under dynamically sustained drug selection. Nat. Genet. 2012;44:101–105. - PMC - PubMed
    1. Blaby IK, Lyons BJ, Wroclawska-Hughes E, Phillips GCF, Pyle TP, Chamberlin SG, Benner SA, Lyons TJ, Crécy-Lagard V de, Crécy E de. Experimental evolution of a facultative thermophile from a mesophilic ancestor. Appl. Environ. Microbiol. 2012;78:144–155. - PMC - PubMed

Publication types