Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Dec 27;23(1):271.
doi: 10.1186/s13059-022-02840-6.

Truvari: refined structural variant comparison preserves allelic diversity

Affiliations

Truvari: refined structural variant comparison preserves allelic diversity

Adam C English et al. Genome Biol. .

Abstract

The fundamental challenge of multi-sample structural variant (SV) analysis such as merging and benchmarking is identifying when two SVs are the same. Common approaches for comparing SVs were developed alongside technologies which produce ill-defined boundaries. As SV detection becomes more exact, algorithms to preserve this refined signal are needed. Here, we present Truvari-an SV comparison, annotation, and analysis toolkit-and demonstrate the effect of SV comparison choices by building population-level VCFs from 36 haplotype-resolved long-read assemblies. We observe over-merging from other SV merging approaches which cause up to a 2.2× inflation of allele frequency, relative to Truvari.

Keywords: SV annotation; SV benchmarking; SV comparison; SV merging; Structural variation.

PubMed Disclaimer

Conflict of interest statement

FJS received research support from PacBio and Oxford Nanopore.

Figures

Fig. 1
Fig. 1
Overview of the Truvari method and comparison metrics. a Schematic illustrating the Truvari bench matching approach of a baseline and comparison (comp) VCF. be Comparison metrics used by Truvari to measure similarity
Fig. 2
Fig. 2
Intra-sample merging. a Distributions of similarity metrics of SVs between NA24385 haplotypes. Colors are thresholds for sequence and size similarity. b Effect of stringency on intra-sample merging SV counts for GRCh38. The trend line is the average number of SVs per merge. Separation of samples is attributable to ancestry
Fig. 3
Fig. 3
Merging strategies’ impact on pVCF number of SVs and their allele frequency over Ensembl genes. a Count of deletions and insertions produced by each merging strategy. b Average allele frequency of SVs as merging leniency increases
Fig. 4
Fig. 4
Investigation of tandem repeats to assess merging strategies’ performance. a Illustration of a locus where eight insertion alleles (Input) have between + 2 and + 5 copies of a 29-bp repeat across 10 samples. Four of the alleles are annotated as redundant representations (blue) since they have a counterpart with an equal number of copies (orange). A correct merge would preserve each of the unique alleles and remove all redundant alleles, leaving 0 missing and 0 redundant SVs in the locus. An incorrect merge removes two unique insertions (+ 3, + 4) and leaves 1 redundant insertion. b Boxplot of the number of missing variants per locus for each merging strategy. c Barplot of the number of loci with none or any redundant alleles post-merging

Similar articles

Cited by

References

    1. Wheeler, M.M., Stilp, A.M., Rao, S. et al. Whole genome sequencing identifies structural variants contributing to hematologic traits in the NHLBI TOPMed program. Nat Commun. 2022;13:7592. 10.1038/s41467-022-35354-7. - PMC - PubMed
    1. Hannan AJ. Tandem repeats mediating genetic plasticity in health and disease. Nat Rev Genet. 2018;19:286–298. doi: 10.1038/nrg.2017.115. - DOI - PubMed
    1. Li Y, Roberts ND, Wala JA, Shapira O, Schumacher SE, Kumar K, et al. Patterns of somatic structural variation in human cancer genomes. Nature. 2020;578:112–121. doi: 10.1038/s41586-019-1913-9. - DOI - PMC - PubMed
    1. Carvalho CMB, Lupski JR. Mechanisms underlying structural variant formation in genomic disorders. Nat Rev Genet. 2016;17:224–238. doi: 10.1038/nrg.2015.25. - DOI - PMC - PubMed
    1. Mahmoud M, Gobet N, Cruz-Dávalos DI, Mounier N, Dessimoz C, Sedlazeck FJ. Structural variant calling: the long and the short of it. Genome Biol. 2019;20:246. doi: 10.1186/s13059-019-1828-7. - DOI - PMC - PubMed

Publication types