Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

false genotyping large deletion #505

Open
DHmeduni opened this issue Aug 13, 2024 · 18 comments
Open

false genotyping large deletion #505

DHmeduni opened this issue Aug 13, 2024 · 18 comments
Assignees
Milestone

Comments

@DHmeduni
Copy link

Hi, I have a 62kb deletion that is very obviously heterzygous but sniffles is calling it as homozygous even the coverage field its's obvious its heterozygous..... anybody have an idea what to do, or is this a bug?

chr7 148238576 Sniffles2.DEL.4A5S6 N 60 PASS PRECISE;SVTYPE=DEL;SVLEN=-62659;END=148301235;SUPPORT=83;COVERAGE=154,75,57,79,157;STRAND=+-;AF=1.000;STDEV_LEN=0.000;STDEV_POS=0.000 GT:GQ:DR:DV 1/1:60:0:83

@mletexier-cnrgh
Copy link

Hi @DHmeduni & @fritzsedlazeck ,
I have the same phenomenon, which is repeated on several events, for example, a 39 Kb deletion, which is seen with the 1/1 genotype with sniffles (VAF=100%, (0, 25)) whereas visually in IGV, it is 0/1. Sequences in IGV are sorted according to haplotypes.
I've attached the IGV screenshot.
The parameters used are:
--minsupport 5
--tandem-repeats hs38me.trf.bed
--minsvlen 50
--min-alignment-length 2000
--mapq 20

FalseGenotype_DEL_IGV

Thanks for any help you can give
Mélanie

@fritzsedlazeck
Copy link
Owner

Thanks we can reproduce this now also here.. @lfpaulin is working on that. Hope we have a fix on the next verison.

We found the issue , which is basically handeling the read counting wrong for large split read events.. which is a bug in the deletions but works on all other SV events. In short we are by far undercounting the ref supporting reads , which leads to a 1/1 GT.

Thanks
Fritz

@mletexier-cnrgh
Copy link

Thank you very much for your prompt feedback, Fritz.
Looking forward to the next version :)

Mélanie

@tuannguyen8390
Copy link

Hi @fritzsedlazeck! Do you have any estimate for when the next release might be coming? We're really looking forward to seeing this fix as this having an impact on our recessive lethal tracking 🍄

@fritzsedlazeck
Copy link
Owner

We have a working prototype that goes through testing right now. It looks like we resolved it. We hope that we can roll out that new version next week.
Thanks
Fritz

@tuannguyen8390
Copy link

Thanks Fritz, really looking forward to it :)

Tuan

@tuannguyen8390
Copy link

Sorry for having a back-to-back message Fritz, but would the bug cause error in the existing snf files or can we just run the merge SNF with the new Sniffles version ?

@fritzsedlazeck
Copy link
Owner

So this is still something we are investigating. @lfpaulin can maybe expand a bit more... the SNF file holds all variants but they have set filters.. These filters can be overwritten with new behavior during the merge, which also applies the same filters.

@lfpaulin
Copy link
Collaborator

The SNF file is "primed" with a filter, however during the merge it can be override if there is support in other samples. Regarding the genotype we are still testing that feature. Theoretically, if the coverage support values are correct in the SNF then they should provide the correct GT, otherwise you may need to re-run them. We have one test case I will check tomorrow, and will let you know our findings

@tuannguyen8390
Copy link

Thanks for the input @fritzsedlazeck & @lfpaulin. The files that we received from partners are from v2.2 so it poses some concern our end. We have a few known recessive SV that are not supposed to be 1/1 so it's a relatively straightforward test our end.

@lfpaulin lfpaulin added this to the 2.5 milestone Nov 6, 2024
@tuannguyen8390
Copy link

We test out v.2.5 today. Seems like we need to re-generate the old snf file

  1. call from bam
zgrep 91857666 4780_2.5.2.vcf.gz                                                                              

9       91857666        Sniffles2.DEL.E0CS8     N       <DEL>   60      PASS    PRECISE;SVTYPE=DEL;SVLEN=-138347;END=91996012;SUPPORT=15;COVERAGE=34,20,11,13,27;STRAND=+-;STDEV_LEN=0.000;STDEV_POS=0.000;VAF=0.500        GT:GQ:DR:DV     0/1:60:15:15
  1. call from 2.5 snf
zgrep 91857666 4780_snf_2.5.2_new_snf.vcf.gz                                                                      
9       91857666        Sniffles2.DEL.2210M8    N       <DEL>   60      PASS    PRECISE;SVTYPE=DEL;SVLEN=-138347;END=91996012;SUPPORT=15;COVERAGE=34,20,11,13,27;STRAND=+-;STDEV_LEN=0.000;STDEV_POS=0.000;VAF=0.500        GT:GQ:DR:DV:ID  0/1:60:15:15:Sniffles2.DEL.E0CS8
  1. call from 2.2 snf
zgrep 91857666 4780_snf_2.5.2.vcf.gz

9       91857666        Sniffles2.DEL.2210M8    N       <DEL>   60      PASS    PRECISE;SVTYPE=DEL;SVLEN=-138347;END=91996012;SUPPORT=15;COVERAGE=34,20,11,13,27;STRAND=+-;AF=1.000;STDEV_LEN=0.000;STDEV_POS=0.000 GT:GQ:DR:DV:ID  1/1:41:0:15:Sniffles2.DEL.E0CS8

@fritzsedlazeck
Copy link
Owner

Sorry for that. I can cross check with the team if there is a mode to rescue form the SNF file.. the SV should be in there but has already a failed tag assigned. @lfpaulin knows this better than I by now ..
Thanks
Fritz

@tuannguyen8390
Copy link

Thanks @fritzsedlazeck , would be v. nice if we can have that option. Asking 17 partners to remake their snf is a bit of a nightmare to me 😆.

@fritzsedlazeck
Copy link
Owner

Yeah... I also have a similar situation. Meeting with @hermannromanek next monday and will discuss. Stay tune pls.
Fritz

@hermannromanek
Copy link
Collaborator

hermannromanek commented Dec 10, 2024

Hey @tuannguyen8390

We've been working hard trying to identify the issue, but so far were unable to reproduce it, but maybe there was also a misunderstanding here.

Just to confirm what I understood:

  • Case 2 and 3 in false genotyping large deletion #505 (comment) you were generating by using the SNF file as input (which would cause sniffles to do a "merge" on a single sample "population")? You're getting wrong genotypes here.
  • Are there known SVs missing then calling from an old SNF file vs a new one (this is the one we couldn't reproduce)?
  • Anything else I didnt understand?

Thanks,
Hermann

Edit: is it possible for you to share the old 2.2 snf?

@tuannguyen8390
Copy link

tuannguyen8390 commented Dec 10, 2024

Hi @hermannromanek ,

Just to reconfirm stuff,

  • Case 2 & 3, yes, we were having wrong genotype using v.2.5 to call SV for SNF generated from v2.2. I understand that this is single sample call, however under our testing we have the same calling phenomenon with multiple samples.
  • Yes, the 9:91857666 is a known SV on bovine that we know it should not be 1/1. As per my previous comment v2.5 call it correctly as 0/1 from v2.5 SNF, while called it 1/1 from v2.2 SNF.

I can share the 2.2 snf to you. Uploading now. Can I drop you an email for confidentiality?

Cheers,

Tuan

@hermannromanek
Copy link
Collaborator

Hi Tuan,

Thank you for confirming - I've added some code allowing sniffles to re-genotype variants from old SNF files and we'll check with human samples but I also wanted to verify it fixes your issue, or do some more analysis if it doesn't. You can reach me at sniffles@romanek.at

Thanks for your help,
Hermann

@tuannguyen8390
Copy link

tuannguyen8390 commented Dec 10, 2024

Hi @hermannromanek ,

Email sent, please let me know if there is any problem accessing files

EDIT: To be ultra clear LOL :D

Are there known SVs missing then calling from an old SNF file vs a new one (this is the one we couldn't reproduce)?

This, I think the answer is no, the SV is still there but the genotyper just typed it wrong (1/1 instead of 0/1).

Tuan

hermannromanek pushed a commit that referenced this issue Dec 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants