-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Indel calling #154
Comments
If you want to find indels up to 50, then you should make -d a little bigger than 50 in case there are other differences in the read, like SNPs away from the indel. The max value for -d is 62 or 63 (depending on other stuff) so you have some slack here. Using -d this big will slow SNAP down quite a bit if you happen to have a lot of reads that don't align at all (or align with high edit distance) but that have enough similarity to the reference to have many seed hits. You may or may not care about this and you can experiment with your data to see what happens. What -i does is to look for potential indels in the seeding phase. That is, if it sees two seed hits that are close to one another but offset (which might indicate an indel in the read between the seeds) then it increases the max edit distance only for that alignment candidate. It will have a much smaller performance impact than -d, but it will miss an indel that doesn't have seed hits on either side of (because, for example, it's close to the end of the read or because the region between the indel and one end or the other has enough differences from the reference that it doesn't have an exact match that corresponds to a seed SNAP looked at). In truth, if indels are close to the end of the read they're likely to be soft clipped anyway unless you turn off soft clipping. |
Thank you for your detailed answer! So, to summarize.
With these two things I will get the best snap performance for indels? |
I'd increase -d to 60 and see if you like the output. I'd also try -i 60, which will probably produce less noise since it will only increase the distance when it looks like there's an indel. I think I spoke too soon about turning off soft clipping. We don't expose an option to do that, so you're stuck with it. So you're not likely to find big indels that are near the ends of reads, since they'll get clipped. That said, they're also pretty unreliable so you probably just want to stick with ones in the middle anyway. |
Hi!
I want to detect accurately indels in some panel samples. I care only about small indels <=50b. Do you think I should increase "-d max edit distance" option to 50? Does this increase only affect snap speed or it also affects accuracy?
How does "-d max edit distance" compare with "-i max edit distance to considerfor potential indels"?
Are there any other options I should consider to increase sensitivity and precision around indels?
My first priority is sensitivity and accuracy and not speed.
KK
The text was updated successfully, but these errors were encountered: