-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tiny fraction of reads mapped #61
Comments
On Sun, Nov 01, 2015 at 09:21:43AM -0800, Richard Smith-Unna wrote:
Maybe, although the advice on line is to ignore it. I recompiled snap and am now trying to figure out how to get transrate |
Same problem as #60. |
No, those messages don’t have anything to do with read mapping. SNAP tries to bind the aligner threads to cores, which somewhat improves efficiency because the hardware doesn’t have to move the cache state to follow the thread. This message means that it failed to bind a thread to a core, which usually happens when you give it –t with more threads than there are cores in the system. When that happens, the extra threads float to whatever core is idle, which might affect performance but won’t affect behavior. Seeing lots of reads in the input which don’t make it into the output is probably because of one of two things. One is that the paired read matcher can’t match ends of reads to one another, either because RNEXT and PNEXT aren’t filled in, or because of a bug that Ravi’s working on now; it usually generates a message at the end of the alignment to this effect. The other reason is that the input reads are marked with the secondary or supplementary alignment flags (0x100 and 0x800), which ordinarily are dropped during the input phase because these aren’t real reads from the sequencer, they’re artifacts produced by a previous aligner. If you want to keep them, you can say –sa. If it’s not either of those things, please let me know and I’ll try to figure out what’s going on. --Bill From: Richard Smith-Unna [mailto:notifications@github.com] via @ctbhttps://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fgithub.com%2fctb&data=01%7c01%7cbolosky%40microsoft.com%7c33987378431c4c3e3bd408d2e2e0eada%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=PUzgyC5P6wtzkSqwWjlLE3HCGGBpeRTMu%2b%2bkUr0HvBE%3d User input 114396588 pairs, of which only 77778 were reported in the BAM file. SNAP logs: Loading index from directory... 0s. 236823466 bases, seed size 23 Aligning. Welcome to SNAP version 1.0beta.18. sched_setaffinity: Invalid argument sched_setaffinity: Invalid argument sched_setaffinity: Invalid argument sched_setaffinity: Invalid argument Could those log messages have something to do with it? Can provide input data if necessary. — |
On Sun, Nov 01, 2015 at 10:11:05AM -0800, Bill Bolosky wrote:
Yes, sorry, the original problem was few reads mapping, and that was the only |
These are FASTQ format input, so I don't think it can be |
Well then that’s very strange. By “reported in the BAM file” you mean that are there at all, not there and mapped? That is, you’re saying that it’s completely losing reads rather than simply failing to map them, right? What does SNAP print out at the end of its run when it reports read counts (the line that starts “Total Reads Aligned MAPQ >= 10…”? --Bill From: Richard Smith-Unna [mailto:notifications@github.com] These are FASTQ format input, so I don't think it can be RNEXT/PNEXT or supplementary aln flags. — |
tl; dr? I can chase down the error messages if you really want, but I think the root problem is #60. Longer version: I was using transrate, and getting very low mapping stats. The version of transrate that I was using comes with snap 1.0b18, which had the above error message. @blahah thought it might be the problem behind the low mapping rate, so he created this issue. In the meantime, I compiled my own version of snap-aligner, identified as 1.0b20 (<= latest from github) and figured out that (variously) 1.0b20 crashed on 'snap-aligner paired' when used with the index that I'd created, and that 1.0b18 misbehaved in some way with the same index. At some point some snap-aligner command said, hey, I hate your FASTA header format, and so I shortened the headers in my reference transcriptome. transrate out-of-the-box (with snap 1.0b18) now maps a decent number of reads and all is well. If you want error messages or verification, I am happy to provide them, but I suspect the root cause is the length of my FASTA headers, which is documented in #60. If you fix that I can re-run everything with the original data and verify that it solves all the problems! |
OK. I have a fix for that that’s almost ready to go. I’ll try to get it checked in tomorrow. From: C. Titus Brown [mailto:notifications@github.com] tl; dr? I can chase down the error messages if you really want, but I think the root problem is #60https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fgithub.com%2famplab%2fsnap%2fissues%2f60&data=01%7c01%7cbolosky%40microsoft.com%7ca9fa65af91f3449140f308d2e2eb803e%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=aXirXPkm2n2VKXZBE7PF82WwV5q5L3i%2byxJ1I0pKL%2fI%3d. Longer version: I was using transrate, and getting very low mapping stats. The version of transrate that I was using comes with snap 1.0b18, which had the above error message. @Blahahhttps://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fgithub.com%2fBlahah&data=01%7c01%7cbolosky%40microsoft.com%7ca9fa65af91f3449140f308d2e2eb803e%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=AwYP2dFs6CBLnkAErBsVXP8AIiVKtNJ%2f72Gv3j9w224%3d thought it might be the problem behind the low mapping rate, so he created this issue. In the meantime, I compiled my own version of snap-aligner, identified as 1.0b20 (<= latest from github) and figured out that (variously) 1.0b20 crashed on 'snap-aligner paired' when used with the index that I'd created, and that 1.0b18 misbehaved in some way with the same index. At some point some snap-aligner command said, hey, I hate your FASTA header format, and so I shortened the headers in my reference transcriptome. transrate out-of-the-box (with snap 1.0b18) now maps a decent number of reads and all is well. If you want error messages or verification, I am happy to provide them, but I suspect the root cause is the length of my FASTA headers, which is documented in #60https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fgithub.com%2famplab%2fsnap%2fissues%2f60&data=01%7c01%7cbolosky%40microsoft.com%7ca9fa65af91f3449140f308d2e2eb803e%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=aXirXPkm2n2VKXZBE7PF82WwV5q5L3i%2byxJ1I0pKL%2fI%3d. If you fix that I can re-run everything with the original data and verify that it solves all the problems! — |
great! On Sun, Nov 01, 2015 at 10:39:19AM -0800, Bill Bolosky wrote:
|
I pushed a fix for very long contig names in beta.21 (and dev.91). You should try that and see if it helps. From: C. Titus Brown [mailto:notifications@github.com] great! On Sun, Nov 01, 2015 at 10:39:19AM -0800, Bill Bolosky wrote:
— |
via @ctb
User input 114396588 pairs, of which only 77778 were reported in the BAM file.
SNAP logs:
Could those log messages have something to do with it? Can provide input data if necessary.
The text was updated successfully, but these errors were encountered: