Dear all, I am doing an assembly of 40 Mb genome with expected coverage of 181x. I am using Illumina reads 76bp length with insert size 200 bp (Sd 20 bp). I have tried velvet for these assemblies and 86-99% of reads were used in this assembly with N50 of 80kb (with k-mer's 21,55,2). But the strange thing is that I am getting only 19 Mb genome after all assemblies. The whole genome has been covered during the library preparations. What could be the possible reason behind this? Is this due to repeat elements, as some of my NODE's covered more than 5000x? I would appreciate your suggestions.
Thanks in advance Rahul
I think there's a good chance that you have an over-coverage of some elements. try maybe reducing the files you assemble (e.g. from 181X to 40X) see if you get the same results. also, check this: http://www.illumina.com/Documents/products/technotes/technote_denovo_assembly_ecoli.pdf