Dear All, I'm trying to run Grenedalf: a population genetics statistics tools for analyzing pool seq data.
I have used the following command:
> grenedalf fst \
> --sam-path sam_files/*.sam.gz \
> --window-type genome \
> --pool-sizes sam_files/pool-sizes.csv \
> --method unbiased-nei \
> --window-average-policy valid-loci \
> --no-extra-columns
I get the following error message when I run the script:
Computing FST between 1 pair of samples.
At chromosome scaffold1 At chromosome scaffold2 At chromosome scaffold3 At chromosome scaffold4 At chromosome scaffold5 At chromosome scaffold6 At chromosome scaffold7 At chromosome scaffold8 At chromosome scaffold9
Error: Invalid sorting order of input Variants. By default, we expect lexicographical sorting of chromosomes, and then sorting by position within chromosomes. Alternatively, when a sequence dictionary is specified (such as from a .dict or .fai file, or from a reference genome .fasta file), we expect the order of chromosomes as specified there. Offending input going from scaffold9:4326433 to scaffold10:1
I sorted the SAM files and tried using sorted BAM files as input; I also attempted using .sync files (output from Popoolation2), but I still see the same error. I tested the example files, and they work perfectly fine. I'm not sure what is going wrong here. Has anyone encountered a similar error?
how ? why is it a sam.gz file when it could have been a bam file after sorting ? how about using --reference-genome-dict or --reference-genome-fai ?
I used samtools sort to sort the bam files. I did sort the sam files, but for some reason I deleted the scripts, I need to dig in my files to check.
For the example run in grenedalf, they have used sam.gz files as input, so I thought of trying that as well. But it gave me the same error. They do recommend using bam files as input. But, that did not change my output.
I did create a index file using
as well as a sequence dictionary using picard
Finally, I thought it could be some issue with my reference genome, so I aligned the reads against a different reference genome. This time when I use the alignment to run grenedalf, I get an empty output file.
Be more precise. Does
grenedalf
software require name based sort or coordinate based sort?Please do not delete posts that have received feedback. If the feedback helped solve the problem, vote/respond accordingly. If you solved your problem by yourself, add an answer outlining your steps so others in your position benefit from your effort.