duplicate and close variants of the same alignment in the output #37
Open
Description
When I use these parameters:
./miniprot -G 100 -O 10 -J 34 -F 30 --gff -ut32 nucleotide.fasta proteins.fasta
I get very close variants of the same alignment:
gpipedev21:issue-34$ grep WP_004242317 miniprot.gff | grep PAF
##PAF gi|490362554|ref|WP_004242317.1| 343 149 343 + gi|545778205|gb|U00096.3| 4641652 3221864 3222446 402 582 0 AS:i:680 ms:i:680 np:i:159 da:i:-1 do:i:0 cg:Z:194M cs:Z::2*accC*gacS*aatA*atcV:2*atcV:2*cacS*gaaD*cccR*ggcQ:1*ggtD:9*cgcY:1*agtA*aaaQ*gaaS*atcV*atcT:2*tatF:1*aacA:2*gttY*aatD:7*gaaQ:1*gagS:1*ggcA*aagA:8*gcgT:3*cgaS:1*aaaR*caaG:3*gaaG:3*tggY:2*ggtD:3*tcgA:3*gaaA:7*cggG:1*gacS:19*attL:2*cgaQ*ggcH*ctgI*aacA:2*cagE:2*tcgA:10*cgaK:2*tttI:1*ccgS:9*atgV:8*gtgL*tatF:1*aaaR*gccL:2*ggtE:1*gcgQ*ctgE:2*ttaQ*gtcI:1*gttA*cccA:1*aaaR:1*aaaI:5*cgtK
##PAF gi|490362554|ref|WP_004242317.1| 343 154 343 + gi|545778205|gb|U00096.3| 4641652 3221879 3222446 396 567 0 AS:i:675 ms:i:675 np:i:157 da:i:-1 do:i:0 cg:Z:189M cs:Z:*atcV:2*atcV:2*cacS*gaaD*cccR*ggcQ:1*ggtD:9*cgcY:1*agtA*aaaQ*gaaS*atcV*atcT:2*tatF:1*aacA:2*gttY*aatD:7*gaaQ:1*gagS:1*ggcA*aagA:8*gcgT:3*cgaS:1*aaaR*caaG:3*gaaG:3*tggY:2*ggtD:3*tcgA:3*gaaA:7*cggG:1*gacS:19*attL:2*cgaQ*ggcH*ctgI*aacA:2*cagE:2*tcgA:10*cgaK:2*tttI:1*ccgS:9*atgV:8*gtgL*tatF:1*aaaR*gccL:2*ggtE:1*gcgQ*ctgE:2*ttaQ*gtcI:1*gttA*cccA:1*aaaR:1*aaaI:5*cgtK
This also expresses itself, maybe, in duplication of some alignment output. For example:
gi|545778205|gb|U00096.3| miniprot CDS 729583 733323 6547 + 0 Parent=MP001848;Rank=18;Identity=0.9719;Target=gi|15829983|ref|NP_308756.1| 1 1247
gi|545778205|gb|U00096.3| miniprot mRNA 729583 733323 6547 + . ID=MP001849;Rank=19;Identity=0.9719;Positive=0.9783;Target=gi|15829983|ref|NP_308756.1| 1 1247
gi|545778205|gb|U00096.3| miniprot CDS 729583 733323 6547 + 0 Parent=MP001849;Rank=19;Identity=0.9719;Target=gi|15829983|ref|NP_308756.1| 1 1247
gi|545778205|gb|U00096.3| miniprot mRNA 729583 733323 6547 + . ID=MP001850;Rank=20;Identity=0.9719;Positive=0.9783;Target=gi|15829983|ref|NP_308756.1| 1 1247
gi|545778205|gb|U00096.3| miniprot CDS 729583 733323 6547 + 0 Parent=MP001850;Rank=20;Identity=0.9719;Target=gi|15829983|ref|NP_308756.1| 1 1247
gi|545778205|gb|U00096.3| miniprot mRNA 729583 733323 6547 + . ID=MP001851;Rank=21;Identity=0.9719;Positive=0.9783;Target=gi|15829983|ref|NP_308756.1| 1 1247
gi|545778205|gb|U00096.3| miniprot CDS 729583 733323 6547 + 0 Parent=MP001851;Rank=21;Identity=0.9719;Target=gi|15829983|ref|NP_308756.1| 1 1247
The alignments are the same, but the Rank=x value is different in each case.