Skip to content

duplicate and close variants of the same alignment in the output #37

Open
@azat-badretdin

Description

When I use these parameters:

 ./miniprot -G 100 -O 10 -J 34 -F 30 --gff -ut32 nucleotide.fasta proteins.fasta

I get very close variants of the same alignment:

gpipedev21:issue-34$ grep WP_004242317 miniprot.gff  | grep PAF
##PAF   gi|490362554|ref|WP_004242317.1|        343     149     343     +       gi|545778205|gb|U00096.3|       4641652 3221864 3222446 402     582     0       AS:i:680        ms:i:680      np:i:159 da:i:-1 do:i:0  cg:Z:194M       cs:Z::2*accC*gacS*aatA*atcV:2*atcV:2*cacS*gaaD*cccR*ggcQ:1*ggtD:9*cgcY:1*agtA*aaaQ*gaaS*atcV*atcT:2*tatF:1*aacA:2*gttY*aatD:7*gaaQ:1*gagS:1*ggcA*aagA:8*gcgT:3*cgaS:1*aaaR*caaG:3*gaaG:3*tggY:2*ggtD:3*tcgA:3*gaaA:7*cggG:1*gacS:19*attL:2*cgaQ*ggcH*ctgI*aacA:2*cagE:2*tcgA:10*cgaK:2*tttI:1*ccgS:9*atgV:8*gtgL*tatF:1*aaaR*gccL:2*ggtE:1*gcgQ*ctgE:2*ttaQ*gtcI:1*gttA*cccA:1*aaaR:1*aaaI:5*cgtK
##PAF   gi|490362554|ref|WP_004242317.1|        343     154     343     +       gi|545778205|gb|U00096.3|       4641652 3221879 3222446 396     567     0       AS:i:675        ms:i:675      np:i:157 da:i:-1 do:i:0  cg:Z:189M       cs:Z:*atcV:2*atcV:2*cacS*gaaD*cccR*ggcQ:1*ggtD:9*cgcY:1*agtA*aaaQ*gaaS*atcV*atcT:2*tatF:1*aacA:2*gttY*aatD:7*gaaQ:1*gagS:1*ggcA*aagA:8*gcgT:3*cgaS:1*aaaR*caaG:3*gaaG:3*tggY:2*ggtD:3*tcgA:3*gaaA:7*cggG:1*gacS:19*attL:2*cgaQ*ggcH*ctgI*aacA:2*cagE:2*tcgA:10*cgaK:2*tttI:1*ccgS:9*atgV:8*gtgL*tatF:1*aaaR*gccL:2*ggtE:1*gcgQ*ctgE:2*ttaQ*gtcI:1*gttA*cccA:1*aaaR:1*aaaI:5*cgtK

This also expresses itself, maybe, in duplication of some alignment output. For example:

gi|545778205|gb|U00096.3|       miniprot        CDS     729583  733323  6547    +       0       Parent=MP001848;Rank=18;Identity=0.9719;Target=gi|15829983|ref|NP_308756.1| 1 1247
gi|545778205|gb|U00096.3|       miniprot        mRNA    729583  733323  6547    +       .       ID=MP001849;Rank=19;Identity=0.9719;Positive=0.9783;Target=gi|15829983|ref|NP_308756.1| 1 1247
gi|545778205|gb|U00096.3|       miniprot        CDS     729583  733323  6547    +       0       Parent=MP001849;Rank=19;Identity=0.9719;Target=gi|15829983|ref|NP_308756.1| 1 1247
gi|545778205|gb|U00096.3|       miniprot        mRNA    729583  733323  6547    +       .       ID=MP001850;Rank=20;Identity=0.9719;Positive=0.9783;Target=gi|15829983|ref|NP_308756.1| 1 1247
gi|545778205|gb|U00096.3|       miniprot        CDS     729583  733323  6547    +       0       Parent=MP001850;Rank=20;Identity=0.9719;Target=gi|15829983|ref|NP_308756.1| 1 1247
gi|545778205|gb|U00096.3|       miniprot        mRNA    729583  733323  6547    +       .       ID=MP001851;Rank=21;Identity=0.9719;Positive=0.9783;Target=gi|15829983|ref|NP_308756.1| 1 1247
gi|545778205|gb|U00096.3|       miniprot        CDS     729583  733323  6547    +       0       Parent=MP001851;Rank=21;Identity=0.9719;Target=gi|15829983|ref|NP_308756.1| 1 1247

The alignments are the same, but the Rank=x value is different in each case.

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestquestionFurther information is requested

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions