Hello, I annotate my vcf with snpEff
command like:
java -Xmx6g -jar snpEff.jar -c configfile GRCH38.86 > result.vcf
I checked ANN
information at INFO
field in result.vcf and found gene ND5
has 2 name( ND5
or MT-ND5
) like:
C|upstream gene variant|MODIFIER|ND5|ND5|transcript|TRANSCRIPT ND5|protein coding||c.-4476T>C|||||4476|WARNING TRANSCRIPT MULTIPLE STOP COD
or
A|upstream gene variant|MODIFIER|MT-ND5|ENSG00000198786|transcript|ENST00000361567.2|protein coding||c.-4484G>A||
Why is that? Do I need to keep only one gene name?
MT probably for mitochondrial. Probably annotation is about the effects on normal gene and mitochondrial gene.
They are the same gene - MT-ND5 is the official symbol, ND5 is an old alias.
Many sources are slow to update gene symbols, which change relatively frequently. Go by the gene or transcript ID whenever possible and use them to grab Gene Symbols for creating final tables, etc.