Generally how are non-canonical chromosomes treated in human variation studies
0
1
Entering edit mode
4 weeks ago
eebloom ▴ 90

I can't seem to find a very cohesive answer to this question. When calling variants in human data aligned to GRCh38, some variants are aligned to regions of the genome such as:

chr17_GL000205v2_random

chr1_KI270706v1_random

chrUn_KN707874v1_decoy

chrEBV

chrM

I understand these to be various alternative contigs, unlocalised sequences, unknown chromosomes, mitochondrial chromosome etc. (see descriptions here)

If your biological question is concerning human variation in nuclear (autosomal and sex) chromosomes it seems suitable to remove variants mapping to chrEBV, chrM and unknown chromosomes.

However, what is the general consensus on unlocalised sequences on canonical chromosomes or reads with multiple alignments. For instance, some variants have a breakpoint in a canonical chromosome and another ion an unknown chromosome e.g.

chr16   76295587    r_246_0 A   [chrUn_KI270518v1:835[G .   SVTYPE=BND;MATEID=r_246_1   TR:VR   213:5   175:0
chrUn_KI270518v1    835 r_246_1 G   [chr16:76295587[G   .   SVTYPE=BND;MATEID=r_246_0   TR:VR   213:5   175:0

It does not seem to be documented in many methods in publications of structural variant analyses. I have seen one example where they discuss "filter[ing] SVs found in the sex and unknown chromosomes".

variants SV assembly WGS • 199 views
ADD COMMENT
0
Entering edit mode
ADD REPLY

Login before adding your answer.

Traffic: 981 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6