Hello all, I am currently learning about phasing and imputation, and I have come across a few ways of representing genotypes. As I understand it, 0 refers to the reference allele while 1 the alternate allele. 0/0 refers to homozygous for the reference allele, 1/1 would be homozygous for the alternate allele, while 0/1 and 1/0 would be the heterozygotes.
However, I have also come across genotypes represented by a single 0, 1 or 2, and here is where I am confused. It's a bit hard to find info for this, but it seems that 0 and 1 refers to the homozygote genotypes, while 2 refers to the heterozygote genotypes, is this correct?
Which of 0 or 1 would be homozygous for the reference and which for the alternate allele? When denoting a genotype as 2, would it not matter if its 0/1 or 1/0? Thanks very much!
in a VCF these are haploid regions. For a diploid organism that could be a haploid region where there is a deletion.
for the other formats & diploid organism, it's usually: 0:HOM_REf, 1:HET 2: HOM_VAR