You can use make a QQ plot in the following ways.
- one-liner for reading tons of millions of P values from the pipe
# python
zcat pval.txt.gz | qqplot.py -out test -title "QQ plot on the fly"
# julia (recommand to run it in the REPL)
zcat pval.txt.gz | qqplot.jl --out test --title "QQ plot on the fly"
warning : If you have 100 billion P values to process you should definitely use qqplot.jl instead of qqplot.py. The hourly processed number of lines of julia version is 5 billion while python is only 700 million on my server.
- running in a julia REPL (recommanded)
include("qqplot.jl")
cmd = pipeline(`zcat pval.gz`, `awk 'NR>1{print $10}'`)
sigp, expp = qqfly("test", cmd=cmd)
- use qqplot.py in your script
import numpy as np
from qqplot import qq
p = np.random.random(1000000)
qq(x=p, figname="test.png")
Usage: downsample.sh [-b <bamlist>] [-d <depth>] [-n <cores>] [-o <outdir>]
Usage: beagle3-imputation.sh [options]
Pipeline of genotype refinement for median depth sequencing data using beagle3
-h, Display help
-i, Input VCF/BCF file
-o, Output folder
-f, MAF filters before imputation
When you run imputation analysis with BEAGLE
(or other imputation tools), you may want to know the distribution of genotype discordance between the original vcf and imputed vcf.
usage: calc_imputed_gt_discord.py [-h] [-chr STRING] VCF1 VCF2 OUT
warning : Before running the script, you must be sure the two vcfs have the exact same sites and samples for each chromosome.
plot INFO/R2 after imputation by BEAGLE
etc.
Before running bcftools merge
, you maybe need to fix the ref and alt and corresponding genotypes, otherwise bcftools
will surprise you.
usage: fixref.py [-h] REF_VCF IN_VCF OUT_VCF