Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 May 15;32(10):1479-85.
doi: 10.1093/bioinformatics/btv722. Epub 2015 Dec 26.

Fast and efficient QTL mapper for thousands of molecular phenotypes

Affiliations

Fast and efficient QTL mapper for thousands of molecular phenotypes

Halit Ongen et al. Bioinformatics. .

Abstract

Motivation: In order to discover quantitative trait loci, multi-dimensional genomic datasets combining DNA-seq and ChiP-/RNA-seq require methods that rapidly correlate tens of thousands of molecular phenotypes with millions of genetic variants while appropriately controlling for multiple testing.

Results: We have developed FastQTL, a method that implements a popular cis-QTL mapping strategy in a user- and cluster-friendly tool. FastQTL also proposes an efficient permutation procedure to control for multiple testing. The outcome of permutations is modeled using beta distributions trained from a few permutations and from which adjusted P-values can be estimated at any level of significance with little computational cost. The Geuvadis & GTEx pilot datasets can be now easily analyzed an order of magnitude faster than previous approaches.

Availability and implementation: Source code, binaries and comprehensive documentation of FastQTL are freely available to download at http://fastqtl.sourceforge.net/

Contact: emmanouil.dermitzakis@unige.ch or olivier.delaneau@unige.ch

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
(a, b) Density plots of the k and n parameter ML estimates made from 100, 1K, 10K and 100K permutations on GEUV_EUR. (c) A scatter plot of the number of variant sites tested per gene (cis-window ±1Mb of the TSS) against the n parameter ML estimates made again from 100, 1K, 10K and 100K permutations on GEUV_EUR. (d, e) Quantile–Quantile plots of the best P-values obtained through 1000 permutations (observed) of the GEUV_EUR dataset against simulated P-values sampled from the fitted beta distributions (expected). Expected P-values are plotted against the observed ones for all genes pooled together in (d) and for each gene separately in panel (e). (f) The KS test −log10 P-values comparing observations and expectations for each gene. The red line shows the expected Bonferroni significance threshold when testing 13 703 genes
Fig. 2.
Fig. 2.
(a, b) Scatter plots of the adjusted P-values obtained from 1000 permutations via the direct method (in grey) and the beta approximation (in light blue) against those obtained through the standard permutation scheme with 100K permutations (a) or through the adaptive method with up to 1M permutations (b). All this was performed on the GEUV_EUR dataset. Adjusted P-values are plotted on both linear (a) and log (b) scales. Expected variation for 1000 permutations is shown by the 95% confidence intervals in red. (c) The equivalent number of permutations required by the direct permutation scheme to reach the same calibration as the beta approximation (from 1000 permutations) as a function of the adjusted P-value targeted. The dashed and solid gray lines show the expected accuracy of the adaptive permutation scheme that stops when 5 and 10 stronger null signals are found, respectively. (d) The sensitivity–specificity ratio of reasonable FastQTL runs (beta approximation or direct method with 50–5000 permutations) to recover an optimal eQTL set derived from 100 000 permutations. (e) The sensitivity–specificity ratio to recover the nine official eQTL sets released by the GTEx consortium using both Matrix eQTL (direct method) and FastQTL (beta approximation) with 100, 500 and 1000 permutations

Similar articles

Cited by

References

    1. Aulchenko Y.S. et al. (2007) GenABEL: an R library for genome-wide association analysis. Bioinformatics, 23, 1294–1296. - PubMed
    1. Benjamini Y., Hochberg Y. (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc., 57, 289–300.
    1. Dudbridge F., Koeleman B.P. (2004) Efficient computation of significance levels for multiple associations in large studies of correlated data, including genomewide association studies. Am. J. Hum. Genet., 75, 424–435. - PMC - PubMed
    1. Fairfax B.P. et al. (2012) Genetics of gene expression in primary immune cells identifies cell type-specific master regulators and roles of HLA alleles. Nat. Genet., 44, 502–510. - PMC - PubMed
    1. Flutre T. et al. (2013) A statistical framework for joint eQTL analysis in multiple tissues. PLoS Genet., 9, e1003486. - PMC - PubMed