Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Jan 24;11(1):e1004942.
doi: 10.1371/journal.pgen.1004942. eCollection 2015 Jan.

Aberrant gene expression in humans

Affiliations

Aberrant gene expression in humans

Yong Zeng et al. PLoS Genet. .

Abstract

Gene expression as an intermediate molecular phenotype has been a focus of research interest. In particular, studies of expression quantitative trait loci (eQTL) have offered promise for understanding gene regulation through the discovery of genetic variants that explain variation in gene expression levels. Existing eQTL methods are designed for assessing the effects of common variants, but not rare variants. Here, we address the problem by establishing a novel analytical framework for evaluating the effects of rare or private variants on gene expression. Our method starts from the identification of outlier individuals that show markedly different gene expression from the majority of a population, and then reveals the contributions of private SNPs to the aberrant gene expression in these outliers. Using population-scale mRNA sequencing data, we identify outlier individuals using a multivariate approach. We find that outlier individuals are more readily detected with respect to gene sets that include genes involved in cellular regulation and signal transduction, and less likely to be detected with respect to the gene sets with genes involved in metabolic pathways and other fundamental molecular functions. Analysis of polymorphic data suggests that private SNPs of outlier individuals are enriched in the enhancer and promoter regions of corresponding aberrantly-expressed genes, suggesting a specific regulatory role of private SNPs, while the commonly-occurring regulatory genetic variants (i.e., eQTL SNPs) show little evidence of involvement. Additional data suggest that non-genetic factors may also underlie aberrant gene expression. Taken together, our findings advance a novel viewpoint relevant to situations wherein common eQTLs fail to predict gene expression when heritable, rare inter-individual variation exists. The analytical framework we describe, taking into consideration the reality of differential phenotypic robustness, may be valuable for investigating complex traits and conditions.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. MD-based multivariate outlier detection.
(A) Scatter plot for the expression levels of two hypothetical genes. Three outliers indicated with red stars have the largest MD values to the population mean. (B) The chi-square plot showing the relative position and order of the three outlier data points, compared to those of non-outlier data points.
Figure 2
Figure 2. Gene expression profiles and outlier detection in the gene set, G-protein coupled receptor activity.
(A) The expression profiles of 326 EUR samples for 94 genes in the gene set. The expression profile of the outlier individual with the largest SSMD is outlined in red. (B) The chi-square plot showing three outliers, as highlighted with the star symbol. (C) The null distribution of SSMD established from 1,000 permutations of 94 randomly selected genes. The red vertical line indicates the observed value of SSMD computed for the original gene set.
Figure 3
Figure 3. Power of SSMD test and validation of significant L- and S-SSMD gene sets.
(A) The change of power as a function of sample size. (B) The change of power as a function of the size of a gene set. (C) Validation of significant L- and S-SSMD gene sets using different expression data. Original: Geuvadis LCL expression data normalized using PEER (i.e., data used for the main results); Rep1: first set of replication of Geuvadis LCL expression data without PEER normalization; Rep2: second set of replication of Geuvadis LCL expression data without PEER normalization; Whole blood: GTEx whole blood expression data; and Muscle: GTEx muscle expression data. The boxplot shows the frequency of observed SSMD is greater than the control SSMD of 1,000 random replicates.
Figure 4
Figure 4. Change of diffSSMD as a function of the ratio between partitioned samples and the power of diffSSMD test under varying sample size.
(A) The change of diffSSMD as a function of the size ratio of partitioned samples. The results with respect to two gene sets of size 20 and 40 are shown. For each ratio of partition, the distribution of diffSSMD rand were constructed from 100 randomly shuffled samples. (B) The change of the power of the diffSSMD test between EUR and AFR populations for the population-specific effect as a function of the size of EUR samples. The red line is fitted by using polynomial regression with the cubic model.
Figure 5
Figure 5. Differences in expression discordance, heritability and variability between L- and S-SSMD genes.
(A) Normalized mean discordant expression (measure as the relative mean difference, RMD) per gene. (B) Heritability of gene expression. (C) Coefficient of variation of single-cell expression.
Figure 6
Figure 6. Distributions of nonzero effect size β of cis-eSNPs of L-SSMD genes in outlier and non-outlier individuals.
The effect size β is genotype-weighted (i.e., β =|β|*genotype, where genotype={0,1,2}).

Similar articles

Cited by

References

    1. Kilpinen H, Barrett JC (2013) How next-generation sequencing is transforming complex disease genetics. Trends Genet 29: 23–30. 10.1016/j.tig.2012.10.001 - DOI - PubMed
    1. Cirulli ET, Goldstein DB (2010) Uncovering the roles of rare variants in common disease through whole-genome sequencing. Nat Rev Genet 11: 415–425. 10.1038/nrg2779 - DOI - PubMed
    1. Lappalainen T, Sammeth M, Friedlander MR, t Hoen PA, Monlong J, et al. (2013) Transcriptome and genome sequencing uncovers functional variation in humans. Nature 501: 506–511. 10.1038/nature12531 - DOI - PMC - PubMed
    1. Fairfax BP, Makino S, Radhakrishnan J, Plant K, Leslie S, et al. (2012) Genetics of gene expression in primary immune cells identifies cell type-specific master regulators and roles of HLA alleles. Nat Genet 44: 502–510. 10.1038/ng.2205 - DOI - PMC - PubMed
    1. Stranger BE, Nica AC, Forrest MS, Dimas A, Bird CP, et al. (2007) Population genomics of human gene expression. Nat Genet 39: 1217–1224. 10.1038/ng2142 - DOI - PMC - PubMed

Publication types

LinkOut - more resources