Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2012 Oct 26;151(3):476-82.
doi: 10.1016/j.cell.2012.10.012.

Revisiting global gene expression analysis

Affiliations
Review

Revisiting global gene expression analysis

Jakob Lovén et al. Cell. .

Abstract

Gene expression analysis is a widely used and powerful method for investigating the transcriptional behavior of biological systems, for classifying cell states in disease, and for many other purposes. Recent studies indicate that common assumptions currently embedded in experimental and analytical practices can lead to misinterpretation of global gene expression data. We discuss these assumptions and describe solutions that should minimize erroneous interpretation of gene expression data from multiple analysis platforms.

PubMed Disclaimer

Figures

Figure 1
Figure 1
A) Schematic representation of pattern of change in gene expression when levels of total RNA in the two cells is similar. The square box represents a perturbation such as increased expression of a gene regulator or a change in environment or cell state. Red arrows point to target genes affected by the perturbation, which are represented as circles. Red shading of circles indicates relative transcriptional increase. B) Schematic representation of microarray normalization when the overall levels of mRNA per cell are not changing in two conditions. Relative mRNA levels for 9 different genes (A–I) are indicated along the y-axis for condition 1 (black) and condition 2 (orange). The panels, from left to right, depict the actual relationship between mRNA levels for the two conditions; the effect of median normalization; the calculated fold-changes based on median normalization, with increased expression represented by red bars above the midline and decreased expression represented by green bars below the midline; and, the perceived transcriptional response of a limited transcriptional increase in gene expression. C) Schematic representation of pattern of change in gene expression when levels of total RNA in the two cells is different such as in transcriptional amplification, where most genes are expressed at higher levels. The square box represents a perturbation such as increased expression of a gene regulator or a change in environment or cell state. Red arrows point to target genes affected by the perturbation, which are represented as circles. Red shading of circles indicates relative transcriptional increase. D) Schematic representation of microarray normalization when the overall levels of mRNA per cell are increased in one condition compared to another. Relative mRNA levels for 9 different genes (A–I) are indicated along the y-axis for condition 1 (black) and condition 2 (orange). The panels, from left to right, depict the actual relationship between mRNA levels for the two conditions; the effect of median normalization; the calculated fold-changes based on median normalization, with increased expression represented by red bars above the midline and decreased expression represented by green bars below the midline; and, the perceived transcriptional response following transcriptional amplification of gene expression.
Figure 2
Figure 2
A) Schematic representation of microarray normalization when the total level of mRNA per cell is different, as in transcriptional amplification, but spike-in RNAs are used as standards for normalization. mRNA levels are indicated along the y-axis for condition 1 (black) and condition 2 (orange); individual genes are represented along the x-axis. Spike-in standards in the mRNA for condition 1 are represented by black triangles and spike-in standards in the mRNA for condition 2 are represented by orange triangles (S1–S3). The panels, from left to right, depict the actual relationship between mRNA levels for the two conditions; the effect of normalization using the spike-in standards; the resulting fold-changes from condition 1 and condition 2, where increased expression is represented by red bars above the midline; and the perceived transcriptional response following transcriptional amplification of gene expression normalized with spike-in RNAs. B) Heatmap showing the results of different normalization methods on the interpretation of microarray data. The data represent fold-change of expression in high-Myc vs. low-Myc cells. Each line represents data for individual probes on the microarray. Red indicates increased expression in high-Myc vs. low-Myc cells. Green indicates decreased expression in high-Myc vs. low-Myc cells. Black indicates no change in expression. The left panel displays data using a standard microarray normalization method (MAS5). The right panel shows the same data, now re-normalized using spike-in standards. C) Heatmap showing the results of different normalization methods on the interpretation of RNA-sequencing data. The data represent fold-change of expression in high-Myc vs. low-Myc cells. Each line represents data for an individual gene. Red indicates increased expression in high-Myc vs. low-Myc cells. Green indicates decreased expression in high-Myc vs. low-Myc cells. Black indicates no change in expression. The left panel displays data using a standard sequencing normalization (reads per kilobase of exon model per million mapped reads). The right panel shows the same data, now re-normalized using spike-in standards. D) Heatmap showing the results of different sample preparation methods on the interpretation of digital quantification data. The data represent counts of mRNA molecules in high-Myc vs. low-Myc cells. Each line represents data for an individual gene. Red indicates increased expression in high-Myc vs. low-Myc cells. Green indicates decreased expression in high-Myc vs. low-Myc cells. Black indicates no change in expression. The left panel displays the results if the quantification is performed with equal amounts of total RNA for the high-Myc vs. low-Myc cells. The right panel displays the results if the quantification is performed with RNA from equal numbers of high-Myc and low-Myc cells.

Similar articles

Cited by

References

    1. Alizadeh AA, Eisen MB, Davis RE, Ma C, Lossos IS, Rosenwald A, Boldrick JC, Sabet H, Tran T, Yu X, et al. Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature. 2000;403:503–511. - PubMed
    1. Bar-Joseph Z, Gitter A, Simon I. Studying and modelling dynamic biological processes using time-series gene expression data. Nat Rev Genet. 2012;13:552–564. - PubMed
    1. Beer DG, Kardia SL, Huang CC, Giordano TJ, Levin AM, Misek DE, Lin L, Chen G, Gharib TG, Thomas DG, et al. Gene-expression profiles predict survival of patients with lung adenocarcinoma. Nat Med. 2002;8:816–824. - PubMed
    1. Benes V, Muckenthaler M. Standardization of protocols in cDNA microarray analysis. Trends Bioechem Sci. 2003;5:244–249. - PubMed
    1. Berger MF, Levin JZ, Vijayendran K, Sivachenko A, Adiconis X, Maguire J, Johnson LA, Robinson J, Verhaak RG, Sougnez C, et al. Integrative analysis of the melanoma transcriptome. Genome Res. 2010;20:413–427. - PMC - PubMed

Publication types

MeSH terms

Substances

Associated data