Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Oct 3;4(10):e1000189.
doi: 10.1371/journal.pcbi.1000189.

Removal of AU bias from microarray mRNA expression data enhances computational identification of active microRNAs

Affiliations

Removal of AU bias from microarray mRNA expression data enhances computational identification of active microRNAs

Ran Elkon et al. PLoS Comput Biol. .

Abstract

Elucidation of regulatory roles played by microRNAs (miRs) in various biological networks is one of the greatest challenges of present molecular and computational biology. The integrated analysis of gene expression data and 3'-UTR sequences holds great promise for being an effective means to systematically delineate active miRs in different biological processes. Applying such an integrated analysis, we uncovered a striking relationship between 3'-UTR AU content and gene response in numerous microarray datasets. We show that this relationship is secondary to a general bias that links gene response and probe AU content and reflects the fact that in the majority of current arrays probes are selected from target transcript 3'-UTRs. Therefore, removal of this bias, which is in order in any analysis of microarray datasets, is of crucial importance when integrating expression data and 3'-UTR sequences to identify regulatory elements embedded in this region. We developed visualization and normalization schemes for the detection and removal of such AU biases and demonstrate that their application to microarray data significantly enhances the computational identification of active miRs. Our results substantiate that, after removal of AU biases, mRNA expression profiles contain ample information which allows in silico detection of miRs that are active in physiological conditions.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Relationship between 3′-UTR AU content and gene response during HPC differentiation.
Expression profiles were measured at several time points after stimulation of HPC differentiation into megakaryocytes. To visualize the relationships between 3′-UTR AU content and gene response, the genes were sorted for each time point according to their fold of repression/induction relative to the expression level at t0, and the mean 3′-UTR AU content was calculated in a sliding window that encompassed in each step 5% of the genes included in the analysis. (At each step the sliding window was moved to the right by 5% of its size.) Each plot corresponds to the time point indicated above it. Genes are sorted on the X-axis according to their response, from the most repressed genes at the left to the most induced genes at the right. The Y-axis represents the mean 3′-UTR AU content calculated on each sliding window. The p value above each plot is for the comparison (Wilcoxon test) between the 3′-UTR AU content of the top 5% (most strongly up-regulated) and bottom 5% (most strongly down-regulated) genes at the corresponding time point. Note the striking relationship between 3′-UTR AU content and gene response at the 16 hr time point.
Figure 2
Figure 2. Strong relationship between 3′-UTR AU content and gene response detected in a comparison between technical replicates.
The figure shows the relationship between 3′-UTR AU content and gene fold-change in a comparison between two chips hybridized with identical universal reference RNA pools. The plot was generated as described in the legend to Figure 1. A highly significant relationship between 3′-UTR AU content and gene response was detected in this technical comparison (p value = 8.1*10−84 for the comparison between the bottom and top 5% ‘responding’ genes), pointing to a major AU bias in microarray measurements.
Figure 3
Figure 3. M-A and M-AU plots.
(A). M-A plot shows that there is no intensity-response bias in the comparison between the two chips hybridized with identical universal reference RNA pools. The Y axis (denoted as M) represents the log2 fold-change and the X-axis (denoted as A) represents the average log2 intensity. Each dot in the plot corresponds to a gene in the dataset. (B). Adopting the M-A plot concept, we introduced the M-AU plot, in which the Y axis represents the log2 fold-change (as in the M-A plots), and the X axis represents the 3′-UTR AU content of a gene. The M-AU plot shows a major AU bias in this technical dataset. The red line is the lowess smoothing line calculated for the scatter plot.
Figure 4
Figure 4. The AU response bias is related to probe base composition regardless probe location along the target transcript.
Probe-level M-AU plot for the comparison between two chips hybridized with a common human brain reference sample. This dataset used the new generation Affymetrix Human Gene 1.0 ST Array, in which probes are located throughout the target transcripts. We generated plots which either included all probes, or included separately only those mapped to the 5′-UTR, CDS, or 3′-UTR of the targets. (As the length of each probe is 25 bases, probe's AU content (X axis) gets only discrete values in the 0–100% range with jumps of 4%). Probes mapped to the different transcript regions exhibited similar level of AU response bias.
Figure 5
Figure 5. AU normalization.
M-AU plots without (A) and after (B) applying an AU normalization scheme to the technical dataset which profiled the universal reference RNA pool.
Figure 6
Figure 6. AU bias in the miR-155 dataset.
Relationship between 3′-UTR AU content and gene response in the dataset that compared gene expression profiles between miR-155-deficient and control Th2 cells. (A) Without AU normalization. (B) After applying AU normalization to the dataset. Plots were generated as described in the legend to Figure 1.

Similar articles

Cited by

References

    1. Kim VN, Nam JW. Genomics of microRNA. Trends Genet. 2006;22:165–173. - PubMed
    1. Lewis BP, Burge CB, Bartel DP. Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets. Cell. 2005;120:15–20. - PubMed
    1. Brennecke J, Stark A, Russell RB, Cohen SM. Principles of microRNA-target recognition. PLoS Biol. 2005;3:e85. doi:10.1371/journal.pbio.0030085. - PMC - PubMed
    1. Rajewsky N. MicroRNA target predictions in animals. Nat Genet. 2006;38:S8–S13. - PubMed
    1. Elkon R, Linhart C, Sharan R, Shamir R, Shiloh Y. Genome-wide in silico identification of transcriptional regulators controlling the cell cycle in human cells. Genome Res. 2003;13:773–780. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources