Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2010;11(12):220.
doi: 10.1186/gb-2010-11-12-220. Epub 2010 Dec 22.

From RNA-seq reads to differential expression results

Affiliations
Review

From RNA-seq reads to differential expression results

Alicia Oshlack et al. Genome Biol. 2010.

Abstract

Many methods and tools are available for preprocessing high-throughput RNA sequencing data and detecting differential expression.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Overview of the RNA-seq analysis pipeline for detecting differential expression. The steps in the pipeline are in red boxes; the methodological components of the pipeline are shown in blue boxes and bold text; software examples and methods for each step (a non-exhaustive list) are shown by regular text in blue boxes. References for the tools and methods shown are listed in Table 1. First, reads are mapped to the reference genome or transcriptome (using junction libraries to map reads that cross exon boundaries); mapped reads are assembled into expression summaries (tables of counts, showing how may reads are in coding region, exon, gene or junction); the data are normalized; statistical testing of differential expression (DE) is performed, producing and a list of genes with associated P-values and fold changes. Systems biology approaches can then be used to gain biological insights from these lists.
Figure 2
Figure 2
Summarizing mapped reads into a gene level count. (a) Mapped reads from a small region of the RNA-binding protein 39 (RBM39) gene are shown for LNCaP prostate cancer cells [90], human liver and human testis from the UCSC track. The three rows of RNA-seq data (blue and black graphs) are shown as a 'pileup track', where the y-axis at each location measures the number of mapped reads that overlap that location. Also shown are the genomic coordinates, gene model (labeled RBM39; blue boxes indicate exons) and conservation score across vertebrates. It is clear that many reads originate from regions with no known exons. (b) A schematic of a genomic region and reads that might arise from it. Reads are color-coded by the genomic feature from which they originate. Different summarization strategies will result in the inclusion or exclusion of different sets of reads in the table of counts. For example, including only reads coming from known exons will exclude the intronic reads (green) from contributing to the results. Splice junctions are listed as a separate class to emphasize both the potential ambiguity in their assignment (such as which exon should a junction read be assigned to) and the possibility that many of these reads may not be mapped because they are harder to map than continuous reads. CDS, coding sequence.

Similar articles

Cited by

References

    1. Pan Q, Shai O, Lee LJ, Frey BJ, Blencowe BJ. Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat Genet. 2008;40:1413–1415. doi: 10.1038/ng.259. - DOI - PubMed
    1. Sultan M, Schulz MH, Richard H, Magen A, Klingenhoff A, Scherf M, Seifert M, Borodina T, Soldatov A, Parkhomchuk D, Schmidt D, O'Keeffe S, Haas S, Vingron M, Lehrach H, Yaspo ML. A global view of gene activity and alternative splicing by deep sequencing of the human transcriptome. Science. 2008;321:956–960. doi: 10.1126/science.1160342. - DOI - PubMed
    1. Wagner JR, Ge B, Pokholok D, Gunderson KL, Pastinen T, Blanchette M. Computational analysis of whole-genome differential allelic expression data in human. PLoS Comput Biol. 2010;6:e1000849. doi: 10.1371/journal.pcbi.1000849. - DOI - PMC - PubMed
    1. Wang X, Sun Q, McGrath SD, Mardis ER, Soloway PD, Clark AG. Transcriptome-wide identification of novel imprinted genes in neonatal mouse brain. PLoS ONE. 2008;3:e3839. doi: 10.1371/journal.pone.0003839. - DOI - PMC - PubMed
    1. Gregg C, Zhang J, Weissbourd B, Luo S, Schroth GP, Haig D, Dulac C. High-resolution analysis of parent-of-origin allelic expression in the mouse brain. Science. 2010;329:643–648. doi: 10.1126/science.1190830. - DOI - PMC - PubMed

Publication types

LinkOut - more resources