Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Apr 15;9(4):e94650.
doi: 10.1371/journal.pone.0094650. eCollection 2014.

Long-read sequencing of chicken transcripts and identification of new transcript isoforms

Affiliations

Long-read sequencing of chicken transcripts and identification of new transcript isoforms

Sean Thomas et al. PLoS One. .

Abstract

The chicken has long served as an important model organism in many fields, and continues to aid our understanding of animal development. Functional genomics studies aimed at probing the mechanisms that regulate development require high-quality genomes and transcript annotations. The quality of these resources has improved dramatically over the last several years, but many isoforms and genes have yet to be identified. We hope to contribute to the process of improving these resources with the data presented here: a set of long cDNA sequencing reads, and a curated set of new genes and transcript isoforms not currently represented in the most up-to-date genome annotation currently available to the community of researchers who rely on the chicken genome.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: J. Underwood formerly worked for, and E. Tseng currently works for Pacific Biosciences, which has financial interests in long-read sequencing technology. There are no patents, products in development or marketed products to declare. This does not alter the authors' adherence to all the PLOS ONE policies on sharing data and materials, as detailed online in the guide for authors.

Figures

Figure 1
Figure 1. Long-read sequences map accurately to the chicken genome.
Shown in this UCSC genome browser image is one example of a long-read alignment shown alongside the corresponding short-read data for this region as well as existing RefSeq and Ensembl annotations.
Figure 2
Figure 2. Broad coverage of existing chicken genes using long-read sequencing.
Along the x-axis representing transcript length is a histogram of the number of RefSeq transcripts within a given range of lengths (grey). A similar histogram is shown for those transcripts that overlap each RefSeq annotation by any amount (red), or by more than 90% of the gene length (dark red).
Figure 3
Figure 3. Validation of mapped long-read splice sites.
TopHat2 was used to identify observed splice junctions from the short-read data. The light red and light blue lines show the distribution of distances from Ensembl-annotated splice sites to the experimentally observed splice sites. Both of these lines peak heavily at 0 indicating the degree of agreement between these orthogonal datasets. Splice sites annotated from long-read sequencing (blue and red), also show overall agreement, with a small peak of misidentified splice donor sites within 10 bp of the accurate one, which is possibly due to alignment errors near sites with multiple possible splice donor sites.
Figure 4
Figure 4. Identification of new isoforms and genes.
Shown in this image from the UCSC genome browser are examples of the new genes and isoforms identified from among the short and long-read annotations: A. This region carries two distinct annotations, one for an alternate transcription start site (TSS), and another for a completely new gene that is currently unannotated. B. A heart-specific isoform of the FKBP7 gene. C. Both long-read and short-read data support the existence of transcripts going in opposite directions in this region of chromosome 9.
Figure 5
Figure 5. New isoforms and genes with tissue-specific expression.
The relative expression of (a) the 2,018 differentially expressed isoforms and (b) the 120 new differentially expressed gene annotations are provided across chicken adult brain, cerebellum, heart, kidney, liver and testes datasets, along the scale provided.

Similar articles

Cited by

References

    1. Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, et al. (2001) Initial sequencing and analysis of the human genome. Nature 409: 860–921 10.1038/35057062 - DOI - PubMed
    1. Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, et al. (2001) The sequence of the human genome. Science 291: 1304–1351 10.1126/science.1058040 - DOI - PubMed
    1. Chinwalla AT, Cook LL, Delehaunty KD, Fewell GA, Fulton LA, et al. (2002) Initial sequencing and comparative analysis of the mouse genome. Nature 420: 520–562 10.1038/nature01262 - DOI - PubMed
    1. Adams MD, Celniker SE, Holt RA, Evans CA, Gocayne JD, et al. (2000) The genome sequence of Drosophila melanogaster. Science 287: 2185–2195. - PubMed
    1. C. elegans Sequencing Consortium (1998) Genome sequence of the nematode C. elegans: a platform for investigating biology. Science 282: 2012–2018. - PubMed

Publication types

LinkOut - more resources