Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007 Jan;35(Database issue):D610-7.
doi: 10.1093/nar/gkl996. Epub 2006 Dec 5.

Ensembl 2007

Affiliations

Ensembl 2007

T J P Hubbard et al. Nucleic Acids Res. 2007 Jan.

Abstract

The Ensembl (http://www.ensembl.org/) project provides a comprehensive and integrated source of annotation of chordate genome sequences. Over the past year the number of genomes available from Ensembl has increased from 15 to 33, with the addition of sites for the mammalian genomes of elephant, rabbit, armadillo, tenrec, platypus, pig, cat, bush baby, common shrew, microbat and european hedgehog; the fish genomes of stickleback and medaka and the second example of the genomes of the sea squirt (Ciona savignyi) and the mosquito (Aedes aegypti). Some of the major features added during the year include the first complete gene sets for genomes with low-sequence coverage, the introduction of new strain variation data and the introduction of new orthology/paralog annotations based on gene trees.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Figure shows the growth in the number of genomes provided by Ensembl over the past 5 years. The discontinuities at the start of 2006 and 2005 represent the removal of the honeybee (Apis mellifera) and nematode (Caenorhabditis briggsae) Ensembl sites, respectively. The black line shows all genomes and the red line shows mammalian genomes.
Figure 2
Figure 2
Figure shows a screenshot of part of an AlignSliceView web page from the elephant (Loxodonta africana) genome as an example of the output from the Ensembl gene build system when applied to low-coverage shotgun genomes. The top panel shows elephant genome sequence and the bottom panel shows the region of human genome sequence that aligns to it. In the DNA(contigs) track blue regions indicate sequence and blank regions indicate gaps. The track for elephant gives an idea of fragmentation of the genome assembly (the gaps in the track for human do not indicate gaps in the genome but rather gaps in the alignment between elephant and human). Elephant DNA contigs have been organized into ‘gene-scaffolds’ based on whole genome alignments (WGA) to a reference genome, in this case human (see text). Elephant transcripts, such as the reverse strand transcript ENSLAFT00000011080 shown here, are built by projecting protein-coding part of human transcripts through the WGA. In this case the elephant transcript has been built by projecting the annotated transcript C9orf138; however, there is no WGA alignment for the third exon of this transcript [the third exon from the right is positioned against a gap between contigs in the DNA(contigs) track]. As a result this exon is missing from the view of human transcripts, a fact that is indicated by the green dotted link linking exons 2 and 4 (CI138_HUMAN). The elephant transcript ENSLAFT00000011080 does contain this exon; however, because of the gap in the elephant sequence, only the exon length can be inferred from the corresponding human transcript, so the exon sequence is composed entirely of ‘N’s in the transcript and ‘X’s in the corresponding translation. Interestingly, in human a shorter alternative transcript is also annotated with a missing third exon (Q5VZT6_HUMAN); however, the form with the third exon appears to be conserved across mouse, rat and dog, suggesting that it is likely to be conserved in elephant too.
Figure 3
Figure 3
Figure shows the sequence variation across mouse strains for the transcript ENSMUST00000006949 in the new view TranscriptSNPView. Strain-specific SNPs were calculated by aligning mouse reads from different strains against the reference genome as described previously (25). This gene-centric view collapses the size of introns to focus on variation within exons, which are shown separately for each strain with the consequences of any SNPs on their coding sequence. The extent of resequencing coverage is also shown for each strain.
Figure 4
Figure 4
Figure shows the gene tree panel from the GeneTreeView web page for the human FOXJ3 gene, generated by the Ensembl gene orthology/paralogy prediction pipeline. Most of the ortholog relationships are one to one; however, there is a one-to-many relationship to the fish lineage, where the gene appears to have duplicated. The relationship to the paralogous gene FOXJ2 can also be seen, where the orthologs in the fish lineage appear to have been lost. The full web page (not shown) includes links to view the tree structure in the Java applet ATV and view the protein sequence alignment upon which it is based in Java applet Jalview. The green bars represent the alignments of the protein translations upon which the tree is based, where shaded blocks represent aligned regions. Poor and fragmented alignments can be the cause of erroneous placements of genes in the tree, so the visualization of the alignment is useful when interpreting the tree.
Figure 5
Figure 5
Figure shows a portion of the human genome in ContigView showing the transcript Q6PEX7_HUMAN. ContigView has a ‘Basepair view’ panel allowing DNA sequence and six frame translation to be examined, which is shown centered on the second exon of this transcript. The introduction of AJAX functionality to ContigView greatly simplifies navigation to precise locations in ‘Basepair view’. Click and drag in any ContigView panel and a red box is drawn. Upon mouse release a popup appears. In this example a region around the start of translation has been selected in ‘Detailed view’. The mouse gesture that the user needs to perform of ‘click…drag…release’ is shown by the annotation on the figure. Clicking on the first option in the popup would reposition base pair view around this feature. This functionality greatly improves the interactivity of the web interface and will be progressively incorporated into other Ensembl views.

Similar articles

  • Ensembl 2006.
    Birney E, Andrews D, Caccamo M, Chen Y, Clarke L, Coates G, Cox T, Cunningham F, Curwen V, Cutts T, Down T, Durbin R, Fernandez-Suarez XM, Flicek P, Gräf S, Hammond M, Herrero J, Howe K, Iyer V, Jekosch K, Kähäri A, Kasprzyk A, Keefe D, Kokocinski F, Kulesha E, London D, Longden I, Melsopp C, Meidl P, Overduin B, Parker A, Proctor G, Prlic A, Rae M, Rios D, Redmond S, Schuster M, Sealy I, Searle S, Severin J, Slater G, Smedley D, Smith J, Stabenau A, Stalker J, Trevanion S, Ureta-Vidal A, Vogel J, White S, Woodwark C, Hubbard TJ. Birney E, et al. Nucleic Acids Res. 2006 Jan 1;34(Database issue):D556-61. doi: 10.1093/nar/gkj133. Nucleic Acids Res. 2006. PMID: 16381931 Free PMC article.
  • Ensembl 2005.
    Hubbard T, Andrews D, Caccamo M, Cameron G, Chen Y, Clamp M, Clarke L, Coates G, Cox T, Cunningham F, Curwen V, Cutts T, Down T, Durbin R, Fernandez-Suarez XM, Gilbert J, Hammond M, Herrero J, Hotz H, Howe K, Iyer V, Jekosch K, Kahari A, Kasprzyk A, Keefe D, Keenan S, Kokocinsci F, London D, Longden I, McVicker G, Melsopp C, Meidl P, Potter S, Proctor G, Rae M, Rios D, Schuster M, Searle S, Severin J, Slater G, Smedley D, Smith J, Spooner W, Stabenau A, Stalker J, Storey R, Trevanion S, Ureta-Vidal A, Vogel J, White S, Woodwark C, Birney E. Hubbard T, et al. Nucleic Acids Res. 2005 Jan 1;33(Database issue):D447-53. doi: 10.1093/nar/gki138. Nucleic Acids Res. 2005. PMID: 15608235 Free PMC article.
  • Ensembl 2009.
    Hubbard TJ, Aken BL, Ayling S, Ballester B, Beal K, Bragin E, Brent S, Chen Y, Clapham P, Clarke L, Coates G, Fairley S, Fitzgerald S, Fernandez-Banet J, Gordon L, Graf S, Haider S, Hammond M, Holland R, Howe K, Jenkinson A, Johnson N, Kahari A, Keefe D, Keenan S, Kinsella R, Kokocinski F, Kulesha E, Lawson D, Longden I, Megy K, Meidl P, Overduin B, Parker A, Pritchard B, Rios D, Schuster M, Slater G, Smedley D, Spooner W, Spudich G, Trevanion S, Vilella A, Vogel J, White S, Wilder S, Zadissa A, Birney E, Cunningham F, Curwen V, Durbin R, Fernandez-Suarez XM, Herrero J, Kasprzyk A, Proctor G, Smith J, Searle S, Flicek P. Hubbard TJ, et al. Nucleic Acids Res. 2009 Jan;37(Database issue):D690-7. doi: 10.1093/nar/gkn828. Epub 2008 Nov 25. Nucleic Acids Res. 2009. PMID: 19033362 Free PMC article.
  • Ensembl 2008.
    Flicek P, Aken BL, Beal K, Ballester B, Caccamo M, Chen Y, Clarke L, Coates G, Cunningham F, Cutts T, Down T, Dyer SC, Eyre T, Fitzgerald S, Fernandez-Banet J, Gräf S, Haider S, Hammond M, Holland R, Howe KL, Howe K, Johnson N, Jenkinson A, Kähäri A, Keefe D, Kokocinski F, Kulesha E, Lawson D, Longden I, Megy K, Meidl P, Overduin B, Parker A, Pritchard B, Prlic A, Rice S, Rios D, Schuster M, Sealy I, Slater G, Smedley D, Spudich G, Trevanion S, Vilella AJ, Vogel J, White S, Wood M, Birney E, Cox T, Curwen V, Durbin R, Fernandez-Suarez XM, Herrero J, Hubbard TJ, Kasprzyk A, Proctor G, Smith J, Ureta-Vidal A, Searle S. Flicek P, et al. Nucleic Acids Res. 2008 Jan;36(Database issue):D707-14. doi: 10.1093/nar/gkm988. Epub 2007 Nov 13. Nucleic Acids Res. 2008. PMID: 18000006 Free PMC article.
  • An overview of Ensembl.
    Birney E, Andrews TD, Bevan P, Caccamo M, Chen Y, Clarke L, Coates G, Cuff J, Curwen V, Cutts T, Down T, Eyras E, Fernandez-Suarez XM, Gane P, Gibbins B, Gilbert J, Hammond M, Hotz HR, Iyer V, Jekosch K, Kahari A, Kasprzyk A, Keefe D, Keenan S, Lehvaslaiho H, McVicker G, Melsopp C, Meidl P, Mongin E, Pettett R, Potter S, Proctor G, Rae M, Searle S, Slater G, Smedley D, Smith J, Spooner W, Stabenau A, Stalker J, Storey R, Ureta-Vidal A, Woodwark KC, Cameron G, Durbin R, Cox A, Hubbard T, Clamp M. Birney E, et al. Genome Res. 2004 May;14(5):925-8. doi: 10.1101/gr.1860604. Epub 2004 Apr 12. Genome Res. 2004. PMID: 15078858 Free PMC article. Review.

Cited by

References

    1. Hinrichs A.S., Karolchik D., Baertsch R., Barber G.P., Bejerano G., Clawson H., Diekhans M., Furey T.S., Harte R.A., Hsu F., et al. The UCSC Genome Browser Database: update 2006. Nucleic Acids Res. 2006;34:D590–D598. - PMC - PubMed
    1. Wheeler D.L., Barrett T., Benson D.A., Bryant S.H., Canese K., Chetvernin V., Church D.M., DiCuccio M., Edgar R., Federhen S., et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 2006;34:D173–D180. - PMC - PubMed
    1. Ashurst J.L., Chen C.K., Gilbert J.G., Jekosch K., Keenan S., Meidl P., Searle S.M., Stalker J., Storey R., Trevanion S., et al. The Vertebrate Genome Annotation (Vega) database. Nucleic Acids Res. 2005;33:D459–D465. - PMC - PubMed
    1. Wu C.H., Apweiler R., Bairoch A., Natale D.A., Barker W.C., Boeckmann B., Ferro S., Gasteiger E., Huang H., Lopez R., et al. The Universal Protein Resource (UniProt): an expanding universe of protein information. Nucleic Acids Res. 2006;34:D187–D191. - PMC - PubMed
    1. The Honeybee Genome Sequencing Consortium. Insights into social insects from the genome of the honeybee Apis mellifera. Nature. 2006;443:931–949. - PMC - PubMed

Publication types