Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Jan 4;44(D1):D20-6.
doi: 10.1093/nar/gkv1352. Epub 2015 Dec 15.

The European Bioinformatics Institute in 2016: Data growth and integration

Affiliations

The European Bioinformatics Institute in 2016: Data growth and integration

Charles E Cook et al. Nucleic Acids Res. .

Abstract

New technologies are revolutionising biological research and its applications by making it easier and cheaper to generate ever-greater volumes and types of data. In response, the services and infrastructure of the European Bioinformatics Institute (EMBL-EBI, www.ebi.ac.uk) are continually expanding: total disk capacity increases significantly every year to keep pace with demand (75 petabytes as of December 2015), and interoperability between resources remains a strategic priority. Since 2014 we have launched two new resources: the European Variation Archive for genetic variation data and EMPIAR for two-dimensional electron microscopy data, as well as a Resource Description Framework platform. We also launched the Embassy Cloud service, which allows users to run large analyses in a virtual environment next to EMBL-EBI's vast public data resources.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Installed (2008–2015) storage at EMBL-EBI. These figures include all installed storage, counting multiple backups for all data resources as well as unused storage to handle submissions in the immediate future. The actual total volume of a single copy of all data resources is roughly 30% of total installed storage capacity. Figures are for end-of-year; 2015 figure is estimated based on installed capacity in October 2015.
Figure 2.
Figure 2.
(A) Data accumulation at EMBL-EBI by data type, for example mass spectrometry (MS); (B) Data accumulation by dedicated resource, for example PRIDE. The y-axis is log-scale, with the slope of the dashed lines indicating a 12-month doubling time. Continued data growth is seen in all types of data at EMBL-EBI and all data resources. In all data resources shown here, growth rates are predicted to continue increasing, with notable sustained exponential growth in PRIDE, the European Genome-phenome Archive (EGA) and MetaboLights: all have doubling times of around 12 months. All three contributing platforms show rates that are increasing over time, with data growing exponentially with around a 12-month doubling time.
Figure 3.
Figure 3.
Representation of the internal interactions between different databases and resources at the EMBL-EBI, as determined by the exchange of data. All resources are placed on the circumference of the circle, with each resource represented by an arc proportional to the total number of interactions. The width of each internal arc, which transects the circle and connects two different resources, is weighted according to the number of different data types that are exchange between the two resources at the ends of the arc. The colouring of the internal arcs does not reflect the direction of data exchange. The graphic was generated using the D3 JavaScript library (http://d3js.org) and the data, gathered as part of an external review, were accurate at the time of acquisition (Jan 2015).

Similar articles

Cited by

References

    1. Brooksbank C., Bergman M.T., Apweiler R., Birney E., Thornton J. The European Bioinformatics Institute's data resources 2014. Nucleic Acids Res. 2014;42:D18–D25. - PMC - PubMed
    1. Stephens Z.D., Lee S.Y., Faghri F., Campbell R.H., Zhai C., Efron M.J., Iyer R., Schatz M.C., Sinha S., Robinson G.E. Big Data: Astronomical or Genomical? PLoS Biol. 2015;13:e1002195. - PMC - PubMed
    1. Hsi-Yang Fritz M., Leinonen R., Cochrane G., Birney E. Efficient storage of high throughput DNA sequencing data using reference-based compression. Genome Res. 2011;21:734–740. - PMC - PubMed
    1. Silvester N., Alako B., Amid C., Cerdeño-Tárraga A., Cleland I., Gibson R., Goodgame N., Hoopen Ten P., Kay S., Leinonen R., et al. Content discovery and retrieval services at the European Nucleotide Archive. Nucleic Acids Res. 2015;43:D23–D29. - PMC - PubMed
    1. Squizzato S., Park Y.M., Buso N., Gur T., Cowley A., Li W., Uludag M., Pundir S., Cham J.A., McWilliam H., et al. The EBI Search engine: providing search and retrieval functionality for biological data from EMBL-EBI. Nucleic Acids Res. 2015;43:W585–W588. - PMC - PubMed

Publication types

LinkOut - more resources