Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2022 Oct 4;12(10):1425.
doi: 10.3390/biom12101425.

Protein Data Bank: A Comprehensive Review of 3D Structure Holdings and Worldwide Utilization by Researchers, Educators, and Students

Affiliations
Review

Protein Data Bank: A Comprehensive Review of 3D Structure Holdings and Worldwide Utilization by Researchers, Educators, and Students

Stephen K Burley et al. Biomolecules. .

Abstract

The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), funded by the United States National Science Foundation, National Institutes of Health, and Department of Energy, supports structural biologists and Protein Data Bank (PDB) data users around the world. The RCSB PDB, a founding member of the Worldwide Protein Data Bank (wwPDB) partnership, serves as the US data center for the global PDB archive housing experimentally-determined three-dimensional (3D) structure data for biological macromolecules. As the wwPDB-designated Archive Keeper, RCSB PDB is also responsible for the security of PDB data and weekly update of the archive. RCSB PDB serves tens of thousands of data depositors (using macromolecular crystallography, nuclear magnetic resonance spectroscopy, electron microscopy, and micro-electron diffraction) annually working on all permanently inhabited continents. RCSB PDB makes PDB data available from its research-focused web portal at no charge and without usage restrictions to many millions of PDB data consumers around the globe. It also provides educators, students, and the general public with an introduction to the PDB and related training materials through its outreach and education-focused web portal. This review article describes growth of the PDB, examines evolution of experimental methods for structure determination viewed through the lens of the PDB archive, and provides a detailed accounting of PDB archival holdings and their utilization by researchers, educators, and students worldwide.

Keywords: DNA; Open Access; Protein Data Bank; RNA; Worldwide Protein Data Bank; biological macromolecules; carbohydrates; cryogenic electron microscopy; cryogenic electron tomography; electron crystallography; macromolecular crystallography; micro-electron diffraction; nuclear magnetic resonance spectroscopy; nucleic acids; proteins; small-molecule ligands.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 10
Figure 10
Experimental approaches for determination of protein structures and computational methods for predicting structures both rely on open access to genomic and 3D structure data. Here, methods for determining the structure of the RNA-binding protein Nova-2 are shown. The MX structure (left) was determined for an isolated domain of the protein bound to its RNA target. The computed structure (right) includes the entire polypeptide chain, which is predicted to include three well-folded domains (blue/cyan) connected by apparently unstructured linkers (yellow/orange). Image adapted from New England Journal of Medicine, Stephen K. Burley, Wadih Arap, Renata Pasqualini, Predicting Proteome-Scale Protein Structure with Artificial Intelligence, 385, 2191–2194 [173].Copyright © 2022 Massachusetts Medical Society. Reprinted with permission.
Figure 1
Figure 1
Geographic distribution of PDB depositions from 1971 to mid-2022.
Figure 2
Figure 2
PDB archive metrics. (A). Growth 1976–2021. (B). New MX, 3DEM, and NMR structures released annually (2000–2021). (C). MX and 3DEM structure counts vs. resolution (Å). (D). Average number of residues per structure for structures released annually (2000–2021). (E). Average number of polymer chains per structure for structures released annually (2000–2021). (F). Average number of non-polymer ligands per structure for structures released annually (2000–2021).
Figure 3
Figure 3
PDB MX structure phasing method trends vs. year of structure release from 2001–2021 (MR: molecular replacement; MAD: multi-wavelength anomalous dispersion; SAD: single-wavelength anomalous dispersion; IR: isomorphous replacement).
Figure 4
Figure 4
Box plot display of PDB MX structure resolution vs. time. The bold solid bar within each box corresponds to the median value for structures publicly released that year. (N.B.: Small numbers of extreme outliers with resolution > 4 Å were excluded from this analysis for clarity).
Figure 5
Figure 5
(A). Annual average reported resolution (blue) and annual best reported resolution (orange) for 3DEM PDB structures released 2013–2022. (B). Percentage of 3DEM PDB structures released per year reporting use of direct electron detectors. (C). Top-three reported image reconstruction software packages per year shown as a percentage of 3DEM PDB structures reporting reconstruction software.
Figure 6
Figure 6
Breakdown of NMR PDB structure holdings by sample type.
Figure 7
Figure 7
Phylogenetic Tree showing PDB holdings (as of mid-2022). Within each of the three branches, PDB structure totals are provided for selected organisms. N.B.: The PDB also houses 3D structures that solely contain nucleic acids (DNA, RNA, DNA-RNA hybrids, etc.) and/or viral proteins or human-designed proteins, which collectively accounted for ~8% of archival holdings as of mid-2022.
Figure 8
Figure 8
(A). Geographic distribution of RCSB.org users by country. (B). Top 10 countries with the highest percentage of users from 2019–2021. Data from Google Analytics.
Figure 9
Figure 9
Average monthly usage of PDB-101 (PDB101.RCSB.org, accessed on 28 August 2022) from 2019–2021. Data from Google Analytics.

Similar articles

Cited by

References

    1. Protein Data Bank Crystallography: Protein Data Bank. Nat. New Biol. 1971;233:223. doi: 10.1038/newbio233223b0. - DOI - PubMed
    1. Berman H.M., Henrick K., Nakamura H. Announcing the worldwide Protein Data Bank. Nat. Struct. Biol. 2003;10:980. doi: 10.1038/nsb1203-980. - DOI - PubMed
    1. wwPDB consortium Protein Data Bank: The single global archive for 3D macromolecular structure data. Nucleic Acids Res. 2019;47:D520–D528. doi: 10.1093/nar/gky949. - DOI - PMC - PubMed
    1. Berman H.M., Westbrook J., Feng Z., Gilliland G., Bhat T.N., Weissig H., Shindyalov I.N., Bourne P.E. The Protein Data Bank. Nucleic Acids Res. 2000;28:235–242. doi: 10.1093/nar/28.1.235. - DOI - PMC - PubMed
    1. Burley S.K., Bhikadiya C., Bi C., Bittrich S., Chen L., Crichlow G., Christie C.H., Dalenberg K., Costanzo L.D., Duarte J.M., et al. RCSB Protein Data Bank: Powerful new tools for exploring 3D structures of biological macromolecules for basic and applied research and education in fundamental biology, biomedicine, biotechnology, bioengineering, and energy sciences. Nucleic Acid Res. 2021;49:D437–D451. doi: 10.1093/nar/gkaa1038. - DOI - PMC - PubMed

Publication types