Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Sep 16;9 Suppl 2(Suppl 2):S1.
doi: 10.1186/1471-2164-9-S2-S1.

The unfoldomics decade: an update on intrinsically disordered proteins

Affiliations

The unfoldomics decade: an update on intrinsically disordered proteins

A Keith Dunker et al. BMC Genomics. .

Abstract

Background: Our first predictor of protein disorder was published just over a decade ago in the Proceedings of the IEEE International Conference on Neural Networks (Romero P, Obradovic Z, Kissinger C, Villafranca JE, Dunker AK (1997) Identifying disordered regions in proteins from amino acid sequence. Proceedings of the IEEE International Conference on Neural Networks, 1: 90-95). By now more than twenty other laboratory groups have joined the efforts to improve the prediction of protein disorder. While the various prediction methodologies used for protein intrinsic disorder resemble those methodologies used for secondary structure prediction, the two types of structures are entirely different. For example, the two structural classes have very different dynamic properties, with the irregular secondary structure class being much less mobile than the disorder class. The prediction of secondary structure has been useful. On the other hand, the prediction of intrinsic disorder has been revolutionary, leading to major modifications of the more than 100 year-old views relating protein structure and function. Experimentalists have been providing evidence over many decades that some proteins lack fixed structure or are disordered (or unfolded) under physiological conditions. In addition, experimentalists are also showing that, for many proteins, their functions depend on the unstructured rather than structured state; such results are in marked contrast to the greater than hundred year old views such as the lock and key hypothesis. Despite extensive data on many important examples, including disease-associated proteins, the importance of disorder for protein function has been largely ignored. Indeed, to our knowledge, current biochemistry books don't present even one acknowledged example of a disorder-dependent function, even though some reports of disorder-dependent functions are more than 50 years old. The results from genome-wide predictions of intrinsic disorder and the results from other bioinformatics studies of intrinsic disorder are demanding attention for these proteins.

Results: Disorder prediction has been important for showing that the relatively few experimentally characterized examples are members of a very large collection of related disordered proteins that are wide-spread over all three domains of life. Many significant biological functions are now known to depend directly on, or are importantly associated with, the unfolded or partially folded state. Here our goal is to review the key discoveries and to weave these discoveries together to support novel approaches for understanding sequence-function relationships.

Conclusion: Intrinsically disordered protein is common across the three domains of life, but especially common among the eukaryotic proteomes. Signaling sequences and sites of posttranslational modifications are frequently, or very likely most often, located within regions of intrinsic disorder. Disorder-to-order transitions are coupled with the adoption of different structures with different partners. Also, the flexibility of intrinsic disorder helps different disordered regions to bind to a common binding site on a common partner. Such capacity for binding diversity plays important roles in both protein-protein interaction networks and likely also in gene regulation networks. Such disorder-based signaling is further modulated in multicellular eukaryotes by alternative splicing, for which such splicing events map to regions of disorder much more often than to regions of structure. Associating alternative splicing with disorder rather than structure alleviates theoretical and experimentally observed problems associated with the folding of different length, isomeric amino acid sequences. The combination of disorder and alternative splicing is proposed to provide a mechanism for easily "trying out" different signaling pathways, thereby providing the mechanism for generating signaling diversity and enabling the evolution of cell differentiation and multicellularity. Finally, several recent small molecules of interest as potential drugs have been shown to act by blocking protein-protein interactions based on intrinsic disorder of one of the partners. Study of these examples has led to a new approach for drug discovery, and bioinformatics analysis of the human proteome suggests that various disease-associated proteins are very rich in such disorder-based drug discovery targets.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Peculiarities of the amino acid sequences of intrinsically disordered proteins. A. Mean net charge versus mean hydropathy plot (charge-hydropathy plot) for the set of 275 folded (blue squares) and 91 natively unfolded proteins (red circles) [37]. B. Amino-acid composition, relative to the set of globular proteins Globular-3D, of intrinsically disordered regions 10 residues or longer from the DisProt database. Dark gray indicates DisProt 1.0 (152 proteins), whereas light gray indicates DisProt 3.4 (460 proteins). Amino acid compositions were calculated per disordered regions and then averaged. The arrangement of the amino acids is by peak height for the DisProt 3.4 release. Confidence intervals were estimated using per-protein bootstrapping with 10,000 iterations [40].
Figure 2
Figure 2
Functional anthology of intrinsic disorder.
Figure 3
Figure 3
PONDR-based analysis of hirudin and thrombin. The correspondence of PONDR® VL-XT predictions and regions of known structure are shown. Two PDB structures are presented – 5HIR (left) and 1NO9 (right) – where each chain is color coded – folded N-terminal domain of hirudin (yellow, disulphide bridges are shown by maroon lines), acidic C-terminal domain of hirudin (red) bound to a heavy chain of thrombin (blue), and light chain of thrombin (green). These color codes are also used for bars in two PONDR® VL-XT plots – (top) hirudin and (bottom) thrombin – to indicate the positions of the regions of known structure in the context of the PONDR® VL-XT predictions. Drawn over these bars, hash marks show the residues in contact with other chains, where the color of the hash mark corresponds to the color code of the chain in contact. Black hash mark in the PONDR® VL-XT plot for thrombin corresponds to the factor Xa cleavage site. A predicted α-MoRF region of hirudin is shown in corresponding PONDR® VL-XT plot as a pink bar.
Figure 4
Figure 4
Examples of structurally divergent MoRFs. MoRFs (red ribbons) and partners (green surface) are shown (A) An α-MoRF, Proteinase Inhibitor IA3, bound to Proteinase A (PDB entry 1DP5). (B) A β-MoRF, viral protein pVIc, bound to Human Adenovirus 2 Proteinase (PDB entry 1AVP). (C) An ι-MoRF, Amphiphysin, bound to α-adaptin C (PDB entry 1KY7). (D) A complex-MoRF, β-amyloid precursor protein (βAPP), bound to the PTB domain of the neuron specific protein X11 (PDB entry 1X11). Partner interfaces (gray surface) are also indicated.
Figure 5
Figure 5
Bioinformatics evidence for the unstructured character of MoRFs in their unbound states. Surface and interface area normalized by the number of residues in each chain for MoRF and the OC datasets.
Figure 6
Figure 6
Structural changes in MoRF partners. Ribbon representation MoRF partners shown unbound (blue ribbons) and bound (green ribbons) to MoRFs (red ribbons). (A) Small scale structural alterations in CheY induced by binding of the MoRF region of FliM (PDB entries: unbound – 1U8T and bound – 1F4V). (B) Large scale structural alterations in calmodulin induced by binding to the MoRF of GAD (PDB entries: unbound – 1CLL and bound – 1NWD). (C) Partial disorder-to-order transition in PCNA induced by binding to the MoRF of FEN-1 (PDB entries: unbound – 1RWZ and bound – 1RXZ). (D) Partial order-to-disorder transition in Bcl-xL induced by binding to the MoRF of Bim (PDB entries: unbound – 1PQ0 and bound – 1PQ1).
Figure 7
Figure 7
Sequence and structure comparison for the four overlapping complexes in the C-terminus of p53. (A) Primary, secondary, and quaternary structure of p53 complexes. (B) The ΔASA for rigid association between the components of complexes for each residue in the relevant sequence region of p53. The two hatched bars indicate acetylated lysine residues. Histogram of conserved predicted disorder effective length classes by kingdom.
Figure 8
Figure 8
Abundance of intrinsic disorder in alternatively spliced regions. Fractions of alternatively spliced regions of RNA coded for entirely disordered protein, for both ordered and disordered protein, and for fully structured regions are shown as red, violet and blue pieces of the pie chart respectively.
Figure 9
Figure 9
Abundance of conserved predicted disordered regions in various organisms. Histogram of conserved predicted disorder effective length classes by kingdom.
Figure 10
Figure 10
Abundance of intrinsic disorder in disease-associated proteins. Percentages of disease associated proteins with ≥ 30 to ≥ 100 consecutive residues predicted to be disordered. The error bars represent 95% confidence intervals and were calculated using 1,000 bootstrap re-sampling. Corresponding data for signaling and ordered proteins are shown for the comparison. Analyzed sets of diseas-related proteins included 1786, 487, 689, and 285 proteins for cancer, CVD, neurodegenerative disease and diabetes, respectively.
Figure 11
Figure 11
IDPs as drug targets. Protein-protein interactions involving α-helical or β-strand portion of the partners are used to design small molecules for cancer drugs. A. A ribbon diagram of complex of β-catenin (light colors) and T cell factor (red) was regenerated from PDB 1G3J. The structure of β-catenin is consisted of 12 tri-helical repeats (except the repeat 7, which just has two helical units). Small molecules from a natural-product library were screened and a couple of inhibitors were found. However, the binding sites for the small molecule inhibitors were not clear. B. A ribbon diagram of complex of MDM2 (green) and P53 fragment (red) was regenerated from PDB 1YCR. Small molecule inhibitors were designed based on the structure of the helical fragment of P53. C. A ribbon diagram of complex of Bcl-xL (green) and BAK fragment (red) was regenerated from PDB 1BXL. Small molecules were designed based on the 20-residue helix of BAK to inhibit the interaction. D. A ribbon diagram of complex of XIAP (green) and Smac fragment (red) was regenerated from PDB 1G3F. Small molecule inhibitors were designed based on the β-strand fragment (AVPIAQKSE) of Smac.

Similar articles

Cited by

References

    1. Landsteiner K. The specificity of serological reactions. Baltimore: C. C. Thomas; 1936.
    1. Pauling L. A Theory of the Structure and Process of Formation of Antibodies. J Am Chem Soc. 1940;62:2643–2657.
    1. Dunker AK, Garner E, Guilliot S, Romero P, Albrecht K, Hart J, Obradovic Z, Kissinger C, Villafranca JE. Protein disorder and the evolution of molecular recognition: theory, predictions and observations. Pac Symp Biocomput. 1998:473–484. - PubMed
    1. Karush F. Heterogeneity of the binding sites of bovine serum albumin. J Am Chem Soc. 1950;72:2705–2713.
    1. Spolar RS, Record MT., Jr Coupling of local folding to site-specific binding of proteins to DNA. Science. 1994;263:777–784. - PubMed

Publication types

LinkOut - more resources