Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Apr 15;434(7):167517.
doi: 10.1016/j.jmb.2022.167517. Epub 2022 Feb 28.

Conformational Variation in Enzyme Catalysis: A Structural Study on Catalytic Residues

Affiliations

Conformational Variation in Enzyme Catalysis: A Structural Study on Catalytic Residues

Ioannis G Riziotis et al. J Mol Biol. .

Abstract

Conformational variation in catalytic residues can be captured as alternative snapshots in enzyme crystal structures. Addressing the question of whether active site flexibility is an intrinsic and essential property of enzymes for catalysis, we present a comprehensive study on the 3D variation of active sites of 925 enzyme families, using explicit catalytic residue annotations from the Mechanism and Catalytic Site Atlas and structural data from the Protein Data Bank. Through weighted pairwise superposition of the functional atoms of active sites, we captured structural variability at single-residue level and examined the geometrical changes as ligands bind or as mutations occur. We demonstrate that catalytic centres of enzymes can be inherently rigid or flexible to various degrees according to the function they perform, and structural variability most often involves a subset of the catalytic residues, usually those not directly involved in the formation or cleavage of bonds. Moreover, data suggest that 2/3 of active sites are flexible, and in half of those, flexibility is only observed in the side chain. The goal of this work is to characterise our current knowledge of the extent of flexibility at the heart of catalysis and ultimately place our findings in the context of the evolution of catalysis as enzymes evolve new functions and bind different substrates.

Keywords: Catalysis; Catalytic residues; Enzyme; Flexibility; Structure.

PubMed Disclaimer

Conflict of interest statement

Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

None
Graphical abstract
Figure 1
Figure 1
Pairwise comparison of the reference enzyme in each M-CSA family with all its homologues, in sequence and structure (30,859 data points in each plot). a and b: Left part of each panel shows the percentage of overall sequence similarity vs. RMSD over all Cα atoms (a) and the percentage of overall sequence similarity vs. RMSD over active site Cα atoms (b). Right part illustrates an example comparison of two homologous enzymes in superposition over all Cα atoms (a) and over the Cα atoms of their active sites (b). The corresponding data point on the plots (left part) are indicated by red circles. c: RMSD over all Cα atoms vs. RMSD over active site Cα atoms. d: RMSD over active site functional atoms vs. RMSD over active site Cα atoms (refer to text for functional atoms definitions). In all plots, data points corresponding to enzymes whose active site is composed of residues belonging to multiple chains, are indicated by a different colour and the distributions of data points on each axis is shown as inline histograms. 3D structure models were prepared in PyMol.
Figure 2
Figure 2
Conformational variation between active site pairs of homologous enzymes, expressed as the Root Mean Square Deviation of their functional groups’ atomic positions. a and b: Distributions of the average pairwise active sites RMSD of homologous enzyme families, grouped by the presence of ligands in each active site comparison pair. Panels a and b refer to pairs of enzymes that have the same or different UniProt mapping respectively. Active site pair groups that have the same or different catalytic residues (e.g. by mutations in the PDB structure) are colored differently (see legend). The mean and median RMSD values are marked by a horizontal line and a white circle respectively. Simple linear regression to show differences on the mean of each group is drawn as a red line. Size of each group is labelled above each boxplot c: Distributions of functional atoms RMSD when the same (upper panel) or different ligands bind to the active site (lower panel). The dataset used in this plot includes only enzymes that have identical UniProt mapping and the same catalytic residues. Red dashed lines indicate the mean RMSD value, while green dotted lines indicate the RMSD values of the major peaks, with their difference being labeled. d: Distributions of functional atoms RMSD of ligand-free active site pairs from the same protein and with the same catalytic residues, grouped by the number of catalytic residues.
Figure 3
Figure 3
Enrichment analysis of conformational variation according to residue type and functional role in pairwise active site superpositions dataset. a and b: Single residue RMSD distributions in various residue types (a) and functional roles (b). Red dashed lines indicate the 0.5 Å RMSD cut-off to define residues as “rigid” or “flexible”. c-e: Ratios of enrichment frequencies of each residue type (c), functional role (d) and role category (e) in two subsets of residue RMSD observations, one containing only residues defined as “rigid” and the other only the “flexible” ones. The odds-ratio of the frequencies represent the propensity to be flexible or rigid. Ratio > 1 indicates that this specific type/role/category tends to be more variable in the 3D space, thus more flexible, while < 1 indicates higher potential to be rigid. The dataset used here is sampled (50 superpositions of conserved and identical active sites from the same protein per M-CSA homologous family). After enrichment and grouping, only groups occupying at least 2% of the sampled dataset are reported.
Figure 4
Figure 4
Different structural paradigms of active site structural behaviour (inherently rigid, inherently flexible, open/closed and extensively variable) (presented here in orange schematic histograms). For each paradigm, conserved active sites from a representative M-CSA enzyme family are shown in superposition, with the M-CSA identification number of each example entry being annotated. Two distinct active site conformations in the Open/closed type are indicated by a bidirectional arrow. 3D models were prepared in PyMol. b: Examples of M-CSA families adhering to each paradigm. Plots on the left are histograms of all-vs-all functional atoms RMSD over the whole active site, and on the right, the RMSD distributions are plotted on a per-residue basis. Only conserved active sites were used to generate these distributions.
Figure 5
Figure 5
Active site structural analyses summary for P450cams. a: Conserved active sites from homologous enzymes are shown in superposition over the reference active site. Clusters of aligned catalytic residues are shown as distinctly coloured sticks and the structure of the reference enzyme of the family (PDB ID: 1YRC) is shown in cartoon representation in the background. Ligands bound in the reference are shown in sticks and labeled with their PDB three-letter code. Residue groups forming individual modules are circled in dashed line. b: All-vs-all functional atoms RMSD matrix of conserved active sites. Darker and lighter colours, as shown in scale to the left of the matrix, indicate higher and lower structural similarity respectively. Clustering is performed by pruning the hierarchical dendrogram shown on the left of the matrix at a height indicated with a dashed line. Some representative members of each major cluster are overlayed on the matrix as sticks (side chain only) and lines (all atoms) for catalytic residues and bound ligands respectively. c: Secondary structure preferences for each residue position as assigned by DSSP (H: a-helix, E: β-strand, B: β-bridge, T: turn, S: bend, I: π-helix, C: random coil), presented in the form of frequency pseudo-sequence logo. d: Distributions of the Miller solvent accessibility for each residue position. e: Functional atoms RMSD for the whole active site. f: Functional atoms RMSD distributions for each residue. g: Relative frequency of ligands bound in active sites within each cluster, expressed in “word cloud” of PDB three-letter codes format. Each three-letter code is scaled so it reflects the relative frequency of the corresponding compound in the active sites of the cluster (larger-sized codes indicate a more frequently-bound ligand within the cluster, while smaller-sized codes indicate the opposite). Ligand-free active sites are also counted and shown in the plots with the word “FREE”.
Figure 6
Figure 6
Active site structural analyses summary for Peptidyl dipeptidases. a: Conserved active sites from homologous enzymes are shown in superposition over the reference active site. Clusters of aligned catalytic residues are shown as distinctly coloured sticks and the structure of the reference enzyme of the family (PDB ID: 1O8A) is shown in cartoon representation in the background (N and C letters in parentheses indicate the subdomain in which residues belong to). Ligands bound in the reference are shown in sticks and labeled with their PDB three-letter code. Residue groups forming individual modules are circled in dashed line. b: All-vs-all functional atoms RMSD matrix of conserved active sites. Darker and lighter colours, as shown in scale to the left of the matrix, indicate higher and lower structural similarity respectively. Clustering is performed by pruning the hierarchical dendrogram shown on the left of the matrix at a height indicated with a dashed line. Some representative members of each major cluster are overlayed on the matrix as sticks (side chain only) and lines (all atoms) for catalytic residues and bound ligands respectively. c: Secondary structure preferences for each residue position as assigned by DSSP (H: a-helix, E: β-strand, B: β-bridge, T: turn, S: bend, I: π-helix, C: random coil), presented in the form of frequency pseudo-sequence logo. d: Functional atoms RMSD for the whole active site. e: Functional atoms RMSD distributions for each residue. f: Relative frequency of ligands bound in active sites within each cluster, expressed in “word cloud” of PDB three-letter codes format. Each three-letter code is scaled so it reflects the relative frequency of the corresponding compound in the active sites of the cluster (larger-sized codes indicate a more frequently-bound ligand within the cluster, while smaller-sized codes indicate the opposite). Ligand-free active sites are also counted and shown in the plots with the word “FREE”. Polymeric (protein or nucleic) bound ligands are shown with the word “POLYMER”
Figure 7
Figure 7
Active site structural analyses summary for the α-subunit of G-proteins. a: Conserved active sites from homologous enzymes are shown in superposition over the reference active site. Clusters of aligned catalytic residues are shown as distinctly coloured sticks and the structure of the reference enzyme of the family (PDB ID: 1BH2) is shown in cartoon representation in the background. Ligands bound in the reference are shown in sticks and labeled with their PDB three-letter code. Residue groups forming individual modules are circled in dashed line. b: All-vs-all functional atoms RMSD matrix of conserved active sites. Darker and lighter colours, as shown in scale to the left of the matrix, indicate higher and lower structural similarity respectively. Clustering is performed by pruning the hierarchical dendrogram shown on the left of the matrix at a height indicated with a dashed line. Some representative members of each major cluster are overlayed on the matrix as sticks (side chain only) and lines (all atoms) for catalytic residues and bound ligands respectively. c: Secondary structure preferences for each residue position as assigned by DSSP (H: a-helix, E: β-strand, B: β-bridge, T: turn, S: bend, I: π-helix, C: random coil), presented in the form of frequency pseudo-sequence logo. d: Functional atoms RMSD for the whole active site. e: Functional atoms RMSD distributions for each residue. f: Relative frequency of ligands bound in active sites within each cluster, expressed in “word cloud” of PDB three-letter codes format. Each three-letter code is scaled so it reflects the relative frequency of the corresponding compound in the active sites of the cluster (larger-sized codes indicate a more frequently-bound ligand within the cluster, while smaller-sized codes indicate the opposite.

Similar articles

Cited by

References

    1. Capra J.A., Singh M. Predicting functionally important residues from sequence conservation. Bioinformatics. 2007;23:1875–1882. doi: 10.1093/bioinformatics/btm270. - DOI - PubMed
    1. Ribeiro A.J.M., Tyzack J.D., Borkakoti N., Holliday G.L., Thornton J.M. A global analysis of function and conservation of catalytic residues in enzymes. J. Biol. Chem. 2020;295:314. doi: 10.1074/JBC.REV119.006289. - DOI - PMC - PubMed
    1. Ramanathan A., Agarwal P.K. Evolutionarily conserved linkage between enzyme fold, flexibility, and catalysis. PLOS Biol. 2011;9 doi: 10.1371/JOURNAL.PBIO.1001193. - DOI - PMC - PubMed
    1. Galperin M.Y., Koonin E.V. Divergence and convergence in enzyme evolution. J. Biol. Chem. 2012;287:21–28. doi: 10.1074/jbc.R111.241976. - DOI - PMC - PubMed
    1. Chothia C., Lesk A.M., Perutz M.F. The relation between the divergence of sequence and structure in proteins. The EMBO J. 1986;5:823–826. - PMC - PubMed

LinkOut - more resources