Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jan 4:9:1080836.
doi: 10.3389/fmolb.2022.1080836. eCollection 2022.

Structural and evolutionary insights into astacin metallopeptidases

Affiliations

Structural and evolutionary insights into astacin metallopeptidases

F Xavier Gomis-Rüth et al. Front Mol Biosci. .

Abstract

The astacins are a family of metallopeptidases (MPs) that has been extensively described from animals. They are multidomain extracellular proteins, which have a conserved core architecture encompassing a signal peptide for secretion, a prodomain or prosegment and a zinc-dependent catalytic domain (CD). This constellation is found in the archetypal name-giving digestive enzyme astacin from the European crayfish Astacus astacus. Astacin catalytic domains span ∼200 residues and consist of two subdomains that flank an extended active-site cleft. They share several structural elements including a long zinc-binding consensus sequence (HEXXHXXGXXH) immediately followed by an EXXRXDRD motif, which features a family-specific glutamate. In addition, a downstream SIMHY-motif encompasses a "Met-turn" methionine and a zinc-binding tyrosine. The overall architecture and some structural features of astacin catalytic domains match those of other more distantly related MPs, which together constitute the metzincin clan of metallopeptidases. We further analysed the structures of PRO-, MAM, TRAF, CUB and EGF-like domains, and described their essential molecular determinants. In addition, we investigated the distribution of astacins across kingdoms and their phylogenetic origin. Through extensive sequence searches we found astacin CDs in > 25,000 sequences down the tree of life from humans beyond Metazoa, including Choanoflagellata, Filasterea and Ichtyosporea. We also found < 400 sequences scattered across non-holozoan eukaryotes including some fungi and one virus, as well as in selected taxa of archaea and bacteria that are pathogens or colonizers of animal hosts, but not in plants. Overall, we propose that astacins originate in the root of Holozoa consistent with Darwinian descent and that the latter genes might be the result of horizontal gene transfer from holozoan donors.

Keywords: catalytic domain (CD); darwinian descent; evolution of metallopeptidases; horizontal gene transfer (HGT); phylogeny of enzymes.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

FIGURE 1
FIGURE 1
Astacin domain combinations and phylogenetic occurrence. Metazoan astacins minimally comprise an N-terminal signal-peptide for extracellular secretion (S), a propeptide or prodomain conferring latency (PRO) and a catalytic zinc-dependent metallopeptidase domain (CD). Moreover, most astacins evince additional domains, listed with Prosite database codes (PS; https://prosite.expasy.org): ABC (ABC-transporter; PS00211), CD (catalytic protease domain; PS51864), C (cytoplasmic tail), CLEC (C-type lectin; PS50041), CUB (found in complement subcomponents C1r/C1s, Uegf and BMP1; PS01180), EGF (epidermal growth factor-like; PS00022), FA58C (factor-5/8 type-C domain; PS500229), FN3 (fibronectin type-III domain; PS508539), HYR (hyalin repeat protein; PS50825), I (intervening domain in meprin α, contains a furin cleavage site), IG (immunoglobulin-like; PS508359), KRING (kringle; PS00021), LC (low complexity domain, disordered), LCCL (Limulus C-domain; PS50820), LYSM (extracellular receptor domain; PS51782), MAM (found in meprin, A5 protein and receptor protein tyrosine phosphatase μ; PS00740), MATH (meprin and traf homology domain; PS50144), PAN (also dubbed APPLE; found in plasma kallikrein and factor XI; PS50948), PLAC (polycystin-1, lipoxygenase and α-toxin; PS50095), PTX (pentraxin; PS51828), RICIN (ricin-type lectin; PS50231), SH2 (SARC homology domain; PS50001), ShKT (K+-channel-blocking Stichodactyla helianthus toxin; PS51670), SMB (somatomedin; PS50958), SRCR (cysteine-rich scavenger receptor; PS50287), SUSHI (sushi adhesion domain; PS50923), TPR (tetratricopeptide repeat; PS50005), TSP (thrombospondin-like domain; PS50092), VWF (von-Willebrand-factor domain; PS50234), ZF-UBR (UBR-type zinc-finger; PS51157) and ZP2 (zona pellucida protein 2 domain; PS51034). On the left, typical astacin family members are listed, for which physiological functions are documented (see Supplementary Table S1 for complete protein and gene names, and UniProt access codes). Phylogenetic occurrences are indicated on the right.
FIGURE 2
FIGURE 2
Representative structures of the most relevant astacin domains. (A) Ribbon-type plot of the mature Astacus astacus crayfish astacin catalytic domain [PDB 1AST; residues 50–251, see UniProt P07584; (Bode et al., 1992; Gomis-Rüth et al., 1993)], which is shown in the standard orientation of MPs [left; (Gomis-Rüth et al., 2012b)] and vertically rotated by 90 degrees (right). Regular secondary structure elements are shown as yellow β-strands (β1–β7) and aquamarine α-helices (αA–αC). The first five strands constitute the typical five-stranded β-sheet of astacins (Gomis-Rüth et al., 2012a) and the helices are dubbed “backing helix” (αA), “active-site helix” (αB) and “C-terminal helix” (αC). The latter is split in two by a kink. Unbound mature astacin has its catalytic zinc cation (magenta sphere) bound in trigonal-bipyramidal coordination by the three histidines (①–③) of a characteristic zinc-binding motif [HEXXHXXGXXH; (Bode et al., 1993)] plus a more distal downstream tyrosine (④) and the catalytic solvent molecule [small red sphere; (Arolas et al., 2018)]. The glutamate within the motif (⑤) is the general base/acid for catalysis (Arolas et al., 2018). The “Met-turn” with the conserved methionine [⑥; (Bode et al., 1993; Tallant et al., 2010)] is shown as an orange ribbon. The mature N-terminal residue (labelled N) is bound to the family-specific glutamate (E103) [⑦; (Gomis-Rüth, 2003)] after the third zinc-binding histidine. The C-terminus is also labelled (C) and the two disulfide bonds of the structure (C42–C198 and C64–C84) are further displayed with sulphur atoms in green. (B) The structure of the unique EGF-like domain of human BMP1 predicted with AlphaFold (Jumper et al., 2021) shows two β-ribbons and three disulfide bonds. Two orthogonal orientations are displayed. (C) Experimental structure of the MAM domain of meprin β [PDB 4GWM; (Arolas et al., 2012)] in two orthogonal orientations. The β-sandwich domain (residues 259–427, see UniProt Q16820) features two disulfide bonds and a structural sodium cation (blue sphere) octahedrally coordinated by six protein oxygens. (D) Structure of the first CUB domain of human BMP predicted with AlphaFold in two orthogonal orientations, which show a β-sandwich architecture with two disulfide bonds. (E) Experimental structure of the TRAF domain of meprin β [PDB 4GWM; (Arolas et al., 2012)] in two orthogonal orientations. The β-sandwich domain (residues 428–597, see UniProt Q16820) has two short helices and a β-ribbon grafted into strand-connecting loops. (F–I) Experimental zymogen structures as Cα–traces in standard orientation (top panels) and after a vertical 90-degree rotation (bottom panels) of (F) crayfish astacin [PDB 3LQ0; (Guevara et al., 2010)], (G) human meprin β [PDB 4GWM; (Arolas et al., 2012)], (H) myroilysin from the bacterium Myroides sp. [PDB 5GWD; (Xu et al., 2017)] and (I) astacin from the horseshoe crab Limulus polyphemus [PDB 8A28; (Guevara et al., 2022)]. Only the PROs (aquamarine) and CDs (sandy brown) are displayed for clarity, together with the catalytic zinc ions (magenta spheres) and the side chains of the respective aspartate/cysteine-switch residue.
FIGURE 3
FIGURE 3
Classification of holozoans. Dendrogram depicting the herein proposed hierarchical clustering of phyla within holozoans assembled based on current literature (Ryan et al., 2010; Ruggiero et al., 2015; Torruella et al., 2015; Cannon et al., 2016; Lu et al., 2017; Sebé-Pedrós et al., 2017; Whelan et al., 2017; Adl et al., 2019; Giribet et al., 2019; Laumer et al., 2019; Marlétaz et al., 2019; Sogabe et al., 2019; Hickman et al., 2020; Schoch et al., 2020; Schulze and Kawauchi, 2021). Phylum Chordata is further shown for its constituting subphyla Vertebrata, Tunicata/Urochordata and Cephalochordata. The first two give rise to Olfactores.
FIGURE 4
FIGURE 4
Sequence alignment of 20 astacins representing different animal phyla excerpted from the sets shown in Figure 5, Table 1 and Supplementary Table S1. In the headline, “pro” labels three residues of the prodomain with the conserved aspartate residue (underlaid green) responsible for latency of the proenzyme, which is absent from myroilysin. The red asterisks below “act” indicate the site of proteolytic activation (maturation). Cysteines are underlaid yellow. “Zinc” labels the three zinc-binding histidines (underlaid blue), the catalytically essential glutamate (Arolas et al., 2018) following the first histidine is underlaid magenta, and the family-specific glutamate after the third histidine in underlaid green. Its negatively charged sidechain forms a salt bridge with the positively charged mature amino terminus after activation. Also labelled is the Met-turn (“Met”, underlaid black) with the tyrosine zinc ligand underlaid blue. Finally, S1’ indicates the region shaping the binding pocket of the P1’ residue of substrates within the catalytic cleft (Schechter and Berger, 1967; Gomis-Rüth et al., 2012b). Sequences: MEPβ HOMSA (Homo sapiens, human, phylum Chordata, subphylum Vertebrata), MEPβ PETMA (Petromyzon marinus, lamprey, phylum Chordata, subphylum Vertebrata), SMD SACKO (Sackoglossus kowalevskii, acorn worm, phylum Hemichordata), CUB ASTRU (Asterias rubens, sea star, phylum Echinodermata), LysM RAMVA (Ramazottius varieornatus, water bear, phylum Tardigrada), TLD DROME (Drosophila melanogaster, fruit fly, phylum Arthropoda, subphylum Hexapoda), LASTMAM LIMPO (Limulus polyphemus, horseshoe crab, phylum Arthropoda, subphylum Chelicerata), AST ASTAS (Astacus astacus, crayfish, phylum Arthropoda, subphylum Crustacea), HCH1 CAEEL (Caenorhabditis elegans, nematode, phylum Nematoda), ShKT PRICA (Priapulus caudatus, cactus worm, phylum Priapulida), CUBMAM BUGNE (Bugula neritina, common bugula, phylum Bryozoa), ShKT LINUN (Lingula unguis, lamp shell, phylum Brachiopoda), AST DIMGY (Dimorphilius gyrociliatus, phylum Annelida), MAMEGF BRAPC (Brachionus plicatilis, rotifer, phylum Rotifera), ShKT8 MYTCO (Mytilus coruscus, Korean mussel, phylum Mollusca), ShKT SCHMD (Schmidtea mediterranea, triclad flatworm, phylum Platyhelminthes), ASTLD MNELE (Mnemiopsis leidyi, sea walnut, Phylum Ctenophora), HAS7 HYDVU (Hydra vulgaris, hydra, phylum Cnidaria), CUBEGF TRIAD, Trichoplax adherens, flat-bodied animal, phylum Placozoa and IG4 AMPQE (Amphimedon queenslandica, sponge, phylum Porifera). In addition, an astacin xenologue from the bacterium Myroides sp., the only known astacin with a proven cysteine switch activation mechanism (Xu et al., 2017; Ran et al., 2020), was further included for comparison (MYR MYRSP).
FIGURE 5
FIGURE 5
Phylogenetic tree based on the catalytic domains of a selection of 147 astacins. The list of species and UniProt and GenBank accession numbers are listed in Supplementary Table S1. The asterisk in the top right quadrant indicates the position of the prototypical name-giving enzyme astacin from the crayfish Astacus astacus. Crayfish astacin is translated with a signal peptide for extracellular targeting, a prodomain conferring latency and a catalytic protease domain (see also Figure 1). The domain compositions of astacins consisting merely of these three domains are omitted for clarity. Astacin-like proteases with more complex domain structures are shown schematically. A detailed list of domains with Prosite database accession numbers is contained in Figure 1 and in Supplementary Table S1.

Similar articles

Cited by

References

    1. Adl S. M., Bass D., Lane C. E., Lukes J., Schoch C. L., Smirnov A., et al. (2019). Revisions to the classification, nomenclature, and diversity of eukaryotes. J. Eukaryot. Microbiol. 66 (1), 4–119. 10.1111/jeu.12691 - DOI - PMC - PubMed
    1. Algrain M., Hennebert E., Bertemes P., Wattiez R., Flammang P., Lengerer B. (2022). In the footsteps of sea stars: Deciphering the catalogue of proteins involved in underwater temporary adhesion. Open Biol. 12 (8), 220103. 10.1098/rsob.220103 - DOI - PMC - PubMed
    1. Altschul S. F., Koonin E. V. (1998). Iterated profile searches with PSI-BLAST--a tool for discovery in protein databases. Trends biochem. Sci. 23 (11), 444–447. 10.1016/s0968-0004(98)01298-5 - DOI - PubMed
    1. AmbuAli A., Monaghan S. J., McLean K., Inglis N. F., Bekaert M., Wehner S., et al. (2020). Identification of proteins from the secretory/excretory products (SEPs) of the branchiuran ectoparasite Argulus foliaceus (Linnaeus, 1758) reveals unique secreted proteins amongst haematophagous ecdysozoa. Parasit. Vectors 13 (1), 88. 10.1186/s13071-020-3964-z - DOI - PMC - PubMed
    1. Aricescu A. R., Hon W. C., Siebold C., Lu W., van der Merwe P. A., Jones E. Y. (2006). Molecular analysis of receptor protein tyrosine phosphatase μ-mediated cell adhesion. EMBO J. 25 (4), 701–712. 10.1038/sj.emboj.7600974 - DOI - PMC - PubMed