Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Aug 7:13:e94948.
doi: 10.7554/eLife.94948.

Insights into early animal evolution from the genome of the xenacoelomorph worm Xenoturbella bocki

Affiliations

Insights into early animal evolution from the genome of the xenacoelomorph worm Xenoturbella bocki

Philipp H Schiffer et al. Elife. .

Abstract

The evolutionary origins of Bilateria remain enigmatic. One of the more enduring proposals highlights similarities between a cnidarian-like planula larva and simple acoel-like flatworms. This idea is based in part on the view of the Xenacoelomorpha as an outgroup to all other bilaterians which are themselves designated the Nephrozoa (protostomes and deuterostomes). Genome data can provide important comparative data and help understand the evolution and biology of enigmatic species better. Here, we assemble and analyze the genome of the simple, marine xenacoelomorph Xenoturbella bocki, a key species for our understanding of early bilaterian evolution. Our highly contiguous genome assembly of X. bocki has a size of ~111 Mbp in 18 chromosome-like scaffolds, with repeat content and intron, exon, and intergenic space comparable to other bilaterian invertebrates. We find X. bocki to have a similar number of genes to other bilaterians and to have retained ancestral metazoan synteny. Key bilaterian signaling pathways are also largely complete and most bilaterian miRNAs are present. Overall, we conclude that X. bocki has a complex genome typical of bilaterians, which does not reflect the apparent simplicity of its body plan that has been so important to proposals that the Xenacoelomorpha are the simple sister group of the rest of the Bilateria.

Keywords: Deuterostomia; Xenacoelomorpha; Xenoturbella; animal evolution; evolutionary biology; genetics; genome analysis; genomics; xenoturbella bocki.

Plain language summary

Xenoturbella bocki is a small marine worm predominantly found on the seafloor of fjords along the west coast of Sweden. This simple organism’s unusual evolutionary history has long intrigued zoologists as it is not clear how it is related to other animal groups. The worm may belong to one of the earliest branches of the animal kingdom, which would explain its simple body. On the other hand, it could be related to a more complex group, the deuterostomes, which includes a wide range of animals, from mammals and birds to sea urchins and starfish. Understanding X. bocki’s evolution could provide valuable insights into how bilaterians evolved as a whole. Unlike its close relatives, the acoelomorphs, X. bocki evolves more slowly, which makes it simpler to study its genome. As a result, it serves as a starting point for investigating the evolutionary processes and genetics underpinning the broader group of bilaterians. To better understand the evolution of X. bocki’s simple body, Schiffer et al. asked whether its genome is simpler or differs in other ways from that of more complex bilaterian organisms. Sequencing the entire X. bocki genome revealed that it has a similar number of genes to that of other animals and includes the genes required for complex biochemical pathways. Reconstructing the worm’s chromosomes – the structures that house genetic information – showed that the X. bocki genes are also distributed in a manner similar to those in other animals. The findings suggest that, despite its simple body plan, X. bocki has a complex genome that is typical of bilaterians. This challenges the idea that X. bocki belongs to a more primitive, simplified sister group to Bilateria and provides a starting point for further studies of how this simple worm evolved.

PubMed Disclaimer

Conflict of interest statement

PS, PN, DL, HR, FL, FM, BF, LB, FS, EH, AZ, PK, KH, SM, MM, HM, RC, RK, PS, MT No competing interests declared

Figures

Figure 1.
Figure 1.. Schematic drawings of X. bocki showing the simple body organization of the marine vermiform animal.
ant, anterior; post, posterior; If, lateral furrow; rf, ring furrow; m, mouth opening.
Figure 2.
Figure 2.. A comparison of total length of exons, intrans, and intergeneic space in the X. bocki genome with other metazoans (data from Francis and Wörheide, 2017).
X. bocki does not appear to be an outlier in any of these comparisons.
Figure 3.
Figure 3.. X. bocki harbors a marine Chlamydiae species as potential symbiont.
In the phylogenetic analysis of 16S rDNA (ML: GTR + F + R7; bootstrap values included) the bacteria in our X. bocki isolate (arrow) are sister lo a previous isolate from X. westbladi. X. westbladi is most likely a mis-identification of X. bocki.
Figure 4.
Figure 4.. A phylogeny based on the presence and absence of genes calculated with OMA.
Both analysis (a) and (b) confirm Xenambulacraria, that is, Xenoturbellida in a group with Echinoderms and Hemichordates. Inclusion of the acoel flatworms places these as sister to all other Bilateria (b). This placement appears as an artifact due to the very fast evolution in this taxon, in particular as good evidence exists for uniting Xenoturbellida and Acoela (Philippe et al., 2019; Cannon et al., 2016; Rouse et al., 2016; Srivastava et al., 2014; Philippe et al., 2011; Bourlat et al., 2006; Ueki et al., 2019).
Figure 5.
Figure 5.. In our orthology screen, X. bocki shows similar percentages of genes in orthogroups, un­assigned genes, and species-specific orthogroups as other well-annotated enomes.
Figure 6.
Figure 6.. The heatmaps show a comparative measure of relative completeness of signaling pathways based on KEGG and assessed with GenomeMaple or abundance of genes in a given gene-family based on lnterProsScan annotations.
(a) Cell signaling pathways in X. bocki are functionally complete, but in comparison to other species contain less genes. The overall completeness is not significantly different to, for example, the nematode C. elegans (inset, t-test). (b) The number of family members per species in major gene families (based on Pfam domains), like transcription factors, fluctuates in evolution. The X. bocki genome does not appear to contain particularly less or more genes in any of the analyzed families. Due to the comparative nature of the assay, no ‘true’ scale can be given: darker colors indicate higher comparative completeness. Schematic cladograms are drawn by the authors.
Figure 7.
Figure 7.. X. bocki has five HOX genes, which are located in relatively close proximity on one of our chromosome-size scaffolds.
Similar clusters exist for the ParaHox and ‘pharyngeal’ genes. Numbers between genes are distance (below) and number of genes between (below). Colors indicate gene families. Red box marks the position of a partial Hox gene. The ‘?’ gene has an unresolved homeodomain identity.
Figure 8.
Figure 8.. X.bocki genome contains genes for most bilaterian specific peptidergic system and a prokineticin gene containing a signature sequence shared with ambulacraria.
(a) Sequence alignment of Cnidarian Colipase-like protein, Ecdysozoan Astakine-like protein and Spiralian, Chordates and Xenacoelomorpha Prokineticin-like proteins show conserved cysteine positions (highlighted by red triangle), as well as clade specific signature sequences sequences among which a “K/R-RFP-K/R” sequence shared only by ambulacrarians and X. bocki. The signature previously reported for Ecdysozoa and chordata, as well as new signatures we found in Spiralia and Cnidaria is absent from ambulacrarians and X. bocki prokineticin ligand sequences. Sequences are available as Figure 8—source data 1; alignment files are available at https://doi.org/10.5281/zenodo.6962271. (b) Peptidergic systems found in Xenoturbella (X), Nemertodermatida (N) and Acoelomorpha (A). Novel findings are highlighted in the top right inset. Color of schemes and inset cladogram nodes on grey background depicts the evolutionary origin of peptidergic systems in accordance with our analysis: bilaterian, protostomian, chordate, xenacoelomorph + ambulacrarian last common ancestors respectively. 7B2, Neuroendocrine protein 7B2; AKH, adipokinetic hormone; Asta-A, Allatostatin-A; Asta-C, Allatostatin-C; AVP, arginine vasopressin; AVT, Arginine vasotocin; CCAP, crustacean cardioactive peptide; CCHa, CCHamide peptide; CCK, cholecystokinin; CRF, Corticotropin-releasing factor; DH31, diuretic hormone 31; DH44, diuretic hormone 44; EH, eclosion hormone; GlycH A5, Glycoprotein Hormone alpha5; GlycH B2, Glycoprotein Hormone beta2; GnRH, Gonadotropin Releasing Hormone; GPR54, G Protein-Coupled Receptor 54; GPR83, G Protein-Coupled Receptor 83; ILP, Insulin-like peptide; Kiss, Kisspeptine; MCH, melanin concentrating hormone; Nmn-B, Neuromedin B; Np-S, Neuropeptide S; NP-Y/F, Neuropeptide Y/F; NucB2, nucleobindin 2; PDF, Pigment-dispersing factor; PEN, neuroendocrine peptide PEN; PTTH, Prothoracicotropic hormone; RYa, RYamide peptide; t-FMRFa, trochozoan-FMRFamide peptide.
Figure 8—figure supplement 1.
Figure 8—figure supplement 1.. Radial tree representation of the phylogenetic analysis of bilaterian glycoprotein hormone and Bursicon.
Colored dots indicate support (UFB, 1000 ultrafast bootstrap replicates; SHL, 1000 SH-aLRT replicates) and follow the color code in the left inset. Scale bar unit for branch length is the number of substitutions per site. Branches are colored according to the phylogenetic position of the organism from which the sequence originates and follow the color code in the left inset. Nbl1, neuroblastoma suppressor of tumorigenicity 1. Sequences are available as Figure 8—source data 1; alignment and IQTREE tree files are available at https://doi.org/10.5281/zenodo.6962271.
Figure 8—figure supplement 2.
Figure 8—figure supplement 2.. Radial tree representation of the sequence similarities analysis of bilaterian insulin-related peptides.
Tree is calculated from concatenated alignment of A and B chains. Scale bar unit for branch length is the number of substitutions per site. Branches are colored according to the phylogenetic position of the organism from which the sequence originates and follow the color code in the bottom inset. dILP, Drosophila insulin-like peptide; GSS, gonad-stimulating substance; ILP, insulin-like peptide; IGF, insulin-like growth factor. Full version of this tree is presented as Supplementary file 2. Sequences are available as Figure 8—source data 1; alignment and IQTREE tree files are available at https://doi.org/10.5281/zenodo.6962271.
Figure 8—figure supplement 3.
Figure 8—figure supplement 3.. Circular tree representation of the phylogenetic analysis of bilaterian Leucine-rich repeat-containing G-protein coupled Receptors (Rhodopsin type G-protein coupled Receptors delta).
Colored dots indicate support (UFB, 1000 Ultrafast bootstrap replicates; SHL, 1000 SH-aLRT replicates) and follow the color code in the bottom inset. Scale bar unit for branch length is the number of substitutions per site. Branches are colored according to the phylogenetic position of the organism from which the sequence originates and follow the color code in the bottom inset. Collapsed group colored in red indicate that they contain at least one X. bocki sequence. GPA2, Glycoprotein Hormone alpha5; GPB5, Glycoprotein Hormone beta2; GPCR, G Protein-Coupled Receptor; GRL-101, G-protein coupled receptor GRL101. Full version of this tree is presented in Supplementary file 3. Sequences are available as Figure 8—source data 2; alignment and IQTREE tree files are available at https://doi.org/10.5281/zenodo.6962271.
Figure 8—figure supplement 4.
Figure 8—figure supplement 4.. Circular tree representation of the phylogenetic analysis of bilaterian Rhodopsin type G-protein coupled Receptors beta and gamma.
Colored dots indicate support (UFB, 1000 ultrafast bootstrap replicates; SHL, 1000 SH-aLRT replicates) for main nodes and follow the color code in the bottom inset. Scale bar unit for branch length is the number of substitutions per site. Branches are colored according to the phylogenetic position of the organism from which the sequence originates and follow the color code in the bottom inset. Circular gray bars highlights names of groups of annotated sequences. Circular red bars indicate position of groups of Xenacoelomorpha sequences and associated number the number of X. bocki sequence(s) within these groups. AKH, adipokinetic hormone; Asta-A, Allatostatin-A; Asta-C, Allatostatin-C; CAPA, Cardio acceleratory peptide; CCAP, crustacean cardioactive peptide; CCHa, CCHamide peptide; CCK, cholecystokinin; CRZ, Corazonin; eFMRF, ecdysozoan-FMRFamide peptide; GGN-EP, GGN excitatory peptide; ETH, ecdysis triggering hormone; GnRH, Gonadotropin Releasing Hormone; GPR150, G Protein-Coupled Receptor 150; GPR54, G Protein-Coupled Receptor 54; GPR83, G Protein-Coupled Receptor 83; MCH, melanin concentrating hormone; Myomod, Myomodulin; NK-2, Neurokinin 2; Np-B/ W, Neuropeptide B/W; Np-FF, Neuropeptide FF; Np-F, Neuropeptide F; Np-S, Neuropeptide S; Np-Y, Neuropeptide Y; PBAN, pheromone biosynthesis activation neuropeptide; PEN, neuroendocrine peptide PEN; PRP, Prolactin releasing peptide; QRFP, Neuropeptide QRFP; RYa, RYamide peptide; SIFa, SIFamide peptide; SPR, Sex peptide receptor; tFMRFa, trochozoan-FMRFamide peptide; TRH, thyrotrophin-releasing hormone. Full version of this tree is presented in Supplementary file 4. Sequences are available as Figure 8—source data 2; alignment and IQTREE tree files are available at https://doi.org/10.5281/zenodo.6962271.
Figure 8—figure supplement 5.
Figure 8—figure supplement 5.. Circular tree representation of the phylogenetic analysis of bilaterian Tyrosine kinase Receptors.
Colored dots indicate support (UFB, 1000 ultrafast bootstrap replicates; SHL, 1000 SH-aLRT replicates) and follow the color code in the bottom inset. Scale bar unit for branch length is the number of substitutions per site. Branches are colored according to the phylogenetic position of the organism from which the sequence originates and follow the color code in the bottom inset. Collapsed group colored in red indicate that they contain at least one X. bocki sequence. EGF, Epidermal Growth Factor; Discoidin cont. R, discoidin domain-containing receptor; Orphan Tyr. Kinase Ror2, receptor tyrosine kinase-like orphan receptor 2; VKR, Venus kinase Receptor; ILP, Insulin-like peptide; PDGF, Platelet-derived growth factor; VEGF, Vascular endothelial growth factor; GDNF, Glial cell line-derived neurotrophic factor; FGF, fibroblast growth factor; PTTH, Prothoracicotropic hormone. Full version of this tree is presented in Supplementary file 5. Sequences are available as Figure 8—source data 2; alignment and IQTREE tree files are available at https://doi.org/10.5281/zenodo.6962271.
Figure 8—figure supplement 6.
Figure 8—figure supplement 6.. Circular tree representation of the phylogenetic analysis of bilaterian Secretin type G-protein coupled Receptors.
Colored dots indicate support (UFB, 1000 ultrafast bootstrap replicates; SHL, 1000 SH-aLRT replicates) for main nodes and follow the color code in the bottom inset. Scale bar unit for branch length is the number of substitutions per site. Branches are colored according to the phylogenetic position of the organism from which the sequence originates and follow the color code in the bottom inset. Circular gray bars highlights names of groups of annotated sequences. Circular red bars indicate position of groups of Xenacoelomorpha sequences and associated number the number of X. bocki sequence(s) within these groups. DH31, diuretic hormone 31; Np-R B1, Neuropeptide receptor B3; Np-R B4, Neuropeptide receptor B1; PDF, Pigment-dispersing factor; CRF, Corticotropin-releasing factor; DH-44, diuretic hormone 44; PTH2/3-R, Parathyroid hormone receptor2/3; GIP, Gastric inhibitory polypeptide; PACAP, Pituitary adenylate cyclase-activating polypeptide; VIP-R, Vasoactive intestinal polypeptide receptor; GHRH, Growth hormone-releasing hormone; PTH, Parathyroid hormone receptor; SCTR, Secretin Receptor. Full version of this tree is presented in Supplementary file 6. Sequences are available as Figure 8—source data 2; alignment and IQTREE tree files are available at https://doi.org/10.5281/zenodo.6962271.
Figure 9.
Figure 9.. The rev sed microRNA complement of X. bocki has a near-complete set of metazoan, bilaterian, and deuterostome families and genes.
Presence (color) and absence (black) of microRNA families (column), paralog numbers (values and heatmap coloring) organized in node-specific blocks in a range of representative protostome and deuterostome species compared with Xenoturbella (species from MirGeneDB 2.1; Fromm et al., 2022). The bottom row depicts 2011 complement by Philippe et al., 2011 (blue numbers on black depict detected miRNA reads, but lack of genomic evidence). Red ‘x’ in the pink box highlights the lack of evidence for an Ambulacraria-specific microRNA in X. bocki.
Figure 10.
Figure 10.. A comparison of scaffolds in the X. bocki genome with other Metazoa.
17 of the 18 large scaffolds in the X. bocki genome are linked via synteny to distinct chromosomal scaffolds in these species.
Figure 10—figure supplement 1.
Figure 10—figure supplement 1.. Conservation of metazoan synteny and methylation in X. bocki.
(a) A summary plot of synteny between major scaffolds in the X. bocki genome assembly and early branching highly contiguous metazoan genome assemblies: Euphydatia muelleri, Trichoplax adhearens, Branchiostoma floridae, Saccoglossus kowalevskii, Ciona intestinalis, Nematostella vectensis, Asteria rubens, Pecten maximus, Nemopilema nomurai, and Carcinoscorpius rotundicauda. All but one of the chromosome-sized scaffolds in our assembly have at least one syntenic match in the each of the other species (see the main text for one-to-one plots with key species and a description of the aberrant scaffold). We performed the same analysis with amphioxus as the focal species as a proof of principle (inset). (b) Analysis of methylation on the largest scaffold in the X. bocki genome assembly. One scaffold with a deviant gene age and synteny structure (see the main text) also stands out in terms of methylation. A detailed analysis of methylation patterns across the genome and classes of genes will be published separately.
Figure 10—figure supplement 2.
Figure 10—figure supplement 2.. Intergenomic comparison of X. bocki and E. muelleri highlighting synteny connections between the aberrant scaffold c1896 and scaffolds across the sponge genome.
Figure 11.
Figure 11.. Phylostratigraphic age distribution of genes on all major scaffolds in the X. bocki genome.
One scaffold (c1896), which showed no synteny to a distinct chromosomal scaffold in the other metazoan species, also had a divergent gene age structure in comparison to other X. bocki scaffolds.
Figure 12.
Figure 12.. Blobplot analysis of the primary lllumina genome assembly.
The assembly shows no major microorganismal contamination, apart from the Chlamydia and Gammaproteobacteria described in the main text. The diamond tool was used to blast against the UniProt database for this analysis.
Figure 13.
Figure 13.. Hi-C based genome scaffolding with instaGRAAL.
(a) Contact frequency map of the largest 18 scaffolds and (b) distribution of contact frequency as a function of distance (distance law).
Figure 13—figure supplement 1.
Figure 13—figure supplement 1.. Kmer profile of the X. bocki Illumina WGS reads obtained with GenomeScope2 (Ranallo-Benavidez et al., 2020).
Linear plot and transformed linear plots are shown. As per the description at http://qb.cshl.edu/genomescope/, we used 21mers counted with jellyfish (Marçais and Kingsford, 2011). GenomeScope genome property estimates and measures were len: 222,242,800 bp, uniq: 31.7%, aa: 99.1%, ab: 0.929%, kcov: 8.72, err: 0.665%, dup: 0.527, k: 21, p:2, model fit min: 34.6%, model fit max: 96.3.

Update of

  • doi: 10.1101/2022.06.24.497508

Similar articles

Cited by

References

    1. Almagro Armenteros JJ, Tsirigos KD, Sønderby CK, Petersen TN, Winther O, Brunak S, von Heijne G, Nielsen H. SignalP 5.0 improves signal peptide predictions using deep neural networks. Nature Biotechnology. 2019;37:420–423. doi: 10.1038/s41587-019-0036-z. - DOI - PubMed
    1. Arimoto A, Hikosaka-Katayama T, Hikosaka A, Tagawa K, Inoue T, Ueki T, Yoshida MA, Kanda M, Shoguchi E, Hisata K, Satoh N. A draft nuclear-genome assembly of the acoel flatworm Praesagittifera naikaiensis. GigaScience. 2019;8:giz023. doi: 10.1093/gigascience/giz023. - DOI - PMC - PubMed
    1. Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, Lesin VM, Nikolenko SI, Pham S, Prjibelski AD, Pyshkin AV, Sirotkin AV, Vyahhi N, Tesler G, Alekseyev MA, Pevzner PA. SPAdes: A new genome assembly algorithm and its applications to single-cell sequencing. Journal of Computational Biology. 2012;19:455–477. doi: 10.1089/cmb.2012.0021. - DOI - PMC - PubMed
    1. Barnett DW, Garrison EK, Quinlan AR, Strömberg MP, Marth GT. BamTools: a C++ API and toolkit for analyzing and managing BAM files. Bioinformatics. 2011;27:1691–1692. doi: 10.1093/bioinformatics/btr174. - DOI - PMC - PubMed
    1. Baudry L, Guiglielmoni N, Marie-Nelly H, Cormier A, Marbouty M, Avia K, Mie YL, Godfroy O, Sterck L, Cock JM, Zimmer C, Coelho SM, Koszul R. instaGRAAL: chromosome-level quality scaffolding of genomes using a proximity ligation-based scaffolder. Genome Biology. 2020;21:148. doi: 10.1186/s13059-020-02041-z. - DOI - PMC - PubMed