Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Jul 2:2:e00569.
doi: 10.7554/eLife.00569.

The genome sequence of the colonial chordate, Botryllus schlosseri

Affiliations

The genome sequence of the colonial chordate, Botryllus schlosseri

Ayelet Voskoboynik et al. Elife. .

Abstract

Botryllus schlosseri is a colonial urochordate that follows the chordate plan of development following sexual reproduction, but invokes a stem cell-mediated budding program during subsequent rounds of asexual reproduction. As urochordates are considered to be the closest living invertebrate relatives of vertebrates, they are ideal subjects for whole genome sequence analyses. Using a novel method for high-throughput sequencing of eukaryotic genomes, we sequenced and assembled 580 Mbp of the B. schlosseri genome. The genome assembly is comprised of nearly 14,000 intron-containing predicted genes, and 13,500 intron-less predicted genes, 40% of which could be confidently parceled into 13 (of 16 haploid) chromosomes. A comparison of homologous genes between B. schlosseri and other diverse taxonomic groups revealed genomic events underlying the evolution of vertebrates and lymphoid-mediated immunity. The B. schlosseri genome is a community resource for studying alternative modes of reproduction, natural transplantation reactions, and stem cell-mediated regeneration. DOI:http://dx.doi.org/10.7554/eLife.00569.001.

Keywords: Botryllus schlosseri; Other; genome; hematopoiesis; stem cell; tunicates; vertebrate evolution.

PubMed Disclaimer

Conflict of interest statement

Stanford has filed US and International patent application numbers 61/532,882 and 13/608,778 entitled “Methods for obtaining a sequence” with inventors AV, DP, and SRQ. DP and SRQ are co-founders of Moleculo Inc.

The other authors declare that no competing interests exist.

Figures

Figure 1.
Figure 1.. Botryllus schlosseri anatomy, life cycle, and phylogeny.
B. schlosseri reproduces both through sexual and asexual (budding) pathways, giving rise to virtually identical adult body plans. Upon settlement, the tadpole phase of the B. schlosseri lifecycle (A) will metamorphose into a founder individual (oozooid) (B), which through asexual budding, generates a colony. The colony includes three overlapping generations: an adult zooid, a primary bud, and a secondary bud, all of which are connected via a vascular network (bv) embedded within a gelatinous matrix (termed tunic). The common vasculature terminates in finger-like protrusions (termed ampullae; BD). Bud development commences in stage A (C). Through budding, B. schlosseri generates its entire body, including digestive (ds) and respiratory (brs) systems, a simple tube-like heart (h), an endostyle (en) that harbors a stem cell niche, a primitive neural complex, and siphons used for feeding, waste, and releasing larvae (BD). Each week, successive buds grow large (D) and complete replication of all zooids in the colony, ultimately replacing the previous generation’s zooids, which die through a massive apoptosis. (E) A phylogenomic tree produced from analysis of 521 nuclear genes (40,798 aligned amino acids) from 15 species, including B. schlosseri. Scale bar-1 mm. DOI: http://dx.doi.org/10.7554/eLife.00569.003
Figure 1—figure supplement 1.
Figure 1—figure supplement 1.. Mitogenomic analysis of tunicates and deuterostomes.
Based on the 13 mitochondrially-encoded proteins. The tree was inferred by PhyloBayes under a GTR+G+CAT model. Support values at nodes represents Bayesian Posterior Probability (PP) and are reported only when >0.5 and <0.95. Nodes with PP < 0.5 were collapsed. The tree was rooted with the non-deuterostome Drosophila and Aplysia species. The main deuterostome lineages are represented in different colours. Abbreviations for tunicate orders: Stolido: Stolidobranchia; Phlebo: Phlebobranchia; Aplouso: Aplousobranchia. Colonial tunicates are indicated by an asterisk and include Botryllus schlosseri, all Aplousobranchia ascidians, and the thaliacean Doliolum nationalis. DOI: http://dx.doi.org/10.7554/eLife.00569.004
Figure 2.
Figure 2.. A novel short read genome sequencing and assembly method for complex, repeat-rich genomes.
(A) Genomic DNA is sheared into 6–8 kb fragments, partitioned into twelve 96-well plates, further fragmented to 600–800 bp, barcoded and sequenced separately for each well (Illumina HiSeq 2000 2x100bp), and assembled by Velvet. (B) Size distribution of contigs assembled from a representative library preparation (BL5). (C) Limiting the number of amplifiable molecules per well (barcode) to the level that almost 100% of all amplifiable molecules are present as single copies (<1000 gDNA molecules) greatly reduces the chance of having a repeated or homologous sequence within a well. Thus, sample complexity is significantly reduced, which reduces ambiguity in the reconstruction of a consensus sequence. As an example, two different predicted repeat-containing genes (g2001,1189bp; and g2002, 688bp) were assembled from two different wells (005 and 145 respectively). Although they contain highly homologous repeats (represented as a Dot Matrix plot, (D) these repetitive genes were resolved and reconstructed properly in the final assembly. DOI: http://dx.doi.org/10.7554/eLife.00569.005
Figure 2—figure supplement 1.
Figure 2—figure supplement 1.. Validation of LRseq approach on human genomic DNA.
Genomic DNA from HapMap NA7019 was prepared for LRseq. These figures show LRseq assembly statistics, obtained by mapping sequenced reads to human genome reference 36. These data were also used to estimate the concentration of amplifiable molecules in B. schlosseri 356a DNA samples prepared by an identical protocol. DOI: http://dx.doi.org/10.7554/eLife.00569.006
Figure 2—figure supplement 2.
Figure 2—figure supplement 2.. Clonality confirmation of the genome of clone Sc6a-b and clone 356a.
(A) Sc6a-b clone, a long lived (7 years old when sampled), highly regenerative colony was chosen to be sequenced. Sc6a-b subclones were starved for 48 hr prior to sampling, and 400 individuals (zooids) were sampled for sequencing. Subclones of this colony are still alive and maintained in our mariculture facility. (B) A few zooids were taken from every sample set and tested via AFLP’s genotyping analysis, confirming that all zooids belong to one genotype. (C and D). Sc6a-b microsatellite loci were homozygous (2 loci) and heterozygous (1 loci) confirming one genotype. (E and F) 356a clone was a highly regenerative long lived colony. 150 individuals were sampled and their gDNA was sequenced. Microsatellite loci were homozygous (E and F), confirming one genotype. Scale bar-1 mm DOI: http://dx.doi.org/10.7554/eLife.00569.007
Figure 2—figure supplement 3.
Figure 2—figure supplement 3.. Statistics for 356a assembly.
(A) Contig length distribution. (B) Distribution of coverage of 356a assembled Celera contigs by Velvet assembled fragments. DOI: http://dx.doi.org/10.7554/eLife.00569.008
Figure 2—figure supplement 4.
Figure 2—figure supplement 4.. Interspersed and tandem repeats distribution in the B. schlosseri genome.
(A) RepeatScout (version 1.0.5; Price et al., 2005) was used to identify interspersed repeat elements de novo using a k-mer length of 14. All identified repeats were subsequently filtered for tandem repeat and low complexity content, using RepeatScout. Genome-wide interspersed repeats were catalogued using RepeatMasker (version open-4.0; Smit et al., 1996-2010). The distribution of large interspersed repeats families (≥1kb) ordered by copy number is presented. (B) To identify both perfect (100% sequence identity) and degenerate genomic tandem repeats, we used XSTREAM (Newman and Cooper, 2007), with a minimum repeat length of 20 bp, minimum word match of 0.8, and otherwise default parameters. 3,183,988 tandem repeats were identified, period range: 1–6525 bp, copy number range: 2.7–1096x DOI: http://dx.doi.org/10.7554/eLife.00569.009
Figure 2—figure supplement 5.
Figure 2—figure supplement 5.. Coverage of 4 fosmids by the B. schlosseri assembly.
Fosmid sequences (red lines; gi; ac numbers are shown, number=bp), were compared with B. schlosseri contigs using blast (e-value < e−10). Best alignments between contigs >500bp (black lines) are shown. Repetitive regions are marked (blue). DOI: http://dx.doi.org/10.7554/eLife.00569.010
Figure 2—figure supplement 6.
Figure 2—figure supplement 6.. Validation of putative B. schlosseri genes.
We experimentally validated 145 B. schlosseri predicted genes. Genes were validated by observing expression in B. schlosseri cDNAs and gDNA via PCR and qPCR assays and resequencing them on Sanger. (A) cDNA PCR product of several early erythroid and HSC putative genes identified in B. schlosseri tissues (endostyle, blood or zooid). Names of the putative genes and the tissues that were tested in this experiment are indicated on the gel image. (B) qPCR expression in B. schlosseri blood of six putative immunity genes. DOI: http://dx.doi.org/10.7554/eLife.00569.011
Figure 3.
Figure 3.. Clustering and assignment of B.schlosseri chromosomes.
(A) We isolated and sequenced 21 metaphase chromosome mixtures using a microfluidic device. Each chromosome mixtures was amplified, barcoded and sequenced separately (IlluminaHiSeq). Genomic contigs larger than 7 kb were aligned to the chromosome reads using BWA. Subsequently, assignment of scaffolds to chromosome cluster was performed using iterative K-means clustering on the correlation matrix between each scaffold. In addition, to find the number of clusters/chromosomes we performed k-means clustering iteratively across different cluster numbers. This plot demonstrates that increasing beyond 13 clusters does little to reduce the error; therefore 13 chromosomes were successfully resolved. (B) To estimate the configuration after the clustering step, 17 out of the 21 wells were deduced to contain information that is used in the clustering process. The average number of normalized reads counts from each metaphase chromosome mixture (well) that align to each scaffold in a cluster group was calculated and plotted. Each peak represented can be inferred to denote the presence of a specific chromosome in the well. Examples of four representative wells are presented, metaphase chromosome mixtures contained between 1–4 chromosomes (see also Figure 3—figure supplement 1). DOI: http://dx.doi.org/10.7554/eLife.00569.012
Figure 3—figure supplement 1.
Figure 3—figure supplement 1.. Distribution of B. schlosseri chromosome groups across different wells.
We isolated and sequenced metaphase diluted chromosome mixtures using a microfluidic device. Each chromosome mixture was amplified, barcoded and sequenced separately (IlluminaHiSeq). The average number of normalized reads counts from each diluted chromosome mixture (well) that align to each scaffold in a cluster group was calculated and plotted. Each peak represents the presence of a specific chromosome in the well. In the 17 wells presented above, chromosome mixtures contained between 1–4 chromosomes. DOI: http://dx.doi.org/10.7554/eLife.00569.013
Figure 3—figure supplement 2.
Figure 3—figure supplement 2.. Pipeline for the assignment of chromosome scaffolds and the 356a–chromosomes hybrid assembly process.
DOI: http://dx.doi.org/10.7554/eLife.00569.014
Figure 3—figure supplement 3.
Figure 3—figure supplement 3.. 356a-Chromosome hybrid assembly of B. schlosseri.
Reads from each of the individual chromosome sample preparations were subsequently assembled using Velvet. The resulting chromosome level contigs were then merged with the 356a assembly to create a 356a-chromosome hybrid assembly. DOI: http://dx.doi.org/10.7554/eLife.00569.015
Figure 3—figure supplement 4.
Figure 3—figure supplement 4.. The fraction of B. schlosseri predicted intron-less genes (blue) and genes with introns (red) in the different chromosomes.
DOI: http://dx.doi.org/10.7554/eLife.00569.016
Figure 4.
Figure 4.. Innovations underlying the emergence and early diversification of vertebrates.
Protein-encoding genes in B. schlosseri were compared to a diverse sampling of 18 well-annotated genomes from other species, and for each genome, the presence or absence of homology to human or mouse proteins was assessed (all vs all blastp e-value threshold of 1e−10; Figure 4—source data 1A). Our data indicate that homologs of ∼660 human/mouse genes were present in the common ancestor of tunicates and vertebrates, but not non-chordate species Figure 4—source data 1B). Among them are genes associated with the development, function, and pathology of vertebrate features, including heart, eye, hearing, immunity, pregnancy and cancer (Figure 4—source data 1C). Gray box = no homology; Yellow box = homology. DOI: http://dx.doi.org/10.7554/eLife.00569.017
Figure 5.
Figure 5.. Analysis of blood and immune cell type-specific genes across evolution reveals evidence for hematopoietic precursors in B. schlosseri.
We analyzed gene expression microarray data from 26 different human blood cell populations, organized into four cell lineages (HSC; Lymphoid Progenitors; Myeloid and Lymphoid Lineage), and identified a set of twenty signature genes with highly enriched expression profiles for each population (Supplementary file 3). For each blood-related gene set, we identified homologous gene sequences in B. schlosseri and 17 other species; the fraction of genes (out of 20) found for each species is displayed as a heat map. Within each major lineage, cell populations are sorted in decreasing order by a conservation index, calculated as the average number of genes found across the 18 species (indicated by a blue bar graph). DOI: http://dx.doi.org/10.7554/eLife.00569.020

Similar articles

Cited by

References

    1. Abascal F, Zardoya R, Posada D. ProtTest: selection of best-fit models of protein evolution. Bioinformatics. 2005;21:2104–5. doi: 10.1093/bioinformatics/bti263. - DOI - PubMed
    1. Abitua PB, Wagner E, Navarrete IA, Levine M. Identification of a rudimentary neural crest in a non-vertebrate chordate. Nature. 2012;492:104–7. doi: 10.1038/nature11589. - DOI - PMC - PubMed
    1. Bajoghli B, Guo P, Aghaallaei N, Hirano M, Strohmeier C, McCurley N, et al. A thymus candidate in lampreys. Nature. 2011;470:90–4. doi: 10.1038/nature09655. - DOI - PubMed
    1. Ballarin L, Menin A, Tallandini L, Matozzo V, Burighel P, Basso G, et al. Haemocytes and blastogenetic cycle in the colonial ascidian Botryllus schlosseri: a matter of life and death. Cell Tissue Res. 2008;331:555–64. doi: 10.1007/s00441-007-0513-4. - DOI - PubMed
    1. Bartl S, Baltimore D, Weissman IL. Molecular evolution of the vertebrate immune system. Proc Natl Acad Sci USA. 1994;91:10769–70. doi: 10.1073/pnas.91.23.10769. - DOI - PMC - PubMed

Publication types