Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2022 Jun 6:12:900878.
doi: 10.3389/fcimb.2022.900878. eCollection 2022.

Paving the Way: Contributions of Big Data to Apicomplexan and Kinetoplastid Research

Affiliations
Review

Paving the Way: Contributions of Big Data to Apicomplexan and Kinetoplastid Research

Robyn S Kent et al. Front Cell Infect Microbiol. .

Abstract

In the age of big data an important question is how to ensure we make the most out of the resources we generate. In this review, we discuss the major methods used in Apicomplexan and Kinetoplastid research to produce big datasets and advance our understanding of Plasmodium, Toxoplasma, Cryptosporidium, Trypanosoma and Leishmania biology. We debate the benefits and limitations of the current technologies, and propose future advancements that may be key to improving our use of these techniques. Finally, we consider the difficulties the field faces when trying to make the most of the abundance of data that has already been, and will continue to be, generated.

Keywords: apicomplexa; functional screens; genomics; kinetoplastid; microscopy; proteomics; transcriptomics.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
Apicomplexan parasites. Most apicomplexan parasites have complex life cycles with several developmental stages that occur in different hosts and in different organs or tissues within the host. While advances have been made to culture many stages of these organisms in vitro, some are restricted to short-term culture. For others, only a limited number of stages can be sustained. Equally, not all stages are amenable to genetic modification. In this figure we summarize main features of Toxoplasma gondii, Cryptosporidium parvum, and Plasmodium spp. Toxoplasma gondii. (i) After ingesting bradyzoite cysts from an intermediate host, the sexual developmental cycle of T. gondii occurs in the gut of felines culminating in the shedding of large numbers of (ii) un-sporulated oocysts in their faeces. Within a few days, oocysts sporulate in the environment and become infective. (iii) Intermediate hosts can become infected by consuming contaminated soil, water or plants. Once consumed, oocysts transform into (iv) tachyzoites in the host’s gut. Dissemination of a tachyzoite infection and repeated rounds of cell infection, replication and egress (lytic cycle) leads to a systemic infection. Immune pressure triggers some tachyzoites to form tissue cysts that contain (v) bradyzoites. In humans, tissue cysts most commonly found in skeletal muscle, the heart, the eyes and the brain. T. gondii tachyzoites are able to cross the placenta from mother to fetus. Reactivation of an encysted infection can occur upon immune suppression and ingestion of bradyzoite cysts by another intermediate host can transmit the infection (v to iii). Among these T. gondii stages, tachyzoites and bradyzoites can be cultured in vitro in large amounts. Tachyzoites are amenable to genetic modification. Cryptosporidium parvum. (i) Sporulated oocysts are excreted by infected hosts through faeces and transmission to humans usually occurs via contaminated water. Following ingestion, the parasite undergoes excystation, whereby (ii) sporozoites are released, and invade the epithelial cells of the ileum. Here, C. parvum undergo (iii) 3 cycles of asexual expansion, followed by (iv) sexual commitment to either micro- or macro-gametes. Fertilization occurs and results in the generation of a (v) zygote, which continues onto a sporulated oocyst. Some oocysts continue to reinfect the host while others are excreted. Cryptosporidium does not complete its lifecycle in vitro without the use of complex organoid systems. Even so, generating large amounts is limited. The sporozoite is the only stage used for transfections; to generate transgenic oocysts, transfected sporozoites must immediately infect an animal model or organoid. Plasmodium spp. Female Anopheles mosquitoes are responsible for transmitting Plasmodium spp. Mosquitoes are the definitive host, where Plasmodium undergoes sexual replication. This occurs in the mosquito’s midgut, where micro/macro-gametes generate zygotes which become motile (i) (ookinetes) and invade the midgut wall. Here they develop into (ii) oocysts. As oocysts mature, they rupture, releasing (iii) sporozoites which migrate to various locations in the mosquito, including its salivary glands. Following an infectious bite, sporozoites migrate from the dermis to the blood vasculature in humans. This allows them to reach the host liver, where they invade hepatocytes, and undergo a single round of (iv) asexual replication (by schizogony) resulting in the release of (v) merosomes filled with merozoites. Merosomes rupture in the blood circulation and release thousands of merozoites, which then infect red blood cells (iRBCs) and give rise to the erythrocytic stage of infection. Within RBCs, parasites develop into (vi) rings, trophozoites and schizonts. Mature schizonts rupture, releasing merozoites which invade other RBCs exponentially increasing the parasite mass. Some of these parasites will develop into sexual stages (called (vii) gametocytes). Mature asexual stages cause iRBC sequestration in organs including the brain, lungs, placenta, pancreas and adipose tissues, while sexual stages display preferential tropism to the bone marrow and other hematopoietic organs. Among these Plasmodium spp. stages, ookinetes, liver, asexual RBC, and sexual RBC stages can all be cultured in vitro in large amounts. Merosomes, rings, and schizonts are amenable to genetic modification. Note Large arrows in diagram refer to the completion of the cycle. Figure created with BioRender.com.
Figure 2
Figure 2
Kinetoplastid parasites. Kinetoplastid parasites have complex life cycles with various stages occurring in insect vector and mammalian hosts, and in different organs or tissues within their hosts. While advances have been made to culture several stages of these organisms in vitro, many are restricted to short-term culture. For others, only a limited number of stages can be sustained. Equally, not all stages are amenable to genetic modification. In this figure we summarize main features of Trypanosoma cruzi, Trypanosoma brucei, and Leishmania spp. Trypanosoma cruzi. Triatomine insect vectors of the genera Triatoma, Rhodnius and Panstrongylus become infected by feeding on infected blood (from humans or other animals). Ingested trypomastigote metacyclics transform into (i) epimastigotes in the insect’s midgut. These multiply and differentiate into (ii) metacyclic trypomastigotes in the hindgut. Infected vectors release trypomastigotes through their faeces on the host skin. Parasites enter the skin via wounds or mucosal membranes (such as through the eyes). Inside the host, (iii) trypomastigotes invade cells of a plethora of tissues, and transform into (iv) amastigotes which multiply and differentiate again into trypomastigotes, which are released from lysed cells. Some of these travel in the (v) bloodstream, and can be ingested by triatomine vectors upon a bite for blood feeding. The most commonly affected organ is the heart, but others, including the liver, spleen, and adipose tissues are invaded too, some of them becoming important parasite reservoirs. Among these T. cruzi stages, epimastigotes, trypomastigotes and amastigotes can be cultured in vitro in large amounts, and the whole life cycle can be modeled in vitro. Epimastigotes, trypomastigotes and amastigotes are amenable to genetic modification. Trypanosoma brucei. Tsetse flies (from the genus Glossina) become infected by feeding on infected blood (from humans and other animals). Within the fly’s midgut, T. brucei stumpy forms transform into (i) procyclic trypomastigotes (PCF). These multiply, egress from the midgut, and transform into (ii) epimastigotes, which can reach the fly’s salivary glands and continue to multiply. (iii) Metacyclic trypomastigotes are injected into the host skin during a bloodmeal. Inside the host, they transform into bloodstream form (BSF) trypomastigotes that can be (iv) slender or (v) stumpy forms, the latter of which rapidly transforms into procyclic forms in the tsetse midgut upon a blood meal. While slender BSFs multiply and thrive in the bloodstream, T. brucei is an extracellular parasite capable of invading multiple organs including the brain, spleen, adipose tissue, pancreas, lungs and lymphatic vasculature. These (iv) tissue-specific forms are relatively poorly understood. Among these Trypanosoma brucei stages, procyclics and BSFs can be cultured in vitro in large amounts, and the same stages are amenable to genetic modification. Leishmania spp. Phlebotomine sandflies become infected by ingesting infected cells during a bloodmeal. Within the sandfly, (i) amastigote forms of Leishmania spp. transform into (ii) promastigotes, which develop in the vector’s gut, and migrate to the proboscis. Infected sandflies transmit (iii) promastigotes during a bloodmeal. After entry into the skin, promastigotes are ingested by phagocytic cells (eg. macrophages and neutrophils). Within these cells, promastigotes transform into (iv) amastigotes, which multiply and (v) infect other cells. Depending on parasite and host factors, cutaneous or visceral leishmaniasis can result. For the former, the skin and soft tissues like the nose and mouth can be affected. For the latter, affected organs commonly include the spleen, liver and bone marrow. Among these Leishmania spp. stages, promastigotes, axenic amastigotes and intracellular amastigotes can be cultured in vitro in large amounts. Promastigoes and amastigotes are amenable to genetic modification. Note Large arrows refer to the completion of the cycle. Figure created with BioRender.com.
Figure 3
Figure 3
Timeline of major achievements in parasite genome sequencing. Only the oldest article using each technology for each parasite is cited. WGS, whole genome shotgun sequencing; PSS, partial shotgun sequencing; 3rd Gen Seq, third generation sequencing or long-read sequencing; ONT, Oxford Nanopore technology; PacBio, Pacific Biosciences; PMID, PubMed identification number.
Figure 4
Figure 4
Publication of transcriptomic studies in parasitology. The use of transcriptomics has had a rapid increased over time, with early techniques (ESTs, SAGE and microarrays) becoming less frequently used in favour of bulk RNA-seq. Most recently the number of studies using scRNA-seq methods has increased to deconvolve mixed populations. Note: *Each term was searched for in publication titles and/or abstracts, along with at least one species of unicellular parasites included in the VEuPathDB database (Amos et al., 2022).
Figure 5
Figure 5
Power of single cell transcriptomics. Cell atlases (central figure) contain the transcriptomes of individual cells, organised according to transcript signature similarities and differences in low dimensional space. The result is a transcriptomic map descriptive of the system in question, which in parasitology can reflect the parasite’s complete life cycle (expressed as arrows). Individual parasite transcriptomes of the same life cycle form (single points) are positioned close together and, if captured, cells undergoing differentiation between different forms are positioned between the cell type clusters. These data can be mined and used as highly valuable resources in several ways. 1) Clustering analyses can be used to group similar cells in increasing resolution, often to identify life cycle forms. Differential expression analysis between clusters reveals novel marker genes, specifically expression in a particular cluster. 2) Pseudotime analysis can be performed to identify dynamic gene expression patterns across the life cycle. A path, or trajectory, is drawn through the cell atlas map connecting neighbouring cells and differential expression analysis is performed as a function of the trajectory. This reveals transcripts which change in level during the life cycle, and the exact expression pattern. Genes which peak in transitioning cells can reveal novel regulators. 3) The cell atlas can further be used as a reference to which query single cell transcriptomes can be mapped. For example, when only a few transcriptomes are available, or only those containing fewer transcripts per cell, mapping them to a high quality reference can identify their detailed position in the life cycle. 4) Transcriptomes of different genetically perturbed parasites, varied strains and even different species can also be mapped to the reference cell atlas through data integration methods. This allows detailed comparisons between datasets and across several cell types. Figure created with BioRender.com.
Figure 6
Figure 6
Commonly used quantitative methods to study proteomics in Apicomplexans and Kinetoplastids. A PubMed search was carried out for each genus with key terms for the proteomic methods. The number of publications by term and by parasite are shown in brackets. Leishmania had the most quantitative proteomic publications, followed closely by Plasmodium, Trypanosoma, and Toxoplasma. Both Leishmania and Plasmodium showed a larger diversity of methods with the inclusion of SWATH-MS and SRM, respectively. Cryptosporidium had the least amount of publications in quantitative proteomics and least diversity of methods. SRM, selected reaction monitoring; SILAC, stable isotope labelling by amino acids in cell culture; SWATH-MS, sequential window acquisition of all theoretical mass spectra. Figure created with BioRender.com.
Figure 7
Figure 7
Publications using functional screens in Apicomplexans and Kinetoplastids. The number of functional screens completed in various Apicomplexans and Kinetoplastids is summarised. PubMed searches were carried out for each genus and screening method (y-axis) manual curation confirmed whether the method was used for screening rather than follow-up studies. In Plasmodium spp. random insertional mutagenesis (pf) and KO screens (pb) have been used extensively. In T. gondii chemical mutagenesis and, more recently, mutagenesis using CrispR libraries dominate. Functional screens in Trypanasoma spp. have been exclusively and extensively completed using RNAi. Few screens have been completed in Leishmania spp. and these are recent. Due to the lack of high throughput technologies in Cryptosporidium spp. no screens have been carried out on the parasites, only host screens. A summary of currently available technologies and their adaptation to high throughput, required for screening, is also shown. Figure created with BioRender.com.
Figure 8
Figure 8
Future directions for functional screens. Most functional screens in Apicomplexa rely on non-conditional gene depletion methods to phenotype mutants. Top left panel After the gene(s) of interest has been disrupted only those that are dispensable for growth during the transfected stage can be phenotyped. With life cycle progression (through stages 2 - N) more mutants within the pool will be lost as they become critical for survival. This means even without reducing the population with selective pressure (eg. drug) the number of mutants within the pool that can be characterised is not complete across the life cycle. Top right panel In conditional regulation systems the means of downregulation are integrated (eg. the auxin tag for the AID system) following transfection. Of note it is likely a few candidates will not tolerate the tag and will be lost from the population. As downregulation can be induced across the life cycle, all mutants within the population can be characterised and none are lost due to prior stage essentiality. Bottom panel After generation of mutants, many functional screens rely on reductive assays to select mutants with a specific phenotype (eg. drug resistance). This is followed by further candidate prioritisation before in-house phenotyping of a small number of mutants, often re-derived as conditionally regulatable knockdowns to allow characterisation throughout the life cycle. If pools of mutants are instead characterised by high-throughput imaging, they can be classified based on tagged protein localisation or, mutant phenotype. Classification of mutants allows for in house phenotyping and open access data sharing distributes follow up studies throughout the field and improves equitability. Figure created with BioRender.com.
Figure 9
Figure 9
Microscopy usage in parasitology. (A) Timeline of key developments allowing high-throughput bioluminescent and fluorescent studies. The left side of the timeline shows the first transgenic bioluminescent lines created for each parasite (Plasmodium spp., Toxoplasma gondii, Cryptosporidium spp., Trypanosoma spp., and Leishmania spp.), as well as the first luminescence-based high-throughput screen (BLS) performed for each parasite. The right side shows the first generation of fluorescent reporter lines for each parasite, and the first use of high-throughput fluorescence imaging, including the use of high content imaging (HCI), ImageStream, and proximity-dependent labelling (PDL). (B) Top section shows studies using bioluminescent reporter parasite lines, specifying the percentage used in high-throughput screens (HTS) for Apicomplexans (Plasmodium spp., Toxoplasma gondii and Cryptosporidium spp., Kinetoplastids (Leishmania spp., and Trypanosoma spp.). Bottom section shows the proportion of ‘omics’ and high-throughput screens using no imaging, low/medium throughput imaging (LT/MT), or high throughput imaging (HT). A PubMed search was performed for each genus, for all ‘omics’ methods, and each was explored to determine usage and throughput of microscopy. Figure created with BioRender.com.
Figure 10
Figure 10
Microscopy current contributions and future directions. Multiple imaging modalities, including electron microscopy, fluorescence microscopy, bioluminescence, and force nanoscopy, have been extensively used in Apicomplexan and Kinetoplastid research. Efforts on technology development have led to the generation of hybrid imaging platforms (eg. combining electron and fluorescence microscopy in CLEM); the integration of cell culture, microfluidics, and bioengineering advances (eg. organoids and organs-on-chip, consistent with animal replacement and reduction) with imaging methods; the integration of 3D printing and robotics for the generation of versatile imaging platforms, including high-content imaging; and the integration of artificial intelligence for image analysis. Many of these improvements are consistent with a philosophy of open science, and have facilitated data-sharing and the creation of low-cost complex imaging equipment. Most of these have already been incorporated into parasitology research, but not in high-throughput modalities. Efforts towards increasing the throughput of conventionally low-throughput techniques are shown on the left column, and include autonomous imaging (whereby user input is not required, reducing human resource demands), miniaturization and parallelization, multiplexing, and faster acquisition methods. Equally, a bottleneck for microscopy-based research is image analysis. Incorporation of artificial intelligence and machine learning, and integration with open databases and open code are promising fields for parasitology. Together, these various elements will likely play a major role on the integration of imaging into ‘omics’ studies to understand parasite biology. Figure created with BioRender.com.
Figure 11
Figure 11
Omics’ technologies in parasitology: challenges and future directions. Genomics, transcriptomics and proteomics (as well as other ‘omics’ technologies), have allowed us to study a plethora of questions in parasitology. While bulk-based ‘omics’ have contributed greatly to parasitology, single cell-based ‘omics’ approaches have highlighted unexpected heterogeneity in parasite populations. Given the complex life cycles of Apicomplexan and Kinetoplastid parasite species, understanding the interconnection between parasite and host is key. ‘Omics’ technologies that allow the simultaneous investigation of parasite and host will continue to play important roles in understanding host-pathogen interactions, including topics of current major interest such as tissue tropism; immune evasion; parasite latency, dormancy, and persistence; and parasite-host circadian rhythms, among others. Functional screens based on genome mutagenesis, RNA regulation and protein regulation, have become vital tools for investigating parasite biology. Together with auxiliary technologies, such as high-throughput imaging, functional screens are powerful tools for parasitology. Current advances in microscopy are allowing valuable low-throughput techniques (eg. super-resolution, and electron microscopy), to be adapted for higher throughput. Together with the incorporation of robotics and artificial intelligence, these valuable tools could become suitable for integration into ‘omics’ research and functional screens. Several challenges remain for ‘omics’ in parasitology, including experimental, technological, and overall challenges. Among the latter are the need for better data annotation and integration. We envisage that vital for an improved/increased use of available and future ‘omics’ data are within-omics data integration, multi-omics data integration, and multi-disciplinary data integration. Equally funding and dedicated personnel to the creation, maintenance, annotation, curation and update of available data is vital. A successful transition in this respect will enable improved collaborative science, and addressing novel and outstanding biological questions. Figure created with BioRender.com.

Similar articles

Cited by

References

    1. Abecasis G. R., Altshuler D., Auton A., Brooks L. D., Durbin R. M., Gibbs R. A., et al. . (2010). A Map of Human Genome Variation From Population-Scale Sequencing. Nature 467, 1061–1073. doi: 10.1038/nature09534 - DOI - PMC - PubMed
    1. Abrahamsen M. S., Templeton T. J., Enomoto S., Abrahante J. E., Zhu G., Lancto C. A., et al. . (2004). Complete Genome Sequence of the Apicomplexan, Cryptosporidium Parvum. Science 304, 441–445. doi: 10.1126/science.1094786 - DOI - PubMed
    1. Adams C. L., Sjaastad M. D. (2009). Design and Implementation of High-Content Imaging Platforms: Lessons Learned From End User-Developer Collaboration. Comb. Chem. High Throughput Screen. 12, 877–887. doi: 10.2174/138620709789383240 - DOI - PubMed
    1. Adaui V., Kröber-Boncardo C., Brinker C., Zirpel H., Sellau J., Arévalo J., et al. . (2020). Application of CRISPR/Cas9-Based Reverse Genetics in Leishmania Braziliensis: Conserved Roles for HSP100 and HSP23. Genes (Basel) 11 (10), 1159. doi: 10.3390/genes11101159 - DOI - PMC - PubMed
    1. Adil A., Kumar V., Jan A. T., Asger M. (2021). Single-Cell Transcriptomics: Current Methods and Challenges in Data Acquisition and Analysis. Front. Neurosci. 15. doi: 10.3389/fnins.2021.591122 - DOI - PMC - PubMed

Publication types