Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2019 Aug 6;20(15):3837.
doi: 10.3390/ijms20153837.

Retrotransposons in Plant Genomes: Structure, Identification, and Classification through Bioinformatics and Machine Learning

Affiliations
Review

Retrotransposons in Plant Genomes: Structure, Identification, and Classification through Bioinformatics and Machine Learning

Simon Orozco-Arias et al. Int J Mol Sci. .

Abstract

Transposable elements (TEs) are genomic units able to move within the genome of virtually all organisms. Due to their natural repetitive numbers and their high structural diversity, the identification and classification of TEs remain a challenge in sequenced genomes. Although TEs were initially regarded as "junk DNA", it has been demonstrated that they play key roles in chromosome structures, gene expression, and regulation, as well as adaptation and evolution. A highly reliable annotation of these elements is, therefore, crucial to better understand genome functions and their evolution. To date, much bioinformatics software has been developed to address TE detection and classification processes, but many problematic aspects remain, such as the reliability, precision, and speed of the analyses. Machine learning and deep learning are algorithms that can make automatic predictions and decisions in a wide variety of scientific applications. They have been tested in bioinformatics and, more specifically for TEs, classification with encouraging results. In this review, we will discuss important aspects of TEs, such as their structure, importance in the evolution and architecture of the host, and their current classifications and nomenclatures. We will also address current methods and their limitations in identifying and classifying TEs.

Keywords: bioinformatics; classification; deep learning; detection; function; machine learning; retrotransposons; structure; transposable elements.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Figures

Figure 1
Figure 1
Structure of LTR retrotransposon. The env gene might not be present in some elements. Orange arrows correspond to LTRs.
Figure 2
Figure 2
Structure of non-autonomous elements. Orange arrows correspond to LTRs and single lines correspond to non-coding regions. PBS: primer binding site; PPT: Poly-Purine Tract; TRIM: Terminal-Repeat Retrotransposons in Miniature; LARD: LArge Retrotransposon Derivatives.
Figure 3
Figure 3
Structure of non-LTR retrotransposons.
Figure 4
Figure 4
Structure of Penelope-like elements (PLEs).
Figure 5
Figure 5
Structure of Dictyostelium intermediate repeat sequences (DIRS).
Figure 6
Figure 6
Classification of TEs following Rexdb and GyDB nomenclature. Adapted from [26].
Figure 7
Figure 7
Accuracy of machine learning (ML) algorithms tested for TE identification and classification problems. A Neural Network and Ridor were used for only one problem. Adapted from Loureiro et al. [170].

Similar articles

Cited by

References

    1. Mita P., Boeke J.D. How Retrotransposons shape genome regulation. Curr. Opin. Genet. Dev. 2016;37:90–100. doi: 10.1016/j.gde.2016.01.001. - DOI - PMC - PubMed
    1. Keidar D., Doron C., Kashkush K. Genome-wide analysis of a recently active retrotransposon, Au SINE, in wheat: Content, distribution within subgenomes and chromosomes, and gene associations. Plant Cell Rep. 2018;37:193–208. doi: 10.1007/s00299-017-2213-1. - DOI - PMC - PubMed
    1. Ou S., Chen J., Jiang N. Assessing genome assembly quality using the LTR assembly index (LAI) Nucleic Acids Res. 2018;46:1–11. doi: 10.1093/nar/gky730. - DOI - PMC - PubMed
    1. Mustafin R.N., Khusnutdinova E.K. The role of transposons in epigenetic regulation of ontogenesis. Russ. J. Dev. Biol. 2018;49:61–78. doi: 10.1134/S1062360418020066. - DOI
    1. Muszewska A., Hoffman-Sommer M., Grynberg M. LTR Retrotransposons in Fungi. PLoS ONE. 2011;6:12. doi: 10.1371/journal.pone.0029425. - DOI - PMC - PubMed

LinkOut - more resources