Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Jun 1;8(11):1698-710.
doi: 10.4161/cc.8.11.8580. Epub 2009 Jun 27.

Prediction of novel families of enzymes involved in oxidative and other complex modifications of bases in nucleic acids

Affiliations

Prediction of novel families of enzymes involved in oxidative and other complex modifications of bases in nucleic acids

Lakshminarayan M Iyer et al. Cell Cycle. .

Abstract

Modified bases in nucleic acids present a layer of information that directs biological function over and beyond the coding capacity of the conventional bases. While a large number of modified bases have been identified, many of the enzymes generating them still remain to be discovered. Recently, members of the 2-oxoglutarate- and iron(II)-dependent dioxygenase super-family, which modify diverse substrates from small molecules to biopolymers, were predicted and subsequently confirmed to catalyze oxidative modification of bases in nucleic acids. Of these, two distinct families, namely the AlkB and the kinetoplastid base J binding proteins (JBP) catalyze in situ hydroxylation of bases in nucleic acids. Using sensitive computational analysis of sequences, structures and contextual information from genomic structure and protein domain architectures, we report five distinct families of 2-oxoglutarate- and iron(II)-dependent dioxygenase that we predict to be involved in nucleic acid modifications. Among the DNA-modifying families, we show that the dioxygenase domains of the kinetoplastid base J-binding proteins belong to a larger family that includes the Tet proteins, prototyped by the human oncogene Tet1, and proteins from basidiomycete fungi, chlorophyte algae, heterolobosean amoeboflagellates and bacteriophages. We present evidence that some of these proteins are likely to be involved in oxidative modification of the 5-methyl group of cytosine leading to the formation of 5-hydroxymethylcytosine. The Tet/JBP homologs from basidiomycete fungi such as Laccaria and Coprinopsis show large lineage-specific expansions and a tight linkage with genes encoding a novel and distinct family of predicted transposases, and a member of the Maelstrom-like HMG family. We propose that these fungal members are part of a mobile transposon. To the best of our knowledge, this is the first report of a eukaryotic transposable element that encodes its own DNA-modification enzyme with a potential regulatory role. Through a wider analysis of other poorly characterized DNA-modifying enzymes we also show that the phage Mu Mom-like proteins, which catalyze the N6-carbamoylmethylation of adenines, are also linked to diverse families of bacterial transposases, suggesting that DNA modification by transposable elements might have a more general presence than previously appreciated. Among the other families of 2-oxoglutarate- and iron(II)-dependent dioxygenases identified in this study, one which is found in algae, is predicted to mainly comprise of RNA-modifying enzymes and shows a striking diversity in protein domain architectures suggesting the presence of RNA modifications with possibly unique adaptive roles. The results presented here are likely to provide the means for future investigation of unexpected epigenetic modifications, such as hydroxymethyl cytosine, that could profoundly impact our understanding of gene regulation and processes such as DNA demethylation.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Multiple alignment of selected examples of the newly predicted families of the nucleic-acid-modifying 2-oxoglutarate- and iron(II)-dependent dioxygenase superfamily. Protein sequences are represented by their gene names, species names and GenBank index numbers (where available). Temporary gene names were assigned for predicted proteins from Naegleria, Aureococcus, Daphnia and Micromonas. The full length protein sequences from these are available in the Supplementary material. The coloring scheme and consensus abbreviations are shown in the key. Family names are shown to the right of the alignment. The distinct inserts of the TET/JBP and the AlkB families are shown within boxes. The key conserved residues defining the 2OGFeDO protein have been marked below the alignment. The consensus secondary structure derived from crystal structures of characterized members of the superfamily is shown above.
Figure 2
Figure 2
Representative domain architectures of the newly identified versions of nucleic-acid-modifying 2OGFeDO proteins. Architectures are arranged by their phylogenetic and family affinities, and are labeled by their gene and species names. Domain architectures within a family are boxed. Domains are typically denoted by their standard names. Non-standard domain nomenclatures are clarified in the inset key at the bottom of the figure. Operons are shown as arrows where the arrow head points from the 5′ to the 3′ direction of the coding frame of the gene. Gene neighborhoods are labeled by the gene coding for the 2OGFeDO protein.
Figure 3
Figure 3
Genomic organization and domain architectures of predicted transposons encoding DNA-modifying enzymes. Genes are depicted as arrows with the arrow head pointing from the 5′ to the 3′ direction of the coding sequence. Gene neighborhoods of the predicted transposase are typically labeled with the gene name of the 2OGFeDO containing protein, the species name and the gi number. In potential fragmentary elements where the 2OGFeDO is absent, the gene neighborhood is labeled with the gene name of the predicted transposase-containing gene. The key at the bottom of the figure explains non-standard domain and gene names, while other gene and domains names are as commonly used in literature.
Figure 4
Figure 4
Multiple alignment of the proposed catalytic domain of the transposase of the novel predicted transposon encoding 2OGFeDO proteins. Protein sequences are represented by their gene names, species names and GenBank index numbers. The predicted secondary structure is shown above the alignment. The coloring scheme and consensus abbreviations are shown in the key. Conserved residues defining the catalytic site of the predicted transposase are marked below the alignment.

Similar articles

Cited by

References

    1. Bloomfield VA, Crothers DM, Tinoco I., Jr . Nucleic Acids: Structures, Properties and Functions. Sausalito, CA: University Science Books; 2000.
    1. Anantharaman V, Koonin EV, Aravind L. Comparative genomics and evolution of proteins involved in RNA metabolism. Nucleic Acids Res. 2002;30:1427–64. - PMC - PubMed
    1. Czerwoniec A, Dunin-Horkawicz S, Purta E, Kaminska KH, Kasprzak JM, Bujnicki JM, et al. MODOMICS: a database of RNA modification pathways. 2008 update. Nucleic Acids Res. 2009;37:118–21. - PMC - PubMed
    1. Roberts RJ, Vincze T, Posfai J, Macelis D. REBASE—enzymes and genes for DNA restriction and modification. Nucleic Acids Res. 2007;35:269–70. - PMC - PubMed
    1. Pfeifer GP. Mutagenesis at methylated CpG sequences. Curr Top Microbiol Immunol. 2006;301:259–81. - PubMed

Publication types

MeSH terms