Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Mar 31:17:264.
doi: 10.1186/s12864-016-2590-9.

An evolutionary roadmap to the microtubule-associated protein MAP Tau

Affiliations

An evolutionary roadmap to the microtubule-associated protein MAP Tau

Frederik Sündermann et al. BMC Genomics. .

Abstract

Background: The microtubule associated protein Tau (MAPT) promotes assembly and interaction of microtubules with the cytoskeleton, impinging on axonal transport and synaptic plasticity. Its neuronal expression and intrinsic disorder implicate it in some 30 tauopathies such as Alzheimer's disease and frontotemporal dementia. These pathophysiological studies have yet to be complemented by computational analyses of its molecular evolution and structural models of all its functional domains to explain the molecular basis for its conservation profile, its site-specific interactions and the propensity to conformational disorder and aggregate formation.

Results: We systematically annotated public sequence data to reconstruct unspliced MAPT, MAP2 and MAP4 transcripts spanning all represented genomes. Bayesian and maximum likelihood phylogenetic analyses, genetic linkage maps and domain architectures distinguished a nonvertebrate outgroup from the emergence of MAP4 and its subsequent ancestral duplication to MAP2 and MAPT. These events were coupled to other linked genes such as KANSL1L and KANSL and may thus be consequent to large-scale chromosomal duplications originating in the extant vertebrate genomes of hagfish and lamprey. Profile hidden Markov models (pHMMs), clustered subalignments and 3D structural predictions defined potential interaction motifs and specificity determining sites to reveal distinct signatures between the four homologous microtubule binding domains and independent divergence of the amino terminus.

Conclusion: These analyses clarified ambiguities of MAPT nomenclature, defined the order, timing and pattern of its molecular evolution and identified key residues and motifs relevant to its protein interaction properties and pathogenic role. Additional unexpected findings included the expansion of cysteine-containing, microtubule binding domains of MAPT in cold adapted Antarctic icefish and the emergence of a novel multiexonic saitohin (STH) gene from repetitive elements in MAPT intron 11 of certain primate genomes.

Keywords: Domain architecture; Gene phylogeny; MAPT gene); Microtubule associated protein Tau (MAPT protein; Microtubule binding domain; Molecular evolution; Profile hidden Markov models; Saitohin (STH); Structure-function prediction.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Gene organization and transcript variants of human MAPs with tau-like microtubule binding domains. a Chromosomal loci and genetic linkage maps of human MAPT, MAP2 and MAP4, including MAPT-IT1 and STH within the MAPT gene. There is also evidence that segmental chromosome duplications 17 ↔ 2 and 17 ↔ 3 formed the paralogous gene pairs MAPT-KANSL1 and MAP2-KANSL1L (see text). b Official gene names and sizes identify the graphic outlines of their respective exon distributions. MAPT intron 1 contains MAPT-IT1 (intronic transcript 1, long non-coding RNA), intron 11 contains the saitohin gene (STH) encoding a single open reading frame and peptide, while numerous remaining non-coding regulatory RNAs and repetitive elements in other introns are not annotated here. c Alternatively spliced transcripts of 8 human MAPT isoforms are identified by formal and familiar terminology showing size distributions of untranslated and coding (grey-filled) exons. The descriptive summary of protein products corresponds to different alternatively spliced exons produced by skipping of one or more exons 3, 4, 6, 8, 10 and 12 in different cell types and conditions; note that a previous, non-standard nomenclature restricted to the 6 brain isoforms started numbering at 1 for the first “coding” exon 2 while exon 4A was designated for exon 6 leading to an apparent maximum exon number of 13 instead of the true 15 [12]. The 4 MTBDs are marked at the top, the second one in color to denote the possible splicing out of exon 12. Known and predicted phosphorylation sites are identified by the ball and stick symbol above MAPT isoform 1. Note that the underlined 6 protein isoforms (v2, v3, v4, v5, v7, v8) are expressed in the central nervous system. Experimental evidence for the expression of exon 10 in humans is still lacking (NCBI BLASTN human RefSeq transcripts). A schematic representation showing the functional organization of tau is displayed on top. d Human MAP2 and MAP4 coding (grey-filled), non-coding and alternatively spliced exons (red numbers) are shown to characterize protein isoforms and localize MAP domains, including the exon splicing affecting the second MTBD
Fig. 2
Fig. 2
a Bayesian consensus phylogenetic tree of the MAPT/MAP2/MAP4 family. Putative homologous proteins of MAPT, MAP2 and MAP4 were retrieved from the NCBI-GenPept and UniProt databases and either completed or reconstructed ab initio by manual curation from BLAST and HMMER comparisons of genome assembly and coding transcript sequences. Full-length proteins representing the full species range for each vertebrate subfamily and a nonvertebrate outgroup were aligned (1947 aa from 102 species) and analyzed to consensus with ExaBayes on the Hanover supercomputer. Posterior probabilities and ML bootstrap percentage confidence values (in brackets) for the branching topology are shown at the nodes and branch lengths (SBL 63.9) are proportional to the amount of evolution along the horizontal scale (non-linear time). The branching topology was well supported and conformed to the known species divergence order identified by taxon symbols and descriptive labels. b Protein domain architectures (MTBD) representative of the vertebrate subfamilies were observed to contrast with various nonvertebrate homologs included in the phylogenetic analysis
Fig. 3
Fig. 3
a pHMM sequence logo of the 4 microtubule binding domains in the 3 protein subfamilies MAPT, MAP2 and MAP4. More than 1200 individual MTBDs were aligned to build a pHMM and saved as sequence logo in scalar vector graphics format. The interpretation of amino acid distributions and column heights is summarized in Fig. 4 legend. Those sites characterized by Z-scores from SDPPRED as having distinct but conserved aa between the 4 different MTBDs in 3 paralogous subfamilies of a subclassified alignment of the 1200 domains, are shaded and starred as “specificity determining positions” responsible for functional divergence. b The 12 individual subfamily logos enable a direct comparison of all MTBD molecular profiles. The aa replacement of Ile/Val for Cys in the core tubulin binding motif “KCGS” of domains 2 and 3 was the most significant (Z-score 4.64 in A) specificity determining site (starred) and the deletion at position 12 in domains 1–3 differentiates these from domain 4. c SDP sequence logos of 5 sites in MTBD subalignments of 100+ proteins each with the highest Z-scores from SDP-PRED analysis. d Maximum likelihood analysis of the 12 MTBD subalignments of 100+ proteins each (using RAxML, WAG substitution model, 100 bootstrap pseudoalignments and gamma rates with alpha = 1.3). The point of separation of domains 1 and 4 from 2 and 3 was based on a midpoint root reflecting in the evolutionary relatedness of these domain pairs. Modest boostrap values were a consequence of the short, 33-aa sequence length and the triangle fans represent 100+ species orthologs for each MTBD category. e The influence of extreme cold adaptation on MAPT in the Antarctic rockcod Notothenia coriiceps was determined by reconstructing the transcript and deduced protein sequence from the corresponding genome assembly (gb:KL666590.1). The results showed 7 sequential MTBDs identified by their match score E-value to individual pHMMs (B above) with a 4-fold tandem duplication of MTBD 2 containing the typical central Cys preceded by an aa replacement from Lys to Arg
Fig. 4
Fig. 4
a Alignment pHMM logo for coding exons from full-length vertebrate MAPT homologs. The profile was reconstructed by SKYLIGN from a hidden Markov model based on a protein alignment of 117 orthologs validated by phylogenetic analysis. Exon numbers and lengths in amino acids and nucleotides are indicated with intron insertion phase numbers at exon junctures. Each site shows the relative proportion of 20 possible amino acids above background level (observed or hidden) and the total column height reflects the information content at each site, based on the overall conservation level imposed by functional constraint. Known MTBDs in exons 11–14 are grey shaded and sites inferred to possess some functional role (known or unknown) are shaded in red to emphasize their elevated column height conservation. The legend at the lower right summarizes other documented sequence features such as nonsynonymous population variants (inverted triangles), post-translational modifications (P-phosphorylation or A-acetylation) and specificity determining Cys residues (yellow stars). b Mammalian subHMM matches projected on a linear 776 aa human MAPT sequence. Each class is represented as a ribbon plot and each subHMM match is color-coded by the HHsearch match probability. The position of the match is indicated relative to the human sequence. The functional organization of tau is indicated on top similar to Fig. 1c. The single subHMMs are indicated in Roman numbers and the pHMM logos are displayed in Additional file 4: Figure S4
Fig. 5
Fig. 5
Display of exon structure, conservation and biophysical properties of MAPT on a potential structural model of human MAPTv6. Exon distribution (a) with red-boxed inset showing a MAPT fragment (602–647 in 776 aa isoform 6) that adopts a more stable helical structure when bound to tubulin [6], pbd:2MZ7; site-specific evolutionary conservation calculated by CONSURF (b); surface maps of hydrophobicity by CHIMERA (c); and surface electric potential from APBS (d) are shown. The predicted model with highest confidence score from I-Tasser was reconstructed by threading template fragments from the Protein Data Bank and ab initio modeling with consideration of steric constraints and low free-energy state. Note that MAPT is an intrinsically disordered protein without fixed constraints on 3D crystal or solution structure [18], so this model is intended mainly as a display platform for the protein physicochemical properties
Fig. 6
Fig. 6
Saitohin (STH) gene locus in MAPT 5’intron 11. a Genomic repetitive element distribution in human MAPT intron 11 showing overlap of the saitohin main ORF with an L2c LINE2 element and an LTR1 long terminal repeat element predicted by REPEATMASKER (http://www.repeatmasker.org/). b An expanded view of the saitohin main ORF designated exon 4 here, overlapping with L2c and LTR1 elements within MAPT 5’intron 11, together with discontiguous, putative exons 1–3 and 5 encoding possible N- and C-terminal extensions. c Saitohin extended amino acid graphic alignment by TBLASTN for the 22 primate genomes listed. d A saitohin full-length pHMM reflects the relative conservation of individual amino acids as their probable frequency (letter size) while the information content or functional potential is reflected in total column height, exemplified by the conserved RGE motif marked in red highlight. This recognized exon 4 ORF is complete only for 7 Catarrhine apes, while N- or C-terminal ORF extensions with an upstream Met or downstream Stop codon have been observed for Macaca fascicularis, Papio anubis and Pongo abelii. Missense or nonsense mutations disrupt the main ORF in various species, although recent RNA Seq data may rectify the true genomic sequences to yield a longer translated protein in some cases (see text). e Saitohin protein 3D model (242 aa) predicted by I-TASSER with the confirmed single exon 4 ORF highlighted in blue, the disease-associated SNP “Q7R” at position 89 and an exposed RGE motif at positions 200–202

Similar articles

Cited by

References

    1. Derisbourg M, Leghay C, Chiapetta G, Fernandez-Gomez FJ, Laurent C, Demeyer D, Carrier S, Buée-Scherrer V, Blum D, Vinh J, Sergeant N, Verdier Y, Buée L, Hamdane M. Role of the Tau N-terminal region in microtubule stabilization revealed by new endogenous truncated forms. Sci Rep. 2015;5:9659. doi: 10.1038/srep09659. - DOI - PMC - PubMed
    1. Dehmelt L, Halpain S. The MAP2/Tau family of microtubule-associated proteins. Genome Biol. 2005;6:1–10. - PMC - PubMed
    1. Fichou Y, Heyden M, Zaccai G, Weik M, Tobias DJ. Molecular dynamics simulations of a powder model of the intrinsically disordered protein Tau. J Phys Chem B. 2015; doi:10.1021/acs.jpcb.5b05849 - PubMed
    1. Li XH, Culver JA, Rhoades E. Tau binds to multiple tubulin dimers with helical structure. J Am Chem Soc. 2015;Jul 12. doi:10.1021/jacs.5b04561. - PMC - PubMed
    1. Wright PE, Dyson HJ. Intrinsically disordered proteins in cellular signalling and regulation. Nat Publ Gr. 2015;16:18–29. - PMC - PubMed

Publication types

LinkOut - more resources