Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2017 Feb;42(2):98-110.
doi: 10.1016/j.tibs.2016.08.008. Epub 2016 Oct 3.

Alternative Splicing May Not Be the Key to Proteome Complexity

Affiliations
Review

Alternative Splicing May Not Be the Key to Proteome Complexity

Michael L Tress et al. Trends Biochem Sci. 2017 Feb.

Abstract

Alternative splicing is commonly believed to be a major source of cellular protein diversity. However, although many thousands of alternatively spliced transcripts are routinely detected in RNA-seq studies, reliable large-scale mass spectrometry-based proteomics analyses identify only a small fraction of annotated alternative isoforms. The clearest finding from proteomics experiments is that most human genes have a single main protein isoform, while those alternative isoforms that are identified tend to be the most biologically plausible: those with the most cross-species conservation and those that do not compromise functional domains. Indeed, most alternative exons do not seem to be under selective pressure, suggesting that a large majority of predicted alternative transcripts may not even be translated into proteins.

Keywords: RNA-seq; alternative splicing; dominant isoforms; functional isoforms; homology; proteomics.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.. Types of alternative isoforms.
This figure presents three types of alternative variants defined using the gene SLC25A3, a mitochondrial phosphate carrier protein. In each case we show the effect at the transcript level and at the protein level. (A) Homologous exons. Above, schema of variant SCL25A3–005, which is generated from variant SCL25A3–001 via the substitution of exon 2a (black) by exon 2b (orange). The differing protein sequences are shown in the alignment below the transcript level comparison. Middle, example spectra for the two peptides that identify the two different alternative isoforms. Below, the likely effect on protein structure (shown in two views) for the similar gene SLC25A4 (PDB code: 1okc); residues that differ between the two isoforms are shown as orange sticks. The change to the structure and function is likely to be comparatively subtle: no residues are lost and most of the changes are found on the outside of the pore. (B) Non-homologous substitution. Above, schema of variant SCL25A3–015, which is generated from variant SCL25A3–001 via the substitution of exon 3 (the longer alternative exon is in red). Below; the likely effect on protein structure shown in two different views; residues that would be lost in the alternative isoforms are shown in red. (C) Indels. Above, schema of variant SCL25A3–002, which is generated from variant SCL25A3–001 via the skipping of exon 6 (green). Below, the likely structural effect of this loss of 28 amino acids is shown in two different views; residues that would be lost in the alternative isoforms are shown in green. The deletion would remove the base of the pore and parts of two different trans-membrane helices meaning that the trans-membrane sections would have to completely refold. Images generated with the PyMOL Molecular Graphics System, Version 1.8 Schrödinger, LLC.
Figure 2.
Figure 2.. Coincidence between main proteomics isoforms and other reference isoforms.
The percentage of genes in which there was agreement between the reference isoform for a gene and the main proteomics isoform calculated from the proteomics experiments [36]. The comparison was made over all 5,011 genes from the same proteomics study for the longest isoform, over a subset of 3,331 genes with CCDS unique isoforms [41] for the CCDS comparison, over a subset of 4,186 genes with principal isoforms for the APPRIS comparison [43] and over a subset of 1,038 genes with five-fold dominant transcripts across all tissues for the RNAseq comparison [37]. The Highest Connected Isoform comparison was made using data from the paper that introduced the method [42]. A random selection of isoforms would have agreed with the main proteomics isoform 46% of the time.
Figure 3.
Figure 3.. Solved crystal structures for two pairs of MS-detected alternative isoforms.
Solved protein structures for alternative isoforms that differ by substitution of homologous exons. In each figure one isoform is coloured orange and the other blue. The region coded by the homologous exons is shown in light blue and light orange. (A) Pyruvate kinase isoforms M1 and M2 [53], those residues that differ in the alternative isoform are shown as sticks. The two structures (PDB codes 1srf and 1srd) are practically identical, the largest differences are in a loop from the substituted region (bottom right) and in the loop region when the M2 isoform binds the fructose biphosphate substrate and the M1 isoform does not (top right). (B) “central” and “peripheral” isoforms of ketohexokinse [54]. Both isoforms bind the substrate fructose; the homologous exon substitution affects the substrate-binding site; the two residues that differ in the site are shown as blue and grey sticks. The peripheral isoform does not bind fructose as strongly as the central isoform; the change in binding residues may mean that the peripheral isoform has a different substrate.
Figure 4.
Figure 4.. Genome-wide distribution of sequence variants in principal and alternative isoforms.
The ratio of non-synonymous to synonymous variants (A) and the percentage of high-impact variants (B) shown for three sets of protein-coding sites: Alternative, those sites that fall inside exons belonging exclusively to alternative variants (895,887 sites in total); APPRIS, those sites from exons that code for APPRIS main isoforms [43] and not for alternative isoforms (4,732,523 sites); and Intersection, those sites that fall inside exons that code for both alternative variants and APPRIS main isoforms (10,792,735 sites). Each ratio was calculated both for rare and common allele frequencies identified from phase3 of the 1000 Genomes project [47] (the boundary between rare and common was set at an allele count of 25, corresponding to an allele frequency of 0.005). High impact variants defined by VEP [55] were splice acceptor variants, splice donor variants, stop gains, stop losses and frameshift variants.

Comment in

Similar articles

Cited by

References

    1. Harrow J et al. (2012) GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res. 22, 760–774 - PMC - PubMed
    1. Sánchez-Pla A et al. (2012) Transcriptomics: mRNA and alternative splicing. J Neuroimmunol. 248, 23–31 - PubMed
    1. Uhlén M et al. (2015) Proteomics. Tissue-based map of the human proteome. Science 347, 1260419. - PubMed
    1. Juntawong P et al. (2012) Translational dynamics revealed by genome-wide profiling of ribosome footprints in Arabidopsis. Proc. Natl. Acad. Sci. U.S.A 111:E203–E212. - PMC - PubMed
    1. Mollet IG et al. (2010) Unconstrained mining of transcript data reveals increased alternative splicing complexity in the human transcriptome. Nucleic Acids Res. 38, 4740–4754 - PMC - PubMed

Publication types