Dissecting the role of low-complexity regions in the evolution of vertebrate proteins
- PMID: 22920595
- PMCID: PMC3523016
- DOI: 10.1186/1471-2148-12-155
Dissecting the role of low-complexity regions in the evolution of vertebrate proteins
Abstract
Background: Low-complexity regions (LCRs) in proteins are tracts that are highly enriched in one or a few amino acids. Given their high abundance, and their capacity to expand in relatively short periods of time through replication slippage, they can greatly contribute to increase protein sequence space and generate novel protein functions. However, little is known about the global impact of LCRs on protein evolution.
Results: We have traced back the evolutionary history of 2,802 LCRs from a large set of homologous protein families from H.sapiens, M.musculus, G.gallus, D.rerio and C.intestinalis. Transcriptional factors and other regulatory functions are overrepresented in proteins containing LCRs. We have found that the gain of novel LCRs is frequently associated with repeat expansion whereas the loss of LCRs is more often due to accumulation of amino acid substitutions as opposed to deletions. This dichotomy results in net protein sequence gain over time. We have detected a significant increase in the rate of accumulation of novel LCRs in the ancestral Amniota and mammalian branches, and a reduction in the chicken branch. Alanine and/or glycine-rich LCRs are overrepresented in recently emerged LCR sets from all branches, suggesting that their expansion is better tolerated than for other LCR types. LCRs enriched in positively charged amino acids show the contrary pattern, indicating an important effect of purifying selection in their maintenance.
Conclusion: We have performed the first large-scale study on the evolutionary dynamics of LCRs in protein families. The study has shown that the composition of an LCR is an important determinant of its evolutionary pattern.
Figures
Similar articles
-
Low Complexity Regions in Mammalian Proteins are Associated with Low Protein Abundance and High Transcript Abundance.Mol Biol Evol. 2022 May 3;39(5):msac087. doi: 10.1093/molbev/msac087. Mol Biol Evol. 2022. PMID: 35482425 Free PMC article.
-
Evolution of Transcript Abundance is Influenced by Indels in Protein Low Complexity Regions.J Mol Evol. 2024 Apr;92(2):153-168. doi: 10.1007/s00239-024-10158-z. Epub 2024 Mar 14. J Mol Evol. 2024. PMID: 38485789
-
Key Role of Amino Acid Repeat Expansions in the Functional Diversification of Duplicated Transcription Factors.Mol Biol Evol. 2015 Sep;32(9):2263-72. doi: 10.1093/molbev/msv103. Epub 2015 Apr 29. Mol Biol Evol. 2015. PMID: 25931513 Free PMC article.
-
Disentangling the complexity of low complexity proteins.Brief Bioinform. 2020 Mar 23;21(2):458-472. doi: 10.1093/bib/bbz007. Brief Bioinform. 2020. PMID: 30698641 Free PMC article. Review.
-
Molecular-evolutionary mechanisms for genomic disorders.Curr Opin Genet Dev. 2002 Jun;12(3):312-9. doi: 10.1016/s0959-437x(02)00304-0. Curr Opin Genet Dev. 2002. PMID: 12076675 Review.
Cited by
-
Culex pipiens pallens cuticular protein CPLCG5 participates in pyrethroid resistance by forming a rigid matrix.Parasit Vectors. 2018 Jan 4;11(1):6. doi: 10.1186/s13071-017-2567-9. Parasit Vectors. 2018. PMID: 29301564 Free PMC article.
-
Viral proteins originated de novo by overprinting can be identified by codon usage: application to the "gene nursery" of Deltaretroviruses.PLoS Comput Biol. 2013;9(8):e1003162. doi: 10.1371/journal.pcbi.1003162. Epub 2013 Aug 15. PLoS Comput Biol. 2013. PMID: 23966842 Free PMC article.
-
Common low complexity regions for SARS-CoV-2 and human proteomes as potential multidirectional risk factor in vaccine development.BMC Bioinformatics. 2021 Apr 8;22(1):182. doi: 10.1186/s12859-021-04017-7. BMC Bioinformatics. 2021. PMID: 33832440 Free PMC article.
-
HLA-B locus products resist degradation by the human cytomegalovirus immunoevasin US11.PLoS Pathog. 2019 Sep 17;15(9):e1008040. doi: 10.1371/journal.ppat.1008040. eCollection 2019 Sep. PLoS Pathog. 2019. PMID: 31527904 Free PMC article.
-
PlaToLoCo: the first web meta-server for visualization and annotation of low complexity regions in proteins.Nucleic Acids Res. 2020 Jul 2;48(W1):W77-W84. doi: 10.1093/nar/gkaa339. Nucleic Acids Res. 2020. PMID: 32421769 Free PMC article.
References
-
- Wootton JC, Federhen S. Analysis of compositionally biased regions in sequence databases. Methods Enzymol. 1996;266:554–571. - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources