StringTie enables improved reconstruction of a transcriptome from RNA-seq reads
- PMID: 25690850
- PMCID: PMC4643835
- DOI: 10.1038/nbt.3122
StringTie enables improved reconstruction of a transcriptome from RNA-seq reads
Abstract
Methods used to sequence the transcriptome often produce more than 200 million short sequences. We introduce StringTie, a computational method that applies a network flow algorithm originally developed in optimization theory, together with optional de novo assembly, to assemble these complex data sets into transcripts. When used to analyze both simulated and real data sets, StringTie produces more complete and accurate reconstructions of genes and better estimates of expression levels, compared with other leading transcript assembly programs including Cufflinks, IsoLasso, Scripture and Traph. For example, on 90 million reads from human blood, StringTie correctly assembled 10,990 transcripts, whereas the next best assembly was of 7,187 transcripts by Cufflinks, which is a 53% increase in transcripts assembled. On a simulated data set, StringTie correctly assembled 7,559 transcripts, which is 20% more than the 6,310 assembled by Cufflinks. As well as producing a more complete transcriptome assembly, StringTie runs faster on all data sets tested to date compared with other assembly software, including Cufflinks.
Conflict of interest statement
The authors declare no competing financial interests.
Figures
Similar articles
-
Improved transcriptome assembly using a hybrid of long and short reads with StringTie.PLoS Comput Biol. 2022 Jun 1;18(6):e1009730. doi: 10.1371/journal.pcbi.1009730. eCollection 2022 Jun. PLoS Comput Biol. 2022. PMID: 35648784 Free PMC article.
-
TransComb: genome-guided transcriptome assembly via combing junctions in splicing graphs.Genome Biol. 2016 Oct 19;17(1):213. doi: 10.1186/s13059-016-1074-1. Genome Biol. 2016. PMID: 27760567 Free PMC article.
-
STAble: a novel approach to de novo assembly of RNA-seq data and its application in a metabolic model network based metatranscriptomic workflow.BMC Bioinformatics. 2018 Jul 9;19(Suppl 7):184. doi: 10.1186/s12859-018-2174-6. BMC Bioinformatics. 2018. PMID: 30066630 Free PMC article.
-
Protocol for transcriptome assembly by the TransBorrow algorithm.Biol Methods Protoc. 2023 Nov 1;8(1):bpad028. doi: 10.1093/biomethods/bpad028. eCollection 2023. Biol Methods Protoc. 2023. PMID: 38023349 Free PMC article. Review.
-
Mapping RNA-seq reads to transcriptomes efficiently based on learning to hash method.Comput Biol Med. 2020 Jan;116:103539. doi: 10.1016/j.compbiomed.2019.103539. Epub 2019 Nov 13. Comput Biol Med. 2020. PMID: 31765913 Review.
Cited by
-
HTL/KAI2 signaling substitutes for light to control plant germination.PLoS Genet. 2024 Oct 21;20(10):e1011447. doi: 10.1371/journal.pgen.1011447. eCollection 2024 Oct. PLoS Genet. 2024. PMID: 39432524 Free PMC article.
-
Comprehensive at-arrival transcriptomic analysis of post-weaned beef cattle uncovers type I interferon and antiviral mechanisms associated with bovine respiratory disease mortality.PLoS One. 2021 Apr 26;16(4):e0250758. doi: 10.1371/journal.pone.0250758. eCollection 2021. PLoS One. 2021. PMID: 33901263 Free PMC article.
-
Comparative Transcriptome Profiling of mRNA and lncRNA of Ovaries in High and Low Egg Production Performance in Domestic Pigeons (Columba livia).Front Genet. 2021 Mar 23;12:571325. doi: 10.3389/fgene.2021.571325. eCollection 2021. Front Genet. 2021. PMID: 33833772 Free PMC article.
-
Cardiac micro-RNA and transcriptomic profile of a novel swine model of chronic kidney disease and left ventricular diastolic dysfunction.Am J Physiol Heart Circ Physiol. 2022 Oct 1;323(4):H659-H669. doi: 10.1152/ajpheart.00333.2022. Epub 2022 Aug 26. Am J Physiol Heart Circ Physiol. 2022. PMID: 36018756 Free PMC article.
-
A combination of Class-I fumarases and metabolites (α-ketoglutarate and fumarate) signal the DNA damage response in Escherichia coli.Proc Natl Acad Sci U S A. 2021 Jun 8;118(23):e2026595118. doi: 10.1073/pnas.2026595118. Proc Natl Acad Sci U S A. 2021. PMID: 34083440 Free PMC article.
References
Publication types
MeSH terms
Substances
Associated data
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases
Research Materials