Screening non-coding RNAs in transcriptomes from neglected species using PORTRAIT: case study of the pathogenic fungus Paracoccidioides brasiliensis

doi:10.1186/1471-2105-10-239

. 2009 Aug 4:10:239.

doi: 10.1186/1471-2105-10-239.

Screening non-coding RNAs in transcriptomes from neglected species using PORTRAIT: case study of the pathogenic fungus Paracoccidioides brasiliensis

Roberto T Arrial¹, Roberto C Togawa, Marcelo de M Brigido

Affiliations

PMID: 19653905
PMCID: PMC2731755
DOI: 10.1186/1471-2105-10-239

Screening non-coding RNAs in transcriptomes from neglected species using PORTRAIT: case study of the pathogenic fungus Paracoccidioides brasiliensis

Roberto T Arrial et al. BMC Bioinformatics. 2009.

. 2009 Aug 4:10:239.

doi: 10.1186/1471-2105-10-239.

Authors

Roberto T Arrial¹, Roberto C Togawa, Marcelo de M Brigido

Affiliation

¹ Biology Institute, University of Brasília, Brazil. rtarrial@gmail.com

PMID: 19653905
PMCID: PMC2731755
DOI: 10.1186/1471-2105-10-239

Abstract

Background: Transcriptome sequences provide a complement to structural genomic information and provide snapshots of an organism's transcriptional profile. Such sequences also represent an alternative method for characterizing neglected species that are not expected to undergo whole-genome sequencing. One difficulty for transcriptome sequencing of these organisms is the low quality of reads and incomplete coverage of transcripts, both of which compromise further bioinformatics analyses. Another complicating factor is the lack of known protein homologs, which frustrates searches against established protein databases. This lack of homologs may be caused by divergence from well-characterized and over-represented model organisms. Another explanation is that non-coding RNAs (ncRNAs) may be caught during sequencing. NcRNAs are RNA sequences that, unlike messenger RNAs, do not code for protein products and instead perform unique functions by folding into higher order structural conformations. There is ncRNA screening software available that is specific for transcriptome sequences, but their analyses are optimized for those transcriptomes that are well represented in protein databases, and also assume that input ESTs are full-length and high quality.

Results: We propose an algorithm called PORTRAIT, which is suitable for ncRNA analysis of transcriptomes from poorly characterized species. Sequences are translated by software that is resistant to sequencing errors, and the predicted putative proteins, along with their source transcripts, are evaluated for coding potential by a support vector machine (SVM). Either of two SVM models may be employed: if a putative protein is found, a protein-dependent SVM model is used; if it is not found, a protein-independent SVM model is used instead. Only ab initio features are extracted, so that no homology information is needed. We illustrate the use of PORTRAIT by predicting ncRNAs from the transcriptome of the pathogenic fungus Paracoccidoides brasiliensis and five other related fungi.

Conclusion: PORTRAIT can be integrated into pipelines, and provides a low computational cost solution for ncRNA detection in transcriptome sequencing projects.

PubMed Disclaimer

Figures

**Figure 1**
**Construction of the training database (dbTR)**. The dbTR comprises both negative and positive instances, and was subdivided as transcripts having identified ORFs (dbTR_OP) and transcripts lacking ORFs (dbTR_OA). Each of these subsets harbor their own negative and positive instances. dbTR_OP training subset was used to induce the protein-dependent SVM model, while dbTR_OA training subset generated the protein-independent SVM model.

**Figure 2**
**ROC curves showing performance of classifiers on dbTS sets**. Sensitivity is plotted against (1-specificity), allowing accuracy comparisons among classifiers. A perfect classifier would yield a curve with a point at (0,1) and the final point in (1,1), that is, top-leftmost curves have better classification performance. Classification threshold was set to 0.5 for all classifiers.

**Figure 3**
**Distribution of *P. brasiliensis* transcript sequences classified as ncRNA by several classifiers as a function of specific annotations by Felipe et al. (2005)**. Annotations of the 6,022 transcripts [32] were considered only after classifier prediction, so even transcripts previously manually annotated as proteins were evaluated for coding potential. A "Confident annotation" refers to a transcript description which lacks the words: "*putative*", "*probable*" and "*hypothetical*". The numbers of transcripts classified as ncRNA are shown in the legend (except for dbPB, which shows the total of Pb transcripts).

See this image and copyright information in PMC

Cited by

Long non-coding RNA-encoded micropeptides: functions, mechanisms and implications.
Xiao Y, Ren Y, Hu W, Paliouras AR, Zhang W, Zhong L, Yang K, Su L, Wang P, Li Y, Ma M, Shi L. Xiao Y, et al. Cell Death Discov. 2024 Oct 23;10(1):450. doi: 10.1038/s41420-024-02175-0. Cell Death Discov. 2024. PMID: 39443468 Free PMC article. Review.
Common Features in lncRNA Annotation and Classification: A Survey.
Klapproth C, Sen R, Stadler PF, Findeiß S, Fallmann J. Klapproth C, et al. Noncoding RNA. 2021 Dec 13;7(4):77. doi: 10.3390/ncrna7040077. Noncoding RNA. 2021. PMID: 34940758 Free PMC article. Review.
Identification and Expression Analysis of Long Noncoding RNAs in Fat-Tail of Sheep Breeds.
Bakhtiarizadeh MR, Salami SA. Bakhtiarizadeh MR, et al. G3 (Bethesda). 2019 Apr 9;9(4):1263-1276. doi: 10.1534/g3.118.201014. G3 (Bethesda). 2019. PMID: 30787031 Free PMC article.
Exosomes Could Offer New Options to Combat the Long-Term Complications Inflicted by Gestational Diabetes Mellitus.
Floriano JF, Willis G, Catapano F, Lima PR, Reis FVDS, Barbosa AMP, Rudge MVC, Emanueli C. Floriano JF, et al. Cells. 2020 Mar 10;9(3):675. doi: 10.3390/cells9030675. Cells. 2020. PMID: 32164322 Free PMC article. Review.
De novo transcriptome assembly from inflorescence of Orchis italica: analysis of coding and non-coding transcripts.
De Paolo S, Salvemini M, Gaudio L, Aceto S. De Paolo S, et al. PLoS One. 2014 Jul 15;9(7):e102155. doi: 10.1371/journal.pone.0102155. eCollection 2014. PLoS One. 2014. PMID: 25025767 Free PMC article.

See all "Cited by" articles

References

1. Ravasi T, Suzuki H, Pang KC, Katayama S, Furuno M, Okunishi R, Fukuda S, Ru K, Frith MC, Gongora MM, Grimmond SM, Hume DA, Hayashizaki Y, Mattick JS. Experimental validation of the regulated expression of large numbers of non-coding RNAs from the mouse genome. Genome Res. 2006;16:11–19. doi: 10.1101/gr.4200206. - DOI - PMC - PubMed
1. Mattick JS. RNA regulation: a new genetics? Nat Rev Genet. 2004;5:316–323. doi: 10.1038/nrg1321. - DOI - PubMed
1. Jossinet F, Ludwig TE, Westhof E. RNA structure: bioinformatic analysis. Curr Op Microbiol. 2007;10:279–285. doi: 10.1016/j.mib.2007.05.010. - DOI - PubMed
1. Teramoto R, Aoki M, Kimura T, Kanaoka M. Prediction of siRNA functionality using generalized string kernel and support vector machine. FEBS Lett. 2005;579:2878–2882. doi: 10.1016/j.febslet.2005.04.045. - DOI - PubMed
1. Xue C, Li F, He T, Liu G-P, Li Y, Zhang X. Classification of real and pseudo microRNA precursors using local structure-sequence features and support vector machine. BMC Bioinformatics. 2005;6:310–317. doi: 10.1186/1471-2105-6-310. - DOI - PMC - PubMed

Publication types

Actions

MeSH terms

Substances

Actions
Actions

LinkOut - more resources

Full Text Sources

[1] Ravasi T, Suzuki H, Pang KC, Katayama S, Furuno M, Okunishi R, Fukuda S, Ru K, Frith MC, Gongora MM, Grimmond SM, Hume DA, Hayashizaki Y, Mattick JS. Experimental validation of the regulated expression of large numbers of non-coding RNAs from the mouse genome. Genome Res. 2006;16:11–19. doi: 10.1101/gr.4200206. - DOI - PMC - PubMed

[2] Ravasi T, Suzuki H, Pang KC, Katayama S, Furuno M, Okunishi R, Fukuda S, Ru K, Frith MC, Gongora MM, Grimmond SM, Hume DA, Hayashizaki Y, Mattick JS. Experimental validation of the regulated expression of large numbers of non-coding RNAs from the mouse genome. Genome Res. 2006;16:11–19. doi: 10.1101/gr.4200206. - DOI - PMC - PubMed

[3] Mattick JS. RNA regulation: a new genetics? Nat Rev Genet. 2004;5:316–323. doi: 10.1038/nrg1321. - DOI - PubMed

[4] Mattick JS. RNA regulation: a new genetics? Nat Rev Genet. 2004;5:316–323. doi: 10.1038/nrg1321. - DOI - PubMed

[5] Jossinet F, Ludwig TE, Westhof E. RNA structure: bioinformatic analysis. Curr Op Microbiol. 2007;10:279–285. doi: 10.1016/j.mib.2007.05.010. - DOI - PubMed

[6] Jossinet F, Ludwig TE, Westhof E. RNA structure: bioinformatic analysis. Curr Op Microbiol. 2007;10:279–285. doi: 10.1016/j.mib.2007.05.010. - DOI - PubMed

[7] Teramoto R, Aoki M, Kimura T, Kanaoka M. Prediction of siRNA functionality using generalized string kernel and support vector machine. FEBS Lett. 2005;579:2878–2882. doi: 10.1016/j.febslet.2005.04.045. - DOI - PubMed

[8] Teramoto R, Aoki M, Kimura T, Kanaoka M. Prediction of siRNA functionality using generalized string kernel and support vector machine. FEBS Lett. 2005;579:2878–2882. doi: 10.1016/j.febslet.2005.04.045. - DOI - PubMed

[9] Xue C, Li F, He T, Liu G-P, Li Y, Zhang X. Classification of real and pseudo microRNA precursors using local structure-sequence features and support vector machine. BMC Bioinformatics. 2005;6:310–317. doi: 10.1186/1471-2105-6-310. - DOI - PMC - PubMed

[10] Xue C, Li F, He T, Liu G-P, Li Y, Zhang X. Classification of real and pseudo microRNA precursors using local structure-sequence features and support vector machine. BMC Bioinformatics. 2005;6:310–317. doi: 10.1186/1471-2105-6-310. - DOI - PMC - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Screening non-coding RNAs in transcriptomes from neglected species using PORTRAIT: case study of the pathogenic fungus Paracoccidioides brasiliensis

Affiliation

Screening non-coding RNAs in transcriptomes from neglected species using PORTRAIT: case study of the pathogenic fungus Paracoccidioides brasiliensis

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Related information

LinkOut - more resources

Full Text Sources