ESLpred2: improved method for predicting subcellular localization of eukaryotic proteins
- PMID: 19038062
- PMCID: PMC2612013
- DOI: 10.1186/1471-2105-9-503
ESLpred2: improved method for predicting subcellular localization of eukaryotic proteins
Abstract
Background: The expansion of raw protein sequence databases in the post genomic era and availability of fresh annotated sequences for major localizations particularly motivated us to introduce a new improved version of our previously forged eukaryotic subcellular localizations prediction method namely "ESLpred". Since, subcellular localization of a protein offers essential clues about its functioning, hence, availability of localization predictor would definitely aid and expedite the protein deciphering studies. However, robustness of a predictor is highly dependent on the superiority of dataset and extracted protein attributes; hence, it becomes imperative to improve the performance of presently available method using latest dataset and crucial input features.
Results: Here, we describe augmentation in the prediction performance obtained for our most popular ESLpred method using new crucial features as an input to Support Vector Machine (SVM). In addition, recently available, highly non-redundant dataset encompassing three kingdoms specific protein sequence sets; 1198 fungi sequences, 2597 from animal and 491 plant sequences were also included in the present study. First, using the evolutionary information in the form of profile composition along with whole and N-terminal sequence composition as an input feature vector of 440 dimensions, overall accuracies of 72.7, 75.8 and 74.5% were achieved respectively after five-fold cross-validation. Further, enhancement in performance was observed when similarity search based results were coupled with whole and N-terminal sequence composition along with profile composition by yielding overall accuracies of 75.9, 80.8, 76.6% respectively; best accuracies reported till date on the same datasets.
Conclusion: These results provide confidence about the reliability and accurate prediction of SVM modules generated in the present study using sequence and profile compositions along with similarity search based results. The presently developed modules are implemented as web server "ESLpred2" available at http://www.imtech.res.in/raghava/eslpred2/.
Similar articles
-
ESLpred: SVM-based method for subcellular localization of eukaryotic proteins using dipeptide composition and PSI-BLAST.Nucleic Acids Res. 2004 Jul 1;32(Web Server issue):W414-9. doi: 10.1093/nar/gkh350. Nucleic Acids Res. 2004. PMID: 15215421 Free PMC article.
-
RSLpred: an integrative system for predicting subcellular localization of rice proteins combining compositional and evolutionary information.Proteomics. 2009 May;9(9):2324-42. doi: 10.1002/pmic.200700597. Proteomics. 2009. PMID: 19402042
-
A machine learning based method for the prediction of secretory proteins using amino acid composition, their order and similarity-search.In Silico Biol. 2008;8(2):129-40. In Silico Biol. 2008. PMID: 18928201
-
ProLoc-GO: utilizing informative Gene Ontology terms for sequence-based prediction of protein subcellular localization.BMC Bioinformatics. 2008 Feb 1;9:80. doi: 10.1186/1471-2105-9-80. BMC Bioinformatics. 2008. PMID: 18241343 Free PMC article.
-
pLoc_bal-mPlant: Predict Subcellular Localization of Plant Proteins by General PseAAC and Balancing Training Dataset.Curr Pharm Des. 2018;24(34):4013-4022. doi: 10.2174/1381612824666181119145030. Curr Pharm Des. 2018. PMID: 30451108 Review.
Cited by
-
Molecular characterization and identification of Th1 epitopes of a Schistosoma japonicum protein similar to prosaposin.Parasitol Res. 2014 Mar;113(3):983-92. doi: 10.1007/s00436-013-3730-7. Epub 2013 Dec 21. Parasitol Res. 2014. PMID: 24363182
-
Fast H-DROP: A thirty times accelerated version of H-DROP for interactive SVM-based prediction of helical domain linkers.J Comput Aided Mol Des. 2017 Feb;31(2):237-244. doi: 10.1007/s10822-016-9999-8. Epub 2016 Dec 27. J Comput Aided Mol Des. 2017. PMID: 28028736
-
TIM-Finder: a new method for identifying TIM-barrel proteins.BMC Struct Biol. 2009 Dec 14;9:73. doi: 10.1186/1472-6807-9-73. BMC Struct Biol. 2009. PMID: 20003393 Free PMC article.
-
YLoc--an interpretable web server for predicting subcellular localization.Nucleic Acids Res. 2010 Jul;38(Web Server issue):W497-502. doi: 10.1093/nar/gkq477. Epub 2010 May 27. Nucleic Acids Res. 2010. PMID: 20507917 Free PMC article.
-
The effector candidate repertoire of the arbuscular mycorrhizal fungus Rhizophagus clarus.BMC Genomics. 2016 Feb 9;17:101. doi: 10.1186/s12864-016-2422-y. BMC Genomics. 2016. PMID: 26861502 Free PMC article.
References
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources