Chaotic genetic algorithm for gene selection and classification problems
- PMID: 19594377
- DOI: 10.1089/omi.2009.0007
Chaotic genetic algorithm for gene selection and classification problems
Abstract
Pattern recognition techniques suffer from a well-known curse, the dimensionality problem. The microarray data classification problem is a classical complex pattern recognition problem. Selecting relevant genes from microarray data poses a formidable challenge to researchers due to the high-dimensionality of features, multiclass categories being involved, and the usually small sample size. The goal of feature (gene) selection is to select those subsets of differentially expressed genes that are potentially relevant for distinguishing the sample classes. In this paper, information gain and chaotic genetic algorithm are proposed for the selection of relevant genes, and a K-nearest neighbor with the leave-one-out crossvalidation method serves as a classifier. The chaotic genetic algorithm is modified by using the chaotic mutation operator to increase the population diversity. The enhanced population diversity expands the GA's search ability. The proposed approach is tested on 10 microarray data sets from the literature. The experimental results show that the proposed method not only effectively reduced the number of gene expression levels, but also achieved lower classification error rates than other methods.
Similar articles
-
A hybrid BPSO-CGA approach for gene selection and classification of microarray data.J Comput Biol. 2012 Jan;19(1):68-82. doi: 10.1089/cmb.2010.0064. Epub 2011 Jan 6. J Comput Biol. 2012. PMID: 21210743 Free PMC article.
-
Selecting subsets of newly extracted features from PCA and PLS in microarray data analysis.BMC Genomics. 2008 Sep 16;9 Suppl 2(Suppl 2):S24. doi: 10.1186/1471-2164-9-S2-S24. BMC Genomics. 2008. PMID: 18831790 Free PMC article.
-
Feature selection and nearest centroid classification for protein mass spectrometry.BMC Bioinformatics. 2005 Mar 23;6:68. doi: 10.1186/1471-2105-6-68. BMC Bioinformatics. 2005. PMID: 15788095 Free PMC article.
-
Classification algorithms for phenotype prediction in genomics and proteomics.Front Biosci. 2008 Jan 1;13:691-708. doi: 10.2741/2712. Front Biosci. 2008. PMID: 17981580 Free PMC article. Review.
-
A review of independent component analysis application to microarray gene expression data.Biotechniques. 2008 Nov;45(5):501-20. doi: 10.2144/000112950. Biotechniques. 2008. PMID: 19007336 Free PMC article. Review.
Cited by
-
Therapy-, gender- and race-specific microRNA markers, target genes and networks related to glioblastoma recurrence and survival.Cancer Genomics Proteomics. 2011 Jul-Aug;8(4):173-83. Cancer Genomics Proteomics. 2011. PMID: 21737610 Free PMC article.
-
Cell cycle and aging, morphogenesis, and response to stimuli genes are individualized biomarkers of glioblastoma progression and survival.BMC Med Genomics. 2011 Jun 7;4:49. doi: 10.1186/1755-8794-4-49. BMC Med Genomics. 2011. PMID: 21649900 Free PMC article.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources