Improving cluster visualization in self-organizing maps: application in gene expression data analysis
- PMID: 17544390
- DOI: 10.1016/j.compbiomed.2007.04.003
Improving cluster visualization in self-organizing maps: application in gene expression data analysis
Abstract
Cluster analysis is one of the crucial steps in gene expression pattern (GEP) analysis. It leads to the discovery or identification of temporal patterns and coexpressed genes. GEP analysis involves highly dimensional multivariate data which demand appropriate tools. A good alternative for grouping many multidimensional objects is self-organizing maps (SOM), an unsupervised neural network algorithm able to find relationships among data. SOM groups and maps them topologically. However, it may be difficult to identify clusters with the usual visualization tools for SOM. We propose a simple algorithm to identify and visualize clusters in SOM (the RP-Q method). The RP is a new node-adaptive attribute that moves in a two dimensional virtual space imitating the movement of the codebooks vectors of the SOM net into the input space. The Q statistic evaluates the SOM structure providing an estimation of the number of clusters underlying the data set. The SOM-RP-Q algorithm permits the visualization of clusters in the SOM and their node patterns. The algorithm was evaluated in several simulated and real GEP data sets. Results show that the proposed algorithm successfully displays the underlying cluster structure directly from the SOM and is robust to different net sizes.
Similar articles
-
Application of Multi-SOM clustering approach to macrophage gene expression analysis.Infect Genet Evol. 2009 May;9(3):328-36. doi: 10.1016/j.meegid.2008.09.009. Epub 2008 Oct 17. Infect Genet Evol. 2009. PMID: 18992849
-
TreeSOM: Cluster analysis in the self-organizing map.Neural Netw. 2006 Jul-Aug;19(6-7):935-49. doi: 10.1016/j.neunet.2006.05.003. Epub 2006 Jun 15. Neural Netw. 2006. PMID: 16781116
-
Adaptive double self-organizing maps for clustering gene expression profiles.Neural Netw. 2003 Jun-Jul;16(5-6):633-40. doi: 10.1016/S0893-6080(03)00102-3. Neural Netw. 2003. PMID: 12850017
-
Gene expression profiling--Clusters of possibilities.Methods. 2010 Apr;50(4):323-35. doi: 10.1016/j.ymeth.2010.01.009. Epub 2010 Jan 15. Methods. 2010. PMID: 20079843 Review.
-
Comparison of algorithms to infer genetic population structure from unlinked molecular markers.Stat Appl Genet Mol Biol. 2014 Aug;13(4):391-402. doi: 10.1515/sagmb-2013-0006. Stat Appl Genet Mol Biol. 2014. PMID: 24964261 Review.
Cited by
-
Computational biology in Argentina.PLoS Comput Biol. 2007 Dec;3(12):e257. doi: 10.1371/journal.pcbi.0030257. PLoS Comput Biol. 2007. PMID: 18166076 Free PMC article. Review. No abstract available.
-
A neural network model for cell classification based on single-cell biomechanical properties.Tissue Eng Part A. 2008 Sep;14(9):1507-15. doi: 10.1089/ten.tea.2008.0180. Tissue Eng Part A. 2008. PMID: 18620486 Free PMC article.
-
Analysis of metagene portraits reveals distinct transitions during kidney organogenesis.Sci Signal. 2008 Dec 9;1(49):ra16. doi: 10.1126/scisignal.1163630. Sci Signal. 2008. PMID: 19066399 Free PMC article.
-
Exploring matrix factorization techniques for significant genes identification of Alzheimer's disease microarray gene expression data.BMC Bioinformatics. 2011;12 Suppl 5(Suppl 5):S7. doi: 10.1186/1471-2105-12-S5-S7. Epub 2011 Jul 27. BMC Bioinformatics. 2011. PMID: 21989140 Free PMC article.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources