Tissue classification with gene expression profiles
- PMID: 11108479
- DOI: 10.1089/106652700750050943
Tissue classification with gene expression profiles
Abstract
Constantly improving gene expression profiling technologies are expected to provide understanding and insight into cancer-related cellular processes. Gene expression data is also expected to significantly aid in the development of efficient cancer diagnosis and classification platforms. In this work we examine three sets of gene expression data measured across sets of tumor(s) and normal clinical samples: The first set consists of 2,000 genes, measured in 62 epithelial colon samples (Alon et al., 1999). The second consists of approximately equal to 100,000 clones, measured in 32 ovarian samples (unpublished extension of data set described in Schummer et al. (1999)). The third set consists of approximately equal to 7,100 genes, measured in 72 bone marrow and peripheral blood samples (Golub et al, 1999). We examine the use of scoring methods, measuring separation of tissue type (e.g., tumors from normals) using individual gene expression levels. These are then coupled with high-dimensional classification methods to assess the classification power of complete expression profiles. We present results of performing leave-one-out cross validation (LOOCV) experiments on the three data sets, employing nearest neighbor classifier, SVM (Cortes and Vapnik, 1995), AdaBoost (Freund and Schapire, 1997) and a novel clustering-based classification technique. As tumor samples can differ from normal samples in their cell-type composition, we also perform LOOCV experiments using appropriately modified sets of genes, attempting to eliminate the resulting bias. We demonstrate success rate of at least 90% in tumor versus normal classification, using sets of selected genes, with, as well as without, cellular-contamination-related members. These results are insensitive to the exact selection mechanism, over a certain range.
Similar articles
-
Selecting informative genes with parallel genetic algorithms in tissue classification.Genome Inform. 2001;12:14-23. Genome Inform. 2001. PMID: 11791220
-
Tumor classification by partial least squares using microarray gene expression data.Bioinformatics. 2002 Jan;18(1):39-50. doi: 10.1093/bioinformatics/18.1.39. Bioinformatics. 2002. PMID: 11836210
-
Simultaneous gene clustering and subset selection for sample classification via MDL.Bioinformatics. 2003 Jun 12;19(9):1100-9. doi: 10.1093/bioinformatics/btg039. Bioinformatics. 2003. PMID: 12801870
-
Organ-specific molecular classification of primary lung, colon, and ovarian adenocarcinomas using gene expression profiles.Am J Pathol. 2001 Oct;159(4):1231-8. doi: 10.1016/S0002-9440(10)62509-6. Am J Pathol. 2001. PMID: 11583950 Free PMC article.
-
Translational Metabolomics of Head Injury: Exploring Dysfunctional Cerebral Metabolism with Ex Vivo NMR Spectroscopy-Based Metabolite Quantification.In: Kobeissy FH, editor. Brain Neurotrauma: Molecular, Neuropsychological, and Rehabilitation Aspects. Boca Raton (FL): CRC Press/Taylor & Francis; 2015. Chapter 25. In: Kobeissy FH, editor. Brain Neurotrauma: Molecular, Neuropsychological, and Rehabilitation Aspects. Boca Raton (FL): CRC Press/Taylor & Francis; 2015. Chapter 25. PMID: 26269925 Free Books & Documents. Review.
Cited by
-
Review of In Situ Hybridization (ISH) Stain Images Using Computational Techniques.Diagnostics (Basel). 2024 Sep 21;14(18):2089. doi: 10.3390/diagnostics14182089. Diagnostics (Basel). 2024. PMID: 39335767 Free PMC article. Review.
-
A systems biology approach to define mechanisms, phenotypes, and drivers in PanNETs with a personalized perspective.NPJ Syst Biol Appl. 2023 Jun 3;9(1):22. doi: 10.1038/s41540-023-00283-8. NPJ Syst Biol Appl. 2023. PMID: 37270586 Free PMC article.
-
Feature selection for high dimensional microarray gene expression data via weighted signal to noise ratio.PLoS One. 2023 Apr 25;18(4):e0284619. doi: 10.1371/journal.pone.0284619. eCollection 2023. PLoS One. 2023. PMID: 37098036 Free PMC article.
-
Factor-specific generative pattern from large-scale drug-induced gene expression profile.Sci Rep. 2023 Apr 18;13(1):6339. doi: 10.1038/s41598-023-33061-x. Sci Rep. 2023. PMID: 37072452 Free PMC article.
-
Statistical Power Analysis for Designing Bulk, Single-Cell, and Spatial Transcriptomics Experiments: Review, Tutorial, and Perspectives.Biomolecules. 2023 Jan 24;13(2):221. doi: 10.3390/biom13020221. Biomolecules. 2023. PMID: 36830591 Free PMC article. Review.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials