Abstract
Underlying every microarray experiment is an experimental question that one would like to address. Finding a useful and satisfactory answer relies on careful experimental design and the use of a variety of data-mining tools to explore the relationships between genes or reveal patterns of expression. While other sections of this issue deal with these lofty issues, this review focuses on the much more mundane but indispensable tasks of 'normalizing' data from individual hybridizations to make meaningful comparisons of expression levels, and of 'transforming' them to select genes for further analysis and data mining.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Chatterjee, S. & Price, B. Regression Analysis by Example (John Wiley & Sons, New York, 1991).
Tseng, G.C., Oh, M.K., Rohlin, L., Liao, J.C. & Wong, W.H. Issues in cDNA microarray analysis: quality filtering, channel normalization, models of variations and assessment of gene effects. Nucleic Acids Res. 29, 2549–2557 (2001).
Chen, Y., Dougherty, E.R. & Bittner, M.L. Ratio-based decisions and the quantitative analysis of cDNA microarray images. J. Biomed. Optics 2, 364–374 (1997).
Yang, Y.H. et al. Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Res. 30, e15 (2002).
Yang, I.V. et al. Within the fold: assessing differential expression measures and reproducibility in microarray assays. Genome Biol. 3, research0062.1–0062.12 (2002).
Cleveland, W.S. Robust locally weighted regression and smoothing scatterplots. J. Amer. Stat. Assoc. 74, 829–836 (1979).
Huber, W., von Heydebreck, A., Sültmann, H., Poustka, A. & Vingron, M. Variance stabilization applied to microarray data calibration and to the quantification of differential expression. Bioinformatics 18, S96–S104 (2002).
Churchill, G.A. Fundamentals of experimental design for cDNA microarrays. Nature Genet. 32, 490–495 (2002).
Bevington, P.R. & Robinson, D.K. Data Reduction and Error Analysis for the Physical Sciences (McGraw-Hill, New York, 1991).
Eisen, M.B., Spellman, P.T., Brown, P.O. & Botstein, D. Cluster analysis and display of genome-wide expression patterns. Proc. Natl Acad. Sci. USA 95, 14863–14868 (1998).
Wen, X., Fuhrman, S., Michaels, G.S., Carr, D.B., Smith, S., Barker, J.L. & Somogy, R. Large-scale temporal gene expression mapping of central nervous system development. Proc. Natl Acad. Sci. USA 95, 334–339 (1998).
Tamayo, P. et al. Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. Proc. Natl Acad. Sci. USA 96, 2907–2912 (1999).
Li, C. & Wong, W. Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. Proc. Natl Acad. Sci. USA 98, 31–36 (2001).
Ideker, T., Thorsson, V., Siegel, A.F. & Hood, L.E. Testing for differentially-expressed genes by maximum-likelihood analysis of microarray data. J. Comput. Biol. 7, 805–817 (2001).
Rocke, D. & Durbin, B. A model for measurement error for gene expression arrays. J. Comput. Biol. 8, 557–569 (2001).
Stoeckert, C. Microarray databases: standards and ontologies. Nature Genet. 32, 469–473 (2002).
Acknowledgements
The work presented here evolved from looking at a large body of data and would have been much less useful without the contributions of Norman H. Lee, Renae L. Malek, Priti Hegde, Ivana Yang, Shuibang Wang, Yonghong Wang, Simon Kwong, Heenam Kim, Wei Liang, Vasily Sharov, John Braisted, Alex Saeed, Joseph White, Jerry Li, Renee Gaspard, Erik Snesrud, Yan Yu, Emily Chen, Jeremy Hasseman, Bryan Frank, Lara Linford, Linda Moy, Tara Vantoai, Gary Churchill and Roger Bumgarner. J.Q. is supported by grants from the US National Science Foundation, the National Heart, Lung, and Blood Institute, and the National Cancer Institute. The MIDAS software system used for the normalization and data filtering presented here is freely available as either executable or source code from http://www.tigr.org/software, along with the MADAM data-management system, the Spotfinder image-processing software, and the MeV clustering and data-mining tool.
Author information
Authors and Affiliations
Ethics declarations
Competing interests
The author declares no competing financial interests.
Rights and permissions
About this article
Cite this article
Quackenbush, J. Microarray data normalization and transformation. Nat Genet 32 (Suppl 4), 496–501 (2002). https://doi.org/10.1038/ng1032
Issue Date:
DOI: https://doi.org/10.1038/ng1032
This article is cited by
-
Cellular clarity: a logistic regression approach to identify root epidermal regulators of iron deficiency response
BMC Genomics (2023)
-
Colonic TRPV4 overexpression is related to constipation severity
BMC Gastroenterology (2023)
-
Enhancing the chimp optimization algorithm to evolve deep LSTMs for accounting profit prediction using adaptive pair reinforced technique
Evolving Systems (2023)
-
Elitist random swapped particle swarm optimization embedded with variable k-nearest neighbour classification: a new PSO variant applied to gene identification
Soft Computing (2023)
-
Multidimensional deprivations among social groups in rural India: A state level analysis
GeoJournal (2023)