Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Sep 10;14(Suppl 5):1-10.
doi: 10.4137/CIN.S27718. eCollection 2015.

An Application of Sequential Meta-Analysis to Gene Expression Studies

Affiliations

An Application of Sequential Meta-Analysis to Gene Expression Studies

Putri W Novianti et al. Cancer Inform. .

Abstract

Most of the discoveries from gene expression data are driven by a study claiming an optimal subset of genes that play a key role in a specific disease. Meta-analysis of the available datasets can help in getting concordant results so that a real-life application may be more successful. Sequential meta-analysis (SMA) is an approach for combining studies in chronological order while preserving the type I error and pre-specifying the statistical power to detect a given effect size. We focus on the application of SMA to find gene expression signatures across experiments in acute myeloid leukemia. SMA of seven raw datasets is used to evaluate whether the accumulated samples show enough evidence or more experiments should be initiated. We found 313 differentially expressed genes, based on the cumulative information of the experiments. SMA offers an alternative to existing methods in generating a gene list by evaluating the adequacy of the cumulative information.

Keywords: differentially expressed genes; gene expression; sequential meta-analysis; triangular test.

PubMed Disclaimer

Figures

Figure 1
Figure 1
General proposed approach to apply sequential meta-analysis to gene expression datasets. The details for each step are described in the Methods section.
Figure 2
Figure 2
Example of a double triangular test (TT) that is designed by prespecified α,1 − β. and θR. A decision can be made when the sample path crosses one of the boundaries, ie, rejecting the null hypothesis in favor of the alternative hypothesis when it crosses the red lines; and failing to reject the null hypothesis if the sample path crosses the blue dashed lines. No decision can be made if the sample path stays inside the boundaries: then more studies need to be included in the analysis. The y-axis and x-axis represent the Z and V score, respectively. More detailed explanation for the Z and V score is provided in the Methods section.
Figure 3
Figure 3
Pairwise comparisons of the differentially expressed genes in individual selected experiments. The number within each block represents the overlap of differentially expressed genes between two experiments, which is then represented by the color. The x-axis and y-axis represent the experiment number.
Figure 4
Figure 4
Heatmaps of the 12,211 fully replicated genes. The colors represent the status of each gene in sequential tests: orange, no decision can be made; red, do not reject the null hypothesis; white, reject the null hypothesis. The y-axis represents the genes that appeared in all experiments, while the x-axis is the cumulative number of experiments used in the sequential test following Whitehead’s boundaries approach. The boundaries were constructed for a relevant effect size θR = 0.8, power 1 − β = 80%, and a type 1 error α = 0.5% (left) or α = 0.0004% (right, Bonferroni correction for α = 5% and 12,211 tests).
Figure 5
Figure 5
Triangular tests of four selected genes. The boundaries were constructed for a pre-specified effect size θR = 0.8, power 1 − β = 80%, and type 1 error α = 0.5% (the upper row) or α = 0.0004% (the lower row, Bonferroni correction for α = 5%). The y-axis and x-axis represent the Z and V score, respectively. More detailed explanation for the Z and V score is provided in the Methods section.
Figure 6
Figure 6
Triangular tests of four selected genes that have inconsistent conclusions. The boundaries were constructed for a relevant effect size θR = 0.8, power 1 − β = 80%, and type 1 error α = 0.5%. The sequential analyses were continued although the sample paths crossed the boundaries (so-called overrunning)., The y-axis and x-axis represent the Z and V score, respectively. More detailed explanation for the Z and V score is provided in the Methods section.
Figure 7
Figure 7
Triangular tests for the FLT3 gene. The boundaries were constructed for a prespecified effect size θR = 0.8, power 1 − β = 80%, and type 1 error α = 0.5% (the first column) or α = 0.0004% (the second column, Bonferroni correction for α = 5%). The first (second) row is the triangular tests when full (filtered) data is used for analysis. The y-axis and x-axis represent the Z and V score, respectively. More detailed explanation for the Z and V score is provided in the Methods section.

Similar articles

Cited by

References

    1. Catherino WH, Segars JH. Microarray analysis in fibroids: which gene list is the correct list? Fertil Steril. 2003;80(2):293–4. - PubMed
    1. Tan PK, Downey TJ, Spitznagel EL, Jr, et al. Evaluation of gene expression measurements from commercial microarray platforms. Nucleic Acids Res. 2003;31(19):5676–84. - PMC - PubMed
    1. Fortunel NO, Otu HH, Ng HH, et al. Comment on “‘Stemness’: transcriptional profiling of embryonic and adult stem cells” and “a stem cell molecular signature”. Science. 2003;302(5644):393. author reply 393. - PubMed
    1. Evsikov AV, Solter D. Comment on “‘Stemness’: transcriptional profiling of embryonic and adult stem cells” and “a stem cell molecular signature”. Science. 2003;302(5644):393. author reply 393. - PubMed
    1. Ein-Dor L, Zuk O, Domany E. Thousands of samples are needed to generate a robust gene list for predicting outcome in cancer. Proc Natl Acad Sci U S A. 2006;103(15):5923–8. - PMC - PubMed

LinkOut - more resources