This site needs JavaScript to work properly. Please enable it to take advantage of the complete set of features!

Skip to main page content

Email citation

Add to Collections

Your saved search

Name of saved search:

Search terms:

Test search terms

Would you like email updates of new search results?

Yes
No

Email: (change)

Frequency:

Which day?

Which day?

Report format:

Send at most:

Send even when there aren't any new results

Optional text in email:

Your RSS Feed

. 2003 Aug 5;100(16):9440-5.

doi: 10.1073/pnas.1530509100. Epub 2003 Jul 25.

Statistical significance for genomewide studies

John D Storey¹, Robert Tibshirani

Affiliations

PMID: 12883005
PMCID: PMC170937
DOI: 10.1073/pnas.1530509100

Statistical significance for genomewide studies

John D Storey et al. Proc Natl Acad Sci U S A. 2003.

. 2003 Aug 5;100(16):9440-5.

doi: 10.1073/pnas.1530509100. Epub 2003 Jul 25.

Authors

John D Storey¹, Robert Tibshirani

Affiliation

¹ Department of Biostatistics, University of Washington, Seattle, WA 98195, USA. jstorey@u.washington.edu

PMID: 12883005
PMCID: PMC170937
DOI: 10.1073/pnas.1530509100

Abstract

With the increase in genomewide experiments and the sequencing of multiple genomes, the analysis of large data sets has become commonplace in biology. It is often the case that thousands of features in a genomewide data set are tested against some null hypothesis, where a number of features are expected to be significant. Here we propose an approach to measuring statistical significance in these genomewide studies based on the concept of the false discovery rate. This approach offers a sensible balance between the number of true and false positives that is automatically calibrated and easily interpreted. In doing so, a measure of statistical significance called the q value is associated with each tested feature. The q value is similar to the well known p value, except it is a measure of significance in terms of the false discovery rate rather than the false positive rate. Our approach avoids a flood of false positive results, while offering a more liberal criterion than what has been used in genome scans for linkage.

PubMed Disclaimer

Figures

Fig. 1. — Fig. 1.
A density histogram of the 3,170 p values from the Hedenfalk et al. (14) data. The dashed line is the density histogram we would expect if all genes were null (not differentially expressed). The dotted line is at the height of our estimate of the proportion of null p values.

Fig. 2. — Fig. 2.
Results from the Hedenfalk et al. (14) data. (a) The q values of the genes versus their respective t statistics. (b) The q values versus their respective p values. (c) The number of genes occurring on the list up through each q value versus the respective q value. (d) The expected number of false positive genes versus the total number of significant genes given by the q values.

Fig. 3. — Fig. 3.
The versus λ for the data of Hedenfalk et al. (14). The solid line is a natural cubic spline fit to these points to estimate .

formula image — Fig. 3.
The versus λ for the data of Hedenfalk et al. (14). The solid line is a natural cubic spline fit to these points to estimate .

See this image and copyright information in PMC

Similar articles

The false discovery rate: a key concept in large-scale genetic studies.
Chen JJ, Roberson PK, Schell MJ. Chen JJ, et al. Cancer Control. 2010 Jan;17(1):58-62. doi: 10.1177/107327481001700108. Cancer Control. 2010. PMID: 20010520
Efficient computation of significance levels for multiple associations in large studies of correlated data, including genomewide association studies.
Dudbridge F, Koeleman BP. Dudbridge F, et al. Am J Hum Genet. 2004 Sep;75(3):424-35. doi: 10.1086/423738. Epub 2004 Jul 19. Am J Hum Genet. 2004. PMID: 15266393 Free PMC article.
Rank order metrics for quantifying the association of sequence features with gene regulation.
Clarke ND, Granek JA. Clarke ND, et al. Bioinformatics. 2003 Jan 22;19(2):212-8. doi: 10.1093/bioinformatics/19.2.212. Bioinformatics. 2003. PMID: 12538241
Dark matter in the genome: evidence of widespread transcription detected by microarray tiling experiments.
Johnson JM, Edwards S, Shoemaker D, Schadt EE. Johnson JM, et al. Trends Genet. 2005 Feb;21(2):93-102. doi: 10.1016/j.tig.2004.12.009. Trends Genet. 2005. PMID: 15661355 Review.
Bioinformatics analysis of alternative splicing.
Lee C, Wang Q. Lee C, et al. Brief Bioinform. 2005 Mar;6(1):23-33. doi: 10.1093/bib/6.1.23. Brief Bioinform. 2005. PMID: 15826354 Review.

See all similar articles

Cited by

Effect of fenofibrate on residual beta cell function in adults and adolescents with newly diagnosed type 1 diabetes: a randomised clinical trial.
Hostrup PE, Schmidt T, Hellsten SB, Gerwig RH, Størling J, Johannesen J, Sulek K, Hostrup M, Andersen HU, Buschard K, Hamid Y, Pociot F. Hostrup PE, et al. Diabetologia. 2024 Oct 30. doi: 10.1007/s00125-024-06290-6. Online ahead of print. Diabetologia. 2024. PMID: 39477880
New insight into the development of synpolydactyly caused by expansion of HOXD13 polyalanine based on weighted gene co-expression network analysis.
Chen X, Shen X, Yang T, Cao Y, Zhao X. Chen X, et al. BMC Med Genomics. 2024 Oct 29;17(1):259. doi: 10.1186/s12920-024-01974-9. BMC Med Genomics. 2024. PMID: 39472920 Free PMC article.
Bidirectional two-sample Mendelian randomization analysis unveils causal association between inflammatory cytokines and the risk of diabetic nephropathy.
Song S, Yan Q, Yu J. Song S, et al. Sci Rep. 2024 Oct 25;14(1):25425. doi: 10.1038/s41598-024-73800-2. Sci Rep. 2024. PMID: 39455620 Free PMC article.
Resistant Potato Starch Supplementation Reduces Serum Free Fatty Acid Levels and Influences Bile Acid Metabolism.
Bush JR, Iwuamadi I, Han J, Schibli DJ, Goodlett DR, Deehan EC. Bush JR, et al. Metabolites. 2024 Oct 5;14(10):536. doi: 10.3390/metabo14100536. Metabolites. 2024. PMID: 39452917 Free PMC article.
Association mapping unravels the genetic basis for drought related traits in different developmental stages of barley.
Slawin C, Ajayi O, Mahalingam R. Slawin C, et al. Sci Rep. 2024 Oct 24;14(1):25121. doi: 10.1038/s41598-024-73618-y. Sci Rep. 2024. PMID: 39448604 Free PMC article.

See all "Cited by" articles

References

1. Morton, N. E. (1955) Am. J. Hum. Gen. 7, 277–318. - PMC - PubMed
1. Lander, E. S. & Kruglyak, L. (1995) Nat. Genet. 11, 241–247. - PubMed
1. Storey, J. D. (2003) Ann. Stat., in press.
1. Storey, J. D. (2002) J. R. Stat. Soc. B 64, 479–498.
1. Benjamini, Y. & Hochberg, Y. (1995) J. R. Stat. Soc. B 85, 289–300.

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- H1 Connect
- The Lens - Patent Citations