Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2005 Nov;15(6):235-43.
doi: 10.2188/jea.15.235.

Systematic evaluation and comparison of statistical tests for publication bias

Affiliations
Comparative Study

Systematic evaluation and comparison of statistical tests for publication bias

Yasuaki Hayashino et al. J Epidemiol. 2005 Nov.

Abstract

Background: This study evaluates the statistical and discriminatory powers of three statistical test methods (Begg's, Egger's, and Macaskill's) to detect publication bias in meta-analyses.

Methods: The data sources were 130 reviews from the Cochrane Database of Systematic Reviews 2002 issue, which considered a binary endpoint and contained 10 or more individual studies. Funnel plots with observers'agreements were selected as a reference standard. We evaluated a trade-off between sensitivity and specificity by varying cut-off p-values, power of statistical tests given fixed false positive rates, and area under the receiver operating characteristic curve.

Results: In 36 reviews, 733 original studies evaluated 2,874,006 subjects. The number of trials included in each ranged from 10 to 70 (median 14.5). Given that the false positive rate was 0.1, the sensitivity of Egger's method was 0.93, and was larger than that of Begg's method (0.86) and Macaskill's method (0.43). The sensitivities of three statistical tests increased as the cut-off p-values increased without a substantial decrement of specificities. The area under the ROC curve of Egger's method was 0.955 (95% confidence interval, 0.889-1.000) and was not different from that of Begg's method (area=0.913, p=0.2302), but it was larger than that of Macaskill's method (area=0.719, p=0.0116).

Conclusion: Egger's linear regression method and Begg's method had stronger statistical and discriminatory powers than Macaskill's method for detecting publication bias given the same type I error level. The power of these methods could be improved by increasing the cut-off p-value without a substantial increment of false positive rate.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.. Receiver-operating characteristic (ROC) curves for Begg’s method, Egger’s method and Macaskill’s method.
Left column: Main analyses by using funnel plots hat were scored as 2 and with observers’ agreement. Right column: Sensitivity analyses by using all 130 reviews. CI: confidence interval.
Figure 2.
Figure 2.. Representative examples of funnel plots in our analysis.
(A) A typical example of the absence of publication bias. The number of included studies was 13; the total sample size was 855; the median sample size per review was 46 (range, 20-234); the pooled odds ratio was 0.96 (95% CI, 0.68-1.35). The funnel plots were scored at 4, and both observers agreed that there is no publication bias in this analysis. The p-values were 0.583, 0.641, and 0.603, respectively, for Egger’s method, Begg’s method, and Macaskill’s method. (B) A typical example of the presence of publication bias. The number of included studies was 15; the total sample size was 1278; the median sample size per review was 73 (range, 23-158); the pooled odds ratio was 0.78 (95% CI, 0.61-1.00). The funnel plots were scored at 2, and both observers agreed that there is a publication bias in this analysis. The p-values were respectively 0.006, 0.002, and 0.02 for Egger’s method, Begg’s method, and Macaskill’s method. (C) An example of the inconsistency between two observers in the interpretation of funnel plots. The number of included studies was 25; the total sample size was 2478; the median sample size per review was 97 (range, 36-200); the pooled odds ratio was 1.64 (95% CI, 1.28-2.11). The funnel plots were scored at 3, and observers A asserted that there was publication bias in this analysis, whereas observer B did not. p-values were respectively 0.500, 0.944, and 0.419 for Egger’s method, Begg’s method, and Macaskill’s method. (D) An example of the inconsistency between the three statistical tests in detecting publication bias. The number of included studies was 11; the total sample size was 619; the median sample size per review was 33 (range, 20-204); the pooled odds ratio was 3.32 (95% CI, 2.24-4.92). The funnel plots were scored at 2, and both observers agreed that there is no publication bias in this analysis. The p-values were respectively 0.018, 0.020, and 0.352 for Egger’s method, Begg’s method, and Macaskill’s method. With positivity criterion p<0.1, Egger’s method and Begg’s method suggest the presence of a publication bias, but Macaskill’s test did not.

Similar articles

Cited by

References

    1. Easterbrook PJ, Berlin JA, Gopalan R, Matthews DR. Publication bias in clinical research. Lancet 1991; 337: 867-72. - PubMed
    1. Egger M, Davey Smith G, Schneider M, Minder C. Bias in meta-analysis detected by a simple, graphical test. BMJ 1997; 315: 629-34. - PMC - PubMed
    1. De Angelis C, Drazen JM, Frizelle FA, Haug C, Hoey J, Horton R, et al. . Clinical trial registration: a statement from the International Committee of Medical Journal Editors. N Engl J Med 2004; 351: 1250-1. - PubMed
    1. Greenland S Invited commentary: a critical look at some popular meta-analytic methods. Am J Epidemiol 1994; 140: 290-6. - PubMed
    1. Begg CB, Mazumdar M. Operating characteristics of a rank correlation test for publication bias. Biometrics 1994; 50: 1088-101. - PubMed