Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Feb 21:15:50.
doi: 10.1186/1471-2105-15-50.

Evaluation of copy number variation detection for a SNP array platform

Affiliations

Evaluation of copy number variation detection for a SNP array platform

Xin Zhang et al. BMC Bioinformatics. .

Abstract

Background: Copy Number Variations (CNVs) are usually inferred from Single Nucleotide Polymorphism (SNP) arrays by use of some software packages based on given algorithms. However, there is no clear understanding of the performance of these software packages; it is therefore difficult to select one or several software packages for CNV detection based on the SNP array platform.We selected four publicly available software packages designed for CNV calling from an Affymetrix SNP array, including Birdsuite, dChip, Genotyping Console (GTC) and PennCNV. The publicly available dataset generated by Array-based Comparative Genomic Hybridization (CGH), with a resolution of 24 million probes per sample, was considered to be the "gold standard". Compared with the CGH-based dataset, the success rate, average stability rate, sensitivity, consistence and reproducibility of these four software packages were assessed compared with the "gold standard". Specially, we also compared the efficiency of detecting CNVs simultaneously by two, three and all of the software packages with that by a single software package.

Results: Simply from the quantity of the detected CNVs, Birdsuite detected the most while GTC detected the least. We found that Birdsuite and dChip had obvious detecting bias. And GTC seemed to be inferior because of the least amount of CNVs it detected. Thereafter we investigated the detection consistency produced by one certain software package and the rest three software suits. We found that the consistency of dChip was the lowest while GTC was the highest. Compared with the CNVs detecting result of CGH, in the matching group, GTC called the most matching CNVs, PennCNV-Affy ranked second. In the non-overlapping group, GTC called the least CNVs. With regards to the reproducibility of CNV calling, larger CNVs were usually replicated better. PennCNV-Affy shows the best consistency while Birdsuite shows the poorest.

Conclusion: We found that PennCNV outperformed the other three packages in the sensitivity and specificity of CNV calling. Obviously, each calling method had its own limitations and advantages for different data analysis. Therefore, the optimized calling methods might be identified using multiple algorithms to evaluate the concordance and discordance of SNP array-based CNV calling.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Flowchart of the study.
Figure 2
Figure 2
Study of CNV calling in matched groups. (A) Venn showing CNV calls generated by four software packages (B) CNV calls generated by multiple software packages.
Figure 3
Figure 3
Study Performance of CNV calling. (A) CNV calls of size distribution. (B) CNV frequency of occurrence.
Figure 4
Figure 4
Study CNV calling of size distribution and Chromosome distribution. (A) CNV calls of size distribution (B) CNV calls of Chromosome distribution.
Figure 5
Figure 5
Study CNV calling in non_overlap group. (A) Venn showing CNV calls generated by four software packages (B) CNV calls of multiple software packages.
Figure 6
Figure 6
ROC/AUC of study.

Similar articles

Cited by

References

    1. Zhang F, Gu W, Hurles ME, Lupski JR. Copy number variation in human health, disease, and evolution. Annu Rev Genom Hum Genet. 2009;10:451–481. doi: 10.1146/annurev.genom.9.081307.164217. - DOI - PMC - PubMed
    1. Diskin SJ, Hou C, Glessner JT, Attiyeh EF, Laudenslager M, Bosse K, Cole K, Mosse YP, Wood A, Lynch JE, Pecor K, Diamond M, Winter C, Wang K, Kim C, Geiger EA, McGrady PW, Blakemore AI, London WB, Shaikh TH, Bradfield J, Grant SF, Li H, Devoto M, Rappaport ER, Hakonarson H, Maris JM. Copy number variation at 1q21.1 associated with neuroblastoma. Nature. 2009;459(7249):987–991. doi: 10.1038/nature08035. - DOI - PMC - PubMed
    1. Sebat J, Lakshmi B, Malhotra D, Troge J, Lese-Martin C, Walsh T, Yamrom B, Yoon S, Krasnitz A, Kendall J, Leotta A, Pai D, Zhang R, Lee YH, Hicks J, Spence SJ, Lee AT, Puura K, Lehtimäki T, Ledbetter D, Gregersen PK, Bregman J, Sutcliffe JS, Jobanputra V, Chung W, Warburton D, King MC, Skuse D, Geschwind DH, Gilliam TC. et al.Strong association of de novo copy number mutations with autism. Science. 2007;316(5823):445–449. doi: 10.1126/science.1138659. - DOI - PMC - PubMed
    1. Gonzalez E, Kulkarni H, Bolivar H, Mangano A, Sanchez R, Catano G, Nibbs RJ, Freedman BI, Quinones MP, Bamshad MJ, Murthy KK, Rovin BH, Bradley W, Clark RA, Anderson SA, O'connell RJ, Agan BK, Ahuja SS, Bologna R, Sen L, Dolan MJ, Ahuja SK. The influence of CCL3L1 gene-containing segmental duplications on HIV-1/AIDS susceptibility. Science. 2005;307(5714):1434–1440. doi: 10.1126/science.1101160. - DOI - PubMed
    1. Kim J, Yim S, Jeong Y, Jung S, Xu H, Shin S, Chung Y. Comparison of normalization methods for defining copy number variation using whole-genome SNP genotyping data. Genomics Inf. 2008;6(4):231–234. doi: 10.5808/GI.2008.6.4.231. - DOI

Publication types

MeSH terms