Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2019 Sep 27;20(5):1734-1753.
doi: 10.1093/bib/bby046.

Evaluating the consistency of large-scale pharmacogenomic studies

Affiliations
Review

Evaluating the consistency of large-scale pharmacogenomic studies

Raziur Rahman et al. Brief Bioinform. .

Abstract

Recent years have seen an increase in the availability of pharmacogenomic databases such as Genomics of Drug Sensitivity in Cancer (GDSC) and Cancer Cell Line Encyclopedia (CCLE) that provide genomic and functional characterization information for multiple cell lines. Studies have alluded to the fact that specific characterizations may be inconsistent between different databases. Analysis of the potential discrepancies in the different databases is highly significant, as these sources are frequently used to analyze and validate methodologies for personalized cancer therapies. In this article, we review the recent developments in investigating the correspondence between different pharmacogenomics databases and discuss the potential factors that require attention when incorporating these sources in any modeling analysis. Furthermore, we explored the consistency among these databases using copulas that can capture nonlinear dependencies between two sets of data.

Keywords: copulas; database dependencies; pairwise relationships; pharmacogenomic databases.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Venn diagrams of data intersection between NCI60, CCLE and GDSC data sets.
Figure 2.
Figure 2.
(A–C) Pearson (blue) and Spearman (yellow) correlation coefficient between common drugs and common CLs of CCLE and GDSC. With range adjustment and normalization, average Pearson correlation of drugs increased considerably (of 16 drugs, 8 has CC >0.5, which is shown using red reference line of value 0.5). While for NCI60 and CCLE (D–F)) and NCI60 and GDSC (G–I), correlation coefficient is generally low. In (G–I), histogram of correlation coefficient is shown as number of common drugs between NCI60 and GDSC is high. In some cases, the correlation coefficient is not considered because of no variation in IC50 along with CLs.
Figure 3.
Figure 3.
Pearson correlation coefficients of CCLE and GDSC drug pairs are shown in two dimensions.
Figure 4.
Figure 4.
For all the 105 possible cases, box plots of Pearson correlation coefficient values for drug pair responses of bootstrapped sets of CCLE are shown here (red plus signs indicate the outliers sets). Along with that corresponding Pearson correlation coefficient values for same drug pair responses of GDSC (green stars) are included to show how many times these correlation lies inside the box.
Figure 5.
Figure 5.
Illustration of copula generation with three hypothetical common CLs of CCLE and GDSC with five genes.
Figure 6.
Figure 6.
Illustration of copula generation with a hypothetical common CL of CCLE and GDSC with five genes that are ordered differently for different cases.
Figure 7.
Figure 7.
Distribution of Frobenius norm difference of copulas with ordered and disordered genes of identical CLs of CCLE and GDSC database. Mean of Frobenius norm difference of ordered gene case is 0.05, while for disordered gene case, it is 2.12.
Figure 8.
Figure 8.
Distribution of Frobenius norm difference of copulas with ordered and disordered CLs of identical genes of CCLE and GDSC database. Mean of Frobenius norm difference of ordered CL case is 0.75, while for disordered CL case, it is 1.44.
Figure 9.
Figure 9.
Distribution of Frobenius norm differences of copulas with ordered and disordered CLs of identical drugs of CCLE and GDSC database. Mean of Frobenius norm difference of ordered CL case is 0.29, while for disordered CL case, it is 0.76.
Figure 10.
Figure 10.
Distribution of Frobenius norm differences of copulas of drug pairs with ordered and disordered drug responses of 15 common drugs from CCLE and GDSC database. Mean of Frobenius norm difference of ordered CL case is 0.23, while for disordered CL case, it is 0.53.

Similar articles

Cited by

References

    1. Altman RB, Flockhart D, Goldstein DB.. Principles of Pharmacogenetics and Pharmacogenomics. Cambridge: Cambridge University Press, 2012.
    1. Adams MD, Kelley JM, Gocayne JD, et al.Complementary DNA sequencing: expressed sequence tags and human genome project. Science 1991;252(5013):1651–6. - PubMed
    1. Sinsheimer RL. The Santa Cruz workshop-may 1985. Genomics 1989;5(4):954–6. - PubMed
    1. Hamburg MA, Collins FS.. The path to personalized medicine. N Engl J Med 2010;363(4):301–4. - PubMed
    1. Kannel WB, McGee DL.. Diabetes and cardiovascular disease: the framingham study. JAMA 1979;241(19):2035–8. - PubMed

Publication types

Substances