Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Jul 2:13:326.
doi: 10.1186/1471-2407-13-326.

Interrogating differences in expression of targeted gene sets to predict breast cancer outcome

Affiliations

Interrogating differences in expression of targeted gene sets to predict breast cancer outcome

Sarah A Andres et al. BMC Cancer. .

Abstract

Background: Genomics provides opportunities to develop precise tests for diagnostics, therapy selection and monitoring. From analyses of our studies and those of published results, 32 candidate genes were identified, whose expression appears related to clinical outcome of breast cancer. Expression of these genes was validated by qPCR and correlated with clinical follow-up to identify a gene subset for development of a prognostic test.

Methods: RNA was isolated from 225 frozen invasive ductal carcinomas,and qRT-PCR was performed. Univariate hazard ratios and 95% confidence intervals for breast cancer mortality and recurrence were calculated for each of the 32 candidate genes. A multivariable gene expression model for predicting each outcome was determined using the LASSO, with 1000 splits of the data into training and testing sets to determine predictive accuracy based on the C-index. Models with gene expression data were compared to models with standard clinical covariates and models with both gene expression and clinical covariates.

Results: Univariate analyses revealed over-expression of RABEP1, PGR, NAT1, PTP4A2, SLC39A6, ESR1, EVL, TBC1D9, FUT8, and SCUBE2 were all associated with reduced time to disease-related mortality (HR between 0.8 and 0.91, adjusted p < 0.05), while RABEP1, PGR, SLC39A6, and FUT8 were also associated with reduced recurrence times. Multivariable analyses using the LASSO revealed PGR, ESR1, NAT1, GABRP, TBC1D9, SLC39A6, and LRBA to be the most important predictors for both disease mortality and recurrence. Median C-indexes on test data sets for the gene expression, clinical, and combined models were 0.65, 0.63, and 0.65 for disease mortality and 0.64, 0.63, and 0.66 for disease recurrence, respectively.

Conclusions: Molecular signatures consisting of five genes (PGR, GABRP, TBC1D9, SLC39A6 and LRBA) for disease mortality and of six genes (PGR, ESR1, GABRP, TBC1D9, SLC39A6 and LRBA) for disease recurrence were identified. These signatures were as effective as standard clinical parameters in predicting recurrence/mortality, and when combined, offered some improvement relative to clinical information alone for disease recurrence (median difference in C-values of 0.03, 95% CI of -0.08 to 0.13). Collectively, results suggest that these genes form the basis for a clinical laboratory test to predict clinical outcome of breast cancer.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Kaplan-Meier plots illustrating separation among the 1000 test data sets. Predictions were based on L1penalized (LASSO) Cox regression models fitted to each training set. Plots A (OS = overall survival) and B (DFS = disease free survival) represent predictions of low or high risk based on actual data, while plots C (OS) and D (DFS) represent predictions based on permuted data sets.
Figure 2
Figure 2
Boxplots of C-index values of the 1000 test data sets. Predictions were made using L1penalized (LASSO) Cox regression models fitted to the training data sets. Predictions made using both actual and permuted data are shown.
Figure 3
Figure 3
Boxplots of Beta coefficients associated with expression of the top seven occurring genes in the Cox regression models among the 1000 training data sets, where genes were selected using the LASSO. Left panel = disease mortality, Right panel = disease recurrence.
Figure 4
Figure 4
Kaplan-Meier plots illustrating separation among the 1000 test data sets. Predictions of low or high risk were based on Cox regression models fitted to each training set, either using gene expression (GE) only (A = overall survival (OS), D = disease free survival (DFS)), clinical data only (B = OS, E = DFS), or both gene expression and clinical data (C = OS, F = DFS). Clinical data included for both outcomes was patient stage of disease at diagnosis (1, 2, and 3 or 4), ER status (+/-), and PR status (+/-). Genes included in both the OS and DFS models were PGR, GABRP, TBC1D9, SLC39A6 and LRBA, while NAT1 was also included in the model for DFS.
Figure 5
Figure 5
Boxplots of C-index values for the 1000 test data sets. Predictions were made using Cox regression models fitted to each training set. Models were derived from either gene expression (GE) data only, clinical data only, or results from both gene expression and clinical data. The green line shown on each panel represents the C-index corresponding to the 10 year Adjuvant! Online risk scores calculated for both disease mortality and disease recurrence, respectively. Clinical data included for both outcomes were patient stage of disease at diagnosis (1, 2, and 3 or 4), ER status (+/-), and PR status (+/-). Genes included in both the OS and DFS models were PGR, GABRP, TBC1D9, SLC39A6 and LRBA, while NAT1 was also included in the model for DFS.
Figure 6
Figure 6
Boxplots of C-index values for the 1000 test data sets, stratified by ER +/- status. Predictions were made using Cox regression models fitted to each training set, derived using gene expression (GE) data. Genes included in both the OS and DFS models were PGR, GABRP, TBC1D9, SLC39A6 and LRBA, while NAT1 was also included in the model for DFS.
Figure 7
Figure 7
Boxplots of C-index values for the 1000 test data sets derived from the TRANSBIG data. Predictions were made using Cox regression models fitted to each training set. Letters correspond to the following fitted models: A) clinical data only (age, size and grade of tumor, ER +/- status), B) gene expression data (PGR, GABRP, TBC1D9, SLC39A6 and LRBA for both OS and DFS, and additionally NAT1 for DFS), C) clinical data plus gene expression data, D) randomly selected gene expression data (5 genes for OS and 6 genes for DFS), and E) randomly selected gene expression data plus clinical data. All five models included the medical center where the patient was seen as a covariate. The horizontal green line shown on each panel represents the C-index corresponding to the Veridex 76-gene prognostic signature [11] calculated based on the full data for both disease mortality and disease recurrence, respectively.
Figure 8
Figure 8
Boxplots of Beta coefficients associated with expression of the genes in the Cox regression models fitted to the training data sets from the TRANSBIG data. Genes included in both the OS and DFS models were PGR, GABRP, TBC1D9, SLC39A6 and LRBA, while NAT1 was also included in the model for DFS. Separate panels are given for the entire data and ER +/- subsets (rows), and for disease mortality and recurrence (columns).

Similar articles

Cited by

References

    1. Bonner RF, Emmert-Buck M, Cole K, Pohida T, Chuaqui R, Goldstein S, Liotta LA. Laser capture microdissection: molecular analysis of tissue. Science. 1997;278:1481–1483. doi: 10.1126/science.278.5342.1481. - DOI - PubMed
    1. Emmert-Buck MR, Bonner RF, Smith PD, Chuaqui RF, Zhuang Z, Goldstein SR, Weiss RA, Liotta LA. Laser capture microdissection. Science. 1996;274:998–1001. doi: 10.1126/science.274.5289.998. - DOI - PubMed
    1. Wittliff JL, Kunitake ST, Chu SS, Travis JC. Applications of laser capture microdissection in genomics and proteomics. J Clin Ligand Assay. 2000;23:66.
    1. Wittliff JL, Erlander MG. Laser capture microdissection and its applications in genomics and proteomics. Methods Enzymol. 2002;356:12–25. - PubMed
    1. Kang Y, Siegel PM, Shu W, Drobnjak M, Kakonen SM, Cordon-Cardo C, Guise TA, Massague J. A multigenic program mediating breast cancer metastasis to bone. Cancer Cell. 2003;3:537–549. doi: 10.1016/S1535-6108(03)00132-6. - DOI - PubMed

Publication types

Substances