Estimation and Validation of Ratio-based Conditional Average Treatment Effects Using Observational Data

doi:10.1080/01621459.2020.1772080

. 2021;116(533):335-352.

doi: 10.1080/01621459.2020.1772080. Epub 2020 Jul 7.

Estimation and Validation of Ratio-based Conditional Average Treatment Effects Using Observational Data

Steve Yadlowsky¹, Fabio Pellegrini², Federica Lionetto³, Stefan Braune⁴, Lu Tian⁵

Affiliations

¹ Stanford University, Electrical Engineering, 1265 Welch Rd, Stanford, 94305-6104 United States.
² Biogen International GmbH, Baar, Switzerland.
³ PwC Data & Analytics, Zurich, Switzerland.
⁴ NeuroTransData, Neurology, Neuburg an der Donau, Germany.
⁵ Stanford University, Department of Biomedical Data Science, Stanford, 94305-6104 United States.

PMID: 33767517
PMCID: PMC7985957
DOI: 10.1080/01621459.2020.1772080

Estimation and Validation of Ratio-based Conditional Average Treatment Effects Using Observational Data

Steve Yadlowsky et al. J Am Stat Assoc. 2021.

. 2021;116(533):335-352.

doi: 10.1080/01621459.2020.1772080. Epub 2020 Jul 7.

Authors

Steve Yadlowsky¹, Fabio Pellegrini², Federica Lionetto³, Stefan Braune⁴, Lu Tian⁵

Affiliations

¹ Stanford University, Electrical Engineering, 1265 Welch Rd, Stanford, 94305-6104 United States.
² Biogen International GmbH, Baar, Switzerland.
³ PwC Data & Analytics, Zurich, Switzerland.
⁴ NeuroTransData, Neurology, Neuburg an der Donau, Germany.
⁵ Stanford University, Department of Biomedical Data Science, Stanford, 94305-6104 United States.

PMID: 33767517
PMCID: PMC7985957
DOI: 10.1080/01621459.2020.1772080

Abstract

While sample sizes in randomized clinical trials are large enough to estimate the average treatment effect well, they are often insufficient for estimation of treatment-covariate interactions critical to studying data-driven precision medicine. Observational data from real world practice may play an important role in alleviating this problem. One common approach in trials is to predict the outcome of interest with separate regression models in each treatment arm, and estimate the treatment effect based on the contrast of the predictions. Unfortunately, this simple approach may induce spurious treatment-covariate interaction in observational studies when the regression model is misspecified. Motivated by the need of modeling the number of relapses in multiple sclerosis patients, where the ratio of relapse rates is a natural choice of the treatment effect, we propose to estimate the conditional average treatment effect (CATE) as the ratio of expected potential outcomes, and derive a doubly robust estimator of this CATE in a semiparametric model of treatment-covariate interactions. We also provide a validation procedure to check the quality of the estimator on an independent sample. We conduct simulations to demonstrate the finite sample performance of the proposed methods, and illustrate their advantages on real data by examining the treatment effect of dimethyl fumarate compared to teriflunomide in multiple sclerosis patients.

Keywords: Conditional Average Treatment Effect; Doubly Robust Estimation; Heterogeneous Treatment Effect; Observational Study; Precision Medicine.

PubMed Disclaimer

Figures

**Fig. 1**
The ATE in subgroups of patients identified by different CATE scores in four simulation settings including: the true CATE, contrast regression, two regression, naïve regression, boosting, modified outcome boosting, and Bayesian additive regression trees (BART).

**Fig. 2**
The distribution of correlation coefficients between the estimated and true CATE in four simulation settings; there are six methods considered from left to right: contrast regression (dark gray), two regression (light gray), naïve regression, boosting, modified outcome boosting, and Bayesian additive regression trees (BART).

**Fig. 3**
The log-transformed CATE scores based on the standard regression approach and the proposed doubly robust adjustment method: the CATE for the ratio of relapse rates

**Fig. 4**
The ATE (relpase rate ratio of TERI vs DMF) in subgroups of patients based on the CATE scores constructed in the training set (two proposed methods, naïve regression and boosting) in the NTD registry

**Fig. 5**
The cross-validated ATE (relapse rate ratio of TERI vs DMF) of subgroups of patients identified by different CATE scores (two proposed methods, naïve regression and boosting) in the NTD registry.

See this image and copyright information in PMC

Cited by

Implementation of a data control framework to ensure confidentiality, integrity, and availability of high-quality real-world data (RWD) in the NeuroTransData (NTD) registry.
Wehrle K, Tozzi V, Braune S, Roßnagel F, Dikow H, Paddock S, Bergmann A, van Hövell P. Wehrle K, et al. JAMIA Open. 2022 Mar 9;5(1):ooac017. doi: 10.1093/jamiaopen/ooac017. eCollection 2022 Apr. JAMIA Open. 2022. PMID: 35571355 Free PMC article.
Toward a causal model of chronic back pain: Challenges and opportunities.
Huie JR, Vashisht R, Galivanche A, Hadjadj C, Morshed S, Butte AJ, Ferguson AR, O'Neill C. Huie JR, et al. Front Comput Neurosci. 2023 Jan 11;16:1017412. doi: 10.3389/fncom.2022.1017412. eCollection 2022. Front Comput Neurosci. 2023. PMID: 36714527 Free PMC article. Review.
Overall and patient-level comparative effectiveness of dimethyl fumarate and fingolimod: A precision medicine application to the Observatoire Français de la Sclérose en Plaques registry.
Simoneau G, Jiang X, Rollot F, Tian L, Copetti M, Guéry M, Ruiz M, Vukusic S, de Moor C, Pellegrini F; OFSEP investigators. Simoneau G, et al. Mult Scler J Exp Transl Clin. 2022 Aug 4;8(3):20552173221116591. doi: 10.1177/20552173221116591. eCollection 2022 Jul-Sep. Mult Scler J Exp Transl Clin. 2022. PMID: 35959484 Free PMC article.

References

1. Athey S. and Imbens G. Recursive partitioning for heterogeneous causal effects. Proceedings of the National Academy of Sciences, 113(27):7353–7360, 2016. - PMC - PubMed
1. Athey S, Tibshirani J, and Wager S. Generalized random forests. The Annals of Statistics, 47(2): 1148–1178, 2019.
1. Bang H. and Robins JM Doubly robust estimation in missing data and causal inference models. Biometrics, 61(4):962–973, 2005. - PubMed
1. Basu S, Sussman JB, Rigdon J, Steimle L, Denton BT, and Hayward RA Benefit and Harm of Intensive Blood Pressure Treatment: Derivation and Validation of Risk Models Using Data from the SPRINT and ACCORD Trials. PLoS Medicine, 14(10):e1002410, 2017. - PMC - PubMed
1. Breiman L. Random forests. Machine Learning, 45(1):5–32, 2001.

Grants and funding

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central
Other Literature Sources
- figshare - Data

[1] Athey S. and Imbens G. Recursive partitioning for heterogeneous causal effects. Proceedings of the National Academy of Sciences, 113(27):7353–7360, 2016. - PMC - PubMed

[2] Athey S. and Imbens G. Recursive partitioning for heterogeneous causal effects. Proceedings of the National Academy of Sciences, 113(27):7353–7360, 2016. - PMC - PubMed

[3] Athey S, Tibshirani J, and Wager S. Generalized random forests. The Annals of Statistics, 47(2): 1148–1178, 2019.

[4] Athey S, Tibshirani J, and Wager S. Generalized random forests. The Annals of Statistics, 47(2): 1148–1178, 2019.

[5] Bang H. and Robins JM Doubly robust estimation in missing data and causal inference models. Biometrics, 61(4):962–973, 2005. - PubMed

[6] Bang H. and Robins JM Doubly robust estimation in missing data and causal inference models. Biometrics, 61(4):962–973, 2005. - PubMed

[7] Basu S, Sussman JB, Rigdon J, Steimle L, Denton BT, and Hayward RA Benefit and Harm of Intensive Blood Pressure Treatment: Derivation and Validation of Risk Models Using Data from the SPRINT and ACCORD Trials. PLoS Medicine, 14(10):e1002410, 2017. - PMC - PubMed

[8] Basu S, Sussman JB, Rigdon J, Steimle L, Denton BT, and Hayward RA Benefit and Harm of Intensive Blood Pressure Treatment: Derivation and Validation of Risk Models Using Data from the SPRINT and ACCORD Trials. PLoS Medicine, 14(10):e1002410, 2017. - PMC - PubMed

[9] Breiman L. Random forests. Machine Learning, 45(1):5–32, 2001.

[10] Breiman L. Random forests. Machine Learning, 45(1):5–32, 2001.

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Estimation and Validation of Ratio-based Conditional Average Treatment Effects Using Observational Data

Affiliations

Estimation and Validation of Ratio-based Conditional Average Treatment Effects Using Observational Data

Authors

Affiliations

Abstract

Figures

Similar articles

Cited by

References

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Abstract

Figures

Similar articles

Cited by

References

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources