Skip to main page content
U.S. flag

An official website of the United States government

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021;116(533):335-352.
doi: 10.1080/01621459.2020.1772080. Epub 2020 Jul 7.

Estimation and Validation of Ratio-based Conditional Average Treatment Effects Using Observational Data

Affiliations

Estimation and Validation of Ratio-based Conditional Average Treatment Effects Using Observational Data

Steve Yadlowsky et al. J Am Stat Assoc. 2021.

Abstract

While sample sizes in randomized clinical trials are large enough to estimate the average treatment effect well, they are often insufficient for estimation of treatment-covariate interactions critical to studying data-driven precision medicine. Observational data from real world practice may play an important role in alleviating this problem. One common approach in trials is to predict the outcome of interest with separate regression models in each treatment arm, and estimate the treatment effect based on the contrast of the predictions. Unfortunately, this simple approach may induce spurious treatment-covariate interaction in observational studies when the regression model is misspecified. Motivated by the need of modeling the number of relapses in multiple sclerosis patients, where the ratio of relapse rates is a natural choice of the treatment effect, we propose to estimate the conditional average treatment effect (CATE) as the ratio of expected potential outcomes, and derive a doubly robust estimator of this CATE in a semiparametric model of treatment-covariate interactions. We also provide a validation procedure to check the quality of the estimator on an independent sample. We conduct simulations to demonstrate the finite sample performance of the proposed methods, and illustrate their advantages on real data by examining the treatment effect of dimethyl fumarate compared to teriflunomide in multiple sclerosis patients.

Keywords: Conditional Average Treatment Effect; Doubly Robust Estimation; Heterogeneous Treatment Effect; Observational Study; Precision Medicine.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
The ATE in subgroups of patients identified by different CATE scores in four simulation settings including: the true CATE, contrast regression, two regression, naïve regression, boosting, modified outcome boosting, and Bayesian additive regression trees (BART).
Fig. 2
Fig. 2
The distribution of correlation coefficients between the estimated and true CATE in four simulation settings; there are six methods considered from left to right: contrast regression (dark gray), two regression (light gray), naïve regression, boosting, modified outcome boosting, and Bayesian additive regression trees (BART).
Fig. 3
Fig. 3
The log-transformed CATE scores based on the standard regression approach and the proposed doubly robust adjustment method: the CATE for the ratio of relapse rates
Fig. 4
Fig. 4
The ATE (relpase rate ratio of TERI vs DMF) in subgroups of patients based on the CATE scores constructed in the training set (two proposed methods, naïve regression and boosting) in the NTD registry
Fig. 5
Fig. 5
The cross-validated ATE (relapse rate ratio of TERI vs DMF) of subgroups of patients identified by different CATE scores (two proposed methods, naïve regression and boosting) in the NTD registry.

Similar articles

Cited by

References

    1. Athey S. and Imbens G. Recursive partitioning for heterogeneous causal effects. Proceedings of the National Academy of Sciences, 113(27):7353–7360, 2016. - PMC - PubMed
    1. Athey S, Tibshirani J, and Wager S. Generalized random forests. The Annals of Statistics, 47(2): 1148–1178, 2019.
    1. Bang H. and Robins JM Doubly robust estimation in missing data and causal inference models. Biometrics, 61(4):962–973, 2005. - PubMed
    1. Basu S, Sussman JB, Rigdon J, Steimle L, Denton BT, and Hayward RA Benefit and Harm of Intensive Blood Pressure Treatment: Derivation and Validation of Risk Models Using Data from the SPRINT and ACCORD Trials. PLoS Medicine, 14(10):e1002410, 2017. - PMC - PubMed
    1. Breiman L. Random forests. Machine Learning, 45(1):5–32, 2001.

LinkOut - more resources