Delta Method implementation to estimate standard errors in a {tidyverse} workflow.
You can install the stable version from the CRAN with
install.packages("dplyr")
Or the development version from GitHub with
remotes::install_github("JavierMtzRdz/tidydelta")
# Or
devtools::install_github("JavierMtzRdz/tidydelta")
In general terms, the Delta Method provides a tool for approximating the
behaviour of an estimator
This observation allows us to take a step forward to decompose the
theorem of the DM. Assuming that
Now, as the sample size
Given the previous result, we can rearrange the equations to show that
as
We often encounter scenarios where our
To approximate this scenario using the Delta Method, we need to compute
the vector of all partial derivatives of
In this equation,
Using tidydelta()
, the following commands are equivalent:
# Load packages
library(tidydelta)
library(tidyverse)
# Simulate samples
set.seed(547)
x <- rnorm(10000, mean = 5, sd = 2)
y <- rnorm(10000, mean = 15, sd = 3)
bd <- tibble(x, y)
# Equivalent uses of tidydelta()
tidydelta(~ y / x,
conf_lev = .95
)
#> # A tibble: 1 × 6
#> y x T_n se lower_ci upper_ci
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 15.0 5.02 2.99 1.33 0.378 5.61
tidydelta(~ bd$y / bd$x,
conf_lev = .95
)
#> # A tibble: 1 × 6
#> y x T_n se lower_ci upper_ci
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 15.0 5.02 2.99 1.33 0.378 5.61
bd %>%
summarise(tidydelta(~ y / x,
conf_lev = .95
))
#> # A tibble: 1 × 6
#> y x T_n se lower_ci upper_ci
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 15.0 5.02 2.99 1.33 0.378 5.61
Now, the data frame is divided into samples to compare the
transformation of the sample with the estimation of tidydelta()
. In
the real world, you would not need to compute the Delta Method if you
have many samples, but it shows how it can be incorporated in a workflow
with tidyverse.
(result <- bd %>%
summarise(tidydelta(~ x / y,
conf_lev = .95
)))
#> # A tibble: 1 × 6
#> x y T_n se lower_ci upper_ci
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 5.02 15.0 0.334 0.149 0.0422 0.626
ggplot() +
geom_histogram(
data = bd %>%
mutate(t = x / y),
aes(x = t)
) +
geom_vline(aes(
xintercept = result$T_n,
color = "T_n"
)) +
geom_vline(
aes(
xintercept = c(
result$lower_ci,
result$upper_ci
),
color = "CI"
),
linetype = "dashed"
) +
labs(color = element_blank())
Bouchard-Côté, Alexandre. n.d. Probability, Illustrated. Accessed October 25, 2023.
Vaart, A. W. van der. 2000. Asymptotic Statistics. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press.
Weisberg, Sanford. 2005. Applied Linear Regression. 3rd ed. Hoboken NJ: Wiley.
Zepeda-Tello, Rodrigo, Michael Schomaker, Camille Maringe, Matthew J. Smith, Aurelien Belot, Bernard Rachet, Mireille E. Schnitzer, and Miguel Angel Luque-Fernandez. 2022. “The Delta-Method and Influence Function in Medical Statistics: A Reproducible Tutorial.” arXiv. https://arxiv.org/abs/2206.15310.