Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add missing reference model components to delay_group_lmpf and generated quantities #147

Merged
merged 60 commits into from
Sep 20, 2022
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
60 commits
Select commit Hold shift + click to select a range
b58ecfe
add required delay_group_lmpf changes for missing ref model
seabbs Jul 31, 2022
3cadc35
debug to compilation
seabbs Jul 31, 2022
a004d8e
add missing reference model to gq
seabbs Jul 31, 2022
5dd2dbc
use segment where possible
seabbs Aug 1, 2022
b235579
add first draft of missing reference look-up
seabbs Aug 1, 2022
48acfdc
correct delay structure
seabbs Aug 1, 2022
9a47a13
model fitting
seabbs Aug 1, 2022
f172d8a
debug allocation of missing reference effects
seabbs Aug 1, 2022
c1e8396
model fitting but not recovering simulated proportion
seabbs Aug 1, 2022
28c47bd
update snaps for enw_missing
seabbs Aug 1, 2022
77cee2d
add plot
seabbs Aug 2, 2022
1856b89
make a output processing
seabbs Aug 2, 2022
a9f8e01
use the correct likelihood you tool
seabbs Aug 2, 2022
28a0d58
make example multi-threaded
seabbs Aug 2, 2022
ddee268
use the correct helper function (log1m_exp not log1m)
seabbs Aug 2, 2022
85a4617
add enw_incidence_to_cumulativ and update enw_new_reports to match
seabbs Aug 10, 2022
0223a5c
update nowcast date for missing example and clean up code
seabbs Aug 10, 2022
7bbd646
reset to same date as used in all other examples
seabbs Aug 10, 2022
3da960e
fix merge issues and turn off warning
seabbs Aug 10, 2022
f726334
use a fixed proportion missing
seabbs Aug 10, 2022
d5dbd24
explore example
seabbs Aug 11, 2022
09e5536
add enw_incidence_to_cumulative
seabbs Aug 11, 2022
c009112
add enw_incidence_to_cumulative
seabbs Aug 11, 2022
ac9c2dd
local CRAN check
seabbs Aug 11, 2022
6a7aac5
solve merge conflicts
seabbs Aug 11, 2022
7a00cbe
add internal helper functions for missing reference lookup
seabbs Aug 11, 2022
4aa7a18
add new global variables
seabbs Aug 11, 2022
be5bbe8
update wordlist
seabbs Aug 11, 2022
d9e8f4f
Merge branch 'develop' into feature-missing-reference-function
seabbs Aug 12, 2022
8a09ec7
add missing refeerence model definition
seabbs Aug 12, 2022
2d1964e
Merge branch 'feature-missing-reference-function' of https://github.c…
seabbs Aug 12, 2022
0d45c7b
write tests for enw_reps_with_complete_refs
seabbs Aug 16, 2022
8e7c1de
fix indexing bug with enw_reference_by_report
seabbs Aug 16, 2022
e77c079
merge develop
seabbs Aug 16, 2022
74ad2df
fix issues from #151 causing spurious warnings
seabbs Aug 16, 2022
8029a18
fix failing tests due to sorting standardisation
seabbs Aug 16, 2022
0b64683
debug ordering changes
seabbs Aug 16, 2022
caef167
more test fixes
seabbs Aug 16, 2022
83ef418
fix enw_complete_dates tests
seabbs Aug 17, 2022
bd0a430
add enw_simulate_missing_reference to package
seabbs Aug 17, 2022
b6cc38f
exporte enw_simulate_missing_reference
seabbs Aug 17, 2022
5808134
make cmdstanr tests skip locally
seabbs Aug 17, 2022
6a2ab6d
complete missing model convergence check
seabbs Aug 17, 2022
79154a5
update news and contribnuting
seabbs Aug 17, 2022
38ae0d0
typo in contributing
seabbs Aug 18, 2022
40af056
Change temp variable name
adrian-lison Aug 30, 2022
b890f81
Refactor filt_obs_indexed
adrian-lison Aug 30, 2022
1a7cb2c
Fix filt_obs_indexes
adrian-lison Aug 30, 2022
5231dec
Improve in-model code doc
adrian-lison Aug 31, 2022
a68bf80
Streamline time wording in in-model doc
adrian-lison Aug 31, 2022
3cbe6b4
Refactor variable names in delay_group_lpmf
adrian-lison Sep 1, 2022
6f3ff55
add localisation changes from main
seabbs Sep 1, 2022
da55c32
Merge branch 'develop' into feature-missing-reference-function
seabbs Sep 1, 2022
0e6a1c7
add usage warning for the missing data MVP
seabbs Sep 1, 2022
11d647b
update news
seabbs Sep 1, 2022
2686216
spelling and global variables
seabbs Sep 1, 2022
2ec8058
add handling of group-wise missing reference observations and look-ups
seabbs Sep 1, 2022
c3494e5
use filtered missing reference obs in likelihood
seabbs Sep 1, 2022
aeb7eef
add missing lookup variables
seabbs Sep 1, 2022
8bdf530
update test snapshot
seabbs Sep 1, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
add enw_incidence_to_cumulative
  • Loading branch information
seabbs committed Aug 11, 2022
commit 09e553617d17e9adca4f643f54f32133e1a20f56
3 changes: 2 additions & 1 deletion NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ export(enw_add_pooling_effect)
export(enw_assign_group)
export(enw_complete_dates)
export(enw_construct_data)
export(enw_cumulative_to_incidence)
export(enw_dates_to_factors)
export(enw_delay_filter)
export(enw_delay_metadata)
Expand All @@ -24,13 +25,13 @@ export(enw_filter_report_dates)
export(enw_fit_opts)
export(enw_formula)
export(enw_formula_as_data_list)
export(enw_incidence_to_cumulative)
export(enw_latest_data)
export(enw_manual_formula)
export(enw_metadata)
export(enw_missing)
export(enw_missing_reference)
export(enw_model)
export(enw_new_reports)
export(enw_nowcast_samples)
export(enw_nowcast_summary)
export(enw_obs)
Expand Down
86 changes: 58 additions & 28 deletions R/preprocess.R
Original file line number Diff line number Diff line change
Expand Up @@ -118,8 +118,8 @@ enw_assign_group <- function(obs, by = c()) {

#' @title FUNCTION_TITLE
#' @description FUNCTION_DESCRIPTION
#' @param obs PARAM_DESCRIPTION
#' @return OUTPUT_DESCRIPTION
#' @inheritParams enw_cumulative_to_incidence
#' @family preprocess
#' @export
#' @importFrom data.table as.data.table copy
Expand All @@ -131,15 +131,16 @@ enw_add_delay <- function(obs) {

#' @title FUNCTION_TITLE
#' @description FUNCTION_DESCRIPTION
#' @param obs PARAM_DESCRIPTION
#'
#' @return OUTPUT_DESCRIPTION
#'
#' @inheritParams enw_cumulative_to_incidence
#' @inheritParams enw_latest_data
#' @family preprocess
#' @export
#' @importFrom data.table copy
enw_add_max_reported <- function(obs) {
obs <- data.table::copy(obs)
obs <- check_dates(obs)
orig_latest <- enw_latest_data(obs)
orig_latest <- orig_latest[
,
Expand Down Expand Up @@ -291,8 +292,8 @@ enw_latest_data <- function(obs) {
#'
#' @param obs A data frame containing at least the following variables:
#' `reference date` (index date of interest), `report_date` (report date for
#' observations), `confirm` (cumulative observations by reference and report
#' date), and `.group` (as added by [enw_assign_group()]).
#' observations), and `confirm` (cumulative observations by reference and report
#' date).
#'
#' @param set_negatives_to_zero Logical, defaults to TRUE. Should negative
#' counts (for calculated incidence of observations) be set to zero. Currently
Expand All @@ -304,29 +305,28 @@ enw_latest_data <- function(obs) {
#' reported on each day (`prop_reported`) is also added.
#' @family preprocess
#' @export
#' @importFrom data.table copy shift
#' @importFrom data.table shift
#' @examples
#' # Default reconstruct incidence
#' dt <- enw_assign_group(
#' germany_covid19_hosp[location == "DE"],
#' by = "age_group"
#' )
#' enw_new_reports(dt)
#' dt <- germany_covid19_hosp[location == "DE"][age_group == "00+"]
#' enw_cumulative_to_incidence(dt)
#'
#' # Make use of maximum reported to calculate empirical daily reporting
#' dt <- enw_assign_group(dt)
#' dt <- enw_add_max_reported(dt)
#' enw_new_reports(dt)
enw_new_reports <- function(obs, set_negatives_to_zero = TRUE) {
reports <- data.table::copy(obs)
reports <- reports[order(reference_date)]
#' enw_cumulative_to_incidence(dt)
enw_cumulative_to_incidence <- function(obs, set_negatives_to_zero = TRUE,
by = c()) {
reports <- check_dates(obs)
reports <- reports[order(reference_date, report_date)]
reports[, new_confirm := confirm - data.table::shift(confirm, fill = 0),
by = c("reference_date", ".group")
by = c("reference_date", by)
]
reports <- reports[,
.SD[reference_date >= min(report_date) | is.na(reference_date)],
by = c(".group")
by = by
]
reports <- reports[, delay := 0:(.N - 1), by = c("reference_date", ".group")]
reports <- reports[, delay := 0:(.N - 1), by = c("reference_date", by)]

if (!is.null(reports$max_confirm)) {
reports[, prop_reported := new_confirm / max_confirm]
Expand All @@ -338,12 +338,41 @@ enw_new_reports <- function(obs, set_negatives_to_zero = TRUE) {
return(reports[])
}

#' Calculate cumulative reported cases from incidence of new reports
#'
#' @param obs A data frame containing at least the following variables:
#' `reference date` (index date of interest), `report_date` (report date for
#' observations), and `new_confirm` (incident observations by reference and
#' report date).
#'
#' @return The input data frame with a new variable `confirm`.
#' @family preprocess
#' @export
#' @examples
#' # Default reconstruct incidence
#' dt <- germany_covid19_hosp[location == "DE"][age_group == "00+"]
#' enw_cumulative_to_incidence(dt)
#'
#' # Make use of maximum reported to calculate empirical daily reporting
#' dt <- enw_assign_group(dt)
#' dt <- enw_add_max_reported(dt)
#' enw_cumulative_to_incidence(dt)
enw_incidence_to_cumulative <- function(obs, by = c()) {
obs <- check_dates(obs)

obs <- obs[!is.na(reference_date)]
obs[order(reference_date, report_date)]

obs[, confirm := cumsum(new_confirm), by = c(by, "reference_date")]
return(obs[])
}

#' Filter observations to restrict the maximum reporting delay
#'
#' @return A data frame filtered so that dates by report are less than or equal
#' the reference date plus the maximum delay.
#'
#' @inheritParams enw_new_reports
#' @inheritParams enw_cumulative_to_incidence
#' @inheritParams enw_preprocess_data
#' @family preprocess
#' @export
Expand All @@ -367,8 +396,8 @@ enw_delay_filter <- function(obs, max_delay) {
#' Constructs the reporting triangle with each row representing a reference date
#' and columns being observations by report date
#'
#' @param obs A data frame as produced by [enw_new_reports()]. Must contain the
#' following variables: `reference_date`, `.group`, `delay`.
#' @param obs A data frame as produced by [enw_cumulative_to_incidence()]. Must
#' contain the following variables: `reference_date`, `.group`, `delay`.
#'
#' @return A data frame with each row being a reference date, and columns being
#' observations by reporting delay.
Expand Down Expand Up @@ -498,9 +527,9 @@ enw_complete_dates <- function(obs, by = c(), max_delay,
#' Returns reports with missing reference dates as well as calculating
#' the proportion of reports for a given reference date that were missing.
#'
#' @param obs A data frame as produced by [enw_new_reports()]. Must contain the
#' following variables: `report_date`, `reference_date`, `.group`, and
#' `confirm`, and `new_confirm`.
#' @param obs A data frame as produced by [enw_cumulative_to_incidence()]. Must
#' contain the following variables: `report_date`, `reference_date`, `.group`,
#' and `confirm`, and `new_confirm`.
#'
#' @return A `data.table` of missing counts and proportions by report date and
#' group.
Expand All @@ -519,7 +548,7 @@ enw_complete_dates <- function(obs, by = c(), max_delay,
#' )
#' obs <- enw_complete_dates(obs)
#' obs <- enw_assign_group(obs)
#' obs <- enw_new_reports(obs)
#' obs <- enw_cumulative_to_incidence(obs)
#' enw_missing_reference(obs)
enw_missing_reference <- function(obs) {
obs <- check_dates(obs)
Expand Down Expand Up @@ -717,7 +746,7 @@ enw_construct_data <- function(obs, new_confirm, latest, missing_reference,
#' - `max_date`: The maximum available report date.
#'
#' @family preprocess
#' @inheritParams enw_new_reports
#' @inheritParams enw_cumulative_to_incidence
#' @export
#' @importFrom data.table as.data.table data.table
#' @examples
Expand All @@ -742,9 +771,10 @@ enw_preprocess_data <- function(obs, by = c(), max_delay = 20, holidays = c(),

obs <- enw_delay_filter(obs, max_delay = max_delay)

diff_obs <- enw_new_reports(
diff_obs <- enw_cumulative_to_incidence(
obs,
set_negatives_to_zero = set_negatives_to_zero
set_negatives_to_zero = set_negatives_to_zero,
by = by
)

# filter obs based on diff constraints
Expand Down
7 changes: 7 additions & 0 deletions tests/test-enw_incidence_to_cumulative.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
test_that("enw_incidence_to_cumulative can return toy cumulative data", {
dt <- enw_incidence_to_cumulative(toy_incidence)
expect_equal(
dt[, .(reference_date, report_date, confirm)],
toy_cumulative[order(reference_date, report_date)]
)
})
14 changes: 14 additions & 0 deletions tests/testthat/setup.R
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,17 @@ if (not_on_cran() & FALSE) {
source(enw_example("script"))
)
}

# Toy example data
toy_incidence <- data.table::data.table(
reference_date = data.table::as.IDate("2021-10-01"),
report_date = seq(
data.table::as.IDate("2021-10-01"),
length.out = 10, by = 1
),
new_confirm = c(1, 2, 3, 4, -2, 5, 5, 6, 7, 9)
)

toy_cumulative <- data.table::copy(toy_incidence)
toy_cumulative <- toy_cumulative[, confirm := cumsum(new_confirm)]
toy_cumulative <- toy_cumulative[sample(.N, .N)][, new_confirm := NULL]
33 changes: 33 additions & 0 deletions tests/testthat/test-enw_cumulative_to_incidence.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
test_that("enw_cumulative_to_incidence can return toy incidence data", {
dt <- enw_cumulative_to_incidence(
toy_cumulative,
set_negatives_to_zero = FALSE
)
expect_equal(
dt[, .(reference_date, report_date, new_confirm)],
toy_incidence
)
expect_equal(dt$delay, 0:9)
})

test_that("enw_cumulative_to_incidence can calculate reporing proportions", {
dt <- enw_cumulative_to_incidence(
data.table::copy(toy_cumulative)[, max_confirm := 100],
set_negatives_to_zero = FALSE
)
expect_equal(
dt[, prop_reported],
dt$new_confirm / 100
)
})

test_that("enw_cumulative_to_incidence can set negatives to zero", {
dt <- enw_cumulative_to_incidence(
toy_cumulative,
set_negatives_to_zero = TRUE
)
expect_equal(
dt[, new_confirm],
c(1, 2, 3, 4, 0, 5, 5, 6, 7, 9)
)
})