Epinowcast 0.2.0
This release adds several extensions to our modelling framework, including modelling of missing data, flexible modelling of the generative process underlying case counts, an optional renewal equation-based generative process (enabling direct estimation of the effective reproduction number), and convolution-based latent reporting delays (enabling the modelling of both directly observed and unobserved delays as well as partial ascertainment). Much of the methodology used in these extensions is based on work done by Adrian Lison and is currently being evaluated.
On top of model extensions this release also adds a range of quality of life features, such as a helper functions for constructing convolution matrices and combining probability mass functions. It also comes with improved computational efficiency, thanks to a refactoring of the hazard model computations to the log scale and extended parallelisation of the likelihood that is optimised for the structure of the input data. We have also extended the package documentation and streamlined the contribution process.
As a large-scale project, this package remains in an experimental state, although it is sufficiently stable for both research and production usage. More core development is needed to improve post-proccessing, pre-processing, and documentation coverage. Moreover, the optimal configuration for different settings still needs to be further explored and is currently mainly the responsibility of the user. Please see our community site, contributing guide, and list of issues/proposed features if you are interested in getting involved. Any scale of contribution is warmly welcomed including user feedback, requests to extend our functionality to cover your setting, and evaluations of the package in your context. This is a community project that needs support from its users in order to provide improved tools for real-time infectious disease surveillance.
We thank @adrian-lison, @choi-hannah, @sbfnk, @Bisaloo, @seabbs, @pearsonca, and @pratikunterwegs for code contributions to this release. We also thank all community members for their contributions including @jhellewell14, @FelixGuenther, @parksw3, and @jbracher.
Full details on the changes in this release can be found in the following sections.
Package
- Added
.Rhistory
to the.gitignore
file. See #132 by @choi-hannah. - Fixed indentations for authors and contributors in the
DESCRIPTION
file. See #132 by @choi-hannah. - Renamed
enw_new_reports()
toenw_cumulative_to_incidence()
and added the reverse functionenw_incidence_to_cumulative()
both functions use aby
argument to allow specification of variable groupings. See #157 by @seabbs. - Switched class checking to
inherits(x, "class")
rather thanclass(x) %in% "class"
. See #155 by @Bisaloo. - Changed
enw_add_metaobs_features()
interface to haveholidays
argument as
a series of dates. Changed interface ofenw_preprocess_data()
to pass...
toenw_add_metaobs_features()
. Interface changes come with internal rewrite and unit tests. As part of internal rewrite, introducescoerce_date()
toR/utils.R
, which wrapsdata.table::as.IDate()
with error handling. See #151 by @pearsonca. - Changed the style of using
match.arg
for validating inputs. Briefly, the preference is now to define options via function arguments and validate with automaticmatch.arg
idiom with corresponding enumerated documentation of the options. For this idiom, the first item in the definition is the default. This approach only applies to string-based arguments; different types of arguments cannot be matched this way, nor can arguments that allow for vector-valued options (e.g., ifsomearg = c("option1", "option2")
were a legal argument indicating to use both options). See #162 by @pearsonca addressing issue #156 by @Bisaloo. - Refined the use of data ordering throughout the preprocessing functions. See #147 by @seabbs.
- Skipped tests that use
cmdstan
locally to improve the developer/contributor experience. See #147 by @seabbs and @adrian-lison. - Added a basic simulator function for missing reference data. See #147 by @seabbs and @adrian-lison.
- Added support for right hand side interactions as syntax sugar for random effects. This allows the specification of, for example, independent random effects by day for each strata of another variable. See #169 by @seabbs.
- Added support for passing
cpp_options
tocmdstanr::cmdstan_model()
. See #182 by @seabbs. - Add a functon,
convolution_matrix()
for constructing convolution matrices. See #183 by @seabbs. - Add a pass through from
enw_model()
towrite_stan_files_no_profile()
for thetarget_dir
argument. This allows users to compile the model once and then share the compiled model across sessions rather than having to recompile each time the temporary directory is cleared. See #185 by @seabbs. - Added
add_pmfs()
, to sum probability mass functions into a new probability mass function. Initial implementation by @seabbs in #183, refactored by @pratikunterwegs in #187, following a suggestion in issue #186 by @pearsonca. - Added a warning when the observed empirical maximum delay is less than the specified maximum delay. See #190 by @seabbs.
- Added nested support for converting array syntax in
convert_cmdstan_to_rstan
. See #192 by @sbfnk.
Model
- Added support for parametric log-logistic delay distributions. See #128 by @adrian-lison.
- Implemented direct specification of parametric baseline hazards. See #134 by @adrian-lison.
- Refactored the observation model, the combination of logit hazards, and the effects priors to be contained in generic functions to make extending package functionality easier. See #137 by @seabbs.
- Implemented specification of the parametric baseline hazards and probabilities on the log scale to increase robustness and efficiency. Also includes refactoring of these functions and reorganisation of
inst/stan/epinowcast.stan
to increase modularity and clarity. See #140 by @seabbs. - Introduced two new delay likelihoods
delay_snap_lmpf
anddelay_group_lmpf
. These stratify by either snapshots or groups. This is helpful for some models (such as the missingness module). The ability to choose which function is used has been exposed to the user inenw_fit_opts()
via thelikelihood_aggregation
argument. Both of these functions rely on a newly addedexpected_obs_from_snaps
function which vectorisesexpected_obs_from_index
. See #138 by @seabbs and @adrian-lison. - Added support for supplying missingness model parameters to the model as well as optional priors and effect estimation. See #138 by @seabbs and @adrian-lison.
- Refactored model generated quantities to be functional. See #138 by @seabbs and @adrian-lison.
- Added support for modelling missing reference dates to the likelihood. See #147 by @seabbs and @adrian-lison.
- Added additional functionality to
delay_group_lmpf
to support modelling observations missing reference dates. Also updated the generated quantities to support this mode. See #147 by @seabbs and @adrian-lison based on #64 by @adrian-lison. - Added a flexible expectation process on the growth rate scale. The default expectation model has been updated to a group-wise random walk on the growth rate. See #152 by @seabbs and @adrian-lison.
- Added a deterministic renewal equation, and latent reporting process. See #152 and #183 by @seabbs and @adrian-lison.
- Added support for no intercept in the expectation model and more general formula support to enable this as a feature in other modules going forward. See #170 by @seabbs.
Documentation
- Removed explicit links to authors and issues in the
NEWS.md
file. See #132 by @choi-hannah. - Added a new example using simulated data and the
enw_missing()
model module. See #138 by @seabbs and @adrian-lison. - Update the model definition vignette to include the missing reference date model. See #147 by @seabbs and @adrian-lison.
- Added the use of an expectation model to the "Hierarchical nowcasting of age stratified COVID-19 hospitalisations in Germany" vignette. See #193 by @seabbs.
Bugs
- The probability-only model (i.e only a parametric distribution is used and hence the hazard scale is not needed) was not used due to a mistake specifying
ref_as_p
in the stan code. There was an additional issue in that theenw_report()
module currently self-declares as on regardless of it is or not. This bug had no impact on results but would have increased runtimes for simple models. Both of these issues were fixed in #142 by @seabbs. - The addition of meta features week and month did not properly sequentially number weeks and months when time series crossed year boundaries. This would impact models that included effects expecting those to in fact be sequentially numbered (e.g. random walks). Fixed in #151 by @pearsonca.
- #151 also corrects a minor issue with
enw_example()
pointing at an old file name whentype="script"
. By @pearsonca.
What's Changed
- Add loglogistic parametric distribution by @adrian-lison in #128
- Feature improve news by @choi-hannah in #132
- Feature modularise observations by @seabbs in #137
- Hazard specification of parametric baseline hazard by @adrian-lison in #134
- Bug: ref_as_p not activating by @seabbs in #142
- Feature logit reference hazard by @seabbs in #140
- Feature missing reference support code by @seabbs in #138
- Feature incidence to cumulative by @seabbs in #157
- Use inherits to test class by @Bisaloo in #155
- fixes #148 re holidays input / documentation by @pearsonca in #151
- address match.args change by @pearsonca in #162
- Localise to epinowcast by @seabbs in #164
- Add missing reference model components to delay_group_lmpf and generated quantities by @seabbs in #147
- Feature: interactions in random effects by @seabbs in #169
- Flexible expectation model by @seabbs in #152
- Feature cpp options by @seabbs in #182
- Renewal and latent reporting on the real scale by @seabbs in #183
- Add target dir passing to enw_model by @seabbs in #185
- Refactor add_pmf by @pratikunterwegs in #187
- Added a warning about empirical and specified delay being different by @seabbs in #190
- compatibility with additional array notation by @sbfnk in #192
- Featur - update-documentation by @seabbs in #193
- Add renewal and latent reporting to model overview by @seabbs in #189
- 0.2.0 Release by @seabbs in #180
New Contributors
- @choi-hannah made their first contribution in #132
- @Bisaloo made their first contribution in #155
- @pearsonca made their first contribution in #151
- @pratikunterwegs made their first contribution in #187
Full Changelog: v0.1.0...v0.2.0