Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introducing gulmc: full monte carlo loss engine #1137

Merged
merged 79 commits into from
Dec 8, 2022
Merged

Conversation

mtazzari
Copy link
Contributor

@mtazzari mtazzari commented Oct 26, 2022

This PR introduces gulmc, a new tool that uses a "full Monte Carlo" approach for ground up losses calculation that, instead of drawing loss samples from the 'effective damageability' probability distribution (as done by calling eve | modelpy | gulpy), it first draws a sample of the hazard intensity, and then draws a sample of the damage from the vulnerability function corresponding to the hazard intensity sample.

Introducing gulmc, full Monte Carlo loss calculation engine

This PR introduces gulmc, a new tool that uses a "full Monte Carlo" approach for ground up losses calculation that, instead of drawing loss samples from the 'effective damageability' probability distribution (as done by calling eve | modelpy | gulpy), it first draws a sample of the hazard intensity, and then draws a sample of the damage from the vulnerability function corresponding to the hazard intensity sample.

Comparing gulpy and gulmc output

gulmc runs the same algorithm of eve | modelpy | gulpy, i.e., it runs the 'effective damageability' calculation mode, with the same command line arguments. For example, to run a model with 1000 samples, alloc rule 1, and streaming the binary output to the output.bin file, can be done with:

eve 1 1 | modelpy | gulpy -S1000 -a1 -o output.bin

or

eve 1 1 | gulmc -S1000 -a1 -o output.bin

Hazard uncertainty treatment

If the hazard intensity in the fooprint has no uncertainty, i.e.:

event_id,areaperil_id,intensity_bin_id,probability
1,4,1,1
[...]

then gulpy and gulmc produce the same outputs. However, if the hazard intensity has a probability distribution, e.g.:

event_id,areaperil_id,intensity_bin_id,probability
1,4,1,2.0000000298e-01
1,4,2,6.0000002384e-01
1,4,3,2.0000000298e-01
[...]

then, by default, gulmc runs the full Monte Carlo sampling of the hazard intensity, and then of damage. In order to reproduce the same results that gulpy produces can be achieved by using the --effective-damageability flag:

eve 1 1 | gulmc -S1000 -a1 -o output.bin --effective-damageability

On the usage of modelpy and eve with gulmc

Due to internal refactoring, gulmc now incorporates the functionality performed by modelpy, therefore modelpy should not be used in a pipe with gulmc:

eve 1 1 | modelpy | gulpy -S1000 -a1 -o output.bin        # wrong usage, won't work
eve 1 1 | gulpy -S1000 -a1 -o output.bin                  # correct usage

Note: both gulpy and gulmc can read the events stream from binary file, i.e., without the need of eve, with:

 gulmc -i input/events.bin -S1000 -a1 -o output.bin

Printing the random values used for sampling

Since we now sample in two dimensions (hazard intensity and damage), the -d flag is revamped to output both random values used for sampling. While gulpy -d printed the random values used to sample the effective damageability distribution, in gulmc:

gulmc -d1 [...]   # prints the random values used for the hazard intensity sampling
gulmc -d2 [...]   # prints the random values used for the damage sampling

Note: if the --effective-damageability flag is used, only -d2 is valid since there is no sampling of the hazard intensity, and the random value printed are those used for the effective damageability sampling.

Note: if -d1 or -d2 are passed, the only valid alloc_rule value is 0. This is because, when printing the random values, back-allocation is not meaningful. alloc_rule=0 is the default value or it can be set with -a0. If a value other than 0 is passed to -a, an error will be thrown.

Testing suite

This PR introduces:

  • a minimal toy model in tests/assets/test_model_1/ that can be used to run unit tests on various functionality in the repository. A more detailed description of the content of the model can be found at tests/assets/test_model_1/README.md.
  • a suite of 192 quantitative tests for the gulmc output for combinations of input parameters (alloc rule, correlation, etc.). Binary files with the expected outputs are stored at tests/assets/test_model_1/expected/.
  • a suite of 48 quantitative tests for the gulpy output for combinations of input parameters.
  • tests checking that ValueErrors in gulmc are raised when expected.

@mtazzari mtazzari added enhancement New feature or request feature A main feature, captured on the backlog labels Oct 26, 2022
@mtazzari mtazzari self-assigned this Oct 26, 2022
Copy link
Contributor

@sambles sambles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Nice to have: It would be useful if the gulmc testing could output the failed differences between expected and results. At the moment the output notes that files differ (true/false). However, the binary format needs to be converted to something human readable first.

  • Note: gulmc testing has failures when running on a non-ubuntu based distro (arch linux based). We suspect it might be linked to OS libraries, but haven't found the cause.

Marco Tazzari
if we can get tests running on other OSs then we can see what we can support, which would be even better, but being so focused on numerical performance, it's fair to pick a specific OS and rely on its reproducibility
rather than aiming at a Windows-style "we work on any computer", which it's going to be hard to maintain

Stephane Struzik
I don't agree on this one, I think the test should pass on all platform we advertise. If we just have rounding error I would prefer that we convert the result to csv and use dataframe to compare. It also has the added benefit of providing a better diff message

@mtazzari
Copy link
Contributor Author

mtazzari commented Dec 5, 2022

Thank you @sambles for the comments.
Regarding the "nice to have", I'm coding up a handy binary-file-check function customized for us so that it converts to csv if the binary differs and shows the differences. I'll add that to PR #1168 which is a continuation of this PR, with additional functionality.

Copy link
Contributor

@sstruzik sstruzik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just few minor comments. The logic looks good.

oasislmf/pytools/gulmc/manager.py Outdated Show resolved Hide resolved
oasislmf/pytools/gulmc/manager.py Outdated Show resolved Hide resolved
oasislmf/pytools/gulmc/manager.py Outdated Show resolved Hide resolved
oasislmf/pytools/getmodel/manager.py Outdated Show resolved Hide resolved
oasislmf/pytools/gulmc/manager.py Outdated Show resolved Hide resolved
@codecov-commenter
Copy link

codecov-commenter commented Dec 8, 2022

Codecov Report

Merging #1137 (0280821) into develop (f12dd9d) will increase coverage by 2.44%.
The diff coverage is 38.26%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #1137      +/-   ##
===========================================
+ Coverage    43.18%   45.63%   +2.44%     
===========================================
  Files           90       93       +3     
  Lines        11175    11584     +409     
===========================================
+ Hits          4826     5286     +460     
+ Misses        6349     6298      -51     
Impacted Files Coverage Δ
oasislmf/pytools/gulmc/cli.py 0.00% <0.00%> (ø)
oasislmf/pytools/modelpy.py 0.00% <0.00%> (ø)
oasislmf/pytools/gulmc/manager.py 39.28% <39.28%> (ø)
oasislmf/pytools/getmodel/manager.py 26.73% <50.00%> (+26.73%) ⬆️
oasislmf/pytools/gul/manager.py 18.86% <66.66%> (+18.86%) ⬆️
oasislmf/pytools/gul/random.py 50.63% <71.42%> (+50.63%) ⬆️
oasislmf/pytools/common.py 100.00% <100.00%> (ø)
oasislmf/pytools/getmodel/common.py 100.00% <100.00%> (ø)
oasislmf/pytools/gul/common.py 100.00% <100.00%> (ø)
oasislmf/pytools/gul/io.py 14.01% <100.00%> (+14.01%) ⬆️
... and 8 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5f4a32e...0280821. Read the comment docs.

@mtazzari mtazzari merged commit 2ccb138 into develop Dec 8, 2022
@mtazzari mtazzari deleted the feature/gulmc branch December 8, 2022 17:44
sambles pushed a commit that referenced this pull request Jan 12, 2023
* [mcgul] first implementation of full MC gul

* [modelpy] montecarlo implementation in modelpy

* stop tracking mcgul

* [modelpy] fixes

* simplify algorithm

* remove unused imports

* use numba, process last areaperil id, cleanup

* [modelpy] add docstrings

* [modelpy] function namechange

* [modelpy] add TODOs not to forget

* [gulpy] bugfix in calling read_getmodel_data

* [gulpy] drafting monte carlo implementation

* [mcgul] Add major modelpy and gulpy rewrite as one tool

* [mcgul] do not sample haz if no haz uncertainty

* [mcgul] cleanup

* [mcgul] good working implementation

* [mcgul] perfectly reproduces effective damageability

* [mcgul] further simplification

* [mcgul] wip

* [mcgul] compute haz cdf in map_areaperil_ids_in_footprint

* [gulmc] update cli

* [getmodel] reverting full mc modifications

* [gul] reverting mc modifications

* [getmodel] reverting mc modifications

* [getmodel] Reverting unused mc modifications

* [gul] updating docstring

* [getmode;] update docstring

* [gulmc] dynamic buff_size

* [gulmc] imports cleanup

* [gulmc] cleanup

* [gulmc] dynamic buff size

* [gulmc] compute effective damageability

* [gulmc] effective damageability with numba

* [gulpy] minor bugfix

* [gulmc] bugfix: use 4 as item size in int32_mv

* [gulmc] minor cleanup

* [gulmc] fix conflicts with stashed edits

* [gulmc] cleanup

* [gulmc] remove unused imports

* [modelpy] remove one blank line

* [gulmc] add effective_damageability optional arg

* [gulmc] bugfix effective damageability

* [gulmc] add tests

* [gulmc] add tests for effective damageability

* [gulmc] move gulpy tests to separate module

* [tests] add test_model_1 to the tests assets

* [tests] use typing.Tuple for type hints

* [gulmc] better tests, tiv set to float64

* [gulmc] log info about effective_damageabilty

* [gulmc] cleaning up, adding docs (WIP).

* [gulmc] adding documentation and docstrings

* [gulmc] bugfix

* [gulmc] adding docs

* [gulmc] add docs

* [gulmc] rewrite complex outputs as tuples

* [gulmc] add final docs

* [gulmc] remove unused import

* [gulmc] Improve --debug flag

* [gulmc] raise ValueError if alloc_rule>0 when debug is 1 or 2

* [gulmc] cleanup

* [flake8] fix error code in ignore config

* [requirements] testing unpinning virtualenv

* [requirements] testing unpinning virtualenv

* [requirements] fixing package clash

* [gulmc] test ValueError if alloc_rule is invalid

* [gulmc] improve tests

* [gulmc] remove unnecessary binary files

* [requirements] removing unnecessary virtualenv

* [CI] specify pip-compile resolver for py 3.7

* [CI] fix bug in CI

* [CI] bugfix

* [gulmc] update following review comments

* [gulmc] implement fixes following review

* [gulmc] bugfix in logging
@awsbuild awsbuild added this to the 1.27.0 milestone Jan 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request feature A main feature, captured on the backlog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Stochastic disaggregation 7 Full Monte Carlo
5 participants