Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Take3 model averaging #414

Merged
merged 15 commits into from
Sep 21, 2022
Merged

Conversation

reshamas
Copy link
Contributor

@reshamas reshamas commented Aug 9, 2022

Description

References

Checklist

Helpful links

Notes for the Reviewer

  • There are two references to "model_comparison.ipynb" and the link is broken. I cannot find this notebook.
  • I changed mentions of "PyMC3" to "PyMC". Is that ok?

#DataUmbrellaPyMCSprint

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@reshamas reshamas requested a review from OriolAbril August 9, 2022 12:07
@reshamas
Copy link
Contributor Author

reshamas commented Aug 9, 2022

@OriolAbril
Thank you for the helpful Git notes (#412 (comment))

All checks are passing now. (phew!)

I have a few notes at the top: "Notes for the Reviewer" regarding the updates I made to the notebook.

@OriolAbril
Copy link
Member

There are two references to "model_comparison.ipynb" and the link is broken. I cannot find this notebook.

Here are the references and links (not to be used, only so you see where they point to):

I changed mentions of "PyMC3" to "PyMC". Is that ok?

Yes, all notebooks need to be updated out of pymc3 both in text and in code, as you are working in the text, you should fix it.

Closes #67

It should not close the issue. The notebook still runs on v3, so it will move to the book style column, not yet to done (which is closing the issue)

myst_nbs/diagnostics_and_criticism/model_averaging.myst.md Outdated Show resolved Hide resolved

One alternative is to perform model selection but discuss all the different models together with the computed values of a given Information Criterion. It is important to put all these numbers and tests in the context of our problem so that we and our audience can have a better feeling of the possible limitations and shortcomings of our methods. If you are in the academic world you can use this approach to add elements to the discussion section of a paper, presentation, thesis, and so on.

Yet another approach is to perform model averaging. The idea now is to generate a meta-model (and meta-predictions) using a weighted average of the models. There are several ways to do this and PyMC3 includes 3 of them that we are going to briefly discuss, you will find a more thorough explanation in the work by [Yuling Yao et. al.](https://arxiv.org/abs/1704.02030)
Yet another approach is to perform model averaging. The idea now is to generate a meta-model (and meta-predictions) using a weighted average of the models. There are several ways to do this and PyMC includes 3 of them that we are going to briefly discuss, you will find a more thorough explanation in the work by [Yuling Yao et. al.](https://arxiv.org/abs/1704.02030)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, model comparison is done by ArviZ, we should update the "PyMC includes" to something like "PyMC integrates with ArviZ"

myst_nbs/diagnostics_and_criticism/model_averaging.myst.md Outdated Show resolved Hide resolved
@@ -71,7 +80,7 @@ The above formula for computing weights is a very nice and simple approach, but

## Stacking

The third approach implemented in PyMC3 is know as _stacking of predictive distributions_ and it has been recently [proposed](https://arxiv.org/abs/1704.02030). We want to combine several models in a metamodel in order to minimize the diverge between the meta-model and the _true_ generating model, when using a logarithmic scoring rule this is equivalently to:
The third approach implemented in PyMC is known as [_stacking of predictive distributions_](https://arxiv.org/abs/1704.02030). We want to combine several models in a metamodel in order to minimize the divergence between the meta-model and the _true_ generating model, when using a logarithmic scoring rule this is equivalent to:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here (it's actually the same paper as above so so far 1 reference to add to bibtex)


The following example is taken from the superb book [Statistical Rethinking](http://xcelab.net/rm/statistical-rethinking/) by Richard McElreath. You will find more PyMC3 examples from this book in this [repository](https://github.com/aloctavodia/Statistical-Rethinking-with-Python-and-PyMC3). We are going to explore a simplified version of it. Check the book for the whole example and a more thorough discussion of both, the biological motivation for this problem and a theoretical/practical discussion of using Information Criteria to compare, select and average models.
The following example is taken from the superb book [Statistical Rethinking](http://xcelab.net/rm/statistical-rethinking/) by Richard McElreath. You will find more PyMC examples from this book in this [repository](https://github.com/aloctavodia/Statistical-Rethinking-with-Python-and-PyMC3). We are going to explore a simplified version of it. Check the book for the whole example and a more thorough discussion of both, the biological motivation for this problem and a theoretical/practical discussion of using Information Criteria to compare, select and average models.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

statistical rethinking should be a citation (it already is in the bibtex file but it would be good to add the url to the bibtex citation). The link to the pymc port of the book code should point to https://github.com/pymc-devs/pymc-resources now instead.

myst_nbs/diagnostics_and_criticism/model_averaging.myst.md Outdated Show resolved Hide resolved
```

+++ {"papermill": {"duration": 0.055089, "end_time": "2020-11-29T12:14:57.977616", "exception": false, "start_time": "2020-11-29T12:14:57.922527", "status": "completed"}, "tags": []}

Now that we have sampled the posterior for the 3 models, we are going to use WAIC (Widely applicable information criterion) to compare the 3 models. We can do this using the `compare` function included with PyMC3.
Now that we have sampled the posterior for the 3 models, we are going to use WAIC (Widely applicable information criterion) to compare the 3 models. We can do this using the `compare` function included with PyMC.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar comment to the one in the introduction, compare is an arviz function now.

We should probably also add a note or comment on the code. I think the code does use waic, but az.compare now defaults to using loo instead, so running the same code will not use waic anymore.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or maybe leave a note on the issue for whoever updates the code and reruns it on pymc v4? (which can't really be done for now as sample_posterior_predictive_w doesn't work on v4 yet)


We can also see that we get a column with the relative `weight` for each model (according to the first equation at the beginning of this notebook). This weights can be _vaguely_ interpreted as the probability that each model will make the correct predictions on future data. Of course this interpretation is conditional on the models used to compute the weights, if we add or remove models the weights will change. And also is dependent on the assumptions behind WAIC (or any other Information Criterion used). So try to do not overinterpret these `weights`.
We can also see that we get a column with the relative `weight` for each model (according to the first equation at the beginning of this notebook). This weights can be _vaguely_ interpreted as the probability that each model will make the correct predictions on future data. Of course this interpretation is conditional on the models used to compute the weights, if we add or remove models the weights will change. And also is dependent on the assumptions behind WAIC (or any other Information Criterion used). So try to not overinterpret these `weights`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also needs a note, not sure how to rewrite it though, I'll try to come back later. The weight-probability interpretation is only valid for bma, not for stacking. The notebook should be clear on this because is is a common source of confusion, see arviz-devs/arviz#2077 or https://discourse.pymc.io/t/bayesian-model-averaging-ranking-of-model-weights-and-loo-dont-match/4658

myst_nbs/diagnostics_and_criticism/model_averaging.myst.md Outdated Show resolved Hide resolved
myst_nbs/diagnostics_and_criticism/model_averaging.myst.md Outdated Show resolved Hide resolved
@@ -247,11 +268,11 @@ comp

+++ {"papermill": {"duration": 0.056609, "end_time": "2020-11-29T12:14:58.387481", "exception": false, "start_time": "2020-11-29T12:14:58.330872", "status": "completed"}, "tags": []}

We can see that the best model is `model_2`, the one with both predictor variables. Notice the DataFrame is ordered from lowest to highest WAIC (_i.e_ from _better_ to _worst_ model). Check [this notebook](model_comparison.ipynb) for a more detailed discussing on model comparison.
We can see that the best model is `model_2`, the one with both predictor variables. Notice the DataFrame is ordered from lowest to highest WAIC (_i.e_ from _better_ to _worst_ model). Check [model_comparison notebook](model_comparison.ipynb) for a more detailed discussion on model comparison.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should also be a sphinx cross reference, not a markdown local link

@reshamas
Copy link
Contributor Author

@OriolAbril This seems to be the pre-commit error. It's in another notebook, and I'm not sure how to fix it:

examples/howto/custom_distribution.ipynb:124: "If your distribution exists in scipy.stats (https://docs.scipy.org/doc/scipy/reference/stats.html), then you can use the Random Variates method scipy.stats.{dist_name}.rvs to generate random samples.\n",

@reshamas reshamas requested a review from OriolAbril September 2, 2022 16:10
@OriolAbril
Copy link
Member

The error is because this notebook examples/howto/custom_distribution.ipynb references the scipy docs with urls instead of cross-references. Now that you have added the scipy docs to the list of domains to avoid it is failing. For now add that notebook to the list of ignored notebooks above: https://github.com/pymc-devs/pymc-examples/blob/main/.pre-commit-config.yaml#L62

@@ -6,9 +6,9 @@ jupytext:
format_version: 0.13
jupytext_version: 1.13.7
kernelspec:
display_name: pymc-dev-py39
display_name: Python 3 (ipykernel)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes should not be included in the PR. The only notebook being modified should be the model averaging one.

@OriolAbril OriolAbril merged commit 9fad19c into pymc-devs:main Sep 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants