Reinforcement Learning Notebook #410

juanitorduz · 2022-08-04T13:15:44Z

Closes #272

Notebook follows style guide https://docs.pymc.io/en/latest/contributing/jupyter_style.html
PR description contains a link to the relevant issue: a tracker one for existing notebooks or a proposal one for new notebooks
Check the notebook is not excluded from any pre-commit check: https://github.com/pymc-devs/pymc-examples/blob/main/.pre-commit-config.yaml

Helpful links

https://github.com/pymc-devs/pymc-examples/blob/main/CONTRIBUTING.md

review-notebook-app · 2022-08-04T13:15:48Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

myst_nbs/case_studies/reinforcement_learning.myst.md

juanitorduz · 2022-08-04T19:43:32Z

Ok! I think the style is in good shape for a first review iteration (thanks for the comments!) in case someone wants to add or suggest more comments on the content. I have done no changes (besides style).

Remark: Seems the title is a bit long? In the gallery id does not render nicely. Maybe is better if we remove the "two action" from the title?

ricardoV94 · 2022-08-04T20:57:38Z

Remark: Seems the title is a bit long? In the gallery id does not render nicely. Maybe is better if we remove the "two action" from the title?

Agree

myst_nbs/case_studies/reinforcement_learning.myst.md

review-notebook-app · 2022-08-05T22:21:05Z

View / edit / reply to this conversation on ReviewNB

ricardoV94 commented on 2022-08-05T22:21:04Z
----------------------------------------------------------------

We could extend this bonus section and give it a less lazy name. One reason why it's useful to use the bernoulli likelihood is that one can then do prior and posterior predictive sampling as well as model comparison. Even if we don't show it, it's useful to mention it.

juanitorduz commented on 2022-08-17T20:53:23Z
----------------------------------------------------------------

Following suggestion I added some model comparison and a posterior predictive check for the Bernoulli model, see https://github.com//pull/410/commits/7d9dd5681a33096cae8dac05f587b2e5305c8c12

juanitorduz commented on 2022-08-18T07:23:17Z
----------------------------------------------------------------

Better comment added in https://github.com//pull/410/commits/e9035c7611e59cd831e5a70255251dbb3f2f28d0

juanitorduz · 2022-08-17T20:53:24Z

Following suggestion I added some model comparison and a posterior predictive check for the Bernoulli model, see https://github.com//pull/410/commits/7d9dd5681a33096cae8dac05f587b2e5305c8c12

View entire conversation on ReviewNB

review-notebook-app · 2022-08-17T21:10:18Z

View / edit / reply to this conversation on ReviewNB

ricardoV94 commented on 2022-08-17T21:10:18Z
----------------------------------------------------------------

By model comparison I meant comparison between different models using stuff like LOO approximations. With Potential you cannot do it, because PyMC does not know what is likelihood and what is prior. With a Bernoulli likelihood you can.

Otherwise, here you compared the very same models written slightly differently, which is not very interesting.

review-notebook-app · 2022-08-17T21:10:19Z

View / edit / reply to this conversation on ReviewNB

ricardoV94 commented on 2022-08-17T21:10:19Z
----------------------------------------------------------------

This plot is a bit difficult to read, what are the x/y axis showing?

OriolAbril commented on 2022-08-17T21:25:33Z
----------------------------------------------------------------

I would recommend using https://python.arviz.org/en/latest/api/generated/arviz.plot_separation.html which is designed for binary outcomes.

For completeness, here is what the plot above is showing. The black line is the histogram of the observations. As the dtype is integer, ArviZ will not use bins with width smaller than 1. There are therefore two bins: [0, 1) and [1, 2) that show it is more probable to observe a 1 than a 0. The blue lines represent the same thing, but for a specific chain+draw combination, if the model is explaining the data generating process, the black line should be undistinguishable from these blue ones. The orange one is the histogram but over all chains an draws instead of a single one which can sometimes be informative too.

OriolAbril · 2022-08-17T21:25:34Z

I would recommend using https://python.arviz.org/en/latest/api/generated/arviz.plot_separation.html which is designed for binary outcomes.

For completeness, here is what the plot above is showing. The black line is the histogram of the observations. As the dtype is integer, ArviZ will not use bins with width smaller than 1. There are therefore two bins: [0, 1) and [1, 2) that show it is more probable to observe a 1 than a 0. The blue lines represent the same thing, but for a specific chain+draw combination, if the model is explaining the data generating process, the black line should be undistinguishable from these blue ones. The orange one is the histogram but over all chains an draws instead of a single one which can sometimes be informative too.

View entire conversation on ReviewNB

juanitorduz · 2022-08-17T21:33:46Z

Thanks for your comments! I misunderstood the suggestion for this section 🤦 (apologies).
I will remove these confusing plots and add @ricardoV94 's comments as part of the last section so that we do not make this notebook unnecessary long. I think this could be a good first iteration 😅

juanitorduz · 2022-08-18T07:23:18Z

Better comment added in https://github.com//pull/410/commits/e9035c7611e59cd831e5a70255251dbb3f2f28d0

View entire conversation on ReviewNB

review-notebook-app · 2022-08-18T07:44:08Z

View / edit / reply to this conversation on ReviewNB

ricardoV94 commented on 2022-08-18T07:44:08Z
----------------------------------------------------------------

Suggestion:

"With pm.Potential you cannot do it, because PyMC does not know what is likelihood and what is prior... NOR HOW TO GENERATE RANDOM DRAWS. NEITHER OF THIS IS A PROBLEM WHEN USING PM.BERNOULLI"

juanitorduz commented on 2022-08-18T08:56:33Z
----------------------------------------------------------------

Agree! Thanks! But probably not with the UPPER CASE right? Seems a bit to aggressive XD.

ricardoV94 commented on 2022-08-18T09:03:14Z
----------------------------------------------------------------

Hehe maybe not (couldn't think of a better way to emphasize where the suggestion connected with the existing text)

juanitorduz commented on 2022-08-18T09:15:43Z
----------------------------------------------------------------

added in https://github.com//pull/410/commits/f72af486f44452e877e835caa5965a480f4affba

juanitorduz · 2022-08-18T08:56:34Z

Agree! Thanks! But probably not with the UPPER CASE right? Seems a bit to aggressive XD.

View entire conversation on ReviewNB

ricardoV94 · 2022-08-18T09:03:18Z

Hehe maybe not (couldn't think of a better way to emphasize where the suggestion connected with the existing text)

View entire conversation on ReviewNB

juanitorduz · 2022-08-18T09:15:44Z

added in https://github.com//pull/410/commits/f72af486f44452e877e835caa5965a480f4affba

View entire conversation on ReviewNB

ricardoV94

Approved content wise. I leave it to @OriolAbril to confirm the meta stuff is correct

OriolAbril

left a couple more nits, looks great, thanks

myst_nbs/case_studies/reinforcement_learning.myst.md

examples/references.bib

myst_nbs/case_studies/reinforcement_learning.myst.md

Reinforcement Learning Notebook (pymc-devs#410)

initial version (wip)

c4e1573

juanitorduz changed the title ~~initial version (wip)~~ [WIP] Reinforcement Learning Notebook Aug 4, 2022

juanitorduz marked this pull request as draft August 4, 2022 13:16

fix date and minor corrections

a3959b9

OriolAbril reviewed Aug 4, 2022

View reviewed changes

initial feedback improvements

69acb1f

ricardoV94 reviewed Aug 4, 2022

View reviewed changes

myst_nbs/case_studies/reinforcement_learning.myst.md Outdated Show resolved Hide resolved

juanitorduz added 6 commits August 4, 2022 21:02

improve plot and fix title

b365732

hide plot cell

c1d9664

add ricardo comment text

5252c2b

hide plot cell v2

a2946b4

add manual tag notebook

cdaecc8

hide just input

83474ed

juanitorduz requested review from ricardoV94 and OriolAbril August 4, 2022 19:41

juanitorduz marked this pull request as ready for review August 4, 2022 19:41

juanitorduz changed the title ~~[WIP] Reinforcement Learning Notebook~~ Reinforcement Learning Notebook Aug 4, 2022

shorten notebook

afce224

OriolAbril reviewed Aug 5, 2022

View reviewed changes

myst_nbs/case_studies/reinforcement_learning.myst.md Show resolved Hide resolved

remove redundant credits section

49b3d6b

juanitorduz requested a review from OriolAbril August 5, 2022 11:32

juanitorduz added 3 commits August 5, 2022 13:43

fix credits section

61e7881

add re-execution date

be0bb83

omg fix date ....facepalm

21998f5

improve last section

7d9dd56

improve style

661359a

remove last plots and add better final remarks

e9035c7

extend bernoulli comment

f72af48

ricardoV94 approved these changes Aug 18, 2022

View reviewed changes

OriolAbril approved these changes Aug 18, 2022

View reviewed changes

myst_nbs/case_studies/reinforcement_learning.myst.md Outdated Show resolved Hide resolved

examples/references.bib Outdated Show resolved Hide resolved

myst_nbs/case_studies/reinforcement_learning.myst.md Outdated Show resolved Hide resolved

final style review

053036a

juanitorduz requested a review from OriolAbril August 18, 2022 18:45

OriolAbril merged commit f603fe3 into pymc-devs:main Aug 18, 2022

juanitorduz deleted the rf_notebook branch August 19, 2022 11:48

kuvychko added a commit to kuvychko/pymc-examples that referenced this pull request Sep 9, 2022

Merge pull request #5 from pymc-devs/main

1719fc9

Reinforcement Learning Notebook (pymc-devs#410)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reinforcement Learning Notebook #410

Reinforcement Learning Notebook #410

juanitorduz commented Aug 4, 2022 •

edited

Loading

review-notebook-app bot commented Aug 4, 2022

juanitorduz commented Aug 4, 2022 •

edited

Loading

ricardoV94 commented Aug 4, 2022

review-notebook-app bot commented Aug 5, 2022 •

edited

Loading

juanitorduz commented Aug 17, 2022

review-notebook-app bot commented Aug 17, 2022

review-notebook-app bot commented Aug 17, 2022 •

edited

Loading

OriolAbril commented Aug 17, 2022

juanitorduz commented Aug 17, 2022 •

edited

Loading

juanitorduz commented Aug 18, 2022

review-notebook-app bot commented Aug 18, 2022 •

edited

Loading

juanitorduz commented Aug 18, 2022

ricardoV94 commented Aug 18, 2022

juanitorduz commented Aug 18, 2022

ricardoV94 left a comment

OriolAbril left a comment

Reinforcement Learning Notebook #410

Reinforcement Learning Notebook #410

Conversation

juanitorduz commented Aug 4, 2022 • edited Loading

Helpful links

review-notebook-app bot commented Aug 4, 2022

juanitorduz commented Aug 4, 2022 • edited Loading

ricardoV94 commented Aug 4, 2022

review-notebook-app bot commented Aug 5, 2022 • edited Loading

juanitorduz commented Aug 17, 2022

review-notebook-app bot commented Aug 17, 2022

review-notebook-app bot commented Aug 17, 2022 • edited Loading

OriolAbril commented Aug 17, 2022

juanitorduz commented Aug 17, 2022 • edited Loading

juanitorduz commented Aug 18, 2022

review-notebook-app bot commented Aug 18, 2022 • edited Loading

juanitorduz commented Aug 18, 2022

ricardoV94 commented Aug 18, 2022

juanitorduz commented Aug 18, 2022

ricardoV94 left a comment

Choose a reason for hiding this comment

OriolAbril left a comment

Choose a reason for hiding this comment

juanitorduz commented Aug 4, 2022 •

edited

Loading

juanitorduz commented Aug 4, 2022 •

edited

Loading

review-notebook-app bot commented Aug 5, 2022 •

edited

Loading

review-notebook-app bot commented Aug 17, 2022 •

edited

Loading

juanitorduz commented Aug 17, 2022 •

edited

Loading

review-notebook-app bot commented Aug 18, 2022 •

edited

Loading