Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reinforcement Learning Notebook #410

Merged
merged 19 commits into from
Aug 18, 2022
Merged

Conversation

juanitorduz
Copy link
Collaborator

@juanitorduz juanitorduz commented Aug 4, 2022

Closes #272

Helpful links

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@juanitorduz juanitorduz changed the title initial version (wip) [WIP] Reinforcement Learning Notebook Aug 4, 2022
@juanitorduz juanitorduz marked this pull request as draft August 4, 2022 13:16
@juanitorduz juanitorduz marked this pull request as ready for review August 4, 2022 19:41
@juanitorduz juanitorduz changed the title [WIP] Reinforcement Learning Notebook Reinforcement Learning Notebook Aug 4, 2022
@juanitorduz
Copy link
Collaborator Author

juanitorduz commented Aug 4, 2022

Ok! I think the style is in good shape for a first review iteration (thanks for the comments!) in case someone wants to add or suggest more comments on the content. I have done no changes (besides style).

Remark: Seems the title is a bit long? In the gallery id does not render nicely. Maybe is better if we remove the "two action" from the title?

@ricardoV94
Copy link
Member

Remark: Seems the title is a bit long? In the gallery id does not render nicely. Maybe is better if we remove the "two action" from the title?

Agree

@juanitorduz juanitorduz requested a review from OriolAbril August 5, 2022 11:32
@review-notebook-app
Copy link

review-notebook-app bot commented Aug 5, 2022

View / edit / reply to this conversation on ReviewNB

ricardoV94 commented on 2022-08-05T22:21:04Z
----------------------------------------------------------------

We could extend this bonus section and give it a less lazy name. One reason why it's useful to use the bernoulli likelihood is that one can then do prior and posterior predictive sampling as well as model comparison. Even if we don't show it, it's useful to mention it.


juanitorduz commented on 2022-08-17T20:53:23Z
----------------------------------------------------------------

Following suggestion I added some model comparison and a posterior predictive check for the Bernoulli model, see https://github.com//pull/410/commits/7d9dd5681a33096cae8dac05f587b2e5305c8c12

juanitorduz commented on 2022-08-18T07:23:17Z
----------------------------------------------------------------

Better comment added in https://github.com//pull/410/commits/e9035c7611e59cd831e5a70255251dbb3f2f28d0

Copy link
Collaborator Author

Following suggestion I added some model comparison and a posterior predictive check for the Bernoulli model, see https://github.com//pull/410/commits/7d9dd5681a33096cae8dac05f587b2e5305c8c12


View entire conversation on ReviewNB

@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

ricardoV94 commented on 2022-08-17T21:10:18Z
----------------------------------------------------------------

By model comparison I meant comparison between different models using stuff like LOO approximations. With Potential you cannot do it, because PyMC does not know what is likelihood and what is prior. With a Bernoulli likelihood you can.

Otherwise, here you compared the very same models written slightly differently, which is not very interesting.


@review-notebook-app
Copy link

review-notebook-app bot commented Aug 17, 2022

View / edit / reply to this conversation on ReviewNB

ricardoV94 commented on 2022-08-17T21:10:19Z
----------------------------------------------------------------

This plot is a bit difficult to read, what are the x/y axis showing?


OriolAbril commented on 2022-08-17T21:25:33Z
----------------------------------------------------------------

I would recommend using https://python.arviz.org/en/latest/api/generated/arviz.plot_separation.html which is designed for binary outcomes.

For completeness, here is what the plot above is showing. The black line is the histogram of the observations. As the dtype is integer, ArviZ will not use bins with width smaller than 1. There are therefore two bins: [0, 1) and [1, 2) that show it is more probable to observe a 1 than a 0. The blue lines represent the same thing, but for a specific chain+draw combination, if the model is explaining the data generating process, the black line should be undistinguishable from these blue ones. The orange one is the histogram but over all chains an draws instead of a single one which can sometimes be informative too.

Copy link
Member

I would recommend using https://python.arviz.org/en/latest/api/generated/arviz.plot_separation.html which is designed for binary outcomes.

For completeness, here is what the plot above is showing. The black line is the histogram of the observations. As the dtype is integer, ArviZ will not use bins with width smaller than 1. There are therefore two bins: [0, 1) and [1, 2) that show it is more probable to observe a 1 than a 0. The blue lines represent the same thing, but for a specific chain+draw combination, if the model is explaining the data generating process, the black line should be undistinguishable from these blue ones. The orange one is the histogram but over all chains an draws instead of a single one which can sometimes be informative too.


View entire conversation on ReviewNB

@juanitorduz
Copy link
Collaborator Author

juanitorduz commented Aug 17, 2022

Thanks for your comments! I misunderstood the suggestion for this section 🤦 (apologies).
I will remove these confusing plots and add @ricardoV94 's comments as part of the last section so that we do not make this notebook unnecessary long. I think this could be a good first iteration 😅

Copy link
Collaborator Author

Better comment added in https://github.com//pull/410/commits/e9035c7611e59cd831e5a70255251dbb3f2f28d0


View entire conversation on ReviewNB

@review-notebook-app
Copy link

review-notebook-app bot commented Aug 18, 2022

View / edit / reply to this conversation on ReviewNB

ricardoV94 commented on 2022-08-18T07:44:08Z
----------------------------------------------------------------

Suggestion:

"With pm.Potential you cannot do it, because PyMC does not know what is likelihood and what is prior... NOR HOW TO GENERATE RANDOM DRAWS. NEITHER OF THIS IS A PROBLEM WHEN USING PM.BERNOULLI"


juanitorduz commented on 2022-08-18T08:56:33Z
----------------------------------------------------------------

Agree! Thanks! But probably not with the UPPER CASE right? Seems a bit to aggressive XD.

ricardoV94 commented on 2022-08-18T09:03:14Z
----------------------------------------------------------------

Hehe maybe not (couldn't think of a better way to emphasize where the suggestion connected with the existing text)

juanitorduz commented on 2022-08-18T09:15:43Z
----------------------------------------------------------------

added in https://github.com//pull/410/commits/f72af486f44452e877e835caa5965a480f4affba

Copy link
Collaborator Author

Agree! Thanks! But probably not with the UPPER CASE right? Seems a bit to aggressive XD.


View entire conversation on ReviewNB

Copy link
Member

Hehe maybe not (couldn't think of a better way to emphasize where the suggestion connected with the existing text)


View entire conversation on ReviewNB

Copy link
Collaborator Author

added in https://github.com//pull/410/commits/f72af486f44452e877e835caa5965a480f4affba


View entire conversation on ReviewNB

Copy link
Member

@ricardoV94 ricardoV94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved content wise. I leave it to @OriolAbril to confirm the meta stuff is correct

Copy link
Member

@OriolAbril OriolAbril left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left a couple more nits, looks great, thanks

myst_nbs/case_studies/reinforcement_learning.myst.md Outdated Show resolved Hide resolved
examples/references.bib Outdated Show resolved Hide resolved
myst_nbs/case_studies/reinforcement_learning.myst.md Outdated Show resolved Hide resolved
@juanitorduz juanitorduz requested a review from OriolAbril August 18, 2022 18:45
@OriolAbril OriolAbril merged commit f603fe3 into pymc-devs:main Aug 18, 2022
@juanitorduz juanitorduz deleted the rf_notebook branch August 19, 2022 11:48
kuvychko added a commit to kuvychko/pymc-examples that referenced this pull request Sep 9, 2022
Reinforcement Learning Notebook (pymc-devs#410)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

Reinforcement Learning Model for behavioral data
3 participants