Add GLM Out of Sample Predictions Notebook #37

juanitorduz · 2021-02-14T15:11:40Z

I would like to add a small example to the docs. See #33 and the original post https://juanitorduz.github.io/glm_pymc3/. I am open for suggestions and changes to make this notebook useful for the community.

review-notebook-app · 2021-02-14T15:11:43Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

review-notebook-app · 2021-02-23T14:55:33Z

View / edit / reply to this conversation on ReviewNB

MarcoGorelli commented on 2021-02-23T14:55:33Z
----------------------------------------------------------------

*in the design matrix

twiecki commented on 2021-03-02T08:48:21Z
----------------------------------------------------------------

typo still needs to be fixed.

juanitorduz commented on 2021-03-02T09:03:47Z
----------------------------------------------------------------

Thanks, this is fixed now.

review-notebook-app · 2021-02-23T14:55:34Z

View / edit / reply to this conversation on ReviewNB

MarcoGorelli commented on 2021-02-23T14:55:33Z
----------------------------------------------------------------

This could be the case when one-hot-encoding categorical variables.

This isn't something I've come across, and I'm not sure I see how that would happen as the one-hot encoding using training set information only - do you have a reference?

twiecki commented on 2021-03-02T08:48:57Z
----------------------------------------------------------------

Yeah I also don't get this point.

juanitorduz commented on 2021-03-02T09:04:53Z
----------------------------------------------------------------

Ok, maybe I needed to explain this better but this is really not relevant so I removed this confusing remark.

review-notebook-app · 2021-02-23T14:55:34Z

View / edit / reply to this conversation on ReviewNB

MarcoGorelli commented on 2021-02-23T14:55:34Z
----------------------------------------------------------------

are you sure that

fpr, tpr, thresholds = roc_curve(
    y_true=y_test, y_score=y_test_pred, pos_label=1, drop_intermediate=False
)

is correct? Looking at the sklearn docs , it seems that the second argument (y_score) should be the raw predictions p_test_pred rather than the predicted labels y_test_pred . Indeed, if you look at the examples in the sklearn docs, they pass clf.predict_proba(X)[:, 1]

twiecki commented on 2021-03-02T08:52:36Z
----------------------------------------------------------------

Yes, that's correct, needs to be fixed.

juanitorduz commented on 2021-03-02T09:05:08Z
----------------------------------------------------------------

this is fixed now

review-notebook-app · 2021-02-23T14:55:35Z

View / edit / reply to this conversation on ReviewNB

MarcoGorelli commented on 2021-02-23T14:55:35Z
----------------------------------------------------------------

output *by the model

juanitorduz commented on 2021-03-02T09:05:33Z
----------------------------------------------------------------

this is fixed now.

review-notebook-app · 2021-02-23T14:55:36Z

View / edit / reply to this conversation on ReviewNB

MarcoGorelli commented on 2021-02-23T14:55:35Z
----------------------------------------------------------------

Let us *now

juanitorduz · 2021-02-23T15:23:35Z

View / edit / reply to this conversation on ReviewNB

MarcoGorelli commented on 2021-02-23T14:55:33Z

This could be the case when one-hot-encoding categorical variables.

This isn't something I've come across, and I'm not sure I see how that would happen as the one-hot encoding using training set information only - do you have a reference?

This is related to (this post)[https://dzone.com/articles/pandasscikit-learn-get-dummies-testtrain-sets], where once the model is deployed it might see new categories. One could try to use handle_unknown = 'ignore' if this is expected (and wanted), but then in the cross-validation step if we use get_dummies we are already showing the model what to expect. Maybe I can just delete this line as it is not very relevant for the post itself :)

juanitorduz · 2021-02-23T15:44:07Z

View / edit / reply to this conversation on ReviewNB

MarcoGorelli commented on 2021-02-23T14:55:34Z

are you sure that

fpr, tpr, thresholds = roc_curve(
y_true=y_test, y_score=y_test_pred, pos_label=1, drop_intermediate=False
)
is correct? Looking at the sklearn docs , it seems that the second argument (y_score) should be the raw predictions p_test_pred rather than the predicted labels y_test_pred . Indeed, if you look at the examples in the sklearn docs, they pass clf.predict_proba(X)[:, 1]

You are totally right 🙈 ! Thank you so much! I corrected the typos and this error on this commit.

MarcoGorelli · 2021-02-28T12:36:51Z

Cool, thanks @juanitorduz - looks good to me, I would just advise to add a prior predictive check (even though it's not the main focus of your notebook, it's probably something to encourage where possible)

juanitorduz · 2021-02-28T16:50:44Z

@MarcoGorelli I added a subsection on prior predictive checks. Let me know what you think about it 🤓 .

review-notebook-app · 2021-03-01T01:48:52Z

View / edit / reply to this conversation on ReviewNB

OriolAbril commented on 2021-03-01T01:48:51Z
----------------------------------------------------------------

Should be az.summary, latest pymc3 has removed the aliases to arviz functions and won't work anymore

juanitorduz commented on 2021-03-02T09:05:52Z
----------------------------------------------------------------

this is fixed now.

juanitorduz · 2021-03-01T08:19:58Z

Thanks @OriolAbril ! I fixed it :)

review-notebook-app · 2021-03-01T10:35:52Z

View / edit / reply to this conversation on ReviewNB

ricardoV94 commented on 2021-03-01T10:35:52Z
----------------------------------------------------------------

Should be patsy, not pasty, right?

juanitorduz commented on 2021-03-02T09:06:26Z
----------------------------------------------------------------

this is fixed now.

review-notebook-app · 2021-03-01T10:35:53Z

View / edit / reply to this conversation on ReviewNB

ricardoV94 commented on 2021-03-01T10:35:52Z
----------------------------------------------------------------

patsy?

review-notebook-app · 2021-03-01T10:35:53Z

View / edit / reply to this conversation on ReviewNB

ricardoV94 commented on 2021-03-01T10:35:53Z
----------------------------------------------------------------

but taking the mean -> by taking the mean

review-notebook-app · 2021-03-01T10:35:54Z

View / edit / reply to this conversation on ReviewNB

ricardoV94 commented on 2021-03-01T10:35:54Z
----------------------------------------------------------------

Missing space after period. "data set.We"

examples/table_of_contents_examples.js

Co-authored-by: Thomas Wiecki <thomas.wiecki@gmail.com>

juanitorduz · 2021-03-01T11:09:22Z

Thank you @ricardoV94 I fixed your suggestions! (I always have issues naming patsy 😆 )!

twiecki · 2021-03-02T08:48:22Z

typo still needs to be fixed.

Add GLM Out of Sample Predictions Notebook #37

Add GLM Out of Sample Predictions Notebook #37

Conversation

juanitorduz commented Feb 14, 2021

review-notebook-app bot commented Feb 14, 2021

review-notebook-app bot commented Feb 23, 2021 • edited Loading

review-notebook-app bot commented Feb 23, 2021 • edited Loading

review-notebook-app bot commented Feb 23, 2021 • edited Loading

review-notebook-app bot commented Feb 23, 2021 • edited Loading

review-notebook-app bot commented Feb 23, 2021 • edited Loading

juanitorduz commented Feb 23, 2021

MarcoGorelli commented on 2021-02-23T14:55:33Z

juanitorduz commented Feb 23, 2021

MarcoGorelli commented on 2021-02-23T14:55:34Z

MarcoGorelli commented Feb 28, 2021

juanitorduz commented Feb 28, 2021

review-notebook-app bot commented Mar 1, 2021 • edited Loading

juanitorduz commented Mar 1, 2021

review-notebook-app bot commented Mar 1, 2021 • edited Loading

review-notebook-app bot commented Mar 1, 2021 • edited Loading

review-notebook-app bot commented Mar 1, 2021 • edited Loading

review-notebook-app bot commented Mar 1, 2021 • edited Loading

juanitorduz commented Mar 1, 2021

twiecki commented Mar 2, 2021

twiecki commented Mar 2, 2021

twiecki commented Mar 2, 2021

review-notebook-app bot commented Mar 2, 2021 • edited Loading

review-notebook-app bot commented Mar 2, 2021 • edited Loading

review-notebook-app bot commented Mar 2, 2021 • edited Loading

juanitorduz commented Mar 2, 2021

juanitorduz commented Mar 2, 2021

juanitorduz commented Mar 2, 2021

juanitorduz commented Mar 2, 2021

juanitorduz commented Mar 2, 2021

juanitorduz commented Mar 2, 2021

juanitorduz commented Mar 2, 2021

juanitorduz commented Mar 2, 2021

juanitorduz commented Mar 2, 2021

juanitorduz commented Mar 2, 2021

juanitorduz commented Mar 2, 2021

juanitorduz commented Mar 2, 2021

review-notebook-app bot commented Mar 2, 2021 • edited Loading

MarcoGorelli commented Mar 2, 2021

twiecki commented Mar 2, 2021

ricardoV94 left a comment

Choose a reason for hiding this comment

juanitorduz commented Mar 2, 2021

ricardoV94 commented Mar 2, 2021

ricardoV94 left a comment

Choose a reason for hiding this comment

juanitorduz commented Mar 2, 2021

twiecki commented Mar 8, 2021

review-notebook-app bot commented Feb 23, 2021 •

edited

Loading

review-notebook-app bot commented Feb 23, 2021 •

edited

Loading

review-notebook-app bot commented Feb 23, 2021 •

edited

Loading

review-notebook-app bot commented Feb 23, 2021 •

edited

Loading

review-notebook-app bot commented Feb 23, 2021 •

edited

Loading

review-notebook-app bot commented Mar 1, 2021 •

edited

Loading

review-notebook-app bot commented Mar 1, 2021 •

edited

Loading

review-notebook-app bot commented Mar 1, 2021 •

edited

Loading

review-notebook-app bot commented Mar 1, 2021 •

edited

Loading

review-notebook-app bot commented Mar 1, 2021 •

edited

Loading

review-notebook-app bot commented Mar 2, 2021 •

edited

Loading

review-notebook-app bot commented Mar 2, 2021 •

edited

Loading

review-notebook-app bot commented Mar 2, 2021 •

edited

Loading

review-notebook-app bot commented Mar 2, 2021 •

edited

Loading