-
-
Notifications
You must be signed in to change notification settings - Fork 253
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add GLM Out of Sample Predictions Notebook #37
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
View / edit / reply to this conversation on ReviewNB MarcoGorelli commented on 2021-02-23T14:55:33Z *in the design matrix twiecki commented on 2021-03-02T08:48:21Z typo still needs to be fixed. juanitorduz commented on 2021-03-02T09:03:47Z Thanks, this is fixed now. |
View / edit / reply to this conversation on ReviewNB MarcoGorelli commented on 2021-02-23T14:55:33Z This could be the case when one-hot-encoding categorical variables. This isn't something I've come across, and I'm not sure I see how that would happen as the one-hot encoding using training set information only - do you have a reference? twiecki commented on 2021-03-02T08:48:57Z Yeah I also don't get this point. juanitorduz commented on 2021-03-02T09:04:53Z Ok, maybe I needed to explain this better but this is really not relevant so I removed this confusing remark. |
View / edit / reply to this conversation on ReviewNB MarcoGorelli commented on 2021-02-23T14:55:34Z are you sure that fpr, tpr, thresholds = roc_curve( y_true=y_test, y_score=y_test_pred, pos_label=1, drop_intermediate=False ) is correct? Looking at the sklearn docs , it seems that the second argument (y_score) should be the raw predictions p_test_pred rather than the predicted labels y_test_pred . Indeed, if you look at the examples in the sklearn docs, they pass twiecki commented on 2021-03-02T08:52:36Z Yes, that's correct, needs to be fixed. juanitorduz commented on 2021-03-02T09:05:08Z this is fixed now |
View / edit / reply to this conversation on ReviewNB MarcoGorelli commented on 2021-02-23T14:55:35Z output *by the model juanitorduz commented on 2021-03-02T09:05:33Z this is fixed now. |
View / edit / reply to this conversation on ReviewNB MarcoGorelli commented on 2021-02-23T14:55:35Z Let us *now |
This is related to (this post)[https://dzone.com/articles/pandasscikit-learn-get-dummies-testtrain-sets], where once the model is deployed it might see new categories. One could try to use |
You are totally right 🙈 ! Thank you so much! I corrected the typos and this error on this commit. |
Cool, thanks @juanitorduz - looks good to me, I would just advise to add a prior predictive check (even though it's not the main focus of your notebook, it's probably something to encourage where possible) |
@MarcoGorelli I added a subsection on prior predictive checks. Let me know what you think about it 🤓 . |
View / edit / reply to this conversation on ReviewNB OriolAbril commented on 2021-03-01T01:48:51Z Should be juanitorduz commented on 2021-03-02T09:05:52Z this is fixed now. |
Thanks @OriolAbril ! I fixed it :) |
View / edit / reply to this conversation on ReviewNB ricardoV94 commented on 2021-03-01T10:35:52Z Should be patsy, not pasty, right?
juanitorduz commented on 2021-03-02T09:06:26Z this is fixed now. |
View / edit / reply to this conversation on ReviewNB ricardoV94 commented on 2021-03-01T10:35:52Z patsy? |
View / edit / reply to this conversation on ReviewNB ricardoV94 commented on 2021-03-01T10:35:53Z but taking the mean -> by taking the mean |
View / edit / reply to this conversation on ReviewNB ricardoV94 commented on 2021-03-01T10:35:54Z Missing space after period. "data set.We" |
Co-authored-by: Thomas Wiecki <thomas.wiecki@gmail.com>
Thank you @ricardoV94 I fixed your suggestions! (I always have issues naming |
typo still needs to be fixed. View entire conversation on ReviewNB |
Yeah I also don't get this point. View entire conversation on ReviewNB |
Yes, that's correct, needs to be fixed. View entire conversation on ReviewNB |
View / edit / reply to this conversation on ReviewNB twiecki commented on 2021-03-02T09:00:29Z typo: interaction juanitorduz commented on 2021-03-02T09:11:48Z fixed |
View / edit / reply to this conversation on ReviewNB twiecki commented on 2021-03-02T09:00:30Z Cell repeated. juanitorduz commented on 2021-03-02T09:07:02Z this is fixed now. |
View / edit / reply to this conversation on ReviewNB twiecki commented on 2021-03-02T09:00:30Z I feel like the prior predictive check doesn't really add anything here and is rather confusing. I'd suggest to remove it. juanitorduz commented on 2021-03-02T09:23:33Z The initial version of the notebook did not have any prior predictive checks and I was suggested to add some. Should I remove it completely or do you have any ideas or suggestions on how to do it so that it fits the purpose of the example? MarcoGorelli commented on 2021-03-02T14:22:51Z Hey - sorry, the prior checks were my suggestion (was just trying to encourage them as part of the Bayesian workflow) but I have no objection to them being removed ricardoV94 commented on 2021-03-02T15:19:49Z I agree with having them removed. This makes it more straightforward to get to the point of this notebook. juanitorduz commented on 2021-03-02T15:38:20Z As there is no strong opinion to keep this section, I removed it from the notebook. |
Thanks, this is fixed now. View entire conversation on ReviewNB |
Ok, maybe I needed to explain this better but this is really not relevant so I removed this confusing remark. View entire conversation on ReviewNB |
this is fixed now View entire conversation on ReviewNB |
this is fixed now. View entire conversation on ReviewNB |
3 similar comments
this is fixed now. View entire conversation on ReviewNB |
this is fixed now. View entire conversation on ReviewNB |
this is fixed now. View entire conversation on ReviewNB |
fixed View entire conversation on ReviewNB |
you are right, I removed it. View entire conversation on ReviewNB |
I added this reference into the resources section. View entire conversation on ReviewNB |
Thank you all for your suggestions and catching my typos 🙈 (it is hard to catch them with a 1-year old learning to walk around you 😄 )! |
The initial version of the notebook did not have any prior predictive checks and I was suggested to add some. Should I remove it completely or do you have any ideas or suggestions on how to do it so that it fits the purpose of the example? View entire conversation on ReviewNB |
View / edit / reply to this conversation on ReviewNB ricardoV94 commented on 2021-03-02T14:18:39Z Is there a reason why you choose tho add Gaussian noise before the logistic transformation, instead of using the bernoulli noise?
intercept = - 0.5 beta_x1 = 1 beta_x2 = -1 beta_interaction = 2 z = intercept + beta_x1 * x1 + beta_x2 * x2 + beta_interaction * x1 * x2 p = 1 / (1 + np.exp(-z)) y = stats.bernoulli(p=p).rvs()
The intermediate variables ( juanitorduz commented on 2021-03-02T15:16:01Z Thank you for pointing this out! I actually just wanted to generate noisy data, but this remark makes sense. I changed it following your suggestion :) |
Hey - sorry, the prior checks were my suggestion (was just trying to encourage them as part of the Bayesian workflow) but I have no objection to them being removed View entire conversation on ReviewNB |
@MarcoGorelli Yeah, when I read that I thought "this is a great suggestion" and I love prior checks but now seeing it implemented I don't think it really benefits much here and I can't think of a way to improve it, so I would argue to remove it again. What do you think? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a comment about the true data model specification. It's not a blocker, but I would at least mention it in the notebook.
Thank you for pointing this out! I actually just wanted to generate noisy data, but this remark makes sense. I changed it following your suggestion :) View entire conversation on ReviewNB |
I agree with having them removed. This makes it more straightforward to get to the point of this notebook. View entire conversation on ReviewNB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
As there is no strong opinion to keep this section, I removed it from the notebook. View entire conversation on ReviewNB |
Thanks @juanitorduz! |
I would like to add a small example to the docs. See #33 and the original post https://juanitorduz.github.io/glm_pymc3/. I am open for suggestions and changes to make this notebook useful for the community.