Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

posterior_linpred() for ordinal families: argument for taking the intercept into account #1137

Merged
merged 34 commits into from
May 5, 2021

Conversation

fweber144
Copy link
Contributor

@fweber144 fweber144 commented Apr 12, 2021

This PR introduces an argument (incl_thres) to posterior_linpred() for taking the intercept into account (in case of an ordinal family) which is required for the augmented-data approach in projpred.

A different topic: In line

out <- get_dpar(object, dpar = dpar, ilink = TRUE)
I don't quite understand why ilink = TRUE is used there (since scale = "linear" should also be possible there, right?). But I guess it's correct and I just don't get it. In case it's not correct: Do you want me to open an issue?

… into account (in case of an ordinal family) which is required for the augmented-data approach in projpred.
@fweber144
Copy link
Contributor Author

Of course, you can delete the comment in this line if you agree that using sweep() with its recycling checks is preferable.

@paul-buerkner
Copy link
Owner

paul-buerkner commented Apr 12, 2021 via email

@fweber144
Copy link
Contributor Author

Sure

@paul-buerkner
Copy link
Owner

Thanks again for our call yesterday. I have taken the liberty to push my changes directly to this PR so that you can continue working on it as well. I have implemented the basic functionality for the envisioned incl_thres feature.

There is some more work to do, which I don't have time for at the moment and I was hoping that perhaps you can continue with it, if you like and have time. Specifically, there are the following TODOs remaining:

  • Refactor the remaining d<ordinal-family> functions following what I have done with dcumulative in distributions.R.
  • Implement the (extended) link functions corresponding to the (extended) inverse link functions. This requires to do some math first because I don't know if anybody has actually written out these extended links already somewhere. The math shouldn't be too complicated though.

@fweber144
Copy link
Contributor Author

Thanks a lot! The basic case works like a charm (see the test added in tests/local/tests.models_new.R). However, I have found two cases which probably need special handling:

  1. Specifying a formula for the distributional parameter disc (e.g., disc ~ 1).
  2. Grouped thresholds (<...> | thres(th, gr) ~ <...>).

I have started tests for these two special cases in tests/local/tests.models_new.R but of course, those two tests still fail. I can have a look if I can fix these special cases.

I'll tackle the TODOs asap.

@paul-buerkner
Copy link
Owner

Thanks! My implementation should handle these specials cases already. In fact these special cases are one reason my implementation is set up in the way it is. Please let me know if you find out why its not working and I am happy to fix it.

@fweber144
Copy link
Contributor Author

Sure. I'm realizing that I forgot to take disc into account when performing the check. My bad, sorry.

@fweber144
Copy link
Contributor Author

fweber144 commented Apr 30, 2021

I fixed the disc ~ 1 test. However, the second special case (grouped thresholds) remains to discuss. As explained in my original corresponding unit test (commit 8e7f4fe), the unmatching group/threshold combinations seem to be assigned a value of zero which is probably misleading. It might be better to replace those zeros with NAs or use a completely different structure (e.g. a list, as in the original check for that unit test). If both is not an option, perhaps disallow grouped thresholds for incl_thres = TRUE. In commit b690b75, I have now replaced the zeros by NAs but I don't know if that breaks other code (which could rely on zeros).

@paul-buerkner
Copy link
Owner

Good catch! The reason things are 0 is because the probability of them is 0. NA should be fine as well, but perhaps this can be done conditionally, that is NA for identity link and 0 otherwise?

@fweber144
Copy link
Contributor Author

fweber144 commented Apr 30, 2021 via email

@paul-buerkner
Copy link
Owner

paul-buerkner commented May 4, 2021 via email

@fweber144
Copy link
Contributor Author

fweber144 commented May 4, 2021

Ah, yeah that's a good idea. I was too focused on the former matrix syntax to see how using arrays in the inv_link_<...>() functions will make life easier.

@fweber144
Copy link
Contributor Author

fweber144 commented May 4, 2021

To get back to the comment from above:

Did you check already if the sratio and cratio families yield the same results when run in brms with a symmetric link function

No, I haven't yet. I'll do so (and perhaps even add this as a unit test).

Families sratio() and cratio() indeed give the same results for symmetric link functions. I'll add unit tests for that. But I think it's better to create a new PR for that so we don't mix up too many different things here.

@paul-buerkner
Copy link
Owner

Thanks. Glad to hear things are correctly implemented in brms. And indeed, we can worry about the unit test later and separately.

@fweber144
Copy link
Contributor Author

Here are the commits with which I would consider this PR to be finished. But of course, if you want me to add or change something, that's no problem. Some notes:

  1. I added some internal documentation. I'm not 100% sure it's always correct, so better check it.
  2. I basically only "translated" the inv_link_<ordinal_family>() functions from matrix syntax to array syntax. However, I think my unit tests for these inv_link_<ordinal_family>() functions should provide a more efficient implementation since they are vectorized as much as possible. If you want, I can swap the two implementations (the original one from the functions and my new one from the unit tests) in a new PR. Of course, I would then also generalize my implementation from the unit tests to arrays with more than 3 dimensions.
  3. As mentioned in a comment of commit 8e7f4fe (these lines), I'm currently re-using the object fit from the previous test. I did so to save computing time when running the tests. However, it breaks the self-containment of that test which is probably bad practice.
  4. As mentioned in this comment, I'll create a separate PR for testing the equivalence of the sratio() and the cratio() family in case of symmetric distribution functions.
  5. The equivalence of the sratio() and the cumulative() (not cratio()) family in case of the "cloglog" link (see Appendix A of this paper) does not seem to hold in brms. That needs to be checked and when clarified, it should probably be included in the PR for the equivalence of the sratio() and the cratio() family.
  6. I have already started some code and the mathematical derivation of the link functions. However, as the link functions are not needed for the new argument incl_thres, I would create a new PR for that if you're OK with it. I also have some other projpred-related code for brms which I would then include in that PR. I think it's best to use the current PR only for argument incl_thres and changes directly related to it.

@paul-buerkner
Copy link
Owner

paul-buerkner commented May 5, 2021

Great, thank you so much!

  1. I will check the doc before merging.
  2. I think it is sufficient if the inv_link functions work for 1D to 3D arrays. We can swap the implementations but only if the speed differences is large enough to make it worthwhile. Did you investigate speed differences already? Personally, I am find with just keeping it in the current way for this PR. We can still change it later on once we have the speed evaluation done.
  3. I will edit to make it self-contained.
  4. Yes, we can do this in a separate PR but we can also choose not to do this at all right now since its not in focus at the moment.
  5. Let's ignore this at the moment. I got this equivalence statement from another paper and didn't check it myself in detail. I don't think its worth investigating at the moment.
  6. Sounds good!

@paul-buerkner
Copy link
Owner

I am running checks now. Once they pass, I will merge this PR. Thank you again for working on it!

@fweber144
Copy link
Contributor Author

Great, thanks. And I'm glad to help. Concerning

We can swap the implementations but only if the speed differences is large enough to make it worthwhile. Did you investigate speed differences already? Personally, I am find with just keeping it in the current way for this PR. We can still change it later on once we have the speed evaluation done.

I will perform such a speed comparison when I get the time.

@fweber144
Copy link
Contributor Author

Concerning commit 5c2cdb4: I'm just realizing that you can probably replace none_cat_dims by marg_othdim.

@paul-buerkner
Copy link
Owner

Good catch. Fixed it and will merge now.

@paul-buerkner paul-buerkner merged commit c545d81 into paul-buerkner:master May 5, 2021
fweber144 added a commit to fweber144/brms that referenced this pull request May 5, 2021
@fweber144 fweber144 deleted the projpred_augdat branch May 5, 2021 14:41
@fweber144
Copy link
Contributor Author

Great, thanks. And I'm glad to help. Concerning

We can swap the implementations but only if the speed differences is large enough to make it worthwhile. Did you investigate speed differences already? Personally, I am find with just keeping it in the current way for this PR. We can still change it later on once we have the speed evaluation done.

I will perform such a speed comparison when I get the time.

You can now find the speed comparison in PR #1155.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants