Improve efficiency of the `inv_link_<ordinal_family>()` functions #1155

fweber144 · 2021-05-06T12:31:30Z

This PR is based on PR #1154, so it's better to merge #1154 first.

As suspected here (enumeration point 2), the current unit tests for the inv_link_<ordinal_family>() functions do indeed provide a more efficient implementation of these inv_link_<ordinal_family>() functions. Here is my requested speed comparison:

library(brms)
options(mc.cores = parallel::detectCores(logical = FALSE))
data(inhaler, package = "brms")
bfit_cumul <- brm(
  formula = rating ~ period + carry + treat + (1 | subject),
  data = inhaler,
  family = cumulative(),
  seed = 475064792
)
library(microbenchmark)
microbenchmark(epred_cumul <- posterior_epred(bfit_cumul),
               times = 25)
### Old:
# Unit: seconds
#                                       expr      min       lq     mean   median      uq      max neval
# epred_cumul <- posterior_epred(bfit_cumul) 1.206015 1.415697 1.525062 1.429557 1.68592 2.037138    25
### 
### New:
# Unit: seconds
#                                       expr      min       lq    mean   median       uq      max neval
# epred_cumul <- posterior_epred(bfit_cumul) 1.322469 1.329213 1.35928 1.336847 1.343451 1.615867    25
### 

bfit_sratio <- update(bfit_cumul, family = sratio())
microbenchmark(epred_sratio <- posterior_epred(bfit_sratio),
               times = 25)
### Old:
# Unit: seconds
#                                         expr      min       lq     mean   median       uq      max neval
# epred_sratio <- posterior_epred(bfit_sratio) 13.98803 14.29517 14.40808 14.38126 14.58076 14.65054    25
### 
### New:
# Unit: seconds
#                                         expr      min       lq     mean   median       uq      max neval
# epred_sratio <- posterior_epred(bfit_sratio) 8.475701 8.635784 8.723748 8.690884 8.795831 9.093068    25
### 

bfit_cratio <- update(bfit_cumul, family = cratio())
microbenchmark(epred_cratio <- posterior_epred(bfit_cratio),
               times = 25)
### Old:
# Unit: seconds
#                                         expr      min       lq     mean   median       uq      max neval
# epred_cratio <- posterior_epred(bfit_cratio) 13.97217 14.32164 14.39963 14.34807 14.47658 14.70686    25
### 
### New:
# Unit: seconds
#                                         expr      min       lq    mean   median       uq      max neval
# epred_cratio <- posterior_epred(bfit_cratio) 8.169052 8.572888 8.62483 8.597578 8.634792 8.878164    25
### 

bfit_acat <- update(bfit_cumul, family = acat())
microbenchmark(epred_acat <- posterior_epred(bfit_acat),
               times = 25)
### Old:
# Unit: seconds
#                                     expr      min       lq     mean   median       uq      max neval
# epred_acat <- posterior_epred(bfit_acat) 13.73694 13.88184 13.94854 13.92991 13.95867 14.17641    25
### 
### New:
# Unit: seconds
#                                     expr      min       lq    mean   median       uq      max neval
# epred_acat <- posterior_epred(bfit_acat) 12.32044 12.65225 12.7692 12.72448 12.90858 13.42917    25
### 

bfit_acat_probit <- update(bfit_cumul, family = acat(link = "probit"))
microbenchmark(epred_acat_probit <- posterior_epred(bfit_acat_probit),
               times = 25)
### Old:
# Unit: seconds
#                                                   expr      min      lq     mean   median       uq      max neval
# epred_acat_probit <- posterior_epred(bfit_acat_probit) 31.17294 31.3034 31.45372 31.49983 31.55278 31.70364    25
### 
### New:
# Unit: seconds
#                                                   expr      min       lq     mean   median       uq      max neval
# epred_acat_probit <- posterior_epred(bfit_acat_probit) 21.54279 22.21362 22.39333 22.48148 22.65841 23.15681    25
###

Because of this speed improvement (which is sometimes smaller, sometimes larger, but always present), this PR swaps the two implementations of the inv_link_<ordinal_family>() functions (the original one and the one from the unit tests).

Of course, one could go one step further and achieve another speed improvement by using arrays when calling d<ordinal_family>() in posterior_epred_ordinal(), i.e. by not iterating over the observations, but instead including them as an additional array margin. But that probably requires larger changes.

…ctions: In `distributions.R`, use the more efficient implementation from the unit tests and in the unit tests, use the original implementation.

paul-buerkner · 2021-05-06T14:16:53Z

Very elegant implementations. Thank you! Will merge once the checks pass.

paul-buerkner · 2021-05-06T14:36:14Z

When I change the number of categories in inv_link_ordinal_sim from 3 to 2 some tests fail. Can you take a look at fix the implementations to work with 2 categories as well?

fweber144 · 2021-05-06T15:26:10Z

Yes, I'll take a look at it. Thanks for the hint.

…maining apply() call in array(), even though it shouldn't be necessary (but better be explicit).

fweber144 · 2021-05-06T19:36:10Z

Should be fixed now. And I hope I have included all special cases in the unit tests now. Thanks again for pointing this out and sorry for not being aware of this.

paul-buerkner · 2021-05-06T21:19:51Z

Thanks for fixing this! And no worries, 1/3 of all brms bugs are caused by R dropping dimensions somewhere in edge cases :-D

fweber144 · 2021-05-07T06:16:48Z

Yeah, that dropping of margins is not really developer-friendly, especially when there's no way to turn it off, as for apply(). Thanks for merging!

jgabry · 2021-05-07T23:34:16Z

And no worries, 1/3 of all brms bugs are caused by R dropping dimensions somewhere in edge cases :-D

1/3 of all nightmares I have are caused by R dropping dimensions ;)

wds15 · 2021-05-08T06:40:42Z

And then filling up containers by repeating things is a real nightmare.

fweber144 · 2021-05-08T07:22:51Z

You mean because I repeatedly added those array(..., dim = c(dim_thres, dim_noncat)) calls in commit 8b1704a? That's true, that could perhaps have been solved more elegantly by adding a custom wrapper around apply() which does not drop margins.

paul-buerkner · 2021-05-08T07:34:08Z

that was no comment about your code I think but only about Rs way to filling arrays with more numbers from the start if the sizes don't match. although you are right, we could add a apply2 function as a wrapper around apply. Frank Weber ***@***.***> schrieb am Sa., 8. Mai 2021, 09:23:

…

You mean because I repeatedly added those array(..., dim = c(dim_thres, dim_noncat)) calls in commit 8b1704a <8b1704a>? That's true, that could perhaps have been solved more elegantly by adding a custom wrapper around apply() which does not drop margins. — You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub <#1155 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADCW2AEAU6E4WPMDR2SJ6WDTMTRFVANCNFSM44G6RHYQ> .

fweber144 · 2021-05-08T15:54:30Z

Ah I see :D

Concerning the apply() wrapper: I'm currently lacking the time to implement this, but I'll try to keep it in mind.

paul-buerkner · 2021-05-08T17:16:15Z

yeah no worries. I can just do that myself. the projpred things are much more important anyway (and probably a lot of other things you have on your table). Frank Weber ***@***.***> schrieb am Sa., 8. Mai 2021, 17:54:

…

Ah I see :D Concerning the apply() wrapper: I'm currently lacking the time to implement this, but I'll try to keep it in mind. — You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub <#1155 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADCW2AB4NPWJV76U6KTLOH3TMVNENANCNFSM44G6RHYQ> .

fweber144 · 2021-05-20T04:09:42Z

As if the R Core Team had heard us: For R 4.1.0, the NEWS file says:

apply() gains a simplify argument to allow disabling of simplification of results.

This new argument doesn't offer exactly what I would have desired, but it should be a good starting point for writing a custom apply() wrapper. @paul-buerkner, do you want me to write such a wrapper? But it would make brms depend on R >= 4.1.0.

paul-buerkner · 2021-05-20T06:39:12Z

thanks for looking into it. I don't want to strictly depend on R 4.1+ or even R 4.0+ at the moment so I suggest we don't change it for now. Frank Weber ***@***.***> schrieb am Do., 20. Mai 2021, 06:09:

…

As if the R Core Team had heard us: For R 4.1.0, the NEWS file <https://cran.r-project.org/doc/manuals/r-release/NEWS.html> says: apply() gains a simplify argument to allow disabling of simplification of results. This new argument doesn't offer exactly what I would have desired, but it should be a good starting point for writing a custom apply() wrapper. @paul-buerkner <https://github.com/paul-buerkner>, do you want me to write such a wrapper? But it would make brms depend on R >= 4.1.0. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1155 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADCW2AGHTDBOEFV53N4WYKTTOSDRLANCNFSM44G6RHYQ> .

Swap the two implementations of the inv_link_<ordinal_family>() fun…

c6644fb

…ctions: In `distributions.R`, use the more efficient implementation from the unit tests and in the unit tests, use the original implementation.

This was referenced May 6, 2021

Refactor unit tests for inv_link_<ordinal_family>() #1154

Merged

posterior_linpred() for ordinal families: argument for taking the intercept into account #1137

Merged

paul-buerkner added 2 commits May 6, 2021 16:04

Merge branch 'master' into pr/1155

17d6aaf

minor cleaning

14d3e63

Merge branch 'master' into pr/1155

4e5fd6c

fweber144 added 4 commits May 6, 2021 20:59

Fix a bug for ncat == 2 and add tests for this.

8b1704a

Add tests for ndraws <- 1.

16a7cb8

Add tests for nobsv <- 1.

5e2a7bf

In the inv_link_<ordinal_family>() functions: Also wrap the last re…

5f7dc59

…maining apply() call in array(), even though it shouldn't be necessary (but better be explicit).

minor cleaning

e37b203

paul-buerkner merged commit 98d0cc8 into paul-buerkner:master May 6, 2021

fweber144 deleted the ordinal_speed branch May 7, 2021 06:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve efficiency of the `inv_link_<ordinal_family>()` functions #1155

Improve efficiency of the `inv_link_<ordinal_family>()` functions #1155

fweber144 commented May 6, 2021 •

edited

Loading

paul-buerkner commented May 6, 2021

paul-buerkner commented May 6, 2021

fweber144 commented May 6, 2021

fweber144 commented May 6, 2021

paul-buerkner commented May 6, 2021

fweber144 commented May 7, 2021 •

edited

Loading

jgabry commented May 7, 2021

wds15 commented May 8, 2021

fweber144 commented May 8, 2021

paul-buerkner commented May 8, 2021 via email

fweber144 commented May 8, 2021

paul-buerkner commented May 8, 2021 via email

fweber144 commented May 20, 2021

paul-buerkner commented May 20, 2021 via email

Improve efficiency of the inv_link_<ordinal_family>() functions #1155

Improve efficiency of the inv_link_<ordinal_family>() functions #1155

Conversation

fweber144 commented May 6, 2021 • edited Loading

paul-buerkner commented May 6, 2021

paul-buerkner commented May 6, 2021

fweber144 commented May 6, 2021

fweber144 commented May 6, 2021

paul-buerkner commented May 6, 2021

fweber144 commented May 7, 2021 • edited Loading

jgabry commented May 7, 2021

wds15 commented May 8, 2021

fweber144 commented May 8, 2021

paul-buerkner commented May 8, 2021 via email

fweber144 commented May 8, 2021

paul-buerkner commented May 8, 2021 via email

fweber144 commented May 20, 2021

paul-buerkner commented May 20, 2021 via email

Improve efficiency of the `inv_link_<ordinal_family>()` functions #1155

Improve efficiency of the `inv_link_<ordinal_family>()` functions #1155

fweber144 commented May 6, 2021 •

edited

Loading

fweber144 commented May 7, 2021 •

edited

Loading