-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance regression in mean filter (v0.18.0 onwards) #6028
Comments
As far as I can tell, the only thing that's changed is #5399, which removes the gil from the core loop of rank filters. I can imagine that this adds a small overhead when running single threaded in some situations, e.g. maybe releasing and reacquiring the gil is more expensive on Windows? However, the change results in very significant (multi-x) speedups in multithreaded environments, so I'm not sure whether much can be done here... (ie we are not going to revert #5399.) Perhaps @soyers you can try to benefit from the multithreading improvements using @scikit-image/core unless others have ideas about how to fix this, I suggest (unfortunately!) closing this as wont-fix... @soyers do you have any thoughts on the above? I'm disappointed to not be able to help but I don't see another way forward... |
It would be really useful to know how many cores were available to the Win10 (virtual?) environment in question @soyers. Beyond the code changes, did anything get tweaked globally in how Windows builds are compiled and optimized around 0.18? It's been a while, though, and I would have expected lots more complaints since if there was something systematic going on, especially with |
@JDWarner what do you mean by "single threaded windows users"? I would expect that most users work with single threads. I think it's more likely that most people, especially desktop users (vs HPC running Linux) wouldn't notice a 30% performance decrease. |
thank you for the detailed feedback and your thoughts on this! @JDWarner I am using a Windows 10 System (native, no VM) with a (mini)conda environment. The scikit-image library is installed from the anaconda repository. The machine executing the tests is a Dell Precision 7520 with an Intel Core i7-6920HQ (4C8T) and 32GB RAM. The experiment above is not parallelized (runs on a single core). I am currently struggling to understand which version of scikit-image actually introduced the kernels which release the GIL. #5399 was merged in May 2017 while scikit-image 0.17.2 was released in 2020. I was therefore assuming that version 0.17.2 already contains the changes for releasing the GIL. As far as I understand this commit should be the one that was released as v0.17.2: https://github.com/scikit-image/scikit-image/blob/1188c33c3defefdc0f672870b5bcc36beaec967f/skimage/filters/rank/bilateral_cy.pyx Using the release tags to narrow down the time frame in which the regression must have been introduced, I would argue that the following PRs most likely introduced the regression:
I just skimmed over the changes and noticed that this PR introduces a couple of changes in how parameters are passed to the cython part of the code. cd5c1a7#diff-11036dd1acc33e83cc65e4a480cb1eb4d56877c10ebbf93c5e5e974191d92988 shows that a lot of cdef Py_ssize_t [::1] se_e_r = np.empty(se_size, dtype=np.intp) changed to se_e_r = <Py_ssize_t*>malloc(se_size) Unfortunately, I don't have the knowledge to judge if some of these changes could potentially cause regressions, but mayve you have more insight concerning this topic! I'm grateful about any input 😄 |
If your application allows relaxing the |
This might be related. I think sometimes Cython memoryviews can be a little slower than using pointers, but we would have to benchmark it to be sure. |
@grlee77 Thank you very much for the hint! I hadn't thought about using a uniform filter. Concerning the memoryviews vs pointers issue, I can imagine that it's hard to tell whether this is really an issue. I am not sure when I will find the time to investigate this, but if somebody else already has a development stack running and is trying to reproduce this, I will do my best to help! |
Description
Our performance tests detected a regression in the mean filter (
from skimage.filters.rank.mean
)after upgrading from scikit-image 0.17.2 to 0.18.3.I was not able to reproduce this issue in linux-based colab notebooks, so it seems to be specific for the Windows builds of skimage.
After some profiling I found that the
skimage.filters.rank.generic_cy._mean
function is about >30% slower in skimage v0.18.3 compared to v0.17.2.Way to reproduce
Run the following code in an ennvironment with skimage v0.17.2 and afterwards run the same code with skimage v0.18.3:
The output of script run with v0.18.3 is (first line is skimage version, second line is average execution time of _mean function):
The output of script run with v0.17.2 is
Version information
Output from v0.18.3 environment:
Output from v0.17.2 environment:
The text was updated successfully, but these errors were encountered: