Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve median filter #150

Merged
merged 2 commits into from
Nov 29, 2024
Merged

Improve median filter #150

merged 2 commits into from
Nov 29, 2024

Conversation

cgohlke
Copy link
Member

@cgohlke cgohlke commented Nov 28, 2024

Description

This PR improves upon #147, simplifying the implementation and boosting performance of median filtering:

  • Do not call _quickselect twice in _median function for even-sized kernels.
    The functions are now merged into one _median function.
  • Move "repeat" loop into _apply_2d_median_filter function nogil section.
  • Separate "repeat" loops for real and imag to improve memory efficiency.
  • Merge _median_filter_2d and _median_filter_nd into one _median_filter function.
  • Rename _apply_2d_median_filter to _median_filter_2d.
  • Use separate nan_mask for real and imag to match Cython implementation.
  • Move median filter after thresholding in introductory tutorial. That should be better practice now that the median filter correctly handles NaN, no? Added a note that "filtered coordinates can no longer be used to reconstruct the original signal" (is there a better phrasing? Should that be added to the function docstring?)

Checklist

  • The pull request title and description are concise.
  • Related issues are linked in the description.
  • New dependencies are explained.
  • The source code and documentation can be distributed under the MIT license.
  • The source code adheres to code standards.
  • New classes, functions, and features are thoroughly tested.
  • New, user-facing classes, functions, and features are documented.
  • New features are covered in tutorials.
  • No files other than source code, documentation, and project settings are added to the repository.

@cgohlke cgohlke added the enhancement New feature or request label Nov 28, 2024
@cgohlke cgohlke self-assigned this Nov 28, 2024
@bruno-pannunzio
Copy link
Contributor

Hi @cgohlke! Thanks for implementing this improvements for median filtering. The modifications helped improving a little bit more the execution times, since now it seems to be around 20% faster than Scipy's (compared to 10-15% faster from what is now in main).

  • Use separate nan_mask for real and imag to match Cython implementation.

This is something we can discuss if it makes sense to apply the same logic as the phasor_threshold for propagating NaN across real and imag so that same values are invalid in all components of the phasor. But we can keep it like this I think and remember to discuss it next meeting.

  • Move median filter after thresholding in introductory tutorial. That should be better practice now that the median filter correctly handles NaN, no?

I think it would be a good idea to have the input of @lmalacrida on this. I remember he once said to me it's better practice to first filter and then threshold. I tried first thresholding and then filtering and the result phasor has wider spread and doesn't look as nice (even with the current implementation which is NaN consistent).

This is an example result of first thresholding and then filtering:
image

And this is first filtering and then thresholding where you can see it is more compact and fewer pixels lie outside the semicircle:
image

  • Added a note that "filtered coordinates can no longer be used to reconstruct the original signal" (is there a better phrasing? Should that be added to the function docstring?)

This can be a good idea to add to the docstring as a warning that once performed the filter you can't go back to the signal. Also maybe add a note in the phasor_to_signal function stating the same warning?

@cgohlke
Copy link
Member Author

cgohlke commented Nov 29, 2024

it's better practice to first filter and then threshold

OK, I reverted the changes to the tutorial since they are disputed.

apply the same logic as the phasor_threshold for propagating NaN across real and imag so that same values are invalid in all components of the phasor.

This would need to be documented and tested. And it should be implemented for all code paths, not just one. The filter would no longer be independent for real and imag. The logic should also apply to the intensity (at least optionally). That would probably be better implemented in a separate function (?)

I think it is easier to implement and document if the median filter is applied independently (as it is now).

@lmalacrida
Copy link

@bruno-pannunzio I agree that Filtering should be performed on the entire image array, regardless of thresholding or before it. This is how i believe was performed at SimFCS.

@cgohlke
Copy link
Member Author

cgohlke commented Nov 29, 2024

I agree that Filtering should be performed on the entire image array, regardless of thresholding or before it. This is how i believe was performed at SimFCS.

And the scientific reason is?

@cgohlke
Copy link
Member Author

cgohlke commented Nov 29, 2024

That would probably be better implemented in a separate function (?)

That's what phasor_threshold already does by default.

@cgohlke
Copy link
Member Author

cgohlke commented Nov 29, 2024

This is how i believe was performed at SimFCS.

I skimmed through the SimFCS source code. At least in some places, a minimum intensity threshold is applied as part of calculating phasor coordinates, that is before applying median filter. Phasor coordinates for intensities below threshold are set to zero, which are then included in the median filter (I presume). Other code paths in SimFCS might or might not do things differently.

For comparison: PhasorPy does not apply any intensity threshold by default when calculating phasor coordinates and sets phasor coordinates with no intensity to NaN. The phasor_threshold function sets values below or above thresholds to NaN. NaN values are ignored in the median filter.

Copy link
Contributor

@bruno-pannunzio bruno-pannunzio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Christoph I think it makes sense what you propose here and keep the workflow as it is now.

Lets keep it like this with the modifications proposed here, where everything is very well documented.

@cgohlke cgohlke merged commit f6b96c0 into phasorpy:main Nov 29, 2024
14 checks passed
@cgohlke cgohlke deleted the improve-median branch November 29, 2024 19:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants