Skip to content

RegionProps.__setattr__ has non-negligible overhead #6601

Open
@anntzer

Description

Description:

RegionProps currently relies on __setattr__ to provide backcompat access of old property names (e.g. mapping BoundingBox to bbox); but because this also affects all attribute initializations in __init__, this ends up having a non-negligible overhead which could be avoided.

Way to reproduce:

First check how much time it takes to label a random (1024, 1024) binary image.

$ python -mtimeit -s 'import numpy as np; from skimage.measure import label, regionprops; np.random.seed(0); im = np.random.randint(0, 2, (1024, 1024))' -- 'label(im)'

-> 3673 labels (label(im).max()); 30ms/iteration

Then check how much time it takes just to instantiate the corresponding regionprops.

$ python -mtimeit -s 'import numpy as np; from skimage.measure import label, regionprops; np.random.seed(0); im = np.random.randint(0, 2, (1024, 1024)); l = label(im)' -- 'regionprops(l)'

-> 31ms/iteration i.e. even more than the labeling itself

One easy way to skip __setitem__ is to directly update the object's __dict__ directly, e.g. replace, in RegionProperties.__init__,

        self.label = label
        
        self._slice = slice
        self.slice = slice
        self._label_image = label_image
        self._intensity_image = intensity_image
        
        self._cache_active = cache_active
        self._cache = {}
        self._ndim = label_image.ndim
        self._multichannel = multichannel
        self._spatial_axes = tuple(range(self._ndim))
        
        self._extra_properties = {}

by

        vars(self).update(
            label=label,
            _slice=slice,
            slice=slice,
            _label_image=label_image,
            _intensity_image=intensity_image,
            _cache_active=cache_active,
            _cache={},
            _ndim=label_image.ndim,
            _multichannel=multichannel,
            _spatial_axes=tuple(range(label_image.ndim)),
            _extra_properties={},
        )

which speeds up the above timeit call to 18ms/iteration -- almost twice as fast.

Admittedly that's a bit ugly; probably the "proper" way to do this is to completely get rid of the custom __setattr__ (and of backcompat alias handling in __getattr__) and to instead define a bunch of pass-through properties, e.g.

@property
def BoundingBox(self): return self.bbox
@BoundingBox.setter
def BoundingBox(self, value): self.bbox = value

(which can probably be done in a loop in the class definition body in you prefer, e.g. something along the lines of

for _alias, _name in PROPS.items():
    locals()[_alias] = property(
        lambda self, *, _name=_name: getattr(self, _name),
        lambda self, value, *, _name=name: setattr(self, _name, value))

)
OTOH, I see that #6000 explicitly chose not to provide properties for the old aliases for the purpose of better hiding them.

I don't really care whether to fix this with the vars(self).update(...) approach or by introducing these properties.

Traceback or output:

No response

Version information:

3.10.6 | packaged by conda-forge | (main, Aug 22 2022, 20:36:39) [GCC 10.4.0]
Linux-5.19.14-300.fc37.x86_64-x86_64-with-glibc2.36
scikit-image version: 0.19.3
numpy version: 1.23.3

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions