-
Notifications
You must be signed in to change notification settings - Fork 392
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX Make skorch work with sklearn 1.6.0 #1076
Conversation
The new sklearn version now requires the estimator to expose a method __sklearn_tags__. This PR implements a bare minimum of sklearn tags to make the tests pass again. Probably this list of tags can be improved to be more precise. Moreover, some tests now started failing. To be precise, using GridSearchCV with y being a torch tensor fails, as sklearn performs a check on the devices of y vs y_pred (a numpy array) and determines that the devices differ. The error message is: def device(*array_list, remove_none=True, remove_types=(str,)): """Hardware device where the array data resides on. If the hardware device is not the same for all arrays, an error is raised. Parameters ---------- *array_list : arrays List of array instances from NumPy or an array API compatible library. remove_none : bool, default=True Whether to ignore None objects passed in array_list. remove_types : tuple or list, default=(str,) Types to ignore in array_list. Returns ------- out : device `device` object (see the "Device Support" section of the array API spec). """ array_list = _remove_non_arrays( *array_list, remove_none=remove_none, remove_types=remove_types ) if not array_list: return None device_ = _single_array_device(array_list[0]) # Note: here we cannot simply use a Python `set` as it requires # hashable members which is not guaranteed for Array API device # objects. In particular, CuPy devices are not hashable at the # time of writing. for array in array_list[1:]: device_other = _single_array_device(array) if device_ != device_other: > raise ValueError( f"Input arrays use different devices: {str(device_)}, " f"{str(device_other)}" ) E ValueError: Input arrays use different devices: cpu, cpu The tests have now been amended to cast y to a numpy array. It is unfortunate that this used not be necessary but now is. I think we can live with that, as it's an edge case. I tested that with normal fitting, passing a torch.tensor as y is no problem.
Ping @githubnemo. Maybe @adrinjalali could also check. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might also be interested in some of the utils we've put here: https://github.com/sklearn-compat/sklearn-compat/
def __sklearn_tags__(self): | ||
# TODO: this is just the bare minimum, more tags should be added | ||
tags = super().__sklearn_tags__() | ||
tags.estimator_type = 'classifier' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should probably be inheriting from ClassifierMixin
instead really.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now, we don't inherit from BaseEstimator
, ClassifierMixin
, etc. So far, this went well but probably can cause issues like this. Would you think it is better to switch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But you're already inherriting from ClassifierMixin
in this file, so adding BaseEstimator
would only make things easier.
In general, we've added a substantial amount of features to the BaseEstimator
, and you probably would have a much easier time if you inherit from it. These days I'd strongly recommend inheriting from the right mixins.
Altenative to #1076 As described in that PR, skorch is currently not compatible with sklearn 1.6.0 or above. As per suggestion, instead of implementing __sklearn_tags__, this PR solves the issue by inheriting from BaseEstimator. Related changes: - It is important to set the correct order when inheriting from BaseEstimator and, say, ClassifierMixin (BaseEstimator should come last). - As explained in #1076, using GridSearchCV with y being a torch tensor fails and two tests had to be adjusted. Unrelated changes - Removed unnecessary imports from callbacks/base.py.
Closing in favor of #1078. |
Alternative to #1076 As described in that PR, skorch is currently not compatible with sklearn 1.6.0 or above. As per suggestion, instead of implementing __sklearn_tags__, this PR solves the issue by inheriting from BaseEstimator. Related changes: - It is important to set the correct order when inheriting from BaseEstimator and, say, ClassifierMixin (BaseEstimator should come last). - As explained in #1076, using GridSearchCV with y being a torch tensor currently fails and two tests had to be adjusted. Unrelated changes - Removed unnecessary imports from callbacks/base.py.
The new sklearn version now requires the estimator to expose a method
__sklearn_tags__
. This PR implements a bare minimum of sklearn tags to make the tests pass again.Probably this list of tags can be improved to be more precise.
Moreover, some tests now started failing. To be precise, using
GridSearchCV
with y being a torch tensor fails, as sklearn performs a check on the devices ofy
vsy_pred
(a numpy array) and determines that the devices differ. The error message is:Error
The tests have now been amended to cast
y
to a numpy array. It is unfortunate that this used not be necessary but now is.I think we can live with that, as it's an edge case. I tested that with normal fitting, passing a
torch.tensor
asy
is no problem.Note: The commit message says sklearn 1.5 but it's in fact 1.6.