Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAINT Parameter validation for descendants of BaseLibSVM #24001

Merged
merged 28 commits into from
Jul 27, 2022

Conversation

stefmolin
Copy link
Contributor

@stefmolin stefmolin commented Jul 26, 2022

Reference Issues/PRs

Towards #23462

What does this implement/fix? Explain your changes.

Added parameter validation to the BaseLibSVM class and its descendants. Updated tests accordingly, and modified __str__() in StrOptions so that the values are always in the same order (e.g., always {'auto', 'scale'}) to make sure tests can be passed consistently.

Any other comments?

@jeremiedbb jeremiedbb added the Validation related to input validation label Jul 26, 2022
Copy link
Member

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @stefmolin. Here are some suggestions.

Linting is failing. You can fix it by installing and running black.

sklearn/svm/_base.py Outdated Show resolved Hide resolved
sklearn/svm/_base.py Outdated Show resolved Hide resolved
sklearn/svm/_base.py Outdated Show resolved Hide resolved
@stefmolin
Copy link
Contributor Author

@jeremiedbb - I'm still working on this, but I will let you know when the PR is ready for review.

@stefmolin stefmolin marked this pull request as ready for review July 26, 2022 17:28
@stefmolin
Copy link
Contributor Author

@jeremiedbb – Ready for review. It is failing the mypy check now. I'm not sure how to fix that.

@glemaitre glemaitre self-requested a review July 27, 2022 09:05
sklearn/svm/_base.py Outdated Show resolved Hide resolved
sklearn/svm/_classes.py Outdated Show resolved Hide resolved
@glemaitre
Copy link
Member

You should check the test def test_bad_input() in test_svm.py where we should be removing some of the checks because it will be covered by the common test.

@glemaitre
Copy link
Member

Otherwise, LGTM

Copy link
Member

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I just fixed the verbose and cache_size constraints

@@ -1332,7 +1332,7 @@ class NuSVR(RegressorMixin, BaseLibSVM):
default 0.5 will be taken.

C : float, default=1.0
Penalty parameter C of the error term. Must be non-negative.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this have been removed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhm it does not make sense indeed. I don't understand why we expose the C parameter here.

I think this is better to let it as-is in this PR and to make a separate PR to remove (or deprecate) the parameter C from NuSVR. It will require adding an entry in the changelog. Deprecation will be better than just removing the parameter. It will be less brutal for the end-user.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@glemaitre are you sure about that ? the doc says
However, unlike NuSVC, where nu replaces C, here nu replaces the parameter epsilon of epsilon-SVR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right. I was only reading the nu-SVC algorithm this morning. So I got confused. Thanks for the explanation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comment was that the Must be non-negative. part that I added was removed in @jeremiedbb's commit. I wanted to confirm that it was intentional, since the parameter validation now requires non-negative values for C.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it was just a nitpick, to keep parameter description consistent accross svm estimators. We'll need to make a pass on all docstrings at some point anyway to properly write the accepted range of values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, during the sprint at EuroPython I believe @glemaitre mentioned to update the docstrings to match the parameter validation. Good to know that I shouldn't update them on future PRs for this meta-issue. It would be a good idea to mention this in #23462 to be clear for others though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No no, it's fine to update the docstrings. It's just that here you only did the C description for one svm estimators so the others svm estimators would have a different description which I didn't find very user friendly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha.

@glemaitre glemaitre self-requested a review July 27, 2022 15:21
Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I will merge as-is.

Thanks @stefmolin. Do you want to make the following PR to deprecate the C parameter from NuSVR?

@glemaitre glemaitre merged commit 329adf1 into scikit-learn:main Jul 27, 2022
@stefmolin stefmolin deleted the maint/svm_parameter_validation branch July 27, 2022 21:25
ivannz added a commit to ivannz/scikit-learn that referenced this pull request Aug 29, 2022
…cSVM scikit-learn#24001)

finish v1.2 deprecation of params kwargs in `.fit` of SVDD (similar to ocSVM scikit-learn#20843)
removed SVDD param-validation exception from test_common.py since scikit-learn#23462 is go (scikit-learn#22722)
ivannz added a commit to ivannz/scikit-learn that referenced this pull request Sep 5, 2022
…cSVM scikit-learn#24001)

finish v1.2 deprecation of params kwargs in `.fit` of SVDD (similar to ocSVM scikit-learn#20843)
TST ensure SVDD passes param-validation test_common.py due to scikit-learn#23462 (scikit-learn#22722)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants