-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix the ensemble_size == 0 error in automl.py #1369
Conversation
…by adding a valueError to the former and giving a warning and returning empty dictionary in the latter
Hey @eddiebergman can you check the PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems mostly good :) Just a bit more explicit checking with show_models()
.
We would also add some brief tests to make sure these are raised.
This constitutes 4 tests, one for each kind of error/warning. These would all go in test_automl.py
- Test that
fit_ensemble(..., ensemble_size=0)
gives an error. You don't need to worry about fitting before hand as the check is performed before anything explicit is done. - Check for the fitted error in
show_models()
, should be straight forward enough. - Check for empty dict when
show_models()
is called withensemble_size = 0
passed in construction ofAutoMLClassifier
. - The last test, checking for no models existing return an empty dict, We could construct an
AutoMLClassifier
withensemble_size > 0
, set whatever flags are needed to indicate it has been fitted so it passed the other check and then test for the correct kind of error being raise.
For reference for tests if you're not familiar, you'll need pytest.raises(<ErrorType>, match=<msg>)
for checking for errors being emitted.
autosklearn/automl.py
Outdated
@@ -1906,6 +1907,9 @@ def show_models(self) -> Dict[int, Any]: | |||
""" | |||
|
|||
ensemble_dict = {} | |||
if self._ensemble_size == 0: | |||
self._logger.warning('No models in the ensemble. Kindly check the ensemble size.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this is a front facing API, this warning should be emitted to the user and not just logged silently, in fact there's not much point in logging it silently as it wouldn't help with debugging much.
import warnings
warnings.warn(msg)
Also, I don't think _ensemble_size
is the best proxy for checking this. A use could pass _ensemble_size = 10
but auto-sklearn failed to find any models, resulting in self._ensemble_size = 10
but having no models_
property.
Take a look at _load_models()
for the logic of self.ensemble_
and self.models_
.
While thinking about it, there's probably three separate conditions to check for:
- Has autosklearn been fitted? If not raise a RuntimeError
- Has autosklearn been given a paramter of
ensemble_size=0
? If so, issue a warning and give an empty dict. - Has autosklearn been given a parameter
ensemble_size > 0
but there is no ensemble to load from? If so, issue a warning and give an empty dict. - Has
I think checkingself.ensemble_
forNone
should be good enough
I would change it to
# At top
import warnings
if self._ensemble_size == 0
Thanks for the comments, will try to write tests and implement the suggestions. |
Codecov Report
@@ Coverage Diff @@
## development #1369 +/- ##
===============================================
+ Coverage 84.41% 84.48% +0.07%
===============================================
Files 146 146
Lines 11267 11282 +15
Branches 1925 1929 +4
===============================================
+ Hits 9511 9532 +21
+ Misses 1240 1234 -6
Partials 516 516 |
Hi @bkpcoding, Do you plan to continue this? If not it's okay but I would rather close this if it's going to go stale. Best, |
Hey @eddiebergman, |
No rush!! Please focus on your exams, this can wait. Just wanted to make sure there was still some plan to do this :) |
Added two tests to check if the automl.fit_ensemble() raises error when ensemble_size == 0 and if show_models() returns empty dictionary when ensemble_size == 0
Hey, @eddiebergman I have made the changes you had requested and added two tests for them, however, it was not clear to me how to implement the points you suggested below, I tried a few things but was not able to successfully run them.
|
Hi @bkpcoding,hope your exams are going well :) So I think any kind of checks help so I'll merge it in once possible, no need to discard the work :) I'll handle the merge once I can and this is finished! So really the class should definitely have some kind of explicit flag indicating it is fit. We will handle this eventually to try and get full sklearn-compatibility.
Turns out it's not very straightforward to see if autosklearn is fitted. Most places just use Can you do the following, it should make the check a lot easier, and allow you to easily implement Has autosklearn been fitted? If not raise a RuntimeError:
Sorry I'm not sure why this is needed, what's the issue that is occurring with this test? If it does not return an empty dict, then we need to fix def test_x():
with pytest.raises(ErrorType, match="First few words of the error"):
do_thing_that_raises_error() |
Test for checking if the show_models() functions raise an error if models are not fitted.
Add a function __sklearn_is_fitted__() which returns the boolean value of self.fitted(). And add the check for model fitting in show_models() function.
Seems good to me, I'll resolve the conflicts and run the workflow files :) If there's any issues, you have to git pull and fix them but otherwise I'm happy with it! @mfeurer, take a look if you like. |
I've done the merge, future contributions with formatting and checking should be easier with some new tooling :) If you want to run the pre-commit checks locally (the ones that normally fail -_-), you can do:
I'll check back in later today and see how the tests do. Edit:
Again this was made easier, in the latest version of |
Hey @eddiebergman, sorry I didn't run the pre-commit tests before. But I have run it now(after formatting using black and isort) and all of them have passed. Let me know if I need to make any more changes or run any tests(I did run test_automl.py and it passed all of them). |
Sorry, rerunning tests, seems like one of them had pip installation issue but it seems unrelated to this (timeouts when trying to find a requirement). I reran them just to be sure |
Hey @bkpcoding, Thanks very much for contributing! I've merged it so congrats :) If you have any comments on how the contribution process was for you, we'd be happy to hear any thoughts you might have to help us improve for the future!
You can leave it here as a comment or else you can find my email on my GitHub profile if you'd prefer to do so privately! Thanks again! Best, |
Hey @eddiebergman thanks for merging the PR and for all the trouble you took for review and suggestions. I think the entire codebase was very well written and after reading the paper and going through the documentation I was pretty much able to grasp the implementation of the library. |
Great to hear :) The code base can always be improved but step by step! Git is always one of those things that takes time, I still don't know much of it's functionality but always plenty more to learn. If there's anything you found helpful to know about git, feel free to throw it into the Contribution guide. If you like, I can tag you on issues that I don't think require going too deep or require extensive changes, otherwise feel free to poke around whenever you feel like it! Best of luck with the exams! Best, |
Surely you can tag me on the issues you think would be relevant to me, I would be happy to look at them |
* Fix the ensemble == 0 error in fit_ensemble and show_models function by adding a valueError to the former and giving a warning and returning empty dictionary in the latter * Update automl.py * Two tests for ensemble_size == 0 cases Added two tests to check if the automl.fit_ensemble() raises error when ensemble_size == 0 and if show_models() returns empty dictionary when ensemble_size == 0 * Update automl.py * Update test_automl.py Test for checking if the show_models() functions raise an error if models are not fitted. * Update automl.py Add a function __sklearn_is_fitted__() which returns the boolean value of self.fitted(). And add the check for model fitting in show_models() function. * Update autosklearn/automl.py * Formatting changes to clear all the pre-commit tests Co-authored-by: Eddie Bergman <eddiebergmanhs@gmail.com>
Fixing the
ensemble_size == 0
error in thefit_ensemble
andshow_models
functions by raising a ValueError in the former and issuing a warning and returning an empty dictionary in the latter. Related to issue #1365