-
-
Notifications
You must be signed in to change notification settings - Fork 25.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] Add few more tests + Documentation for re-entrant cross-validation estimators #7823
[MRG] Add few more tests + Documentation for re-entrant cross-validation estimators #7823
Conversation
Maybe it's easier just to document under |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain the issue with assert_equal a little more? perhaps it needs a comment.
sklearn/model_selection/_search.py
Outdated
@@ -816,7 +816,7 @@ class GridSearchCV(BaseSearchCV): | |||
For instance the below given table | |||
|
|||
+------------+-----------+------------+-----------------+---+---------+ | |||
|param_kernel|param_gamma|param_degree|split0_test_score|...|rank_....| | |||
|param_kernel|param_gamma|param_degree|split0_test_score|...| rank... | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not better as rank_t...
?
sklearn/model_selection/_split.py
Outdated
|
||
Note | ||
---- | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer no blank line here...
sklearn/model_selection/_split.py
Outdated
---- | ||
|
||
Multiple calls to the ``split`` method will not return identical | ||
training or testing sets unless ``random_state`` is set to an integer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No such param
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TimeSeriesSplit
has a random_state
...
sklearn/model_selection/_split.py
Outdated
---- | ||
|
||
Multiple calls to the ``split`` method will not return identical | ||
training or testing sets unless ``random_state`` is set to an integer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only if shuffle=True
@jnothman Have addressed your comments... Another look please? |
np.testing.assert_equal( | ||
np.array(list(kf_iter_wrapped.split(X, y))), | ||
np.array(list(kf_randomized_iter_wrapped.split(X, y)))) | ||
except AssertionError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah sorry you asked this before and I failed to give a response!
So the nested lists comparison raises a deprecation warning...
>>> assert_true(np.array(list(kfold.split(X, y))) != np.array(list(kfold.split(X, y))))
DeprecationWarning: elementwise != comparison failed; this will raise an error in the future.
np.testing.assert_equal
on the other hand handles list of np.ndarrays gracefully...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Argh wait I forgot the else
clause. Sorry for that...
@@ -1175,12 +1175,30 @@ def test_grid_search_cv_splits_consistency(): | |||
cv=KFold(n_splits=n_splits)) | |||
gs2.fit(X, y) | |||
|
|||
# Give generator as a cv parameter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be sure, we've not got anything that ensures this will remain a generator. Either use a generator expression or test for its type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
list(kf_randomized_iter_wrapped.split(X, y))) | ||
assert_true(np.any(np.array(list(kf_iter_wrapped.split(X, y))) != | ||
np.array(list(kf_randomized_iter_wrapped.split(X, y))))) | ||
np.testing.assert_array_equal( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still don't understand why you've resorted to np.testing.assert_array_equal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
np.testing.assert_array_equal
handles nested lists unlike sklearn.testing.assert_array_equal
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I think this should at least be commented on somewhere in the file, if not imported into sklearn.utils.testing and given a different name, or the tests rewritten to do this comparison of nested lists explicitly.
list(kf_randomized_iter_wrapped.split(X, y))) | ||
assert_true(np.any(np.array(list(kf_iter_wrapped.split(X, y))) != | ||
np.array(list(kf_randomized_iter_wrapped.split(X, y))))) | ||
np.testing.assert_array_equal( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I think this should at least be commented on somewhere in the file, if not imported into sklearn.utils.testing and given a different name, or the tests rewritten to do this comparison of nested lists explicitly.
list(kf_randomized_iter_wrapped.split(X, y))) | ||
assert_true(np.any(np.array(list(kf_iter_wrapped.split(X, y))) != | ||
np.array(list(kf_randomized_iter_wrapped.split(X, y))))) | ||
# numpy's assert_array_equal properly compares nested lists |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jnothman would this suffice? or would you prefer having this imported into sklearn.utils.testing
rather?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose this comment is okay. Ideally I think we want all our asserts to come from one place and have clear naming for where they should be applied.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We say that the splits are different with random_state=None, but that's not tested anywhere, is it?
sklearn/model_selection/_split.py
Outdated
Note | ||
---- | ||
Multiple calls to the ``split`` method will not return identical | ||
training or testing sets if ``random_state`` parameter exists and is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe "if splitting is randomized and the random_state parameter is not set to an integer"? I had to check where this was and what it means for a parameter to exist. Also, is that true? This class can not know how the subclasses treat the random_state
.
So maybe rather "this might not be deterministic" which is not very explicit. Maybe describe this in the class docstring for each class and link there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it's my lack of coffee, but why does split
use _iter_test_masks
? Currently we create indices, transform them to booleans and then transform them back to indices. It is for making the negation easy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for coming to this very late. What do you suggest as the right thing to do?
"This *may not* be deterministic if `random_state`, if available, is not explicitly set while initializing the class"
Sounds okay to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about "Randomized CV splitters may return different results for each call of split
. This can be avoided (and identical results returned for each split) by setting random_state
to an integer."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this belongs in the narrative docs too.
sklearn/model_selection/_split.py
Outdated
Note | ||
---- | ||
Multiple calls to the ``split`` method will not return identical | ||
training or testing sets unless ``random_state`` is set to an integer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what "exists" means here. This class has a random_state
parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw the if shuffle
in the _make_test_fold
method seems unnecessary and makes the code harder to follow imho.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am really unable to recollect why I did that :@ :@ :@ Arghh. I guess it was done so different split calls would produce same split? But I guess that was vetoed down (see: #7935).
I'll revert for now. I can reintroduce when someone complains.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, personally, I think having multiple split calls produce the same split is a better design, but not one that we currently implement.
def _pop_time_keys(cv_results): | ||
for key in ('mean_fit_time', 'std_fit_time', | ||
'mean_score_time', 'std_score_time'): | ||
cv_results.pop(key) | ||
return cv_results | ||
|
||
# Check if generators as supported as cv and that the splits are consistent | ||
np.testing.assert_equal(_pop_time_keys(gs3.cv_results_), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wouldn't it be better to somehow test that the same samples have been used? We could have a model that stores the training data and a scorer that produces a hash of the training data in the model and the test data passed to score? Or is that too hacky for testing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that can be done. Maybe I'm being lazy but I feel this is sufficient? I can't see a case where this would pass if different samples were used. The score is quite sensitive to sample order no?
(Especially given that the dataset is make_classification
and estimator
is LinearSVC
and not one of our toy estimators which would return 1 / 0 as the score)
def _pop_time_keys(cv_results): | ||
for key in ('mean_fit_time', 'std_fit_time', | ||
'mean_score_time', 'std_score_time'): | ||
cv_results.pop(key) | ||
return cv_results | ||
|
||
# Check if generators as supported as cv and that the splits are consistent |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are supported
89593cd
to
91963f4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
sklearn/model_selection/_split.py
Outdated
Note | ||
---- | ||
Multiple calls to the ``split`` method will not return identical | ||
training or testing sets if ``random_state`` parameter exists and is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about "Randomized CV splitters may return different results for each call of split
. This can be avoided (and identical results returned for each split) by setting random_state
to an integer."
sklearn/model_selection/_split.py
Outdated
Note | ||
---- | ||
Multiple calls to the ``split`` method will not return identical | ||
training or testing sets unless ``random_state`` is set to an integer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, personally, I think having multiple split calls produce the same split is a better design, but not one that we currently implement.
sklearn/model_selection/_split.py
Outdated
Note | ||
---- | ||
Multiple calls to the ``split`` method will not return identical | ||
training or testing sets if ``random_state`` parameter exists and is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this belongs in the narrative docs too.
@@ -1075,11 +1076,15 @@ def test_search_cv_results_rank_tie_breaking(): | |||
cv_results['mean_test_score'][2]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pytest: assert cv_results['mean_test_score'][1] != approx(cv_results['mean_test_score'][2])
:|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about just: assert_false(np.allclose(cv_results['mean_test_score'][1], cv_results['mean_test_score'][2]))
shuffle=True, random_state=0).split(X, y), | ||
GeneratorType)) | ||
|
||
# Give generator as a cv parameter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you really want to test that it's a generator, you should confirm (or ensure by using a generator expression) that it is indeed a generator. Otherwise the implementation in KFold may change and this test is no longer doing the right thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you haven't addressed this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a test a few lines above (at L1431), which confirms if KFold indeed returns a GeneratorType
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay
But then the comment needs to appear before
Thanks Joel, have addressed your comments |
doc/modules/cross_validation.rst
Outdated
@@ -728,6 +728,10 @@ to shuffle the data indices before splitting them. Note that: | |||
* To ensure results are repeatable (*on the same platform*), use a fixed value | |||
for ``random_state``. | |||
|
|||
The randomized CV splitters may return different results for each call of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is described in less clear terms in the preceding bullet point. Please merge them.
shuffle=True, random_state=0).split(X, y), | ||
GeneratorType)) | ||
|
||
# Give generator as a cv parameter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you haven't addressed this.
sklearn/model_selection/_split.py
Outdated
Note | ||
---- | ||
Randomized CV splitters may return different results for each call of | ||
split. This can be avoided (and identical results returned for each |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems a bit redundant. Maybe "You can make the results identical by setting random_state to an integer"?
this looks good but I haven't double checked if it addressed all of @jnothman's comments from the original PR. |
Done. If you guys are happy, this can be merged now. Unless travis decides to give me a headache. |
Thanks |
…ion estimators (scikit-learn#7823) * DOC Add NOTE that unless random_state is set, split will not be identical * TST use np.testing.assert_equal for nested lists/arrays * TST Make sure cv param can be a generator * DOC rank_ becomes a link when rendered * Use test_... * Remove blank line; Add if shuffle is True * Fix tests * Explicitly test for GeneratorType * TST Add the else clause * TST Add comment on usage of np.testing.assert_array_equal * TYPO * MNT Remove if ; * Address Joel's comments * merge the identical points in doc * DOC address Andy's comments * Move comment to before the check for generator type
…ion estimators (scikit-learn#7823) * DOC Add NOTE that unless random_state is set, split will not be identical * TST use np.testing.assert_equal for nested lists/arrays * TST Make sure cv param can be a generator * DOC rank_ becomes a link when rendered * Use test_... * Remove blank line; Add if shuffle is True * Fix tests * Explicitly test for GeneratorType * TST Add the else clause * TST Add comment on usage of np.testing.assert_array_equal * TYPO * MNT Remove if ; * Address Joel's comments * merge the identical points in doc * DOC address Andy's comments * Move comment to before the check for generator type
…ion estimators (scikit-learn#7823) * DOC Add NOTE that unless random_state is set, split will not be identical * TST use np.testing.assert_equal for nested lists/arrays * TST Make sure cv param can be a generator * DOC rank_ becomes a link when rendered * Use test_... * Remove blank line; Add if shuffle is True * Fix tests * Explicitly test for GeneratorType * TST Add the else clause * TST Add comment on usage of np.testing.assert_array_equal * TYPO * MNT Remove if ; * Address Joel's comments * merge the identical points in doc * DOC address Andy's comments * Move comment to before the check for generator type
…ion estimators (scikit-learn#7823) * DOC Add NOTE that unless random_state is set, split will not be identical * TST use np.testing.assert_equal for nested lists/arrays * TST Make sure cv param can be a generator * DOC rank_ becomes a link when rendered * Use test_... * Remove blank line; Add if shuffle is True * Fix tests * Explicitly test for GeneratorType * TST Add the else clause * TST Add comment on usage of np.testing.assert_array_equal * TYPO * MNT Remove if ; * Address Joel's comments * merge the identical points in doc * DOC address Andy's comments * Move comment to before the check for generator type
…ion estimators (scikit-learn#7823) * DOC Add NOTE that unless random_state is set, split will not be identical * TST use np.testing.assert_equal for nested lists/arrays * TST Make sure cv param can be a generator * DOC rank_ becomes a link when rendered * Use test_... * Remove blank line; Add if shuffle is True * Fix tests * Explicitly test for GeneratorType * TST Add the else clause * TST Add comment on usage of np.testing.assert_array_equal * TYPO * MNT Remove if ; * Address Joel's comments * merge the identical points in doc * DOC address Andy's comments * Move comment to before the check for generator type
…ion estimators (scikit-learn#7823) * DOC Add NOTE that unless random_state is set, split will not be identical * TST use np.testing.assert_equal for nested lists/arrays * TST Make sure cv param can be a generator * DOC rank_ becomes a link when rendered * Use test_... * Remove blank line; Add if shuffle is True * Fix tests * Explicitly test for GeneratorType * TST Add the else clause * TST Add comment on usage of np.testing.assert_array_equal * TYPO * MNT Remove if ; * Address Joel's comments * merge the identical points in doc * DOC address Andy's comments * Move comment to before the check for generator type
…ion estimators (scikit-learn#7823) * DOC Add NOTE that unless random_state is set, split will not be identical * TST use np.testing.assert_equal for nested lists/arrays * TST Make sure cv param can be a generator * DOC rank_ becomes a link when rendered * Use test_... * Remove blank line; Add if shuffle is True * Fix tests * Explicitly test for GeneratorType * TST Add the else clause * TST Add comment on usage of np.testing.assert_array_equal * TYPO * MNT Remove if ; * Address Joel's comments * merge the identical points in doc * DOC address Andy's comments * Move comment to before the check for generator type
…ion estimators (scikit-learn#7823) * DOC Add NOTE that unless random_state is set, split will not be identical * TST use np.testing.assert_equal for nested lists/arrays * TST Make sure cv param can be a generator * DOC rank_ becomes a link when rendered * Use test_... * Remove blank line; Add if shuffle is True * Fix tests * Explicitly test for GeneratorType * TST Add the else clause * TST Add comment on usage of np.testing.assert_array_equal * TYPO * MNT Remove if ; * Address Joel's comments * merge the identical points in doc * DOC address Andy's comments * Move comment to before the check for generator type
TODO
random_state
is not set.np.testing.assert_equal
to test for equality of nested lists fixes that.The word rank in the mock table view ofFixed in [MRG + 2] ENH Allowcv_results_
is assumed as a link.cross_val_score
,GridSearchCV
et al. to evaluate on multiple metrics #7388@jnothman @amueller Pl. review :)