BUG: force pipeline steps to be list not a tuple #9604

jorisvandenbossche · 2017-08-22T15:07:01Z

Alternative to #9221. Combined the fix of @agramfort and suggestion of @jnothman. And added a more explicit test for it.

Fixes #9587, closes #9221

What does this implement/fix? Explain your changes.

Previously passing a tuple as the steps to a Pipeline worked, this broke in 0.19.
Therefore, this PR converts the passed steps to a list.

This is modifying an init argument (self.steps). However, there is currently a problem with the Pipeline implementation that the fitted steps are saved in self.steps and not in self.steps_. Therefore, self.steps needs to be mutable and cannot be a tuple.
The correct fix would be to solve the design problem, for which there is a PR (#8350). This PR provides a smaller temporary fix to undo the regression, until the other PR is merged (which will certainly not be in a bugfix release)

agramfort · 2017-08-22T16:12:23Z

LGTM +1 for MRG

jnothman · 2017-08-22T22:51:09Z

sklearn/tests/test_pipeline.py

+    pipe.fit(X, y=None)
+    pipe.score(X)
+
+    X = np.array([[1, 2]])


Not sure what these two lines are for.

jnothman · 2017-08-22T22:51:38Z

Otherwise LGTM

jnothman

Removed the lines. Will merge on green

jnothman · 2017-08-22T23:57:09Z

Thanks, @jorisvandenbossche

jorisvandenbossche · 2017-08-23T13:27:46Z

sklearn/tests/test_pipeline.py

@@ -215,8 +215,6 @@ def test_pipeline_init_tuple():
    pipe.fit(X, y=None)
    pipe.score(X)

-    X = np.array([[1, 2]])
-    pipe = Pipeline((('transf', Transf()), ('clf', FitParamT())))


The X was indeed redundant, but the redefinition of the pipe is actually to make sure that set_params also works when fit is not yet called.

In the end I put my fix (conversion to list) in the __init__ and not in the fit like Alex did, so it shouldn't matter. But now the test is less robust to a change in the Pipeline implementation.

jnothman · 2017-08-23T21:40:40Z

I'm not convinced initialising it again reduces much risk.

…

On 23 Aug 2017 11:27 pm, "Joris Van den Bossche" ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In sklearn/tests/test_pipeline.py <#9604 (comment)> : > @@ -215,8 +215,6 @@ def test_pipeline_init_tuple(): pipe.fit(X, y=None) pipe.score(X) - X = np.array([[1, 2]]) - pipe = Pipeline((('transf', Transf()), ('clf', FitParamT()))) The X was indeed redundant, but the redefinition of the pipe is actually to make sure that set_params also works when fit is not yet called. In the end I put my fix (conversion to list) in the __init__ and not in the fit like Alex did, so it shouldn't matter. But now the test is less robust to a change in the Pipeline implementation. — You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub <#9604 (review)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEz64iM6VdMzrpdOCbQAgai0Jt02Q4Cks5sbCjVgaJpZM4O-wuL> .

jorisvandenbossche · 2017-08-23T21:48:19Z

If someone would move the conversion of steps to a list from the init to the fit method (as eg validation of parameters often happens in the fit and not init), then set_params will be broken in the specific case that the pipeline was not yet fitted before, and the current test will not catch that.
(that's what I meant with "less robust to a change in the Pipeline implementation.")
But given that is not that likely we will change that (as an actual refactor of Pipeline would ensure steps is not mutated, and doesn't need to be a list), this is not that important :-)

jnothman · 2017-08-23T22:33:37Z

feel free to change it, but comment clearly what the purpose of the test is. tests need to be readable On 24 Aug 2017 7:48 am, "Joris Van den Bossche" <notifications@github.com> wrote: If someone would move the conversion of steps to a list from the init to the fit method (as eg validation of parameters often happens in the fit and not init), then set_params will be broken in the specific case that the pipeline was not yet fitted before, and the current test will not catch that. (that's what I meant with "less robust to a change in the Pipeline implementation.") But given that is not that likely we will change that (as an actual refactor of Pipeline would ensure steps is not mutated, and doesn't need to be a list), this is not that important :-) — You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub <#9604 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEz6wrzMO65HExwkQZfycNvU2RBV9omks5sbJ4lgaJpZM4O-wuL> .

BUG: force pipeline steps to be list not a tuple

3df9563

jnothman reviewed Aug 22, 2017

View reviewed changes

Remove redundant lines

49adbd2

jnothman reviewed Aug 22, 2017

View reviewed changes

jnothman added this to the 0.19.1 milestone Aug 22, 2017

jnothman merged commit aae8700 into scikit-learn:master Aug 22, 2017

jnothman pushed a commit to jnothman/scikit-learn that referenced this pull request Aug 23, 2017

FIX force pipeline steps to be list not a tuple (scikit-learn#9604)

d5b69d3

jorisvandenbossche commented Aug 23, 2017

View reviewed changes

AishwaryaRK pushed a commit to AishwaryaRK/scikit-learn that referenced this pull request Aug 29, 2017

FIX force pipeline steps to be list not a tuple (scikit-learn#9604)

4b99bdf

This was referenced Sep 8, 2017

Converting steps to list breaks pipeline cloning #9715

Closed

[MRG+1] Don't modify steps in Pipeline.__init__ #9716

Merged

maskani-moh pushed a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017

FIX force pipeline steps to be list not a tuple (scikit-learn#9604)

8fe1243

jwjohnson314 pushed a commit to jwjohnson314/scikit-learn that referenced this pull request Dec 18, 2017

FIX force pipeline steps to be list not a tuple (scikit-learn#9604)

ad96498

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: force pipeline steps to be list not a tuple #9604

BUG: force pipeline steps to be list not a tuple #9604

jorisvandenbossche commented Aug 22, 2017

agramfort commented Aug 22, 2017

jnothman Aug 22, 2017

jnothman commented Aug 22, 2017

jnothman left a comment

jnothman commented Aug 22, 2017

jorisvandenbossche Aug 23, 2017

jnothman commented Aug 23, 2017 via email

jorisvandenbossche commented Aug 23, 2017

jnothman commented Aug 23, 2017 via email

BUG: force pipeline steps to be list not a tuple #9604

BUG: force pipeline steps to be list not a tuple #9604

Conversation

jorisvandenbossche commented Aug 22, 2017

What does this implement/fix? Explain your changes.

agramfort commented Aug 22, 2017

jnothman Aug 22, 2017

Choose a reason for hiding this comment

jnothman commented Aug 22, 2017

jnothman left a comment

Choose a reason for hiding this comment

jnothman commented Aug 22, 2017

jorisvandenbossche Aug 23, 2017

Choose a reason for hiding this comment

jnothman commented Aug 23, 2017 via email

jorisvandenbossche commented Aug 23, 2017

jnothman commented Aug 23, 2017 via email