Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: force pipeline steps to be list not a tuple #9604

Merged
merged 2 commits into from
Aug 22, 2017

Conversation

jorisvandenbossche
Copy link
Member

Alternative to #9221. Combined the fix of @agramfort and suggestion of @jnothman. And added a more explicit test for it.

Fixes #9587, closes #9221

What does this implement/fix? Explain your changes.

Previously passing a tuple as the steps to a Pipeline worked, this broke in 0.19.
Therefore, this PR converts the passed steps to a list.

This is modifying an init argument (self.steps). However, there is currently a problem with the Pipeline implementation that the fitted steps are saved in self.steps and not in self.steps_. Therefore, self.steps needs to be mutable and cannot be a tuple.
The correct fix would be to solve the design problem, for which there is a PR (#8350). This PR provides a smaller temporary fix to undo the regression, until the other PR is merged (which will certainly not be in a bugfix release)

@agramfort
Copy link
Member

LGTM +1 for MRG

pipe.fit(X, y=None)
pipe.score(X)

X = np.array([[1, 2]])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what these two lines are for.

@jnothman
Copy link
Member

Otherwise LGTM

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the lines. Will merge on green

@jnothman jnothman added this to the 0.19.1 milestone Aug 22, 2017
@jnothman jnothman merged commit aae8700 into scikit-learn:master Aug 22, 2017
@jnothman
Copy link
Member

Thanks, @jorisvandenbossche

jnothman pushed a commit to jnothman/scikit-learn that referenced this pull request Aug 23, 2017
@@ -215,8 +215,6 @@ def test_pipeline_init_tuple():
pipe.fit(X, y=None)
pipe.score(X)

X = np.array([[1, 2]])
pipe = Pipeline((('transf', Transf()), ('clf', FitParamT())))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The X was indeed redundant, but the redefinition of the pipe is actually to make sure that set_params also works when fit is not yet called.

In the end I put my fix (conversion to list) in the __init__ and not in the fit like Alex did, so it shouldn't matter. But now the test is less robust to a change in the Pipeline implementation.

@jnothman
Copy link
Member

jnothman commented Aug 23, 2017 via email

@jorisvandenbossche
Copy link
Member Author

If someone would move the conversion of steps to a list from the init to the fit method (as eg validation of parameters often happens in the fit and not init), then set_params will be broken in the specific case that the pipeline was not yet fitted before, and the current test will not catch that.
(that's what I meant with "less robust to a change in the Pipeline implementation.")
But given that is not that likely we will change that (as an actual refactor of Pipeline would ensure steps is not mutated, and doesn't need to be a list), this is not that important :-)

@jnothman
Copy link
Member

jnothman commented Aug 23, 2017 via email

AishwaryaRK pushed a commit to AishwaryaRK/scikit-learn that referenced this pull request Aug 29, 2017
maskani-moh pushed a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017
jwjohnson314 pushed a commit to jwjohnson314/scikit-learn that referenced this pull request Dec 18, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Regression: Pipelines don't accept steps as a tuple in 0.19
3 participants