Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+3] Add flake8 linting of the diff in Travis #7127

Merged
merged 3 commits into from
Aug 30, 2016

Conversation

lesteve
Copy link
Member

@lesteve lesteve commented Aug 2, 2016

What does this implement/fix? Explain your changes.

This allows to check that PRs do not add obvious flake8 violations. The main things behind it are:

  • find common ancestor between branch and scikit-learn/scikit-learn remote
  • run flake8 --diff on the diff between the branch and the common ancestor

This approach has been used in joblib and nilearn for a little while and works fine.

Additional features:

  • the line numbers in Travis matches the local branch on the PR author machine
  • ./build_tools/travis/flake8_diff.sh can be run locally for quick turn-around

An example output for a PR adding flake 8 violations:
https://travis-ci.org/joblib/joblib/jobs/149138375

Creating a PR was motivated by https://travis-ci.org/joblib/joblib/jobs/149138375. As I said in this issue:

This way we make sure that no obvious flake8 violations are introduced by PRs while not having to fix all the flake8 violations in the entire source tree, which would avoid the main concern of creating conflicts with existing PRs.

ping @agramfort @GaelVaroquaux

# reported by Travis will match with the local code.
BRANCH_NAME=travis_pr_$TRAVIS_PULL_REQUEST
git fetch $REMOTE pull/$TRAVIS_PULL_REQUEST/head:$BRANCH_NAME
git checkout $BRANCH_NAME
Copy link
Member

@jnothman jnothman Aug 2, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume there are no concurrency issues with modifying the working copy in the current directory? The rest of this script only modifies .git. And I don't actually see why a working copy is needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my mind these lines were supposed of doing the same as what Travis does by default, but with the branch as it is on the PR author's machine instead of at is is after the merge into master. This way the line numbers on Travis and on the PR author machine match.

I assume there are no concurrency issues with modifying the working copy in the current directory? The rest of this script only modifies .git. And I don't actually see why a working copy is needed.

Note this only happens on Travis (not if you run the command locally) so I am not sure about which concurrency issue you are talking about. It may be true that we may not need a working copy but again this happens only on Travis so is it such a big problem ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a big problem. I suppose Travis can't reuse working copies for the various tests, so I'm thinking nonsense.

@jnothman
Copy link
Member

jnothman commented Aug 2, 2016

I think this is a good idea. Pity flake8 doesn't cover Cython.


# Conservative approach: diff without context so that code that was
# not changed does not create failures
git diff --unified=0 $COMMIT | flake8 --diff --show-source
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this takes care to only do things to .py files. I also assume it does not handle rst, pyx, pxd.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this takes care to only do things to .py files. I also assume it does not handle rst, pyx, pxd.

Only .py files are taken into account indeed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand what's happening here. flake8 is only run on the added lines, right? So how can it detect undefined names? Or does it parse the diff and use the whole file as context?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From some tests I ran a long time ago it only uses the diff, I agree that the non PEP8 stuff (e.g. unused variable) is mostly limited to new file (or at least when a significant amount of code is added).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm I'm not sure if that is how flake8 --diff is meant to be used? I think you just have to remove the =0 and it does the right thing.

Copy link
Member

@amueller amueller Aug 2, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok never mind. my test show that your line does the right thing, though I have no idea how it works.

@jnothman
Copy link
Member

jnothman commented Aug 2, 2016

LGTM

@@ -53,11 +53,11 @@ if [[ "$DISTRIB" == "conda" ]]; then
if [[ "$INSTALL_MKL" == "true" ]]; then
conda create -n testenv --yes python=$PYTHON_VERSION pip nose \
numpy=$NUMPY_VERSION scipy=$SCIPY_VERSION numpy scipy \
cython=$CYTHON_VERSION libgfortran mkl
cython=$CYTHON_VERSION libgfortran mkl flake8
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if it's worth moving this to the if below because we only need it in the one test? Actually the two kinds of tests have pretty different requirements, right? The flake8 test doesn't even need numpy and scipy I guess? But maybe that doesn't matter so much and we should keep it simple.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah you only need the source code, I went for the smallest change but I agree it could be simplified.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the concern here is that if flake8, now or in the future, has other dependencies, then we are installing dependencies that may give us false green CI icons in the future. Seems unlikely, but I don't think it's a great hassle to set this up separately.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed this: now flake8 is only installed when needed (i.e. "$RUN_FLAKE8" == "true").

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see that in the code. Can you please point it out?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jnothman I didn't understand you comment. My concern was more that we are burning travis time by installing numpy, scipy and cython, thought it was a minor concern.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was saying that when we're actually running tests, installing deps that aren't on our deps list may confound test results.

@amueller
Copy link
Member

amueller commented Aug 2, 2016

can we get a make target to run that locally (and add it to the developer docs)?

@@ -0,0 +1,72 @@
#!/bin/bash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the file should start with the explanation of what's happening as you gave in this PR, I think.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@amueller
Copy link
Member

amueller commented Aug 2, 2016

Why don't we use https://flake8-diff.readthedocs.io/en/latest/ instead? Hm I guess it looks a bit abandoned. But it seems like something that should be factored out into it's own project.... (it sounds like you just copied a non-trivial bash script among 3 projects with no shared tracking).

@lesteve
Copy link
Member Author

lesteve commented Aug 2, 2016

Why don't we use https://flake8-diff.readthedocs.io/en/latest/ instead?

I did not bump into this when I looked for a solution for this problem I have to admit. Will look into it a bit more.

(it sounds like you just copied a non-trivial bash script among 3 projects with no shared tracking).

Yep my brain is the only place the shared tracking is happening at the moment ;-).

@amueller
Copy link
Member

amueller commented Aug 2, 2016

LGTM but I think a standalone solution at some point would be nice ;)
What does the githook in flake8 do?

@lesteve
Copy link
Member Author

lesteve commented Aug 9, 2016

@amueller @jnothman I added some explanation in flake8_diff.sh and a flake8-diff target in the Makefile (with a mention in the developer doc). Anything else to do before it can be merged?

# ancestor
#
# Additional features:
# - the line numbers in Travis matches the local branch on the PR
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

matches -> match

@NelleV
Copy link
Member

NelleV commented Aug 9, 2016

LGTM

with respect to common ancestor of branch and scikit-learn/scikit-learn
remote.

Add a flake8-diff target in the Makefile and mention it in the doc.
@NelleV
Copy link
Member

NelleV commented Aug 27, 2016

@amueller any more comments on this PR?

@lesteve
Copy link
Member Author

lesteve commented Aug 27, 2016

Thanks @NelleV for bumping this PR for me :)

@jnothman jnothman changed the title [MRG] Add flake8 linting of the diff in Travis [MRG+1] Add flake8 linting of the diff in Travis Aug 27, 2016
@jnothman
Copy link
Member

And yes, thank you, @NelleV, this will be great to have.

so that the flake8 build stands out more on the Travis website.
@jnothman
Copy link
Member

LGTM

@jnothman jnothman changed the title [MRG+1] Add flake8 linting of the diff in Travis [MRG+3] Add flake8 linting of the diff in Travis Aug 30, 2016
@jnothman
Copy link
Member

Now it's time to watch all those PRs get crosses next to them... muhuhahaha

@jnothman jnothman merged commit 4566251 into scikit-learn:master Aug 30, 2016
@jnothman
Copy link
Member

Thanks a lot @lesteve

@lesteve lesteve deleted the add-flake8-diff branch August 30, 2016 07:27
TomDLT pushed a commit to TomDLT/scikit-learn that referenced this pull request Oct 3, 2016
* Add flake8 linting on the diff

with respect to common ancestor of branch and scikit-learn/scikit-learn
remote.

Add a flake8-diff target in the Makefile and mention it in the doc.

* Only install flake8 when needed
python setup.py develop
fi

if [[ "$RUN_FLAKE8" == "true" ]]; then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @lesteve
I am working on a similar PR for matplotlib, and I was wondering why you used strings and not booleans here?

Copy link
Member Author

@lesteve lesteve Nov 4, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know nothing about bash booleans in case they do exist. Is there any clear advantage of using them? Also I guess INSTALL_MKL was used like this so I just copied it.

paulha pushed a commit to paulha/scikit-learn that referenced this pull request Aug 19, 2017
* Add flake8 linting on the diff

with respect to common ancestor of branch and scikit-learn/scikit-learn
remote.

Add a flake8-diff target in the Makefile and mention it in the doc.

* Only install flake8 when needed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants