Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pre-commit config and dev instructions #21583

Merged
merged 1 commit into from
Jan 26, 2022
Merged

Conversation

ianhi
Copy link
Contributor

@ianhi ianhi commented Nov 10, 2021

PR Summary

Add a .pre-commit-config.yaml file for use with pre-commit as well as add instructions for setting it up. When installed this should shorten the feedback loop for catching flake8 errors as it will run every time you attempt to git commit - it also is smart about running only on files staged for commit.

I find this to be very convenient and think it would be nice to include as an option for those who want to use it. But this is certainly something that would be up for discussion.

PR Checklist

  • [NA] Has pytest style unit tests (and pytest passes).
  • Is Flake 8 compliant (run flake8 on changed files to check).
    • I know it is because of pre-commit!
  • New features are documented, with examples if plot related.
  • [i hope.. ] Documentation is sphinx and numpydoc compliant (the docs should build without error).
  • Conforms to Matplotlib style conventions (install flake8-docstrings and run flake8 --docstring-convention=all).

@ianhi
Copy link
Contributor Author

ianhi commented Nov 10, 2021

Example of it running when creating this PR:

image

Copy link
Member

@timhoffm timhoffm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding pre-commit checks is fundamentally a good idea!

===========================
You can optionally install `pre-commit <https://pre-commit.com/>`_ hooks.
These will automatically check flake8 and other style issues when you run
`git commit`. To install the hooks ::
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`git commit`. To install the hooks ::
``git commit``. To install the hooks ::

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably should include instructions for where the yaml is in case folks need to tweak it?

Comment on lines 8 to 12
- repo: https://github.com/myint/autoflake
rev: v1.4
hooks:
- id: autoflake
args: ["--in-place", "--remove-all-unused-imports", "--ignore-init-module-imports", "--remove-unused-variables"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like auto-corrections that go beyond basic stuff like end-fo-file-fixer or trailing-whitespace. Anyway, one cannot apply the unused import detection as is to the current codebase because we have cases like this:

from .textpath import TextPath # Unused, but imported by others.

So let's leave this out.

Copy link
Contributor Author

@ianhi ianhi Nov 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing.

Although note that I think this basically pushes you to to use __all__ or # noqa either of which would prevent that from being removed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We currently do that via F401 excludes in .flake8.

The # noqa comment is quite non-instructive. Technically, something like would work for flake8 but has the disadvantage that lines become too long very quickly and you get line-length-complaints. Also, last time I checked, not all tools can work with this expanded syntax.

from .textpath import TextPath  # Unused, but imported by others. # noqa: F401

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think autocorrections are a good idea either. That seems likely to sow more confusion than it helps. I don't think new contributors find flake8 intimidating, and many IDEs will show you where the flake8 errors are already, so I don't think its too cumbersome to ask users to edit by hand. If this were togglable, maybe that would be OK, but default off?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The # noqa comment is quite non-instructive.

Agreed, my preference would still be to use __all__ = ["TextPath", ...] and re-enable F401. This feels like the correct way to specify the "exports" of a file while also allowing checks for actually unused imports.

.pre-commit-config.yaml Show resolved Hide resolved
@ianhi
Copy link
Contributor Author

ianhi commented Nov 10, 2021

It's also easy to run this automatically on PRs using https://pre-commit.ci/ which I like doing

@tacaswell tacaswell added this to the v3.6.0 milestone Nov 10, 2021
@QuLogic QuLogic requested a review from greglucas November 10, 2021 21:56
Copy link
Contributor

@greglucas greglucas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍, it is a nice-to-have and doesn't force people to use it. Some comments for thought/discussion about expanding this to more things below, but overall I like the idea of adding this.

I'd suggest adding pre-commit to the environment.yml file. Can you even run pre-commit install for users automatically through the conda env create command somehow? That may be getting too invasive, but I feel like it could help users out locally rather than having CI tell them something is wrong and having to search through logs that can be long and cryptic.

There is some conversation going on about detecting added and then removed large files during a PR #21593. Adding the check-added-large-files would prevent me from committing locally, but I'm curious if using the pre-commit.ci would find that file or not? I think another way of phrasing that is: if I push a PR with 20 commits, does pre-commit.ci run on the squashed version, or does it run on every single commit?

For pre-commit.ci, I think running it would be good. I haven't used it before, but I am curious about the auto-fixing whitespace and things for new contributors, as that may make it less intimidating to contribute to the codebase. However, I'm curious if those users would be able to get those pushed changes from CI back into their local copy without going through merge/rebase issues which may actually be worse and more confusing.

.pre-commit-config.yaml Show resolved Hide resolved
.pre-commit-config.yaml Show resolved Hide resolved
@ianhi
Copy link
Contributor Author

ianhi commented Nov 11, 2021

There is some conversation going on about detecting added and then removed large files during a PR #21593. Adding the check-added-large-files would prevent me from committing locally, but I'm curious if using the pre-commit.ci would find that file or not? I think another way of phrasing that is: if I push a PR with 20 commits, does pre-commit.ci run on the squashed version, or does it run on every single commit?

If anyone is inspired to experiment pre-commit.ci is installed on https://github.com/matplotlib/ipympl so a PR there could be used to experiment.

Can you even run pre-commit install for users automatically through the conda env create command somehow?

I have no idea, but that seems a bit invasive unless there was an easy opt out (like a flag in the create command). Though it seems that conda allowing that would be a security risk of some sort so I don't expect it to be possible)

I haven't used it before, but I am curious about the auto-fixing whitespace and things for new contributors, as that may make it less intimidating to contribute to the codebase.

I think this is a really nice benefit, not having to mess around with lint errors. Though the conflicts are definitely an issue, they will either need to pull immediately or force push and communicating that could be tricky - but maybe accomplished with a commenting bot?

Copy link
Member

@jklymak jklymak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a nice idea to save extra hits on CI. I abuse CI as much as anyone, but it actually has a cost, in energy costs if nothing else.

Comment on lines 8 to 12
- repo: https://github.com/myint/autoflake
rev: v1.4
hooks:
- id: autoflake
args: ["--in-place", "--remove-all-unused-imports", "--ignore-init-module-imports", "--remove-unused-variables"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think autocorrections are a good idea either. That seems likely to sow more confusion than it helps. I don't think new contributors find flake8 intimidating, and many IDEs will show you where the flake8 errors are already, so I don't think its too cumbersome to ask users to edit by hand. If this were togglable, maybe that would be OK, but default off?

@ianhi
Copy link
Contributor Author

ianhi commented Nov 11, 2021

So for autofixing we can turn off the autofix functionality of pre-commit.ci using https://pre-commit.ci/#configuration-autofix_prs so it would not push commits, only fail (also looks like you can exclude certain checks) but they would still run locally.

@ianhi
Copy link
Contributor Author

ianhi commented Nov 11, 2021

dding the check-added-large-files would prevent me from committing locally, but I'm curious if using the pre-commit.ci would find that file or not? I think another way of phrasing that is: if I push a PR with 20 commits, does pre-commit.ci run on the squashed version, or does it run on every single commit?

From matplotlib/ipympl#392 it looks like it unfortunately runs on the squashed version

@greglucas
Copy link
Contributor

Thanks for checking on that! So, it looks like we will need that other duplicate file checker PR anyways. And now that I think about it, this test will only check large files, it wouldn't get flagged for a 400kb file added and then removed, which we also don't want to creep in.

For the CI portion, it looks like the CI will only "fix" the PRs? It won't give a comment about why it failed, or does it do that as well? I wonder if we want to have a separate GitHub Actions CI and just run pre-commit on an instance ourselves.
https://pre-commit.com/#github-actions-example
with a --show-diff-on-failure to try and make it easy to point out to new contributors. We could also add in a fail_fast: true to save CI resources if the pre-commit job fails immediately.

All these comments can be added in a follow-up PR too, as they may require more discussion than just adding in the config file and instructions.

@ianhi
Copy link
Contributor Author

ianhi commented Nov 11, 2021

For the CI portion, it looks like the CI will only "fix" the PRs? It won't give a comment about why it failed, or does it do that as well?

I set it so that it won't push commits to PRs unless asked to. It should still run and the details will give a reason for failure. For example see this PR: mpl-extensions/mpl-image-labeller#18 (although that has pushing commits on, but the final commit still fails pre-commit as it doesn't autofix everything)

After clicking through the errors look like this:

image

Compared to the current linting workflow on failure: https://github.com/matplotlib/matplotlib/runs/4159189797?check_suite_focus=true

Copy link
Contributor

@greglucas greglucas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks @ianhi!


pip install pre-commit
pre-commit install

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a sentence about where the config file is so devs can modify their local version?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added something, let me know if that's what you were looking for

@ianhi
Copy link
Contributor Author

ianhi commented Nov 12, 2021

When running in ci I believe it will check all files in the repo, not just those changed in the PR (locally it just checks what is being committed). So if this going to be set up in CI we need to get to all green on pre-commit run --all-files. To that end I've added excludes and will push a commit with autofixes

@ianhi
Copy link
Contributor Author

ianhi commented Nov 12, 2021

Currently running pre-commit run --all-files gives:

Check for added large files..............................................Passed
Check docstring is first.................................................Failed
- hook id: check-docstring-first
- exit code: 1

examples/images_contours_and_fields/colormap_normalizations_symlognorm.py:15 Multiple module docstrings (first docstring on line 1).
lib/matplotlib/dates.py:213 Multiple module docstrings (first docstring on line 1).
lib/matplotlib/_constrained_layout.py:27 Multiple module docstrings (first docstring on line 1).
lib/matplotlib/backend_tools.py:992 Multiple module docstrings (first docstring on line 1).

Fix End of Files.........................................................Failed
- hook id: end-of-file-fixer
- exit code: 1
- files were modified by this hook

Fixing plot_types/basic/README.rst
Fixing plot_types/stats/README.rst
Fixing LICENSE/LICENSE_BAKOMA
Fixing .github/workflows/clean_pr.yml
Fixing doc/api/lines_api.rst
Fixing .github/ISSUE_TEMPLATE/feature_request.yml
Fixing plot_types/arrays/README.rst
Fixing doc/api/transformations.rst
Fixing LICENSE/LICENSE_STIX
Fixing doc/devel/testing.rst
Fixing doc/_static/.gitignore
Fixing LICENSE/LICENSE_COURIERTEN
Fixing lib/matplotlib/tests/README
Fixing lib/matplotlib/backends/web_backend/css/page.css
Fixing plot_types/unstructured/README.rst
Fixing plot_types/README.rst
Fixing doc/devel/style_guide.rst
Fixing LICENSE/LICENSE
Fixing LICENSE/LICENSE_CARLOGO
Fixing doc/api/toolkits/mplot3d/faq.rst
Fixing README.rst
Fixing doc/api/_enums_api.rst

Mixed line ending........................................................Passed
Trim Trailing Whitespace.................................................Failed
- hook id: trailing-whitespace
- exit code: 1
- files were modified by this hook

Fixing doc/users/faq/index.rst
Fixing doc/users/index.rst
Fixing doc/api/next_api_changes/removals/00001-ABC.rst
Fixing doc/users/getting_started/index.rst
Fixing lib/matplotlib/backends/web_backend/css/fbm.css
Fixing doc/README.txt
Fixing doc/api/patches_api.rst
Fixing LICENSE/LICENSE_BAKOMA
Fixing doc/api/tri_api.rst
Fixing doc/api/lines_api.rst
Fixing .github/ISSUE_TEMPLATE/feature_request.yml
Fixing .github/ISSUE_TEMPLATE/documentation.yml
Fixing doc/devel/contributing.rst
Fixing doc/users/next_whats_new/extending_MarkerStyle.rst
Fixing LICENSE/LICENSE_STIX
Fixing doc/api/next_api_changes/behavior/00001-ABC.rst
Fixing .github/ISSUE_TEMPLATE/maintenance.yml
Fixing examples/event_handling/README.txt
Fixing CODE_OF_CONDUCT.md
Fixing doc/api/next_api_changes/development/00001-ABC.rst
Fixing LICENSE/LICENSE_AMSFONTS
Fixing doc/_templates/automodule.rst
Fixing doc/devel/documenting_mpl.rst
Fixing plot_types/unstructured/README.rst
Fixing LICENSE/LICENSE_COLORBREWER
Fixing .github/ISSUE_TEMPLATE/bug_report.yml
Fixing plot_types/README.rst
Fixing doc/users/explain/fonts.rst
Fixing src/_path.h
Fixing doc/users/resources/index.rst
Fixing SECURITY.md
Fixing src/_backend_agg.h

flake8...................................................................Passed

@ianhi
Copy link
Contributor Author

ianhi commented Nov 12, 2021

With the last two commits the only failure is:

Check docstring is first.................................................Failed
- hook id: check-docstring-first
- exit code: 1

lib/matplotlib/_constrained_layout.py:27 Multiple module docstrings (first docstring on line 1).

The other docstring first failures were easy fixes but this one I didn't feel comfortable changing.

@jklymak do you have thoughts on what to do there?

@ianhi
Copy link
Contributor Author

ianhi commented Nov 12, 2021

This gained many files once I tested running pre-commit to get it to a place where everything passes. Happy to drop that commit and/or squash down as seems reasonable

@@ -989,12 +989,10 @@ def trigger(self, *args, **kwargs):
'help': ToolHelpBase,
'copy': ToolCopyToClipboardBase,
}
"""Default tools"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are docstrings; they should probably be comments instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about converting them but they seemed redundant with eachother and with the variable name. Happy to add back though

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for removing the docstrings are indeed redundant.

@ianhi
Copy link
Contributor Author

ianhi commented Nov 12, 2021

I'd like to suggest that if pre-commit.ci is considered being added as an integration it is added before this is merged so that everyone can get a feel for it while it is only affecting this PR.

@jklymak
Copy link
Member

jklymak commented Nov 12, 2021

The other docstring first failures were easy fixes but this one I didn't feel comfortable changing.

I am not quite following - _contrained_layout.py doesn't fail our usual flake8 - what are you doing here that is different?

@ianhi
Copy link
Contributor Author

ianhi commented Nov 12, 2021

The other docstring first failures were easy fixes but this one I didn't feel comfortable changing.

I am not quite following - _contrained_layout.py doesn't fail our usual flake8 - what are you doing here that is different?

This is failing the check-docstring-first check because there is a second module level docstring in that file here:

"""
General idea:
-------------

@jklymak
Copy link
Member

jklymak commented Nov 12, 2021

Sure I appreciate that. However I'm not sure that we want to start being more stringent than our current linting. Can this check be implemented without changing our current policy? because our linting policy is a larger conversation.

Copy link
Member

@jklymak jklymak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to block on all the linting changes. If this pr can go in without them, then great I'm totally fine with the convenience. I'm also not necessarily against linting changes but that should be it's own PR and discussion. Thanks!

@jklymak jklymak added the status: needs comment/discussion needs consensus on next step label Nov 13, 2021
@greglucas
Copy link
Contributor

I actually think all of the linting updates look good, it isn't black formatting things ;) For the double docstring I'd just combine it all into one or turn the second one into a massive comment. Since it is a private module, I don't think it gets rendered anywhere in the docs...

I also agree that it would be helpful to separate that commit out into a separate PR. So, turn this into two PRs. A new one with a lot of linting updates, and the other with just the pre-commit configs.

@jklymak jklymak dismissed their stale review November 13, 2021 17:31

unblock

@jklymak
Copy link
Member

jklymak commented Nov 13, 2021

OK, I see, triple quotes are not to be used for in-line comments (I didn't know that before!). I guess I'm fine with that if it is a bonafide python standard, and I'm fine with the triple quotes in _constrained_layout.py being coverted to #.

@ianhi
Copy link
Contributor Author

ianhi commented Nov 13, 2021

I'm going to block on all the linting changes. If this pr can go in without them, then great I'm totally fine with the convenience.

I also agree that it would be helpful to separate that commit out into a separate PR. So, turn this into two PRs. A new one with a lot of linting updates, and the other with just the pre-commit configs.

Sure thing. I added all the changes here because I was concerned about adding an automated check that would fail on some files (or on any CI check) so if this does go in I think it would be nice to also have the linting one go in soon after.

An alternative PR breakup strategy that preserves the always passing of the check could maybe be as follows:
PR 1: This one minus the docstring check (which I think would be the only contentious styling check)
- config commits squashed into one for easy review
PR 2: Adding in potentially more checks (e.g. docstring) and having a longer discussion of lint rules.

But for now (likely tomorrow or monday) I'll break it up as suggested one for config, and one for changes

Update pre-commit hooks

rst format

Disable autofixes on PRs

Unless manually triggered. See https://pre-commit.ci/#configuration

add pre-commit to environment.yml

Slow down autoupdate schedule

alphabetize

Add sentence about where pre-commit config is

Add excludes to pre-commit config

Otherwise end of lines of test images will get fixed,
this also ignores all the prev_whats_new and prev_api_changes
as well as vendored components (agg, gitwash)

pre-comit config
@ianhi
Copy link
Contributor Author

ianhi commented Nov 15, 2021

This PR is now just the config and #21638 is config plus linting changes.

@ianhi
Copy link
Contributor Author

ianhi commented Nov 15, 2021

OK, I see, triple quotes are not to be used for in-line comments (I didn't know that before!). I guess I'm fine with that if it is a bonafide python standard, and I'm fine with the triple quotes in _constrained_layout.py being coverted to #.

I'm not sure that they are "not to be used" more so that you're (mis?)using a language features of defining a multiline string and not assigning it to a variable.

@greglucas
Copy link
Contributor

My approval still stands here.

It looks like we just need to authorize pre-commit.ci on the matplotlib/matplotlib repository if we want to have the CI checks and it appears any maintainer can do that. We should discuss at a dev call whether we want to enable the service generally or not. For now, adding a configuration file shouldn't affect any users unless they are using their own local pre-commit configuration already, so I still think this is a plus and then I can take advantage of it ;)

@ianhi, do you know if you remove the ci: ... from the pre-commit config will that disable the CI checks, or is the only way to turn on/off the pre-commit.ci service to enable/disable on the repository?

@ianhi
Copy link
Contributor Author

ianhi commented Nov 16, 2021

@ianhi, do you know if you remove the ci: ... from the pre-commit config will that disable the CI checks, or is the only way to turn on/off the pre-commit.ci service to enable/disable on the repository?

I'm a little confused exactly what your asking. The ci: ... is currently preventing the CI service pushing commits to PRs but I believe it would still run the checks and give a pass/fail. To turn off the check you need to (dis/en)able on the whole repo.

@greglucas
Copy link
Contributor

To turn off the check you need to (dis/en)able on the whole repo.

👍 that is exactly what I was wondering, thanks!

@greglucas
Copy link
Contributor

@ianhi, I think there was consensus to not use pre-commit.ci for now, since we aren't auto-fixing PRs. Instead, we should replace the current flake8 runner under workflows/reviewdog.yml with the pre-commit hook checks instead (piping the output from pre-commit into reviewdog). Something like: pre-commit run --from-ref origin/HEAD --to-ref HEAD | reviewdog. (I'm not sure if that will work directly because it shows the status of each hook when run, so perhaps there is a better command line option to only get the failure text to pipe into reviewdog)

It looks like you can also cache the pre-commit environment for use next time too: https://pre-commit.com/#github-actions-example

@ianhi
Copy link
Contributor Author

ianhi commented Nov 18, 2021

@ianhi, I think there was consensus to not use pre-commit.ci for now, since we aren't auto-fixing PRs.

Sure thing, though note I'm pretty sure that with the config in this PR even pre-commit.ci will not push autofixes, it would merely report which checks failed and should (I think) be faster than github actions.

@dopplershift
Copy link
Contributor

Right, the thinking was that instead of adding "yet another service", we could get the benefits of using pre-commit just replacing running flake8 in our existing GitHub Actions job with pre-commit. In my experience, the speed of the linting job isn't really important (compared with say tests/docs).

@tacaswell tacaswell merged commit 9679c52 into matplotlib:main Jan 26, 2022
@ianhi
Copy link
Contributor Author

ianhi commented Jan 26, 2022

@tacaswell was that an accidental merge? per #21638 (comment) we may sitll need to remove the "trailing-whitespace" fixer?

@tacaswell
Copy link
Member

Ah, the changes were gone from the diff so I thought it was dealt with.

I was not a fan of seeing all of the thrashing now, but if it comes in as we actually touch those files that seems more OK to me. The worst offenders are the vendored copy of Agg (which we basically never touch so the hooks will not see them, and if we do it will be by someone who knows how to turn them off ;) ).

I can see the case for adding rst to the ignored files on the trailing whitespace, but lets see how much we actually run into that first?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: needs comment/discussion needs consensus on next step
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants