Add pre-commit config and dev instructions #21583

ianhi · 2021-11-10T16:53:51Z

PR Summary

Add a .pre-commit-config.yaml file for use with pre-commit as well as add instructions for setting it up. When installed this should shorten the feedback loop for catching flake8 errors as it will run every time you attempt to git commit - it also is smart about running only on files staged for commit.

I find this to be very convenient and think it would be nice to include as an option for those who want to use it. But this is certainly something that would be up for discussion.

PR Checklist

[NA] Has pytest style unit tests (and pytest passes).
Is Flake 8 compliant (run flake8 on changed files to check).
- I know it is because of pre-commit!
New features are documented, with examples if plot related.
[i hope.. ] Documentation is sphinx and numpydoc compliant (the docs should build without error).
Conforms to Matplotlib style conventions (install flake8-docstrings and run flake8 --docstring-convention=all).

ianhi · 2021-11-10T16:57:06Z

Example of it running when creating this PR:

timhoffm

Adding pre-commit checks is fundamentally a good idea!

timhoffm · 2021-11-10T21:06:12Z

doc/devel/development_setup.rst

+===========================
+You can optionally install `pre-commit <https://pre-commit.com/>`_ hooks.
+These will automatically check flake8 and other style issues when you run
+`git commit`. To install the hooks ::


Suggested change

`git commit`. To install the hooks ::

``git commit``. To install the hooks ::

Probably should include instructions for where the yaml is in case folks need to tweak it?

timhoffm · 2021-11-10T21:21:39Z

.pre-commit-config.yaml

+  - repo: https://github.com/myint/autoflake
+    rev: v1.4
+    hooks:
+      - id: autoflake
+        args: ["--in-place", "--remove-all-unused-imports", "--ignore-init-module-imports", "--remove-unused-variables"]


I don't like auto-corrections that go beyond basic stuff like end-fo-file-fixer or trailing-whitespace. Anyway, one cannot apply the unused import detection as is to the current codebase because we have cases like this:

matplotlib/lib/matplotlib/text.py

Line 16 in c12791c

from .textpath import TextPath # Unused, but imported by others.

So let's leave this out.

Sure thing.

Although note that I think this basically pushes you to to use __all__ or # noqa either of which would prevent that from being removed.

We currently do that via F401 excludes in .flake8.

The # noqa comment is quite non-instructive. Technically, something like would work for flake8 but has the disadvantage that lines become too long very quickly and you get line-length-complaints. Also, last time I checked, not all tools can work with this expanded syntax.

from .textpath import TextPath # Unused, but imported by others. # noqa: F401

I don't think autocorrections are a good idea either. That seems likely to sow more confusion than it helps. I don't think new contributors find flake8 intimidating, and many IDEs will show you where the flake8 errors are already, so I don't think its too cumbersome to ask users to edit by hand. If this were togglable, maybe that would be OK, but default off?

The # noqa comment is quite non-instructive.

Agreed, my preference would still be to use __all__ = ["TextPath", ...] and re-enable F401. This feels like the correct way to specify the "exports" of a file while also allowing checks for actually unused imports.

.pre-commit-config.yaml

ianhi · 2021-11-10T21:40:54Z

It's also easy to run this automatically on PRs using https://pre-commit.ci/ which I like doing

greglucas

👍, it is a nice-to-have and doesn't force people to use it. Some comments for thought/discussion about expanding this to more things below, but overall I like the idea of adding this.

I'd suggest adding pre-commit to the environment.yml file. Can you even run pre-commit install for users automatically through the conda env create command somehow? That may be getting too invasive, but I feel like it could help users out locally rather than having CI tell them something is wrong and having to search through logs that can be long and cryptic.

There is some conversation going on about detecting added and then removed large files during a PR #21593. Adding the check-added-large-files would prevent me from committing locally, but I'm curious if using the pre-commit.ci would find that file or not? I think another way of phrasing that is: if I push a PR with 20 commits, does pre-commit.ci run on the squashed version, or does it run on every single commit?

For pre-commit.ci, I think running it would be good. I haven't used it before, but I am curious about the auto-fixing whitespace and things for new contributors, as that may make it less intimidating to contribute to the codebase. However, I'm curious if those users would be able to get those pushed changes from CI back into their local copy without going through merge/rebase issues which may actually be worse and more confusing.

.pre-commit-config.yaml

ianhi · 2021-11-11T05:10:59Z

There is some conversation going on about detecting added and then removed large files during a PR #21593. Adding the check-added-large-files would prevent me from committing locally, but I'm curious if using the pre-commit.ci would find that file or not? I think another way of phrasing that is: if I push a PR with 20 commits, does pre-commit.ci run on the squashed version, or does it run on every single commit?

If anyone is inspired to experiment pre-commit.ci is installed on https://github.com/matplotlib/ipympl so a PR there could be used to experiment.

Can you even run pre-commit install for users automatically through the conda env create command somehow?

I have no idea, but that seems a bit invasive unless there was an easy opt out (like a flag in the create command). Though it seems that conda allowing that would be a security risk of some sort so I don't expect it to be possible)

I haven't used it before, but I am curious about the auto-fixing whitespace and things for new contributors, as that may make it less intimidating to contribute to the codebase.

I think this is a really nice benefit, not having to mess around with lint errors. Though the conflicts are definitely an issue, they will either need to pull immediately or force push and communicating that could be tricky - but maybe accomplished with a commenting bot?

jklymak

This seems a nice idea to save extra hits on CI. I abuse CI as much as anyone, but it actually has a cost, in energy costs if nothing else.

jklymak · 2021-11-11T14:54:34Z

.pre-commit-config.yaml

+  - repo: https://github.com/myint/autoflake
+    rev: v1.4
+    hooks:
+      - id: autoflake
+        args: ["--in-place", "--remove-all-unused-imports", "--ignore-init-module-imports", "--remove-unused-variables"]


I don't think autocorrections are a good idea either. That seems likely to sow more confusion than it helps. I don't think new contributors find flake8 intimidating, and many IDEs will show you where the flake8 errors are already, so I don't think its too cumbersome to ask users to edit by hand. If this were togglable, maybe that would be OK, but default off?

ianhi · 2021-11-11T16:17:18Z

So for autofixing we can turn off the autofix functionality of pre-commit.ci using https://pre-commit.ci/#configuration-autofix_prs so it would not push commits, only fail (also looks like you can exclude certain checks) but they would still run locally.

ianhi · 2021-11-11T16:35:38Z

dding the check-added-large-files would prevent me from committing locally, but I'm curious if using the pre-commit.ci would find that file or not? I think another way of phrasing that is: if I push a PR with 20 commits, does pre-commit.ci run on the squashed version, or does it run on every single commit?

From matplotlib/ipympl#392 it looks like it unfortunately runs on the squashed version

greglucas · 2021-11-11T17:34:16Z

Thanks for checking on that! So, it looks like we will need that other duplicate file checker PR anyways. And now that I think about it, this test will only check large files, it wouldn't get flagged for a 400kb file added and then removed, which we also don't want to creep in.

For the CI portion, it looks like the CI will only "fix" the PRs? It won't give a comment about why it failed, or does it do that as well? I wonder if we want to have a separate GitHub Actions CI and just run pre-commit on an instance ourselves.
https://pre-commit.com/#github-actions-example
with a --show-diff-on-failure to try and make it easy to point out to new contributors. We could also add in a fail_fast: true to save CI resources if the pre-commit job fails immediately.

All these comments can be added in a follow-up PR too, as they may require more discussion than just adding in the config file and instructions.

ianhi · 2021-11-11T18:25:41Z

For the CI portion, it looks like the CI will only "fix" the PRs? It won't give a comment about why it failed, or does it do that as well?

I set it so that it won't push commits to PRs unless asked to. It should still run and the details will give a reason for failure. For example see this PR: mpl-extensions/mpl-image-labeller#18 (although that has pushing commits on, but the final commit still fails pre-commit as it doesn't autofix everything)

After clicking through the errors look like this:

Compared to the current linting workflow on failure: https://github.com/matplotlib/matplotlib/runs/4159189797?check_suite_focus=true

greglucas

Looks good to me, thanks @ianhi!

jklymak · 2021-11-12T15:25:40Z

doc/devel/development_setup.rst

+
+    pip install pre-commit
+    pre-commit install
+


Can you add a sentence about where the config file is so devs can modify their local version?

I added something, let me know if that's what you were looking for

ianhi · 2021-11-12T16:54:28Z

When running in ci I believe it will check all files in the repo, not just those changed in the PR (locally it just checks what is being committed). So if this going to be set up in CI we need to get to all green on pre-commit run --all-files. To that end I've added excludes and will push a commit with autofixes

ianhi · 2021-11-12T16:55:12Z

Currently running pre-commit run --all-files gives:

Check for added large files..............................................Passed
Check docstring is first.................................................Failed
- hook id: check-docstring-first
- exit code: 1

examples/images_contours_and_fields/colormap_normalizations_symlognorm.py:15 Multiple module docstrings (first docstring on line 1).
lib/matplotlib/dates.py:213 Multiple module docstrings (first docstring on line 1).
lib/matplotlib/_constrained_layout.py:27 Multiple module docstrings (first docstring on line 1).
lib/matplotlib/backend_tools.py:992 Multiple module docstrings (first docstring on line 1).

Fix End of Files.........................................................Failed
- hook id: end-of-file-fixer
- exit code: 1
- files were modified by this hook

Fixing plot_types/basic/README.rst
Fixing plot_types/stats/README.rst
Fixing LICENSE/LICENSE_BAKOMA
Fixing .github/workflows/clean_pr.yml
Fixing doc/api/lines_api.rst
Fixing .github/ISSUE_TEMPLATE/feature_request.yml
Fixing plot_types/arrays/README.rst
Fixing doc/api/transformations.rst
Fixing LICENSE/LICENSE_STIX
Fixing doc/devel/testing.rst
Fixing doc/_static/.gitignore
Fixing LICENSE/LICENSE_COURIERTEN
Fixing lib/matplotlib/tests/README
Fixing lib/matplotlib/backends/web_backend/css/page.css
Fixing plot_types/unstructured/README.rst
Fixing plot_types/README.rst
Fixing doc/devel/style_guide.rst
Fixing LICENSE/LICENSE
Fixing LICENSE/LICENSE_CARLOGO
Fixing doc/api/toolkits/mplot3d/faq.rst
Fixing README.rst
Fixing doc/api/_enums_api.rst

Mixed line ending........................................................Passed
Trim Trailing Whitespace.................................................Failed
- hook id: trailing-whitespace
- exit code: 1
- files were modified by this hook

Fixing doc/users/faq/index.rst
Fixing doc/users/index.rst
Fixing doc/api/next_api_changes/removals/00001-ABC.rst
Fixing doc/users/getting_started/index.rst
Fixing lib/matplotlib/backends/web_backend/css/fbm.css
Fixing doc/README.txt
Fixing doc/api/patches_api.rst
Fixing LICENSE/LICENSE_BAKOMA
Fixing doc/api/tri_api.rst
Fixing doc/api/lines_api.rst
Fixing .github/ISSUE_TEMPLATE/feature_request.yml
Fixing .github/ISSUE_TEMPLATE/documentation.yml
Fixing doc/devel/contributing.rst
Fixing doc/users/next_whats_new/extending_MarkerStyle.rst
Fixing LICENSE/LICENSE_STIX
Fixing doc/api/next_api_changes/behavior/00001-ABC.rst
Fixing .github/ISSUE_TEMPLATE/maintenance.yml
Fixing examples/event_handling/README.txt
Fixing CODE_OF_CONDUCT.md
Fixing doc/api/next_api_changes/development/00001-ABC.rst
Fixing LICENSE/LICENSE_AMSFONTS
Fixing doc/_templates/automodule.rst
Fixing doc/devel/documenting_mpl.rst
Fixing plot_types/unstructured/README.rst
Fixing LICENSE/LICENSE_COLORBREWER
Fixing .github/ISSUE_TEMPLATE/bug_report.yml
Fixing plot_types/README.rst
Fixing doc/users/explain/fonts.rst
Fixing src/_path.h
Fixing doc/users/resources/index.rst
Fixing SECURITY.md
Fixing src/_backend_agg.h

flake8...................................................................Passed

ianhi · 2021-11-12T17:05:25Z

With the last two commits the only failure is:

Check docstring is first.................................................Failed
- hook id: check-docstring-first
- exit code: 1

lib/matplotlib/_constrained_layout.py:27 Multiple module docstrings (first docstring on line 1).

The other docstring first failures were easy fixes but this one I didn't feel comfortable changing.

@jklymak do you have thoughts on what to do there?

ianhi · 2021-11-12T17:09:09Z

This gained many files once I tested running pre-commit to get it to a place where everything passes. Happy to drop that commit and/or squash down as seems reasonable

examples/images_contours_and_fields/colormap_normalizations_symlognorm.py

QuLogic · 2021-11-12T21:22:43Z

lib/matplotlib/backend_tools.py

@@ -989,12 +989,10 @@ def trigger(self, *args, **kwargs):
                 'help': ToolHelpBase,
                 'copy': ToolCopyToClipboardBase,
                 }
-"""Default tools"""


These are docstrings; they should probably be comments instead.

I thought about converting them but they seemed redundant with eachother and with the variable name. Happy to add back though

+1 for removing the docstrings are indeed redundant.

ianhi · 2021-11-12T21:55:10Z

I'd like to suggest that if pre-commit.ci is considered being added as an integration it is added before this is merged so that everyone can get a feel for it while it is only affecting this PR.

jklymak · 2021-11-12T22:23:56Z

The other docstring first failures were easy fixes but this one I didn't feel comfortable changing.

I am not quite following - _contrained_layout.py doesn't fail our usual flake8 - what are you doing here that is different?

ianhi · 2021-11-12T22:45:02Z

The other docstring first failures were easy fixes but this one I didn't feel comfortable changing.

I am not quite following - _contrained_layout.py doesn't fail our usual flake8 - what are you doing here that is different?

This is failing the check-docstring-first check because there is a second module level docstring in that file here:

matplotlib/lib/matplotlib/_constrained_layout.py

Lines 27 to 29 in f93c0a3

    
           """ 
        
           General idea: 
        
           -------------

jklymak · 2021-11-12T22:55:23Z

Sure I appreciate that. However I'm not sure that we want to start being more stringent than our current linting. Can this check be implemented without changing our current policy? because our linting policy is a larger conversation.

jklymak

I'm going to block on all the linting changes. If this pr can go in without them, then great I'm totally fine with the convenience. I'm also not necessarily against linting changes but that should be it's own PR and discussion. Thanks!

greglucas · 2021-11-13T15:54:17Z

I actually think all of the linting updates look good, it isn't black formatting things ;) For the double docstring I'd just combine it all into one or turn the second one into a massive comment. Since it is a private module, I don't think it gets rendered anywhere in the docs...

I also agree that it would be helpful to separate that commit out into a separate PR. So, turn this into two PRs. A new one with a lot of linting updates, and the other with just the pre-commit configs.

unblock

jklymak · 2021-11-13T17:34:22Z

OK, I see, triple quotes are not to be used for in-line comments (I didn't know that before!). I guess I'm fine with that if it is a bonafide python standard, and I'm fine with the triple quotes in _constrained_layout.py being coverted to #.

ianhi · 2021-11-13T23:23:31Z

I'm going to block on all the linting changes. If this pr can go in without them, then great I'm totally fine with the convenience.

I also agree that it would be helpful to separate that commit out into a separate PR. So, turn this into two PRs. A new one with a lot of linting updates, and the other with just the pre-commit configs.

Sure thing. I added all the changes here because I was concerned about adding an automated check that would fail on some files (or on any CI check) so if this does go in I think it would be nice to also have the linting one go in soon after.

An alternative PR breakup strategy that preserves the always passing of the check could maybe be as follows:
PR 1: This one minus the docstring check (which I think would be the only contentious styling check)
- config commits squashed into one for easy review
PR 2: Adding in potentially more checks (e.g. docstring) and having a longer discussion of lint rules.

But for now (likely tomorrow or monday) I'll break it up as suggested one for config, and one for changes

Update pre-commit hooks rst format Disable autofixes on PRs Unless manually triggered. See https://pre-commit.ci/#configuration add pre-commit to environment.yml Slow down autoupdate schedule alphabetize Add sentence about where pre-commit config is Add excludes to pre-commit config Otherwise end of lines of test images will get fixed, this also ignores all the prev_whats_new and prev_api_changes as well as vendored components (agg, gitwash) pre-comit config

ianhi · 2021-11-15T06:09:21Z

This PR is now just the config and #21638 is config plus linting changes.

ianhi · 2021-11-15T06:12:19Z

OK, I see, triple quotes are not to be used for in-line comments (I didn't know that before!). I guess I'm fine with that if it is a bonafide python standard, and I'm fine with the triple quotes in _constrained_layout.py being coverted to #.

I'm not sure that they are "not to be used" more so that you're (mis?)using a language features of defining a multiline string and not assigning it to a variable.

greglucas · 2021-11-16T00:31:49Z

My approval still stands here.

It looks like we just need to authorize pre-commit.ci on the matplotlib/matplotlib repository if we want to have the CI checks and it appears any maintainer can do that. We should discuss at a dev call whether we want to enable the service generally or not. For now, adding a configuration file shouldn't affect any users unless they are using their own local pre-commit configuration already, so I still think this is a plus and then I can take advantage of it ;)

@ianhi, do you know if you remove the ci: ... from the pre-commit config will that disable the CI checks, or is the only way to turn on/off the pre-commit.ci service to enable/disable on the repository?

ianhi · 2021-11-16T00:38:04Z

@ianhi, do you know if you remove the ci: ... from the pre-commit config will that disable the CI checks, or is the only way to turn on/off the pre-commit.ci service to enable/disable on the repository?

I'm a little confused exactly what your asking. The ci: ... is currently preventing the CI service pushing commits to PRs but I believe it would still run the checks and give a pass/fail. To turn off the check you need to (dis/en)able on the whole repo.

greglucas · 2021-11-16T00:44:22Z

To turn off the check you need to (dis/en)able on the whole repo.

👍 that is exactly what I was wondering, thanks!

greglucas · 2021-11-18T21:36:09Z

@ianhi, I think there was consensus to not use pre-commit.ci for now, since we aren't auto-fixing PRs. Instead, we should replace the current flake8 runner under workflows/reviewdog.yml with the pre-commit hook checks instead (piping the output from pre-commit into reviewdog). Something like: pre-commit run --from-ref origin/HEAD --to-ref HEAD | reviewdog. (I'm not sure if that will work directly because it shows the status of each hook when run, so perhaps there is a better command line option to only get the failure text to pipe into reviewdog)

It looks like you can also cache the pre-commit environment for use next time too: https://pre-commit.com/#github-actions-example

ianhi · 2021-11-18T21:42:17Z

@ianhi, I think there was consensus to not use pre-commit.ci for now, since we aren't auto-fixing PRs.

Sure thing, though note I'm pretty sure that with the config in this PR even pre-commit.ci will not push autofixes, it would merely report which checks failed and should (I think) be faster than github actions.

dopplershift · 2021-11-18T23:16:31Z

Right, the thinking was that instead of adding "yet another service", we could get the benefits of using pre-commit just replacing running flake8 in our existing GitHub Actions job with pre-commit. In my experience, the speed of the linting job isn't really important (compared with say tests/docs).

ianhi · 2022-01-26T22:06:07Z

@tacaswell was that an accidental merge? per #21638 (comment) we may sitll need to remove the "trailing-whitespace" fixer?

tacaswell · 2022-01-27T01:53:58Z

Ah, the changes were gone from the diff so I thought it was dealt with.

I was not a fan of seeing all of the thrashing now, but if it comes in as we actually touch those files that seems more OK to me. The worst offenders are the vendored copy of Agg (which we basically never touch so the hooks will not see them, and if we do it will be by someone who knows how to turn them off ;) ).

I can see the case for adding rst to the ignored files on the trailing whitespace, but lets see how much we actually run into that first?

timhoffm reviewed Nov 10, 2021

View reviewed changes

tacaswell added this to the v3.6.0 milestone Nov 10, 2021

QuLogic requested a review from greglucas November 10, 2021 21:56

greglucas reviewed Nov 11, 2021

View reviewed changes

.pre-commit-config.yaml Show resolved Hide resolved

.pre-commit-config.yaml Show resolved Hide resolved

jklymak reviewed Nov 11, 2021

View reviewed changes

ianhi mentioned this pull request Nov 11, 2021

Test PR of pre-commit matplotlib/ipympl#392

Closed

greglucas approved these changes Nov 12, 2021

View reviewed changes

jklymak reviewed Nov 12, 2021

View reviewed changes

greglucas mentioned this pull request Nov 12, 2021

DOC: Add conda environment instructions to developers guide #21611

Closed

QuLogic reviewed Nov 12, 2021

View reviewed changes

ianhi force-pushed the pre-commit branch from 99720c8 to 8f3c732 Compare November 12, 2021 21:55

jklymak previously requested changes Nov 12, 2021

View reviewed changes

jklymak added the status: needs comment/discussion needs consensus on next step label Nov 13, 2021

ianhi force-pushed the pre-commit branch from ca0dab1 to c2729bd Compare November 15, 2021 06:01

ianhi mentioned this pull request Nov 15, 2021

Pre-commit config + fixes #21638

Closed

3 tasks

tacaswell merged commit 9679c52 into matplotlib:main Jan 26, 2022

This was referenced Mar 31, 2022

[MNT]: Add codespell to pre-commit hooks #22740

Closed

DEV: Add codespell to pre-commit hooks #22777

Merged

matthewfeickert mentioned this pull request Apr 8, 2022

STY: Apply pre-commit hooks to codebase #22809

Merged

3 tasks

	`git commit`. To install the hooks ::
	``git commit``. To install the hooks ::

Add pre-commit config and dev instructions #21583

Add pre-commit config and dev instructions #21583

Conversation

ianhi commented Nov 10, 2021

PR Summary

PR Checklist

ianhi commented Nov 10, 2021

timhoffm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ianhi Nov 10, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ianhi commented Nov 10, 2021

greglucas left a comment

Choose a reason for hiding this comment

ianhi commented Nov 11, 2021

jklymak left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ianhi commented Nov 11, 2021

ianhi commented Nov 11, 2021

greglucas commented Nov 11, 2021

ianhi commented Nov 11, 2021 • edited Loading

greglucas left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ianhi commented Nov 12, 2021

ianhi commented Nov 12, 2021

ianhi commented Nov 12, 2021 • edited Loading

ianhi commented Nov 12, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ianhi commented Nov 12, 2021

jklymak commented Nov 12, 2021

ianhi commented Nov 12, 2021

jklymak commented Nov 12, 2021

jklymak left a comment

Choose a reason for hiding this comment

greglucas commented Nov 13, 2021

jklymak commented Nov 13, 2021

ianhi commented Nov 13, 2021

ianhi commented Nov 15, 2021

ianhi commented Nov 15, 2021

greglucas commented Nov 16, 2021

ianhi commented Nov 16, 2021

greglucas commented Nov 16, 2021

greglucas commented Nov 18, 2021

ianhi commented Nov 18, 2021 • edited Loading

dopplershift commented Nov 18, 2021

ianhi commented Jan 26, 2022

tacaswell commented Jan 27, 2022

ianhi Nov 10, 2021 •

edited

Loading

ianhi commented Nov 11, 2021 •

edited

Loading

ianhi commented Nov 12, 2021 •

edited

Loading

ianhi commented Nov 18, 2021 •

edited

Loading