Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Alternative Schema Locations #1753

Merged
merged 30 commits into from
Mar 23, 2022
Merged

feat: Alternative Schema Locations #1753

merged 30 commits into from
Mar 23, 2022

Conversation

kratsg
Copy link
Contributor

@kratsg kratsg commented Jan 24, 2022

Description

Resolves #1609.

Refactor:

  • migrate from pkg_resources to importlib. See migration guide and details about importlib.
  • load_schema will handle functionality for finding the schema locations locally in multiple places
  • load_schema will not handle version identification (that is done through validate only)

Feature:

  • support overriding the paths for finding schemas, using the pyhf installed location as a base via pyhf.utils.schemas
  • support for offline access to currently supported version of schemas (via extra load_schema commands)
  • SchemaNotFound exception for when a schema cannot be found locally (outside of jsonschema.RefResolver calls)

Example Usage:

import pyhf, pathlib, json
pyhf.utils.schemas = pathlib.Path('/Users/kratsg/pyhf/myschemas')
ws = pyhf.Workspace(json.load(open('myschemas/custom.json')))

and then additionally, a nice benefit (maybe?) we get for free is that code which generates against a different schema won't work and fails more loudly (where it might have been more silent in the past).

>>> pyhf.simplemodels.uncorrelated_background([10], [5], [1]).spec
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/kratsg/.pyenv/versions/pyhf-dev/lib/python3.8/site-packages/pyhf/simplemodels.py", line 141, in uncorrelated_background
    return Model(spec, batch_size=batch_size)
  File "/Users/kratsg/.pyenv/versions/pyhf-dev/lib/python3.8/site-packages/pyhf/pdf.py", line 669, in __init__
    utils.validate(self.spec, self.schema, version=self.version)
  File "/Users/kratsg/.pyenv/versions/pyhf-dev/lib/python3.8/site-packages/pyhf/utils.py", line 74, in validate
    schema = load_schema(f'{version}/{schema_name}')
  File "/Users/kratsg/.pyenv/versions/pyhf-dev/lib/python3.8/site-packages/pyhf/utils.py", line 53, in load_schema
    raise pyhf.exceptions.SchemaNotFound(
pyhf.exceptions.SchemaNotFound: The schema 1.0.0/model.json was not found. Do you have the right version or the right path? /Users/kratsg/pyhf/myschemas/1.0.0/model.json

Documentation: https://pyhf--1753.org.readthedocs.build/en/1753/api.html#schema

Checklist Before Requesting Reviewer

  • Tests are passing
  • "WIP" removed from the title of the pull request
  • Selected an Assignee for the PR to be responsible for the log summary

Before Merging

For the PR Assignees:

  • Summarize commit messages into a comprehensive review of the PR
- Refactors
  * Migrate from `pkg_resources` to `importlib`. See importlib's migration guide.
    https://importlib-resources.readthedocs.io/en/latest/migration.html
  * Add importlib_resources as core dependency for Pythons older than 3.9.
  * `load_schema` will handle functionality for finding the schema locations
    locally in multiple places.
  * `load_schema` will not handle `version` identification (that is done through
    `validate` only).

- Features
  * Support overriding the paths for finding schemas, using the `pyhf` installed
    location as a base via `pyhf.schema.variables.path`.
  * Add support for offline access to currently supported version of schemas
    (via extra `load_schema` commands).
  * Add `SchemaNotFound` exception for when a schema cannot be found locally
    (outside of `jsonschema.RefResolver` calls)

- Python API
  * `pyhf.schema` introduced as a new API for all things schema-related.
  * `pyhf.schema.version`, `pyhf.schema.path`, `pyhf.schema.load_schema`, and
    `pyhf.schema.validate` are migrated over from `pyhf.utils`.

@codecov
Copy link

codecov bot commented Jan 24, 2022

Codecov Report

Merging #1753 (8b38df6) into master (1884c6c) will decrease coverage by 0.12%.
The diff coverage is 92.59%.

@@            Coverage Diff             @@
##           master    #1753      +/-   ##
==========================================
- Coverage   98.14%   98.02%   -0.13%     
==========================================
  Files          64       68       +4     
  Lines        4270     4310      +40     
  Branches      719      725       +6     
==========================================
+ Hits         4191     4225      +34     
- Misses         46       49       +3     
- Partials       33       36       +3     
Flag Coverage Δ
contrib 26.42% <49.38%> (+0.17%) ⬆️
doctest 60.74% <82.71%> (+0.13%) ⬆️
unittests 96.03% <92.59%> (-0.11%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/pyhf/utils.py 89.47% <71.42%> (-7.31%) ⬇️
src/pyhf/schema/variables.py 75.00% <75.00%> (ø)
src/pyhf/schema/loader.py 90.47% <90.47%> (ø)
src/pyhf/__init__.py 100.00% <100.00%> (ø)
src/pyhf/exceptions/__init__.py 100.00% <100.00%> (ø)
src/pyhf/patchset.py 100.00% <100.00%> (ø)
src/pyhf/pdf.py 97.80% <100.00%> (ø)
src/pyhf/readxml.py 94.37% <100.00%> (ø)
src/pyhf/schema/__init__.py 100.00% <100.00%> (ø)
src/pyhf/schema/validator.py 100.00% <100.00%> (ø)
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1884c6c...8b38df6. Read the comment docs.

Copy link
Member

@matthewfeickert matthewfeickert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kratsg Just a question, the general idea of

import pyhf, pathlib, json
pyhf.utils.schemas = pathlib.Path('/Users/kratsg/pyhf/myschemas')
ws = pyhf.Workspace(json.load(open('myschemas/custom.json')))

of being able to update the schema path seems fine, but instead of having users overwriting the value by assignment, do you think there is value in offering a pyhf.utils.update_schema_path or something that performs this operation through the API?

Other than that things seem to be looking good here I think. Though maybe worth also asking @alexander-held and maybe @lhenkelm as they asked for this in Issue #1609, and so might have some thoughts.

src/pyhf/utils.py Outdated Show resolved Hide resolved
@lhenkelm
Copy link
Contributor

lhenkelm commented Feb 1, 2022

instead of having users overwriting the value by assignment, do you think there is value in offering a pyhf.utils.update_schema_path or something that performs this operation through the API?

Thanks @kratsg for taking up my suggestion!

I second @matthewfeickert's suggestion to change schemas via function call rather than assignment. It is much simpler to add functionality to a function if it ever becomes necessary, than to replace an entire existing module with a class instance with a property.

I'd even go so far to say that a context manager doubling as an update function would be ideal IMO:

# in pyhf.utils
_SCHEMAS = Path(...)
class use_schema_path: # snake_case to remind of function-like usage
    def  __init__(self, path):
        global _SCHEMAS
        self._old_schemas = _SCHEMAS
        _SCHEMAS = pathlib.Path(path)
    def __enter__(self):
        pass
    def __exit__(self, *args, **kwargs):
        global _SCHEMAS
        _SCHEMAS = self._old_schemas

which can still be called as a function (only executing __init__), so short scripts etc. are not forced to use with blocks.
But it can also be used like so:

def make_my_workspace(spec):
    with pyhf.utils.use_schema_path('/my/very/special/schemas'):
        return pyhf.Workspace(spec)

So as a user writing code on top of pyhf, I don't have to worry about resesetting the global variable, the CM does it for me, and there are fewer mistakes to make.

@kratsg kratsg force-pushed the feat/altSchemaPaths branch from 7f2fae3 to 5148451 Compare March 22, 2022 15:10
@kratsg kratsg marked this pull request as ready for review March 22, 2022 17:44
@kratsg
Copy link
Contributor Author

kratsg commented Mar 22, 2022

So as a user writing code on top of pyhf, I don't have to worry about resesetting the global variable, the CM does it for me, and there are fewer mistakes to make.

This is filed as #1815 so this PR isn't overtly complicated. The API for this PR will be pyhf.schema(new_path: pathlib.Path)

@kratsg kratsg requested a review from lukasheinrich March 22, 2022 17:57
Copy link
Member

@matthewfeickert matthewfeickert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this nice work @kratsg. Overall it looks really solid 👍 and stepping through it with you annotations (thanks! 🙏) I see that it isn't really that bad at all (just looks like a lot).

I have a few super minor suggestions that are more just around testing and adding notes for ourselves in the future and then just one question (though it might not even be necessary if my question comes from me being confused).

setup.cfg Outdated Show resolved Hide resolved
src/pyhf/schema/loader.py Outdated Show resolved Hide resolved
src/pyhf/schema/variables.py Outdated Show resolved Hide resolved
src/pyhf/schema/loader.py Show resolved Hide resolved
src/pyhf/schema/validator.py Show resolved Hide resolved
src/pyhf/utils.py Outdated Show resolved Hide resolved
kratsg and others added 5 commits March 23, 2022 02:43
Co-authored-by: Matthew Feickert <matthew.feickert@cern.ch>
Co-authored-by: Matthew Feickert <matthew.feickert@cern.ch>
Co-authored-by: Matthew Feickert <matthew.feickert@cern.ch>
Co-authored-by: Matthew Feickert <matthew.feickert@cern.ch>
@matthewfeickert matthewfeickert added feat/enhancement New feature or request docs Documentation related refactor A code change that neither fixes a bug nor adds a feature labels Mar 23, 2022
Copy link
Member

@matthewfeickert matthewfeickert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kratsg. LGTM now!

tests/constraints.txt Show resolved Hide resolved
@lukasheinrich
Copy link
Contributor

LGTM!

@kratsg kratsg merged commit 2789d54 into master Mar 23, 2022
@kratsg kratsg deleted the feat/altSchemaPaths branch March 23, 2022 14:36
kratsg pushed a commit that referenced this pull request Mar 23, 2022
* Report coverage for the oldest Python (3.7) and newest (3.10) that are
tested to ensure that version specific 'if' statements are properly covered.
* Add Python version specific flags to the coverage reporting to compare
differences in coverage and track coverage changes across versions more easily.
* Follow up to PR #1753
matthewfeickert pushed a commit that referenced this pull request Mar 23, 2022
* Add contextlib support to the pyhf.schema API.
   - Follow up to PR #1753.
* Add tests for use of the schema contextlib to tests/test_schema.py.
* Add tests for use of the schema contextlib to the public API tests.
@kratsg kratsg mentioned this pull request Mar 24, 2022
1 task
lhenkelm added a commit to lhenkelm/pyhf that referenced this pull request Jul 6, 2022
lhenkelm added a commit to lhenkelm/pyhf that referenced this pull request Aug 22, 2022
matthewfeickert pushed a commit that referenced this pull request Aug 26, 2022
* Load `defs.json` into `schema.variables.SCHEMA_CACHE` from the local file.
  - in line with what was done before in PR #1753
* Add pytest-socket as a dependency of the 'test' extra. `pytest_socket.socket_disabled`
  was added in pytest-socket v0.2.0.
   - c.f. https://github.com/miketheman/pytest-socket/releases/tag/0.2.0
* Add a unit test to catch this regression in the future.
* `pyhf.schema` clears the cache to avoid stale schemas (and restores the cache if used as a CM).
* Amend unit tests for `pyhf.schema` to test the correct treatment of the cache.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation related feat/enhancement New feature or request refactor A code change that neither fixes a bug nor adds a feature schema and spec JSON schema
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Expose base_uri in utils.validate
4 participants