Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: cache results of cnf, dnf and _merge_single_markers #609

Merged
merged 2 commits into from
Aug 17, 2023

Conversation

radoering
Copy link
Member

@radoering radoering commented Jun 24, 2023

Picking up the suggestion in python-poetry/poetry#7257 (comment)

In order to cache cnf and dnf, we can no longer consider composite markers and constraints with the same sub elements but a different order to be equal because order is relevant when building an intersection or union. Neglecting order leads to different results if the result of an "equal" marker with a different order has already been cached. This can even result in a RecursionError, as the test suite demonstrates.

Performance measurements see python-poetry/poetry#7257 (comment)

@radoering radoering force-pushed the perf/cnf-dnf-caching branch 2 times, most recently from 9f58ec7 to b822add Compare June 26, 2023 04:28
@radoering radoering requested a review from a team July 2, 2023 14:03
@dimbleby
Copy link
Contributor

dimbleby commented Jul 3, 2023

In order to cache cnf and dnf, we can no longer consider composite markers and constraints with the same sub elements but a different order to be equal because order is relevant when building an intersection or union. Neglecting order leads to different results if the result of an "equal" marker with a different order has already been cached. This can even result in a RecursionError, as the test suite demonstrates.

this all makes me a bit uncomfortable:

  • in my mental model markers with the same set of sub-markers really should be equal
  • it's worrying that the code is so close to falling into recursion errors etc, perhaps it's all more fragile than we thought

on the other hand there are quite a lot of tests for this stuff, and if they're all passing then I suppose it's probably fine...!

should __hash__ implementations be updated to match updated __eq__? instead of xor-ing in each individual marker (which was order-independent) it might make more sense to xor in hash(tuple(self._markers))? Or just construct a tuple that goes eg ("union", *self._markers) and hash that.

(If they're ordered, perhaps submarkers should be a tuple rather than a list all along, so as to make them immutable??)

re the SingleMarker: converting to a string for hash / equality purposes feels a bit weird? A typical trick is to define a method that returns a tuple of all the identifying features and use that for both hash and equality. I suppose that's kinda what you've done but with a string instead of a tuple...

@radoering
Copy link
Member Author

  • in my mental model markers with the same set of sub-markers really should be equal

I don't think so (anymore). It's just a matter of concept what's defined as equal.

If markers with the same subset of markers should be equal, shouldn't (more generally) markers that can be transformed into each other be equal, too? However, I don't think that works well with our algorithm.

IMO, a better concept is that only markers that have the same string representation are equal. It's easier to achieve and fits better with our algorithm.

should __hash__ implementations be updated to match updated __eq__?

It's not necessary but it might make sense if the hash calculation does not become more expensive so we get less different markers with equal hashes and thus have to do less comparisons for equality.

(If they're ordered, perhaps submarkers should be a tuple rather than a list all along, so as to make them immutable??)

IIRC, we have always considered submarkers immutable so making them a tuple instead of a list might make sense.

re the SingleMarker: converting to a string for hash / equality purposes feels a bit weird? A typical trick is to define a method that returns a tuple of all the identifying features and use that for both hash and equality. I suppose that's kinda what you've done but with a string instead of a tuple...

Agreed.

@radoering radoering force-pushed the perf/cnf-dnf-caching branch from 68102fd to 56c28a3 Compare July 23, 2023 16:12
@radoering radoering force-pushed the perf/cnf-dnf-caching branch 2 times, most recently from 9274aae to baec059 Compare August 11, 2023 16:14
In order to cache `cnf` and `dnf`, we can no longer consider composite markers and constraints with the same sub elements but a different order to be equal
because order is relevant when building an intersection or union.
Neglecting order leads to different results if the result of an "equal" marker with a different order has already been cached.
This can even result in a RecursionError, as the test suite demonstrates.
…a tuple instead of a list because it must not be changed
@radoering radoering force-pushed the perf/cnf-dnf-caching branch from baec059 to ef3eb4d Compare August 17, 2023 03:51
@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@radoering radoering merged commit ca962a0 into python-poetry:main Aug 17, 2023
mwalbeck pushed a commit to mwalbeck/docker-python-poetry that referenced this pull request Aug 27, 2023
This PR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [poetry](https://python-poetry.org/) ([source](https://github.com/python-poetry/poetry), [changelog](https://python-poetry.org/history/)) | minor | `1.5.1` -> `1.6.1` |

---

### Release Notes

<details>
<summary>python-poetry/poetry (poetry)</summary>

### [`v1.6.1`](https://github.com/python-poetry/poetry/blob/HEAD/CHANGELOG.md#161---2023-08-21)

[Compare Source](python-poetry/poetry@1.6.0...1.6.1)

##### Fixed

-   Update the minimum required version of `requests` ([#&#8203;8336](python-poetry/poetry#8336)).

### [`v1.6.0`](https://github.com/python-poetry/poetry/blob/HEAD/CHANGELOG.md#160---2023-08-20)

[Compare Source](python-poetry/poetry@1.5.1...1.6.0)

##### Added

-   **Add support for repositories that do not provide a supported hash algorithm** ([#&#8203;8118](python-poetry/poetry#8118)).
-   **Add full support for duplicate dependencies with overlapping markers** ([#&#8203;7257](python-poetry/poetry#7257)).
-   **Improve performance of `poetry lock` for certain edge cases** ([#&#8203;8256](python-poetry/poetry#8256)).
-   Improve performance of `poetry install` ([#&#8203;8031](python-poetry/poetry#8031)).
-   `poetry check` validates that specified `readme` files do exist ([#&#8203;7444](python-poetry/poetry#7444)).
-   Add a downgrading note when updating to an older version ([#&#8203;8176](python-poetry/poetry#8176)).
-   Add support for `vox` in the `xonsh` shell ([#&#8203;8203](python-poetry/poetry#8203)).
-   Add support for `pre-commit` hooks for projects where the pyproject.toml file is located in a subfolder ([#&#8203;8204](python-poetry/poetry#8204)).
-   Add support for the `git+http://` scheme ([#&#8203;6619](python-poetry/poetry#6619)).

##### Changed

-   **Drop support for Python 3.7** ([#&#8203;7674](python-poetry/poetry#7674)).
-   Move `poetry lock --check` to `poetry check --lock` and deprecate the former ([#&#8203;8015](python-poetry/poetry#8015)).
-   Change future warning that PyPI will only be disabled automatically if there are no primary sources ([#&#8203;8151](python-poetry/poetry#8151)).

##### Fixed

-   Fix an issue where `build-system.requires` were not respected for projects with build scripts ([#&#8203;7975](python-poetry/poetry#7975)).
-   Fix an issue where the encoding was not handled correctly when calling a subprocess ([#&#8203;8060](python-poetry/poetry#8060)).
-   Fix an issue where `poetry show --top-level` did not show top level dependencies with extras ([#&#8203;8076](python-poetry/poetry#8076)).
-   Fix an issue where `poetry init` handled projects with `src` layout incorrectly ([#&#8203;8218](python-poetry/poetry#8218)).
-   Fix an issue where Poetry wrote `.pth` files with the wrong encoding ([#&#8203;8041](python-poetry/poetry#8041)).
-   Fix an issue where `poetry install` did not respect the source if the same version of a package has been locked from different sources ([#&#8203;8304](python-poetry/poetry#8304)).

##### Docs

-   Document **official Poetry badge** ([#&#8203;8066](python-poetry/poetry#8066)).
-   Update configuration folder path for macOS ([#&#8203;8062](python-poetry/poetry#8062)).
-   Add a warning about pip ignoring lock files ([#&#8203;8117](python-poetry/poetry#8117)).
-   Clarify the use of the `virtualenvs.in-project` setting. ([#&#8203;8126](python-poetry/poetry#8126)).
-   Change `pre-commit` YAML style to be consistent with pre-commit's own examples ([#&#8203;8146](python-poetry/poetry#8146)).
-   Fix command for listing installed plugins ([#&#8203;8200](python-poetry/poetry#8200)).
-   Mention the `nox-poetry` package ([#&#8203;8173](python-poetry/poetry#8173)).
-   Add an example with a PyPI source in the pyproject.toml file ([#&#8203;8171](python-poetry/poetry#8171)).
-   Use `reference` instead of deprecated `callable` in the scripts example ([#&#8203;8211](python-poetry/poetry#8211)).

##### poetry-core ([`1.7.0`](https://github.com/python-poetry/poetry-core/releases/tag/1.7.0))

-   Improve performance of marker handling ([#&#8203;609](python-poetry/poetry-core#609)).
-   Allow `|` as a value separator in markers with the operators `in` and `not in` ([#&#8203;608](python-poetry/poetry-core#608)).
-   Put pretty name (instead of normalized name) in metadata ([#&#8203;620](python-poetry/poetry-core#620)).
-   Update list of supported licenses ([#&#8203;623](python-poetry/poetry-core#623)).
-   Fix an issue where PEP 508 dependency specifications with names starting with a digit could not be parsed ([#&#8203;607](python-poetry/poetry-core#607)).
-   Fix an issue where Poetry considered an unrelated `.gitignore` file resulting in an empty wheel ([#&#8203;611](python-poetry/poetry-core#611)).

##### poetry-plugin-export ([`^1.5.0`](https://github.com/python-poetry/poetry-plugin-export/releases/tag/1.5.0))

-   Fix an issue where markers for dependencies required by an extra were not generated correctly ([#&#8203;209](python-poetry/poetry-plugin-export#209)).

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi40Mi40IiwidXBkYXRlZEluVmVyIjoiMzYuNTIuMiIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Reviewed-on: https://git.walbeck.it/walbeck-it/docker-python-poetry/pulls/846
Co-authored-by: renovate-bot <bot@walbeck.it>
Co-committed-by: renovate-bot <bot@walbeck.it>
@radoering radoering deleted the perf/cnf-dnf-caching branch November 24, 2024 12:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants