Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

move-package: add resolve_version hook #15867

Merged
merged 4 commits into from
Jan 31, 2024

Conversation

kklas
Copy link
Contributor

@kklas kklas commented Jan 23, 2024

Description

Implemented the version resolution hook. This hook is called during dependency graph resolution and returns an arbitrary version (Symbol) for a package. This is then used by the resolved to check that two different instances of the same package are of the same version.

The version hook effectively makes it possible to reference the same package in different locations (e.g., git, local, or on-chain). As long as their name and version matches, they will be considered the same package (respective of the address renaming and override flag).

When the version hook isn't used (returns None), packages will be differentiated as they were before the introduction of the hook based on their location (dependency kind in parent manifest).

This is part of the work to enable compiling against on-chain
dependencies #14178.

cc @rvantonder @amnn

Test Plan

Added unit tests


If your changes are not user-facing and do not break anything, you can skip the following section. Otherwise, please briefly describe what has changed under the Release Notes section.

Type of Change (Check all that apply)

  • protocol change
  • user-visible impact
  • breaking change for a client SDKs
  • breaking change for FNs (FN binary must upgrade)
  • breaking change for validators or node operators (must upgrade binaries)
  • breaking change for on-chain data layout
  • necessitate either a data wipe or data migration

Release notes

kklas added 3 commits January 23, 2024 13:31
Implemented the version resolution hook. This hook is called during
dependency graph resolution and returns an arbitrary version (`Symbol`)
for a package. This is then used by the resolved to check that two
different instances of the same package are of the same version.

The version hook effectively makes it possible to reference the same
package in different locations (e.g., git, local, or on-chain). As long
as their name and version matches, they will be considered the same
package (respective of the address renaming and override flag).

When the version hook isn't used (returns `None`), packages will be
differentiated as they were before the introduction of the hook based
on their location (dependency kind in parent manifest).
…files

This test highlights the problem of error messages using resolved pkg
names instead of original when dependnecy lock files are present. This
can be resolved by storing original pkg names in lockfiles.
Copy link

vercel bot commented Jan 23, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
mysten-ui ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jan 30, 2024 9:32pm
sui-core ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jan 30, 2024 9:32pm
sui-typescript-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jan 30, 2024 9:32pm
3 Ignored Deployments
Name Status Preview Comments Updated (UTC)
explorer ⬜️ Ignored (Inspect) Visit Preview Jan 30, 2024 9:32pm
multisig-toolkit ⬜️ Ignored (Inspect) Visit Preview Jan 30, 2024 9:32pm
sui-kiosk ⬜️ Ignored (Inspect) Visit Preview Jan 30, 2024 9:32pm

Copy link
Member

@amnn amnn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good overall, thanks as always @kklas ! I will leave it to @rvantonder to do the final review and merge, but the main thought from my review was whether it was worth naming this concept more generically than resolve_version, e.g. resolve_identifier or similar (and making some associated changes to the test set-up).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: test name typo (missmatch -> mismatch).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Surprised to see the lock file being checked in here -- is that required? (Perhaps you built this package directly at its location)? We probably need to .gitignore Move.lock files within these tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, they need to be here. The purpose of this test is to show that, when loading dependencies from the lockfiles (instead of constructing everything directly), the error messages show the resolved names of packages (which in Sui case will be original ids) instead of original pkg name (as defined in their manifest). This is because the original pkg names aren't stored in the lockfile (just resolved), so when loading the deps from lockfiles here https://github.com/MystenLabs/sui/blob/main/external-crates/move/crates/move-package/src/resolution/dependency_graph.rs#L1088-L1125, the resolved name is used in the dep_orig_name field.

I realize now though that this test should probably have been added in a separate PR because it's somewhat unrelated to the version hook, so this may be confusing. Sorry, my mistake!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(by default the .gitignore does ignore Move.lock except for system packages (and this, which I'm guessing was added with -f)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think that might have been the case. This test will highlight whether the problem has been solved after we add original pkg name to the lockfile.

@@ -0,0 +1,22 @@
# This test highlights the problem of error messages using resolved pkg
# names instead of original when dependnecy lock files are present. The
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: typo

Suggested change
# names instead of original when dependnecy lock files are present. The
# names instead of original when dependency lock files are present. The

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that your earlier PR that introduced the original package name would have fixed this, but I guess that's not the case -- is that to come?

There's also a confusion here in that if there is a conflict between an on-chain dependency and a source dependency, then the only identity they share is the publish-at address, which is the resolved name, so what could we show to indicate that the two packages are the same? Maybe we default to the original name, and fallabck to the resolved name if the original name doesn't exist?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It solved it largely, but we don't store the original name in the lockfile, so when loding the graph from the lockfile the resolved name is used (this is just for error messages though). It can only be solved by adding original pkg name to lockfiles which is actually a discussion point (3) in the OP #14178

Copy link
Contributor Author

@kklas kklas Jan 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's also a confusion here in that if there is a conflict between an on-chain dependency and a source dependency, then the only identity they share is the publish-at address, which is the resolved name

So just to clarify this, we have:

  • original name (user-friendly name in the manifest, used just for error messages)
  • resolved name (in Sui's case this will be the original package id)
  • resolved version (in Sui's case this will be the published-at id)

So when the algorithm is resolving the graph, it will load all packages and store them in the table based on their resolved name as the key. And when it encounters another package whose resolved name already exists in the table, it will then compare both packages based on their resolved version (which is implicitly done here https://github.com/MystenLabs/sui/blob/main/external-crates/move/crates/move-package/src/resolution/dependency_graph.rs#L717) and their dependencies (done immediately below https://github.com/MystenLabs/sui/blob/main/external-crates/move/crates/move-package/src/resolution/dependency_graph.rs#L751)

So packages are discriminated on the resolved name first, and then when there are duplicates we make sure they are the same resolved version (+their deps are the same).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on https://github.com/MystenLabs/sui/pull/15867/files#r1470220325 we're interested in keeping the resolved name in the lock file; that has my vote for readability and convenience.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a dumb question: what happens if a user duplicately specify an on-chain dependency?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same thing that happens when you specify a duplicate source dependency. Both on-chain and source dependencies get resolved to (original_id, published_at) so after that it doesn't matter to the resolver where the dependency came from.
There are some rare edge cases where transitive deps pulled in from different locations do and don't define their published-at which is briefly dicussed here in the "Backwards compatibility" section (rarely should happen and you can use overrides to go around it).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was initially confused where this 1.5.0 came from because in the sui repo, this is not the version set on Move Stdlib. The test was set-up this way because when we started the package management refactor, it was destined for move-language/move, but now that it's not, we should switch the dependency here to one in the sui repo.

Comment on lines 1667 to 1671
// Compare deps in both graphs. When the resolve_version hook is defined (both packages have a version),
// we compare the packages based on their version rather than their location (`PM::DependencyKind`)
// as defined in their parent manifest. When the hook is not defined (or returns None) for both packages,
// we compare the packages based on their location. If the version resolves for one package but is None for
// the other, we consider the packages to be different.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to make this a doc comment on the implementation of PartialEq because I was scratching my head there a bit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point!

@@ -447,10 +484,18 @@ impl<Progress: Write> DependencyGraphBuilder<Progress> {
&dep_pkg_path,
&mut self.progress_output,
)?;
(pkg_graph, false, true, dep_pkg_name)
// TODO: support resolved_pkg_name and resolved_version for
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will that be part of this stack, or for follow-up?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue here is that both the resolve name and version hooks derive from the dependency's manifest file instead of lockfile, but with externally resolved deps we effectively have just the lockfile. Since external deps aren't really used on Sui (and I wasn't sure whether this is something vestigial that would be deprecated), I left this as is for this PR to keep the scope sane.
We can solve this by adding more info to the lockfile (resolved name and version) and / or making the hooks also be able to work with lockfiles instead of just manifest. Although if externally resolved deps aren't going to be used on Sui, perhaps it's not worth the effort?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be open to adding more info to the lock file to support this. For background the external resolution logic was initially intended to support a package manager -- the original package manager that it was intended to support (Movey) did not materialise, but we do expect others to take its place, and there are some plans to use external resolution as a way to simplify how the framework dependencies interact with the Move toolchain, so I don't think external resolution will disappear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, in that case I can add it in a follow up PR before we enable the hooks (there's no difference before then).

With this PR in the lockfile we now have:

  • resolved name for each dep (the resolved name is now stored in the name field where the original name is stored without the hook; not for the root package -- we resolve it from the manifest)
  • resolved version for each dep (not for the root package -- we resolve it from the manifest)

We don't have:
(a) original name for each dep (this is needed for user friendly error messages; for root we can always get it from the manifest)
(b) resolved name for the root package (this is actually needed only for the external resolver because otherwise we can resolve it from the dep manifest)
(c) version for the root package (also only needed for the external resolver for the same reason).

So the two obvious options I can see for adding this are:
(1) add (a), (b), and (c) to the lockfile
(2) add only (a) to the lockfile while for (b) and (c) we can add it as some extra metadata just for the external resolver

I'm more keen on (1) because with (b) and (c) we effectively get the original id and published-at into the lockfile which means that, with a reasonable tooling tweak, we can completely move them out of the manifest.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, option (1) seems reasonable, and in fact we have plans to introduce the package's published-at and original-id into the lock file anyway (to support automated address management and address domain separation).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👆 what @amnn said, we have plans to add related things here. Though one thing to keep in mind though is that we plan to namespace this based on network, so there won't be only 3 three fields, but one for each network (with devnet rotating).

I can add it in a follow up PR before we enable the hooks (there's no difference before then)

So maybe we can compare notes / share plans here offline before making changes.

In general, the (a)/(b)/(c) breakdown is super useful and I'm in favor of being more explicit about info (option (1)) than not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So maybe we can compare notes / share plans here offline before making changes.

Of course! I'll share my ideas on #14178 before I send the PR.

@@ -144,6 +153,15 @@ pub struct Dependency {
pub dep_orig_name: PM::PackageName,
}

impl PartialEq for Dependency {
fn eq(&self, other: &Self) -> bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doc comment on this impl would be helpful as well to flag that it's avoiding comparing by dep_orig_name.

@@ -129,11 +134,15 @@ impl PartialEq for Package {
// comparing packages during insertion of externally resolved ones - an internally resolved
// existing package in the graph would not be recognized as a potential different version of
// the externally resolved one)
self.kind == other.kind
match (&self.version, &other.version) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This use of version made me think that this name is not doing the field justice -- it's being used for a lot more than just a version: It overrides the identity of the package as a whole. Is it worth giving it a more generic name accordingly?

Maybe I'm missing something, but it also makes the test runner setup a little strange, because currently it equates all packages that have the same version field. I think that's fine today because all packages were given different versions, but in practice, maybe the test-runner should implement resolve_version by creating some kind of combined string that represents both the package name and its version field? Giving the resolver and field a more generic name would make that kind of thing easier to spot as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that the resolved_name should probably be resolve_identifier or similar, and yes it's confusing.

because currently it equates all packages that have the same version field

The packages are discriminated by resolved name, and then only for two packages with the same name, it will do the version check on them.

@kklas
Copy link
Contributor Author

kklas commented Jan 29, 2024

Thanks @amnn for the very detailed look! I will fix the nits on the final swipe, waiting for @rvantonder comments also.

Copy link
Contributor

@rvantonder rvantonder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, mostly left clarifying comments!

Comment on lines +48 to +50
/// The version resolved from the version resolution hook.
pub version: Option<String>,

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not right now, but eventually it'll be worthwhile to tick a PR with user-facing changes, saying that we can compile from on-chain packages, and more specifically with docs, how this relates to the lock file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I'm putting a note down to remember to do that when the on-chain hooks get enabled. For now there's effectively no changes.

@@ -447,10 +484,18 @@ impl<Progress: Write> DependencyGraphBuilder<Progress> {
&dep_pkg_path,
&mut self.progress_output,
)?;
(pkg_graph, false, true, dep_pkg_name)
// TODO: support resolved_pkg_name and resolved_version for
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👆 what @amnn said, we have plans to add related things here. Though one thing to keep in mind though is that we plan to namespace this based on network, so there won't be only 3 three fields, but one for each network (with devnet rotating).

I can add it in a follow up PR before we enable the hooks (there's no difference before then)

So maybe we can compare notes / share plans here offline before making changes.

In general, the (a)/(b)/(c) breakdown is super useful and I'm in favor of being more explicit about info (option (1)) than not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(by default the .gitignore does ignore Move.lock except for system packages (and this, which I'm guessing was added with -f)

@@ -0,0 +1,22 @@
# This test highlights the problem of error messages using resolved pkg
# names instead of original when dependnecy lock files are present. The
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on https://github.com/MystenLabs/sui/pull/15867/files#r1470220325 we're interested in keeping the resolved name in the lock file; that has my vote for readability and convenience.

@@ -0,0 +1,22 @@
# This test highlights the problem of error messages using resolved pkg
# names instead of original when dependnecy lock files are present. The
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a dumb question: what happens if a user duplicately specify an on-chain dependency?

@kklas
Copy link
Contributor Author

kklas commented Jan 30, 2024

Nits are fixed!

@rvantonder
Copy link
Contributor

thanks @kklas 🙌

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants