Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

workspace glob exclusions #15164

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

probably-neb
Copy link

@probably-neb probably-neb commented Nov 15, 2024

What does this PR do?

Adds the ability to exclude packages from a workspace using ! glob syntax

Implemented by preventing excluded directories from ever being added to workspace_names while processing the workspace names array.

This is accomplished by creating a hashmap of the excluded paths which are computed using GlobWalker before processing the rest of the entries in the workspace names array. The remaining paths are not added to workspace_names if they are present in the excluded path map.

This should have little to no overhead when there are no workspace exclusions as the only changes to the original algorithm are

  1. allocating space to store all of the entries in the workspace names array in order to allow for (2)
  2. iterating over the workspace names array twice to construct the exclusion map before processing the remaining entries
  3. checking each workspace path against the exclusion map (hashing is skipped if map is empty)

The assumptions that informed my decision to go with this implementation were

  • exclusions are rarely needed (as evidenced by lack of support thus far)
  • when exclusions are used they are likely used to exclude a single directory (not many with a glob)
  • the number of packages included in the workspace is likely much larger than the number of packages excluded from the workspace so the speed of checking is important
  • while using GlobWalker to get all of the valid paths described by each exclusion so they can be stored individually risks high memory usage when many paths are excluded, it is simple to implement/understand, robust, and requires much less bespoke logic than say trying to check each workspace path to see if it matches one of the exclusion globs

How did you verify your code works?

  • I checked the lifetime of memory allocated to verify it's (1) freed and (2) only freed when it should be
  • Wrote TypeScript/JavaScript tests and they pass locally (bun-debug test bun-workspaces.test)

Still TODO

  • Decide whether to keep or remove order-based precedence in workspace names array (see comment)
  • Update Documentation including description of precedence rules

@probably-neb
Copy link
Author

The reason for a draft PR is that right now I have it so the order in which workspace packages are defined in the root package.json matters.

For example this test is passing

test("workspace entry order matters", async () => {
    await Promise.all([
      write(
        join(packageDir, "package.json"),
        JSON.stringify({
          name: "foo",
          workspaces: [
              "packages/*",

              "!packages/re-excluded-pkg",
              "packages/re-excluded-pkg",
              "!packages/re-excluded-pkg",

              "!packages/re-included-pkg",
              "packages/re-included-pkg",
          ],
        }),
      ),
      write(
        join(packageDir, "packages", "re-excluded-pkg", "package.json"),
        JSON.stringify({
          name: "re-excluded-pkg",
          dependencies: { },
        }),
      ),
      write(
        join(packageDir, "packages", "re-included-pkg", "package.json"),
        JSON.stringify({
          name: "re-included-pkg",
          dependencies: { },
        }),
      )
    ]);

    await runBunInstall(env, packageDir);

    const filesInNodeModules = await readdir(join(packageDir, "node_modules"));

    expect(filesInNodeModules).not.toContain("re-excluded-pkg");
    expect(filesInNodeModules).toContain("re-included-pkg");
})

link

However I'm now questioning if that behavior is actually confusing because packages/excluded would actually be included in the workspace described by the following package.json which is probably counter-intuitive

{
  "workspaces": [
    "!packages/excluded",
    "packages/*"
  ]
}

Note that without order-based precedence or complex logic to figure out what the user is trying to do something like the following would result in packages/included not being included

{
  "workspaces": [
    "!packages/*",
    "packages/included"
  ]
}

I am leaning towards ignoring order and having exclusions always outweigh inclusions but felt it would be better to check first so the ux is acceptable.

@probably-neb probably-neb changed the title Probably neb/workspace glob exclusions workspace glob exclusions Nov 15, 2024
@probably-neb probably-neb force-pushed the probably-neb/workspace-glob-exclusions branch from 060aba1 to 48c1d69 Compare December 3, 2024 23:56
@probably-neb probably-neb force-pushed the probably-neb/workspace-glob-exclusions branch from 899957d to 2716e39 Compare January 7, 2025 21:10
@probably-neb
Copy link
Author

After some further research (scraping open source pnpm-workspace.json files on github) and analyzing the use of exclusions in those workspaces

https://github.com/probably-neb/pnpm-workspace-glob-exclusions-data/blob/4ee965b5c72cd5bf939894cfb44a7b20d6032b58/results-11-20-2024.json

I've determined that my initial assumptions were correct, and that additionally most people assume that exclusions have higher precedence than inclusions, and therefore my implementation should not allow re-including workspaces after they have been excluded based on order.

@probably-neb
Copy link
Author

I would appreciate some input on how to best test windows paths in the workspaces array, otherwise I feel this is ready to go

@probably-neb probably-neb marked this pull request as ready for review January 8, 2025 15:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant