Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(repo-map) go vendor included in repo map causing token limit error #777

Merged
merged 1 commit into from
Nov 3, 2024

Conversation

brewinski
Copy link
Contributor

When using the @codebase flag, I encountered token exceeded errors. Upon investigation, I discovered that this issue arises because the Go project I'm working on uses a vendor folder for its dependencies. This folder was being included as part of the repo map, resulting in larger project context.

I am open to feedback and willing to alter the solution if this approach is not suitable. Please let me know if there are any suggestions or improvements that can be made.

@b0o
Copy link
Collaborator

b0o commented Nov 3, 2024

This seems like it would be useful to have as a configuration option. We could create a new configuration sub-table for repo-map options e.g.:

{
  repo_map = {
    ignore_patterns = { "%.git", "%.worktree", "__pycache__", "node_modules" }, -- ignore files matching these patterns when building the repo map
  }
}

With this approach, I think it's debatable whether vendor should be added to the default configuration in this case. It's not as standard as node_modules or __pycache__; some users may actually want to include things inside of vendor directories. Users who do want to ignore it can opt-in.

If you decide to take this approach, in order to decouple parse_gitignore from repo-map specifics, maybe we should not set the default ignore_patterns directly inside of parse_gitignore, but extend the returned table inside of RepoMap._build_repo_map:

function M.parse_gitignore(gitignore_path)
  local ignore_patterns = {}
  local negate_patterns = {}
  --- ... snip ...
  return ignore_patterns, negate_patterns
end

function RepoMap._build_repo_map(project_root, file_ext)
  local output = {}
  local gitignore_path = project_root .. "/.gitignore"
  local ignore_patterns, negate_patterns = Utils.parse_gitignore(gitignore_path)
  vim.list_extend(ignore_patterns, Config.repo_map.ignore_patterns)
  -- maybe we should support configurable negate_patterns too?
  -- ... snip ...
end

Additional notes for future repo-map improvements:
  • In the future, the repo_map configuration can be expanded with additional options, e.g.:

    {
      repo_map = {
        enabled = true, -- include additional context about your project with messages sent to LLMs
        ignore_patterns = { "%.git", "%.worktree", "__pycache__", "node_modules", "vendor" }, -- ignore files matching these patterns when building the repo map
        vcs_ignore = true,  -- in addition to ignore_patterns, also respect .gitignore files
      }
    }
  • Looking at the implementation of RepoMap._build_repo_map, it seems that only the project's root .gitignore is respected. Ideally we would respect nested .gitignore files. We already have a dependency on Plenary, which has support for this, so we should leverage it.

  • Perhaps we should have a configurable size limit for the repo_map, both in terms of the size of individual files and the overall number of files to include.

@brewinski
Copy link
Contributor Author

Thanks for the feedback @b0o. Seems like a reasonable approach. I'll look into implementing this over the next couple of days.

@yetone
Copy link
Owner

yetone commented Nov 3, 2024

‌‌‌‌‌@brewinski Thank you for your PR and @b0o's suggestions! I'll merge this PR first, and then you can submit a new PR to make it configurable.

@yetone yetone merged commit 99f3b3a into yetone:main Nov 3, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants