feat(adapters): add Hugging Face adapter (#527)

olimorris · Dec 13, 2024 · 48747c4 · 48747c4
1 parent c021141
commit 48747c4
Show file tree

Hide file tree

Showing 3 changed files with 200 additions and 44 deletions.
diff --git a/README.md b/README.md
@@ -11,7 +11,7 @@
 </p>
 
 <p align="center">
-Currently supports: Anthropic, Copilot, Gemini, Ollama, OpenAI, Azure OpenAI and xAI adapters<br><br>
+Currently supports: Anthropic, Copilot, Gemini, Ollama, OpenAI, Azure OpenAI, HuggingFace and xAI adapters<br><br>
 New features are always announced <a href="https://github.com/olimorris/codecompanion.nvim/discussions/categories/announcements">here</a>
 </p>
 
@@ -28,7 +28,7 @@ Thank you to the following people:
 ## :sparkles: Features
 
 - :speech_balloon: [Copilot Chat](https://github.com/features/copilot) meets [Zed AI](https://zed.dev/blog/zed-ai), in Neovim
-- :electric_plug: Support for Anthropic, Copilot, Gemini, Ollama, OpenAI, Azure OpenAI and xAI LLMs (or bring your own!)
+- :electric_plug: Support for Anthropic, Copilot, Gemini, Ollama, OpenAI, Azure OpenAI, HuggingFace and xAI LLMs (or bring your own!)
 - :rocket: Inline transformations, code creation and refactoring
 - :robot: Variables, Slash Commands, Agents/Tools and Workflows to improve LLM output
 - :sparkles: Built in prompt library for common tasks like advice on LSP errors and code explanations
@@ -288,6 +288,7 @@ The plugin uses adapters to connect to LLMs. Out of the box, the plugin supports
 - OpenAI (`openai`) - Requires an API key
 - Azure OpenAI (`azure_openai`) - Requires an Azure OpenAI service with a model deployment
 - xAI (`xai`) - Requires an API key
+- HuggingFace (`huggingface`) - Requires a Serveless Inference API key from HuggingFace.co
 
 The plugin utilises objects called Strategies. These are the different ways that a user can interact with the plugin. The _chat_ strategy harnesses a buffer to allow direct conversation with the LLM. The _inline_ strategy allows for output from the LLM to be written directly into a pre-existing Neovim buffer. The _agent_ and _workflow_ strategies are wrappers for the _chat_ strategy, allowing for [tool use](#robot-agents--tools) and [agentic workflows](#world_map-agentic-workflows).
 

diff --git a/doc/codecompanion.txt b/doc/codecompanion.txt
@@ -1,4 +1,4 @@
-*codecompanion.txt*       For NVIM v0.10.0       Last change: 2024 November 29
+*codecompanion.txt*       For NVIM v0.10.0       Last change: 2024 December 11
 
 ==============================================================================
 Table of Contents                            *codecompanion-table-of-contents*
@@ -15,7 +15,7 @@ Table of Contents                            *codecompanion-table-of-contents*
 FEATURES                                              *codecompanion-features*
 
 - Copilot Chat <https://github.com/features/copilot> meets Zed AI <https://zed.dev/blog/zed-ai>, in Neovim
-- Support for Anthropic, Copilot, Gemini, Ollama, OpenAI, Azure OpenAI and xAI LLMs (or bring your own!)
+- Support for Anthropic, Copilot, Gemini, Ollama, OpenAI, Azure OpenAI, HuggingFace and xAI LLMs (or bring your own!)
 - Inline transformations, code creation and refactoring
 - Variables, Slash Commands, Agents/Tools and Workflows to improve LLM output
 - Built in prompt library for common tasks like advice on LSP errors and code explanations
@@ -33,6 +33,9 @@ REQUIREMENTS                                      *codecompanion-requirements*
 
 INSTALLATION                                      *codecompanion-installation*
 
+
+  [!IMPORTANT] The plugin requires the markdown Tree-sitter parser to be
+  installed with `:TSInstall markdown`
 Install the plugin with your preferred package manager:
 
 **Lazy.nvim**
@@ -43,8 +46,6 @@ Install the plugin with your preferred package manager:
       dependencies = {
         "nvim-lua/plenary.nvim",
         "nvim-treesitter/nvim-treesitter",
-        -- The following are optional:
-        { "MeanderingProgrammer/render-markdown.nvim", ft = { "markdown", "codecompanion" } },
       },
       config = true
     }
@@ -61,8 +62,6 @@ Install the plugin with your preferred package manager:
       requires = {
         "nvim-lua/plenary.nvim",
         "nvim-treesitter/nvim-treesitter",
-        -- The following are optional:
-        { "MeanderingProgrammer/render-markdown.nvim", ft = { "markdown", "codecompanion" } },
       }
     })
 <
@@ -74,9 +73,6 @@ Install the plugin with your preferred package manager:
 
     Plug 'nvim-lua/plenary.nvim'
     Plug 'nvim-treesitter/nvim-treesitter'
-    " -- The following are optional
-    Plug 'MeanderingProgrammer/render-markdown.nvim'
-    " --
     Plug 'olimorris/codecompanion.nvim'
 
     call plug#end()
@@ -86,22 +82,57 @@ Install the plugin with your preferred package manager:
     EOF
 <
 
+**Completion**
+
+When conversing with the LLM, you can leverage variables, slash commands and
+tools in the chat buffer. Out of the box, the plugin will display these to you
+via a native Neovim completion menu (which you’ll need to trigger with
+`<C-_>`). However, it also has support for nvim-cmp
+<https://github.com/hrsh7th/nvim-cmp> and blink.cmp
+<https://github.com/Saghen/blink.cmp>. The former, requires no setup however to
+enable completions for `blink.cmp`, please ensure you’ve enabled it in your
+config:
+
+>lua
+    sources = {
+      completion = {
+        enabled_providers = { "some", "other", "providers", "codecompanion" },
+      },
+      providers = {
+        codecompanion = {
+          name = "CodeCompanion",
+          module = "codecompanion.providers.completion.blink",
+          enabled = true,
+        },
+      },
+    },
+<
+
+**Slash Commands**
 
-  [!IMPORTANT] The plugin requires the markdown Tree-sitter parser to be
-  installed with `:TSInstall markdown`
 To better utilise Slash Commands, Telescope.nvim
 <https://github.com/nvim-telescope/telescope.nvim>, mini.pick
 <https://github.com/echasnovski/mini.pick> or fzf lua
 <https://github.com/ibhagwan/fzf-lua> can also be installed. Please refer to
 the |codecompanion-chat-buffer| section for more information.
 
+**Pinned plugins**
+
 As per #377 <https://github.com/olimorris/codecompanion.nvim/issues/377>, if
 you pin your plugins to the latest releases, consider setting plenary.nvim to:
 
 >lua
     { "nvim-lua/plenary.nvim", branch = "master" },
 <
 
+**Prettify with render-markdown.nvim**
+
+Add the following to your dependencies:
+
+>lua
+    { "MeanderingProgrammer/render-markdown.nvim", ft = { "markdown", "codecompanion" } },
+<
+
 
 QUICKSTART                                          *codecompanion-quickstart*
 
@@ -246,6 +277,7 @@ supports:
 - OpenAI (`openai`) - Requires an API key
 - Azure OpenAI (`azure_openai`) - Requires an Azure OpenAI service with a model deployment
 - xAI (`xai`) - Requires an API key
+- HuggingFace (`huggingface`) - Requires a Serveless Inference API key from HuggingFace.co
 
 The plugin utilises objects called Strategies. These are the different ways
 that a user can interact with the plugin. The _chat_ strategy harnesses a
@@ -364,22 +396,6 @@ You can also add your own keymaps:
     })
 <
 
-**Using with render-markdown.nvim**
-
-If you use the fantastic render-markdown.nvim
-<https://github.com/MeanderingProgrammer/render-markdown.nvim> plugin, then you
-can turn off the `show_header_separator` option for a cleaner chat buffer:
-
->lua
-    require("codecompanion").setup({
-      display = {
-        chat = {
-          show_header_separator = false,
-        }
-      }
-    })
-<
-
 
 ADAPTERS ~
 
@@ -657,6 +673,30 @@ The look and feel of the chat buffer can be customised as per the
 You can also add additional _Variables_ and _Slash Commands_ which can then be
 referenced in the chat buffer.
 
+**Slash Commands**
+
+As outlined in the |codecompanion-quickstart| section, Slash Commands allow you
+to easily share additional context with your LLM from the chat buffer. Some of
+the commands also allow for multiple providers:
+
+- `/buffer` - Has `default`, `telescope` and `fzf_lua` providers
+- `/files` - Has `default`, `telescope`, `mini_pick` and `fzf_lua` providers
+- `/help` - Has `telescope`, `mini_pick` and `fzf_lua` providers
+- `/symbols` - Has `default`, `telescope`, `mini_pick` and `fzf_lua` providers
+
+Please refer to the config
+<https://github.com/olimorris/codecompanion.nvim/blob/main/lua/codecompanion/config.lua>
+to see how to change the default provider.
+
+**References**
+
+When Slash Commands or Variables are used, a block quote is added to the chat
+buffer referencing what’s been shared with the LLM. When a conversation
+becomes long, this allows you to keep track of what’s been shared. You can
+modify these block quotes to remove references from being shared with the LLM
+which will alter the history of the conversation. This can be useful to
+minimize token consumption.
+
 **Keymaps**
 
 When in the chat buffer, press `?` to bring up a menu that lists the available
@@ -689,21 +729,6 @@ You can display your selected adapter’s schema at the top of the buffer, if
 `display.chat.show_settings` is set to `true`. This allows you to vary the
 response from the LLM.
 
-**Slash Commands**
-
-As outlined in the |codecompanion-quickstart| section, Slash Commands allow you
-to easily share additional context with your LLM from the chat buffer. Some of
-the commands also allow for multiple providers:
-
-- `/buffer` - Has `default`, `telescope` and `fzf_lua` providers
-- `/files` - Has `default`, `telescope`, `mini_pick` and `fzf_lua` providers
-- `/help` - Has `telescope`, `mini_pick` and `fzf_lua` providers
-- `/symbols` - Has `default`, `telescope`, `mini_pick` and `fzf_lua` providers
-
-Please refer to the config
-<https://github.com/olimorris/codecompanion.nvim/blob/main/lua/codecompanion/config.lua>
-to see how to change the default provider.
-
 
 INLINE ASSISTANT ~
 

diff --git a/lua/codecompanion/adapters/huggingface.lua b/lua/codecompanion/adapters/huggingface.lua
@@ -0,0 +1,130 @@
+local log = require("codecompanion.utils.log")
+local openai = require("codecompanion.adapters.openai")
+
+---@class HuggingFace.Adapter: CodeCompanion.Adapter
+return {
+  name = "huggingface",
+  roles = {
+    llm = "assistant",
+    user = "user",
+  },
+  opts = {
+    stream = true,
+  },
+  features = {
+    text = true,
+    tokens = false,
+    vision = true,
+  },
+  url = "${url}/models/${model}/v1/chat/completions",
+  env = {
+    api_key = "HUGGINGFACE_API_KEY",
+    url = "https://api-inference.huggingface.co",
+    model = "schema.model.default",
+  },
+  raw = {
+    "--no-buffer",
+    "--silent",
+  },
+  headers = {
+    ["Content-Type"] = "application/json",
+    Authorization = "Bearer ${api_key}",
+  },
+  -- NOTE: currently, decided to not implment the tokens counter handle, since the API infernce docs
+  -- says it is supported, yet, the usage is returning null when the stream is enabled
+  handlers = {
+    ---@param self CodeCompanion.Adapter
+    ---@return boolean
+    setup = function(self)
+      if self.opts and self.opts.stream then
+        self.parameters.stream = true
+      end
+
+      return true
+    end,
+
+    --- Use the OpenAI adapter for the bulk of the work
+    form_parameters = function(self, params, messages)
+      return openai.handlers.form_parameters(self, params, messages)
+    end,
+    form_messages = function(self, messages)
+      return openai.handlers.form_messages(self, messages)
+    end,
+    chat_output = function(self, data)
+      return openai.handlers.chat_output(self, data)
+    end,
+    inline_output = function(self, data, context)
+      return openai.handlers.inline_output(self, data, context)
+    end,
+    on_exit = function(self, data)
+      return openai.handlers.on_exit(self, data)
+    end,
+  },
+  schema = {
+    model = {
+      order = 1,
+      mapping = "parameters",
+      type = "enum",
+      desc = "ID of the model to use from Hugging Face.",
+      default = "Qwen/Qwen2.5-72B-Instruct",
+      choices = {
+        "meta-llama/Llama-3.2-1B-Instruct",
+        "Qwen/Qwen2.5-72B-Instruct",
+        "google/gemma-2-2b-it",
+        "mistralai/Mistral-Nemo-Instruct-2407",
+      },
+    },
+    temperature = {
+      order = 2,
+      mapping = "parameters",
+      type = "number",
+      optional = true,
+      default = 0.5,
+      desc = "What sampling temperature to use, between 0 and 2.",
+      validate = function(n)
+        return n >= 0 and n <= 2, "Must be between 0 and 2"
+      end,
+    },
+    max_tokens = {
+      order = 3,
+      mapping = "parameters",
+      type = "integer",
+      optional = true,
+      default = 2048,
+      desc = "The maximum number of tokens to generate.",
+      validate = function(n)
+        return n > 0, "Must be greater than 0"
+      end,
+    },
+    top_p = {
+      order = 4,
+      mapping = "parameters",
+      type = "number",
+      optional = true,
+      default = 0.7,
+      desc = "Nucleus sampling parameter.",
+      validate = function(n)
+        return n >= 0 and n <= 1, "Must be between 0 and 1"
+      end,
+    },
+    -- caveat to using the cache: https://huggingface.co/docs/api-inference/parameters#caching
+    ["x-use-cache"] = {
+      order = 5,
+      mapping = "headers",
+      type = "string",
+      optional = true,
+      default = "true",
+      desc = "Whether to use the cache layer on the inference API...",
+      choices = { "true", "false" },
+    },
+    ["x-wait-for-model"] = {
+      order = 6,
+      mapping = "headers",
+      type = "string",
+      optional = true,
+      default = "false",
+      desc = "Whether to wait for the model to be loaded...",
+      choices = { "true", "false" },
+    },
+  },
+}