Updated Engines pages

janhq · imtuyethan · Jan 14, 2025 · Jan 7, 2025 · Jan 7, 2025 · Jan 7, 2025
commit 500529b41007b59f99bb2c449ecbe77d3f668d58
diff --git a/docs/src/pages/docs/_assets/extensions-09.png b/docs/src/pages/docs/_assets/extensions-09.png
diff --git a/docs/src/pages/docs/_assets/install-engines-01.png b/docs/src/pages/docs/_assets/install-engines-01.png
diff --git a/docs/src/pages/docs/_assets/install-engines-02.png b/docs/src/pages/docs/_assets/install-engines-02.png
diff --git a/docs/src/pages/docs/_assets/install-engines-03.png b/docs/src/pages/docs/_assets/install-engines-03.png
diff --git a/docs/src/pages/docs/_assets/tensorrt-llm-01.png b/docs/src/pages/docs/_assets/tensorrt-llm-01.png
diff --git a/docs/src/pages/docs/_assets/tensorrt-llm-02.png b/docs/src/pages/docs/_assets/tensorrt-llm-02.png
diff --git a/docs/src/pages/docs/install-engines.mdx b/docs/src/pages/docs/install-engines.mdx
@@ -32,8 +32,17 @@ To add a new remote engine:
 
 1. Navigate to **Settings** (<Settings width={16} height={16} style={{display:"inline"}}/>) > **Engines**
 1. At **Remote Engine** category, click **+ Install Engine** 
+
+<br/>
+![Install Remote Engines](./_assets/install-engines-01.png)
+<br/>
+
 2. Fill in the following required information:
 
+<br/>
+![Install Remote Engines](./_assets/install-engines-02.png)
+<br/>
+
 | Field | Description | Required |
 |-------|-------------|----------|
 | Engine Name | Name for your engine (e.g., "OpenAI", "Claude") | ✓ |
@@ -48,7 +57,14 @@ To add a new remote engine:
 > - The conversion functions are only needed for providers that don't follow the OpenAI API format. For OpenAI-compatible APIs, you can leave these empty.
 > - For OpenAI-compatible APIs like OpenAI, Anthropic, or Groq, you only need to fill in the required fields. Leave optional fields empty.
 
-4. Click **Install** to complete
+4. Click **Install** 
+5. Once completed, you should see your engine in **Engines** page:
+    - You can rename or uninstall your engine
+    - You can navigate to its own settings page
+
+<br/>
+![Install Remote Engines](./_assets/install-engines-03.png)
+<br/>
 
 ### Examples
 #### OpenAI-Compatible Setup
@@ -89,6 +105,9 @@ API Key: your_api_key_here
 ```
 
 **Conversion Functions:**
+> - Request: Convert from Jan's OpenAI-style format to your API's format
+> - Response: Convert from your API's format back to OpenAI-style format
+
 1. Request Format Conversion:
 ```javascript
 function convertRequest(janRequest) {
@@ -117,11 +136,6 @@ function convertResponse(apiResponse) {
 }
 ```
 
-<Callout type="info">
-The conversion functions should:
-- Request: Convert from Jan's OpenAI-style format to your API's format
-- Response: Convert from your API's format back to OpenAI-style format
-</Callout>
 
 **Expected Formats:**
 

diff --git a/docs/src/pages/docs/local-engines/llama-cpp.mdx b/docs/src/pages/docs/local-engines/llama-cpp.mdx
@@ -57,26 +57,83 @@ Jan offers different backend variants for **llama.cpp** based on your operating
 Choose the backend that matches your hardware. Using the wrong variant may cause performance issues or prevent models from loading.
 </Callout>
 
-### macOS
-- `mac-arm64`: For Apple Silicon Macs (M1/M2/M3)
-- `mac-amd64`: For Intel-based Macs
-
-### Windows
-- `win-cuda`: For NVIDIA GPUs using CUDA
-- `win-cpu`: For CPU-only operation
-- `win-directml`: For DirectML acceleration (AMD/Intel GPUs)
-- `win-opengl`: For OpenGL acceleration
-
-### Linux
-- `linux-cuda`: For NVIDIA GPUs using CUDA
-- `linux-cpu`: For CPU-only operation
-- `linux-rocm`: For AMD GPUs using ROCm
-- `linux-openvino`: For Intel GPUs/NPUs using OpenVINO
-- `linux-vulkan`: For Vulkan acceleration
+<Tabs items={['Windows', 'Linux', 'macOS']}>
+
+<Tabs.Tab>
+### CUDA Support (NVIDIA GPUs)
+- `llama.cpp-avx-cuda-11-7`
+- `llama.cpp-avx-cuda-12-0`
+- `llama.cpp-avx2-cuda-11-7`
+- `llama.cpp-avx2-cuda-12-0`
+- `llama.cpp-avx512-cuda-11-7`
+- `llama.cpp-avx512-cuda-12-0`
+- `llama.cpp-noavx-cuda-11-7`
+- `llama.cpp-noavx-cuda-12-0`
+
+### CPU Only
+- `llama.cpp-avx`
+- `llama.cpp-avx2`
+- `llama.cpp-avx512`
+- `llama.cpp-noavx`
+
+### Other Accelerators
+- `llama.cpp-vulkan`
 
 <Callout type="info">
-For detailed hardware compatibility, please visit our guide for [Mac](/docs/desktop/mac#compatibility), [Windows](/docs/desktop/windows#compatibility), and [Linux](docs/desktop/linux).
+- For detailed hardware compatibility, please visit our guide for [Windows](/docs/desktop/windows#compatibility).
+- AVX, AVX2, and AVX-512 are CPU instruction sets. For best performance, use the most advanced instruction set your CPU supports.
+- CUDA versions should match your installed NVIDIA drivers.
 </Callout>
 
+</Tabs.Tab>
+
+<Tabs.Tab>
+### CUDA Support (NVIDIA GPUs)
+- `llama.cpp-avx-cuda-11-7`
+- `llama.cpp-avx-cuda-12-0`
+- `llama.cpp-avx2-cuda-11-7`
+- `llama.cpp-avx2-cuda-12-0`
+- `llama.cpp-avx512-cuda-11-7`
+- `llama.cpp-avx512-cuda-12-0`
+- `llama.cpp-noavx-cuda-11-7`
+- `llama.cpp-noavx-cuda-12-0`
+
+### CPU Only
+- `llama.cpp-avx`
+- `llama.cpp-avx2`
+- `llama.cpp-avx512`
+- `llama.cpp-noavx`
+
+### Other Accelerators
+- `llama.cpp-vulkan`
+- `llama.cpp-arm64`
+
+<Callout type="info">
+- For detailed hardware compatibility, please visit our guide for [Linux](docs/desktop/linux).
+- AVX, AVX2, and AVX-512 are CPU instruction sets. For best performance, use the most advanced instruction set your CPU supports.
+- CUDA versions should match your installed NVIDIA drivers.
+</Callout>
+
+</Tabs.Tab>
+
+<Tabs.Tab>
+### Apple Silicon
+- `llama.cpp-mac-arm64`: For M1/M2/M3 Macs
+
+### Intel
+- `llama.cpp-mac-amd64`: For Intel-based Macs
+
+<Callout type="info">
+For detailed hardware compatibility, please visit our guide for [Mac](/docs/desktop/mac#compatibility).
+</Callout>
+
+
+</Tabs.Tab>
+
+</Tabs>
+
+
+
+
 
 
diff --git a/docs/src/pages/docs/local-engines/tensorrt-llm.mdx b/docs/src/pages/docs/local-engines/tensorrt-llm.mdx
@@ -28,12 +28,9 @@ import { Settings, EllipsisVertical, Plus, FolderOpen, Pencil } from 'lucide-rea
 Jan uses **TensorRT-LLM** as an optional engine for faster inference on NVIDIA GPUs. This engine uses [Cortex-TensorRT-LLM](https://github.com/janhq/cortex.tensorrt-llm), which includes an efficient C++ server that executes the [TRT-LLM C++ runtime](https://nvidia.github.io/TensorRT-LLM/gpt_runtime.html) natively. It also includes features and performance improvements like OpenAI compatibility, tokenizer improvements, and queues.
 
 <Callout type="info">
-Currently only available for **Windows** users, **Linux** support is coming soon!
+TensorRT-LLM engine is only available for **Windows** users, **Linux** support is coming soon!
 </Callout>
 
-You can find its settings in **Settings** (<Settings width={16} height={16} style={{display:"inline"}}/>) > **Local Engine** > **TensorRT-LLM**:
-
-
 
 ## Requirements
 - NVIDIA GPU with Compute Capability 7.0 or higher (RTX 20xx series and above)
@@ -45,59 +42,46 @@ You can find its settings in **Settings** (<Settings width={16} height={16} styl
 For detailed setup guide, please visit [Windows](/docs/desktop/windows#compatibility).
 </Callout>
 
-## Engine Version and Updates
-- **Engine Version**: View current version of TensorRT-LLM engine
-- **Check Updates**: Verify if a newer version is available & install available updates when it's available
-
-## Available Backends
-
-TensorRT-LLM is specifically designed for NVIDIA GPUs. Available backends include:
-
-**Windows**
-- `win-cuda`: For NVIDIA GPUs with CUDA support
-
-<Callout type="warning">
-TensorRT-LLM requires an NVIDIA GPU with CUDA support. It is not compatible with other GPU types or CPU-only systems.
-</Callout>
-
-
-
 ## Enable TensorRT-LLM 
 
 <Steps>
-### Step 1: Install TensorRT-Extension
+### Step 1: Install Additional Dependencies
+1. Navigate to **Settings** (<Settings width={16} height={16} style={{display:"inline"}}/>) > **Local Engine** > **TensorRT-LLM**:
+2. At **Additional Dependencies**, click **Install**
 
-1. Click the **Gear Icon (⚙️)** on the bottom left of your screen.
-2. Select the **TensorRT-LLM** under the **Model Provider** section.
-<br/>
-![Click Tensor](../_assets/tensor.png)
-<br/>
-3. Click **Install** to install the required dependencies to use TensorRT-LLM.
 <br/>
-![Install Extension](../_assets/install-tensor.png)
+![Click Tensor](../_assets/tensorrt-llm-01.png)
 <br/>
-3. Check that files are correctly downloaded.
 
+3. Verify that files are correctly downloaded:
 ```bash
 ls ~/jan/data/extensions/@janhq/tensorrt-llm-extension/dist/bin
-# Your Extension Folder should now include `nitro.exe`, among other artifacts needed to run TRT-LLM
+
+# Your Extension Folder should now include `cortex.exe`, among other artifacts needed to run TRT-LLM
 ```
+4. Restart Jan 
 
-### Step 2: Download a Compatible Model
+### Step 2: Download Compatible Models
 
-TensorRT-LLM can only run models in `TensorRT` format. These models, aka "TensorRT Engines", are prebuilt for each target OS+GPU architecture.
+TensorRT-LLM can only run models in `TensorRT` format. These models, also known as "TensorRT Engines", are prebuilt specifically for each operating system and GPU architecture.
 
-We offer a handful of precompiled models for Ampere and Ada cards that you can immediately download and play with:
+We currently offer a selection of precompiled models optimized for NVIDIA Ampere and Ada GPUs that you can use right away:
 
-1. Restart the application and go to the Hub.
-2. Look for models with the `TensorRT-LLM` label in the recommended models list > Click **Download**.
+1. Go to **Hub**
+2. Look for models with the `TensorRT-LLM` label & make sure they're within your hardware compatibility
+3. Click **Download** 
 
-<Callout type='info'>
-  This step might take some time. 🙏
+<Callout type="info">
+This download might take some time as TensorRT models are typically large files.
 </Callout>
 
-![image](https://hackmd.io/_uploads/rJewrEgRp.png)
+<br/>
+![Download TensorRT-LLM Model](../_assets/tensorrt-llm-02.png)
+<br/>
+
+### Step 3: Start Threads
+Once the model(s) is downloaded, start using it in [Threads](/docs/threads)
+</Steps>
+
 
-3. Click **Download** to download the model.
 
-</Steps>