Semantic Router is an open-source framework designed to serve as a high-speed decision-making layer for Language Models (LLMs) and intelligent agents. This innovative router uses the semantic meaning of queries and documents to make real-time routing decisions, significantly enhancing the efficiency and relevance of information retrieval.
This tutorial will leverage semantic vector space to determines which documents from a database should be retrieved in response to a given query, whether it’s multiple documents, a single document, or none at all.
pip install semantic-router
Example is running with LlamaCpp Python and Instructor. You can running with other framework LLMs, more info in here.
pip install instructor
Install llama cpp python with CUDA.
To install with CUDA support, set the LLAMA_CUDA=on
environment variable before installing:
CMAKE_ARGS="-DLLAMA_CUDA=on" pip install llama-cpp-python
curl -L -o meta-llama-3-8b-instruct.Q8_0.gguf https://huggingface.co/SanctumAI/Meta-Llama-3-8B-Instruct-GGUF/resolve/main/meta-llama-3-8b-instruct.Q8_0.gguf