Skip to content

ID-based RAG FastAPI: Integration with Langchain and PostgreSQL/pgvector

License

Notifications You must be signed in to change notification settings

danny-avila/rag_api

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ID-based RAG FastAPI

Overview

This project integrates Langchain with FastAPI in an Asynchronous, Scalable manner, providing a framework for document indexing and retrieval, using PostgreSQL/pgvector.

Files are organized into embeddings by file_id. The primary use case is for integration with LibreChat, but this simple API can be used for any ID-based use case.

The main reason to use the ID approach is to work with embeddings on a file-level. This makes for targeted queries when combined with file metadata stored in a database, such as is done by LibreChat.

The API will evolve over time to employ different querying/re-ranking methods, embedding models, and vector stores.

Features

  • Document Management: Methods for adding, retrieving, and deleting documents.
  • Vector Store: Utilizes Langchain's vector store for efficient document retrieval.
  • Asynchronous Support: Offers async operations for enhanced performance.

Setup

Getting Started

  • Configure .env file based on section below
  • Setup pgvector database:
    • Run an existing PSQL/PGVector setup, or,
    • Docker: docker compose up (also starts RAG API)
      • or, use docker just for DB: docker compose -f ./db-compose.yaml up
  • Run API:
    • Docker: docker compose up (also starts PSQL/pgvector)
      • or, use docker just for RAG API: docker compose -f ./api-compose.yaml up
    • Local:
      • Make sure to setup DB_HOST to the correct database hostname
      • Run the following commands (preferably in a virtual environment)
pip install -r requirements.txt
uvicorn main:app

Environment Variables

The following environment variables are required to run the application:

  • RAG_OPENAI_API_KEY: The API key for OpenAI API Embeddings (if using default settings).

    • Note: OPENAI_API_KEY will work but RAG_OPENAI_API_KEY will override it in order to not conflict with LibreChat setting.
  • RAG_OPENAI_BASEURL: (Optional) The base URL for your OpenAI API Embeddings

  • RAG_OPENAI_PROXY: (Optional) Proxy for OpenAI API Embeddings

  • POSTGRES_DB: (Optional) The name of the PostgreSQL database.

  • POSTGRES_USER: (Optional) The username for connecting to the PostgreSQL database.

  • POSTGRES_PASSWORD: (Optional) The password for connecting to the PostgreSQL database.

  • DB_HOST: (Optional) The hostname or IP address of the PostgreSQL database server.

  • DB_PORT: (Optional) The port number of the PostgreSQL database server.

  • RAG_HOST: (Optional) The hostname or IP address where the API server will run. Defaults to "0.0.0.0"

  • RAG_PORT: (Optional) The port number where the API server will run. Defaults to port 8000.

  • JWT_SECRET: (Optional) The secret key used for verifying JWT tokens for requests.

    • The secret is only used for verification. This basic approach assumes a signed JWT from elsewhere.
    • Omit to run API without requiring authentication
  • COLLECTION_NAME: (Optional) The name of the collection in the vector store. Default value is "testcollection".

  • CHUNK_SIZE: (Optional) The size of the chunks for text processing. Default value is "1500".

  • CHUNK_OVERLAP: (Optional) The overlap between chunks during text processing. Default value is "100".

  • RAG_UPLOAD_DIR: (Optional) The directory where uploaded files are stored. Default value is "./uploads/".

  • PDF_EXTRACT_IMAGES: (Optional) A boolean value indicating whether to extract images from PDF files. Default value is "False".

  • DEBUG_RAG_API: (Optional) Set to "True" to show more verbose logging output in the server console, and to enable postgresql database routes

  • CONSOLE_JSON: (Optional) Set to "True" to log as json for Cloud Logging aggregations

  • EMBEDDINGS_PROVIDER: (Optional) either "openai", "azure", "huggingface", "huggingfacetei" or "ollama", where "huggingface" uses sentence_transformers; defaults to "openai"

  • EMBEDDINGS_MODEL: (Optional) Set a valid embeddings model to use from the configured provider.

    • Defaults
    • openai: "text-embedding-3-small"
    • azure: "text-embedding-3-small" (will be used as your Azure Deployment)
    • huggingface: "sentence-transformers/all-MiniLM-L6-v2"
    • huggingfacetei: "http://huggingfacetei:3000". Hugging Face TEI uses model defined on TEI service launch.
    • ollama: "nomic-embed-text"
  • RAG_AZURE_OPENAI_API_VERSION: (Optional) Default is 2023-05-15. The version of the Azure OpenAI API.

  • RAG_AZURE_OPENAI_API_KEY: (Optional) The API key for Azure OpenAI service.

    • Note: AZURE_OPENAI_API_KEY will work but RAG_AZURE_OPENAI_API_KEY will override it in order to not conflict with LibreChat setting.
  • RAG_AZURE_OPENAI_ENDPOINT: (Optional) The endpoint URL for Azure OpenAI service, including the resource.

    • Example: https://YOUR_RESOURCE_NAME.openai.azure.com.
    • Note: AZURE_OPENAI_ENDPOINT will work but RAG_AZURE_OPENAI_ENDPOINT will override it in order to not conflict with LibreChat setting.
  • HF_TOKEN: (Optional) if needed for huggingface option.

  • OLLAMA_BASE_URL: (Optional) defaults to http://ollama:11434.

Make sure to set these environment variables before running the application. You can set them in a .env file or as system environment variables.

Cloud Installation Settings:

AWS:

Make sure your RDS Postgres instance adheres to this requirement:

The pgvector extension version 0.5.0 is available on database instances in Amazon RDS running PostgreSQL 15.4-R2 and higher, 14.9-R2 and higher, 13.12-R2 and higher, and 12.16-R2 and higher in all applicable AWS Regions, including the AWS GovCloud (US) Regions.

In order to setup RDS Postgres with RAG API, you can follow these steps:

  • Create a RDS Instance/Cluster using the provided AWS Documentation.

  • Login to the RDS Cluster using the Endpoint connection string from the RDS Console or from your IaC Solution output.

  • The login is via the Master User.

  • Create a dedicated database for rag_api: create database rag_api;.

  • Create a dedicated user\role for that database: create role rag;

  • Switch to the database you just created: \c rag_api

  • Enable the Vector extension: create extension vector;

  • Use the documentation provided above to set up the connection string to the RDS Postgres Instance\Cluster.

Notes:

  • Even though you're logging with a Master user, it doesn't have all the super user privileges, that's why we cannot use the command: create role x with superuser;
  • If you do not enable the extension, rag_api service will throw an error that it cannot create the extension due to the note above.

Dev notes:

Installing pre-commit formatter

Run the following commands to install pre-commit formatter, which uses black code formatter:

pip install pre-commit
pre-commit install