This project integrates Langchain with FastAPI in an Asynchronous, Scalable manner, providing a framework for document indexing and retrieval, using PostgreSQL/pgvector.
Files are organized into embeddings by file_id
. The primary use case is for integration with LibreChat, but this simple API can be used for any ID-based use case.
The main reason to use the ID approach is to work with embeddings on a file-level. This makes for targeted queries when combined with file metadata stored in a database, such as is done by LibreChat.
The API will evolve over time to employ different querying/re-ranking methods, embedding models, and vector stores.
- Document Management: Methods for adding, retrieving, and deleting documents.
- Vector Store: Utilizes Langchain's vector store for efficient document retrieval.
- Asynchronous Support: Offers async operations for enhanced performance.
- Configure
.env
file based on section below - Setup pgvector database:
- Run an existing PSQL/PGVector setup, or,
- Docker:
docker compose up
(also starts RAG API)- or, use docker just for DB:
docker compose -f ./db-compose.yaml up
- or, use docker just for DB:
- Run API:
- Docker:
docker compose up
(also starts PSQL/pgvector)- or, use docker just for RAG API:
docker compose -f ./api-compose.yaml up
- or, use docker just for RAG API:
- Local:
- Make sure to setup
DB_HOST
to the correct database hostname - Run the following commands (preferably in a virtual environment)
- Make sure to setup
- Docker:
pip install -r requirements.txt
uvicorn main:app
The following environment variables are required to run the application:
-
RAG_OPENAI_API_KEY
: The API key for OpenAI API Embeddings (if using default settings).- Note:
OPENAI_API_KEY
will work butRAG_OPENAI_API_KEY
will override it in order to not conflict with LibreChat setting.
- Note:
-
RAG_OPENAI_BASEURL
: (Optional) The base URL for your OpenAI API Embeddings -
RAG_OPENAI_PROXY
: (Optional) Proxy for OpenAI API Embeddings -
POSTGRES_DB
: (Optional) The name of the PostgreSQL database. -
POSTGRES_USER
: (Optional) The username for connecting to the PostgreSQL database. -
POSTGRES_PASSWORD
: (Optional) The password for connecting to the PostgreSQL database. -
DB_HOST
: (Optional) The hostname or IP address of the PostgreSQL database server. -
DB_PORT
: (Optional) The port number of the PostgreSQL database server. -
RAG_HOST
: (Optional) The hostname or IP address where the API server will run. Defaults to "0.0.0.0" -
RAG_PORT
: (Optional) The port number where the API server will run. Defaults to port 8000. -
JWT_SECRET
: (Optional) The secret key used for verifying JWT tokens for requests.- The secret is only used for verification. This basic approach assumes a signed JWT from elsewhere.
- Omit to run API without requiring authentication
-
COLLECTION_NAME
: (Optional) The name of the collection in the vector store. Default value is "testcollection". -
CHUNK_SIZE
: (Optional) The size of the chunks for text processing. Default value is "1500". -
CHUNK_OVERLAP
: (Optional) The overlap between chunks during text processing. Default value is "100". -
RAG_UPLOAD_DIR
: (Optional) The directory where uploaded files are stored. Default value is "./uploads/". -
PDF_EXTRACT_IMAGES
: (Optional) A boolean value indicating whether to extract images from PDF files. Default value is "False". -
DEBUG_RAG_API
: (Optional) Set to "True" to show more verbose logging output in the server console, and to enable postgresql database routes -
CONSOLE_JSON
: (Optional) Set to "True" to log as json for Cloud Logging aggregations -
EMBEDDINGS_PROVIDER
: (Optional) either "openai", "azure", "huggingface", "huggingfacetei" or "ollama", where "huggingface" uses sentence_transformers; defaults to "openai" -
EMBEDDINGS_MODEL
: (Optional) Set a valid embeddings model to use from the configured provider.- Defaults
- openai: "text-embedding-3-small"
- azure: "text-embedding-3-small"
- huggingface: "sentence-transformers/all-MiniLM-L6-v2"
- huggingfacetei: "http://huggingfacetei:3000". Hugging Face TEI uses model defined on TEI service launch.
- ollama: "nomic-embed-text"
-
RAG_AZURE_OPENAI_API_KEY
: (Optional) The API key for Azure OpenAI service.- Note:
AZURE_OPENAI_API_KEY
will work butRAG_AZURE_OPENAI_API_KEY
will override it in order to not conflict with LibreChat setting.
- Note:
-
RAG_AZURE_OPENAI_ENDPOINT
: (Optional) The endpoint URL for Azure OpenAI service, including the resource.- Example:
https://example-resource.azure.openai.com
. - Note:
AZURE_OPENAI_ENDPOINT
will work butRAG_AZURE_OPENAI_ENDPOINT
will override it in order to not conflict with LibreChat setting.
- Example:
-
HF_TOKEN
: (Optional) if needed forhuggingface
option. -
OLLAMA_BASE_URL
: (Optional) defaults tohttp://ollama:11434
.
Make sure to set these environment variables before running the application. You can set them in a .env
file or as system environment variables.
Make sure your RDS Postgres instance adheres to this requirement:
The pgvector extension version 0.5.0 is available on database instances in Amazon RDS running PostgreSQL 15.4-R2 and higher, 14.9-R2 and higher, 13.12-R2 and higher, and 12.16-R2 and higher in all applicable AWS Regions, including the AWS GovCloud (US) Regions.
In order to setup RDS Postgres with RAG API, you can follow these steps:
-
Create a RDS Instance/Cluster using the provided AWS Documentation.
-
Login to the RDS Cluster using the Endpoint connection string from the RDS Console or from your IaC Solution output.
-
The login is via the Master User.
-
Create a dedicated database for rag_api:
create database rag_api;
. -
Create a dedicated user\role for that database:
create role rag;
-
Switch to the database you just created:
\c rag_api
-
Enable the Vector extension:
create extension vector;
-
Use the documentation provided above to set up the connection string to the RDS Postgres Instance\Cluster.
Notes:
- Even though you're logging with a Master user, it doesn't have all the super user privileges, that's why we cannot use the command:
create role x with superuser;
- If you do not enable the extension, rag_api service will throw an error that it cannot create the extension due to the note above.