This repository contains the code and resources used in the KI Park workshop for building and refining RAG pipelines to analyze transformer specifications. The implementation demonstrates multiple retrieval and generation strategies, progressing from basic similarity search to advanced reranking.
docs/
: Contains the Markdown (.md
) file created from the original PDF specifications usingdocling
.rag_app/chains/
: Includes the implementations for various RAG chains:basic_qa_chain.py
: Basic similarity search-based chain.mmr_qa_chain.py
: Chain with Maximal Marginal Relevance (MMR).ensemble_qa_chain.py
: Chain combining BM25 and embedding-based retrieval.reranking_qa_chain.py
: Advanced chain incorporating MMR, ensemble retrieval, and reranking.
app_gui_<chain_name>.py
: Gradio-based GUIs for each chain, e.g.,app_gui_basic.py
for the basic chain.chunking_embedding.py
: Preprocessing script to chunk and embed the Markdown file into Chroma and BM25.
-
Install Dependencies
Install the required Python packages using
pip
:pip install -r requirements.txt
-
Prepare the Dataset
The dataset (PDF) is already processed into a markdown file and located in the
docs/
folder. -
Embed the Data
Run the
chunking_embedding.py
script to chunk and embed the markdown file into Chroma and BM25 stores:python chunking_embedding.py
-
API Key Setup
To run the application, you need to set up your Hugging Face API key:
-
Copy the
example.env
file to.env
:cp example.env .env
-
Open the
.env
file and replace<YOUR_HUGGINGFACE_API_KEY>
with your Hugging Face API key:HF_API_TOKEN=<YOUR_HUGGINGFACE_API_KEY>
-
Save the file.
-
-
Run the Applications
Run any of the GUI applications for side-by-side comparisons of the different chains. Each app will run on a different port:
python app_gui_basic.py # Port 7861 python app_gui_mmr.py # Port 7862 python app_gui_ensemble.py # Port 7863 python app_gui_reranker.py # Port 7864
-
Basic Chain:
- Implements simple similarity search-based retrieval.
- Found in
rag_app/chains/basic_qa_chain.py
.
-
MMR Chain:
- Adds Maximal Marginal Relevance (MMR) for diverse retrieval.
- Found in
rag_app/chains/mmr_qa_chain.py
.
-
Ensemble Chain:
- Combines BM25 and embedding-based retrieval for hybrid results.
- Found in
rag_app/chains/ensemble_qa_chain.py
.
-
Reranker Chain:
- Uses MMR and ensemble retrieval, with reranking for improved precision.
- Found in
rag_app/chains/reranking_qa_chain.py
.
- Flexible Retrieval Strategies: Experiment with similarity search, MMR, hybrid retrieval, and reranking.
- Gradio Integration: User-friendly GUIs for each chain.
- Scalable Preprocessing: Efficient document embedding with Chroma and BM25.
- Process the Markdown file:
python chunking_embedding.py
- Start the RAG GUI:
python app_gui_mm.py
- Query the system to explore transformer specifications interactively.
Each Gradio app GUI is configured to run on a unique port, allowing you to run multiple chains in parallel for easier comparison:
app_gui_basic.py
: Runs on port 7861.app_gui_mmr.py
: Runs on port 7862.app_gui_ensemble.py
: Runs on port 7863.app_gui_reranker.py
: Runs on port 7864.
To run all GUIs simultaneously:
python app_gui_basic.py &
python app_gui_mmr.py &
python app_gui_ensemble.py &
python app_gui_reranker.py &
You can then access each GUI at localhost:<port>
.
During execution, you might encounter the following warnings. These are due to our choice to maintain compatibility with older versions for simplicity:
- LangChainDeprecationWarning:
- Warning from
ConversationBufferMemory
in LangChain. - Migration guide: LangChain Migration Guide.
- Warning from
- Gradio UserWarning:
- Warning about
type='tuples'
in the Gradio chatbot component. - This is related to upcoming Gradio updates that recommend using
type='messages'
.
- Warning about
These warnings are non-critical and do not affect the functionality of the pipelines. Updating to newer versions may introduce additional complexity not covered in this workshop.
- The markdown file in the
docs/
folder was generated using Docling, an open-source project for processing documents. - This project leverages the following libraries and tools:
- Gradio – Build simple and interactive user interfaces.
- Hugging Face – Embeddings, models, and cross-encoder reranking.
- rank_bm25 – Keyword-based retrieval using the BM25 algorithm.
- LangChain – Develop RAG pipelines and chain integrations.