RAG Workshop: Transformer Specification Analysis

This repository contains the code and resources used in the KI Park workshop for building and refining RAG pipelines to analyze transformer specifications. The implementation demonstrates multiple retrieval and generation strategies, progressing from basic similarity search to advanced reranking.

📂 Project Structure

docs/: Contains the Markdown (.md) file created from the original PDF specifications using docling.
rag_app/chains/: Includes the implementations for various RAG chains:
- basic_qa_chain.py: Basic similarity search-based chain.
- mmr_qa_chain.py: Chain with Maximal Marginal Relevance (MMR).
- ensemble_qa_chain.py: Chain combining BM25 and embedding-based retrieval.
- reranking_qa_chain.py: Advanced chain incorporating MMR, ensemble retrieval, and reranking.
app_gui_<chain_name>.py: Gradio-based GUIs for each chain, e.g., app_gui_basic.py for the basic chain.
chunking_embedding.py: Preprocessing script to chunk and embed the Markdown file into Chroma and BM25.

🛠️ Setup Instructions

Install Dependencies

Install the required Python packages using pip:
```
pip install -r requirements.txt
```
Prepare the Dataset

The dataset (PDF) is already processed into a markdown file and located in the docs/ folder.
Embed the Data

Run the chunking_embedding.py script to chunk and embed the markdown file into Chroma and BM25 stores:
```
python chunking_embedding.py
```
API Key Setup

To run the application, you need to set up your Hugging Face API key:
- Copy the example.env file to .env:
```
cp example.env .env
```
- Open the .env file and replace <YOUR_HUGGINGFACE_API_KEY> with your Hugging Face API key:
```
HF_API_TOKEN=<YOUR_HUGGINGFACE_API_KEY>
```
- Save the file.

Run the Applications

Run any of the GUI applications for side-by-side comparisons of the different chains. Each app will run on a different port:

python app_gui_basic.py    # Port 7861
python app_gui_mmr.py      # Port 7862
python app_gui_ensemble.py # Port 7863
python app_gui_reranker.py # Port 7864

🚀 Chains Overview

Basic Chain:
- Implements simple similarity search-based retrieval.
- Found in rag_app/chains/basic_qa_chain.py.
MMR Chain:
- Adds Maximal Marginal Relevance (MMR) for diverse retrieval.
- Found in rag_app/chains/mmr_qa_chain.py.
Ensemble Chain:
- Combines BM25 and embedding-based retrieval for hybrid results.
- Found in rag_app/chains/ensemble_qa_chain.py.
Reranker Chain:
- Uses MMR and ensemble retrieval, with reranking for improved precision.
- Found in rag_app/chains/reranking_qa_chain.py.

✨ Key Features

Flexible Retrieval Strategies: Experiment with similarity search, MMR, hybrid retrieval, and reranking.
Gradio Integration: User-friendly GUIs for each chain.
Scalable Preprocessing: Efficient document embedding with Chroma and BM25.

📝 Usage Example

Process the Markdown file:
```
python chunking_embedding.py
```
Start the RAG GUI:
```
python app_gui_mm.py
```
Query the system to explore transformer specifications interactively.

🛠 Additional Details

Parallel Execution for Side-by-Side Comparison

Each Gradio app GUI is configured to run on a unique port, allowing you to run multiple chains in parallel for easier comparison:

app_gui_basic.py: Runs on port 7861.
app_gui_mmr.py: Runs on port 7862.
app_gui_ensemble.py: Runs on port 7863.
app_gui_reranker.py: Runs on port 7864.

To run all GUIs simultaneously:

python app_gui_basic.py &
python app_gui_mmr.py &
python app_gui_ensemble.py &
python app_gui_reranker.py &

You can then access each GUI at localhost:<port>.

🔔 Known Warnings

During execution, you might encounter the following warnings. These are due to our choice to maintain compatibility with older versions for simplicity:

LangChainDeprecationWarning:
- Warning from ConversationBufferMemory in LangChain.
- Migration guide: LangChain Migration Guide.
Gradio UserWarning:
- Warning about type='tuples' in the Gradio chatbot component.
- This is related to upcoming Gradio updates that recommend using type='messages'.

These warnings are non-critical and do not affect the functionality of the pipelines. Updating to newer versions may introduce additional complexity not covered in this workshop.

Credits and References

The markdown file in the docs/ folder was generated using Docling, an open-source project for processing documents.
This project leverages the following libraries and tools:
- Gradio – Build simple and interactive user interfaces.
- Hugging Face – Embeddings, models, and cross-encoder reranking.
- rank_bm25 – Keyword-based retrieval using the BM25 algorithm.
- LangChain – Develop RAG pipelines and chain integrations.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.devcontainer		.devcontainer
.github		.github
docs		docs
rag_app		rag_app
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTION.md		CONTRIBUTION.md
README.md		README.md
app_gui_basic.py		app_gui_basic.py
app_gui_ensemble.py		app_gui_ensemble.py
app_gui_mmr.py		app_gui_mmr.py
app_gui_rerank.py		app_gui_rerank.py
chunking_embedding.py		chunking_embedding.py
config.py		config.py
env.example		env.example
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG Workshop: Transformer Specification Analysis

📂 Project Structure

🛠️ Setup Instructions

🚀 Chains Overview

✨ Key Features

📝 Usage Example

🛠 Additional Details

Parallel Execution for Side-by-Side Comparison

🔔 Known Warnings

Credits and References

About

Contributors 2

Languages

artiquare/mastering_rag_workshop

Folders and files

Latest commit

History

Repository files navigation

RAG Workshop: Transformer Specification Analysis

📂 Project Structure

🛠️ Setup Instructions

🚀 Chains Overview

✨ Key Features

📝 Usage Example

🛠 Additional Details

Parallel Execution for Side-by-Side Comparison

🔔 Known Warnings

Credits and References

About

Topics

Resources

Code of conduct

Stars

Watchers

Forks

Contributors 2

Languages