This project is about Retriever-Extractor based Question Answering.
- Docker v4.25 or higher (if running docker container).
- Poetry.
Using docker: Run the docker-compose files to run all relevant services (docker compose up
or docker compose up --build
).
You can also set up a virtual environment using Poetry. Poetry can be installed using pip
:
pip install poetry
Then initiate the virtual environment with the required dependencies (see poetry.lock
, pyproject.toml
):
poetry config virtualenvs.in-project true # ensures virtual environment is in project
poetry install
The virtual environment can be accessed from the shell using:
poetry shell
IDEs like Pycharm will be able to detect the interpreter of this virtual environment.
All the Transformer models used in this project can be downloaded from HuggingFace (this will be automatically done by the code).
The project is run through the main.py
script. You can give it the following command line arguments:
-
Perform Inference:
python script.py inference
To perform inference on a specific document collection:
python script.py inference <hugging_face_dataset>
-
Train baseline reader model and primary reader model on NewsQA train set:
python script.py train
-
Evaluate reader models (baseline, primary) on NewsQA evaluation set:
python script.py eval
-
Evaluate reader models (baseline, primary) on out of distribution test sets:
python script.py test