Skip to content

Final Project for the Natural Language Processing course

Notifications You must be signed in to change notification settings

matthjs/nlp-final-project

Repository files navigation


Natural Language Processing - Final Project

About The Project

This project is about Retriever-Extractor based Question Answering.

Getting started

Prerequisites

Running

Using docker: Run the docker-compose files to run all relevant services (docker compose up or docker compose up --build).

You can also set up a virtual environment using Poetry. Poetry can be installed using pip:

pip install poetry

Then initiate the virtual environment with the required dependencies (see poetry.lock, pyproject.toml):

poetry config virtualenvs.in-project true    # ensures virtual environment is in project
poetry install

The virtual environment can be accessed from the shell using:

poetry shell

IDEs like Pycharm will be able to detect the interpreter of this virtual environment.

Models

All the Transformer models used in this project can be downloaded from HuggingFace (this will be automatically done by the code).

Usage

The project is run through the main.py script. You can give it the following command line arguments:

  1. Perform Inference:

    python script.py inference

    To perform inference on a specific document collection:

    python script.py inference <hugging_face_dataset>
  2. Train baseline reader model and primary reader model on NewsQA train set:

    python script.py train
  3. Evaluate reader models (baseline, primary) on NewsQA evaluation set:

    python script.py eval
  4. Evaluate reader models (baseline, primary) on out of distribution test sets:

    python script.py test

About

Final Project for the Natural Language Processing course

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages