Annif is an automated subject indexing toolkit. It was originally created as a statistical automated indexing tool that used metadata from the Finna.fi discovery interface as a training corpus.
This repo contains a rewritten production version of Annif based on the prototype. It is a work in progress.
Python 3.5+. Pipenv is used for managing dependencies.
Clone the repository.
Switch into the repository directory. Install pipenv if you don't have it:
pip install pipenv # or pip3 install pipenv
Install dependencies and download NLTK data:
pipenv install # use --dev if you want to run tests etc.
python -m nltk.downloader punkt
Enter the virtual environment:
pipenv shell
Start up the application:
annif
See Getting Started in the wiki for more details.
Run pipenv shell
to enter the virtual environment and then run pytest
.
To have the test suite watch for changes in code and run automatically, use
pytest-watch by running ptw
.
The code in this repository is licensed under Apache License 2.0, except for the
dependencies included under annif/static/css
and annif/static/js
,
which have their own licenses. See the file headers for details.