Description
At the moment, we are using setuptools and Pip to manage dependencies, which are defined in setup.py. In the past we used Pipenv, but it was abandoned due to many problems in PR #405. Poetry is very similar to Pipenv and one of the core features is that it manages virtual environments.
Here are some of the potential benefits of using Poetry for Annif:
- Simplify development installation (hopefully), for example Poetry handles the creation of virtual environments which currently has to be done manually.
- Move package metadata to a standardized (PEP 518)
pyproject.toml
file, instead of the current executablesetup.py
. - Make it easier to manage dependencies and discover which libraries need updating. Poetry provides commands like
poetry update
(update dependencies to latest versions within the current constraints),poetry show --outdated
(show available versions of dependencies, also outside the current constraints) andpoetry show --tree
(show the dependencies as a tree). These are a hassle in the current setup and/or require installing external tools.
There are of course potential downsides as well - the change could be disruptive to developers, who need to learn new habits. CI jobs and documentation (mainly the top level README) need to be updated. There's always a risk of unwanted side effects and unexpected problems.
One detail: Poetry documentation recommends that the poetry.lock
file should be stored under version control. However, I don't think that's a good idea for Annif for several reasons: we support multiple Python versions in parallel (and potentially multiple platforms) which may lead to different versions of packages being installed in different situations, it's a recipe for merge conflicts etc. So I think we should put poetry.lock
in .gitignore
instead and try to ensure that declared dependencies are strict enough. See also this SO answer with similar arguments.