-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make vocabularies multilingual #600
Conversation
Codecov Report
@@ Coverage Diff @@
## master #600 +/- ##
==========================================
- Coverage 99.54% 99.52% -0.02%
==========================================
Files 86 86
Lines 5653 5695 +42
==========================================
+ Hits 5627 5668 +41
- Misses 26 27 +1
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
b1aa810
to
4d78bb5
Compare
Rebased on current master (which now contains PR #597 that was the starting point of this branch) and force-pushed. |
Kudos, SonarCloud Quality Gate passed! 0 Bugs No Coverage information |
This is more or less done now, pending review. I tried to fix the issues reported by QA tools. Code Climate still complains about load_vocabulary, but I can't figure out how to make it better. Codecov says there's one more missed line than before in annif.vocab, but I can't find it in the detailed report. |
This pull request introduces 1 alert when merging b258873 into b6a1363 - view on LGTM.com new alerts:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks for reviewing @juhoinkinen ! I will merge this now, although we still need to do more testing before the next release, however there are other related changes to the vocabulary functionality coming up (possibly e.g. #602) and it makes sense to test them all in one go. |
This PR implements #559 - making vocabularies multilingual, so that there is no need to use separate language-specific vocabulary id's such as
yso-fi
,yso-sv
andyso-en
. Instead the vocabulary idyso
can be used for all projects and theloadvoc
command needs to be executed just once as it will detect which languages are available in the vocabulary and load the labels in all available languages.It's also possible to define/override the language of labels, for example to use
vocab=lcsh(en)
in a Finnish language project.The changes need to be carefully tested as they are quite disruptive. Documentation (including the Annif tutorial) should be updated, mainly by stripping language suffixes from vocabulary id's in examples. However, old examples (vocabulary id's with a language suffix) should still keep working.