Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suppress duplicate log messages from subject module #673

Merged
merged 3 commits into from
Feb 13, 2023

Conversation

juhoinkinen
Copy link
Member

When a project's vocabulary is not up-to-date with the corpus being used there can be very many warnings like warning: Unknown subject URI <http://www.yso.fi/onto/yso/p22036> that flood the screen.

See an example output
annif train tfidf-en ../Annif-tutorial/data-sets/yso-nlf/docs/train/
Backend tfidf: transforming subject corpus
Backend tfidf: creating vectorizer
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p5843>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p20828>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p8927>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p649>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p2434>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p7227>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p733>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p7227>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p12534>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p22036>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p11216>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p2794>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p22036>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p4458>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p11216>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p649>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p656>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p4813>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p9223>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p1967>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p2794>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p6415>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p6415>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p11216>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p2880>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p4008>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p2434>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p11216>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p649>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p2434>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p1967>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p12534>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p10290>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p4458>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p8927>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p6415>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p4008>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p11216>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p24827>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p6415>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p11216>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p1586>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p484>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p3456>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p649>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p649>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p2794>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p649>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p22036>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p649>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p2794>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p11216>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p20071>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p11216>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p11216>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p2434>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p2794>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p1967>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p4458>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p2880>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p656>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p4813>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p649>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p4008>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p4008>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p10589>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p9223>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p22036>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p22036>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p2880>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p11216>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p2794>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p22036>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p12107>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p22036>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p22036>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p22036>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p2794>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p649>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p656>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p4813>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p1586>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p484>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p3456>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p24827>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p25715>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p2794>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p656>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p4813>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p22036>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p1967>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p17355>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p656>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p4813>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p4458>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p649>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p22036>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p14299>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p22036>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p11216>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p11216>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p11216>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p2880>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p22036>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p22036>
Backend tfidf: creating similarity index

However, usually there are not so many URIs (or labels) that are absent from the vocabulary, but the message is shown for every occasion any of them is encounted in the corpus: the same warnings are duplicated many times. This PR suppresses the duplicate log messages that are raised from the subject.py module.

See the above output with this PR applied
annif train tfidf-en ../Annif-tutorial/data-sets/yso-nlf/docs/train/
Backend tfidf: transforming subject corpus
Backend tfidf: creating vectorizer
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p5843>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p20828>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p8927>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p649>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p2434>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p7227>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p733>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p12534>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p22036>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p11216>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p2794>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p4458>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p656>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p4813>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p9223>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p1967>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p6415>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p2880>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p4008>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p10290>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p24827>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p1586>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p484>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p3456>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p20071>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p10589>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p12107>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p25715>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p17355>
warning: Unknown subject URI <http://www.yso.fi/onto/yso/p14299>
Backend tfidf: creating similarity index

Based on a SO answer.

@juhoinkinen juhoinkinen added this to the 0.61 milestone Feb 9, 2023
annif/util.py Fixed Show fixed Hide fixed
@codecov
Copy link

codecov bot commented Feb 9, 2023

Codecov Report

Base: 99.56% // Head: 99.56% // Increases project coverage by +0.00% 🎉

Coverage data is based on head (8d7e1c5) compared to base (8a194c4).
Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #673   +/-   ##
=======================================
  Coverage   99.56%   99.56%           
=======================================
  Files          87       87           
  Lines        6145     6158   +13     
=======================================
+ Hits         6118     6131   +13     
  Misses         27       27           
Impacted Files Coverage Δ
annif/corpus/subject.py 100.00% <100.00%> (ø)
annif/util.py 98.57% <100.00%> (+0.26%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@juhoinkinen juhoinkinen marked this pull request as ready for review February 10, 2023 14:56
Copy link
Member

@osma osma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, only a minor suggestion for the variable name

annif/corpus/subject.py Outdated Show resolved Hide resolved
@sonarqubecloud
Copy link

SonarCloud Quality Gate failed.    Quality Gate failed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot E 1 Security Hotspot
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@juhoinkinen juhoinkinen merged commit d8dd8db into master Feb 13, 2023
@juhoinkinen juhoinkinen deleted the supress-duplicate-logs-from-subject-module branch February 13, 2023 07:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants