-
Notifications
You must be signed in to change notification settings - Fork 27.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add POS tagging and Phrase chunking token classification examples #6457
Add POS tagging and Phrase chunking token classification examples #6457
Conversation
* POS tagging example * Phrase chunking example
Hi @vblagoje , thanks for adding this 👍 GermEval dataset is currently not available - it seems that they've relaunched the shared task website. This dataset removal will also affect libraries such as Flair or For PoS tagging it would be awesome if you could also report/output accuracy after training - just import |
Thanks for the review @stefan-it Let me know if there are any additional suggestions. Perhaps we can add appropriate URLs for the GermEval dataset and remove the chunking example if needed. |
This looks great, thanks! Note that there is a big rework of the examples to use the nlp library and Trainer in the pipeline. We're polishing the APIs before we start converting every script. I'll tag you when we get to this one to make sure we don't break anything. In the meantime, could you take care of the styling issue so we can merge? |
Ok @sgugger please do ping me and I'll make sure that all token classification examples work as expected, perhpas I can help with the transition. I am not sure why CI fails for styling, more specifically isort |
It may be because of the dep you're adding to examples. It should probably be added in the |
Ok @sgugger |
Looks flaky, re-triggered the CI |
…ples (huggingface#6457)" This reverts commit f4cd971.
This PR adds POS tagging and Phrase chunking examples to token classification examples. The current example (NER) is minimally adjusted to allow users to experiment with their token classification model training easily. Although experimenting with token classifications other than NER token classification is already possible for skilled developers, this PR lowers the barrier to entry even further and demonstrates HF extensibility.
The adjustments made consist of:
I also noticed that:
If you think adding one rather than two token task classification example is enough (say POS tagging) let me know - I'll remove the other. Also, please let me know if any additional adjustments are needed.