-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bump openvino_tokenizers version #1333
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, address your comment from another PR https://github.com/openvinotoolkit/openvino.genai/pull/1246/files#r1853668649
done |
Difference between HF chat_sample and our chat_sample arises from the fact that in tests we used LlamaTokenzier which is different from tokenzer used by default with AutoTokenzier. LlamaTokenzier gives slightly different results that AutoTokenzier, and openvino_tokenizers is alighned with the AutoTokenizers. When openvino_tokenizers fixed and issue and aligned to AutoTokenzier/LlamaTokenizerFast then 29871 tokens appeared which is missing in LlamaTokenzier and precommit tests started to differ. @apaniukov please confirm that analysis is correct |
No description provided.