You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
One thing i am finding when using LLM's is i need to think about tokens as a users but it ends up being a bit whack a mole and not really easy or intuitive for me to reason about how many tokens my message might be using and how to maybe optimize it a bit better.
Could be nice is token counts where some how easy for me to see as i have a chat.
The text was updated successfully, but these errors were encountered:
This would be a nice feature. Currently we don't store tokens on the inference backend, only the message as a string. Tokenization, truncation, etc are all performed by the worker. It might not be ideal to store a token count or the tokens on the backend, because they could change for the same message string if the tokenizer changed (e.g. switching model). So we would need some way for a worker to communicate up-to-date token counts to the backend, maybe at the end of each message generation?
Yeah or another way could just be a button to just check how many tokens the current chat is using, so just a new sort of endpoint that just would tokenize (or lookup the answer if it exists somewhere based on chat id) and get the count and send it back.
So as a user you then have an easy way to check it. But without it sort of polluting the FE experience, although obviously would depend on how it was done in terms of UI.
A separate button and endpoint maybe could be easier to handle although tbh I'm not sure on if that's more hassle or would be wasteful in terms of BE side.
One thing i am finding when using LLM's is i need to think about tokens as a users but it ends up being a bit whack a mole and not really easy or intuitive for me to reason about how many tokens my message might be using and how to maybe optimize it a bit better.
Could be nice is token counts where some how easy for me to see as i have a chat.
The text was updated successfully, but these errors were encountered: