You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a list of changes to the public interface of the llama library. Collaborators are encouraged to edit this post in order to reflect important changes to the API that end up merged into the master branch.
If you are building a 3rd party project that relies on libllama, it is recommended to follow this issue and check it before upgrading to new versions.
#9355 restores the functionality for getting performance measurements from within libllama (which was removed in #9294) via a new llama_perf API. The llama_context_params is extended with a new bool no_perf parameter that can be used to disable the internal timings during libllama compute.
Overview
This is a list of changes to the public interface of the
llama
library. Collaborators are encouraged to edit this post in order to reflect important changes to the API that end up merged into themaster
branch.If you are building a 3rd party project that relies on
libllama
, it is recommended to follow this issue and check it before upgrading to new versions.See also:
llama-server
REST APIRecent API changes (most recent at the top)
softmax
sampler and updatedist
sampler`all_pos_0, all_pos_1, all_seq_id
fromllama_batch
LLAMA_POOLING_TYPE_RANK
llama_n_head()
llama_perf
API + param to disable internal profilingllama_sampler_chain_remove()
LLAMA_VOCAB_TYPE_RWKV
enum valuellama_threadpool
API + changeuint32_t
->int32_t
llama_model_is_recurrent
For older changes, use:
Upcoming API changes
The text was updated successfully, but these errors were encountered: