Skip to content

changelog : llama-server REST APIΒ #9291

Open
@ggerganov

Description

Overview

This is a list of changes to the public HTTP interface of the llama-server example. Collaborators are encouraged to edit this post in order to reflect important changes to the API that end up merged into the master branch.

If you are building a 3rd party project that relies on llama-server, it is recommended to follow this issue and check it carefully before upgrading to new versions.

See also:

Recent API changes (most recent at the top)

version PR desc
b4599 #9639 /v1/chat/completions now supports tools & tool_choice
TBD. #10974 /v1/completions is now OAI-compat
TBD. #10783 logprobs is now OAI-compat, default to pre-sampling probs
TBD. #10861 /embeddings supports pooling type none
TBD. #10853 Add optional "tokens" output to /completions endpoint
b4337 #10803 Remove penalize_nl
b4265 #10626 CPU docker images working directory changed to /app
b4285 #10691 (Again) Change /slots and /props responses
b4283 #10704 Change /slots and /props responses
b4027 #10162 /slots endpoint: remove slot[i].state, add slot[i].is_processing
b3912 #9865 Add option to time limit the generation phase
b3911 #9860 Remove self-extend support
b3910 #9857 Remove legacy system prompt support
b3897 #9776 Change default security settings, /slots is now disabled by default
Endpoints now check for API key if it's set
b3887 #9510 Add /rerank endpoint
b3754 #9459 Add [DONE]\n\n in OAI stream response to match spec
b3721 #9398 Add seed_cur to completion response
b3683 #9308 Environment variable updated
b3599 #9056 Change /health and /slots

For older changes, use:

git log --oneline -p b3599 -- examples/server/README.md

Upcoming API changes

  • TBD

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentation

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions