Skip to content

Latest commit

 

History

History

examples

Examples with OpenLLM

You can find the following examples to interact with OpenLLM features. See more here

Features

The following notebook demonstrate general OpenLLM features and how to start running any open source models in production.

OpenAI-compatible endpoints

The openai_completion_client.py demos how to use the OpenAI-compatible /v1/completions to generate text.

export OPENLLM_ENDPOINT=https://api.openllm.com
python openai_completion_client.py

# For streaming set STREAM=True
STREAM=True python openai_completion_client.py

The openai_chat_completion_client.py demos how to use the OpenAI-compatible /v1/chat/completions to chat with a model.

export OPENLLM_ENDPOINT=https://api.openllm.com
python openai_chat_completion_client.py

# For streaming set STREAM=True
STREAM=True python openai_chat_completion_client.py

TinyLLM

The api_server.py demos how one can easily write production-ready BentoML service with OpenLLM and vLLM.

Install requirements:

pip install -U "openllm[vllm]"

To serve the Bento (given you have access to GPU):

bentoml serve api_server:svc

To build the Bento do the following:

bentoml build -f bentofile.yaml .