Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(breaking): unify LLM API #283

Merged
merged 9 commits into from
Sep 1, 2023
Prev Previous commit
Next Next commit
chore: add breaking change notes
Signed-off-by: aarnphm-ec2-dev <29749331+aarnphm@users.noreply.github.com>
  • Loading branch information
aarnphm committed Sep 1, 2023
commit 973c29f0826e50b9527ab2096bc89775d733f99d
20 changes: 20 additions & 0 deletions changelog.d/283.breaking.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
All environment variable now will be more simplified, without the need for the specific model prefix

For example: OPENLLM_LLAMA_GENERATION_MAX_NEW_TOKENS now becomes OPENLLM_GENERATION_MAX_NEW_TOKENS

Unify some misc environment variable. To switch different backend, one can use `--backend` for both `start` and `build`

```bash
openllm start llama --backend vllm
```

or the environment variable `OPENLLM_BACKEND`

```bash
OPENLLM_BACKEND=vllm openllm start llama
```

`openllm.Runner` now will default to try download the model the first time if the model is not available, and get the cached in model store consequently

Model serialisation now updated to a new API version with more clear name change, kindly ask users to do `openllm prune -y --include-bentos` and update to
this current version of openllm