With OpenLLM, you can run inference with any open-source large-language models, deploy to the cloud or on-premises, and build powerful AI apps, and more.
To learn more about OpenLLM, please visit OpenLLM's README.md
This package holds the underlying client implementation for OpenLLM. If you are
coming from OpenLLM, the client can be accessed via openllm.client
.
It provides somewhat of a "similar" APIs to bentoml.Client
(via openllm_client.benmin
) for interacting with OpenLLM server. This can also be extended to use with general
BentoML server as well.
Note
The component of interop with generic BentoML server will be considered as experimental that will/can be merged back to BentoML.
If you are just using this package for interacting with OpenLLM server, nothing should change from openllm.client
namespace.
import openllm
client = openllm.client.HTTPClient()
client.query('Explain to me the difference between "further" and "farther"')
If you use OpenLLM in your research, we provide a citation to use:
@software{Pham_OpenLLM_Operating_LLMs_2023,
author = {Pham, Aaron and Yang, Chaoyu and Sheng, Sean and Zhao, Shenyang and Lee, Sauyon and Jiang, Bo and Dong, Fog and Guan, Xipeng and Ming, Frost},
license = {Apache-2.0},
month = jun,
title = {{OpenLLM: Operating LLMs in production}},
url = {https://github.com/bentoml/OpenLLM},
year = {2023}
}