With OpenLLM, you can run inference with any open-source large-language models, deploy to the cloud or on-premises, and build powerful AI apps, and more.
To learn more about OpenLLM, please visit OpenLLM's README.md
This package holds the underlying client implementation for OpenLLM. If you are
coming from OpenLLM, the client can be accessed via openllm.client
.
It provides somewhat of a "similar" APIs to bentoml.Client
(via openllm_client.min
) for interacting with OpenLLM server. This can also be extended to use with general
BentoML server as well.
Note
The component of interop with generic BentoML server will be considered as EXPERIMENTAL and
will be refactored to new client implementation soon!
If you are just using this package for interacting with OpenLLM server, The API should be the same as openllm.client
namespace.
import openllm
client = openllm.client.HTTPClient()
client.query('Explain to me the difference between "further" and "farther"')
If you use OpenLLM in your research, we provide a citation to use:
@software{Pham_OpenLLM_Operating_LLMs_2023,
author = {Pham, Aaron and Yang, Chaoyu and Sheng, Sean and Zhao, Shenyang and Lee, Sauyon and Jiang, Bo and Dong, Fog and Guan, Xipeng and Ming, Frost},
license = {Apache-2.0},
month = jun,
title = {{OpenLLM: Operating LLMs in production}},
url = {https://github.com/bentoml/OpenLLM},
year = {2023}
}