diff --git a/CITATION.cff b/CITATION.cff new file mode 100644 index 000000000..1a7c4e4ac --- /dev/null +++ b/CITATION.cff @@ -0,0 +1,65 @@ +cff-version: 1.2.0 +title: 'OpenLLM: Operating LLMs in production' +message: >- + If you use this software, please cite it using these + metadata. +type: software +authors: + - given-names: Aaron + family-names: Pham + email: aarnphm@bentoml.com + orcid: 'https://orcid.org/0009-0008-3180-5115' + - given-names: Chaoyu + family-names: Yang + email: chaoyu@bentoml.com + - given-names: Sean + family-names: Sheng + email: ssheng@bentoml.com + - given-names: Shenyang + family-names: ' Zhao' + email: larme@bentoml.com + - given-names: Sauyon + family-names: Lee + email: sauyon@bentoml.com + - given-names: Bo + family-names: Jiang + email: jiang@bentoml.com + - given-names: Fog + family-names: Dong + email: fog@bentoml.com + - given-names: Xipeng + family-names: Guan + email: xipeng@bentoml.com + - given-names: Frost + family-names: Ming + email: frost@bentoml.com +repository-code: 'https://github.com/bentoml/OpenLLM' +url: 'https://bentoml.com/' +abstract: >- + OpenLLM is an open platform for operating large language + models (LLMs) in production. With OpenLLM, you can run + inference with any open-source large-language models, + deploy to the cloud or on-premises, and build powerful AI + apps. It has built-in support for a wide range of + open-source LLMs and model runtime, including StableLM, + Falcon, Dolly, Flan-T5, ChatGLM, StarCoder and more. + OpenLLM helps serve LLMs over RESTful API or gRPC with one + command or query via WebUI, CLI, our Python/Javascript + client, or any HTTP client. It provides first-class + support for LangChain, BentoML and Hugging Face that + allows you to easily create your own AI apps by composing + LLMs with other models and services. Last but not least, + it automatically generates LLM server OCI-compatible + Container Images or easily deploys as a serverless + endpoint via BentoCloud. +keywords: + - MLOps + - LLMOps + - LLM + - Infrastructure + - Transformers + - LLM Serving + - Model Serving + - Serverless Deployment +license: Apache-2.0 +date-released: '2023-06-13' diff --git a/README.md b/README.md index 8fb2c0f85..81441262c 100644 --- a/README.md +++ b/README.md @@ -327,7 +327,8 @@ OPENLLM_FLAN_T5_FRAMEWORK=tf openllm start flan-t5 ### Fine-tuning support (Experimental) -One can serve OpenLLM models with any PEFT-compatible layers with `--adapter-id`: +One can serve OpenLLM models with any PEFT-compatible layers with +`--adapter-id`: ```bash openllm start opt --model-id facebook/opt-6.7b --adapter-id aarnphm/opt-6-7b-quotes @@ -345,21 +346,26 @@ To use multiple adapters, use the following format: openllm start opt --model-id facebook/opt-6.7b --adapter-id aarnphm/opt-6.7b-lora --adapter-id aarnphm/opt-6.7b-lora:french_lora ``` -By default, the first adapter-id will be the default Lora layer, but optionally users can change what Lora layer to use for inference via `/v1/adapters`: +By default, the first adapter-id will be the default Lora layer, but optionally +users can change what Lora layer to use for inference via `/v1/adapters`: ```bash curl -X POST http://localhost:3000/v1/adapters --json '{"adapter_name": "vn_lora"}' ``` -Note that for multiple adapter-name and adapter-id, it is recommended to update to use the default adapter before sending the inference, to avoid any performance degradation +Note that for multiple adapter-name and adapter-id, it is recommended to update +to use the default adapter before sending the inference, to avoid any +performance degradation -To include this into the Bento, one can also provide a `--adapter-id` into `openllm build`: +To include this into the Bento, one can also provide a `--adapter-id` into +`openllm build`: ```bash openllm build opt --model-id facebook/opt-6.7b --adapter-id ... - ``` +``` -> **Note**: We will gradually roll out support for fine-tuning all models. Currently, only OPT has fully adapters support. +> **Note**: We will gradually roll out support for fine-tuning all models. +> Currently, only OPT has fully adapters support. ### Integrating a New Model @@ -582,3 +588,19 @@ capabilities or have any questions, don't hesitate to reach out in our Checkout our [Developer Guide](https://github.com/bentoml/OpenLLM/blob/main/DEVELOPMENT.md) if you wish to contribute to OpenLLM's codebase. + +## 📔 Citation + +If you use OpenLLM in your research, we provide a [citation](./CITATION.cff) to +use: + +```bibtex +@software{Pham_OpenLLM_Operating_LLMs_2023, +author = {Pham, Aaron and Yang, Chaoyu and Sheng, Sean and Zhao, Shenyang and Lee, Sauyon and Jiang, Bo and Dong, Fog and Guan, Xipeng and Ming, Frost}, +license = {Apache-2.0}, +month = jun, +title = {{OpenLLM: Operating LLMs in production}}, +url = {https://github.com/bentoml/OpenLLM}, +year = {2023} +} +```