-
Notifications
You must be signed in to change notification settings - Fork 495
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add support for Llama 3.2-Vision models #2376
Conversation
This commit adds support for the Llama 3.2-Vision collection of multimodal LLMs for both the transformers and vllm engines. - Updated `llm_family.json` and `llm_family_modelscope.json` to include Llama 3.2-Vision and Llama 3.2-Vision-Instruct model information. - Modified `vllm` engine's `core.py` to handle these models. - Enhanced documentation with model reference files to reflect the newly supported built-in models.
Following is the CI error:
This seems to originate from |
This should be related to model_config.json, the json file cannot be read normally. |
I am running locally a production instance using my branch of xinference and it works without any errors and also it can load the Llama-3.2 models correctly. I need to make some changes to install vLLM 0.6.2 as it requires fastapi>=0.114.1 but xinference requires a dependency of fastapi==0.110.3 or smaller. I am using ubuntu 22.04 with python 3.11.9 using uv package manager. |
I think the limitation of fastapi can be removed now IMO. |
- Updated `llm_family.json` and `llm_family_modelscope.json` to remove trailing commas in the Llama-3.2 model configuration.
Ok will do that and commit again. I just fixed the trailing ',' error from the json files. JSON validator worked fine on it but trailing commas are ok in Python dictionaries not in JSON. |
- Updated `setup.cfg` to require `fastapi>=0.114.1` to support the installation of `vllm>=0.6.2`, which depends on the updated FastAPI version.
@qinxuye All the CI jobs passed except the self_hosted GPU, the error is linked to ChatTTS module not connected to changes in this PR. So I believe you should be able to merge this PR unless you want to fix the ChatTTS related errors which might have been introduced by some other merged PR.
|
This is a known issue, we can ignore it now, I will review this PR ASAP. |
Does Llama 3.2-Vision-Instruct work well? |
Merged with upstream changes and made modifications to VLLM_SUPPORTED_VISION_MODEL_LIST
Added space before VLLMModel class for flake8 rule
@qinxuye Any updates on this PR for Llama 3.2 Vision model? |
…sion Updated the model_id in modelscope model link for Llama-3.2-90B-Vision
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This pull request introduces support for the Llama 3.2-Vision collection of multimodal large language models (LLMs) within Xinference. These models bring the capability to process both text and image inputs, expanding the potential for diverse applications.
Key Changes:
This pull request adds support for the Llama 3.2-Vision collection of multimodal LLMs for both the transformers and vllm engines.
llm_family.json
andllm_family_modelscope.json
to include Llama 3.2-Vision and Llama 3.2-Vision-Instruct model information.vllm
engine'score.py
to handle these models.