#

qwen-vl

Here are 5 public repositories matching this topic...

gokayfem / awesome-vlm-architectures

Famous Vision Language Models and Their Architectures

awesome awesome-list kosmos clip image-encoder vlm blip multimodal text-encoder vision-language-model llava internlm cogvlm qwen-vl

Updated Sep 8, 2024
Markdown

PaddlePaddle / PaddleMIX

Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high performance and flexibility.

Updated Nov 4, 2024
Python

zjysteven / lmms-finetune

A minimal codebase for finetuning large multimodal models, supporting llava-1.5/1.6, llava-interleave, llava-next-video, llava-onevision, qwen-vl, qwen2-vl, phi3-v etc.

finetuning multimodal vision-language foundation-models instruction-tuning large-language-model llava visual-instruction-tuning multimodal-large-language-models large-multimodal-models qwen-vl llava-next

Updated Oct 22, 2024
Python

reidbarber / webmarker

Mark web pages for use with vision-language models

som prompt gemini claude playwright prompt-engineering llms vision-language-model gpt4v qwen-vl gpt4o set-of-mark

Updated Sep 22, 2024
TypeScript

autodistill / autodistill-qwen-vl

Qwen-VL base model for use with Autodistill.

zero-shot-object-detection autodistill qwen-vl

Updated Feb 8, 2024
Python

Improve this page

Add a description, image, and links to the qwen-vl topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the qwen-vl topic, visit your repo's landing page and select "manage topics."