[NeurIPS'24 Spotlight] EVE: Encoder-Free Vision-Language Models
-
Updated
Oct 2, 2024 - Python
[NeurIPS'24 Spotlight] EVE: Encoder-Free Vision-Language Models
Official Implementation for "MyVLM: Personalizing VLMs for User-Specific Queries" (ECCV 2024)
DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception
This repo is a live list of papers on game playing and large multimodality model - "A Survey on Game Playing Agents and Large Models: Methods, Applications, and Challenges".
Open-source code for the paper "Enhancing Remote Sensing Vision-Language Models for Zero-Shot Scene Classification"
[ECCV 2024] API: Attention Prompting on Image for Large Vision-Language Models
up-to-date curated list of state-of-the-art Large vision language models hallucinations research work, papers & resources
[ICLR 2024 Spotlight 🔥 ] - [ Best Paper Award SoCal NLP 2023 🏆] - Jailbreak in pieces: Compositional Adversarial Attacks on Multi-Modal Language Models
[ICML 2024] Offical code repo for ICML2024 paper "Candidate Pseudolabel Learning: Enhancing Vision-Language Models by Prompt Tuning with Unlabeled Data"
This is an official implementation of our work, Select and Distill: Selective Dual-Teacher Knowledge Transfer for Continual Learning on Vision-Language Models, accepted to ECCV'24
Facial Expression Recognition using vision language models (VLMs)
Add a description, image, and links to the vision-language-models topic page so that developers can more easily learn about it.
To associate your repository with the vision-language-models topic, visit your repo's landing page and select "manage topics."