exiawsh

exiawsh

32 followers · 19 following

Achievements

Stars

baaivision / Emu3

Next-Token Prediction is All You Need

Python 845 25 Updated Sep 30, 2024

OpenGVLab / all-seeing

[ICLR 2024 & ECCV 2024] The All-Seeing Projects: Towards Panoptic Visual Recognition&Understanding and General Relation Comprehension of the Open World"

Python 448 14 Updated Aug 9, 2024

Ucas-HaoranWei / GOT-OCR2.0

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Python 4,930 403 Updated Oct 2, 2024

Oryx-mllm / Oryx

MLLM for On-Demand Spatial-Temporal Understanding at Arbitrary Resolution

Python 243 9 Updated Sep 30, 2024

QwenLM / Qwen2-VL

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Python 2,474 140 Updated Oct 4, 2024

4DVLab / IDKB

Official repository for paper "Can LVLMs Obtain a Driver’s License? A Benchmark Towards Reliable AGI for Autonomous Driving"

18 Updated Sep 5, 2024

iGuoYanjun / C2ANet

Python 5 Updated Jul 31, 2024

NVlabs / EAGLE

EAGLE: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders

Python 509 43 Updated Sep 19, 2024

RenShuhuai-Andy / TimeChat

[CVPR 2024] TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding

Python 274 24 Updated Jun 6, 2024

yunlong10 / Awesome-LLMs-for-Video-Understanding

🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.

1,351 71 Updated Oct 6, 2024

VITA-MLLM / VITA

✨✨VITA: Towards Open-Source Interactive Omni Multimodal LLM

Python 811 41 Updated Oct 6, 2024

IVGSZ / Flash-VStream

This is the official implementation of "Flash-VStream: Memory-Based Real-Time Understanding for Long Video Streams"

Python 113 7 Updated Aug 11, 2024

PKU-YuanGroup / Video-LLaVA

【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

Python 2,888 208 Updated Sep 25, 2024

magic-research / PLLaVA

Official repository for the paper PLLaVA

Python 573 38 Updated Jul 28, 2024

showlab / videollm-online

VideoLLM-online: Online Video Large Language Model for Streaming Video (CVPR 2024)

Python 192 26 Updated Aug 15, 2024

ShareGPT4Omni / ShareGPT4Video

[NeurIPS 2024 D&B Track] An official implementation of ShareGPT4Video: Improving Video Understanding and Generation with Better Captions

Python 1,237 44 Updated Aug 7, 2024

mlfoundations / open_clip

An open source implementation of CLIP.

Python 9,938 959 Updated Aug 19, 2024

OpenGVLab / LCL

Vision Model Pre-training on Interleaved Image-Text Data via Latent Compression Learning

Python 59 3 Updated Jun 16, 2024

opendatalab / HA-DPO

Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization

Python 59 5 Updated Jan 30, 2024

Kevinz-code / SeVa

[MM2024, oral] "Self-Supervised Visual Preference Alignment" https://arxiv.org/abs/2404.10501

Python 37 3 Updated Jul 26, 2024

GAIR-NLP / anole

Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation

Python 652 36 Updated Aug 5, 2024

bit-lsj / HPHS

HPHS: Hierarchical Planning based on Hybrid Frontier Sampling for Unknown Environments Exploration

Python 16 Updated Jul 19, 2024

facebookresearch / chameleon

Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.

Python 1,785 108 Updated Jul 29, 2024

wd1511 / Awesome-Diffusion-for-Image-Translation

A collection of papers on Diffusion for Image-to-Image Translation and Style Transfer

Python 106 14 Updated Oct 7, 2024

weihaox / awesome-image-translation

A collection of awesome resources image-to-image translation.

1,167 119 Updated Sep 3, 2024

karpathy / LLM101n

LLM101n: Let's build a Storyteller

29,193 1,599 Updated Aug 1, 2024

TinyLLaVA / TinyLLaVA_Factory

A Framework of Small-scale Large Multimodal Models

Python 597 53 Updated Sep 10, 2024

wayveai / wayve_scenes

Codebase for the WayveScenes101 Dataset

Python 157 5 Updated Sep 25, 2024

BAAI-DCAI / DataOptim

A collection of visual instruction tuning datasets.

Python 75 3 Updated Mar 14, 2024

westlake-autolab / Delphi

Official Code Release of Delphi

45 Updated Jun 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

exiawsh

Achievements

Achievements

Block or report exiawsh

Stars

baaivision / Emu3

OpenGVLab / all-seeing

Ucas-HaoranWei / GOT-OCR2.0

Oryx-mllm / Oryx

QwenLM / Qwen2-VL

4DVLab / IDKB

iGuoYanjun / C2ANet

NVlabs / EAGLE

RenShuhuai-Andy / TimeChat

yunlong10 / Awesome-LLMs-for-Video-Understanding

VITA-MLLM / VITA

IVGSZ / Flash-VStream

PKU-YuanGroup / Video-LLaVA

magic-research / PLLaVA

showlab / videollm-online

ShareGPT4Omni / ShareGPT4Video

mlfoundations / open_clip

OpenGVLab / LCL

opendatalab / HA-DPO

Kevinz-code / SeVa

GAIR-NLP / anole

bit-lsj / HPHS

facebookresearch / chameleon

wd1511 / Awesome-Diffusion-for-Image-Translation

weihaox / awesome-image-translation

karpathy / LLM101n

TinyLLaVA / TinyLLaVA_Factory

wayveai / wayve_scenes

BAAI-DCAI / DataOptim

westlake-autolab / Delphi