Skip to content
View teowu's full-sized avatar
🎯
Focusing
🎯
Focusing

Organizations

@VQAssessment @Q-Future

Block or report teowu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.
Showing results

🔥🔥MLVU: Multi-task Long Video Understanding Benchmark

Python 153 Updated Nov 3, 2024

PyTorch code for our paper "Dog-IQA: Standard-guided Zero-shot MLLM for Mix-grain Image Quality Assessment"

13 Updated Oct 7, 2024

Codebase for Aria - an Open Multimodal Native MoE

Jupyter Notebook 769 66 Updated Nov 4, 2024
22 Updated Jul 8, 2024

[ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning

Python 610 61 Updated Jun 1, 2024

[Neurips 24 Spotlight] Training in Pairs + Inference on Single Image with Anchors

Python 18 1 Updated Aug 24, 2024

[LMM + codec] A new paradigm of visual signal compression!

Python 25 Updated Jun 14, 2024

Accelerating the development of large multimodal models (LMMs) with lmms-eval

Python 1,849 142 Updated Nov 2, 2024

Open-source evaluation toolkit of large vision-language models (LVLMs), support ~100 VLMs, 40+ benchmarks

Python 1,272 183 Updated Nov 4, 2024

🏆 [CVPRW 2024] COVER: A Comprehensive Video Quality Evaluator. 🥇 Winner solution for Video Quality Assessment Challenge at the 1st AIS 2024 workshop @ CVPR 2024

Python 40 2 Updated Jul 18, 2024

[Neurips 24' D&B] Official Dataloader and Evaluation Scripts for LongVideoBench.

Python 64 2 Updated Jul 27, 2024

[LMM + AIGC] What do we expect from LMMs as AIGI evaluators and how do they perform?

148 3 Updated Sep 27, 2024

Cornell NLVR and NLVR2 are natural language grounding datasets. Each example shows a visual input and a sentence describing it, and is annotated with the truth-value of the sentence.

HTML 255 59 Updated Aug 18, 2022

An open-source implementation for training LLaVA-NeXT.

Python 354 19 Updated Oct 23, 2024

[NeurIPS 2024] This repo contains evaluation code for the paper "Are We on the Right Way for Evaluating Large Vision-Language Models"

Python 144 5 Updated Sep 26, 2024

Ongoing research training transformer models at scale

Python 10,456 2,344 Updated Nov 4, 2024
HTML 3 Updated May 8, 2024

MambaOut: Do We Really Need Mamba for Vision?

Python 2,020 34 Updated Oct 22, 2024

Official code for Paper "Mantis: Multi-Image Instruction Tuning"

Python 179 15 Updated Oct 31, 2024

Multimodal language model benchmark, featuring challenging examples

Python 149 6 Updated Aug 13, 2024
Python 118 5 Updated Sep 29, 2024

[MM 2024 Oral] Refiner for AIGC

Jupyter Notebook 23 1 Updated Jul 29, 2024

[CVPRW2024, Official Code] for paper "Exploring AIGC Video Quality: A Focus on Visual Harmony, Video-Text Consistency and Domain Distribution Gap".

11 Updated Jun 14, 2024

Tips for releasing research code in Machine Learning (with official NeurIPS 2020 recommendations)

2,583 717 Updated May 19, 2023

The official Meta Llama 3 GitHub site

Python 26,962 3,055 Updated Aug 12, 2024

Evaluating text-to-image/video/3D models with VQAScore

Python 211 20 Updated Sep 9, 2024

LLM training in simple, raw C/CUDA

Cuda 24,296 2,738 Updated Oct 2, 2024

A list of works on evaluation of visual generation models, including evaluation metrics, models, and systems

193 12 Updated Sep 19, 2024

Analysis of video quality datasets via design of minimalistic video quality models

Python 8 Updated Jul 15, 2024

[ACMMM 2024] AesExpert: Towards Multi-modality Foundation Model for Image Aesthetics Perception

57 5 Updated Oct 30, 2024
Next