- 👋 Hi, I’m Teo (Timothy) Wu, Final Year PhD Candidate in Nanyang Technological University 🇸🇬, Resume, Homepage
- I am working on Large Multi-modal Models (LMMs), especially on videos and other long-form multi-modal inputs. See our Neurips 2024 D&B, LongVideoBench for the first-ever benchmark designed for long-context video-text interleaved inputs!
Prior to working on general LMMs, I am also the creator of Q-Future, a project that aims to utilize LMMs to boost low-level vision, visual evaluation, and related topics. Here are the two representative works:
See my top Repos:
-
- ICML 2024, Q-Align TL,DR: The first and yet the best visual quality and aesthetic evaluation method powered by MLLMs/LMMs.
-
- CVPR 2024, Q-Instruct TL,DR: The first low-level visual instruction tuning dataset, with a model zoo of low-level-improved MLLMs!
-
- ICLR 2024 Spotlight, Q-Bench TL,DR: The first low-level visual benchmark for multi-modality LLMs. Focusing three tracks (perception, descrpition, and assessment).
-
- ACMMM 2023 Oral, MaxVQA/MaxWell TL,DR: 16-dimensional VQA Dataset and Method, towards explainable VQA. Gradio demos are available in repo.
-
- 🥇 ICCV 2023, DOVER TL,DR: the SOTA NR-VQA method, can predict disentangled aesthetic and technical quality. Colab demo available.
-
- 🧰 ECCV 2022, End-to-End VQA Toolbox (FAST-VQA) TL, DR: An end-to-end Video Quality Assessment toolbox allowing you to develop your methods; official repo for FAST-VQA!
-
- 🥇 ICME 2023 Oral, Zero-Shot BVQI TL, DR: the SOTA zero-shot NR-VQA method.
- 📫 Reach me by e-mail: realtimothyhwu@gmail.com, Twitter: Twitter
- Google Scholar