shuyansy

Yan Shu shuyansy

In BAAI Researching in multi-modal learning

34 followers · 15 following

BAAI
China

Achievements

Stars

weichaozeng / TextCtrl

TextCtrl: Diffusion-based Scene Text Editing with Prior Guidance Control

Python 3 1 Updated Oct 10, 2024

VectorSpaceLab / OmniGen

695 13 Updated Sep 18, 2024

xiaoachen98 / Open-LLaVA-NeXT

An open-source implementation for training LLaVA-NeXT.

Python 267 11 Updated Sep 26, 2024

NVlabs / VILA

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)

Python 1,888 150 Updated Sep 25, 2024

EvolvingLMMs-Lab / lmms-eval

Accelerating the development of large multimodal models (LMMs) with lmms-eval

Python 1,464 125 Updated Oct 15, 2024

JUNJIE99 / VISTA_Evaluation_FineTuning

Evaluation code and datasets for the ACL 2024 paper, VISTA: Visualized Text Embedding for Universal Multi-Modal Retrieval. The original code and model can be accessed at FlagEmbedding.

Python 13 2 Updated Sep 4, 2024

EvolvingLMMs-Lab / LongVA

Long Context Transfer from Language to Vision

Python 310 17 Updated Aug 26, 2024

BAAI-DCAI / SpatialBot

The official repo for "SpatialBot: Precise Spatial Understanding with Vision Language Models.

Python 149 9 Updated Sep 19, 2024

JUNJIE99 / MLVU

🔥🔥MLVU: Multi-task Long Video Understanding Benchmark

Python 145 Updated Oct 12, 2024

BAAI-DCAI / Multimodal-Robustness-Benchmark

Python 41 Updated Oct 11, 2024

FlagOpen / FlagEmbedding

Retrieval and Retrieval-augmented LLMs

Python 7,137 520 Updated Oct 10, 2024

BradyFU / Awesome-Multimodal-Large-Language-Models

✨✨Latest Advances on Multimodal Large Language Models

12,192 778 Updated Oct 9, 2024

rese1f / MovieChat

[CVPR 2024] MovieChat: From Dense Token to Sparse Memory for Long Video Understanding

Python 506 41 Updated Sep 6, 2024

BAAI-DCAI / Bunny

A family of lightweight multimodal models.

Python 905 68 Updated Sep 18, 2024

ttengwang / Awesome_Long_Form_Video_Understanding

Awesome papers & datasets specifically focused on long-term videos.

178 7 Updated Oct 3, 2024

kousw / experimental-consistory

Python 99 6 Updated Mar 3, 2024

md-mohaiminul / VideoRecap

Python 161 7 Updated Jul 12, 2024

TencentARC / SmartEdit

Official code of SmartEdit [CVPR-2024 Highlight]

Python 240 8 Updated Jun 21, 2024

shengliu66 / ICV

Code for In-context Vectors: Making In Context Learning More Effective and Controllable Through Latent Space Steering

Python 133 6 Updated Jul 6, 2024

shuyansy / Efficient-Ambiguous-Text-Detector

An official Project related to Paper "Perceiving Ambiguity and Semantics without Recognition: An Efficient and Effective Ambiguous Scene Text Detector" (ACM MM 2023)

Python 27 3 Updated Dec 3, 2023

shuyansy / Survey-of-Visual-Text-Processing

The official project of paper "Visual Text Meets Low-level Vision: A Comprehensive Survey on Visual Text Processing"

44 1 Updated Oct 7, 2024

bahjat-kawar / time-diffusion

Official code repo for "Editing Implicit Assumptions in Text-to-Image Diffusion Models"

Python 80 2 Updated Mar 15, 2023

yeungchenwa / Recommendations-Diffusion-Text-Image

A paper collection of recent diffusion models for text-image generation tasks, e,g., visual text generation, font generation, text removal, text image super resolution, text editing, handwritten ge…

193 4 Updated Aug 2, 2024

zhoubolei / bolei_awesome_posters

CVPR and NeurIPS poster examples and templates. May we have in-person poster session soon!

1,456 136 Updated May 9, 2023

UKPLab / sentence-transformers

State-of-the-Art Text Embeddings

Python 15,051 2,451 Updated Oct 15, 2024

dali92002 / OCR-TR

Optocal Character Recognition (OCR / HTR) using Transformers

Python 10 1 Updated Aug 20, 2022

shuyansy / multilingual-machine-translation

This is some code for multilingual machine translation (English, Korean, Japanese, Arabic)

Python 1 Updated Sep 7, 2023

dali92002 / SSL-OCR

Text-DIAE: A Self-Supervised Degradation Invariant Autoencoders for Text Recognition and Document Enhancement - AAAI 2023

Python 23 6 Updated Jul 12, 2023

shuyansy / Synthesis-multilingual-handwritten-text-data

This is a simple yet method focused on handwritten text dataset generation, which is beneficial for handwritten text detection and segmentation

Python 1 Updated Sep 7, 2023

advimman / lama

🦙 LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022

Jupyter Notebook 7,939 842 Updated Jul 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Yan Shu shuyansy

Achievements

Achievements

Block or report shuyansy

Stars

weichaozeng / TextCtrl

VectorSpaceLab / OmniGen

xiaoachen98 / Open-LLaVA-NeXT

NVlabs / VILA

EvolvingLMMs-Lab / lmms-eval

JUNJIE99 / VISTA_Evaluation_FineTuning

EvolvingLMMs-Lab / LongVA

BAAI-DCAI / SpatialBot

JUNJIE99 / MLVU

BAAI-DCAI / Multimodal-Robustness-Benchmark

FlagOpen / FlagEmbedding

BradyFU / Awesome-Multimodal-Large-Language-Models

rese1f / MovieChat

BAAI-DCAI / Bunny

ttengwang / Awesome_Long_Form_Video_Understanding

kousw / experimental-consistory

md-mohaiminul / VideoRecap

TencentARC / SmartEdit

shengliu66 / ICV

shuyansy / Efficient-Ambiguous-Text-Detector

shuyansy / Survey-of-Visual-Text-Processing

bahjat-kawar / time-diffusion

yeungchenwa / Recommendations-Diffusion-Text-Image

zhoubolei / bolei_awesome_posters

UKPLab / sentence-transformers

dali92002 / OCR-TR

shuyansy / multilingual-machine-translation

dali92002 / SSL-OCR

shuyansy / Synthesis-multilingual-handwritten-text-data

advimman / lama