yuangpeng

Follow

🤩

A surprising multimodal large model will be released soon!

Yuang Peng yuangpeng

🤩

A surprising multimodal large model will be released soon!

Follow

53 followers · 95 following

Tsinghua University
Beijing, China
03:49 (UTC +08:00)
https://www.yuangpeng.com
@yuang_peng

Achievements

Achievements

Highlights

Pro

Stars

mit-han-lab / vila-u

VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation

Python 121 2 Updated Oct 24, 2024

dolbydu / font

258 114 Updated Apr 25, 2022

rhymes-ai / Allegro

Allegro is a powerful text-to-video model that generates high-quality videos up to 6 seconds at 15 FPS and 720p resolution from simple text input.

Python 555 39 Updated Oct 31, 2024

baaivision / Emu3

Next-Token Prediction is All You Need

Python 1,795 69 Updated Oct 24, 2024

Ucas-HaoranWei / GOT-OCR2.0

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Python 5,897 500 Updated Nov 4, 2024

PixArt-alpha / PixArt-sigma

PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation

Python 1,669 83 Updated Oct 31, 2024

NJU-PCALab / OpenVid-1M

Python 189 4 Updated Jul 15, 2024

PKU-YuanGroup / Open-Sora-Plan

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Python 11,531 1,023 Updated Nov 6, 2024

LTH14 / mar

PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838

Python 986 55 Updated Sep 27, 2024

linkangheng / Video-Data-Pipeline

The Repo is for processing the Large Video Dataset WebVid-10M, include sample, track, and so on

Python 2 Updated Jul 12, 2024

cambrian-mllm / cambrian

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Python 1,751 113 Updated Oct 30, 2024

jihaonew / MM-Instruct

MM-Instruct: Generated Visual Instructions for Large Multimodal Model Alignment

Python 31 Updated Jul 1, 2024

yuangpeng / dreambench_plus

Python 71 1 Updated Jul 7, 2024

20000yshust / llmattack_chatglm2

This is an implemention of llm attack on chatglm2

Python 2 Updated Oct 30, 2024

facebookresearch / chameleon

Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.

Python 1,827 112 Updated Jul 29, 2024

conventional-commits / conventionalcommits.org

The conventional commits specification

SCSS 7,096 552 Updated Oct 23, 2024

ggjy / DeLVM

Python 108 7 Updated Jun 6, 2024

FoundationVision / LlamaGen

Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation

Python 1,303 56 Updated Aug 15, 2024

allenai / unified-io-2.pytorch

Python 64 3 Updated Jul 3, 2024

NielsRogge / Transformers-Tutorials

This repository contains demos I made with the Transformers library by HuggingFace.

Jupyter Notebook 9,443 1,447 Updated Oct 21, 2024

FoundationVision / VAR

[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-sim…

Python 4,226 315 Updated Oct 6, 2024

NVIDIA / Megatron-LM

Ongoing research training transformer models at scale

Python 10,499 2,351 Updated Nov 9, 2024

hpcaitech / Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Python 22,182 2,168 Updated Aug 9, 2024

xai-org / grok-1

Grok open release

Python 49,529 8,317 Updated Aug 30, 2024

borisdayma / dalle-mini

DALL·E Mini - Generate images from a text prompt

Python 14,750 1,208 Updated Nov 9, 2023

allenai / OLMo

Modeling, training, eval, and inference code for OLMo

Python 4,610 469 Updated Nov 9, 2024

baaivision / Emu

Emu Series: Generative Multimodal Models from BAAI

Python 1,659 86 Updated Sep 27, 2024

mosaicml / diffusion

Python 675 70 Updated Nov 4, 2024

Stability-AI / StableCascade

Official Code for Stable Cascade

Jupyter Notebook 6,539 533 Updated Jul 25, 2024

LargeWorldModel / LWM

Large World Model -- Modeling Text and Video with Millions Context

Python 7,138 551 Updated Oct 19, 2024