Skip to content
View zouhaoa's full-sized avatar
  • Zhejiang University
  • HangZhou

Block or report zouhaoa

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.

Starred repositories

Showing results

Codebase for Aria - an Open Multimodal Native MoE

Jupyter Notebook 777 68 Updated Nov 7, 2024

Make huge neural nets fit in memory

Python 2,724 270 Updated Apr 26, 2020
Python 118 Updated Oct 9, 2024

A paper list of some recent works about Token Compress for Vit and VLM

128 2 Updated Nov 6, 2024

Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"

Python 3,206 277 Updated May 4, 2024

g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains

Python 3,848 348 Updated Oct 7, 2024

Personal Project: MPP-Qwen14B & MPP-Qwen-Next(Multimodal Pipeline Parallel based on Qwen-LM). Support [video/image/multi-image] {sft/conversations}. Don't let the poverty limit your imagination! Tr…

Jupyter Notebook 375 20 Updated Sep 24, 2024

Official inference repo for FLUX.1 models

Python 15,705 1,128 Updated Oct 8, 2024

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Python 2,975 175 Updated Oct 4, 2024

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 31,225 3,710 Updated Nov 8, 2024

High-resolution models for human tasks.

Python 4,455 244 Updated Oct 24, 2024

Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.

Python 1,011 44 Updated Nov 5, 2024

Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.

Python 1,824 111 Updated Jul 29, 2024

LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images

Python 318 15 Updated Oct 8, 2024

Controllable video and image Generation, SVD, Animate Anyone, ControlNet, ControlNeXt, LoRA

Python 1,395 66 Updated Sep 25, 2024

LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath

Python 9,258 717 Updated Aug 5, 2024

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 12,187 1,114 Updated Oct 14, 2024

llama3 implementation one matrix multiplication at a time

Jupyter Notebook 13,678 1,093 Updated May 23, 2024

Diffree: Text-Guided Shape Free Object Inpainting with Diffusion Model

Python 233 13 Updated Aug 6, 2024

Official Implementation of ICLR'24: Kosmos-G: Generating Images in Context with Multimodal Large Language Models

Python 50 3 Updated May 25, 2024

ELITE: Encoding Visual Concepts into Textual Embeddings for Customized Text-to-Image Generation (ICCV 2023, Oral)

Python 511 30 Updated Jan 8, 2024

AIGC-interview/CV-interview/LLMs-interview面试问题与答案集合仓,同时包含工作和科研过程中的新想法、新问题、新资源与新项目

1,707 171 Updated Oct 14, 2024

The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.

Jupyter Notebook 5,229 337 Updated Jun 28, 2024

InstantID: Zero-shot Identity-Preserving Generation in Seconds 🔥

Python 11,078 807 Updated Jul 18, 2024

Public code repo for paper "A Single Transformer for Scalable Vision-Language Modeling"

Jupyter Notebook 112 3 Updated Sep 21, 2024

Kolors Team

Python 3,824 262 Updated Sep 4, 2024

(ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.

2,157 189 Updated Nov 7, 2024

Optimized local inference for LLMs with HuggingFace-like APIs for quantization, vision/language models, multimodal agents, speech, vector DB, and RAG.

Python 195 29 Updated Oct 18, 2024

Simulator-conditioned Driving Scene Generation

59 4 Updated Jun 29, 2024
Next