Skip to content
View zouhaoa's full-sized avatar
  • Zhejiang University
  • HangZhou

Block or report zouhaoa

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.

Starred repositories

Showing results
74 Updated Sep 24, 2024

A paper list of some recent works about Token Compress for Vit and VLM

62 Updated Sep 25, 2024

Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"

Python 3,188 278 Updated May 4, 2024

g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains

Python 3,421 315 Updated Oct 1, 2024

Personal Project: MPP-Qwen14B & MPP-Qwen-Next(Multimodal Pipeline Parallel based on Qwen-LM). Support [video/image/multi-image] {sft/conversations}. Don't let the poverty limit your imagination! Tr…

Jupyter Notebook 360 20 Updated Sep 24, 2024

Official inference repo for FLUX.1 models

Python 14,461 1,039 Updated Oct 3, 2024

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Python 2,458 138 Updated Oct 4, 2024

Implementing a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 28,237 3,220 Updated Oct 5, 2024

High-resolution models for human tasks.

Python 4,128 218 Updated Oct 3, 2024

Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.

Python 882 39 Updated Sep 30, 2024

Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.

Python 1,779 108 Updated Jul 29, 2024

LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images

Python 303 15 Updated Sep 18, 2024

Controllable video and image Generation, SVD, Animate Anyone, ControlNet, ControlNeXt, LoRA

Python 1,299 60 Updated Sep 25, 2024

LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath

Python 9,213 715 Updated Aug 5, 2024

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 11,329 970 Updated Oct 5, 2024

llama3 implementation one matrix multiplication at a time

Jupyter Notebook 13,234 1,062 Updated May 23, 2024

Diffree: Text-Guided Shape Free Object Inpainting with Diffusion Model

Python 222 13 Updated Aug 6, 2024

Official Implementation of ICLR'24: Kosmos-G: Generating Images in Context with Multimodal Large Language Models

Python 46 3 Updated May 25, 2024

ELITE: Encoding Visual Concepts into Textual Embeddings for Customized Text-to-Image Generation (ICCV 2023, Oral)

Python 511 30 Updated Jan 8, 2024

AIGC-interview/CV-interview/LLMs-interview面试问题与答案集合仓,同时包含工作和科研过程中的新想法、新问题、新资源与新项目

1,597 165 Updated Sep 29, 2024

The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.

Jupyter Notebook 5,085 331 Updated Jun 28, 2024

InstantID: Zero-shot Identity-Preserving Generation in Seconds 🔥

Python 10,959 800 Updated Jul 18, 2024

Public code repo for paper "A Single Transformer for Scalable Vision-Language Modeling"

Jupyter Notebook 106 2 Updated Sep 21, 2024

Kolors Team

Python 3,666 242 Updated Sep 4, 2024

(ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.

2,103 187 Updated Aug 20, 2024

Optimized local inference for LLMs with HuggingFace-like APIs for quantization, vision/language models, multimodal agents, speech, vector DB, and RAG.

Python 178 26 Updated Sep 2, 2024

Simulator-conditioned Driving Scene Generation

48 3 Updated Jun 29, 2024

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Python 1,705 112 Updated Sep 19, 2024

This repository is a curated collection of the most exciting and influential CVPR 2024 papers. 🔥 [Paper + Code + Demo]

Python 636 58 Updated Jun 24, 2024
Next