Highlights
- Pro
Lists (8)
Sort Name ascending (A-Z)
Stars
Audio plugin for custom MP3 distortion and digital glitches
DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos
Entropy Based Sampling and Parallel CoT Decoding
Python library and shell utilities to monitor filesystem events.
GammaCV is a WebGL accelerated Computer Vision library for browser
Gourieff / sd-webui-reactor
Forked from s0md3v/sd-webui-roopFast and Simple Face Swap Extension for StableDiffusion WebUI (A1111 SD WebUI, SD WebUI Forge, SD.Next, Cagliostro)
A collection of resources on digital human including clothed people digitalization, virtual try-on, and other related directions.
UdioWrapper is a Python package that enables the generation of music tracks using Udio's API through textual prompts. This package is based on the reverse engineering of the Udio API (https://www.…
Use API to call the music generation AI of suno.ai, and easily integrate it into agents like GPTs.
Official Implementation of Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction
Native UI for the Whispering Tiger project - https://github.com/Sharrnah/whispering (live transcription / translation)
High performance AI inference stack. Built for production. @ziglang / @openxla / MLIR / @bazelbuild
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
GUI for a Vocal Remover that uses Deep Neural Networks.
Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models
Node.js client for Replicate
Generate a simple Node.js project structure for running AI models with Replicate's API
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
Create butter-smooth transitions between prompts, powered by stable diffusion
[SIGGRAPH Asia 2024 (Journal Track)] StableNormal: Reducing Diffusion Variance for Stable and Sharp Normal
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.
Foundational model for human-like, expressive TTS
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translatio…