Lists (2)
Sort Name ascending (A-Z)
Stars
Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement
Deezer source separation library including pretrained models.
C++17 port of Demucs v3 (hybrid) and v4 (hybrid transformer) models with ggml and Eigen3
The implementation of "X-TF-GridNet: A Time-Frequency Domain Target Speaker Extraction Network with Adaptive Speaker Embedding Fusion", which is accepted by Information Fusion.
Real-time speech enhancement mobile app using Nested U-Net
Conformer-based Metric GAN for speech enhancement
NU-Wave 2: A General Neural Audio Upsampling Model for Various Sampling Rates @ INTERSPEECH 2022
The official code repo of "HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection"
We implemented the DEMUCS model for speech enhancement in the time-frequency domain, and additionally implemented HD-DEMUCS.
We provide a PyTorch implementation of the paper Voice Separation with an Unknown Number of Multiple Speakers In which, we present a new method for separating a mixed audio sequence, in which multi…
A high-quality speech analysis, manipulation and synthesis system
HiFi-GAN: High Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks
The Official PyTorch Implementation of FN-SSL & IPDnet for Sound Source Localization
24/7 local AI screen & mic recording. Start recording your screen today ... or be left behind. Works with Ollama. Alternative to Rewind.ai & Zapier. Open. Secure. You own your data. Rust.
This is the code and dataset repo for Interspeech 2024 paper "Target conversation extraction: Source separation using turn-taking dynamics"
Implementation of paper "DPCRN: Dual-Path Convolution Recurrent Network for Single Channel Speech Enhancement"
This repo includes the official implementations of "Fine-tune the pretrained ATST model for sound event detection".
Transformer with Local Modeling by Convolution for Speech Separation and Enhancement
The official repo: "McNet: Fuse Multiple Cues for Multichannel Speech Enhancement", ICASSP 2023
Windows desktop front end for Spleeter - AI source separation
API for a Vocal Remover that uses Deep Neural Networks.
the comfyui custom node for UVR5 to separate vocals and background music
Real-time end-to-end singing voice conversion system based on DDSP (Differentiable Digital Signal Processing)
Music repair method to convert lossy MP3 compressed music to lossless music.
Active noise cancellation using various algorithms (FxLMS, FuLMS, NLMS) in Matlab, VST and C