Stars
[WACV 2023] Audio-Visual Efficient Conformer (AVEC) for Robust Speech Recognition
PyTorch implementation of "Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring" (CVPR2023) and "Visual Context-driven Audio Feature Enhan…
Official Code implementation for the ICLR paper "LipVoicer: Generating Speech from Silent Videos Guided by Lip Reading"
Implementation of DiffWave and SaShiMi audio generation models