Skip to content

Fine-tuning the Wav2vec2 and Whisper models for several audio classification scenarios, such as speaker identification, language recognition, emotion recognition, and more.

Notifications You must be signed in to change notification settings

areffarhadi/Wav2vec2_Whisper_audio_classification

Repository files navigation

Wav2vec2 and Whisper models for audio classification

Using this script, we can implement several scenarios in audio classification, such as speaker identification, language recognition, emotion recognition, sentiment analysis and more, using Wav2vec2 and Whisper models.

For fine-tuning the Whisper model for audio classification: Whisper_Emotion.py

For fine-tuning the Wav2Vec2 for audio classification: wav2vec_Emotion.py

The manifest for feeding wav data must be like train_voice_emotion.csv file.

in addition we have wav2vec_Emotion_specaugm.py for utilizing SpecAugment as augmentation technique, wav2vec_embeding.py for extract and save feature embedings and wav2vec_emb_score.py for extracting scores for each wav file.

Please use slurm_run.sh to run the scripts with Slurm.

About

Fine-tuning the Wav2vec2 and Whisper models for several audio classification scenarios, such as speaker identification, language recognition, emotion recognition, and more.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published