Wav2vec2 and Whisper models for audio classification

Using this script, we can implement several scenarios in audio classification, such as speaker identification, language recognition, emotion recognition, sentiment analysis and more, using Wav2vec2 and Whisper models.

For fine-tuning the Whisper model for audio classification: Whisper_Emotion.py

For fine-tuning the Wav2Vec2 for audio classification: wav2vec_Emotion.py

The manifest for feeding wav data must be like train_voice_emotion.csv file.

in addition we have wav2vec_Emotion_specaugm.py for utilizing SpecAugment as augmentation technique, wav2vec_embeding.py for extract and save feature embedings and wav2vec_emb_score.py for extracting scores for each wav file.

Please use slurm_run.sh to run the scripts with Slurm.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Wav2vec2 and Whisper models for audio classification

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
README.md		README.md
Whisper_Emotion.py		Whisper_Emotion.py
slurm_run.sh		slurm_run.sh
train_voice_emotion.csv		train_voice_emotion.csv
train_voice_sent.csv		train_voice_sent.csv
wav2vec_Emotion.py		wav2vec_Emotion.py
wav2vec_Emotion_specaugm.py		wav2vec_Emotion_specaugm.py
wav2vec_emb_score.py		wav2vec_emb_score.py
wav2vec_embeding.py		wav2vec_embeding.py

areffarhadi/Wav2vec2_Whisper_audio_classification

Folders and files

Latest commit

History

Repository files navigation

Wav2vec2 and Whisper models for audio classification

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages