ASR-Pipeline-Abhinav

ASR Pipeline for vad, chunking and transcription of Indian languages.

This pipeline processes audio files through a series of stages:

Voice Activity Detection (VAD): Removes silence and detects regions of speech and breaks audio by them.
Audio Chunking: Splits audio into smaller chunks.
Transcription with Force Alignment: Transcribes audio and aligns words with timestamps to a json.
Speaker Diarization: Identifies unique speakers in the audio using embeddings and cosine similarity threshold.

Run the script asr_pipeline.py and enter the stage number when prompted to process audio files step by step. Enter 0 to exit.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
README.md		README.md
allignment.py		allignment.py
asr_pipeline.py		asr_pipeline.py
chunking.py		chunking.py
conversion_sr.py		conversion_sr.py
num_speakers.py		num_speakers.py
requirements.txt		requirements.txt
speaker_embedding.py		speaker_embedding.py
transcription.py		transcription.py
vad.py		vad.py

Provide feedback