This is a Speaker Recognition system with GUI, served as an SRT project for the course Signal Processing (2013Fall) in Tsinghua University.
- SciPy
- scikit-learn
- scikits.talkbox
- bob
- pyssp
- PyQt
- PyAudio
- gcc >= 4.7
Voice Activity Detection(VAD): Long-Term Spectral Divergence (LTSD)
- Gaussian Mixture Model (GMM)
- Universal Background Model (UBM)
- Continuous Restricted Boltzman Machine (CRBM)
- Joint Factor Analysis (JFA)
For more details of this project, please see:
- Our presentation slides
- Our complete report
Our GUI not only has basic functionality for recording, enrollment, training and testing, but also has a visualization of real-time speaker recognition:
See our demo video (in Chinese) for more details.