Audio source separation and speech enhancement are fast evolving fields with a growing number of papers submitted to conferences each year. While datasets such as wsj0-{2, 3}mix, WHAM or MS-SNSD are being shared, there has been little effort to create common codebases for development and evaluation of source separation and speech enhancement algorithms. Hence AsSteroid !
The intent of Asteroid is to be a community-based project, to go beyond sharing datasets. We share tools to create new algorithms as well as recipes to reproduce published papers.
- User friendliness.
- Modularity.
- Extensibility.
- Reproducibility.
- Deep clustering (Hershey et al. and Isik et al.)
- Tasnet (Luo et al.)
- ConvTasnet (Luo et al.)
- Chimera ++ (for ) (Luo et al. and Wang et al.)
- FurcaNeXt (Shi et al.)
- DualPathRNN (Luo et al.)
- Two step learning (Tzinis et al.)
Analysis-synthesis
- STFT (See unit tests)
- Free e.g fully learned (Luo et al.)
- Analytic free e.g fully learned under analycity constraint (Pariente et al.)
- Parametrized Sinc (Pariente et al.)
Analysis only (can be extended)
- Fixed Multi-Phase Gammatones (Ditter et al.)
- Parametrized modulated Gaussian windows (Openreview)
- Parametrized Gammatone (Openreview)
- Parametrized Gammachirp (Openreview)
Make table with all the results per dataset here.