Here is a brief introduction of each module(directory).
bin
: training and recognition binariesdataset
: IO designutils
: common utilstransformer
: the core ofWeNet
, in which the standard transformer/conformer is implemented. It contains the common blocks(backbone) of speech transformers.- transformer/attention.py: Standard multi head attention
- transformer/embedding.py: Standard position encoding
- transformer/positionwise_feed_forward.py: Standard feed forward in transformer
- transformer/convolution.py: ConvolutionModule in Conformer model
- transformer/subsampling.py: Subsampling implementation for speech task
transducer
: transducer implementationsqueezeformer
: squeezeformer implementation, please refer paperefficient_conformer
: efficient conformer implementation, please refer paperparaformer
: paraformer implementation, please refer paperparaformer/cif.py
: Continuous Integrate-and-Fire implemented, please refer paper
branchformer
: branchformer implementation, please refer paperwhisper
: whisper implementation, please refer paperssl
: Self-supervised speech model implementation. e.g. wav2vec2, bestrq, w2vbert.ctl_model
: Enhancing the Unified Streaming and Non-streaming Model with with Contrastive Learning implementation paper
transducer
, squeezeformer
, efficient_conformer
, branchformer
and cif
are all based on transformer
,
they resue a lot of the common blocks of tranformer
.
If you want to contribute your own x-former, please reuse the current code as much as possible.