Skip to content

Latest commit





Module Introduction

Here is a brief introduction of each module(directory).

  • bin: training and recognition binaries
  • dataset: IO design
  • utils: common utils
  • transformer: the core of WeNet, in which the standard transformer/conformer is implemented. It contains the common blocks(backbone) of speech transformers.
    • transformer/ Standard multi head attention
    • transformer/ Standard position encoding
    • transformer/ Standard feed forward in transformer
    • transformer/ ConvolutionModule in Conformer model
    • transformer/ Subsampling implementation for speech task
  • transducer: transducer implementation
  • squeezeformer: squeezeformer implementation, please refer paper
  • efficient_conformer: efficient conformer implementation, please refer paper
  • paraformer: paraformer implementation, please refer paper
    • paraformer/ Continuous Integrate-and-Fire implemented, please refer paper
  • branchformer: branchformer implementation, please refer paper
  • whisper: whisper implementation, please refer paper
  • ssl: Self-supervised speech model implementation. e.g. wav2vec2, bestrq, w2vbert.
  • ctl_model: Enhancing the Unified Streaming and Non-streaming Model with with Contrastive Learning implementation paper

transducer, squeezeformer, efficient_conformer, branchformer and cif are all based on transformer, they resue a lot of the common blocks of tranformer.

If you want to contribute your own x-former, please reuse the current code as much as possible.