Skip to content

Commit

Permalink
update conformer for tedlium2
Browse files Browse the repository at this point in the history
  • Loading branch information
pyf98 committed Dec 19, 2022
1 parent 26f432b commit 8ee35df
Show file tree
Hide file tree
Showing 4 changed files with 156 additions and 0 deletions.
40 changes: 40 additions & 0 deletions egs2/tedlium2/asr1/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,44 @@
# RESULTS

## Environments
- date: `Fri Dec 16 05:04:30 CST 2022`
- python version: `3.9.15 (main, Nov 24 2022, 14:31:59) [GCC 11.2.0]`
- espnet version: `espnet 202209`
- pytorch version: `pytorch 1.12.1`
- Git hash: `26f432bc859e5e40cac1a86042d498ba7baffbb0`
- Commit date: `Fri Dec 9 02:16:01 2022 +0000`

## asr_train_asr_conformer_raw_en_bpe500_sp

Config: [conf/tuning/train_asr_conformer.yaml](conf/tuning/train_asr_conformer.yaml)
Params: 30.76 M
Model: [https://huggingface.co/pyf98/tedlium2_conformer](https://huggingface.co/pyf98/tedlium2_conformer)

## Without LM

### WER

|dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
|---|---|---|---|---|---|---|---|---|
|decode_asr_asr_model_valid.acc.ave/dev|466|14671|93.1|4.4|2.5|1.0|7.8|69.7|
|decode_asr_asr_model_valid.acc.ave/test|1155|27500|93.4|4.0|2.6|1.0|7.6|64.2|

### CER

|dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
|---|---|---|---|---|---|---|---|---|
|decode_asr_asr_model_valid.acc.ave/dev|466|78259|97.0|0.9|2.2|0.9|3.9|69.7|
|decode_asr_asr_model_valid.acc.ave/test|1155|145066|96.9|0.9|2.2|0.9|4.0|64.2|

### TER

|dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err|
|---|---|---|---|---|---|---|---|---|
|decode_asr_asr_model_valid.acc.ave/dev|466|28296|94.7|2.9|2.4|0.9|6.3|69.7|
|decode_asr_asr_model_valid.acc.ave/test|1155|52113|95.0|2.6|2.5|0.9|5.9|64.2|



## Environments
- date: `Thu Nov 11 09:45:45 CST 2021`
- python version: `3.7.11 (default, Jul 27 2021, 14:32:16) [GCC 7.5.0]`
Expand Down
6 changes: 6 additions & 0 deletions egs2/tedlium2/asr1/conf/decode_asr.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
beam_size: 20
ctc_weight: 0.3
lm_weight: 0.0
maxlenratio: 0.0
minlenratio: 0.0
penalty: 0.0
77 changes: 77 additions & 0 deletions egs2/tedlium2/asr1/conf/tuning/train_asr_conformer.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
# Trained with NVIDIA A40 GPU (48GB) x 2
encoder: conformer
encoder_conf:
output_size: 256
attention_heads: 4
linear_units: 1024
num_blocks: 12
dropout_rate: 0.1
positional_dropout_rate: 0.1
attention_dropout_rate: 0.1
input_layer: conv2d
normalize_before: true
macaron_style: true
rel_pos_type: latest
pos_enc_layer_type: rel_pos
selfattention_layer_type: rel_selfattn
activation_type: swish
use_cnn_module: true
cnn_module_kernel: 31

decoder: transformer
decoder_conf:
attention_heads: 4
linear_units: 2048
num_blocks: 6
dropout_rate: 0.1
positional_dropout_rate: 0.1
self_attention_dropout_rate: 0.1
src_attention_dropout_rate: 0.1

model_conf:
ctc_weight: 0.3
lsm_weight: 0.1
length_normalized_loss: false

frontend_conf:
n_fft: 512
win_length: 400
hop_length: 160

seed: 2022
use_amp: true
num_workers: 6
batch_type: numel
batch_bins: 50000000
accum_grad: 1
max_epoch: 50
init: none
best_model_criterion:
- - valid
- acc
- max
keep_nbest_models: 10

optim: adam
optim_conf:
lr: 0.002
weight_decay: 0.000001
scheduler: warmuplr
scheduler_conf:
warmup_steps: 15000

specaug: specaug
specaug_conf:
apply_time_warp: true
time_warp_window: 5
time_warp_mode: bicubic
apply_freq_mask: true
freq_mask_width_range:
- 0
- 27
num_freq_mask: 2
apply_time_mask: true
time_mask_width_ratio_range:
- 0.
- 0.05
num_time_mask: 5
33 changes: 33 additions & 0 deletions egs2/tedlium2/asr1/run.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
#!/usr/bin/env bash
# Set bash to 'debug' mode, it will exit on :
# -e 'error', -u 'undefined variable', -o ... 'error in pipeline', -x 'print commands',
set -e
set -u
set -o pipefail

train_set="train"
valid_set="dev"
test_sets="test dev"

asr_config=conf/tuning/train_asr_conformer.yaml
inference_config=conf/decode_asr.yaml

./asr.sh \
--lang en \
--nj 8 \
--ngpu 2 \
--gpu_inference true \
--inference_nj 2 \
--feats_type raw \
--audio_format "flac.ark" \
--token_type bpe \
--nbpe 500 \
--use_lm false \
--asr_config "${asr_config}" \
--inference_config "${inference_config}" \
--train_set "${train_set}" \
--valid_set "${valid_set}" \
--test_sets "${test_sets}" \
--speed_perturb_factors "0.9 1.0 1.1" \
--bpe_train_text "data/${train_set}/text" \
--lm_train_text "data/${train_set}/text" "$@"

0 comments on commit 8ee35df

Please sign in to comment.