Skip to content

Commit

Permalink
Extract framewise alignment information using CTC decoding (k2-fsa#39)
Browse files Browse the repository at this point in the history
* Use new APIs with k2.RaggedTensor

* Fix style issues.

* Update the installation doc, saying it requires at least k2 v1.7

* Extract framewise alignment information using CTC decoding.

* Print environment information.

Print information about k2, lhotse, PyTorch, and icefall.

* Fix CI.

* Fix CI.

* Compute framewise alignment information of the LibriSpeech dataset.

* Update comments for the time to compute alignments of train-960.

* Preserve cut id in mix cut transformer.

* Minor fixes.

* Add doc about how to extract framewise alignments.
  • Loading branch information
csukuangfj authored Oct 18, 2021
1 parent bd7c2f7 commit 4890e27
Show file tree
Hide file tree
Showing 18 changed files with 582 additions and 38 deletions.
9 changes: 8 additions & 1 deletion .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -46,10 +46,18 @@ jobs:
with:
python-version: ${{ matrix.python-version }}

- name: Install libnsdfile and libsox
if: startsWith(matrix.os, 'ubuntu')
run: |
sudo apt update
sudo apt install -q -y libsndfile1-dev libsndfile1 ffmpeg
sudo apt install -q -y --fix-missing sox libsox-dev libsox-fmt-all
- name: Install Python dependencies
run: |
python3 -m pip install --upgrade pip pytest
pip install k2==${{ matrix.k2-version }}+cpu.torch${{ matrix.torch }} -f https://k2-fsa.org/nightly/
pip install git+https://github.com/lhotse-speech/lhotse
# icefall requirements
pip install -r requirements.txt
Expand Down Expand Up @@ -88,4 +96,3 @@ jobs:
# runt tests for conformer ctc
cd egs/librispeech/ASR/conformer_ctc
pytest
6 changes: 4 additions & 2 deletions egs/librispeech/ASR/RESULTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,14 +38,16 @@ python conformer_ctc/train.py --bucketing-sampler True \
--concatenate-cuts False \
--max-duration 200 \
--full-libri True \
--world-size 4
--world-size 4 \
--lang-dir data/lang_bpe_5000

python conformer_ctc/decode.py --nbest-scale 0.5 \
--epoch 34 \
--avg 20 \
--method attention-decoder \
--max-duration 20 \
--num-paths 100
--num-paths 100 \
--lang-dir data/lang_bpe_5000
```

### LibriSpeech training results (Tdnn-Lstm)
Expand Down
50 changes: 50 additions & 0 deletions egs/librispeech/ASR/conformer_ctc/README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,53 @@
## Introduction

Please visit
<https://icefall.readthedocs.io/en/latest/recipes/librispeech/conformer_ctc.html>
for how to run this recipe.

## How to compute framewise alignment information

### Step 1: Train a model

Please use `conformer_ctc/train.py` to train a model.
See <https://icefall.readthedocs.io/en/latest/recipes/librispeech/conformer_ctc.html>
for how to do it.

### Step 2: Compute framewise alignment

Run

```
# Choose a checkpoint and determine the number of checkpoints to average
epoch=30
avg=15
./conformer_ctc/ali.py \
--epoch $epoch \
--avg $avg \
--max-duration 500 \
--bucketing-sampler 0 \
--full-libri 1 \
--exp-dir conformer_ctc/exp \
--lang-dir data/lang_bpe_5000 \
--ali-dir data/ali_5000
```
and you will get four files inside the folder `data/ali_5000`:

```
$ ls -lh data/ali_500
total 546M
-rw-r--r-- 1 kuangfangjun root 1.1M Sep 28 08:06 test_clean.pt
-rw-r--r-- 1 kuangfangjun root 1.1M Sep 28 08:07 test_other.pt
-rw-r--r-- 1 kuangfangjun root 542M Sep 28 11:36 train-960.pt
-rw-r--r-- 1 kuangfangjun root 2.1M Sep 28 11:38 valid.pt
```

**Note**: It can take more than 3 hours to compute the alignment
for the training dataset, which contains 960 * 3 = 2880 hours of data.

**Caution**: The model parameters in `conformer_ctc/ali.py` have to match those
in `conformer_ctc/train.py`.

**Caution**: You have to set the parameter `preserve_id` to `True` for `CutMix`.
Search `./conformer_ctc/asr_datamodule.py` for `preserve_id`.

**TODO:** Add doc about how to use the extracted alignment in the other pull-request.
Loading

0 comments on commit 4890e27

Please sign in to comment.