forked from k2-fsa/icefall
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Extract framewise alignment information using CTC decoding (k2-fsa#39)
* Use new APIs with k2.RaggedTensor * Fix style issues. * Update the installation doc, saying it requires at least k2 v1.7 * Extract framewise alignment information using CTC decoding. * Print environment information. Print information about k2, lhotse, PyTorch, and icefall. * Fix CI. * Fix CI. * Compute framewise alignment information of the LibriSpeech dataset. * Update comments for the time to compute alignments of train-960. * Preserve cut id in mix cut transformer. * Minor fixes. * Add doc about how to extract framewise alignments.
- Loading branch information
1 parent
bd7c2f7
commit 4890e27
Showing
18 changed files
with
582 additions
and
38 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,53 @@ | ||
## Introduction | ||
|
||
Please visit | ||
<https://icefall.readthedocs.io/en/latest/recipes/librispeech/conformer_ctc.html> | ||
for how to run this recipe. | ||
|
||
## How to compute framewise alignment information | ||
|
||
### Step 1: Train a model | ||
|
||
Please use `conformer_ctc/train.py` to train a model. | ||
See <https://icefall.readthedocs.io/en/latest/recipes/librispeech/conformer_ctc.html> | ||
for how to do it. | ||
|
||
### Step 2: Compute framewise alignment | ||
|
||
Run | ||
|
||
``` | ||
# Choose a checkpoint and determine the number of checkpoints to average | ||
epoch=30 | ||
avg=15 | ||
./conformer_ctc/ali.py \ | ||
--epoch $epoch \ | ||
--avg $avg \ | ||
--max-duration 500 \ | ||
--bucketing-sampler 0 \ | ||
--full-libri 1 \ | ||
--exp-dir conformer_ctc/exp \ | ||
--lang-dir data/lang_bpe_5000 \ | ||
--ali-dir data/ali_5000 | ||
``` | ||
and you will get four files inside the folder `data/ali_5000`: | ||
|
||
``` | ||
$ ls -lh data/ali_500 | ||
total 546M | ||
-rw-r--r-- 1 kuangfangjun root 1.1M Sep 28 08:06 test_clean.pt | ||
-rw-r--r-- 1 kuangfangjun root 1.1M Sep 28 08:07 test_other.pt | ||
-rw-r--r-- 1 kuangfangjun root 542M Sep 28 11:36 train-960.pt | ||
-rw-r--r-- 1 kuangfangjun root 2.1M Sep 28 11:38 valid.pt | ||
``` | ||
|
||
**Note**: It can take more than 3 hours to compute the alignment | ||
for the training dataset, which contains 960 * 3 = 2880 hours of data. | ||
|
||
**Caution**: The model parameters in `conformer_ctc/ali.py` have to match those | ||
in `conformer_ctc/train.py`. | ||
|
||
**Caution**: You have to set the parameter `preserve_id` to `True` for `CutMix`. | ||
Search `./conformer_ctc/asr_datamodule.py` for `preserve_id`. | ||
|
||
**TODO:** Add doc about how to use the extracted alignment in the other pull-request. |
Oops, something went wrong.