-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CTC recipe to AISHELL-1 #1576
Conversation
Huge ! Is this comparable to the SOTA around? |
Hi @TParcollet , I think it's good for a system pure-CTC/greedy/without LM.
|
I see, not bad, but we use extra pre-training while they don't, correct ? |
Yes exact. Fair enough, but their branchformer is quite something according to the results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @BenoitWang minor details only.
The yaml file combines related hparam files well; as is for the train script.
Is the AISHELL-1 prepare script completely stripped of the extra dependency to pandas
for all its recipes?
(not a bad thing to reduce dependencies, although pandas is neat, - just asking - pandas is not in the SB requirements and neither it is explicitly stated fo AISHELL-1, so it's a good catch)
Hi @anautsch thanks for the review, the fix is done. And yes that's why I want to reduce pandas, it is only used to generate csv for all the recipes. |
… into aishell-ctc
Hi @TParcollet @anautsch @Adel-Moumen, Thank you all for the reviews and tests! The HF link is added, here's a brief summary of the PR:
|
lgtm. Tested recipes in Side note: we have an internal issue with
|
Hi @mravanelli @TParcollet , this PR adds a typical CTC-wav2vec recipe to AISHELL-1.
Test CER: 5.06%
Dev CER: 4.52%
Some points:
prepare.py
,pandas
is not necessary to be used to generate csv, so it is deleted together with some unused variables.