Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AudioMNIST experiments #1

Merged
merged 154 commits into from
Oct 29, 2024
Merged
Changes from 1 commit
Commits
Show all changes
154 commits
Select commit Hold shift + click to select a range
962ad50
data prep scripts update
Adel-Moumen Feb 10, 2024
39b5049
iterate over utterances
Adel-Moumen Feb 10, 2024
b313734
without parallel map
Adel-Moumen Feb 10, 2024
7bdb17f
parallel map -> so fast omfg
Adel-Moumen Feb 10, 2024
a3d2d4c
gigaspeech data prep done
Adel-Moumen Feb 10, 2024
4cb3257
speechcolab extra dep if one must download gigaspeech
Adel-Moumen Feb 10, 2024
e521cc1
create ASR CTC folder
Adel-Moumen Feb 10, 2024
92a17c1
base yaml + update data prep to better reflect potential different na…
Adel-Moumen Feb 10, 2024
4dd02a0
update recipe
Adel-Moumen Feb 10, 2024
c254085
update recipe to be compliant with gigaspeech csv
Adel-Moumen Feb 10, 2024
b4de83a
add transformers dep
Adel-Moumen Feb 10, 2024
c3afdcc
convert opus to wav
Adel-Moumen Feb 10, 2024
945b8bb
recipe --debug mode works.
Adel-Moumen Feb 10, 2024
ae91209
typo GRABAGE_UTTERANCE_TAGS -> GARBAGE_UTTERANCE_TAGS
Adel-Moumen Feb 10, 2024
28b4257
tmp DL file
Adel-Moumen Feb 11, 2024
3a6396c
update DL FILE
Adel-Moumen Feb 11, 2024
6e771d7
add DL file in ASR/CTC
Adel-Moumen Feb 11, 2024
ebfcddb
update extra_requirements.txt
Adel-Moumen Feb 11, 2024
a68d0b8
add support of savedir within Pretrained subclasses
Adel-Moumen Feb 12, 2024
b2ed2a9
add wbs requirements
Adel-Moumen Feb 12, 2024
4b8c533
webdataset
Adel-Moumen Feb 13, 2024
44785c0
remove print
Adel-Moumen Feb 13, 2024
e203d77
tmp files webdataset
Adel-Moumen Feb 13, 2024
9b44e8d
verbosity + metada.json
Adel-Moumen Feb 14, 2024
1426156
letzo now label_encoder can actually train + the recipe seems to work.
Adel-Moumen Feb 14, 2024
0786b0b
Merge branch 'develop' of https://github.com/Adel-Moumen/speechbrain …
Adel-Moumen Feb 14, 2024
99bdfb1
Merge branch 'speechbrain:develop' into gigaspeech
Adel-Moumen Feb 14, 2024
aaeee16
Merge branch 'gigaspeech' of https://github.com/Adel-Moumen/speechbra…
Adel-Moumen Feb 14, 2024
ce12662
remove wbs
Adel-Moumen Mar 18, 2024
ed3ba03
DL info
Adel-Moumen Mar 18, 2024
8ae360b
HF DL support
Adel-Moumen Mar 18, 2024
1601ddc
remove webdataset as it sucks :p
Adel-Moumen Mar 18, 2024
9531d0b
name
Adel-Moumen Mar 18, 2024
1356ff1
ngram commands
Adel-Moumen Mar 18, 2024
4fa921b
Merge branch 'speechbrain:develop' into gigaspeech
Adel-Moumen Mar 18, 2024
0485173
whisper baseline
Adel-Moumen Mar 18, 2024
b360f8b
fix HF
Adel-Moumen Mar 18, 2024
3d71a04
Merge remote-tracking branch 'speechbrain/develop' into gigaspeech
Adel-Moumen Mar 29, 2024
81884ee
pre-commit + sentencepiece char
Adel-Moumen Mar 29, 2024
0f3da32
remove csv
Adel-Moumen Mar 29, 2024
cf2507a
Add quirks.py, move global PyTorch config and GPU workarounds there
asumagic Sep 17, 2024
0ea337f
Add support for SB_DISABLE_QUIRKS environment variable
asumagic Sep 17, 2024
265aa24
Fetch rework: make savedir optional
asumagic Oct 4, 2024
10b5286
Merge branch 'develop' into gigaspeech
TParcollet Oct 8, 2024
0009cf2
bunch of updates to make it run
TParcollet Oct 8, 2024
8bdbd1e
no download script
TParcollet Oct 8, 2024
8083872
fix precommit
TParcollet Oct 8, 2024
a362bca
fix precommit
TParcollet Oct 8, 2024
603049c
readmes
TParcollet Oct 8, 2024
d4b3f0d
readmes
TParcollet Oct 8, 2024
ef87027
readmes
TParcollet Oct 8, 2024
8d53430
readmes
TParcollet Oct 8, 2024
762a7b2
doc update
TParcollet Oct 8, 2024
14a9df7
CI god not happy, make CI god happy
TParcollet Oct 8, 2024
19d4753
why you here little encoder
TParcollet Oct 8, 2024
beb2ab2
adding a tranduscer streaming recipe, because why not
TParcollet Oct 8, 2024
cde564a
add test for transducer
TParcollet Oct 8, 2024
7f1ff0e
works better when me not stupid
TParcollet Oct 8, 2024
d27e285
fix yaml
TParcollet Oct 8, 2024
800d637
update req
TParcollet Oct 8, 2024
b76911b
add warning for cache dir
TParcollet Oct 9, 2024
f1be37b
add warning for cache dir
TParcollet Oct 9, 2024
d96d2ce
enable multiprocessing
TParcollet Oct 9, 2024
2926264
Minor cleanups to fetching
pplantinga Oct 9, 2024
f87e350
Change default behavior of inference to not create savedir if not spe…
pplantinga Oct 9, 2024
5259f27
allow data prep without ddp
TParcollet Oct 10, 2024
c0ea27a
fix tests
TParcollet Oct 10, 2024
688cbe3
smoll readme update
TParcollet Oct 10, 2024
99d998e
fix review comments
TParcollet Oct 11, 2024
484e8f4
Merge branch 'develop' into gigaspeech
TParcollet Oct 11, 2024
0d77a46
fixed concat_start_index check (#2717)
gfdb Oct 11, 2024
9912b25
Ensure adapted models save their parameters (#2716)
pplantinga Oct 11, 2024
679e270
wtf
Oct 11, 2024
a33cd7b
update doc
TParcollet Oct 11, 2024
9e2af5b
more documentation on storage
Oct 11, 2024
468147d
missing arg
TParcollet Oct 11, 2024
575a55c
a bit of logs
TParcollet Oct 11, 2024
0886ec6
new schedulers
TParcollet Oct 11, 2024
e285300
new schedulers
TParcollet Oct 11, 2024
e31a066
Fixes #2656: Remove EOS from SoundChoice
flexthink Oct 11, 2024
a06221b
fix my stupidity
TParcollet Oct 11, 2024
8eb530d
Merge branch 'speechbrain:develop' into gigaspeech
Adel-Moumen Oct 11, 2024
6bd627c
Update non-HF code path for new preprocessing code in GigaSpeech
asumagic Oct 15, 2024
dd28c73
Fix CSV path for non-HF Gigaspeech
asumagic Oct 15, 2024
ab79b48
Fix formatting
asumagic Oct 15, 2024
410fe2f
Kmeans fix (#2642)
poonehmousavi Oct 15, 2024
4822cba
Merge branch 'develop' into fetch-take-two
mravanelli Oct 15, 2024
5043059
add call on start of fit_batch fn
Adel-Moumen Oct 17, 2024
cdf4860
Update core.py
Adel-Moumen Oct 17, 2024
d3599dc
Update core.py
Adel-Moumen Oct 17, 2024
c650072
Merge branch 'speechbrain:develop' into fix_call_on_start_fit_batch
Adel-Moumen Oct 17, 2024
339360a
Merge pull request #2722 from Adel-Moumen/fix_call_on_start_fit_batch
asumagic Oct 18, 2024
2d157c4
Merge pull request #2718 from flexthink/speechbrain-g2p-fix
asumagic Oct 18, 2024
4e64041
Fix preprocess_text example
asumagic Oct 18, 2024
2dd7232
Fix guess_source docstring with up-to-date info
asumagic Oct 18, 2024
2ee50fb
Also remove default savedir from Pretrained
pplantinga Oct 18, 2024
6e999fe
Merge pull request #2712 from pplantinga/fetch-take-two
asumagic Oct 18, 2024
942d5ed
Merge branch 'develop' into gpu-quirks
asumagic Oct 20, 2024
d9efa5a
Fix function name for log_applied_quirks
asumagic Oct 20, 2024
932dcde
wip audiomnist+gt
naspert Oct 21, 2024
bad5e05
Revert "fix normalization for LFB"
naspert Oct 21, 2024
32f6038
audiomnist classification setup
naspert Oct 21, 2024
24ed44e
fix config
naspert Oct 21, 2024
2535f2c
add missing file
naspert Oct 21, 2024
510396b
update dataset load/training
naspert Oct 21, 2024
3a583f7
remove unnecessary params
naspert Oct 21, 2024
fbef11a
remove sort
naspert Oct 21, 2024
52d6744
remove unnecessary code
naspert Oct 21, 2024
ccf0f89
fix paths
naspert Oct 21, 2024
cc58456
fix loss computation
naspert Oct 21, 2024
aa55757
add missing flatten
naspert Oct 21, 2024
08cc458
print summary
naspert Oct 22, 2024
c7bb76f
Explain quirks in docs/experiment.md
asumagic Oct 22, 2024
1d8074f
ok stupid linter check that hates intentional leading spaces in markdown
asumagic Oct 22, 2024
1bb368a
Merge pull request #2558 from asumagic/gpu-quirks
asumagic Oct 22, 2024
906ada0
add citing in README
Adel-Moumen Oct 22, 2024
77c089f
add code to pad all wavs to the same length
naspert Oct 22, 2024
7cb19e9
fix pad call
naspert Oct 22, 2024
ce718dc
fix error computation
naspert Oct 22, 2024
d551595
fix error computation
naspert Oct 23, 2024
0db8721
Make `collect_in` optional for `Pretrainer`, disable it by default
asumagic Oct 22, 2024
2026dfe
Change more defaults to `savedir=None` and `fetch_strategy=SYMLINK`
asumagic Oct 23, 2024
6f2b7ad
move flatten in audionet
naspert Oct 23, 2024
f3a6337
Merge remote-tracking branch 'upstream/develop' into gigaspeech
asumagic Oct 23, 2024
de7a7e8
Fix GS transducer test prediction decoding?
asumagic Oct 23, 2024
dd5f7d2
fix data prep logic and paths
naspert Oct 23, 2024
5b15078
Actually fix GS transducer test prediction decoding
asumagic Oct 23, 2024
76a803b
Remove punctuation filtering that is handled elsewhere
asumagic Oct 23, 2024
231c78a
HuggingFance
asumagic Oct 23, 2024
743f902
fix skip data prep logic
naspert Oct 23, 2024
59794d0
add original audionet feature extraction
naspert Oct 23, 2024
de3426e
fix pooling for audionet feature extraction
naspert Oct 24, 2024
1772e51
fix audionet shape + remove input norm
naspert Oct 24, 2024
be5cb6e
try data augmentation
naspert Oct 24, 2024
c0d0838
add missing refs
naspert Oct 24, 2024
f2188ad
- rework AudioNet to have optional pooling
naspert Oct 25, 2024
2ed662b
fix typo in url
naspert Oct 25, 2024
6d93019
update audionet hparams
naspert Oct 25, 2024
16e4408
update audionet custom hparams
naspert Oct 25, 2024
6c7b563
update audionet custom hparams
naspert Oct 25, 2024
7168e0c
Updated warning for load_collected
asumagic Oct 25, 2024
fd0cd20
Merge pull request #2727 from asumagic/pretrainer-no-collect-dir
asumagic Oct 25, 2024
d98e949
Add results and notices for results for GigaSpeech transducer & wavlm
asumagic Oct 25, 2024
db5b629
english hard
asumagic Oct 25, 2024
3d2eeee
Merge pull request #2405 from Adel-Moumen/gigaspeech
asumagic Oct 25, 2024
e978711
update audionet custom hparams
naspert Oct 28, 2024
ca375b3
fix doc + pre-commit clean
naspert Oct 28, 2024
4b9d9fe
fix code examples
naspert Oct 28, 2024
56812a3
Merge remote-tracking branch 'origin/develop' into gammatone_dev
naspert Oct 28, 2024
a6b47f3
fix consistency tests
naspert Oct 29, 2024
13a9cbb
fix pre commit
naspert Oct 29, 2024
7807cb4
remove config
naspert Oct 29, 2024
7a3ff84
fix docstring for LFB
naspert Oct 29, 2024
92a88a7
fix docstring for GammatoneConv1D
naspert Oct 29, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Actually fix GS transducer test prediction decoding
  • Loading branch information
asumagic committed Oct 23, 2024
commit 5b15078a40d93482bac1a8ac5093e6f6908c688b
8 changes: 1 addition & 7 deletions recipes/GigaSpeech/ASR/transducer/train.py
Original file line number Diff line number Diff line change
Expand Up @@ -195,18 +195,12 @@ def compute_objectives(self, predictions, batch, stage):
logits_transducer, tokens, wav_lens, token_lens
)

if stage == sb.Stage.VALID:
if stage != sb.Stage.TRAIN:
# Decode token terms to words
predicted_words = self.tokenizer(
predicted_tokens, task="decode_from_list"
)
elif stage == sb.Stage.TEST:
predicted_words = [
self.tokenizer.decode_ids(utt_seq).split(" ")
for utt_seq in predicted_tokens
]

if stage != sb.Stage.TRAIN:
# Convert indices to words
target_words = undo_padding(tokens, token_lens)
target_words = self.tokenizer(target_words, task="decode_from_list")
Expand Down