Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Key error #225

Open
subratac opened this issue Dec 3, 2021 · 1 comment
Open

Key error #225

subratac opened this issue Dec 3, 2021 · 1 comment

Comments

@subratac
Copy link

subratac commented Dec 3, 2021

I am getting a KeyError when I run any of the setting (extractive/abstractive). Has anyone come across this? I have all the .pt datasets and it is loading all of them as you can see in the first part of the log file.

[2021-12-03 04:00:45,621 INFO] loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at ../temp/26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
gpu_rank 0
[2021-12-03 04:00:45,654 INFO] * number of parameters: 75951418
[2021-12-03 04:00:45,655 INFO] Start training...
[2021-12-03 04:00:45,766 INFO] Loading train dataset from ../bert_data/cnndm.train.18.bert.pt, number of examples: 1998
Traceback (most recent call last):
File "train.py", line 122, in
train_abs(args, device_id)
File "/content/PreSumm/src/train_abstractive.py", line 273, in train_abs
train_abs_single(args, device_id)
File "/content/PreSumm/src/train_abstractive.py", line 334, in train_abs_single
trainer.train(train_iter_fct, args.train_steps)
File "/content/PreSumm/src/models/trainer.py", line 142, in train
for i, batch in enumerate(train_iter):
File "/content/PreSumm/src/models/data_loader.py", line 142, in iter
for batch in self.cur_iter:
File "/content/PreSumm/src/models/data_loader.py", line 278, in iter
for idx, minibatch in enumerate(self.batches):
File "/content/PreSumm/src/models/data_loader.py", line 256, in create_batches
for buffer in self.batch_buffer(data, self.batch_size * 300):
File "/content/PreSumm/src/models/data_loader.py", line 224, in batch_buffer
ex = self.preprocess(ex, self.is_test)
File "/content/PreSumm/src/models/data_loader.py", line 195, in preprocess
tgt = ex['tgt'][:self.args.max_tgt_len][:-1]+[2]
KeyError: 'tgt'

@subratac
Copy link
Author

subratac commented Dec 3, 2021

also, here's my command I am running for abstractive summary model training:

!python train.py -mode train -accum_count 5 -batch_size 300 -bert_data_path ../bert_data/cnndm -dec_dropout 0.1 -log_file /content/PreSumm/logs/cnndm_baseline -lr 0.05 -model_path MODEL_PATH -save_checkpoint_steps 2000 -seed 777 -sep_optim false -train_steps 200000 -use_bert_emb true -use_interval true -warmup_steps 8000 -max_pos 512 -report_every 50 -enc_hidden_size 512 -enc_layers 6 -enc_ff_size 2048 -enc_dropout 0.1 -dec_layers 6 -dec_hidden_size 512 -dec_ff_size 2048 -encoder baseline -task abs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant