Differences between facebook/bart-base and facebook/bart-large? #8005
Closed
Description
❓ Questions & Help
Is there some more difference between facebook/bart-base
and facebook/bart-large
(other than dimensions, heads and layers)?
Who can help
Environment info
- transformers version: 3.3.1
- Python version: 3.6.12
- PyTorch version (GPU?): 1.4.0 GPU-version
Command:
I'm using the seq2seq/finetune.py script to finetune both BARTs.
python finetune.py \
--data_dir=${DATA_DIR} \
--learning_rate=3e-5 \
--num_train_epochs 5 \
--task summarization \
--model_name_or_path=${MODEL} \
--train_batch_size=4 \
--eval_batch_size=4 \
--gpus 1 \
--output_dir=$OUTPUT_DIR \
--max_source_length=256 \
--max_target_length=256 \
--val_max_target_length=256 \
--test_max_target_length=256 \
--eval_max_gen_length=256 \
--do_train --do_predict \
--eval_beams 5
${MODEL} model can be facebook/bart-base
or facebook/bart-large
Details
When I finetune facebook/bart-base, it works well:
"input_ids": " <s> ( report :ARG1 ( station :ARG1 ( troop :mod ( country :wiki Russia :name ( name :op1 Russia ) ) :ARG0-of ( withdraw :ARG2 ( country :quant 3 :location ( sea :wiki Baltic_Sea :name ( name :op1 Baltic :op2 Sea ) ) ) ) ) :ARG2 ( and :op1 ( state :wiki - :name ( name :op1 Jalininggele ) :location country ) :op2 ( state :wiki - :name ( name :op1 Simolingsike ) ) :op3 ( city :wiki - :name ( name :op1 Yelinia ) :location ( relative-position :op1 ( city :wiki Moscow :name ( name :op1 Moscow ) ) :quant ( distance-quantity :quant 300 :unit ( kilometer ) ) ) ) ) :mod ( respective ) ) )</s><pad><pad><pad>",
"labels": "<s> It is reported that the Russian troops that withdrew from the three Baltic Sea countries will be stationed respectively in the Russian state of Jalininggele, the state of Simolingsike and Yelinia city which is 300 kilometers away from Moscow.</s>",
"decoder_input_ids": "</s><s> It is reported that the Russian troops that withdrew from the three Baltic Sea countries will be stationed respectively in the Russian state of Jalininggele, the state of Simolingsike and Yelinia city which is 300 kilometers away from Moscow.",
"generated_ids": "</s><s> Russian troops reported to be stationed in the 3 Baltic Sea countries of Jalininggele, Simolingsike and Yelinia 300 kilometers (110 miles) from Moscow.</s><pad><pad><pad><pad><pad><pad><pad>"
When I finetune facebook/bart-large, it did not generate a reasonable output:
"input_ids": "<s> ( report :ARG1 ( station :ARG1 ( troop :mod ( country :wiki Russia :name ( name :op1 Russia ) ) :ARG0-of ( withdraw :ARG2 ( country :quant 3 :location ( sea :wiki Baltic_Sea :name ( name :op1 Baltic :op2 Sea ) ) ) ) ) :ARG2 ( and :op1 ( state :wiki - :name ( name :op1 Jalininggele ) :location country ) :op2 ( state :wiki - :name ( name :op1 Simolingsike ) ) :op3 ( city :wiki - :name ( name :op1 Yelinia ) :location ( relative-position :op1 ( city :wiki Moscow :name ( name :op1 Moscow ) ) :quant ( distance-quantity :quant 300 :unit ( kilometer ) ) ) ) ) :mod ( respective ) ) )</s><pad><pad><pad>",
"labels": "<s> It is reported that the Russian troops that withdrew from the three Baltic Sea countries will be stationed respectively in the Russian state of Jalininggele, the state of Simolingsike and Yelinia city which is 300 kilometers away from Moscow.</s>",
"decoder_input_ids": "</s><s> It is reported that the Russian troops that withdrew from the three Baltic Sea countries will be stationed respectively in the Russian state of Jalininggele, the state of Simolingsike and Yelinia city which is 300 kilometers away from Moscow.",
"generated_ids": "</s><s><s><s><s><s><s><s><s><s><s> ... <s><s><s><s><s><s><s><s><s><s><s><s><s><s><s><s><s><s></s>"
I'm using the same code, but only facebook/bart-base
model works. In a previous transformer version, both worked, but not in this one (3.3.1).