Skip to content

Differences between facebook/bart-base and facebook/bart-large? #8005

Closed
@leoribeiro

Description

❓ Questions & Help

Is there some more difference between facebook/bart-base and facebook/bart-large (other than dimensions, heads and layers)?

Who can help

@sshleifer @wisedoge

Environment info

  • transformers version: 3.3.1
  • Python version: 3.6.12
  • PyTorch version (GPU?): 1.4.0 GPU-version

Command:

I'm using the seq2seq/finetune.py script to finetune both BARTs.

python finetune.py \
--data_dir=${DATA_DIR} \
--learning_rate=3e-5 \
--num_train_epochs 5 \
--task summarization \
--model_name_or_path=${MODEL} \
--train_batch_size=4 \
--eval_batch_size=4 \
--gpus 1 \
--output_dir=$OUTPUT_DIR \
--max_source_length=256 \
--max_target_length=256 \
--val_max_target_length=256 \
--test_max_target_length=256 \
--eval_max_gen_length=256 \
--do_train --do_predict \
--eval_beams 5

${MODEL} model can be facebook/bart-base or facebook/bart-large

Details

When I finetune facebook/bart-base, it works well:

"input_ids": " <s> ( report :ARG1 ( station :ARG1 ( troop :mod ( country :wiki Russia :name ( name :op1 Russia ) ) :ARG0-of ( withdraw :ARG2 ( country :quant 3 :location ( sea :wiki Baltic_Sea :name ( name :op1 Baltic :op2 Sea ) ) ) ) ) :ARG2 ( and :op1 ( state :wiki - :name ( name :op1 Jalininggele ) :location country ) :op2 ( state :wiki - :name ( name :op1 Simolingsike ) ) :op3 ( city :wiki - :name ( name :op1 Yelinia ) :location ( relative-position :op1 ( city :wiki Moscow :name ( name :op1 Moscow ) ) :quant ( distance-quantity :quant 300 :unit ( kilometer ) ) ) ) ) :mod ( respective ) ) )</s><pad><pad><pad>",
        "labels": "<s> It is reported that the Russian troops that withdrew from the three Baltic Sea countries will be stationed respectively in the Russian state of Jalininggele, the state of Simolingsike and Yelinia city which is 300 kilometers away from Moscow.</s>",
        "decoder_input_ids": "</s><s> It is reported that the Russian troops that withdrew from the three Baltic Sea countries will be stationed respectively in the Russian state of Jalininggele, the state of Simolingsike and Yelinia city which is 300 kilometers away from Moscow.",
        "generated_ids": "</s><s> Russian troops reported to be stationed in the 3 Baltic Sea countries of Jalininggele, Simolingsike and Yelinia 300 kilometers (110 miles) from Moscow.</s><pad><pad><pad><pad><pad><pad><pad>"

When I finetune facebook/bart-large, it did not generate a reasonable output:

"input_ids": "<s> ( report :ARG1 ( station :ARG1 ( troop :mod ( country :wiki Russia :name ( name :op1 Russia ) ) :ARG0-of ( withdraw :ARG2 ( country :quant 3 :location ( sea :wiki Baltic_Sea :name ( name :op1 Baltic :op2 Sea ) ) ) ) ) :ARG2 ( and :op1 ( state :wiki - :name ( name :op1 Jalininggele ) :location country ) :op2 ( state :wiki - :name ( name :op1 Simolingsike ) ) :op3 ( city :wiki - :name ( name :op1 Yelinia ) :location ( relative-position :op1 ( city :wiki Moscow :name ( name :op1 Moscow ) ) :quant ( distance-quantity :quant 300 :unit ( kilometer ) ) ) ) ) :mod ( respective ) ) )</s><pad><pad><pad>",
        "labels": "<s> It is reported that the Russian troops that withdrew from the three Baltic Sea countries will be stationed respectively in the Russian state of Jalininggele, the state of Simolingsike and Yelinia city which is 300 kilometers away from Moscow.</s>",
        "decoder_input_ids": "</s><s> It is reported that the Russian troops that withdrew from the three Baltic Sea countries will be stationed respectively in the Russian state of Jalininggele, the state of Simolingsike and Yelinia city which is 300 kilometers away from Moscow.",
        "generated_ids": "</s><s><s><s><s><s><s><s><s><s><s> ... <s><s><s><s><s><s><s><s><s><s><s><s><s><s><s><s><s><s></s>"

I'm using the same code, but only facebook/bart-base model works. In a previous transformer version, both worked, but not in this one (3.3.1).

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions