β Difficulties to reproduce BART results on CNN/DM by fine-tuning bart-largeΒ #5654
Description
β Help
I'm trying to fine-tune BART on CNN/DM by myself (so, starting from facebook/bart-large
checkpoint).
However I can't reproduce the results so far... BART authors report a R1 score of 44.16
in their paper, but my best checkpoint so far is only 42.53
.
It's not an issue with the eval script, as I can reproduce the authors' results from the checkpoint facebook/bart-large-cnn
. I get a score of 44.09
using this checkpoint.
I tried several hyper-parameters : the ones provided in the example folder, but also the ones used in fairseq repo. It doesn't change anything...
I'm a bit at loss on how to reproduce these fine-tuning score...
Could anyone fine-tune BART successfully using transformers
repo ? If yes, can you share your parameters ?
Any help would be greatly appreciated !