summarization_indosum

Summarization

IndoLEM uses IndoSum for extractive summarization. Our experiment is based on Liu and Lapata (2018) framework with three BERT models: IndoBERT, malayBERT, and mBERT.

Requirements

Tested with below configuration. Higher torch version is not suitable for PreSumm.

python==3.7.6
torch==1.1.0
torchvision==0.8.1
transformers==3.0.0
pyrouge==0.1.3
tensorboardX==2.1

Experiment

First, download the data here and put the them (all folds) in folder data/
Original implementation can be found here.
Run three scripts for data preprocessing:

python make_datafiles_presum_indobert.py
python make_datafiles_presum_malaybert.py
python make_datafiles_presum_mbert.py

Now you can run the experiment by using the script below:

IndoBERT

cd scripts
./train_indobert.sh
./eval_indobert.sh

MalayBERT

cd scripts
./train_malaybert.sh
./eval_malaybert.sh

mBERT

cd scripts
./train_mbert.sh
./eval_mbert.sh

In scripts/ run chmod +x * to enable bash execution. The training requires 3 GPUs (V100 16GB). If you have lower GPU size, please reduce the batch size.

Since we use 5-fold cross validation, the experiments are run 5 times with different folds in folder data/. Please adjust the script.

Evaluation

Please install pyrouge for evaluating the summary.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

summarization_indosum

summarization_indosum

README.md

Summarization

Requirements

Experiment

Evaluation

Name		Name	Last commit message	Last commit date
parent directory ..
PreSumm		PreSumm
data		data
scripts		scripts
README.md		README.md
make_datafiles_presum_indobert.py		make_datafiles_presum_indobert.py
make_datafiles_presum_malaybert.py		make_datafiles_presum_malaybert.py
make_datafiles_presum_mbert.py		make_datafiles_presum_mbert.py

Files

summarization_indosum

Directory actions

More options

Directory actions

More options

Latest commit

History

summarization_indosum

Folders and files

parent directory

README.md

Summarization

Requirements

Experiment

Evaluation