retrieval_model

Sentence Retrieval with BERT

The sentence retrival codes.

Introduction

The fact verification shared task contains three steps: Document Retrieval, Sentence Retrival and Fact Verification.
We utilize BERT based model for the sentence retrival part.
Our paper main compare KGAT with ESIM based sentence retrieval, which is same as GEAR because the BERT model overfit on the development set. The data can also be found in data folder.
We use pairwise loss for sentence retrieval, which can achieve best performance.

Train a new sentence selection model

Data process:
- Go to the data folder.
- Run bash process.sh to generate pairs for training and development sets.
Train the retrieval model:
- Run bash train.sh to train the BERT based sentence retrieval model.

Test model

Run bash test.sh to get the data for claim verification and top5 evidence will be reserved.
The process_data.py aims to include the golden data for claim verification to avoid the data bias.
Note that if no golden evidence is provided the prediction should be NOT ENOUGH INFO (Different from the golden label). To avoid this senario, we add golden evidence for training and development sets to avoid the label bias.

Retrieval Perfomance

We have tested the performance of BERT based retrieval model with different setting. We do not write them into the paper because of the page limitation. We hope these results can help you for further studies. The codes of BERT + Prediction are same with the pretrain.

Development set.

Model	Prec@5	Rec@5	F1@5
BERT + Prediction	27.66	95.91	42.94
BERT + PairwiseLoss	27.21	93.89	42.14
BERT + PairwiseLoss + WikiTitle	27.29	94.37	42.34

Testing set.

Model	Prec@5	Rec@5	F1@5
BERT + Prediction	23.77	85.07	37.15
BERT + PairwiseLoss	25.22	87.35	39.14
BERT + PairwiseLoss + WikiTitle	25.21	87.47	39.14

Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
bert_model.py		bert_model.py
data_loader.py		data_loader.py
file_utils.py		file_utils.py
models.py		models.py
process_data.py		process_data.py
test.py		test.py
test.sh		test.sh
train.py		train.py
train.sh		train.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

retrieval_model

retrieval_model

README.md

Sentence Retrieval with BERT

Introduction

Train a new sentence selection model

Test model

Retrieval Perfomance

Files

retrieval_model

Directory actions

More options

Directory actions

More options

Latest commit

History

retrieval_model

Folders and files

parent directory

README.md

Sentence Retrieval with BERT

Introduction

Train a new sentence selection model

Test model

Retrieval Perfomance