Skip to content
/ birch Public
forked from castorini/birch

Document ranking via sentence modeling using BERT

Notifications You must be signed in to change notification settings

yumoxu/birch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Birch

Docker Build Status DOI

Document ranking via sentence modeling using BERT

Environment & Data

# Set up environment
pip install virtualenv
virtualenv -p python3.5 birch_env
source birch_env/bin/activate

# Install dependencies
pip install Cython  # jnius dependency
pip install -r requirements.txt

git clone https://github.com/NVIDIA/apex
cd apex && pip install -v --no-cache-dir . && cd ..

# Set up Anserini
git clone https://github.com/castorini/anserini.git
cd anserini && mvn clean package appassembler:assemble
cd eval && tar xvfz trec_eval.9.0.4.tar.gz && cd trec_eval.9.0.4 && make && cd ../../..

# Download data and models
wget https://zenodo.org/record/3241945/files/birch_data.tar.gz
tar -xzvf birch_data.tar.gz

Training

python src/main.py --mode training --collection mb --qrels_file qrels.microblog.txt --batch_size <batch_size> --eval_steps <eval_steps> --learning_rate <learning_rate> --num_train_epochs <num_train_epochs> --device cuda

Inference

python src/main.py --mode inference --experiment <qa_2cv, mb_2cv, qa_5cv, mb_5cv> --collection <robust04_2cv, robust04_5cv> --model_path <models/saved.mb_3, models/saved.qa_2> --load_trained --batch_size <batch_size> --device cuda

Note that this step takes a long time. If you don't want to evaluate the pretrained models, you may skip to the next step and evaluate with our predictions under data/predictions.

Evaluation

BM25+RM3:

./eval_scripts/baseline.sh <path/to/anserini> <path/to/index> <2, 5>

Sentence Evidence:

  • Compute document score

Set the last argument to True if you want to tune the hyperparameters first. To use the default hyperparameters, set to False.

./eval_scripts/test.sh <qa_2cv, mb_2cv, qa_5cv, mb_5cv> <2, 5> <path/to/anserini> <True, False>
  • Evaluate with trec_eval
./eval_scripts/eval.sh <bm25+rm3_2cv, qa_2cv, mb_2cv, bm25+rm3_5cv, qa_5cv, mb_5cv> <path/to/anserini> qrels.robust2004.txt

Result on Robust04

  • "Paper 1" based on two-fold CV:
Model AP P@20
Paper 1 (two fold) 0.2971 0.3948
BM25+RM3 (Anserini) 0.2987 0.3871
1S: BERT(QA) 0.3014 0.3928
2S: BERT(QA) 0.3003 0.3948
3S: BERT(QA) 0.3003 0.3948
1S: BERT(MB) 0.3241 0.4217
2S: BERT(MB) 0.3240 0.4209
3S: BERT(MB) 0.3244 0.4219
  • "Paper 2" based on five-fold CV:
Model AP P@20
Paper 2 (five fold) 0.272 0.386
BM25+RM3 (Anserini) 0.3033 0.3974
1S: BERT(QA) 0.3102 0.4068
2S: BERT(QA) 0.3090 0.4064
3S: BERT(QA) 0.3090 0.4064
1S: BERT(MB) 0.3266 0.4245
2S: BERT(MB) 0.3278 0.4267
3S: BERT(MB) 0.3278 0.4287

See this paper for the exact fold settings.

Replication Log


How do I cite this work?

@article{yang2019simple,
  title={Simple Applications of BERT for Ad Hoc Document Retrieval},
  author={Yang, Wei and Zhang, Haotian and Lin, Jimmy},
  journal={arXiv preprint arXiv:1903.10972},
  year={2019}
}

About

Document ranking via sentence modeling using BERT

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 70.4%
  • JavaScript 23.9%
  • Shell 5.7%