Skip to content

Latest commit

 

History

History

DrQA

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 

DrQA

A pytorch implementation of the ACL 2017 paper Reading Wikipedia to Answer Open-Domain Questions (DrQA). The code is based on Runqi's implementation (https://github.com/hitvoice/DrQA).

Requirements

  • python >=3.5
  • numpy
  • pandas
  • msgpack
  • spacy 1.x

Quick Start

Setup

  • make sure python 3 and pip is installed.
  • install pytorch matched with your OS, python and cuda versions.
  • install the remaining requirements via pip install -r requirements.txt
  • download the SQuAD datafile, GloVe word vectors and Spacy English language models using bash download.sh.

Train

# prepare the data
python prepro.py

# make sure CUDA lib path and SRU can be found. if not, try:
export LD_LIBRARY_PATH=/usr/local/cuda/lib64
export PYTHONPATH=/path_to_sru_repo/sru/

# train for 100 epoches with batchsize 32
python train.py -e 100 -bs 32

Results

EM Time used in RNN Total time/epoch
LSTM (original paper) 69.5 ~316s ~431s
SRU (v1, 6 layer) ~71.1 ~100s ~201s
SRU (this version, 6 layer) ~71.4 ~100s ~201s

Tested on GeForce GTX 1070.

Credits

Author of the Document Reader model: Danqi Chen.

Author of the original Pytorch implementation: Runqi Yang.

Most of the pytorch model code is borrowed from Facebook/ParlAI under a BSD-3 license.