git clone https://github.com/izuna385/Zero-Shot-Entity-Linking.git
cd Zero-Shot-Entity-Linking
python -m spacy download en_core_web_sm
# ~ Multiprocessing Sentence Boundary Detection takes about 2 hours under 8 core CPUs.
sh preprocessing.sh
python3 ./src/train.py -num_epochs 1
For further speednizing to check entire script, run the following command.
python3 ./src/train.py -num_epochs 1 -debug True
also, multi-gpu is supported.
CUDA_VISIBLE_DEVICES=0,1 python3 ./src/train.py -num_epochs 1 -cuda_devices 0,1
-
This experiments aim to confirm whether fine-tuning pretraind BERT (more specifically, encoders for mention and entity) is effective even to the unknown domains.
-
Following [Logeswaran et al., '19], entities are not shared between train-dev and train-test.
-
If you are interested in what this repository does, see the original paper, or unofficial slides.
-
-
torch
,allennlp
,transformers
, andfaiss
are required. See alsorequirements.txt
. -
~3 GB CPU and ~1.1GB GPU are necessary for running script.
-
Run
sh preprocessing.sh
at this directory.- The Datasets are derived from [Logeswaran et al., '19].
-
See
./src/experiment_logdir/
.- Log directory is named after when the experiment starts.
- Preprocess with more strict sentence boundary.
- MIT