LexiconNER

This is the implementation of "Named Entity Recognition using Positive-Unlabeled Learning" published at ACL2019.

Set up and run

Download glove.6B.100d.txt

Environment

pytorch 1.1.0 python 3.6.4 cuda 8.0

Instructions for running code

Phrase one <train bnPU model>

Train Print parameters run python feature_pu_model.py --h

optional arguments:
  -h, --help            show this help message and exit
  --lr LR               learning rate
  --beta BETA           beta of pu learning (default 0.0)
  --gamma GAMMA         gamma of pu learning (default 1.0)
  --drop_out DROP_OUT   dropout rate
  --m M                 class balance rate
  --flag FLAG           entity type (PER/LOC/ORG/MISC)
  --dataset DATASET     name of the dataset
  --batch_size BATCH_SIZE
                    	batch size for training and testing
  --print_time PRINT_TIME
                    	epochs for printing result
  --pert PERT           percentage of data use for training
  --type TYPE           pu learning type (bnpu/bpu/upu)

e.g.) Train on PER type of conll2003 dataset: python feature_pu_model.py --dataset conll2003 --type PER ** Evaluating**

python feature_pu_model_evl.py --model saved_model/bnpu_conll2003_PER_lr_0.0001_prior_0.3_beta_0.0_gamma_1.0_percent_1.0 --flag PER --dataset conll2003 --output 1

replace the model name from the training

python final_evl.py

Get the final result on all the entity type. Remember to revise the filenames to be the output file name of evaluating.

Phrase two <train adaPU model>

dictionary generation run python ada_dict_generation.py -h

optional arguments:
  -h, --help            show this help message and exit
  --beta BETA           learning rate
  --gamma GAMMA         gamma of pu learning (default 1.0)
  --drop_out DROP_OUT   dropout rate
  --m M                 class balance rate
  --flag FLAG           entity type (PER/LOC/ORG/MISC)
  --dataset DATASET     name of the dataset
  --lr LR               learning rate
  --batch_size BATCH_SIZE
                        batch size for training and testing
  --iter ITER           iteration time
  --unlabeled UNLABELED
                        use unlabeled data or not
  --pert PERT           percentage of data use for training
  --model MODEL         saved model name

e.g.) python ada_dict_generation.py --model saved_model/bnpu_conll2003_PER_lr_0.0001_prior_0.3_beta_0.0_gamma_1.0_percent_1.0 --flag PER --iter 1 adaptive training run python adaptivepumodel.py -h

optional arguments:
  -h, --help            show this help message and exit
  --beta BETA           beta of pu learning (default 0.0)
  --gamma GAMMA         gamma of pu learning (default 1.0)
  --drop_out DROP_OUT   dropout rate
  --m M                 class balance rate
  --p P                 estimate value of prior
  --flag FLAG           entity type (PER/LOC/ORG/MISC)
  --dataset DATASET     name of the dataset
  --lr LR               learning rate
  --batch_size BATCH_SIZE
                        batch size for training and testing
  --output OUTPUT       write the test result, set 1 for writing result to
                        file
  --model MODEL         saved model name
  --iter ITER           iteration time
```
e.g.)
`python adaptive\_pu\_model.py --model saved\_model/bnpu\_conll2003\_PER\_lr\_0.0001\_prior\_0.3\_beta\_0.0\_gamma\_1.0\_percent\_1.0 --flag PER --iter 1`
Replace saved model names and iteration times when doing adaptive learning. And in the same iteration the iter number in dictionary generation and adaptive learning should be same.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.idea		.idea
data/conll2003		data/conll2003
dictionary		dictionary
feature_dictionary		feature_dictionary
paper		paper
saved_model		saved_model
utils		utils
.gitignore		.gitignore
AIS_彭敏龙_Name_Entity_Recognition_using_Positive_Unlabeled_Learning.pptx		AIS_彭敏龙_Name_Entity_Recognition_using_Positive_Unlabeled_Learning.pptx
LICENSE		LICENSE
README.md		README.md
ada_dict_generation.py		ada_dict_generation.py
adaptive_pu_model.py		adaptive_pu_model.py
dict_match.py		dict_match.py
feature_pu_model.py		feature_pu_model.py
feature_pu_model_evl.py		feature_pu_model_evl.py
final_evl.py		final_evl.py
sub_model.py		sub_model.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LexiconNER

Set up and run

Environment

Instructions for running code

Phrase one <train bnPU model>

Phrase two <train adaPU model>

About

Releases

Packages

Languages

License

whustan/LexiconNER

Folders and files

Latest commit

History

Repository files navigation

LexiconNER

Set up and run

Environment

Instructions for running code

Phrase one <train bnPU model>

Phrase two <train adaPU model>

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages