Skip to content

Commit

Permalink
fix some bugs with import
Browse files Browse the repository at this point in the history
  • Loading branch information
XINGXIAOYU committed Aug 20, 2019
1 parent a14169f commit af1a597
Show file tree
Hide file tree
Showing 21 changed files with 3,394 additions and 13,903 deletions.
9 changes: 9 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Created by .ignore support plugin (hsz.mobi)
### Example user template template
### Example user template

# IntelliJ project files
.idea
*.iml
out
gen
6 changes: 6 additions & 0 deletions .idea/vcs.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

78 changes: 78 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,80 @@
# LexiconNER
This is the implementation of "Named Entity Recognition using Positive-Unlabeled Learning" published at ACL2019.

###Set up and run
Download glove.6B.100d.txt
### Environment
pytorch 1.1.0
python 3.6.4
cuda 8.0
### Instructions for running code
#### Phrase one <train bnPU model>
**Train**
Print parameters
`run feature_pu_model.py --h`
optional arguments:
-h, --help show this help message and exit
--lr LR learning rate
--beta BETA beta of pu learning (default 0.0)
--gamma GAMMA gamma of pu learning (default 1.0)
--drop_out DROP_OUT dropout rate
--m M class balance rate
--flag FLAG entity type (PER/LOC/ORG/MISC)
--dataset DATASET name of the dataset
--batch_size BATCH_SIZE
batch size for training and testing
--print_time PRINT_TIME
epochs for printing result
--pert PERT percentage of data use for training
--type TYPE pu learning type (bnpu/bpu/upu)
e.g.)
Train on PER type of conll2003 dataset:
`python feature_pu_model.py --dataset conll2003 --type PER`
** Evaluating**
python feature_pu_model_evl.py --model saved_model/bnpu_conll2003_PER_lr_0.0001_prior_0.3_beta_0.0_gamma_1.0_percent_1.0 --flag PER --dataset conll2003 --output 1
replace the model name from the training
python final_evl.py
Get the final result on all the entity type. Remember to revise the filenames to be the output file name of evaluating.

#### Phrase two <train adaPU model>
**dictionary generation**
`run python ada_dict_generation.py -h`
optional arguments:
-h, --help show this help message and exit
--beta BETA learning rate
--gamma GAMMA gamma of pu learning (default 1.0)
--drop_out DROP_OUT dropout rate
--m M class balance rate
--flag FLAG entity type (PER/LOC/ORG/MISC)
--dataset DATASET name of the dataset
--lr LR learning rate
--batch_size BATCH_SIZE
batch size for training and testing
--iter ITER iteration time
--unlabeled UNLABELED
use unlabeled data or not
--pert PERT percentage of data use for training
--model MODEL saved model name
e.g.)
`python ada_dict_generation.py --model saved_model/bnpu_conll2003_PER_lr_0.0001_prior_0.3_beta_0.0_gamma_1.0_percent_1.0 --flag PER --iter 1`
**adaptive training**
`run python adaptive_pu_model.py -h
optional arguments:
-h, --help show this help message and exit
--beta BETA beta of pu learning (default 0.0)
--gamma GAMMA gamma of pu learning (default 1.0)
--drop_out DROP_OUT dropout rate
--m M class balance rate
--p P estimate value of prior
--flag FLAG entity type (PER/LOC/ORG/MISC)
--dataset DATASET name of the dataset
--lr LR learning rate
--batch_size BATCH_SIZE
batch size for training and testing
--output OUTPUT write the test result, set 1 for writing result to
file
--model MODEL saved model name
--iter ITER iteration time
e.g.)
`python adaptive\_pu\_model.py --model saved\_model/bnpu\_conll2003\_PER\_lr\_0.0001\_prior\_0.3\_beta\_0.0\_gamma\_1.0\_percent\_1.0 --flag PER --iter 1`
Replace saved model names and iteration times when doing adaptive learning. And in the same iteration the iter number in dictionary generation and adaptive learning should be same.
Loading

0 comments on commit af1a597

Please sign in to comment.