wilburOne / AdversarialNameTagger Public

Notifications You must be signed in to change notification settings
Fork 3
Star 12

Multi-Level Adversarial for Cross-lingual Name Tagging

12 stars 3 forks Branches Tags Activity

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
dnn_pytorch		dnn_pytorch
example		example
.DS_Store		.DS_Store
README.md		README.md
__init__.py		__init__.py

Repository files navigation

Cross-lingual Multi-Level Adversarial Transfer to Enhance Low-Resource Name Tagging

This repository includes the source code for the cross-lingual name tagging with multi-level adversarial training

Requirements

Python3, Pytorch

Data Format

Label format

The name tagger follows BIO or BIOES scheme:
Sentence format

Document is segmented into sentences. Each sentence is tokenized into multiple tokens.

In the training file, sentences are separated by an empty line. Tokens are separated by linebreak. For each token, label should be always at the end. Token and label are separated by space.

Example:
```
George B-PER
W. I-PER
Bush I-PER
went O
to O
Germany B-GPE
yesterday O
. O

New B-ORG
York I-ORG
Times I-ORG
```
A real example of a bio file: example/data/eng.train.bio

Usage

Training example is provided in example/seq_labeling_naacl/.

Citation

[1] Lifu Huang, Heng Ji, Jonathan May. Cross-lingual Multi-Level Adversarial Transfer to Enhance Low-Resource Name Tagging, Proc. NAACL, 2019

About

Multi-Level Adversarial for Cross-lingual Name Tagging

Report repository

Releases

No releases published

Packages

No packages published

Languages