Skip to content

wilburOne/AdversarialNameTagger

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Cross-lingual Multi-Level Adversarial Transfer to Enhance Low-Resource Name Tagging

This repository includes the source code for the cross-lingual name tagging with multi-level adversarial training

Requirements

Python3, Pytorch

Data Format

  • Label format

    The name tagger follows BIO or BIOES scheme:

    Alt Text

  • Sentence format

    Document is segmented into sentences. Each sentence is tokenized into multiple tokens.

    In the training file, sentences are separated by an empty line. Tokens are separated by linebreak. For each token, label should be always at the end. Token and label are separated by space.

    Example:

    George B-PER
    W. I-PER
    Bush I-PER
    went O
    to O
    Germany B-GPE
    yesterday O
    . O
    
    New B-ORG
    York I-ORG
    Times I-ORG
    

    A real example of a bio file: example/data/eng.train.bio

Usage

Training example is provided in example/seq_labeling_naacl/.

Citation

[1] Lifu Huang, Heng Ji, Jonathan May. Cross-lingual Multi-Level Adversarial Transfer to Enhance Low-Resource Name Tagging, Proc. NAACL, 2019

About

Multi-Level Adversarial for Cross-lingual Name Tagging

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published