This repository contains necessary code for reproducing main results in the paper:
Natural Language Adversarial Defense through Synonym Encoding (UAI 2021)
Xiaosen Wang, Hao Jin, Yichen Yang and Kun He
For IGA attack, please refer to IGA!
There are three datasets used in our experiments:
The code was tested with:
- python 3.6.5
- numpy 1.16.4
- tensorflow 1.8.0
- tensorflow-gpu 1.5.0
- pandas 0.23.0
- keras 2.2.0
- scikit-learn 0.19.1
- scipy 1.0.1
textrnn.py
,textcnn.py
,textbirnn.py
: The models for LSTM, Word-CNN and Bi-LSTM.train_orig.py
,train_enc.py
: Training models with or without SEM.glove_utils.py
: Loading the glove model and create embedding matrix for word dictionary.build_embeddings.py
: Generating the embedding matrix for original word dictionary and encoded word dictionary
-
Generating the embedding matrix for original dictionary and encoded dictionary:
python build_embedding.py
-
Training the models with the original word dictionary:
python train_orig.py --data aclImdb --sn 10 --sigma 0.5 --nn_type textrnn
-
Training the models with the encoded word dictionary:
python train_enc.py --data aclImdb --sn 10 --sigma 0.5 --nn_type textrnn
This repository is under active development. Questions and suggestions can be sent to xswanghuster@gmail.com.