Mr.Harm

Official PyTorch implementation for the paper - Beneath the Surface: Unveiling Harmful Memes with Multimodal Reasoning Distilled from Large Language Models.

(EMNLP 2023: The 2023 Conference on Empirical Methods in Natural Language Processing (Findings), Dec 2023, Singapore.) [paper]

Install

conda create -n meme python=3.8
conda activate meme
pip install -r requirements.txt

Data

Please refer to Data.

Training

Learn from LLMs

export DATA="/path/to/data/folder"
export LOG="/path/to/save/ckpts/name"

rm -rf $LOG
mkdir $LOG

CUDA_VISIBLE_DEVICES=0 python run.py with data_root=$DATA \
    num_gpus=1 num_nodes=1 task_train per_gpu_batchsize=32 batch_size=32 \
    clip32_base224 text_t5_base image_size=224 vit_randaug mode="rationale" \
    log_dir=$LOG precision=32 max_epoch=10 learning_rate=5e-5

Learn from Labels

export DATA="/path/to/data/folder"
export LOG="/path/to/save/ckpts/name"

rm -rf $LOG
mkdir $LOG

CUDA_VISIBLE_DEVICES=0 python run.py with data_root=$DATA \
    num_gpus=1 num_nodes=1 task_train per_gpu_batchsize=32 batch_size=32 \
    clip32_base224 text_t5_base image_size=224 vit_randaug mode="label" \
    log_dir=$LOG precision=32 max_epoch=30 learning_rate=5e-5 \
    load_path="/path/to/distill_LLMs.ckpt"

Inference

export DATA="/path/to/data/folder"
export LOG="/path/to/log/folder"

CUDA_VISIBLE_DEVICES=0 python run.py with data_root=$DATA \
    num_gpus=1 num_nodes=1 task_train per_gpu_batchsize=1 batch_size=1 \
    clip32_base224 text_t5_base image_size=224 vit_randaug \
    log_dir=$LOG precision=32 test_only=True \
    load_path="/path/to/label_learn.ckpt" \
    out_path="/path/to/save/label_pred.json"

Then, you can use the /path/to/save/label_pred.json and the gold labels to get the scores.

Citation

@inproceedings{lin-etal-2023-beneath,
    title = "Beneath the Surface: Unveiling Harmful Memes with Multimodal Reasoning Distilled from Large Language Models",
    author = "Lin, Hongzhan  and
      Luo, Ziyang  and
      Ma, Jing  and
      Chen, Long",
    editor = "Bouamor, Houda  and
      Pino, Juan  and
      Bali, Kalika",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2023",
    month = dec,
    year = "2023",
    address = "Singapore",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.findings-emnlp.611",
    doi = "10.18653/v1/2023.findings-emnlp.611",
    pages = "9114--9128",
}

Acknowledgements

The code is based on ViLT and METER.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
Data		Data
src		src
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mr.Harm

Install

Data

Training

Inference

Citation

Acknowledgements

About

Releases

Packages

Contributors 3

Languages

License

HKBUNLP/Mr.Harm-EMNLP2023

Folders and files

Latest commit

History

Repository files navigation

Mr.Harm

Install

Data

Training

Inference

Citation

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages