GitHub - INK-USC/RICA

RICA: Evaluating Robust Inference Capabilities Based on Commonsense Axioms

Paper

Website

For Leaderboard submission:
1. Zero-Shot Setting:
  - Training data : None
  - Test data: test_sentences.txt
2. Fine-Tuning Setting:
  - Training data : data/10k/
  - Test data: test_sentences.txt
All data is in data:
- We prepared the json format to better help users to understand the structures of our probe sets, i.e., which perturbation corresponds to each statement, what is the logical form, etc. For actual testing/training files, we've replaced the A/B with random entities and are in txt format.
1. Full 253k data (noisy) in json format
  - data/RICA_253k_axiom2set.jsonl
2. Human-Verified 10k data in json format
  - data/RICA_10k_axiom2set.jsonl
Testing Models
- Besides the following scripts, we also prepared a Jupyter notebook in Probing_Examples.ipynb that use the most up-to-date Huggingface pipeline for masked word prediction.
1. For zeroshot
  - BERT experiments/eval_bert_zeroshot.py
    
    python eval_bert_zeroshot.py test_dir_name test_name(easy/hard/joint) filename seed
    
    e.g. python eval_bert_zeroshot.py joint_test_set joint bert-large-42 42
  - RoBERTa experiments/eval_roberta_zeroshot.py
    
    python eval_roberta_zeroshot.py test_dir_name test_name(easy/hard/joint) filename seed
    
    e.g. python eval_roberta_zeroshot.py joint_test_set joint roberta-large-42 42
  - GPT2 experiments/eval_gpt2_zeroshot.py
    
    python eval_gpt2_zeroshot.py
2. For finetuned models
  - BERT experiments/eval_bert_finetuned.py
    
    python eval_bert_finetuned.py test_data_dir model_dir output_dir #of_novel_entities model_name
    
    e.g. python eval_bert_finetuned.py human_curated_set 10k 10k_fine_tuned 1 bert-large-42
  - RoBERTa experiments/eval_roberta_finetuned.py
    
    `python eval_roberta_finetuned.py test_data_dir model_dir output_dir #of_novel_entities model_name``
    
    e.g. python eval_roberta_finetuned.py human_curated_set 10k 10k_fine_tuned 1 robert-large-42
  - GPT2 experiments/run_generative_gpt2_on_easy.py experiments/run_generative_gpt2_on_hard.py experiments/run_generative_gpt2_on_joint.py
    
    python run_generative_gpt2_on_easy.py #_of_novel_entities
    
    e.g. python run_generative_gpt2_on_easy.py 5
Finetuning BERT/RoBERTa for MWP
1. Finetuning code train_mlm.py is in happy-transformer/examples/
  
  python train_mlm.py training_data_directory #_of_novel_entities output_filename model_name seed_number
  
  e.g. python train_mlm.py 10k 10 roberta-large-42 roberta-large 42
2. After finetuning, you can get the average binary score using experiments/test_mlm.py
  
  python test_mlm.py training_data_directory filename.txt
  
  e.g. python test_mlm.py 10k 10 10_roberta-large-42.txt
Finetuning GPT2
1. You can finetune GPT2 using experiments/fine_tune_GPT-2.sh

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.idea		.idea
.vscode		.vscode
data		data
experiment_utils		experiment_utils
experiments		experiments
.DS_Store		.DS_Store
Probing_Examples.ipynb		Probing_Examples.ipynb
README.md		README.md
config.txt		config.txt
test_sentences.txt		test_sentences.txt
test_sentences_m.txt		test_sentences_m.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RICA: Evaluating Robust Inference Capabilities Based on Commonsense Axioms

Paper

Website

About

Releases

Packages

Languages

INK-USC/RICA

Folders and files

Latest commit

History

Repository files navigation

RICA: Evaluating Robust Inference Capabilities Based on Commonsense Axioms

Paper

Website

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages