-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit f580f9a
Showing
1,769 changed files
with
957,508 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
/.DS_Store |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
# LLM_unlearning | ||
|
||
## Install | ||
```bash | ||
conda create --name unlearning python=3.9.16 | ||
conda activate unlearning | ||
pip install -r requirements.txt | ||
``` | ||
## Run one experiment (125M FT) | ||
``` | ||
python main.py --method DI --cd_num_token 1000 --model_name_or_path EleutherAI/gpt-neo-125m --train_batch_size 64 --eval_batch_size 256 --eval_num 5000 --num_epochs_di 10 --lr_di 1e-06 --di_strength 3 --output_folder outputs_new | ||
``` | ||
|
||
## Run one experiment (1.3B LoRA) | ||
``` | ||
python main.py --method DI --model_name_or_path EleutherAI/gpt-neo-1.3B --train_batch_size 32 --eval_batch_size 64 --eval_num 5000 --lr_di 5e-06 --di_strength 3 --num_epochs_di 100 --gradient_accu 2 --early_stop True --early_stop_criteria 1.03 --peft lora --rank 8 --lora_alpha 16 --warmup_steps 100 --output_folder outputs_new | ||
``` | ||
|
||
## Run full experiments | ||
Python files under ./exp would create thorough experiments for different purposes. | ||
|
||
## Logs and Visualization | ||
Running main.py will produce a result file and a generation example file. You then use parse_log.py to convert that to CSV file. We have our old results in the ./output folder and you can use visualization.ipynb to visualize it. |
Binary file not shown.
Empty file added
0
...aswag_default_0.1.0_512a66dd8b1b1643ab4a48aa4f150d04c91680da6a4096498a5e5f799623d5ae.lock
Empty file.
Empty file added
0
...9f3eb60e823d5_0.0.0_2a3b91fbd88a2c90d1dbbb32b460cf621d31bd5b05b934492fdef7d8d6f236ec.lock
Empty file.
Empty file added
0
...qa_plain_text_1.1.0_6c611c1a9bf220943c4174e117d3b660859665baf1d43156230116185312d011.lock
Empty file.
Empty file added
0
...per_glue_copa_1.0.3_bb9675f958ebfee0d5d6dc5476fafe38c79123727a7258d515c450873dbdbbed.lock
Empty file.
Empty file added
0
..._winogrande_s_1.1.0_a826c3d3506aefe0e9e9390dcb53271070536586bab95849876b2c1743df56e2.lock
Empty file.
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
Binary file added
BIN
+185 KB
data/downloads/2b8d1b41bc3410e7183fc0ac9512242c0271f4e5556665158155ae695713d6e3
Binary file not shown.
1 change: 1 addition & 0 deletions
1
data/downloads/2b8d1b41bc3410e7183fc0ac9512242c0271f4e5556665158155ae695713d6e3.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
{"url": "https://huggingface.co/datasets/allenai/ai2_arc/resolve/210d026faf9955653af8916fad021475a3f00453/ARC-Challenge/train-00000-of-00001.parquet", "etag": null} |
Empty file added
0
data/downloads/2b8d1b41bc3410e7183fc0ac9512242c0271f4e5556665158155ae695713d6e3.lock
Empty file.
10,003 changes: 10,003 additions & 0 deletions
10,003
data/downloads/30b6e49bd1e17dbfea4c75c30d8399bf3a92f898e9832ef6ca159f74eabd6754
Large diffs are not rendered by default.
Oops, something went wrong.
1 change: 1 addition & 0 deletions
1
data/downloads/30b6e49bd1e17dbfea4c75c30d8399bf3a92f898e9832ef6ca159f74eabd6754.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
{"url": "https://raw.githubusercontent.com/rowanz/hellaswag/master/data/hellaswag_test.jsonl", "etag": null} |
Empty file added
0
data/downloads/30b6e49bd1e17dbfea4c75c30d8399bf3a92f898e9832ef6ca159f74eabd6754.lock
Empty file.
Binary file added
BIN
+43 KB
data/downloads/53d2f20b2636031aca97f6c04afef6cba49ef933449622025adfc8809de8b032
Binary file not shown.
1 change: 1 addition & 0 deletions
1
data/downloads/53d2f20b2636031aca97f6c04afef6cba49ef933449622025adfc8809de8b032.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
{"url": "https://dl.fbaipublicfiles.com/glue/superglue/data/v2/COPA.zip", "etag": null} |
Empty file added
0
data/downloads/53d2f20b2636031aca97f6c04afef6cba49ef933449622025adfc8809de8b032.lock
Empty file.
39,905 changes: 39,905 additions & 0 deletions
39,905
data/downloads/630ed04bd62ee51d06d9ba13f00fe153c1951e84594dd9df8c4c1c9587516f77
Large diffs are not rendered by default.
Oops, something went wrong.
1 change: 1 addition & 0 deletions
1
data/downloads/630ed04bd62ee51d06d9ba13f00fe153c1951e84594dd9df8c4c1c9587516f77.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
{"url": "https://raw.githubusercontent.com/rowanz/hellaswag/master/data/hellaswag_train.jsonl", "etag": null} |
Empty file added
0
data/downloads/630ed04bd62ee51d06d9ba13f00fe153c1951e84594dd9df8c4c1c9587516f77.lock
Empty file.
Binary file added
BIN
+338 KB
data/downloads/63c87df0329762fa4cf5a54b6d1a15173d51b1044fe330490daeafb0b54754a8
Binary file not shown.
1 change: 1 addition & 0 deletions
1
data/downloads/63c87df0329762fa4cf5a54b6d1a15173d51b1044fe330490daeafb0b54754a8.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
{"url": "https://huggingface.co/datasets/allenai/ai2_arc/resolve/210d026faf9955653af8916fad021475a3f00453/ARC-Easy/test-00000-of-00001.parquet", "etag": null} |
Empty file added
0
data/downloads/63c87df0329762fa4cf5a54b6d1a15173d51b1044fe330490daeafb0b54754a8.lock
Empty file.
Binary file added
BIN
+323 KB
data/downloads/8c447af4bc8816f3aa2900a1d99f34bafeb1d6ad26dfcfba129dfdccb5120b87
Binary file not shown.
1 change: 1 addition & 0 deletions
1
data/downloads/8c447af4bc8816f3aa2900a1d99f34bafeb1d6ad26dfcfba129dfdccb5120b87.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
{"url": "https://huggingface.co/datasets/allenai/ai2_arc/resolve/210d026faf9955653af8916fad021475a3f00453/ARC-Easy/train-00000-of-00001.parquet", "etag": null} |
Empty file added
0
data/downloads/8c447af4bc8816f3aa2900a1d99f34bafeb1d6ad26dfcfba129dfdccb5120b87.lock
Empty file.
10,042 changes: 10,042 additions & 0 deletions
10,042
data/downloads/af9990f4ae181bbfbb2e33863f2dfa12b92eb453b0b0fc524106741d796a2d15
Large diffs are not rendered by default.
Oops, something went wrong.
1 change: 1 addition & 0 deletions
1
data/downloads/af9990f4ae181bbfbb2e33863f2dfa12b92eb453b0b0fc524106741d796a2d15.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
{"url": "https://raw.githubusercontent.com/rowanz/hellaswag/master/data/hellaswag_val.jsonl", "etag": null} |
Empty file added
0
data/downloads/af9990f4ae181bbfbb2e33863f2dfa12b92eb453b0b0fc524106741d796a2d15.lock
Empty file.
Binary file added
BIN
+54.4 KB
data/downloads/ba208327ccb4a2f2b093cdd7eecee1c96cbbe3d92ec67e1be12bdbc972d4eea8
Binary file not shown.
1 change: 1 addition & 0 deletions
1
data/downloads/ba208327ccb4a2f2b093cdd7eecee1c96cbbe3d92ec67e1be12bdbc972d4eea8.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
{"url": "https://huggingface.co/datasets/allenai/ai2_arc/resolve/210d026faf9955653af8916fad021475a3f00453/ARC-Challenge/validation-00000-of-00001.parquet", "etag": null} |
Empty file added
0
data/downloads/ba208327ccb4a2f2b093cdd7eecee1c96cbbe3d92ec67e1be12bdbc972d4eea8.lock
Empty file.
Binary file added
BIN
+1.74 MB
data/downloads/d1b38d244e5da498143659669d640dab6fb81dfba94ede666ed9d3f3c3be694a
Binary file not shown.
1 change: 1 addition & 0 deletions
1
data/downloads/d1b38d244e5da498143659669d640dab6fb81dfba94ede666ed9d3f3c3be694a.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
{"url": "https://storage.googleapis.com/ai2-mosaic/public/physicaliqa/physicaliqa-train-dev.zip", "etag": null} |
Empty file added
0
data/downloads/d1b38d244e5da498143659669d640dab6fb81dfba94ede666ed9d3f3c3be694a.lock
Empty file.
3,084 changes: 3,084 additions & 0 deletions
3,084
data/downloads/e22289a2fc01bf5d112b3c9c699b8105bcb4d573ca3d8470b7f0c416771f76e1
Large diffs are not rendered by default.
Oops, something went wrong.
1 change: 1 addition & 0 deletions
1
data/downloads/e22289a2fc01bf5d112b3c9c699b8105bcb4d573ca3d8470b7f0c416771f76e1.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
{"url": "https://yonatanbisk.com/piqa/data/tests.jsonl", "etag": null} |
Empty file added
0
data/downloads/e22289a2fc01bf5d112b3c9c699b8105bcb4d573ca3d8470b7f0c416771f76e1.lock
Empty file.
Binary file added
BIN
+3.24 MB
data/downloads/e60860809f7c35bc30394c47748cf246674b6314e8450f4c6a6cf9065ff0ab18
Binary file not shown.
1 change: 1 addition & 0 deletions
1
data/downloads/e60860809f7c35bc30394c47748cf246674b6314e8450f4c6a6cf9065ff0ab18.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
{"url": "https://storage.googleapis.com/ai2-mosaic/public/winogrande/winogrande_1.1.zip", "etag": null} |
Empty file added
0
data/downloads/e60860809f7c35bc30394c47748cf246674b6314e8450f4c6a6cf9065ff0ab18.lock
Empty file.
Binary file added
BIN
+199 KB
data/downloads/ec545f4634b4c60d7eba3ff158bce61c6c016554c7fa834b5be8ed09a721b8c3
Binary file not shown.
1 change: 1 addition & 0 deletions
1
data/downloads/ec545f4634b4c60d7eba3ff158bce61c6c016554c7fa834b5be8ed09a721b8c3.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
{"url": "https://huggingface.co/datasets/allenai/ai2_arc/resolve/210d026faf9955653af8916fad021475a3f00453/ARC-Challenge/test-00000-of-00001.parquet", "etag": null} |
Empty file added
0
data/downloads/ec545f4634b4c60d7eba3ff158bce61c6c016554c7fa834b5be8ed09a721b8c3.lock
Empty file.
Empty file added
0
...downloads/extracted/671ce19bc58f9c835a0af62a4eb9912b85d7b2a346a834088ccefac6435cc018.lock
Empty file.
Binary file added
BIN
+333 Bytes
...f9c835a0af62a4eb9912b85d7b2a346a834088ccefac6435cc018/__MACOSX/winogrande_1.1/._README.md
Binary file not shown.
70 changes: 70 additions & 0 deletions
70
...8f9c835a0af62a4eb9912b85d7b2a346a834088ccefac6435cc018/winogrande_1.1/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,70 @@ | ||
# WinoGrande | ||
|
||
Version 1.1 (Sep 16th, 2020) | ||
|
||
- - - | ||
|
||
## Data | ||
|
||
./data/ | ||
├── train_[xs,s,m,l,xl].jsonl # training set with differnt sizes | ||
├── train_[xs,s,m,l,xl]-labels.lst # answer labels for training sets | ||
├── train_debiased.jsonl # debiased training set | ||
├── train_debiased-labels.lst # answer labels for debiased training set | ||
├── dev.jsonl # development set | ||
├── dev-labels.lst # answer labels for development set | ||
├── test.jsonl # test set | ||
├── sample-submissions-labels.lst # example submission file for leaderboard | ||
└── eval.py # evaluation script | ||
|
||
You can use `train_*.jsonl` for training models and `dev` for validation. | ||
Please note that labels are not included in `test.jsonl`. To evaluate your models on `test` set, make a submission to our [leaderboard](https://winogrande.allenai.org). | ||
|
||
|
||
## Evaluation | ||
|
||
You can use `eval.py` for evaluation on the dev split, which yields `metrics.json`. | ||
|
||
e.g., python eval.py --preds_file ./YOUR_PREDICTIONS.lst --labels_file ./dev-labels.lst | ||
|
||
In the prediction file, each line consists of the predictions (1 or 2) by 5 training sets (ordered by `xs`, `s`, `m`, `l`, `xl`, separated by comma) for each evauation set question. | ||
|
||
2,1,1,1,1 | ||
1,1,2,2,2 | ||
1,1,1,1,1 | ||
......... | ||
......... | ||
|
||
Namely, the first column is the predictions by a model trained/finetuned on `train_xs.jsonl`, followed by a model prediction by `train_s.jsonl`, ... , and the last (fifth) column is the predictions by a model from `train_xl.jsonl`. | ||
Please checkout a sample submission file (`sample-submission-labels.lst`) for reference. | ||
|
||
## Submission to Leaderboard | ||
|
||
You can submit your predictions on `test` set to the [leaderboard](http://winogrande.allenai.org). | ||
The submission file must be named as `predictions.lst`. The format is the same as above. | ||
|
||
|
||
## Reference | ||
If you use this dataset, please cite the following paper: | ||
|
||
@article{sakaguchi2019winogrande, | ||
title={WinoGrande: An Adversarial Winograd Schema Challenge at Scale}, | ||
author={Sakaguchi, Keisuke and Bras, Ronan Le and Bhagavatula, Chandra and Choi, Yejin}, | ||
journal={arXiv preprint arXiv:1907.10641}, | ||
year={2019} | ||
} | ||
|
||
|
||
## License | ||
|
||
Winogrande dataset is licensed under CC BY 2.0. | ||
|
||
|
||
## Questions? | ||
|
||
You may ask us questions at our [google group](https://groups.google.com/a/allenai.org/forum/#!forum/winogrande). | ||
|
||
|
||
## Contact | ||
|
||
Email: keisukes[at]allenai.org |
Oops, something went wrong.