Regressions and bindings for CIRAL (castorini#2377)

bilet-13 · Feb 16, 2024 · 61433f6 · 61433f6
1 parent baa57eb
commit 61433f6
Show file tree

Hide file tree

Showing 29 changed files with 1,015 additions and 110 deletions.
diff --git a/README.md b/README.md
@@ -305,7 +305,8 @@ The "Corpus" above should be substituted into the full file name `beir-v1.0.0-${
 + Regressions for [CLEF 2006 Monolingual French](docs/regressions/regressions-clef06-fr.md)
 + Regressions for [TREC 2002 Monolingual Arabic](docs/regressions/regressions-trec02-ar.md)
 + Regressions for FIRE 2012 monolingual baselines: [Bengali](docs/regressions/regressions-fire12-bn.md), [Hindi](docs/regressions/regressions-fire12-hi.md), [English](docs/regressions/regressions-fire12-en.md)
-+ Regressions for CIRAL (v1.0) monolingual baselines on dev set: [Hausa](docs/regressions/regressions-ciral-v1.0-ha.md), [Somali](docs/regressions/regressions-ciral-v1.0-so.md), [Swahili](docs/regressions/regressions-ciral-v1.0-sw.md), [Yoruba](docs/regressions/regressions-ciral-v1.0-yo.md)
++ Regressions for CIRAL (v1.0) BM25 (query translation): [Hausa](docs/regressions/regressions-ciral-v1.0-ha.md), [Somali](docs/regressions/regressions-ciral-v1.0-so.md), [Swahili](docs/regressions/regressions-ciral-v1.0-sw.md), [Yoruba](docs/regressions/regressions-ciral-v1.0-yo.md)
++ Regressions for CIRAL (v1.0) BM25 (document translation): [Hausa](docs/regressions/regressions-ciral-v1.0-ha-en.md), [Somali](docs/regressions/regressions-ciral-v1.0-so-en.md), [Swahili](docs/regressions/regressions-ciral-v1.0-sw-en.md), [Yoruba](docs/regressions/regressions-ciral-v1.0-yo-en.md)
 
 </details>
 <details>

diff --git a/docs/regressions/regressions-ciral-v1.0-ha-en.md b/docs/regressions/regressions-ciral-v1.0-ha-en.md
@@ -0,0 +1,79 @@
+# Anserini Regressions: CIRAL (v1.0) &mdash; Hausa (English Translation)
+
+This page documents BM25 regression experiments for [CIRAL (v1.0) &mdash; Hausa](https://github.com/ciralproject/ciral) with document translations. To be clear, the queries are in English and the corpus is in English (translated with [NLLB 1.3B](https://huggingface.co/facebook/nllb-200-1.3B)).
+
+The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/ciral-v1.0-ha-en.yaml).
+Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/ciral-v1.0-ha-en.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
+
+From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end:
+
+```
+python src/main/python/run_regression.py --index --verify --search --regression ciral-v1.0-ha-en
+```
+
+## Indexing
+
+Typical indexing command:
+
+```
+target/appassembler/bin/IndexCollection \
+  -collection MrTyDiCollection \
+  -input /path/to/ciral-hausa-english \
+  -generator DefaultLuceneDocumentGenerator \
+  -index indexes/lucene-index.ciral-v1.0-ha-en/ \
+  -threads 16 -storePositions -storeDocvectors -storeRaw \
+  >& logs/log.ciral-hausa-english &
+```
+
+See [this page](https://github.com/ciralproject/ciral) for more details about the CIRAL corpus.
+For additional details, see explanation of [common indexing options](../../docs/common-indexing-options.md).
+
+## Retrieval
+
+After indexing has completed, you should be able to perform retrieval as follows:
+
+```
+target/appassembler/bin/SearchCollection \
+  -index indexes/lucene-index.ciral-v1.0-ha-en/ \
+  -topics tools/topics-and-qrels/topics.ciral-v1.0-ha-test-a.tsv \
+  -topicReader TsvInt \
+  -output runs/run.ciral-hausa-english.bm25-default.topics.ciral-v1.0-ha-test-a.txt \
+  -bm25 -hits 1000 &
+target/appassembler/bin/SearchCollection \
+  -index indexes/lucene-index.ciral-v1.0-ha-en/ \
+  -topics tools/topics-and-qrels/topics.ciral-v1.0-ha-test-a.tsv \
+  -topicReader TsvInt \
+  -output runs/run.ciral-hausa-english.bm25-default.topics.ciral-v1.0-ha-test-a.txt \
+  -bm25 -hits 1000 &
+target/appassembler/bin/SearchCollection \
+  -index indexes/lucene-index.ciral-v1.0-ha-en/ \
+  -topics tools/topics-and-qrels/topics.ciral-v1.0-ha-test-b.tsv \
+  -topicReader TsvInt \
+  -output runs/run.ciral-hausa-english.bm25-default.topics.ciral-v1.0-ha-test-b.txt \
+  -bm25 -hits 1000 &
+```
+
+Evaluation can be performed using `trec_eval`:
+
+```
+target/appassembler/bin/trec_eval -c -m ndcg_cut.20 tools/topics-and-qrels/qrels.ciral-v1.0-ha-test-a.tsv runs/run.ciral-hausa-english.bm25-default.topics.ciral-v1.0-ha-test-a.txt
+target/appassembler/bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.ciral-v1.0-ha-test-a.tsv runs/run.ciral-hausa-english.bm25-default.topics.ciral-v1.0-ha-test-a.txt
+target/appassembler/bin/trec_eval -c -m ndcg_cut.20 tools/topics-and-qrels/qrels.ciral-v1.0-ha-test-a-pools.tsv runs/run.ciral-hausa-english.bm25-default.topics.ciral-v1.0-ha-test-a.txt
+target/appassembler/bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.ciral-v1.0-ha-test-a-pools.tsv runs/run.ciral-hausa-english.bm25-default.topics.ciral-v1.0-ha-test-a.txt
+target/appassembler/bin/trec_eval -c -m ndcg_cut.20 tools/topics-and-qrels/qrels.ciral-v1.0-ha-test-b.tsv runs/run.ciral-hausa-english.bm25-default.topics.ciral-v1.0-ha-test-b.txt
+target/appassembler/bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.ciral-v1.0-ha-test-b.tsv runs/run.ciral-hausa-english.bm25-default.topics.ciral-v1.0-ha-test-b.txt
+```
+
+## Effectiveness
+
+With the above commands, you should be able to reproduce the following results:
+
+| **nDCG@20**                                                                                                  | **BM25 (default)**|
+|:-------------------------------------------------------------------------------------------------------------|-----------|
+| [CIRAL Hausa: Test Set A (Shallow Judgements)](https://huggingface.co/datasets/CIRAL/ciral)                  | 0.1619    |
+| [CIRAL Hausa: Test Set A (Pools)](https://huggingface.co/datasets/CIRAL/ciral)                               | 0.2142    |
+| [CIRAL Hausa: Test Set B](https://huggingface.co/datasets/CIRAL/ciral)                                       | 0.2124    |
+| **R@100**                                                                                                    | **BM25 (default)**|
+| [CIRAL Hausa: Test Set A (Shallow Judgements)](https://huggingface.co/datasets/CIRAL/ciral)                  | 0.4099    |
+| [CIRAL Hausa: Test Set A (Pools)](https://huggingface.co/datasets/CIRAL/ciral)                               | 0.4039    |
+| [CIRAL Hausa: Test Set B](https://huggingface.co/datasets/CIRAL/ciral)                                       | 0.4394    |
diff --git a/docs/regressions/regressions-ciral-v1.0-ha.md b/docs/regressions/regressions-ciral-v1.0-ha.md
@@ -1,6 +1,6 @@
 # Anserini Regressions: CIRAL (v1.0) &mdash; Hausa
 
-This page documents BM25 monolingual regression experiments on the dev set of [CIRAL (v1.0) &mdash; Hausa](https://github.com/ciralproject/ciral).
+This page documents BM25 regression experiments for [CIRAL (v1.0) &mdash; Hausa](https://github.com/ciralproject/ciral) with query translations. To be clear, the queries are in Hausa (human translations) and the corpus is in Hausa.
 
 The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/ciral-v1.0-ha.yaml).
 Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/ciral-v1.0-ha.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
@@ -35,18 +35,33 @@ After indexing has completed, you should be able to perform retrieval as follows
 ```
 target/appassembler/bin/SearchCollection \
   -index indexes/lucene-index.ciral-v1.0-ha/ \
-  -topics tools/topics-and-qrels/topics.ciral-v1.0-ha-dev-native.tsv \
+  -topics tools/topics-and-qrels/topics.ciral-v1.0-ha-test-a-native.tsv \
   -topicReader TsvInt \
-  -output runs/run.ciral-hausa.bm25-default.topics.ciral-v1.0-ha-dev-native.txt \
+  -output runs/run.ciral-hausa.bm25-default.topics.ciral-v1.0-ha-test-a-native.txt \
+  -bm25 -hits 1000 -language ha &
+target/appassembler/bin/SearchCollection \
+  -index indexes/lucene-index.ciral-v1.0-ha/ \
+  -topics tools/topics-and-qrels/topics.ciral-v1.0-ha-test-a-native.tsv \
+  -topicReader TsvInt \
+  -output runs/run.ciral-hausa.bm25-default.topics.ciral-v1.0-ha-test-a-native.txt \
+  -bm25 -hits 1000 -language ha &
+target/appassembler/bin/SearchCollection \
+  -index indexes/lucene-index.ciral-v1.0-ha/ \
+  -topics tools/topics-and-qrels/topics.ciral-v1.0-ha-test-b-native.tsv \
+  -topicReader TsvInt \
+  -output runs/run.ciral-hausa.bm25-default.topics.ciral-v1.0-ha-test-b-native.txt \
   -bm25 -hits 1000 -language ha &
 ```
 
 Evaluation can be performed using `trec_eval`:
 
 ```
-target/appassembler/bin/trec_eval -c -m ndcg_cut.20 tools/topics-and-qrels/qrels.ciral-v1.0-ha-dev.tsv runs/run.ciral-hausa.bm25-default.topics.ciral-v1.0-ha-dev-native.txt
-target/appassembler/bin/trec_eval -c -M 10 -m recip_rank tools/topics-and-qrels/qrels.ciral-v1.0-ha-dev.tsv runs/run.ciral-hausa.bm25-default.topics.ciral-v1.0-ha-dev-native.txt
-target/appassembler/bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.ciral-v1.0-ha-dev.tsv runs/run.ciral-hausa.bm25-default.topics.ciral-v1.0-ha-dev-native.txt
+target/appassembler/bin/trec_eval -c -m ndcg_cut.20 tools/topics-and-qrels/qrels.ciral-v1.0-ha-test-a.tsv runs/run.ciral-hausa.bm25-default.topics.ciral-v1.0-ha-test-a-native.txt
+target/appassembler/bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.ciral-v1.0-ha-test-a.tsv runs/run.ciral-hausa.bm25-default.topics.ciral-v1.0-ha-test-a-native.txt
+target/appassembler/bin/trec_eval -c -m ndcg_cut.20 tools/topics-and-qrels/qrels.ciral-v1.0-ha-test-a-pools.tsv runs/run.ciral-hausa.bm25-default.topics.ciral-v1.0-ha-test-a-native.txt
+target/appassembler/bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.ciral-v1.0-ha-test-a-pools.tsv runs/run.ciral-hausa.bm25-default.topics.ciral-v1.0-ha-test-a-native.txt
+target/appassembler/bin/trec_eval -c -m ndcg_cut.20 tools/topics-and-qrels/qrels.ciral-v1.0-ha-test-b.tsv runs/run.ciral-hausa.bm25-default.topics.ciral-v1.0-ha-test-b-native.txt
+target/appassembler/bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.ciral-v1.0-ha-test-b.tsv runs/run.ciral-hausa.bm25-default.topics.ciral-v1.0-ha-test-b-native.txt
 ```
 
 ## Effectiveness
@@ -55,8 +70,10 @@ With the above commands, you should be able to reproduce the following results:
 
 | **nDCG@20**                                                                                                  | **BM25 (default)**|
 |:-------------------------------------------------------------------------------------------------------------|-----------|
-| [CIRAL Hausa: Dev](https://huggingface.co/datasets/CIRAL/ciral)                                              | 0.2039    |
-| **MRR@10**                                                                                                   | **BM25 (default)**|
-| [CIRAL Hausa: Dev](https://huggingface.co/datasets/CIRAL/ciral)                                              | 0.3153    |
+| [CIRAL Hausa: Test Set A (Shallow Judgements)](https://huggingface.co/datasets/CIRAL/ciral)                  | 0.1656    |
+| [CIRAL Hausa: Test Set A (Pools)](https://huggingface.co/datasets/CIRAL/ciral)                               | 0.1161    |
+| [CIRAL Hausa: Test Set B](https://huggingface.co/datasets/CIRAL/ciral)                                       | 0.2121    |
 | **R@100**                                                                                                    | **BM25 (default)**|
-| [CIRAL Hausa: Dev](https://huggingface.co/datasets/CIRAL/ciral)                                              | 0.2760    |
+| [CIRAL Hausa: Test Set A (Shallow Judgements)](https://huggingface.co/datasets/CIRAL/ciral)                  | 0.2874    |
+| [CIRAL Hausa: Test Set A (Pools)](https://huggingface.co/datasets/CIRAL/ciral)                               | 0.1916    |
+| [CIRAL Hausa: Test Set B](https://huggingface.co/datasets/CIRAL/ciral)                                       | 0.3800    |
diff --git a/docs/regressions/regressions-ciral-v1.0-so-en.md b/docs/regressions/regressions-ciral-v1.0-so-en.md
@@ -0,0 +1,79 @@
+# Anserini Regressions: CIRAL (v1.0) &mdash; Somali (English Translation)
+
+This page documents BM25 regression experiments for [CIRAL (v1.0) &mdash; Somali](https://github.com/ciralproject/ciral) with document translations. To be clear, the queries are in English and the corpus is in English (translated with [NLLB 1.3B](https://huggingface.co/facebook/nllb-200-1.3B)).
+
+The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/ciral-v1.0-so-en.yaml).
+Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/ciral-v1.0-so-en.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
+
+From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end:
+
+```
+python src/main/python/run_regression.py --index --verify --search --regression ciral-v1.0-so-en
+```
+
+## Indexing
+
+Typical indexing command:
+
+```
+target/appassembler/bin/IndexCollection \
+  -collection MrTyDiCollection \
+  -input /path/to/ciral-somali-english \
+  -generator DefaultLuceneDocumentGenerator \
+  -index indexes/lucene-index.ciral-v1.0-so-en/ \
+  -threads 16 -storePositions -storeDocvectors -storeRaw \
+  >& logs/log.ciral-somali-english &
+```
+
+See [this page](https://github.com/ciralproject/ciral) for more details about the CIRAL corpus.
+For additional details, see explanation of [common indexing options](../../docs/common-indexing-options.md).
+
+## Retrieval
+
+After indexing has completed, you should be able to perform retrieval as follows:
+
+```
+target/appassembler/bin/SearchCollection \
+  -index indexes/lucene-index.ciral-v1.0-so-en/ \
+  -topics tools/topics-and-qrels/topics.ciral-v1.0-so-test-a.tsv \
+  -topicReader TsvInt \
+  -output runs/run.ciral-somali-english.bm25-default.topics.ciral-v1.0-so-test-a.txt \
+  -bm25 -hits 1000 &
+target/appassembler/bin/SearchCollection \
+  -index indexes/lucene-index.ciral-v1.0-so-en/ \
+  -topics tools/topics-and-qrels/topics.ciral-v1.0-so-test-a.tsv \
+  -topicReader TsvInt \
+  -output runs/run.ciral-somali-english.bm25-default.topics.ciral-v1.0-so-test-a.txt \
+  -bm25 -hits 1000 &
+target/appassembler/bin/SearchCollection \
+  -index indexes/lucene-index.ciral-v1.0-so-en/ \
+  -topics tools/topics-and-qrels/topics.ciral-v1.0-so-test-b.tsv \
+  -topicReader TsvInt \
+  -output runs/run.ciral-somali-english.bm25-default.topics.ciral-v1.0-so-test-b.txt \
+  -bm25 -hits 1000 &
+```
+
+Evaluation can be performed using `trec_eval`:
+
+```
+target/appassembler/bin/trec_eval -c -m ndcg_cut.20 tools/topics-and-qrels/qrels.ciral-v1.0-so-test-a.tsv runs/run.ciral-somali-english.bm25-default.topics.ciral-v1.0-so-test-a.txt
+target/appassembler/bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.ciral-v1.0-so-test-a.tsv runs/run.ciral-somali-english.bm25-default.topics.ciral-v1.0-so-test-a.txt
+target/appassembler/bin/trec_eval -c -m ndcg_cut.20 tools/topics-and-qrels/qrels.ciral-v1.0-so-test-a-pools.tsv runs/run.ciral-somali-english.bm25-default.topics.ciral-v1.0-so-test-a.txt
+target/appassembler/bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.ciral-v1.0-so-test-a-pools.tsv runs/run.ciral-somali-english.bm25-default.topics.ciral-v1.0-so-test-a.txt
+target/appassembler/bin/trec_eval -c -m ndcg_cut.20 tools/topics-and-qrels/qrels.ciral-v1.0-so-test-b.tsv runs/run.ciral-somali-english.bm25-default.topics.ciral-v1.0-so-test-b.txt
+target/appassembler/bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.ciral-v1.0-so-test-b.tsv runs/run.ciral-somali-english.bm25-default.topics.ciral-v1.0-so-test-b.txt
+```
+
+## Effectiveness
+
+With the above commands, you should be able to reproduce the following results:
+
+| **nDCG@20**                                                                                                  | **BM25 (default)**|
+|:-------------------------------------------------------------------------------------------------------------|-----------|
+| [CIRAL Somali: Test Set A (Shallow Judgements)](https://huggingface.co/datasets/CIRAL/ciral)                 | 0.1590    |
+| [CIRAL Somali: Test Set A (Pools)](https://huggingface.co/datasets/CIRAL/ciral)                              | 0.2461    |
+| [CIRAL Somali: Test Set B](https://huggingface.co/datasets/CIRAL/ciral)                                      | 0.2186    |
+| **R@100**                                                                                                    | **BM25 (default)**|
+| [CIRAL Somali: Test Set A (Shallow Judgements)](https://huggingface.co/datasets/CIRAL/ciral)                 | 0.3904    |
+| [CIRAL Somali: Test Set A (Pools)](https://huggingface.co/datasets/CIRAL/ciral)                              | 0.4379    |
+| [CIRAL Somali: Test Set B](https://huggingface.co/datasets/CIRAL/ciral)                                      | 0.4637    |