Add docs for SPLADE++ ED w/ ONNX (castorini#2369)

+ Tweaked scores for ONNX (before, it was scores just copied from pre-encoded version). + Completed template docs and links from main README.
ru5h16h · Feb 14, 2024 · 8d4a7f1 · 8d4a7f1
1 parent 84788d5
commit 8d4a7f1
Show file tree

Hide file tree

Showing 150 changed files with 4,803 additions and 240 deletions.
diff --git a/README.md b/README.md
diff --git a/docs/regressions.md b/docs/regressions.md
@@ -55,6 +55,8 @@ nohup python src/main/python/run_regression.py --index --verify --search --regre
 nohup python src/main/python/run_regression.py --index --verify --search --regression msmarco-passage-cos-dpr-distil-lexlsh >& logs/log.msmarco-passage-cos-dpr-distil-lexlsh &
 nohup python src/main/python/run_regression.py --index --verify --search --regression msmarco-passage-bge-base-en-v1.5-hnsw >& logs/log.msmarco-passage-bge-base-en-v1.5-hnsw &
 nohup python src/main/python/run_regression.py --index --verify --search --regression msmarco-passage-bge-base-en-v1.5-hnsw-int8 >& logs/log.msmarco-passage-bge-base-en-v1.5-hnsw-int8 &
+nohup python src/main/python/run_regression.py --index --verify --search --regression msmarco-passage-cohere-embed-english-v3-hnsw >& logs/log.msmarco-passage-cohere-embed-english-v3-hnsw &
+nohup python src/main/python/run_regression.py --index --verify --search --regression msmarco-passage-cohere-embed-english-v3-hnsw-int8 >& logs/log.msmarco-passage-cohere-embed-english-v3-hnsw-int8 &
 nohup python src/main/python/run_regression.py --index --verify --search --regression msmarco-passage-openai-ada2 >& logs/log.msmarco-passage-openai-ada2 &
 
 nohup python src/main/python/run_regression.py --index --verify --search --regression msmarco-passage-splade-pp-ed-onnx >& logs/log.msmarco-passage-splade-pp-ed-onnx &
@@ -244,6 +246,7 @@ nohup python src/main/python/run_regression.py --index --verify --search --regre
 <summary>BEIR (v1.0.0): SPLADE++ CoCondenser-EnsembleDistil</summary>
 
 ```bash
+# Pre-encoded queries
 nohup python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-trec-covid-splade-pp-ed >& logs/log.beir-v1.0.0-trec-covid-splade-pp-ed &
 nohup python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-bioasq-splade-pp-ed >& logs/log.beir-v1.0.0-bioasq-splade-pp-ed &
 nohup python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-nfcorpus-splade-pp-ed >& logs/log.beir-v1.0.0-nfcorpus-splade-pp-ed &
@@ -273,6 +276,37 @@ nohup python src/main/python/run_regression.py --index --verify --search --regre
 nohup python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-fever-splade-pp-ed >& logs/log.beir-v1.0.0-fever-splade-pp-ed &
 nohup python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-climate-fever-splade-pp-ed >& logs/log.beir-v1.0.0-climate-fever-splade-pp-ed &
 nohup python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-scifact-splade-pp-ed >& logs/log.beir-v1.0.0-scifact-splade-pp-ed &
+
+# ONNX
+nohup python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-trec-covid-splade-pp-ed-onnx >& logs/log.beir-v1.0.0-trec-covid-splade-pp-ed-onnx &
+nohup python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-bioasq-splade-pp-ed-onnx >& logs/log.beir-v1.0.0-bioasq-splade-pp-ed-onnx &
+nohup python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-nfcorpus-splade-pp-ed-onnx >& logs/log.beir-v1.0.0-nfcorpus-splade-pp-ed-onnx &
+nohup python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-nq-splade-pp-ed-onnx >& logs/log.beir-v1.0.0-nq-splade-pp-ed-onnx &
+nohup python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-hotpotqa-splade-pp-ed-onnx >& logs/log.beir-v1.0.0-hotpotqa-splade-pp-ed-onnx &
+nohup python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-fiqa-splade-pp-ed-onnx >& logs/log.beir-v1.0.0-fiqa-splade-pp-ed-onnx &
+nohup python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-signal1m-splade-pp-ed-onnx >& logs/log.beir-v1.0.0-signal1m-splade-pp-ed-onnx &
+nohup python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-trec-news-splade-pp-ed-onnx >& logs/log.beir-v1.0.0-trec-news-splade-pp-ed-onnx &
+nohup python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-robust04-splade-pp-ed-onnx >& logs/log.beir-v1.0.0-robust04-splade-pp-ed-onnx &
+nohup python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-arguana-splade-pp-ed-onnx >& logs/log.beir-v1.0.0-arguana-splade-pp-ed-onnx &
+nohup python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-webis-touche2020-splade-pp-ed-onnx >& logs/log.beir-v1.0.0-webis-touche2020-splade-pp-ed-onnx &
+nohup python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-android-splade-pp-ed-onnx >& logs/log.beir-v1.0.0-cqadupstack-android-splade-pp-ed-onnx &
+nohup python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-english-splade-pp-ed-onnx >& logs/log.beir-v1.0.0-cqadupstack-english-splade-pp-ed-onnx &
+nohup python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-gaming-splade-pp-ed-onnx >& logs/log.beir-v1.0.0-cqadupstack-gaming-splade-pp-ed-onnx &
+nohup python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-gis-splade-pp-ed-onnx >& logs/log.beir-v1.0.0-cqadupstack-gis-splade-pp-ed-onnx &
+nohup python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-mathematica-splade-pp-ed-onnx >& logs/log.beir-v1.0.0-cqadupstack-mathematica-splade-pp-ed-onnx &
+nohup python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-physics-splade-pp-ed-onnx >& logs/log.beir-v1.0.0-cqadupstack-physics-splade-pp-ed-onnx &
+nohup python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-programmers-splade-pp-ed-onnx >& logs/log.beir-v1.0.0-cqadupstack-programmers-splade-pp-ed-onnx &
+nohup python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-stats-splade-pp-ed-onnx >& logs/log.beir-v1.0.0-cqadupstack-stats-splade-pp-ed-onnx &
+nohup python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-tex-splade-pp-ed-onnx >& logs/log.beir-v1.0.0-cqadupstack-tex-splade-pp-ed-onnx &
+nohup python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-unix-splade-pp-ed-onnx >& logs/log.beir-v1.0.0-cqadupstack-unix-splade-pp-ed-onnx &
+nohup python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-webmasters-splade-pp-ed-onnx >& logs/log.beir-v1.0.0-cqadupstack-webmasters-splade-pp-ed-onnx &
+nohup python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-cqadupstack-wordpress-splade-pp-ed-onnx >& logs/log.beir-v1.0.0-cqadupstack-wordpress-splade-pp-ed-onnx &
+nohup python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-quora-splade-pp-ed-onnx >& logs/log.beir-v1.0.0-quora-splade-pp-ed-onnx &
+nohup python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-dbpedia-entity-splade-pp-ed-onnx >& logs/log.beir-v1.0.0-dbpedia-entity-splade-pp-ed-onnx &
+nohup python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-scidocs-splade-pp-ed-onnx >& logs/log.beir-v1.0.0-scidocs-splade-pp-ed-onnx &
+nohup python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-fever-splade-pp-ed-onnx >& logs/log.beir-v1.0.0-fever-splade-pp-ed-onnx &
+nohup python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-climate-fever-splade-pp-ed-onnx >& logs/log.beir-v1.0.0-climate-fever-splade-pp-ed-onnx &
+nohup python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-scifact-splade-pp-ed-onnx >& logs/log.beir-v1.0.0-scifact-splade-pp-ed-onnx &
 ```
 
 </details>

diff --git a/docs/regressions/regressions-beir-v1.0.0-arguana-splade-pp-ed-onnx.md b/docs/regressions/regressions-beir-v1.0.0-arguana-splade-pp-ed-onnx.md
@@ -0,0 +1,82 @@
+# Anserini Regressions: BEIR (v1.0.0) &mdash; ArguAna
+
+**Model**: [SPLADE++ (CoCondenser-EnsembleDistil)](https://arxiv.org/abs/2205.04733) (using ONNX for on-the-fly query encoding)
+
+This page describes regression experiments, integrated into Anserini's regression testing framework, using [SPLADE++ (CoCondenser-EnsembleDistil)](https://arxiv.org/abs/2205.04733) on [BEIR (v1.0.0) &mdash; ArguAna](http://beir.ai/).
+The model itself can be download [here](https://huggingface.co/naver/splade-cocondenser-ensembledistil).
+See the [official SPLADE repo](https://github.com/naver/splade) and the following paper for more details:
+
+> Thibault Formal, Carlos Lassance, Benjamin Piwowarski, and Stéphane Clinchant. [From Distillation to Hard Negative Sampling: Making Sparse Neural IR Models More Effective.](https://dl.acm.org/doi/10.1145/3477495.3531857) _Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval_, pages 2353–2359.
+
+In these experiments, we are using ONNX to perform query encoding on the fly.
+
+The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-arguana-splade-pp-ed-onnx.yaml).
+Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-arguana-splade-pp-ed-onnx.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation.
+
+From one of our Waterloo servers (e.g., `orca`), the following command will perform the complete regression, end to end:
+
+```
+python src/main/python/run_regression.py --index --verify --search --regression beir-v1.0.0-arguana-splade-pp-ed-onnx
+```
+
+All the BEIR corpora, encoded by the SPLADE++ CoCondenser-EnsembleDistil model, are available for download:
+
+```bash
+wget https://rgw.cs.uwaterloo.ca/pyserini/data/beir-v1.0.0-splade-pp-ed.tar -P collections/
+tar xvf collections/beir-v1.0.0-splade-pp-ed.tar -C collections/
+```
+
+The tarball is 42 GB and has MD5 checksum `9c7de5b444a788c9e74c340bf833173b`.
+After download and unpacking the corpora, the `run_regression.py` command above should work without any issue.
+
+## Indexing
+
+Sample indexing command:
+
+```
+target/appassembler/bin/IndexCollection \
+  -collection JsonVectorCollection \
+  -input /path/to/beir-v1.0.0-arguana-splade-pp-ed \
+  -generator DefaultLuceneDocumentGenerator \
+  -index indexes/lucene-index.beir-v1.0.0-arguana-splade-pp-ed/ \
+  -threads 16 -impact -pretokenized \
+  >& logs/log.beir-v1.0.0-arguana-splade-pp-ed &
+```
+
+The important indexing options to note here are `-impact -pretokenized`: the first tells Anserini not to encode BM25 doclengths into Lucene's norms (which is the default) and the second option says not to apply any additional tokenization on the pre-encoded tokens.
+For additional details, see explanation of [common indexing options](../../docs/common-indexing-options.md).
+
+## Retrieval
+
+Topics and qrels are stored [here](https://github.com/castorini/anserini-tools/tree/master/topics-and-qrels), which is linked to the Anserini repo as a submodule.
+
+After indexing has completed, you should be able to perform retrieval as follows:
+
+```
+target/appassembler/bin/SearchCollection \
+  -index indexes/lucene-index.beir-v1.0.0-arguana-splade-pp-ed/ \
+  -topics tools/topics-and-qrels/topics.beir-v1.0.0-arguana.test.tsv.gz \
+  -topicReader TsvString \
+  -output runs/run.beir-v1.0.0-arguana-splade-pp-ed.splade-pp-ed.topics.beir-v1.0.0-arguana.test.txt \
+  -impact -pretokenized -removeQuery -hits 1000 -encoder SpladePlusPlusEnsembleDistil &
+```
+
+Evaluation can be performed using `trec_eval`:
+
+```
+target/appassembler/bin/trec_eval -c -m ndcg_cut.10 tools/topics-and-qrels/qrels.beir-v1.0.0-arguana.test.txt runs/run.beir-v1.0.0-arguana-splade-pp-ed.splade-pp-ed.topics.beir-v1.0.0-arguana.test.txt
+target/appassembler/bin/trec_eval -c -m recall.100 tools/topics-and-qrels/qrels.beir-v1.0.0-arguana.test.txt runs/run.beir-v1.0.0-arguana-splade-pp-ed.splade-pp-ed.topics.beir-v1.0.0-arguana.test.txt
+target/appassembler/bin/trec_eval -c -m recall.1000 tools/topics-and-qrels/qrels.beir-v1.0.0-arguana.test.txt runs/run.beir-v1.0.0-arguana-splade-pp-ed.splade-pp-ed.topics.beir-v1.0.0-arguana.test.txt
+```
+
+## Effectiveness
+
+With the above commands, you should be able to reproduce the following results:
+
+| **nDCG@10**                                                                                                  | **SPLADE++ (CoCondenser-EnsembleDistil)**|
+|:-------------------------------------------------------------------------------------------------------------|-----------|
+| BEIR (v1.0.0): ArguAna                                                                                       | 0.5218    |
+| **R@100**                                                                                                    | **SPLADE++ (CoCondenser-EnsembleDistil)**|
+| BEIR (v1.0.0): ArguAna                                                                                       | 0.9758    |
+| **R@1000**                                                                                                   | **SPLADE++ (CoCondenser-EnsembleDistil)**|
+| BEIR (v1.0.0): ArguAna                                                                                       | 0.9950    |
diff --git a/docs/regressions/regressions-beir-v1.0.0-arguana-splade-pp-ed.md b/docs/regressions/regressions-beir-v1.0.0-arguana-splade-pp-ed.md
@@ -1,9 +1,14 @@
 # Anserini Regressions: BEIR (v1.0.0) &mdash; ArguAna
 
-**Model**: [SPLADE++ (CoCondenser-EnsembleDistil)](https://arxiv.org/abs/2205.04733)
+**Model**: [SPLADE++ (CoCondenser-EnsembleDistil)](https://arxiv.org/abs/2205.04733) (using pre-encoded queries)
 
 This page describes regression experiments, integrated into Anserini's regression testing framework, using [SPLADE++ (CoCondenser-EnsembleDistil)](https://arxiv.org/abs/2205.04733) on [BEIR (v1.0.0) &mdash; ArguAna](http://beir.ai/).
-See the [official SPLADE repo](https://github.com/naver/splade) for more details; the model itself can be download [here](https://huggingface.co/naver/splade-cocondenser-ensembledistil).
+The model itself can be download [here](https://huggingface.co/naver/splade-cocondenser-ensembledistil).
+See the [official SPLADE repo](https://github.com/naver/splade) and the following paper for more details:
+
+> Thibault Formal, Carlos Lassance, Benjamin Piwowarski, and Stéphane Clinchant. [From Distillation to Hard Negative Sampling: Making Sparse Neural IR Models More Effective.](https://dl.acm.org/doi/10.1145/3477495.3531857) _Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval_, pages 2353–2359.
+
+In these experiments, we are using pre-encoded queries (i.e., cached results of query encoding).
 
 The exact configurations for these regressions are stored in [this YAML file](../../src/main/resources/regression/beir-v1.0.0-arguana-splade-pp-ed.yaml).
 Note that this page is automatically generated from [this template](../../src/main/resources/docgen/templates/beir-v1.0.0-arguana-splade-pp-ed.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead and then run `bin/build.sh` to rebuild the documentation.