Skip to content

Commit

Permalink
Update experiment docs to use trec_eval 9.0.8 (castorini#2332)
Browse files Browse the repository at this point in the history
  • Loading branch information
jasper-xian authored Jan 11, 2024
1 parent 1356944 commit 3bd6e71
Show file tree
Hide file tree
Showing 9 changed files with 66 additions and 66 deletions.
8 changes: 4 additions & 4 deletions docs/elastirini.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ sh target/appassembler/bin/SearchElastic \
To evaluate effectiveness:

```bash
$ tools/eval/trec_eval.9.0.4/trec_eval -m map -m P.30 \
$ target/appassembler/bin/trec_eval -m map -m P.30 \
tools/topics-and-qrels/qrels.robust04.txt \
runs/run.es.robust04.bm25.topics.robust04.txt

Expand Down Expand Up @@ -123,7 +123,7 @@ sh target/appassembler/bin/SearchElastic \
Evaluation:

```bash
$ tools/eval/trec_eval.9.0.4/trec_eval -m map -m P.30 \
$ target/appassembler/bin/trec_eval -m map -m P.30 \
tools/topics-and-qrels/qrels.core18.txt \
runs/run.es.core18.bm25.topics.core18.txt

Expand Down Expand Up @@ -169,7 +169,7 @@ sh target/appassembler/bin/SearchElastic \
Evaluation:

```bash
$ tools/eval/trec_eval.9.0.4/trec_eval -c -m recall.1000 -m map \
$ target/appassembler/bin/trec_eval -c -m recall.1000 -m map \
tools/topics-and-qrels/qrels.msmarco-passage.dev-subset.txt \
runs/run.es.msmacro-passage.txt

Expand Down Expand Up @@ -217,7 +217,7 @@ This can take potentially longer than `SearchCollection` with Lucene indexes.
Evaluation:

```bash
$ tools/eval/trec_eval.9.0.4/trec_eval -c -m recall.1000 -m map \
$ target/appassembler/bin/trec_eval -c -m recall.1000 -m map \
tools/topics-and-qrels/qrels.msmarco-doc.dev.txt \
runs/run.es.msmarco-doc.txt

Expand Down
60 changes: 30 additions & 30 deletions docs/experiments-covid.md
Original file line number Diff line number Diff line change
Expand Up @@ -451,8 +451,8 @@ target/appassembler/bin/SearchCollection -index indexes/lucene-index-cord19-abst
-output runs/anserini.covid-r2.abstract.qdel.bm25.txt -runtag anserini.covid-r2.abstract.qdel.bm25.txt \
-removedups -bm25 -hits 10000
tools/eval/trec_eval.9.0.4/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/anserini.covid-r2.abstract.qq.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
tools/eval/trec_eval.9.0.4/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/anserini.covid-r2.abstract.qdel.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
target/appassembler/bin/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/anserini.covid-r2.abstract.qq.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
target/appassembler/bin/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/anserini.covid-r2.abstract.qdel.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
python tools/eval/measure_judged.py --qrels tools/topics-and-qrels/qrels.covid-round1.txt --cutoffs 10 --run runs/anserini.covid-r2.abstract.qq.bm25.txt
python tools/eval/measure_judged.py --qrels tools/topics-and-qrels/qrels.covid-round1.txt --cutoffs 10 --run runs/anserini.covid-r2.abstract.qdel.bm25.txt
Expand All @@ -471,8 +471,8 @@ target/appassembler/bin/SearchCollection -index indexes/lucene-index-cord19-full
-output runs/anserini.covid-r2.full-text.qdel.bm25.txt -runtag anserini.covid-r2.full-text.qdel.bm25.txt \
-removedups -bm25 -hits 10000
tools/eval/trec_eval.9.0.4/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/anserini.covid-r2.full-text.qq.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
tools/eval/trec_eval.9.0.4/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/anserini.covid-r2.full-text.qdel.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
target/appassembler/bin/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/anserini.covid-r2.full-text.qq.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
target/appassembler/bin/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/anserini.covid-r2.full-text.qdel.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
python tools/eval/measure_judged.py --qrels tools/topics-and-qrels/qrels.covid-round1.txt --cutoffs 10 --run runs/anserini.covid-r2.full-text.qq.bm25.txt
python tools/eval/measure_judged.py --qrels tools/topics-and-qrels/qrels.covid-round1.txt --cutoffs 10 --run runs/anserini.covid-r2.full-text.qdel.bm25.txt
Expand All @@ -491,8 +491,8 @@ target/appassembler/bin/SearchCollection -index indexes/lucene-index-cord19-para
-output runs/anserini.covid-r2.paragraph.qdel.bm25.txt -runtag anserini.covid-r2.paragraph.qdel.bm25.txt \
-selectMaxPassage -bm25 -hits 10000
tools/eval/trec_eval.9.0.4/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/anserini.covid-r2.paragraph.qq.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
tools/eval/trec_eval.9.0.4/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/anserini.covid-r2.paragraph.qdel.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
target/appassembler/bin/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/anserini.covid-r2.paragraph.qq.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
target/appassembler/bin/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/anserini.covid-r2.paragraph.qdel.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
python tools/eval/measure_judged.py --qrels tools/topics-and-qrels/qrels.covid-round1.txt --cutoffs 10 --run runs/anserini.covid-r2.paragraph.qq.bm25.txt
python tools/eval/measure_judged.py --qrels tools/topics-and-qrels/qrels.covid-round1.txt --cutoffs 10 --run runs/anserini.covid-r2.paragraph.qdel.bm25.txt
Expand All @@ -511,8 +511,8 @@ python src/main/python/fusion.py --method RRF --out runs/anserini.covid-r2.fusio
And to evaluate the fusion runs:

```bash
tools/eval/trec_eval.9.0.4/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/anserini.covid-r2.fusion1.txt | egrep '(ndcg_cut_10 |recall_1000 )'
tools/eval/trec_eval.9.0.4/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/anserini.covid-r2.fusion2.txt | egrep '(ndcg_cut_10 |recall_1000 )'
target/appassembler/bin/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/anserini.covid-r2.fusion1.txt | egrep '(ndcg_cut_10 |recall_1000 )'
target/appassembler/bin/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/anserini.covid-r2.fusion2.txt | egrep '(ndcg_cut_10 |recall_1000 )'

python tools/eval/measure_judged.py --qrels tools/topics-and-qrels/qrels.covid-round1.txt --cutoffs 10 --run runs/anserini.covid-r2.fusion1.txt
python tools/eval/measure_judged.py --qrels tools/topics-and-qrels/qrels.covid-round1.txt --cutoffs 10 --run runs/anserini.covid-r2.fusion2.txt
Expand All @@ -531,8 +531,8 @@ python tools/scripts/filter_run_with_qrels.py --discard --qrels tools/topics-and
Evaluating runs with round 2 judgments:

```bash
tools/eval/trec_eval.9.0.4/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round2.txt runs/anserini.r2.fusion1.txt | egrep '(ndcg_cut_10 |recall_1000 )'
tools/eval/trec_eval.9.0.4/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round2.txt runs/anserini.r2.fusion2.txt | egrep '(ndcg_cut_10 |recall_1000 )'
target/appassembler/bin/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round2.txt runs/anserini.r2.fusion1.txt | egrep '(ndcg_cut_10 |recall_1000 )'
target/appassembler/bin/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round2.txt runs/anserini.r2.fusion2.txt | egrep '(ndcg_cut_10 |recall_1000 )'

python tools/eval/measure_judged.py --qrels tools/topics-and-qrels/qrels.covid-round2.txt --cutoffs 10 --run runs/anserini.r2.fusion1.txt
python tools/eval/measure_judged.py --qrels tools/topics-and-qrels/qrels.covid-round2.txt --cutoffs 10 --run runs/anserini.r2.fusion2.txt
Expand Down Expand Up @@ -577,12 +577,12 @@ target/appassembler/bin/SearchCollection -index indexes/lucene-index-covid-2020-
Here are the commands to evaluate results on the abstract index:

```bash
tools/eval/trec_eval.9.0.4/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/run.covid-r1.abstract.query.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
tools/eval/trec_eval.9.0.4/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/run.covid-r1.abstract.question.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
tools/eval/trec_eval.9.0.4/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/run.covid-r1.abstract.query+question.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
tools/eval/trec_eval.9.0.4/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/run.covid-r1.abstract.query+question+narrative.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
tools/eval/trec_eval.9.0.4/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/run.covid-r1.abstract.query-udel.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
tools/eval/trec_eval.9.0.4/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/run.covid-r1.abstract.query-covid19.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
target/appassembler/bin/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/run.covid-r1.abstract.query.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
target/appassembler/bin/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/run.covid-r1.abstract.question.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
target/appassembler/bin/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/run.covid-r1.abstract.query+question.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
target/appassembler/bin/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/run.covid-r1.abstract.query+question+narrative.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
target/appassembler/bin/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/run.covid-r1.abstract.query-udel.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
target/appassembler/bin/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/run.covid-r1.abstract.query-covid19.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'

python tools/eval/measure_judged.py --qrels tools/topics-and-qrels/qrels.covid-round1.txt --cutoffs 10 --run runs/run.covid-r1.abstract.query.bm25.txt
python tools/eval/measure_judged.py --qrels tools/topics-and-qrels/qrels.covid-round1.txt --cutoffs 10 --run runs/run.covid-r1.abstract.question.bm25.txt
Expand Down Expand Up @@ -623,12 +623,12 @@ target/appassembler/bin/SearchCollection -index indexes/lucene-index-covid-full-
Here are the commands to evaluate results on the full-text index:

```bash
tools/eval/trec_eval.9.0.4/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/run.covid-r1.full-text.query.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
tools/eval/trec_eval.9.0.4/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/run.covid-r1.full-text.question.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
tools/eval/trec_eval.9.0.4/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/run.covid-r1.full-text.query+question.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
tools/eval/trec_eval.9.0.4/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/run.covid-r1.full-text.query+question+narrative.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
tools/eval/trec_eval.9.0.4/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/run.covid-r1.full-text.query-udel.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
tools/eval/trec_eval.9.0.4/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/run.covid-r1.full-text.query-covid19.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
target/appassembler/bin/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/run.covid-r1.full-text.query.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
target/appassembler/bin/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/run.covid-r1.full-text.question.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
target/appassembler/bin/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/run.covid-r1.full-text.query+question.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
target/appassembler/bin/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/run.covid-r1.full-text.query+question+narrative.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
target/appassembler/bin/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/run.covid-r1.full-text.query-udel.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
target/appassembler/bin/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/run.covid-r1.full-text.query-covid19.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'

python tools/eval/measure_judged.py --qrels tools/topics-and-qrels/qrels.covid-round1.txt --cutoffs 10 --run runs/run.covid-r1.full-text.query.bm25.txt
python tools/eval/measure_judged.py --qrels tools/topics-and-qrels/qrels.covid-round1.txt --cutoffs 10 --run runs/run.covid-r1.full-text.question.bm25.txt
Expand Down Expand Up @@ -669,12 +669,12 @@ target/appassembler/bin/SearchCollection -index indexes/lucene-index-covid-parag
Here are the commands to evaluate results on the paragraph index:

```bash
tools/eval/trec_eval.9.0.4/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/run.covid-r1.paragraph.query.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
tools/eval/trec_eval.9.0.4/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/run.covid-r1.paragraph.question.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
tools/eval/trec_eval.9.0.4/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/run.covid-r1.paragraph.query+question.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
tools/eval/trec_eval.9.0.4/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/run.covid-r1.paragraph.query+question+narrative.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
tools/eval/trec_eval.9.0.4/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/run.covid-r1.paragraph.query-udel.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
tools/eval/trec_eval.9.0.4/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/run.covid-r1.paragraph.query-covid19.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
target/appassembler/bin/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/run.covid-r1.paragraph.query.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
target/appassembler/bin/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/run.covid-r1.paragraph.question.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
target/appassembler/bin/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/run.covid-r1.paragraph.query+question.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
target/appassembler/bin/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/run.covid-r1.paragraph.query+question+narrative.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
target/appassembler/bin/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/run.covid-r1.paragraph.query-udel.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'
target/appassembler/bin/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/run.covid-r1.paragraph.query-covid19.bm25.txt | egrep '(ndcg_cut_10 |recall_1000 )'

python tools/eval/measure_judged.py --qrels tools/topics-and-qrels/qrels.covid-round1.txt --cutoffs 10 --run runs/run.covid-r1.paragraph.query.bm25.txt
python tools/eval/measure_judged.py --qrels tools/topics-and-qrels/qrels.covid-round1.txt --cutoffs 10 --run runs/run.covid-r1.paragraph.question.bm25.txt
Expand All @@ -697,8 +697,8 @@ python src/main/python/fusion.py --method RRF --out runs/run.covid-r1.fusion2.tx
And to evalute the fusion runs:

```bash
tools/eval/trec_eval.9.0.4/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/run.covid-r1.fusion1.txt | egrep '(ndcg_cut_10 |recall_1000 )'
tools/eval/trec_eval.9.0.4/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/run.covid-r1.fusion2.txt | egrep '(ndcg_cut_10 |recall_1000 )'
target/appassembler/bin/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/run.covid-r1.fusion1.txt | egrep '(ndcg_cut_10 |recall_1000 )'
target/appassembler/bin/trec_eval -c -M1000 -m all_trec tools/topics-and-qrels/qrels.covid-round1.txt runs/run.covid-r1.fusion2.txt | egrep '(ndcg_cut_10 |recall_1000 )'

python tools/eval/measure_judged.py --qrels tools/topics-and-qrels/qrels.covid-round1.txt --cutoffs 10 --run runs/run.covid-r1.fusion1.txt
python tools/eval/measure_judged.py --qrels tools/topics-and-qrels/qrels.covid-round1.txt --cutoffs 10 --run runs/run.covid-r1.fusion2.txt
Expand Down
2 changes: 1 addition & 1 deletion docs/experiments-doc2query.md
Original file line number Diff line number Diff line change
Expand Up @@ -180,7 +180,7 @@ sh target/appassembler/bin/SearchCollection -topicReader Car \
Evaluation is performed with `trec_eval`:

```
tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recip_rank \
target/appassembler/bin/trec_eval -c -m map -c -m recip_rank \
tools/topics-and-qrels/qrels.car17v2.0.benchmarkY1test.txt \
runs/run.car17v2.0.bm25.expanded-topk10.txt
```
Expand Down
2 changes: 1 addition & 1 deletion docs/experiments-fever.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ Note that by default, the above uses the BM25 algorithm with parameters `k1=0.9`
Finally, we can evaluate the retrieved documents using the official TREC evaluation tool, `trec_eval`.

```bash
tools/eval/trec_eval.9.0.4/trec_eval -c -m recall \
target/appassembler/bin/trec_eval -c -m recall \
collections/fever/qrels.paragraph.dev.txt runs/run.fever-paragraph.dev.txt
```

Expand Down
8 changes: 4 additions & 4 deletions docs/experiments-msmarco-doc.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ Adjust the parallelism by changing the `-parallelism` argument.
After the run completes, we can evaluate with `trec_eval`:

```bash
$ tools/eval/trec_eval.9.0.4/trec_eval -c -mmap -mrecall.1000 \
$ target/appassembler/bin/trec_eval -c -mmap -mrecall.1000 \
tools/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc.dev.bm25.txt
map all 0.2309
recall_1000 all 0.8856
Expand All @@ -81,11 +81,11 @@ Then, run `trec_eval` to compare.
Note that to be fair, we restrict evaluation to top 100 hits per topic (which is what Microsoft provides):

```bash
$ tools/eval/trec_eval.9.0.4/trec_eval -c -mmap -M 100 \
$ target/appassembler/bin/trec_eval -c -mmap -M 100 \
tools/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/msmarco-docdev-top100
map all 0.2219

$ tools/eval/trec_eval.9.0.4/trec_eval -c -mmap -M 100 \
$ target/appassembler/bin/trec_eval -c -mmap -M 100 \
tools/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc.dev.bm25.txt
map all 0.2302
```
Expand Down Expand Up @@ -186,7 +186,7 @@ $ target/appassembler/bin/SearchCollection \
-parallelism 4 \
-bm25 -bm25.k1 3.8 -bm25.b 0.87 -hits 1000

$ tools/eval/trec_eval.9.0.4/trec_eval -c -mmap -mrecall.1000 \
$ target/appassembler/bin/trec_eval -c -mmap -mrecall.1000 \
tools/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc.dev.opt-mrr.txt
map all 0.2789
recall_1000 all 0.9326
Expand Down
Loading

0 comments on commit 3bd6e71

Please sign in to comment.