Skip to content

Commit

Permalink
CAR regression refactoring: v1.5 and v2.0 comparison (castorini#642)
Browse files Browse the repository at this point in the history
+ Both v1.5 and v2.0 uses benchmarkY1-test
+ Made naming/docs consistent
+ Removed previous test200 for v1.5
  • Loading branch information
lintool authored May 11, 2019
1 parent 2ba2b95 commit 3eef2fb
Show file tree
Hide file tree
Showing 12 changed files with 8,000 additions and 6,625 deletions.
35 changes: 17 additions & 18 deletions docs/experiments-car17v1.5.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,43 +17,42 @@ For additional details, see explanation of [common indexing options](common-inde

## Retrieval

Topics and qrels are stored in `src/main/resources/topics-and-qrels/`, downloaded from NIST:

+ `topics.car17v1.5.test200.txt`: [Topics for the test200 subset (TREC 2017 Complex Answer Retrieval Track)](http://trec-car.cs.unh.edu/datareleases/v1.5/test200-v1.5.tar.xz)
+ `qrels.car17v1.5.test200.txt`: [adhoc qrels (TREC 2017 Complex Answer Retrieval Track)](http://trec-car.cs.unh.edu/datareleases/v1.5/test200-v1.5.tar.xz)
The "benchmarkY1-test" topics and qrels (v1.5) are stored in `src/main/resources/topics-and-qrels/`, downloaded from [the CAR website](http://trec-car.cs.unh.edu/datareleases/):

+ `topics.car17v1.5.benchmarkY1test.txt`
+ `qrels.car17v1.5.benchmarkY1test.txt`

After indexing has completed, you should be able to perform retrieval as follows:

```
nohup target/appassembler/bin/SearchCollection -topicreader Car -index lucene-index.car17v1.5.pos+docvectors+rawdocs -topics src/main/resources/topics-and-qrels/topics.car17v1.5.test200.txt -output run.car17v1.5.bm25.topics.car17v1.5.test200.txt -bm25 &
nohup target/appassembler/bin/SearchCollection -topicreader Car -index lucene-index.car17v1.5.pos+docvectors+rawdocs -topics src/main/resources/topics-and-qrels/topics.car17v1.5.benchmarkY1test.txt -output run.car17v1.5.bm25.topics.car17v1.5.benchmarkY1test.txt -bm25 &
nohup target/appassembler/bin/SearchCollection -topicreader Car -index lucene-index.car17v1.5.pos+docvectors+rawdocs -topics src/main/resources/topics-and-qrels/topics.car17v1.5.test200.txt -output run.car17v1.5.bm25+rm3.topics.car17v1.5.test200.txt -bm25 -rm3 &
nohup target/appassembler/bin/SearchCollection -topicreader Car -index lucene-index.car17v1.5.pos+docvectors+rawdocs -topics src/main/resources/topics-and-qrels/topics.car17v1.5.benchmarkY1test.txt -output run.car17v1.5.bm25+rm3.topics.car17v1.5.benchmarkY1test.txt -bm25 -rm3 &
nohup target/appassembler/bin/SearchCollection -topicreader Car -index lucene-index.car17v1.5.pos+docvectors+rawdocs -topics src/main/resources/topics-and-qrels/topics.car17v1.5.test200.txt -output run.car17v1.5.bm25+ax.topics.car17v1.5.test200.txt -bm25 -axiom -rerankCutoff 20 -axiom.deterministic &
nohup target/appassembler/bin/SearchCollection -topicreader Car -index lucene-index.car17v1.5.pos+docvectors+rawdocs -topics src/main/resources/topics-and-qrels/topics.car17v1.5.benchmarkY1test.txt -output run.car17v1.5.bm25+ax.topics.car17v1.5.benchmarkY1test.txt -bm25 -axiom -rerankCutoff 20 -axiom.deterministic &
nohup target/appassembler/bin/SearchCollection -topicreader Car -index lucene-index.car17v1.5.pos+docvectors+rawdocs -topics src/main/resources/topics-and-qrels/topics.car17v1.5.test200.txt -output run.car17v1.5.ql.topics.car17v1.5.test200.txt -ql &
nohup target/appassembler/bin/SearchCollection -topicreader Car -index lucene-index.car17v1.5.pos+docvectors+rawdocs -topics src/main/resources/topics-and-qrels/topics.car17v1.5.benchmarkY1test.txt -output run.car17v1.5.ql.topics.car17v1.5.benchmarkY1test.txt -ql &
nohup target/appassembler/bin/SearchCollection -topicreader Car -index lucene-index.car17v1.5.pos+docvectors+rawdocs -topics src/main/resources/topics-and-qrels/topics.car17v1.5.test200.txt -output run.car17v1.5.ql+rm3.topics.car17v1.5.test200.txt -ql -rm3 &
nohup target/appassembler/bin/SearchCollection -topicreader Car -index lucene-index.car17v1.5.pos+docvectors+rawdocs -topics src/main/resources/topics-and-qrels/topics.car17v1.5.benchmarkY1test.txt -output run.car17v1.5.ql+rm3.topics.car17v1.5.benchmarkY1test.txt -ql -rm3 &
nohup target/appassembler/bin/SearchCollection -topicreader Car -index lucene-index.car17v1.5.pos+docvectors+rawdocs -topics src/main/resources/topics-and-qrels/topics.car17v1.5.test200.txt -output run.car17v1.5.ql+ax.topics.car17v1.5.test200.txt -ql -axiom -rerankCutoff 20 -axiom.deterministic &
nohup target/appassembler/bin/SearchCollection -topicreader Car -index lucene-index.car17v1.5.pos+docvectors+rawdocs -topics src/main/resources/topics-and-qrels/topics.car17v1.5.benchmarkY1test.txt -output run.car17v1.5.ql+ax.topics.car17v1.5.benchmarkY1test.txt -ql -axiom -rerankCutoff 20 -axiom.deterministic &
```

Evaluation can be performed using `trec_eval`:

```
eval/trec_eval.9.0.4/trec_eval -m map -m recip_rank src/main/resources/topics-and-qrels/qrels.car17v1.5.test200.txt run.car17v1.5.bm25.topics.car17v1.5.test200.txt
eval/trec_eval.9.0.4/trec_eval -m map -m recip_rank src/main/resources/topics-and-qrels/qrels.car17v1.5.benchmarkY1test.txt run.car17v1.5.bm25.topics.car17v1.5.benchmarkY1test.txt
eval/trec_eval.9.0.4/trec_eval -m map -m recip_rank src/main/resources/topics-and-qrels/qrels.car17v1.5.test200.txt run.car17v1.5.bm25+rm3.topics.car17v1.5.test200.txt
eval/trec_eval.9.0.4/trec_eval -m map -m recip_rank src/main/resources/topics-and-qrels/qrels.car17v1.5.benchmarkY1test.txt run.car17v1.5.bm25+rm3.topics.car17v1.5.benchmarkY1test.txt
eval/trec_eval.9.0.4/trec_eval -m map -m recip_rank src/main/resources/topics-and-qrels/qrels.car17v1.5.test200.txt run.car17v1.5.bm25+ax.topics.car17v1.5.test200.txt
eval/trec_eval.9.0.4/trec_eval -m map -m recip_rank src/main/resources/topics-and-qrels/qrels.car17v1.5.benchmarkY1test.txt run.car17v1.5.bm25+ax.topics.car17v1.5.benchmarkY1test.txt
eval/trec_eval.9.0.4/trec_eval -m map -m recip_rank src/main/resources/topics-and-qrels/qrels.car17v1.5.test200.txt run.car17v1.5.ql.topics.car17v1.5.test200.txt
eval/trec_eval.9.0.4/trec_eval -m map -m recip_rank src/main/resources/topics-and-qrels/qrels.car17v1.5.benchmarkY1test.txt run.car17v1.5.ql.topics.car17v1.5.benchmarkY1test.txt
eval/trec_eval.9.0.4/trec_eval -m map -m recip_rank src/main/resources/topics-and-qrels/qrels.car17v1.5.test200.txt run.car17v1.5.ql+rm3.topics.car17v1.5.test200.txt
eval/trec_eval.9.0.4/trec_eval -m map -m recip_rank src/main/resources/topics-and-qrels/qrels.car17v1.5.benchmarkY1test.txt run.car17v1.5.ql+rm3.topics.car17v1.5.benchmarkY1test.txt
eval/trec_eval.9.0.4/trec_eval -m map -m recip_rank src/main/resources/topics-and-qrels/qrels.car17v1.5.test200.txt run.car17v1.5.ql+ax.topics.car17v1.5.test200.txt
eval/trec_eval.9.0.4/trec_eval -m map -m recip_rank src/main/resources/topics-and-qrels/qrels.car17v1.5.benchmarkY1test.txt run.car17v1.5.ql+ax.topics.car17v1.5.benchmarkY1test.txt
```

Expand All @@ -63,11 +62,11 @@ With the above commands, you should be able to replicate the following results:

MAP | BM25 | BM25+RM3 | BM25+AX | QL | QL+RM3 | QL+AX |
:---------------------------------------|-----------|-----------|-----------|-----------|-----------|-----------|
All Topics | 0.1689 | 0.1287 | 0.1355 | 0.1516 | 0.1173 | 0.1082 |
benchmarkY1test | 0.1563 | 0.1295 | 0.1358 | 0.1386 | 0.1080 | 0.1048 |


RECIP_RANK | BM25 | BM25+RM3 | BM25+AX | QL | QL+RM3 | QL+AX |
:---------------------------------------|-----------|-----------|-----------|-----------|-----------|-----------|
All Topics | 0.2321 | 0.1788 | 0.1857 | 0.2085 | 0.1573 | 0.1501 |
benchmarkY1test | 0.2336 | 0.1923 | 0.1949 | 0.2037 | 0.1599 | 0.1524 |


31 changes: 15 additions & 16 deletions docs/experiments-car17v2.0.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,43 +17,42 @@ For additional details, see explanation of [common indexing options](common-inde

## Retrieval

The "benchmarkY1-test" topics and qrels are stored in `src/main/resources/topics-and-qrels/`, downloaded from [the CAR website](http://trec-car.cs.unh.edu/datareleases/):

+ `topics.car17v2.0.test.pages.cbor-hierarchical.txt`
+ `qrels.car17v2.0.test.pages.cbor-hierarchical.txt`
The "benchmarkY1-test" topics and qrels (v2.0) are stored in `src/main/resources/topics-and-qrels/`, downloaded from [the CAR website](http://trec-car.cs.unh.edu/datareleases/):

+ `topics.car17v2.0.benchmarkY1test.txt`
+ `qrels.car17v2.0.benchmarkY1test.txt`

After indexing has completed, you should be able to perform retrieval as follows:

```
nohup target/appassembler/bin/SearchCollection -topicreader Car -index lucene-index.car17v2.0.pos+docvectors+rawdocs -topics src/main/resources/topics-and-qrels/topics.car17v2.0.test.pages.cbor-hierarchical.txt -output run.car17v2.0.bm25.topics.car17v2.0.test.pages.cbor-hierarchical.txt -bm25 &
nohup target/appassembler/bin/SearchCollection -topicreader Car -index lucene-index.car17v2.0.pos+docvectors+rawdocs -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt -output run.car17v2.0.bm25.topics.car17v2.0.benchmarkY1test.txt -bm25 &
nohup target/appassembler/bin/SearchCollection -topicreader Car -index lucene-index.car17v2.0.pos+docvectors+rawdocs -topics src/main/resources/topics-and-qrels/topics.car17v2.0.test.pages.cbor-hierarchical.txt -output run.car17v2.0.bm25+rm3.topics.car17v2.0.test.pages.cbor-hierarchical.txt -bm25 -rm3 &
nohup target/appassembler/bin/SearchCollection -topicreader Car -index lucene-index.car17v2.0.pos+docvectors+rawdocs -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt -output run.car17v2.0.bm25+rm3.topics.car17v2.0.benchmarkY1test.txt -bm25 -rm3 &
nohup target/appassembler/bin/SearchCollection -topicreader Car -index lucene-index.car17v2.0.pos+docvectors+rawdocs -topics src/main/resources/topics-and-qrels/topics.car17v2.0.test.pages.cbor-hierarchical.txt -output run.car17v2.0.bm25+ax.topics.car17v2.0.test.pages.cbor-hierarchical.txt -bm25 -axiom -rerankCutoff 20 -axiom.deterministic &
nohup target/appassembler/bin/SearchCollection -topicreader Car -index lucene-index.car17v2.0.pos+docvectors+rawdocs -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt -output run.car17v2.0.bm25+ax.topics.car17v2.0.benchmarkY1test.txt -bm25 -axiom -rerankCutoff 20 -axiom.deterministic &
nohup target/appassembler/bin/SearchCollection -topicreader Car -index lucene-index.car17v2.0.pos+docvectors+rawdocs -topics src/main/resources/topics-and-qrels/topics.car17v2.0.test.pages.cbor-hierarchical.txt -output run.car17v2.0.ql.topics.car17v2.0.test.pages.cbor-hierarchical.txt -ql &
nohup target/appassembler/bin/SearchCollection -topicreader Car -index lucene-index.car17v2.0.pos+docvectors+rawdocs -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt -output run.car17v2.0.ql.topics.car17v2.0.benchmarkY1test.txt -ql &
nohup target/appassembler/bin/SearchCollection -topicreader Car -index lucene-index.car17v2.0.pos+docvectors+rawdocs -topics src/main/resources/topics-and-qrels/topics.car17v2.0.test.pages.cbor-hierarchical.txt -output run.car17v2.0.ql+rm3.topics.car17v2.0.test.pages.cbor-hierarchical.txt -ql -rm3 &
nohup target/appassembler/bin/SearchCollection -topicreader Car -index lucene-index.car17v2.0.pos+docvectors+rawdocs -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt -output run.car17v2.0.ql+rm3.topics.car17v2.0.benchmarkY1test.txt -ql -rm3 &
nohup target/appassembler/bin/SearchCollection -topicreader Car -index lucene-index.car17v2.0.pos+docvectors+rawdocs -topics src/main/resources/topics-and-qrels/topics.car17v2.0.test.pages.cbor-hierarchical.txt -output run.car17v2.0.ql+ax.topics.car17v2.0.test.pages.cbor-hierarchical.txt -ql -axiom -rerankCutoff 20 -axiom.deterministic &
nohup target/appassembler/bin/SearchCollection -topicreader Car -index lucene-index.car17v2.0.pos+docvectors+rawdocs -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt -output run.car17v2.0.ql+ax.topics.car17v2.0.benchmarkY1test.txt -ql -axiom -rerankCutoff 20 -axiom.deterministic &
```

Evaluation can be performed using `trec_eval`:

```
eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recip_rank src/main/resources/topics-and-qrels/qrels.car17v2.0.test.pages.cbor-hierarchical.txt run.car17v2.0.bm25.topics.car17v2.0.test.pages.cbor-hierarchical.txt
eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recip_rank src/main/resources/topics-and-qrels/qrels.car17v2.0.benchmarkY1test.txt run.car17v2.0.bm25.topics.car17v2.0.benchmarkY1test.txt
eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recip_rank src/main/resources/topics-and-qrels/qrels.car17v2.0.test.pages.cbor-hierarchical.txt run.car17v2.0.bm25+rm3.topics.car17v2.0.test.pages.cbor-hierarchical.txt
eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recip_rank src/main/resources/topics-and-qrels/qrels.car17v2.0.benchmarkY1test.txt run.car17v2.0.bm25+rm3.topics.car17v2.0.benchmarkY1test.txt
eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recip_rank src/main/resources/topics-and-qrels/qrels.car17v2.0.test.pages.cbor-hierarchical.txt run.car17v2.0.bm25+ax.topics.car17v2.0.test.pages.cbor-hierarchical.txt
eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recip_rank src/main/resources/topics-and-qrels/qrels.car17v2.0.benchmarkY1test.txt run.car17v2.0.bm25+ax.topics.car17v2.0.benchmarkY1test.txt
eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recip_rank src/main/resources/topics-and-qrels/qrels.car17v2.0.test.pages.cbor-hierarchical.txt run.car17v2.0.ql.topics.car17v2.0.test.pages.cbor-hierarchical.txt
eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recip_rank src/main/resources/topics-and-qrels/qrels.car17v2.0.benchmarkY1test.txt run.car17v2.0.ql.topics.car17v2.0.benchmarkY1test.txt
eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recip_rank src/main/resources/topics-and-qrels/qrels.car17v2.0.test.pages.cbor-hierarchical.txt run.car17v2.0.ql+rm3.topics.car17v2.0.test.pages.cbor-hierarchical.txt
eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recip_rank src/main/resources/topics-and-qrels/qrels.car17v2.0.benchmarkY1test.txt run.car17v2.0.ql+rm3.topics.car17v2.0.benchmarkY1test.txt
eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recip_rank src/main/resources/topics-and-qrels/qrels.car17v2.0.test.pages.cbor-hierarchical.txt run.car17v2.0.ql+ax.topics.car17v2.0.test.pages.cbor-hierarchical.txt
eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recip_rank src/main/resources/topics-and-qrels/qrels.car17v2.0.benchmarkY1test.txt run.car17v2.0.ql+ax.topics.car17v2.0.benchmarkY1test.txt
```

Expand Down
7 changes: 3 additions & 4 deletions src/main/resources/docgen/templates/car17v1.5.template
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,10 @@ For additional details, see explanation of [common indexing options](common-inde

## Retrieval

Topics and qrels are stored in `src/main/resources/topics-and-qrels/`, downloaded from NIST:

+ `topics.car17v1.5.test200.txt`: [Topics for the test200 subset (TREC 2017 Complex Answer Retrieval Track)](http://trec-car.cs.unh.edu/datareleases/v1.5/test200-v1.5.tar.xz)
+ `qrels.car17v1.5.test200.txt`: [adhoc qrels (TREC 2017 Complex Answer Retrieval Track)](http://trec-car.cs.unh.edu/datareleases/v1.5/test200-v1.5.tar.xz)
The "benchmarkY1-test" topics and qrels (v1.5) are stored in `src/main/resources/topics-and-qrels/`, downloaded from [the CAR website](http://trec-car.cs.unh.edu/datareleases/):

+ `topics.car17v1.5.benchmarkY1test.txt`
+ `qrels.car17v1.5.benchmarkY1test.txt`

After indexing has completed, you should be able to perform retrieval as follows:

Expand Down
7 changes: 3 additions & 4 deletions src/main/resources/docgen/templates/car17v2.0.template
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,10 @@ For additional details, see explanation of [common indexing options](common-inde

## Retrieval

The "benchmarkY1-test" topics and qrels are stored in `src/main/resources/topics-and-qrels/`, downloaded from [the CAR website](http://trec-car.cs.unh.edu/datareleases/):

+ `topics.car17v2.0.test.pages.cbor-hierarchical.txt`
+ `qrels.car17v2.0.test.pages.cbor-hierarchical.txt`
The "benchmarkY1-test" topics and qrels (v2.0) are stored in `src/main/resources/topics-and-qrels/`, downloaded from [the CAR website](http://trec-car.cs.unh.edu/datareleases/):

+ `topics.car17v2.0.benchmarkY1test.txt`
+ `qrels.car17v2.0.benchmarkY1test.txt`

After indexing has completed, you should be able to perform retrieval as follows:

Expand Down
30 changes: 15 additions & 15 deletions src/main/resources/regression/car17v1.5.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -24,9 +24,9 @@ index_stats:
documents (non-empty): 29674409
total terms: 1257896158
topics:
- name: "All Topics"
path: topics.car17v1.5.test200.txt
qrel: qrels.car17v1.5.test200.txt
- name: "benchmarkY1test"
path: topics.car17v1.5.benchmarkY1test.txt
qrel: qrels.car17v1.5.benchmarkY1test.txt
evals:
- command: eval/trec_eval.9.0.4/trec_eval
params:
Expand All @@ -50,18 +50,18 @@ models:
- -bm25
results:
map:
- 0.1689
- 0.1563
recip_rank:
- 0.2321
- 0.2336
- name: bm25+rm3
params:
- -bm25
- -rm3
results:
map:
- 0.1287
- 0.1295
recip_rank:
- 0.1788
- 0.1923
- name: bm25+ax
params:
- -bm25
Expand All @@ -70,26 +70,26 @@ models:
- -axiom.deterministic
results:
map:
- 0.1355
- 0.1358
recip_rank:
- 0.1857
- 0.1949
- name: ql
params:
- -ql
results:
map:
- 0.1516
- 0.1386
recip_rank:
- 0.2085
- 0.2037
- name: ql+rm3
params:
- -ql
- -rm3
results:
map:
- 0.1173
- 0.1080
recip_rank:
- 0.1573
- 0.1599
- name: ql+ax
params:
- -ql
Expand All @@ -98,6 +98,6 @@ models:
- -axiom.deterministic
results:
map:
- 0.1082
- 0.1048
recip_rank:
- 0.1501
- 0.1524
4 changes: 2 additions & 2 deletions src/main/resources/regression/car17v2.0.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,8 @@ index_stats:
total terms: 1249740109
topics:
- name: "benchmarkY1test"
path: topics.car17v2.0.test.pages.cbor-hierarchical.txt
qrel: qrels.car17v2.0.test.pages.cbor-hierarchical.txt
path: topics.car17v2.0.benchmarkY1test.txt
qrel: qrels.car17v2.0.benchmarkY1test.txt
evals:
- command: eval/trec_eval.9.0.4/trec_eval
params:
Expand Down
Loading

0 comments on commit 3eef2fb

Please sign in to comment.