baselines

Baseline run description

The baseline system is a three-step retrieval process where, given a query, (1) the top-n documents are retrieved from the index and (2) chunked into passages using Spacy's SentenceRecognizer pipeline. (3) Passages are then re-ranked using a neural re-ranker.

For (1), we use a simple BM25 function (K1=4.46, b=0.82), and for (3), we use a T5 re-ranker trained on the MS MARCO passage dataset.

org_automatic_results_1000.v1.0.run is the results file generated by retrieving and re-ranking the passages from the top 1000 documents using the automatic rewrites.

org_manual_results_1000.v1.0.run is the results file generated by retrieving and re-ranking the passages from the top 1000 documents using the manual rewrites.

We also provide the converted versions of each run after they have been converted from passage to document ids and deduped (document_runs).

Baseline Rewriter Policy

We use the T5 model trained on the CANARD dataset to generate our baseline automatic rewrites.

For a given query n in a topic, the rewrite context consists of all queries from previous turns (turn 1 to turn n-1) and the passages from the last three turns (turn n-3 to turn n-1).

For example, the automatic rewrite for topic 107, turn 8 (identified as 107-8) was generated by passing the following as context:

query 107-1 ||| query 107-2 ||| query 107-3 ||| query 107-4 ||| query 107-5 ||| passage 107-5 ||| query 107-6 ||| passage 107-6 ||| query 107-7 ||| passage 107-7

Name		Name	Last commit message	Last commit date
parent directory ..
document_runs		document_runs
Readme.md		Readme.md
org_automatic_results_1000.v1.0.run		org_automatic_results_1000.v1.0.run
org_convdr.trec		org_convdr.trec
org_convdr_bert.trec		org_convdr_bert.trec
org_manual_ance.trec		org_manual_ance.trec
org_manual_ance_bert.trec		org_manual_ance_bert.trec
org_manual_bm25.trec		org_manual_bm25.trec
org_manual_results_1000.v1.0.run		org_manual_results_1000.v1.0.run

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

baselines

baselines

Readme.md

Baseline run description

Baseline Rewriter Policy

Files

baselines

Directory actions

More options

Directory actions

More options

Latest commit

History

baselines

Folders and files

parent directory

Readme.md

Baseline run description

Baseline Rewriter Policy