Skip to content

Document ranking via sentence modeling using BERT

Notifications You must be signed in to change notification settings

castorini/birch

Repository files navigation

BERT4retrieval

BM25 results

Top K Sentences Method Recall Number of Docs MAP of Max Sent
1000 RM3 0.63 720.4 0.1974
1500 RM3 0.67 1065.4 0.1985
1000 BM25 0.61 716.6 0.1862
1500 BM25 0.66 1057.1 0.1895

BERT pretrained with TRECQA+WikiQA

score = Lambda * bm25_rm3 + (1.0-Lambda) * (bert_high_sent_1 + bert_high_sent_2/2)

MAP Lambda
0.2051 0
0.2106 0.05
0.2151 0.1
0.2194 0.15
0.2233 0.2
0.2282 0.25
0.2321 0.3
0.2354 0.35
0.2388 0.4
0.2419 0.45
0.2451 0.5
0.2472 0.55
0.2497 0.6
0.2503 0.65
0.2522 0.7
0.2525 0.75
0.2518 0.8
0.2511 0.85
0.2495 0.9
0.2484 0.95
0.2451 1

BERT pretrained with Tweets

score = Lambda * bm25_rm3 + (1.0-Lambda) * (bert_high_sent_1 + bert_high_sent_2/2)

MAP Lambda
0.2378 0
0.2413 0.05
0.2445 0.1
0.2478 0.15
0.2505 0.2
0.2525 0.25
0.2543 0.3
0.2562 0.35
0.2579 0.4
0.2588 0.45
0.2596 0.5
0.2595 0.55
0.2594 0.6
0.2594 0.65
0.2585 0.7
0.2576 0.75
0.2557 0.8
0.2537 0.85
0.2515 0.9
0.2489 0.95
0.2451 1

BERT pretrained with TRECQA+WikiQA using topic description

MAP Lambda
0.2020 0
0.2076 0.05
0.2130 0.1
0.2189 0.15
0.2237 0.2
0.2301 0.25
0.2347 0.3
0.2399 0.35
0.2437 0.4
0.2466 0.45
0.2493 0.5
0.2520 0.55
0.2534 0.6
0.2548 0.65
0.2552 0.7
0.2560 0.75
0.2561 0.8
0.2545 0.85
0.2520 0.9
0.2494 0.95
0.2451 1

TODO

  • Combine BERT scores from tweet and trec_wiki_qa
  • re-rank from top BM25+RM3 100 to top 1000 documents

About

Document ranking via sentence modeling using BERT

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published