-
Notifications
You must be signed in to change notification settings - Fork 471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Overhaul of regressions for MS MARCO {passage, doc} and DL {19, 20} #1559
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1559 +/- ##
=========================================
Coverage 57.31% 57.31%
Complexity 998 998
=========================================
Files 167 167
Lines 9274 9274
Branches 1281 1281
=========================================
Hits 5315 5315
Misses 3522 3522
Partials 437 437 Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Few minor comments.
docs/regressions.md
Outdated
@@ -62,12 +62,18 @@ nohup python src/main/python/run_regression.py --collection msmarco-doc-docTTTTT | |||
nohup python src/main/python/run_regression.py --collection msmarco-doc-docTTTTTquery-per-passage >& logs/log.msmarco-doc-docTTTTTquery-per-passage & | |||
|
|||
nohup python src/main/python/run_regression.py --collection dl19-passage >& logs/log.dl19-passage & | |||
nohup python src/main/python/run_regression.py --collection dl19-passage-docTTTTTquery >& logs/dl19-passage-docTTTTTquery & |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nohup python src/main/python/run_regression.py --collection dl19-passage-docTTTTTquery >& logs/dl19-passage-docTTTTTquery & | |
nohup python src/main/python/run_regression.py --collection dl19-passage-docTTTTTquery >& logs/log.dl19-passage-docTTTTTquery & |
docs/regressions.md
Outdated
nohup python src/main/python/run_regression.py --index --collection msmarco-doc-docTTTTTquery-per-doc >& logs/log.msmarco-doc-docTTTTTquery-per-doc & | ||
nohup python src/main/python/run_regression.py --index --collection msmarco-doc-docTTTTTquery-per-passage >& logs/log.msmarco-doc-docTTTTTquery-per-passage & | ||
|
||
nohup python src/main/python/run_regression.py --index --collection dl19-passage >& logs/log.dl19-passage & | ||
nohup python src/main/python/run_regression.py --index --collection dl19-passage-docTTTTTquery >& logs/dl19-passage-docTTTTTquery & |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nohup python src/main/python/run_regression.py --index --collection dl19-passage-docTTTTTquery >& logs/dl19-passage-docTTTTTquery & | |
nohup python src/main/python/run_regression.py --index --collection dl19-passage-docTTTTTquery >& logs/log.dl19-passage-docTTTTTquery & |
@@ -6,6 +6,7 @@ Note that there are four different regression conditions for this task, and this | |||
+ **Indexing Condition:** each MS MARCO document is first segmented into passages, each passage is treated as a unit of indexing | |||
+ **Expansion Condition:** doc2query-T5 | |||
|
|||
In the passage indexing condition, we select the highest-scoring passage from a document as the score for that document to produce a document ranking; this is known as the MaxP technique. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the passage indexing condition, we select the highest-scoring passage from a document as the score for that document to produce a document ranking; this is known as the MaxP technique. | |
In the passage indexing condition, we select the score of the highest-scoring passage from a document as the score for that document to produce a document ranking; this is known as the MaxP technique. |
Covers