Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproduce MS-MARCO passage and document ranking experiments #1725

Merged

Conversation

mikhail-tsir
Copy link
Contributor

@mikhail-tsir mikhail-tsir commented Jan 9, 2022

Reproduced expected results as described in the passage ranking and document ranking docs.

Environment:
Ubuntu 18.04.5 LTS
Java: openjdk 11.0.13 2021-10-19
Python: version 3.9.7

Issues I ran into:

  • Downloading the documents from the alternative mirror seemed faster but kept getting this error:
Read error at byte 3324919808/8501799926 (Success). Retrying.

at which point the download would restart and I would get the same error again eventually.
Downloading from the original source worked fine but was slower.

  • Not really an issue, but since I am on linux I had to use python3 instead of python when running the python scripts.
  • Ran into no space left on device error while indexing documents (had about 28G free before indexing). Solved by freeing up space on device.

@lintool lintool merged commit 7ff99e0 into castorini:master Jan 10, 2022
@mikhail-tsir mikhail-tsir deleted the experiment-reproduction-mikhail-tsirlin branch January 10, 2022 21:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants