Skip to content

Commit

Permalink
add figure to README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
EC2 Default User committed Nov 15, 2023
1 parent e3f1c37 commit 0d61b71
Show file tree
Hide file tree
Showing 3 changed files with 11 additions and 2 deletions.
13 changes: 11 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,22 @@
# Scalable and Effective Generative Information Retrieval
This repo provides the source code and checkpoints for our paper [Scalable and Effective Generative Information Retrieval]() (RIPOR).
This repo provides the source code and checkpoints for our paper [Scalable and Effective Generative Information Retrieval]() (RIPOR). We propose RIPOR, a optimization framework for generative retrieval. RIPOR is designed based on two often-overlooked fundamental design considerations in generative retrieval. To addresse the issues, we propose a novel prefix-oriented ranking optimization algorithm and relevance-based DocID initialization, which illustrated in the following Figure. The main experiment is conducted on large-scale information retrieval benchmark MSMARCO-8.8M, and evaluated on three evaluation sets MSMARCO-Dev, TREC'19 and 20. RIPOR surpasses state-of-the-art generative retrieval models by a large margin (e.g., 30.5% MRR improvements on MS MARCO Dev Set), and perform better on par with popular dense retrieval models.

<p align="center">
<img align="center" src="./architecture.png" width="750" />
</p>
<p align="center">
<b>Figure:</b> An overview of the RIPOR framework. The top two sub-figures illustrate the novel components in RIPOR framework,
detailed in Sections 3.1 and 3.2. The bottom sub-figure presents the complete optimization pipeline
</p>


## Package installation
- pip install -r requirement.txt
- pip install torch==1.10.0+cu111 -f https://download.pytorch.org/whl/torch_stable.html
- conda install -c conda-forge faiss-gpu

## Download files
All necessary training files and checkpoints can be downloaded from [Google Disk Ripor_data](https://drive.google.com/drive/u/1/folders/1LLrOoXQq49hGoTMH1b7yyOlUvctmL6Ah). First you should download `RIPOR_data/data/`
All necessary training files and checkpoints can be downloaded from [Ripor_data](https://drive.google.com/drive/u/1/folders/1LLrOoXQq49hGoTMH1b7yyOlUvctmL6Ah). First you should download `RIPOR_data/data/`
- If you only want to do inference, you just need to download the following files:
- `RIPOR_data/experiments-full-t5seq-aq/t5_docid_gen_encoder_1/aq_smtid/docid_to_smtid.json`
- `RIPOR_data/$experiment_dir/t5seq_aq_encoder_seq2seq_1_lng_knp_self_mnt_32_dcy_2/checkpoint`
Expand Down
Binary file added architecture.pdf
Binary file not shown.
Binary file added architecture.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 0d61b71

Please sign in to comment.