Product Quantization

Hi,
Me and my friend have been reading the code for a while and we were looking for some ideas for contributing.
@ankane, you mentioned product quantization in #27. Is this still an issue? We would like to work on it if it looks like a useful feature.
I have some questions about it and would like to discuss design choices in this issue as we start implementing this feature. 
Current questions I have:
* Would it make more sense if pgvector supports a new index (e.g. IVFPQ like Faiss) to achieve product quantization or just adding a new vector type? This is much harder to do without adding a new index since we have to store the centroids for each subvector. It might make more sense to have a new index type for ivf + product quantization.
* Do you think the subvector type might help in the internal implementation? I think it can help to get a part of the vector in the process of product quantization.
* Please let me know of any specific implementations you find more performant for IVFPQ. I have provided a list of resources that I'm reading to fully understand how people have implemented it in the past.

# Some Resources
* [Faiss IVFPQ](https://github.com/facebookresearch/faiss/blob/main/faiss/IndexIVFPQ.h) implementation
* The [PQ paper](https://ieeexplore.ieee.org/document/5432202)
* A [useful blog post](https://sidshome.wordpress.com/2023/12/30/deep-dive-into-faiss-indexivfpq-for-vector-search/) and some IVFPQ optimizations
* [Billion-scale Approximate Nearest Neighbor Search](https://wangzwhu.github.io/home/file/acmmm-t-part3-ann.pdf)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Product Quantization #605

Some Resources

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development