-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for open source models based on text-embeddings-inference #66
Add support for open source models based on text-embeddings-inference #66
Conversation
…ce. Signed-off-by: wileyzhang <bluechanel612@gmail.com>
…ce. Signed-off-by: wileyzhang <bluechanel612@gmail.com>
…ce. Signed-off-by: wileyzhang <bluechanel612@gmail.com>
Signed-off-by: wileyzhang <bluechanel612@gmail.com>
Signed-off-by: wileyzhang <bluechanel612@gmail.com>
Thanks for the contribution! Curious, what is the extra value of OpenSourceRerankFunction compared to CrossEncoderRerankFunction? |
OpenSourceRerankFunction benefits from leveraging text-embeddings-inference, which provides significantly higher throughput and lower latency compared to CrossEncoderRerankFunction. For detailed performance metrics, you can refer to the official benchmark results here: https://huggingface.co/docs/text-embeddings-inference/index#text-embeddings-inference. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please help sketch a documentation similar to https://milvus.io/docs/embed-with-sentence-transform.md with simple examples? That would be really helpful to other users of milvus model lib.
@property | ||
def dim(self): | ||
if self._dim is None: | ||
self._dim = self._call_api(["get dim"])[0].shape[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this really work? i.e. self._call_api(["get dim"]) aka self._session.post(self.api_url,
json= {"input": ["get dim"]},) will return the vector shape? That sounds magical
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works by sending a dummy message to the API to retrieve the vector dimension, as the original API does not directly provide this information. I'll add a comment here for clarification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM.
self.api_url, | ||
json={ | ||
"query": query, | ||
"raw_scores": False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall these params be configurable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and what does raw_scores mean? say will it not return scores?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When raw_scores is set to false, the returned scores are normalized to a range of 0-1. When set to true, the scores are the raw, unnormalized values. I believe it should default to false to align with mdoel like JinaAI Rerank. Perhaps I should consider removing this configuration entirely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it.
Co-authored-by: codingjaguar <codingjaguar@gmail.com>
Signed-off-by: wileyzhang <bluechanel612@gmail.com>
…e' into support_text-embeddings-inference
Signed-off-by: wileyzhang <bluechanel612@gmail.com>
The example documentation has been submitted at milvus-io/milvus-docs#2998. |
/lgtm |
Add support for broader integration of embedding models. This update leverages the open-source embedding inference project text-embeddings-inference by Hugging Face.
Signed-off-by: wileyzhang bluechanel612@gmail.com