Multimodal Embedding #7866

taowang1993 · 2024-09-01T08:21:05Z

Self Checks

I have searched for existing issues search for existing issues, including closed ones.
I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
[FOR CHINESE USERS] 请务必使用英文提交 Issue，否则会被关闭。谢谢！:）
Please do not modify this template :) and fill in all the required fields.

1. Is this request related to a challenge you're experiencing? Tell me about your story.

Currently, Dify supports only text embedding.

But I need to show users images from my documents such as diagrams and graphs.

This feature is very useful in education, medicine, legal and finance domains.

Major vector DBs already support multimodal RAG.

https://weaviate.io/blog/multimodal-models

https://milvus.io/docs/multimodal_rag_with_milvus.md

https://jina.ai/news/jina-clip-v1-a-truly-multimodal-embeddings-model-for-text-and-image/

2. Additional context or comments

No response

3. Can you help us with this feature?

I am interested in contributing to this feature.

friedinando · 2024-09-26T21:11:17Z

+1

monotykamary · 2024-11-08T09:06:47Z

https://docs.voyageai.com/docs/multimodal-embeddings
I think this is the near future for multimodal RAG, especially since OCR for Open WebUI and now Claude's Visual PDFs are getting heavier use-cases.

dosubot · 2024-12-09T16:07:04Z

Hi, @taowang1993. I'm Dosu, and I'm helping the Dify team manage their backlog. I'm marking this issue as stale.

Issue Summary

You requested the addition of multimodal embedding support in Dify, emphasizing its importance in various fields.
@friedinando expressed support for this feature.
@monotykamary shared documentation on multimodal embeddings and highlighted the growing relevance of multimodal RAG.

Next Steps

Could you confirm if this issue is still relevant to the latest version of the Dify repository? If so, please comment to keep the discussion open.
If there is no further activity, this issue will be automatically closed in 15 days.

Thank you for your understanding and contribution!

dosubot bot added 👻 feat:rag Embedding related issue, like qdrant, weaviate, milvus, vector database. 💪 enhancement New feature or request labels Sep 1, 2024

dosubot bot mentioned this issue Sep 5, 2024

Best Approach to Build a Multimodal RAG Application? #8001

Closed

5 tasks

Yawen-1010 self-assigned this Oct 22, 2024

dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Dec 9, 2024

dosubot bot closed this as not planned Won't fix, can't repro, duplicate, stale Dec 24, 2024

dosubot bot removed the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Dec 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multimodal Embedding #7866

Multimodal Embedding #7866

taowang1993 commented Sep 1, 2024

friedinando commented Sep 26, 2024

monotykamary commented Nov 8, 2024

dosubot bot commented Dec 9, 2024

Multimodal Embedding #7866

Multimodal Embedding #7866

Comments

taowang1993 commented Sep 1, 2024

Self Checks

1. Is this request related to a challenge you're experiencing? Tell me about your story.

2. Additional context or comments

3. Can you help us with this feature?

friedinando commented Sep 26, 2024

monotykamary commented Nov 8, 2024

dosubot bot commented Dec 9, 2024