-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support more clip models #1
Comments
I think it could be valuable |
I'm wondering where would be a good place to support all clip models do you think a all-clip package could be good ? I can build it from https://github.com/rom1504/clip-retrieval/blob/main/clip_retrieval/clip_inference/mapper.py and https://github.com/rom1504/clip-retrieval/blob/main/clip_retrieval/clip_inference/load_clip.py |
I was pointed to https://github.com/dmarx/Multi-Modal-Comparators ; looking into it |
opened ai-forever/ru-clip#20 and rinnakk/japanese-clip#2 as it was pointed out to me that translating class names is not trivial |
can we use http://compling.hss.ntu.edu.sg/omw/ to automatically get class names for imagenet in all languages |
for other datasets, maybe https://www.wikidata.org/wiki/Wikidata:Main_Page could work |
opened dmarx/Multi-Modal-Comparators#26 will wait a little |
That looks very good, I suspected that we need to change the data structure of classnames/templates from plain fixed dicts in the code to something a bit more advanced due to these kind of use cases. So, maybe we could think about the data structure. We would have multiple languages, multiple datasets. For each (language, dataset) pair, multiple sets of templates (e.g., LiT experimented with different sets of templates and showed their effect on zero-shot accuracy) and also possibly multiple sets of classnames. So for each evaluation run we need to select the dataset, the language, which version of the templates to use, which version of the classnames to use, and we should have a default/recommended value for both templates and classnames to use. We will also need to populate this database, as we add more datasets and languages, sometimes in a programmatic way, e.g. from the multilingual wordnet link you sent. I don't know if a relational database is needed e.g. with SQLite, possibly too much, perhaps just a JSON file or a CSV would suffice. Any thoughts on this @rom1504 ? |
I think a json file is probably best for small data like this where we can have a clear schema and we want things to be easy to explore/check |
moar models! dmarx/Multi-Modal-Comparators#2 |
can we directly use "open_clip" models from huggingface hub? https://huggingface.co/microsoft/BiomedCLIP-PubMedBERT_256-vit_base_patch16_224
|
@usuyama Yes, that would be cool, it works fine already, e.g. |
@mehdidc Great! Tried and seeing this error - any thoughts? notebook: https://colab.research.google.com/drive/1UmHOIOGNtOTZeOcc47jL-cZep2iEFx39?usp=sharing |
@usuyama Here you would also need to io install VTAB's repo: |
@usuyama As an alternative, you can use the WDS version is much faster to download/prepare: |
https://github.com/rinnakk/japanese-clip
https://github.com/jaketae/koclip
https://github.com/ai-forever/ru-clip
https://github.com/FreddeFrallan/Multilingual-CLIP (we're about to release a much better version of this)
The text was updated successfully, but these errors were encountered: