Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support more clip models #1

Open
rom1504 opened this issue May 12, 2022 · 18 comments
Open

support more clip models #1

rom1504 opened this issue May 12, 2022 · 18 comments

Comments

@rom1504
Copy link
Contributor

rom1504 commented May 12, 2022

https://github.com/rinnakk/japanese-clip

https://github.com/jaketae/koclip

https://github.com/ai-forever/ru-clip

https://github.com/FreddeFrallan/Multilingual-CLIP (we're about to release a much better version of this)

@rom1504
Copy link
Contributor Author

rom1504 commented May 12, 2022

I think it could be valuable

@rom1504
Copy link
Contributor Author

rom1504 commented May 12, 2022

I'm wondering where would be a good place to support all clip models

do you think a all-clip package could be good ? I can build it from https://github.com/rom1504/clip-retrieval/blob/main/clip_retrieval/clip_inference/mapper.py and https://github.com/rom1504/clip-retrieval/blob/main/clip_retrieval/clip_inference/load_clip.py

@rom1504
Copy link
Contributor Author

rom1504 commented May 13, 2022

I was pointed to https://github.com/dmarx/Multi-Modal-Comparators ; looking into it

@rom1504
Copy link
Contributor Author

rom1504 commented May 13, 2022

opened ai-forever/ru-clip#20 and rinnakk/japanese-clip#2 as it was pointed out to me that translating class names is not trivial

@rom1504
Copy link
Contributor Author

rom1504 commented May 13, 2022

can we use http://compling.hss.ntu.edu.sg/omw/ to automatically get class names for imagenet in all languages

@rom1504
Copy link
Contributor Author

rom1504 commented May 13, 2022

for other datasets, maybe https://www.wikidata.org/wiki/Wikidata:Main_Page could work

@rom1504
Copy link
Contributor Author

rom1504 commented May 13, 2022

opened dmarx/Multi-Modal-Comparators#26

will wait a little

@mehdidc
Copy link
Collaborator

mehdidc commented May 13, 2022

can we use http://compling.hss.ntu.edu.sg/omw/ to automatically get class names for imagenet in all languages

That looks very good, I suspected that we need to change the data structure of classnames/templates from plain fixed dicts in the code to something a bit more advanced due to these kind of use cases. So, maybe we could think about the data structure. We would have multiple languages, multiple datasets. For each (language, dataset) pair, multiple sets of templates (e.g., LiT experimented with different sets of templates and showed their effect on zero-shot accuracy) and also possibly multiple sets of classnames. So for each evaluation run we need to select the dataset, the language, which version of the templates to use, which version of the classnames to use, and we should have a default/recommended value for both templates and classnames to use. We will also need to populate this database, as we add more datasets and languages, sometimes in a programmatic way, e.g. from the multilingual wordnet link you sent. I don't know if a relational database is needed e.g. with SQLite, possibly too much, perhaps just a JSON file or a CSV would suffice. Any thoughts on this @rom1504 ?

@rom1504
Copy link
Contributor Author

rom1504 commented May 13, 2022

I think a json file is probably best for small data like this where we can have a clear schema and we want things to be easy to explore/check

@dmarx
Copy link

dmarx commented May 13, 2022

moar models! dmarx/Multi-Modal-Comparators#2

@rom1504
Copy link
Contributor Author

rom1504 commented May 16, 2022

@rom1504
Copy link
Contributor Author

rom1504 commented Nov 18, 2022

@mkshing
Copy link
Contributor

mkshing commented Dec 9, 2022

@rom1504 @mehdidc Hi, thank you for considering this. For Japanese CLIP, I sent a PR to support our models. I am happy to be reviewed. Thanks again.
#50

@usuyama
Copy link

usuyama commented Apr 26, 2023

can we directly use "open_clip" models from huggingface hub? https://huggingface.co/microsoft/BiomedCLIP-PubMedBERT_256-vit_base_patch16_224

import open_clip

model, preprocess_train, preprocess_val = open_clip.create_model_and_transforms('hf-hub:microsoft/BiomedCLIP-PubMedBERT_256-vit_base_patch16_224')
tokenizer = open_clip.get_tokenizer('hf-hub:microsoft/BiomedCLIP-PubMedBERT_256-vit_base_patch16_224')

@mehdidc
Copy link
Collaborator

mehdidc commented Apr 27, 2023

@usuyama Yes, that would be cool, it works fine already, e.g.
clip_benchmark eval --dataset=vtab/pcam --model="hf-hub:microsoft/BiomedCLIP-PubMedBERT_256-vit_base_patch16_224" --pretrained=""

@usuyama
Copy link

usuyama commented Apr 27, 2023

@mehdidc Great!

Tried and seeing this error - any thoughts?

CleanShot 2023-04-26 at 22 44 43@2x

notebook: https://colab.research.google.com/drive/1UmHOIOGNtOTZeOcc47jL-cZep2iEFx39?usp=sharing

@mehdidc
Copy link
Collaborator

mehdidc commented Apr 27, 2023

@usuyama Here you would also need to io install VTAB's repo: pip install task_adaptation==0.1

@mehdidc
Copy link
Collaborator

mehdidc commented Apr 27, 2023

@usuyama As an alternative, you can use the WDS version is much faster to download/prepare: clip_benchmark eval --dataset=wds/vtab/pcam --dataset_root "https://huggingface.co/datasets/clip-benchmark/wds_vtab-pcam/tree/main" --model="hf-hub:microsoft/BiomedCLIP-PubMedBERT_256-vit_base_patch16_224" --pretrained=""

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants