-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
model architectures and pretrained models to support #2
Comments
LiT: they released some models last week https://github.com/google-research/vision_transformer#lit-models Audio: besides AudioCLIP there's also wav2clip with a different approach: https://github.com/descriptinc/lyrebird-wav2clip |
SLIP demo install using dapm and loading weights using old strategy from DD: alembics/disco-diffusion@c509aa1
|
CLOOB demo using dapm
|
https://github.com/rinnakk/japanese-clip (not the same as this) |
This is a very large and seemingly very good CLIP in Chinese that @Dango233 has shown me: https://wukong-dataset.github.io/wukong-dataset/benchmark.html One problem though: It's pre-trained weights are on Mindspore (Huawei's PyTorch) so someone would need to convert that... |
maybe just the fine tuned model? |
A new (better, it seems) Multilingual CLIP https://github.com/FreddeFrallan/Multilingual-CLIP |
@apolinario indeed and now it's packaged properly on pypi as multilingual-clip it's also available for easy testing at https://rom1504.github.io/clip-retrieval/?useMclip=true&query=%E9%BB%84%E8%89%B2%E3%81%84%E7%8C%AB&back=https%3A%2F%2Fknn5.laion.ai&index=laion5B |
@rom1504 @apolinario the m-clip release gave me a thought: maybe we could host mmc on pypi with essentially none of the other perceptors installed at all. Simple instructions for "finalizing" the mmc install could live in the README (as well as one-liners for specific perceptors PRN), and we could add a warning on import too. maybe we could ship an update script or a CLI command. My thinking here is if we ship the core tooling as a bare library, then anyone could attach the mocking utilities upstream to quickly make new perceptors drop-in-able if they aren't already, which conversely would make them trivial to add to mmc (since they'd already be hooked into a conformant API one way or another). Actually, it might be cleaner and simpler to isolate a simple mocking wrapper and package that for pypi? I'm mostly just thinking out-loud now. Thoughts? |
I like the idea and spirit and I feel eventually if MMC gets way too many perceptors making some optional make a lot of sense. Now starting with all optional, I'm not sure - regardless I think your idea holds - just not sure if we ship empty or with some basics (OpenAI + OpenCLIP for e.g.) and let users further install from then on |
(New perceptor: https://github.com/microsoft/UniCL) |
… On Sat, Jun 25, 2022, 20:18 David Marx ***@***.***> wrote:
https://github.com/microsoft/RegionCLIP
—
Reply to this email directly, view it on GitHub
<#2 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAR437SIUULWF4SL66ON72LVQ5EOVANCNFSM5S6LPHLA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Turkish CLIP https://github.com/yusufani/TrCLIP |
EVA-CLIP - https://github.com/baaivision/EVA/blob/master/clip/README.md basically already api compliant |
installable
installable with extra effort
Not installable
Not released
CLIP
CLOOB
SLIP
CLIP-JAX
AudioCLIP
CLIPfa (farsi) - https://github.com/sajjjadayobi/CLIPfa
CLIP pretrained on FOOD101 by PASSL? - https://github.com/PaddlePaddle/PASSL/blob/main/docs/Train_CLIP_model.md
SBERT Multilingual CLIP - https://www.sbert.net/docs/pretrained_models.html#image-text-models
References for more variants:
https://paperswithcode.com/paper/learning-transferable-visual-models-from
Potentially in scope, lower priority
Older stuff
VQA is sort of a generalization of vision language co-training... TBD.
MAGMA could be another useful approach to promote multi-lingual support
https://github.com/Aleph-Alpha/magma
The text was updated successfully, but these errors were encountered: