model architectures and pretrained models to support #2

dmarx · 2022-04-09T05:10:46Z

installable

installable with extra effort

https://github.com/Sense-GVT/DeCLIP
- installs into a generic "prototype" package that could conflict with other packages using the same "spring" starter template
- dataclasses==0.8 in requirements.txt throws version conflict

Not installable

Not released

References for more variants:

https://paperswithcode.com/paper/learning-transferable-visual-models-from

Potentially in scope, lower priority

https://github.com/j-min/VL-T5
https://github.com/ashkamath/mdetr
https://github.com/sberbank-ai/ru-dolph
https://github.com/drboog/Lafite
https://github.com/pbaylies/Augmented_CLIP
ALIGN (is this even public? or just big?)
- https://ai.googleblog.com/2021/05/align-scaling-up-visual-and-vision.html
- https://arxiv.org/abs/2102.05918
ALBEF
ArtEmis - Affective language for Visual Art
- https://github.com/optas/artemis
https://github.com/CompVis/net2net
LiT (not sure if there are public pre-trained models) - https://arxiv.org/abs/2111.07991
- https://colab.research.google.com/github/google-research/vision_transformer/blob/main/lit.ipynb
NUWA (very unlikely they'll release a reptrained model)
- https://github.com/microsoft/NUWA
- https://github.com/lucidrains/nuwa-pytorch

Older stuff

DAMSM - https://github.com/taoxugit/AttnGAN
https://github.com/sidward14/Style-AttnGAN
https://github.com/luogen1996/MCN
ViLBERT - https://github.com/facebookresearch/vilbert-multi-task
VL-BERT - https://github.com/jackroos/VL-BERT

VQA is sort of a generalization of vision language co-training... TBD.

MAGMA could be another useful approach to promote multi-lingual support

https://github.com/Aleph-Alpha/magma

dmarx · 2022-04-17T08:20:49Z

https://github.com/navervision/KELIP

apolinario · 2022-04-19T08:09:57Z

LiT: they released some models last week https://github.com/google-research/vision_transformer#lit-models

Audio: besides AudioCLIP there's also wav2clip with a different approach: https://github.com/descriptinc/lyrebird-wav2clip

dmarx · 2022-04-19T18:11:25Z

https://github.com/allenai/reclip

dmarx · 2022-04-19T18:16:32Z

https://github.com/facebookresearch/Detic

dmarx · 2022-04-19T23:06:38Z

https://github.com/sallymmx/ActionCLIP

dmarx · 2022-04-19T23:07:21Z

https://github.com/ChenRocks/UNITER

dmarx · 2022-04-20T06:13:52Z

https://github.com/raoyongming/DenseCLIP

dmarx · 2022-04-20T06:17:04Z

https://github.com/ttlmh/Bridge-Prompt

apolinario · 2022-04-20T18:37:37Z

https://github.com/sonoisa/clip-japanese

dmarx · 2022-04-21T22:27:55Z

https://mmf.sh/docs/notes/projects
- older stuff, e.g. ViLBET
https://github.com/facebookresearch/multimodal

dmarx · 2022-04-29T21:48:11Z

SLIP demo install using dapm and loading weights using old strategy from DD: alembics/disco-diffusion@c509aa1

!wget https://dl.fbaipublicfiles.com/slip/slip_base_100ep.pt
!pip install napm

import napm
url = 'https://github.com/facebookresearch/SLIP'
napm.pseudoinstall_git_repo(url, add_install_dir_to_path=True)

import torch
import napm
import SLIP
from SLIP.models import SLIP_VITB16, SLIP, SLIP_VITL16

sd = torch.load('slip_base_100ep.pt', map_location=torch.device('cpu') )
real_sd = {}
for k, v in sd['state_dict'].items():
  new_key = '.'.join(k.split('.')[1:]) # strips "module" prefix. sure, why not.
  #print(k, new_key) 
  real_sd[new_key] = v
del sd

SLIPB16model = SLIP_VITB16(ssl_mlp_dim=4096, ssl_emb_dim=256)
SLIPB16model.load_state_dict(real_sd)

dmarx · 2022-04-29T22:21:40Z

CLOOB demo using dapm

!pip install git+https://github.com/openai/CLIP
import napm

url = "https://github.com/crowsonkb/cloob-training"
napm.pseudoinstall_git_repo(url, package_name='cloob')

import cloob
from cloob.cloob_training import model_pt, pretrained

config = pretrained.get_config('cloob_laion_400m_vit_b_16_16_epochs')
model = model_pt.get_pt_model(config)
checkpoint = pretrained.download_checkpoint(config)
model.load_state_dict(model_pt.get_pt_params(config, checkpoint), )
#model.eval().requires_grad_(False).to('cuda')

dmarx · 2022-05-05T17:53:26Z

CoCa https://arxiv.org/abs/2205.01917

dmarx · 2022-05-08T19:07:30Z

OTTER https://github.com/facebookresearch/OTTER

dmarx · 2022-05-09T07:22:24Z

ruDOLPH

dmarx · 2022-05-11T00:09:37Z

https://socraticmodels.github.io/

dmarx · 2022-05-14T01:48:52Z

https://github.com/yxuansu/MAGIC

apolinario · 2022-05-15T04:06:22Z

https://github.com/rinnakk/japanese-clip (not the same as this)

apolinario · 2022-05-15T11:09:19Z

This is a very large and seemingly very good CLIP in Chinese that @Dango233 has shown me: https://wukong-dataset.github.io/wukong-dataset/benchmark.html

One problem though: It's pre-trained weights are on Mindspore (Huawei's PyTorch) so someone would need to convert that...

Dango233 · 2022-05-16T08:55:17Z

https://github.com/mindspore-ai/models/tree/master/research/mm/wukong

dmarx · 2022-05-29T01:53:23Z

maybe just the fine tuned model?

https://github.com/j-min/clip-caption-reward

apolinario · 2022-06-02T23:47:36Z

A new (better, it seems) Multilingual CLIP https://github.com/FreddeFrallan/Multilingual-CLIP

rom1504 · 2022-06-02T23:52:22Z

@apolinario indeed and now it's packaged properly on pypi as multilingual-clip

it's also available for easy testing at https://rom1504.github.io/clip-retrieval/?useMclip=true&query=%E9%BB%84%E8%89%B2%E3%81%84%E7%8C%AB&back=https%3A%2F%2Fknn5.laion.ai&index=laion5B

dmarx · 2022-06-03T05:10:58Z

@rom1504 @apolinario the m-clip release gave me a thought: maybe we could host mmc on pypi with essentially none of the other perceptors installed at all. Simple instructions for "finalizing" the mmc install could live in the README (as well as one-liners for specific perceptors PRN), and we could add a warning on import too. maybe we could ship an update script or a CLI command.

My thinking here is if we ship the core tooling as a bare library, then anyone could attach the mocking utilities upstream to quickly make new perceptors drop-in-able if they aren't already, which conversely would make them trivial to add to mmc (since they'd already be hooked into a conformant API one way or another).

Actually, it might be cleaner and simpler to isolate a simple mocking wrapper and package that for pypi?

I'm mostly just thinking out-loud now. Thoughts?

apolinario · 2022-06-05T11:39:35Z

I like the idea and spirit and I feel eventually if MMC gets way too many perceptors making some optional make a lot of sense. Now starting with all optional, I'm not sure - regardless I think your idea holds - just not sure if we ship empty or with some basics (OpenAI + OpenCLIP for e.g.) and let users further install from then on

apolinario · 2022-06-05T11:39:38Z

(New perceptor: https://github.com/microsoft/UniCL)

apolinario · 2022-06-07T05:35:02Z

https://github.com/goel-shashank/CyCLIP

dmarx · 2022-06-10T23:54:05Z

https://github.com/facebookresearch/omnivore

dmarx · 2022-06-23T04:20:14Z

https://github.com/microsoft/GLIP

dmarx · 2022-06-25T18:18:08Z

https://github.com/microsoft/RegionCLIP

rom1504 · 2022-06-25T18:24:45Z

https://github.com/FacePerceiver/FaRL#use-farl-as-faceclip face clip

…

On Sat, Jun 25, 2022, 20:18 David Marx ***@***.***> wrote: https://github.com/microsoft/RegionCLIP — Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAR437SIUULWF4SL66ON72LVQ5EOVANCNFSM5S6LPHLA> . You are receiving this because you were mentioned.Message ID: ***@***.***>

dmarx · 2022-06-30T20:06:01Z

https://github.com/Lednik7/CLIP-ONNX/tree/main/clip_onnx

dmarx · 2022-06-30T21:10:20Z

https://github.com/OFA-Sys/OFA

apolinario · 2022-07-04T20:35:24Z

Turkish CLIP https://github.com/yusufani/TrCLIP

dmarx · 2022-09-26T01:01:44Z

https://github.com/salesforce/LAVIS

dmarx · 2022-12-09T03:07:25Z

EVA-CLIP - https://github.com/baaivision/EVA/blob/master/clip/README.md

basically already api compliant

dmarx mentioned this issue May 13, 2022

support more clip models LAION-AI/CLIP_benchmark#1

Open

koke2c95 mentioned this issue Jun 8, 2022

Any benchmark for the latest released model? KichangKim/DeepDanbooru#61

Open

model architectures and pretrained models to support #2

model architectures and pretrained models to support #2

Comments

dmarx commented Apr 9, 2022 • edited Loading

installable

installable with extra effort

Not installable

Not released

dmarx commented Apr 17, 2022

apolinario commented Apr 19, 2022

dmarx commented Apr 19, 2022

dmarx commented Apr 19, 2022

dmarx commented Apr 19, 2022

dmarx commented Apr 19, 2022

dmarx commented Apr 20, 2022

dmarx commented Apr 20, 2022

apolinario commented Apr 20, 2022

dmarx commented Apr 21, 2022 • edited Loading

dmarx commented Apr 29, 2022 • edited Loading

dmarx commented Apr 29, 2022

dmarx commented May 5, 2022

dmarx commented May 8, 2022

dmarx commented May 9, 2022

dmarx commented May 11, 2022

dmarx commented May 14, 2022

apolinario commented May 15, 2022

apolinario commented May 15, 2022

Dango233 commented May 16, 2022

dmarx commented May 29, 2022

apolinario commented Jun 2, 2022

rom1504 commented Jun 2, 2022 • edited Loading

dmarx commented Jun 3, 2022 • edited Loading

apolinario commented Jun 5, 2022

apolinario commented Jun 5, 2022 • edited Loading

apolinario commented Jun 7, 2022

dmarx commented Jun 10, 2022

dmarx commented Jun 23, 2022

dmarx commented Jun 25, 2022

rom1504 commented Jun 25, 2022 via email

dmarx commented Jun 30, 2022

dmarx commented Jun 30, 2022

apolinario commented Jul 4, 2022

dmarx commented Sep 26, 2022

dmarx commented Dec 9, 2022

dmarx commented Apr 9, 2022 •

edited

Loading

dmarx commented Apr 21, 2022 •

edited

Loading

dmarx commented Apr 29, 2022 •

edited

Loading

rom1504 commented Jun 2, 2022 •

edited

Loading

dmarx commented Jun 3, 2022 •

edited

Loading

apolinario commented Jun 5, 2022 •

edited

Loading