multi-modal-learning

Here are 110 public repositories matching this topic...

mlfoundations / open_clip

An open source implementation of CLIP.

computer-vision deep-learning pytorch pretrained-models language-model contrastive-loss multi-modal-learning zero-shot-classification

Updated Dec 23, 2024
Python

OFA-Sys / Chinese-CLIP

Star

Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.

nlp computer-vision deep-learning transformers pytorch chinese pretrained-models multi-modal clip coreml-models contrastive-loss vision-language multi-modal-learning image-text-retrieval vision-and-language-pre-training

Updated Aug 6, 2024
Python

lyuchenyang / Macaw-LLM

Star

Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration

machine-learning natural-language-processing deep-learning neural-networks language-model multi-modal-learning

Updated Jun 17, 2024
Python

NVlabs / prismer

Star

The implementation of "Prismer: A Vision-Language Model with Multi-Task Experts".

vqa image-captioning language-model multi-task-learning vision-and-language multi-modal-learning vision-language-model

Updated Jan 17, 2024
Python

lucidrains / x-clip

Star

A concise but complete implementation of CLIP with various experimental improvements from recent papers

deep-learning artificial-intelligence zero-shot-learning multi-modal-learning contrastive-learning

Updated Oct 16, 2023
Python

jokieleung / awesome-visual-question-answering

Star

A curated list of Visual Question Answering(VQA)(Image/Video Question Answering),Visual Question Generation ,Visual Dialog ,Visual Commonsense Reasoning and related area.

vqa awesome-list multi-modal multi-modal-learning attention-networks

Updated Jul 6, 2023

OpenRobotLab / EmbodiedScan

Star

[CVPR 2024 & NeurIPS 2024] EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI

computer-vision robotics 3d-vision multi-modal-learning

Updated Nov 15, 2024
Python

kyegomez / zeta

Sponsor

Star

Build high-performance AI models with modular building blocks

multi-platform deep-learning transformers pytorch artificial-intelligence transformer speech-recognition multi-modal multi-agent-systems multi-modal-learning gpt4 llama2 longnet

Updated Dec 23, 2024
Python

DmitryRyumin / CVPR-2023-24-Papers

Star

CVPR 2023-2024 Papers: Dive into advanced research presented at the leading computer vision conference. Keep up to date with the latest developments in computer vision and deep learning. Code included. ⭐ support visual intelligence development!

Updated Jul 15, 2024
Python

zjukg / KG-MM-Survey

Star

Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey

information-extraction survey knowledge-graph awsome image-classification image-generation surveys entity-linking knowledge-graph-embeddings visual-question-answering entity-alignment paper-list awsome-list cross-modal-retrieval multi-modal-learning multi-modal-fusion large-language-models multi-modal-knowledge-graph

Updated Dec 10, 2024

zhengli97 / PromptKD

Star

[CVPR 2024] Official PyTorch Code for "PromptKD: Unsupervised Prompt Distillation for Vision-Language Models"

clip knowledge-distillation multi-modal-learning prompt-learning vision-language-model cvpr2024

Updated Dec 23, 2024
Python

Ysz2022 / NeRCo

Star

[ICCV 2023] Implicit Neural Representation for Cooperative Low-light Image Enhancement

iccv low-light-image multi-modal-learning low-light-image-enhancement neural-representation iccv2023

Updated Mar 18, 2024
Python

moabarar / nemar

Star

[CVPR2020] Unsupervised Multi-Modal Image Registration via Geometry Preserving Image-to-Image Translation

deep-learning cnn pytorch multi-modal image-registration affine-transformation stn image-to-image-translation multimodal deformable-transformation multi-modal-learning cvpr2020 registartion multimodal-image-registration

Updated Aug 2, 2020
Python

huggingface / chug

Star

Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.

computer-vision pdf-document datasets distributed-training dataloading document-understanding multi-modal-learning webdataset

Updated Apr 3, 2024
Python

GuanRunwei / Achelous

Star

The official repository of Achelous and Achelous++

object-detection object-tracking semantic-segmentation multi-task-learning point-cloud-segmentation multi-modal-learning multi-modal-fusion panoptic-perception 4d-mmwave-radar

Updated Jul 8, 2024
Python

qizekun / ReCon

Star

[ICML 2023] Contrast with Reconstruct: Contrastive 3D Representation Learning Guided by Generative Pretraining

representation-learning 3d-point-clouds self-supervised-learning multi-modal-learning

Updated Jul 21, 2024
Python

wjun0830 / CGDETR

Star

Official pytorch repository for CG-DETR "Correlation-guided Query-Dependency Calibration in Video Representation Learning for Temporal Grounding"

computer-vision video-summarization pytorch video-understanding video-grounding multi-modal-learning detr moment-retrieval highlight-detection detection-transformer temporal-grounding text-video-retrieval

Updated Aug 21, 2024
Python

shikras / d-cube

Star

A detection/segmentation dataset with labels characterized by intricate and flexible expressions. "Described Object Detection: Liberating Object Detection with Flexible Expressions" (NeurIPS 2023).

dataset object-detection vision-language multi-modal-learning referring-expression-comprehension open-vocabulary-detection

Updated Mar 20, 2024
Python

924973292 / EDITOR

Star

【CVPR2024】Magic Tokens: Select Diverse Tokens for Multi-modal Object Re-Identification

multi-modal frequency-analysis reid person-reid vehicle-reidentification multi-modal-learning cvpr2024 rgbnt201 rgbnt100 msvr310 token-selection

Updated Oct 24, 2024
Python

likyoo / Multimodal-Remote-Sensing-Toolkit

Star

A python tool to perform deep learning experiments on multimodal remote sensing data.

python pytorch remote-sensing multi-modal-learning

Updated Jan 23, 2022
Python

Improve this page

Add a description, image, and links to the multi-modal-learning topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the multi-modal-learning topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multi-modal-learning

Here are 110 public repositories matching this topic...

mlfoundations / open_clip

OFA-Sys / Chinese-CLIP

lyuchenyang / Macaw-LLM

NVlabs / prismer

lucidrains / x-clip

jokieleung / awesome-visual-question-answering

OpenRobotLab / EmbodiedScan

kyegomez / zeta

DmitryRyumin / CVPR-2023-24-Papers

zjukg / KG-MM-Survey

zhengli97 / PromptKD

Ysz2022 / NeRCo

moabarar / nemar

huggingface / chug

GuanRunwei / Achelous

qizekun / ReCon

wjun0830 / CGDETR

shikras / d-cube

924973292 / EDITOR

likyoo / Multimodal-Remote-Sensing-Toolkit

Improve this page

Add this topic to your repo