A General Purpose Turkish CLIP Model (TrCLIP) for Image&Text Retrieval and its Application to E-Commerce
A turkish supported CLIP
View SPACES Demo
Paper(will be added)
·
Report Bug
·
Request Feature
In this paper, we introduce a Turkish adaption of CLIP (Contrastive Language-Image Pre-Training). Our approach is to train a model with the same output space as the Text encoder of the CLIP model while processing Turkish input. For this, we collected 2.5M unique English-Turkish data. The model we named TrCLIP performed 71% in CIFAR100, 86% in VOC2007, and 47% in FER2013 as zero-shot accuracy. We have examined its performance on e-commerce data and a vast domain-independent dataset in image and text retrieval tasks. The model can work in Turkish without any extra fine-tuning.
See requirements.txt
! pip install trclip
Model can be found in huggingface : https://huggingface.co/yusufani/trclip-vitl14-e10
TrCaption Dataset link : https://drive.google.com/file/d/1-QrfiwPFvzhh8mWW4Bc0uKHRC8OWaRGa/view?usp=sharing
Trcaption pre-calculated features : Ofcourse you can create embeddings from strcatch but it takes time:D Trcaption metadata : https://drive.google.com/file/d/1-LlI104fo3KgKHjnoYZpo51aqiH4dS8f/view?usp=sharing Trcaption images - trclip-vitl14-e10 : https://drive.google.com/file/d/1-JBSLX3OZ5aCSGJEUxJm680BZCuRweFj/view?usp=sharing Trcaption texts - trclip-vitl14-e10 : will added
trclip = Trclip(model_path, clip_model='ViT-L/14', device='cpu')
images = [Image.open() for i in image_paths]
texts = ['kedi', 'köpek' , 'at']
Mode image_retrieval -> It calculates probabilities for each text, basically if you want to fetch image for given text
Mode text_retrieval -> It calculates probabilities for each image, basically if you want to fetch text for given image
per_mode_indices, per_mode_probs = trclip.get_results(texts=texts, images=ims, mode='image_retrieval')
image_features= trlip.get_image_features(images)
text_features= trlip.get_text_features(texts)
per_mode_indices, per_mode_probs = trclip.get_results(text_features=text_features, image_features=image_features, mode='image_retrieval')
from trclip.visualizer import image_retrieval_visualize,
image_retrieval_visualize(per_mode_indices, per_mode_probs, texts, im_paths,
n_figure_in_column=2,
n_images_in_figure=4, n_figure_in_row=1, save_fig=False,
show=False,
break_on_index=-1)
from trclip.visualizer import text_retrieval_visualize,
text_retrieval_visualize(per_mode_indices, per_mode_probs, im_paths, texts,
n_figure_in_column=4,
n_texts_in_figure=4 if len(texts) > 4 else len(texts),
n_figure_in_row=2,
save_fig=False,
show=False,
break_on_index=-1,
)
Distributed under the MIT License. See LICENSE.txt
for more information.
Your Name - yusufani8@gmail.com
Project Link: https://github.com/yusufani/TrCLIP