This repository contains the code for our paper Dual-Path Convolutional Image-Text Embedding. Thank you for your kindly attention.
The compelete code will be uploaded in two weeks. I am adding illustrations and comments to the code for using. You can check my progress as follows.
-
Get word2vec weight
-
Data Preparation (Flickr30k)
-
Train on Flickr30k
-
Test on Flickr30k
-
Data Preparation (MSCOCO)
-
Train on MSCOCO
-
Test on MSCOCO
-
Data Preparation (CUHK-PEDES)
-
Train on CUHK-PEDES
-
Test on CUHK-PEDES
-
Run the code on another machine
-
Extract wrod2vec weights. Follow the instruction in
./word2vector_matlab
; -
Prepare the dataset. Follow the instruction in
./dataset
. You can choose one dataset to run. Three datasets need different prepocessing. I write the instruction for Flickr30k, MSCOCO and CUHK-PEDES. -
Download the model pre-trained on ImageNet. And put the model into './data'.
(bash) wget http://www.vlfeat.org/matconvnet/models/imagenet-resnet-50-dag.mat
Alternatively, you may try VGG16 or VGG19.
- For Flickr30k, run
train_flickr_word2_1_pool.m
for Stage I training.
Run train_flickr_word_Rankloss_shift_hard
for Stage II training.
- For MSCOCO, run
train_coco_word2_1_pool.m
for Stage I training.
Run train_coco_Rankloss_shift_hard.m
for Stage II training.
- For CUHK-PEDES, run
train_cuhk_word2_1_pool.m
for Stage I training.
Run train_cuhk_word_Rankloss_shift
for Stage II training.
Select one model and have fun!
-
For Flickr30k, run
test/extract_pic_feature_word2_plus_52.m
and to extract the feature from image and text. Note that you need to change the model path in the code. -
For MSCOCO, run
test_coco/extract_pic_feature_word2_plus.m
and to extract the feature from image and text. Note that you need to change the model path in the code. -
For CUHK-PEDES, run
test_cuhk/extract_pic_feature_word2_plus_52.m
and to extract the feature from image and text. Note that you need to change the model path in the code.