Skip to content

Latest commit

 

History

History
527 lines (515 loc) · 56.5 KB

File metadata and controls

527 lines (515 loc) · 56.5 KB

Peformance Comparison of Cross-modal Retrieval

Catalogue

Peformance of Commonly-used Datasets

Performance of Flickr8K

(* indicates Ensemble models, ^ indicates questionable authen)

Method_name Concise_note Sentence retrieval Image retrieval
R@1R@5R@10 R@1R@5R@10
DeViSERCNN 4.816.527.3 5.920.129.6
SDT-RNNAlexNet 4.518.028.6 6.118.529.0
SDT-RNNRCNN 6.022.734.0 6.621.631.7
DeFragAlexNet 5.919.227.3 5.217.626.5
DeFragRCNN 12.632.944.0 9.729.642.5
m-RNNAlexNet 14.537.248.5 11.531.042.4
DVSADepTree 14.837.950.0 11.631.443.8
DVSARCNN 16.540.654.2 11.832.144.7
UVSEAlexNet 13.536.245.7 10.431.043.7
UVSEVggNet 18.040.955.0 12.537.051.5
NICGoogleNet 20--61 19--64
m-CNN*OverFeat 14.935.949.0 11.834.548.0
m-CNN*VggNet 24.853.767.1 20.347.661.7
HM-LSTMRCNN 27.7--68.6 24.4--68.1
SPEVggNet 30.160.473.7 23.051.364.8
FVGMM+HGLMM 31.059.373.7 21.250.064.8
MFMVggNet 35.667.078.6 28.458.572.3
NAAResNet 37.268.179.1 27.759.671.8
ITMeetsALMobileNet 30.958.670.8 ------
ITMeetsALResNet 40.167.879.2 ------
2WayNetVggNet 43.463.2-- 29.349.7--
SCAN*BUTD 52.281.089.2 38.367.878.9
IMRAMBUTD, Image 48.578.185.3 32.061.473.9
IMRAMBUTD, Text 52.181.590.1 40.269.079.2
IMRAMBUTD, Full 54.784.291.0 41.069.279.9

Performance of Flickr30K

Method_name Concise_note Sentence retrieval Image retrieval
R@1R@5R@10 R@1R@5R@10
DeViSERCNN 4.518.129.2 6.721.932.7
SDT-RNNRCNN 9.629.841.1 8.929.841.1
DeFragRCNN 14.237.751.3 10.230.844.2
DeFragftRCNN 16.440.254.7 10.331.444.5
DCCAAlexNet 16.739.352.9 12.631.043.0
NICGoogleNet 17--56 17--57
DVSADepTree 20.046.659.4 15.036.548.2
DVSARCNN 22.248.261.4 15.237.750.5
UVSEAlexNet 14.839.250.9 11.834.046.3
UVSEVggNet 23.050.762.9 16.842.056.5
LRCNVggNet 23.646.658.3 17.540.350.8
m-CNN*OverFeat 20.144.256.3 15.940.351.9
m-CNN*VggNet 33.664.174.9 26.256.369.6
m-RNNAlexNet 18.440.250.9 12.631.241.5
m-RNNVggNet 35.463.873.7 22.850.763.1
FVGMM+HGLMM 35.062.073.8 25.052.766.0
HM-LSTMRCNN 38.1--76.5 27.7--68.8
SPEVggNet 40.368.979.9 29.760.172.1
sm-LSTMVggNet 42.467.579.9 28.257.068.4
sm-LSTM*VggNet 42.571.981.5 30.260.4 72.3
CSEResNet 44.674.383.8 36.969.179.6
MDMVggNet 44.975.484.4 34.467.077.7
RRF-NetResNet 47.677.487.1 35.468.379.9
CMPLMobileNet 40.366.976.7 30.458.268.5
CMPLResNet 49.676.886.1 37.365.775.5
2WayNetVggNet 49.867.5-- 36.055.6--
MFMVggNet 50.278.186.7 38.270.180.2
VSE++VggNet 41.369.177.9 31.460.071.2
VSE++ResNet 52.980.587.2 39.670.179.5
TIMAMResNet, Bert 53.178.887.6 42.671.681.9
TERNBUTD, Bert 53.279.486.0 41.171.981.2
DANVggNet 41.473.582.5 31.861.772.5
DANResNet 55.081.8 89.0 39.469.279.1
NAAResNet 55.180.389.6 39.468.879.9
SCOVggNet 44.274.183.6 32.864.374.9
SCOResNet 55.582.089.3 41.170.580.1
Dual-PathVggNet 47.677.387.1 35.366.678.2
Dual-PathResNet 55.681.989.5 39.169.280.9
ITMeetsALVggNet 38.566.576.3 30.759.470.3
ITMeetsALMobileNet 46.673.582.5 34.463.374.2
ITMeetsALResNet 56.582.289.6 43.571.880.2
CVSE++ResNet 56.682.590.2 42.471.680.8
GXNResNet 56.8--89.6 41.5--80.1
SMANResNet, Random 56.984.891.9 43.273.383.5
SMANResNet, Glove 57.385.392.2 43.473.783.4
M3AResNet 58.182.890.1 44.772.481.1
Align2GroundBUTD ------ 49.774.883.3
A3VSEBUTD 65.089.294.5 49.579.586.6
DXRResNet, Bert 65.187.392.6 50.678.886.7
MTFNBUTD 63.185.892.4 46.375.383.6
MTFNBUTD, RR_no_STT 65.388.393.3 46.775.983.8
MTFNBUTD, RR_STT 65.388.393.3 52.080.186.1
R-SCANBUTD, VrR-VG 66.390.696.0 51.477.884.9
SAVEResNet 67.288.394.2 49.878.786.2
SCANBUTD, T2I_AVE 61.887.593.7 45.874.483.0
SCANBUTD, I2T_AVE 67.989.094.4 43.974.282.8
SCAN*BUTD, AVE+LSE 67.490.395.8 48.677.785.2
BFANBUTD, prob 65.589.4-- 47.977.6--
BFANBUTD, equal 64.589.7-- 48.877.3--
BFAN*BUTD 68.191.4-- 50.878.4--
CAMPBUTD 68.189.795.2 51.577.185.3
RDANBUTD 68.191.095.9 54.180.987.2
GSLSResNet, BUTD 68.289.194.5 43.473.582.5
PersonalityResNeXt, Transformer 68.490.695.3 ------
CASCResNet 68.590.695.9 50.278.386.3
GVSE*BUTD 68.590.995.5 50.679.887.6
HALSCAN_I2T 68.689.994.7 46.074.082.3
OANBUTD 68.693.096.0 53.380.187.1
SAEMBUTD, Bert 69.191.095.1 52.481.188.1
MPLSCAN_I2T 69.489.995.4 47.575.583.1
LIWEBUTD, CLMR 64.088.393.3 46.876.484.5
LIWEBUTD, -Glove 66.488.994.1 47.576.284.9
LIWEBUTD, +Glove 69.690.395.6 51.280.487.2
PFANBUTD, T2I 66.089.694.3 49.677.084.2
PFANBUTD, I2T 67.690.093.8 45.774.783.6
PFAN*BUTD 70.091.895.0 50.478.786.1
PFAN++*BUTD 70.191.896.1 52.779.987.0
CAANBUTD 70.191.697.2 52.879.087.9
DP-RNNBUTD 70.291.695.8 55.581.388.2
TERANBUTD, Bert 70.890.995.5 56.581.288.2
HOADBUTD 70.892.796.0 59.585.691.0
HOADBUTD, +Dist 70.892.796.0 60.986.191.0
GOTSCAN_I2T 70.992.895.5 50.778.786.2
LGSGMBUTD 71.091.996.1 57.484.190.2
VSRNBUTD 70.489.293.7 53.077.985.7
VSRN*BUTD 71.390.696.0 54.781.888.2
SCGVggNet, Prod 57.285.192.1 40.169.579.5
SCGVggNet, Gated 71.890.894.8 49.376.485.6
SGMBUTD 71.891.795.5 53.579.686.5
Meta-SPN*BFAN* 72.593.296.7 53.380.287.2
CSCCBUTD, +GloVe 72.793.496.5 61.286.791.5
ADDR*BUTD, BFAN 71.391.596.4 54.080.087.6
ADDR*BUTD, SCAN 72.193.196.1 53.580.487.4
ADDR*BUTD, VSRN 73.092.596.6 55.682.088.9
AOQ*BUTD, SCAN 70.392.095.5 50.079.286.2
AOQ*BUTD, VSRN 72.891.895.8 55.382.288.4
AOQ*BUTD, BFAN 73.294.597.0 54.080.387.7
CVSEBUTD 73.592.195.8 52.980.487.8
SMFEABUTD 73.792.596.1 54.782.188.4
IMRAMBUTD, Image 67.090.595.6 51.278.285.5
IMRAMBUTD, Text 68.891.696.0 53.079.087.1
IMRAMBUTD, Full 74.193.096.6 53.979.487.2
HANBUTD 74.192.496.4 54.881.187.4
MMCABUTD, Bert 74.292.896.4 54.881.487.8
SHANBUTD, T2I 72.592.395.8 53.678.685.5
SHANBUTD, I2T 70.691.795.5 50.577.185.2
SHANBUTD, Full 74.693.596.9 55.381.388.4
WCGLBUTD 74.893.396.8 54.880.687.5
CCRS*BUTD, SCAN 70.192.096.0 52.379.986.8
CCRS*BUTD, BFAN 75.393.696.7 55.481.387.7
SSAMTBUTD, Bert 75.492.696.4 54.881.588.0
SAN^VggNet 67.088.094.6 51.477.285.2
SAN^ResNet 75.592.696.2 60.184.790.6
SAMVSRN 68.489.794.8 52.478.786.6
SAM^CVSE^ 70.089.293.1 55.082.689.0
SAMSGR 75.992.496.6 57.683.189.7
GSMNBUTD, sparse 71.492.096.1 53.979.787.1
GSMNBUTD, dense 72.693.596.8 53.780.087.0
GSMN*BUTD 76.494.397.3 57.482.389.0
ADAPTBUTD, I2T 70.290.895.8 55.582.789.8
ADAPTBUTD, T2I 73.693.796.7 57.083.690.3
ADAPT*BUTD, +GloVe 76.695.497.6 60.786.692.0
SGRAFBUTD, SAF 73.793.396.3 56.181.588.0
SGRAFBUTD, SGR 75.293.396.6 56.281.086.5
SGRAF*BUTD 77.894.197.4 58.583.088.8
DSRANBUTD, GRU 72.693.696.3 56.384.089.8
DSRANBUTD, Bert 75.394.497.6 57.384.890.9
DSRAN*BUTD, GRU 74.994.597.0 58.685.891.3
DSRAN*BUTD, Bert 77.895.197.6 59.286.091.9
CAMERABUTD, Bert 76.595.197.2 58.984.790.2
CAMERA*BUTD, Bert 78.095.197.9 60.385.991.7
CAEMCLBUTD 76.393.296.5 57.082.188.5
CAEMCL*BUTD 78.794.597.9 58.283.689.6
T-EMDEBUTD, SAF 75.294.297.1 57.182.288.3
T-EMDEBUTD, SGR 77.593.197.2 56.982.087.5
T-EMDE*BUTD, SGRAF 78.894.497.5 59.683.689.2
DIMEBUTD, I2T, Bert 77.495.097.4 60.185.591.8
DIMEBUTD, T2I, Bert 77.593.597.5 59.185.591.0
DIME*BUTD, Bert 81.095.998.4 63.688.193.0
PG*BUTD, +3loss 81.094.597.1 60.686.592.4
PG*BUTD, +GloVe 82.895.997.9 62.289.393.8
ACMMBUTD 80.095.598.2 50.276.884.7
ACMM*BUTD 85.296.798.4 53.879.886.8
GPOIN, BiGRU 77.194.597.1 58.584.189.6
GPO*IN+VG, BiGRU 80.796.498.3 60.886.392.3
GPO*IN+VG, Bert 85.397.298.9 66.789.994.0
GPO*WSL, Bert 88.798.999.8 76.194.597.1

Performance of MSCOCO1K

Method_name Concise_note Sentence retrieval Image retrieval
R@1R@5R@10 R@1R@5R@10
STVcombine-skip 33.867.782.1 25.960.074.6
DVSARCNN 38.469.980.5 27.460.274.8
FVGMM+HGLMM 39.467.980.9 25.159.876.6
m-RNNVggNet 41.073.083.5 29.042.277.0
m-CNN*VggNet 42.873.184.1 32.668.682.8
UVSEVggNet 43.475.785.8 31.066.779.9
HM-LSTMRCNN 43.9--87.8 36.1--86.7
Order-embVggNet 46.7--88.9 37.9--85.9
SPEVggNet 50.179.789.2 39.675.286.9
SEAMVggNet 50.781.490.9 40.375.787.4
sm-LSTMVggNet 52.481.790.8 38.673.484.6
sm-LSTM*VggNet 53.283.191.5 40.775.887.4
CMPLMobileNet 52.983.892.1 41.374.685.9
MDMVggNet 54.784.191.9 44.679.690.5
2WayNetVggNet 55.875.2-- 39.763.3--
CMPMResNet 56.186.392.9 44.678.889.0
CSEResNet 56.384.492.2 45.781.290.6
RRF-NetResNet 56.485.391.5 43.978.188.6
ITMeetsALVggNet 44.276.186.3 37.172.785.1
ITMeetsALMobileNet 54.784.391.1 41.076.788.1
ITMeetsALResNet 58.585.392.1 48.382.090.6
MFMVggNet 58.986.392.4 47.781.090.9
CHAIN-VSEVggNet 51.682.091.3 38.675.187.2
CHAIN-VSEResNet 59.488.094.2 43.579.890.2
NAAResNet 61.387.995.4 47.080.890.1
TERNBUTD, Bert 63.790.596.2 51.985.693.6
VSE++VggNet 57.286.093.3 45.979.489.1
VSE++ResNet 64.690.095.7 52.084.392.0
Dual-PathVggNet 59.486.292.9 41.676.387.5
Dual-PathResNet 65.689.895.5 47.179.990.0
DXRResNet, Bert 67.093.097.6 56.888.294.9
PersonalityResNeXt, Transformer 67.391.796.5 ------
Align2GroundBUTD ------ 56.684.992.8
SMANResNet, Random 67.990.696.2 58.887.093.7
SMANResNet, Glove 68.491.396.6 58.587.493.5
GXNResNet 68.5--97.9 56.6--94.5
GSLSResNet, BUTD 68.994.198.0 58.688.294.9
CVSE++ResNet 69.192.296.1 55.686.793.8
PVSEResNet 69.291.696.6 55.286.593.7
DSVE-LocResNet 69.891.996.6 55.986.994.0
SCOVggNet 66.691.896.6 55.586.693.8
SCOResNet 69.992.997.5 56.787.594.8
R-SCANBUTD, VrR-VG 70.394.598.1 57.687.393.7
M3AResNet 70.491.796.8 58.487.194.0
SAVEResNet 70.893.297.6 56.987.694.4
MPLSCAN_I2T 71.193.798.2 56.886.793.0
SAEMBUTD, Bert 71.294.197.7 57.888.694.9
SoDeepDSVE-Loc 71.592.897.1 56.287.094.3
OANBUTD 71.796.499.3 60.288.694.5
GVSE*BUTD 72.294.198.1 60.589.495.8
CAMPBUTD 72.394.898.3 58.587.995.0
CASCResNet 72.396.099.0 58.989.896.0
SCANBUTD, T2I_AVE 70.994.597.8 56.487.093.9
SCANBUTD, I2T_AVE 69.293.297.5 54.486.093.6
SCAN*BUTD, LSE+AVE 72.794.898.4 58.888.494.8
LIWEBUTD, -Glove 69.693.998.0 55.587.394.2
LIWEBUTD, CLMR 71.893.197.6 56.287.594.2
LIWEBUTD, +Glove 73.295.598.2 57.988.394.5
SGMBUTD 73.493.897.8 57.587.394.3
ParNetBUTD, NP 72.894.997.9 57.987.494.0
ParNetBUTD, P 73.594.598.3 58.388.294.1
MTFNBUTD 71.994.297.9 57.388.695.0
MTFNBUTD, RR_no_STT 74.394.997.9 57.588.895.0
MTFNBUTD, RR_STT 74.394.997.9 60.189.195.0
Meta-SPNBFAN, equal 74.495.098.3 58.687.694.3
RDANBUTD 74.696.298.7 61.689.294.7
CVSEBUTD 74.895.198.3 59.989.495.2
MMCABUTD, Bert 74.895.697.7 61.689.895.2
BFANBUTD, prob 73.094.8-- 58.087.6--
BFANBUTD, equal 73.794.9-- 58.387.5--
BFAN*BUTD 74.995.2-- 59.488.4--
SMFEABUTD 75.195.498.3 62.590.196.2
DP-RNNBUTD 75.395.898.6 62.589.795.1
CCRS*BUTD, SCAN 70.994.398.0 57.387.694.3
CCRS*BUTD, BFAN 75.495.398.5 60.388.694.6
WCGLBUTD 75.495.598.6 60.889.395.3
CAANBUTD 75.595.498.5 61.389.795.2
VSRNBUTD 74.094.397.8 60.888.494.1
VSRN*BUTD 76.294.898.2 62.889.795.1
ADAPTBUTD, I2T 74.594.297.9 62.090.495.5
ADAPTBUTD, T2I 75.395.198.4 63.390.095.5
ADAPT*BUTD 76.595.698.9 62.290.596.0
PFANBUTD, T2I 75.895.999.0 61.089.195.1
PFANBUTD, I2T 70.794.197.8 53.084.592.6
PFAN*BUTD 76.596.399.0 61.689.695.2
SCGVggNet, Prod 73.494.897.6 56.385.693.5
SCGVggNet, Gated 76.696.399.2 61.488.995.1
IMRAMBUTD, Image 76.195.398.2 61.088.694.5
IMRAMBUTD, Text 74.095.698.4 60.688.994.6
IMRAMBUTD, Full 76.795.698.5 61.789.195.0
SHANBUTD, T2I 75.996.198.7 60.788.294.2
SHANBUTD, I2T 73.095.897.9 58.587.394.0
SHANBUTD, Full 76.896.398.7 62.689.695.8
PFAN++*BUTD 77.196.598.3 62.589.995.4
ADDR*BUTD, SCAN 76.195.598.4 61.288.994.8
ADDR*BUTD, BFAN 76.495.898.3 62.389.496.2
ADDR*BUTD, VSRN 77.496.198.9 63.590.796.7
CAMERABUTD, Bert 75.995.598.6 62.390.195.2
CAMERA*BUTD, Bert 77.596.398.8 63.490.995.8
AOQ*BUTD, SCAN 74.195.298.5 59.888.695.0
AOQ*BUTD, BFAN 77.396.098.5 61.289.295.0
AOQ*BUTD, VSRN 77.595.598.6 63.590.595.8
TERANBUTD, Bert 77.795.998.6 65.091.296.4
HOAD^BUTD 77.096.198.7 65.193.197.9
HOAD^BUTD, +Dist 77.896.198.7 66.293.097.9
TOD-NetVSE++ 68.692.096.9 54.585.392.4
TOD-NetBert 75.895.398.4 61.889.695.0
TOD-Net*Bert 78.196.098.6 63.690.695.8
SSAMTBUTD, Bert 78.295.698.0 62.789.695.3
HALSCAN_I2T 78.396.398.5 60.186.792.8
DSRANBUTD, GRU 76.394.998.4 62.489.795.2
DSRANBUTD, Bert 77.195.398.1 62.989.995.3
DSRAN*BUTD, GRU 78.095.698.5 64.290.495.8
DSRAN*BUTD, Bert 78.395.798.4 64.590.895.8
GSMNBUTD, sparse 76.195.698.3 60.488.795.0
GSMNBUTD, dense 74.795.398.2 60.388.594.6
GSMN*BUTD 78.496.498.6 63.390.195.7
HANBUTD 78.796.498.8 65.490.595.3
DIMEBUTD, I2T, Bert 77.995.998.3 63.090.596.2
DIMEBUTD, T2I, Bert 77.295.598.5 62.390.295.8
DIME*BUTD, Bert 78.896.398.7 64.891.596.5
CSCC^BUTD, +GloVe 78.896.199.0 66.692.596.4
CAEMCLBUTD 77.696.498.8 62.289.895.8
CAEMCL*BUTD 78.997.598.8 65.790.296.6
SGRAFBUTD, SAF 76.195.498.3 61.889.495.3
SGRAFBUTD, SGR 78.095.898.2 61.489.395.4
SGRAF*BUTD 79.696.298.5 63.290.796.1
T-EMDEBUTD, SAF 78.395.798.5 62.389.795.2
T-EMDEBUTD, SGR 77.195.998.5 61.689.595.1
T-EMDE*BUTD, SGRAF 79.696.398.7 63.590.495.6
SAMVSRN 74.693.697.5 61.589.694.9
SAM^CVSE^ 79.895.197.7 67.093.097.3
SAMSGR 80.797.298.6 63.890.595.9
ACMMBUTD 81.998.099.3 58.287.393.9
ACMM*BUTD 84.197.899.4 60.788.794.9
PG*BUTD, +GloVe 84.095.897.8 63.988.995.6
SAN^VggNet 74.994.998.2 60.890.395.7
SAN^ResNet 85.497.599.0 69.193.497.2
GPOIN, BiGRU 76.595.398.5 62.990.695.8
GPO*IN+VG, BiGRU 80.097.099.0 64.891.696.5
GPO*IN+VG, Bert 82.297.599.5 68.192.997.2
GPO*WSL, Bert 85.698.099.4 73.194.397.7

Performance of MSCOCO5K

Method_name Concise_note Sentence retrieval Image retrieval
R@1R@5R@10 R@1R@5R@10
DVSARCNN 16.539.252.0 10.729.642.2
FVGMM+HGLMM 17.339.050.2 10.828.340.1
Order-embVggNet 23.3--65.0 18.0--57.6
CSEResNet 27.957.170.4 22.250.264.4
CMPLMobileNet 24.652.366.4 19.144.658.4
CMPMResNet 31.160.773.9 22.950.263.8
TERNBUTD, Bert 38.469.581.3 28.759.772.7
Dual-PathVggNet 35.563.275.6 21.047.560.9
Dual-PathResNet 41.270.581.1 25.353.466.4
VSE++VggNet 32.961.774.7 24.152.866.2
VSE++ResNet 41.371.181.2 30.359.472.4
GXNResNet 42.0--84.7 31.7--74.6
SCOVggNet 40.270.181.3 31.361.573.9
SCOResNet 42.872.383.0 33.162.975.5
CVSE++ResNet 43.273.584.1 32.462.274.6
DXRResNet, Bert 44.975.284.7 33.964.977.4
PVSEResNet 45.274.384.5 32.463.075.0
R-SCANBUTD, VrR-VG 45.477.987.9 36.265.576.7
SAVEResNet 46.776.386.1 34.064.877.0
MPLSCAN_I2T 46.977.787.6 34.464.275.9
GVSE*BUTD 47.276.688.4 31.261.270.5
CASCResNet 47.278.387.4 34.764.876.8
OANBUTD 47.881.290.4 37.066.678.0
MTFNBUTD 44.776.487.3 33.164.776.1
MTFNBUTD, RR 48.377.687.3 35.966.176.1
M3AResNet 48.975.284.4 38.365.776.9
A3VSEBUTD 49.381.190.2 39.068.080.1
GVSE*BUTD 49.977.487.6 38.468.579.7
SGMBUTD 50.079.387.9 35.364.976.5
CAMPBUTD 50.182.189.7 39.068.980.2
SCANBUTD, I2T_LSE 46.477.487.2 34.463.775.7
SCAN*BUTD, AVE+LSE 50.482.290.0 38.669.380.4
GOTSCAN_I2T 50.580.289.8 38.166.878.5
PFAN*BUTD 50.883.989.1 39.569.580.8
Meta-SPNBFAN, equal 51.081.189.4 37.566.777.5
PFAN++*BUTD 51.284.389.2 41.470.979.0
HOADBUTD 51.281.789.1 39.472.584.1
HOADBUTD, +Dist 51.481.889.1 40.573.584.1
CAANBUTD 52.583.390.9 41.270.382.9
VSRN*BUTD 53.081.189.4 40.570.681.1
CCRS*BUTD, SCAN 47.978.188.2 36.966.978.4
CCRS*BUTD, BFAN 53.181.890.2 38.367.878.6
IMRAMBUTD, Image 53.282.590.4 38.968.579.2
IMRAMBUTD, Text 52.081.890.1 38.668.179.1
IMRAMBUTD, Full 53.783.291.0 39.769.179.8
MMCABUTD, Bert 54.082.590.7 38.769.780.8
SMFEABUTD 54.2--89.9 41.9--83.7
CAMERABUTD, Bert 53.181.389.8 39.070.581.5
CAMERA*BUTD, Bert 55.182.991.2 40.571.782.5
DSRANBUTD, GRU 51.981.689.8 39.570.681.0
DSRANBUTD, Bert 53.782.189.9 40.370.981.3
DSRAN*BUTD, GRU 54.483.591.3 41.571.982.1
DSRAN*BUTD, Bert 55.383.590.9 41.772.782.8
CSCCBUTD, +GloVe 55.683.691.2 40.873.284.3
TERANBUTD, Bert 55.683.991.6 42.672.582.9
SAMVSRN 49.179.087.4 37.568.179.5
SAMSGR 55.783.291.2 40.569.780.5
SAM^CVSE^ 56.482.490.1 42.373.984.5
SCGVggNet, Prod 49.978.988.1 33.262.474.7
SCGVggNet, Gated 56.684.592.0 39.268.081.3
AOQ*BUTD, SCAN 51.282.590.1 39.469.780.4
AOQ*BUTD, VSRN 55.183.390.8 41.171.582.0
AOQ*BUTD, BFAN 57.384.591.7 40.169.280.1
ADDR*BUTD, BFAN 54.384.091.5 40.169.280.6
ADDR*BUTD, VSRN 56.685.390.4 42.571.982.0
ADDR*BUTD, SCAN 57.386.092.7 41.872.081.3
SSAMTBUTD, Bert 57.784.290.8 40.870.580.5
SGRAFBUTD, SAF 53.382.390.1 39.869.080.2
SGRAFBUTD, SGR 56.983.290.5 40.269.079.8
SGRAF*BUTD 57.884.991.6 41.970.781.3
T-EMDEBUTD, SAF 56.7--90.7 40.3--80.4
T-EMDEBUTD, SGR 57.0--91.0 40.0--80.1
T-EMDE*BUTD, SGRAF 59.1--91.8 41.8--81.7
DIMEBUTD, I2T, Bert 56.183.291.1 40.270.781.4
DIMEBUTD, T2I, Bert 55.382.490.2 39.770.381.0
DIME*BUTD, Bert 59.385.491.9 43.173.083.1
SAN^ResNet 65.489.494.8 46.277.486.6
ACMMBUTD 63.588.093.6 36.765.176.7
ACMM*BUTD 66.989.694.9 39.569.681.1
GPOIN, BiGRU 55.181.989.9 40.970.681.5
GPO*IN+VG, BiGRU 59.886.192.8 42.772.883.3
GPO*IN+VG, Bert 62.587.894.0 46.075.885.7
GPO*WSL, Bert 68.190.295.2 52.780.288.3
PG*BUTD, +GloVe 68.788.793.0 46.277.885.5

Peformance of Identity-aware Datasets

Performance of RSTPReid

Method_name Concise_note Text-to-Image
R@1R@5R@10
DSSLResNet 32.4355.0863.19

Performance of CUHK-PEDES

Method_name Concise_note Text-to-Image
R@1R@5R@10
LSTM-Q+IVggNet 17.19--57.82
GNA-RNNVggNet 19.05--53.64
IATVVggNet 25.94--60.48
PWM-ATHVggNet 27.1449.4561.02
GLAResNet 43.5866.9376.26
Dual-PathVggNet 32.1554.4264.30
Dual-PathResNet 44.4066.2675.07
CMPMMobileNet 44.02--77.00
CMPLMobileNet 49.37--79.27
MCCLMobileNet, CL 48.21--78.27
MCCLMobileNet 50.58--79.06
MIAVggNet 48.0070.7079.30
MIAResNet 53.1075.0082.90
A-GANetResNet 53.1474.0382.95
PMAVggNet 47.0268.5478.06
PMAResNet 53.8173.5481.23
TIMAMResNet, Bert 54.5177.5684.78
CMAAMMobileNet 55.1376.1483.77
ITMeetsALMobileNet 51.8573.3681.27
ITMeetsALResNet 55.7276.1584.26
ViTAAResNet 55.9775.8483.52
FTDResNet 57.8478.3385.43
MGELVggNet 52.6874.3783.11
MGELMobileNet 59.2179.1685.88
MGELResNet 60.2780.0186.74
SSANVggNet 55.5276.1783.45
SSANResNet 61.3780.1586.73
NAFSResNet, Bert 59.9479.8686.70
NAFS+RVN 61.5081.1987.51
DSSLResNet 59.9880.4187.56
DSSL+RR 62.3382.1188.01
LapsCoreCMPL 53.33--83.20
LapsCoreNAFS 63.40--87.80

Performance of ICFG-PEDES

Method_name Concise_note Text-to-Image
R@1R@5R@10
Dual-PathResNet 38.9959.4468.41
CMPLResNet 43.5165.4474.26
MIAResNet 46.4967.1475.18
SCANResNet 50.0569.6577.21
ViTAAResNet 50.9868.7975.78
SSANResNet 54.2372.6379.53

Performance of CUB-Flowers

Method_name Concise_note CUB Flowers
Image-to-Text Text-to-Image Image-to-Text Text-to-Image
R@1AP@50 R@1AP@50
FVGMM+HGLMM 36.535.6 54.852.8
Word2Vec 38.633.5 54.252.1
Word-NNCNN 51.043.3 60.756.3
Word-NNCNN-RNN 56.848.7 65.659.6
IATVTriplet 52.552.4 64.364.9
IATVVggNet 61.557.6 68.470.1
CMPMMobileNet 62.164.6 66.167.7
CMPLMobileNet 64.367.9 68.969.7
TIMAMResNet, Bert 67.770.3 70.673.7
LapsCoreCMPL 68.066.0 75.271.4
LapsCoreCMP_adv 72.369.5 77.973.3