research-article

Collaborative Subspace Graph Hashing for Cross-modal Retrieval

Authors:

Canqun YangAuthors Info & Claims

ICMR '18: Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval

Pages 213 - 221

https://doi.org/10.1145/3206025.3206042

Published: 05 June 2018 Publication History

Abstract

Current hashing methods for cross-modal retrieval generally attempt to learn the separate modality-specific transformation matrices to embed multi-modality data into a latent common subspace, and usually ignore the fact that respecting the diversity of multi-modality features in the latent subspace could be beneficial for retrieval improvements. To this, we propose a collaborative subspace graph hashing method (CSGH) to perform a two-stage collaborative learning framework for cross-modal retrieval. Particularly, CSGH first embeds multi-modality data into separate latent subspaces through individual modality-specific transformation matrices, and then connects these latent subspaces to a common Hamming space through a shared transformation matrix. In this framework, CSGH considers the modality-specific neighborhood structure and the cross-modal correlation within multi-modality data through the Laplacian regularization and the graph based correlation constraint, respectively. To solve CSGH, we develop an alternative procedure to optimize it, and fortunately, each sub-problem of CSGH has the elegant analytical solution. Experiments of cross-modal retrieval on Wiki, NUS-WIDE, Flickr25K and Flickr1M datasets show the effectiveness of CSGH compared with the state-of-the-art cross-modal hashing methods.

References

[1]

M. M. Bronstein, A. M. Bronstein, F. Michel, and N. Paragios. 2010. Data fusion through cross-modality metric learning using similarity-sensitive hashing CVPR. 3594--3601.

[2]

M. M. Bronstein, A. M. Bronstein, F. Michel, and N. Paragios. 2013. Data fusion through cross-modality metric learning using similarity-sensitive hashing CVPR. 3594--3601.

[3]

Y. Cao, M Long, J. Wang, and S. Liu. 2017 a. Collective deep quantization for efficient cross-modal retrieval AAAI. 3974--3980.

[4]

Z. Cao, M. Long, J. Wang, and Q. Yang. 2017 b. Transitive hashing network for heterogeneous multimedia retrieval AAAI. 81--87.

[5]

L. Chi, B. Li, X. Zhu, S. Pan, and L. Chen. 2017. Hashing for adaptive real-time graph stream classification with concept drifts. IEEE TCB (2017).

[6]

T. S. Chua, J. Tang, R. Hong, H. Li, Z. Luo, and Y. T. Zheng. 2009. NUS-WIDE: a real-world web image database from national university of singapore CIVR. ACM, 48.

Digital Library

[7]

P. J. Costa, E. Coviello, G. Doyle, N. Rasiwasia, G. R. Lanckriet, R. Levy, and N. Vasconcelos. 2014. On the role of correlation and abstraction in cross-modal multimedia retrieval. IEEE TPAMI Vol. 36, 3 (2014), 521--35.

Digital Library

[8]

J. Costa Pereira, E. Coviello, G. Doyle, N. Rasiwasia, G. Lanckriet, R. Levy, and N. Vasconcelos. 2014. On the role of correlation and abstraction in cross-modal multimedia retrieval. IEEE TPAMI Vol. 36, 3 (March. 2014), 521--535.

Digital Library

[9]

M. Datar, N. Immorlica, P. Indyk, and V. S. Mirrokni. 2004. Locality-sensitive hashing scheme based on P-stable distributions SCG. ACM, 253--262.

Digital Library

[10]

G. Ding, Y. Guo, and J. Zhou. 2014. Collective matrix factorization hashing for multimodal data CVPR. 2083--2090.

Digital Library

[11]

G. Dong, X. Zhang, L. Lan, X. Huang, and Z. Luo. 2018. Discrete graph hashing via affine transformation. In ICME.

[12]

F. Feng, X. Wang, and R. Li. 2014. Cross-modal retrieval with correspondence autoencoder MM. ACM, 7--16.

Digital Library

[13]

L. Feng, X. S. Xu, S. Guo, and X. L. Wang. 2017. Supervised class graph preserving hashing for image retrieval and classification MMM. Springer, 391--403.

[14]

Y. Gong, S. Lazebnik, A. Gordo, and F. Perronnin. 2013. Iterative quantization: a procrustean approach to learning binary codes for large-scale image retrieval. IEEE TPAMI Vol. 35, 12 (2013), 2916--2929.

Digital Library

[15]

L. He, X. Xu, H. Lu, Y. Yang, F. Shen, and H. T. Shen. 2017. Unsupervised cross-modal retrieval through adversarial learning ICME. 1153--1158.

[16]

M. J. Huiskes and M. S. Lew. 2008. The MIR Flickr retrieval evaluation. In MIR. ACM, 39--43.

Digital Library

[17]

S. Kumar and R. Udupa. 2011. Learning hash functions for cross-view similarity search IJCAI. 1360--1365.

Digital Library

[18]

X. Li, D. Hu, and F. Nie. 2017. Large graph hashing with spectral rotation. In AAAI. 2203--2209.

[19]

J. Liang, Z. Li, D. Cao, R. He, and J. Wang. 2016. Self-paced cross-modal subspace matching. In SIGIR. ACM, 569--578.

Digital Library

[20]

J. Lin, Z. Li, and J. Tang. 2017 a. Discriminative deep hashing for scalable face image retrieval IJCAI.

Digital Library

[21]

J. Lin, A. Veillard, L. Y. Duan, V. Chandrasekhar, and T. Poggio. 2017 b. Nested invariance pooling and RBM hashing for image instance retrieval ICMR. ACM, 260--268.

Digital Library

[22]

Z. Lin, G. Ding, M. Hu, and J. Wang. 2015. Semantics-preserving hashing for cross-view retrieval CVPR. 3864--3872.

[23]

L. Liu, Z. Lin, L. Shao, F. Shen, G. Ding, and J. Han. 2017 a. Sequential discrete hashing for scalable cross-modality similarity retrieval. IEEE TIP Vol. 26, 1 (2017), 107--118.

Digital Library

[24]

L. Liu, F. Shen, Y. Shen, X. Liu, and L. Shao. 2017 b. Deep sketch hashing: fast free-hand sketch-based image retrieval arXiv.

[25]

W. Liu, J. Wang, S. Kumar, and S. F. Chang. 2011. Hashing with graphs. In ICML. 1--8.

Digital Library

[26]

M. Long, Y. Cao, J. Wang, and P. S. Yu . 2016. Composite correlation quantization for efficient multimodal retrieval SIGIR. ACM, 579--588.

Digital Library

[27]

X. Lu, F. Wu, S. Tang, Z. Zhang, X. He, and Y. Zhuang. 2013. A low rank structural large margin method for cross-modal ranking SIGIR. ACM, 433--442.

Digital Library

[28]

X. Lu, X. Zheng, and X. Li. 2017. Latent semantic minimal hashing for image retrieval. TIP Vol. 26, 1 (2017), 355--368.

Digital Library

[29]

Jonathan M., Michael M. B., Alexander M. B., and Jürgen S. 2014. Multimodal Similarity-Preserving Hashing. IEEE TPAMI Vol. 36, 4 (2014), 824--30.

Digital Library

[30]

B. Thomee M. J. Huiskes and M. S. Lew. 2010. New trends and ideas in visual concept detection: the MIR Flickr retrieval evaluation initiative. In MIR. ACM, 527--536.

Digital Library

[31]

S. Moran and V. Lavrenko. 2015. Regularised cross-modal hashing. (2015), 907--910.

Digital Library

[32]

J. Ngiam, A. Khosla, M. Kim, J. Nam, H. Lee, and A. Y. Ng. 2011. Multimodal deep learning. In ICML. 689--696.

Digital Library

[33]

M. Ou, P. Cui, F. Wang, J. Wang, W. Zhu, and S. Yang. 2013. Comparing apples to oranges:a scalable solution with heterogeneous hashing SIGKDD. ACM, 230--238.

Digital Library

[34]

Yuxin Peng, Xin Huang, and Yunzhen Zhao. 2017. An overview of cross-media retrieval: concepts, methodologies, benchmarks and challenges. TCSVT (2017).

[35]

Y. Peng, J. Qi, X. Huang, and Y. Yuan. 2018. CCL: Cross-modal Correlation Learning with Multi-grained Fusion by Hierarchical Network. IEEE Transactions on Multimedia Vol. 20, 2 (2018), 405--420.

Digital Library

[36]

Y. Peng, X. Zhai, Y. Zhao, and X. Huang. 2016. Semi-supervised cross-media feature learning with unified patch graph regularization. IEEE TCSVT Vol. 26, 3 (2016), 583--596.

Digital Library

[37]

V. Ranjan, N. Rasiwasia, and C. V. Jawahar. 2015. Multi-label cross-modal retrieval. In ICCV. 4094--4102.

Digital Library

[38]

N. Rasiwasia, J. Costa Pereira, E. Coviello, G. Doyle, G.R.G. Lanckriet, R. Levy, and N. Vasconcelos. 2010. A new approach to cross-modal multimedia retrieval MM. ACM, 251--260.

Digital Library

[39]

R. Salakhutdinov and G. Hinton. 2009. Semantic hashing. International Journal of Approximate Reasoning Vol. 50, 7 (2009), 969--978.

Digital Library

[40]

J. Song, Y. Yang, Y. Yang, Z. Huang, and H. T. Shen. 2013. Inter-media hashing for large-scale retrieval from heterogeneous data sources SIGMOD. ACM, 785--796.

Digital Library

[41]

N. Srivastava and R. R. Salakhutdinov. 2012. Multimodal learning with deep boltzmann machines. In NIPS, bibfieldeditorF. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger (Eds.). Curran Associates, Inc., 2222--2230. deftempurl%http://papers.nips.cc/paper/4683-multimodal-learning-with-deep-boltzmann-machines.pdf tempurl

Digital Library

[42]

D. Wang, X. Gao, X. Wang, and L. He. 2015. Semantic topic multimodal hashing for cross-media retrieval IJCAI. 3890--3896.

Digital Library

[43]

K. Wang, R. He, L. Wang, W. Wang, and T. Tan. 2016. Joint feature selection and subspace learning for cross-modal retrieval. IEEE TPAMI Vol. 38, 10 (2016), 2010--2023.

Digital Library

[44]

W. Wang, B. C. Ooi, X. Yang, D. Zhang, and Y. Zhuang. 2014. Effective multi-modal retrieval based on stacked auto-encoders. Proceedings of the VLDB Endowment Vol. 7, 8 (2014), 649--660.

Digital Library

[45]

Y. Wei, Y. Zhao, C. Lu, Sh. Wei, L. Liu, Zh. Zhu, and S. Yan. 2017. Cross-modal retrieval with CNN visual features: a new baseline. IEEE TCB Vol. 47, 2 (2017), 449--460.

[46]

Y. Wei, Y. Zhao, Z. Zhu, S. Wei, Y. Xiao, J. Feng, and S. Yan. 2016. Modality-Dependent Cross-Media Retrieval. Acm Transactions on Intelligent Systems and Technology Vol. 7, 4 (2016), 57.

Digital Library

[47]

Y. Weiss, A. Torralba, and R. Fergus. 2009. Spectral hashing. In NIPS. 1753--1760.

Digital Library

[48]

B. Wu, Q. Yang, W. S. Zheng, Y. Wang, and J. Wang. 2015. Quantized correlation hashing for fast cross-modal search IJCAI. 3946--3952.

Digital Library

[49]

D. Wu, Z. Lin, B. Li, M. Ye, and W. Wang. 2017. Deep supervised hashing for multi-label and large-scale image retrieval ICMR. ACM, 150--158.

Digital Library

[50]

F. Wu, Y. Zhou, Y. Yang, S. Tang, Y. Zhang, and Y. Zhuang. 2014. Sparse multi-modal hashing. IEEE TMM Vol. 16, 2 (2014), 427--439.

Digital Library

[51]

L. Wu and Y. Wang. 2017. Structured deep hashing with convolutional neural networks for fast person re-identification. In arXiv.

[52]

L. Xie, J. Shen, J. Han, L. Zhu, and L. Shao. 2017. Dynamic multi-view hashing for online image retrieval IJCAI. 3133--3139.

Digital Library

[53]

X. Xu, F. Shen, H. T. Shen, and X. Li. 2017 a. Learning discriminative binary codes for large-scale cross-modal retrieval. IEEE TIP Vol. 26, 5 (2017), 2494--2507.

Digital Library

[54]

Y. Xu, Y. Yang, F. Shen, X. Xu, Y. Zhou, and H. T. Shen. 2017 b. Attribute hashing for zero-shot image retrieval. In ICME. IEEE, 133--138.

[55]

E. Yang, C. Deng, W. Liu, Liu .X, D. Tao, and X. Gao. 2017. Pairwise relationship guided deep hashing for cross-modal retrieval AAAI. 1618--1625.

[56]

T. Yao, X. Kong, H. Fu, and Q. Tian. 2017. Supervised coarse-to-fine semantic hashing for cross-media retrieval. Digital Signal Processing Vol. 63 (2017), 135--144.

Digital Library

[57]

Z. Yu, F. Wu, Y. Yang, Q. Tian, J. Luo, and Y. Zhuang. 2014. Discriminative coupled dictionary hashing for fast cross-media retrieval SIGIR. ACM, 395--404.

Digital Library

[58]

X. Zhai, Y. Peng, and J. Xiao. 2013. Heterogeneous metric learning with joint graph regularization for cross-media retrieval. In AAAI. 1198--1204.

Digital Library

[59]

X. Zhai, Y. Peng, and J. Xiao. 2014. Learning cross-media joint representation with sparse and semisupervised regularization. IEEE TCSVT Vol. 24, 6 (2014), 965--978.

[60]

D. Zhang and W.J. Li. 2014. Large-scale supervised multimodal hashing with semantic correlation maximization AAAI. 2177--2183.

Digital Library

[61]

D. Zhang, F. Wang, and L. Si. 2011. Composite hashing with multiple information sources SIGIR. ACM, 225--234.

Digital Library

[62]

L. Zhang, B. Ma, G. Li, Q. Huang, and Q. Tian. 2017. Cross-modal retrieval using multi-ordered discriminative structured subspace learning. TMM Vol. 19, 6 (2017), 1220--1233.

Digital Library

[63]

R. Zhang, L. Lin, R. Zhang, W. Zuo, and L. Zhang. 2015. Bit-scalable deep hashing with regularized similarity learning for image retrieval and person re-Identification. IEEE TIP Vol. 24, 12 (2015), 4766--4779.

Digital Library

[64]

W. Zhao, Z. Guan, H. Luo, J. Peng, and J. Fan. 2017 b. Deep multiple instance hashing for object-based image retrieval IJCAI. 3504--3510.

Digital Library

[65]

X. Zhao, G. Ding, Y. Guo, J. Han, and Y. Gao. 2017 a. TUCH: turning cross-view hashing into single-view hashing via generative adversarial nets. In IJCAI. 3511--3517.

Digital Library

[66]

S. Zheng, X. Cai, C. Ding, F. Nie, and H. Huang. 2015. A closed form solution to multi-view low-rank regression AAAI. 1973--1979.

Digital Library

[67]

J. Zhou, G. Ding, and Y. Guo. 2014. Latent semantic sparse hashing for cross-modal similarity search SIGIR. ACM, 415--424.

Digital Library

[68]

F. Zhu, X. Kong, L. Zheng, H. Fu, and Q. Tian. 2017. Part-based deep hashing for large-scale person re-identification. IEEE TIP Vol. 26, 10 (2017), 4806--4817.

[69]

X. Zhu, Z. Huang, H. T. Shen, and X. Zhao. 2013. Linear cross-modal hashing for efficient multimedia search MM. ACM, 143--152.

Digital Library

[70]

Y. Zhuang, Z. Yu, W. Wang, F. Wu, S. Tang, and J. Shao. 2014. Cross-media hashing with neural networks. (2014), 901--904.

Digital Library

Cited By

Sun YLiu KLi YRen ZDai JPeng DCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Distribution Consistency Guided Hashing for Cross-Modal RetrievalProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680633(5623-5632)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3680633
Zeng DWu JHattori GXu RYu Y(2023)Learning Explicit and Implicit Dual Common Subspaces for Audio-visual Cross-modal RetrievalACM Transactions on Multimedia Computing, Communications, and Applications10.1145/356460819:2s(1-23)Online publication date: 17-Feb-2023
https://dl.acm.org/doi/10.1145/3564608
Dong GZhang XShen XLan LLuo ZYing X(2023)Discriminative Geometric-Structure-Based Deep Hashing for Large-Scale Image RetrievalIEEE Transactions on Cybernetics10.1109/TCYB.2022.317331553:10(6236-6247)Online publication date: Oct-2023
https://doi.org/10.1109/TCYB.2022.3173315
Show More Cited By

Index Terms

Collaborative Subspace Graph Hashing for Cross-modal Retrieval
1. Information systems
  1. Information retrieval
    1. Specialized information retrieval
      1. Multimedia and multimodal retrieval

Recommendations

Adversarial Cross-Modal Retrieval
MM '17: Proceedings of the 25th ACM international conference on Multimedia

Cross-modal retrieval aims to enable flexible retrieval experience across different modalities (e.g., texts vs. images). The core of cross-modal retrieval research is to learn a common subspace where the items of different modalities can be directly ...
Multi-modal Subspace Learning with Joint Graph Regularization for Cross-Modal Retrieval
ACPR '13: Proceedings of the 2013 2nd IAPR Asian Conference on Pattern Recognition

This paper investigates the problem of cross-modal retrieval, where users can search results across various modalities by submitting any modality of query. Since the query and its retrieved results can be of different modalities, how to measure the ...
Subspace learning by kernel dependence maximization for cross-modal retrieval
Abstract
Heterogeneity of multi-modal data is the key challenge for multimedia cross-modal retrieval. To solve this challenge, many approaches have been developed. As the mainstream, subspace learning based approaches focus on learning a latent ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICMR '18: Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval

June 2018

550 pages

ISBN:9781450350464

DOI:10.1145/3206025

Conference Chairs:
Kiyoharu Aizawa
The Univ. of Tokyo, Japan
,
Michael Lew
Leiden Univ., Netherlands
,
Shin'ichi Satoh
National Inst. of Informatics, Japan

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 June 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Natural Science Foundation of China
National Key Research and Development Program of China under Grant

Conference

ICMR '18

Sponsor:

SIGMM

ICMR '18: International Conference on Multimedia Retrieval

June 11 - 14, 2018

Yokohama, Japan

Acceptance Rates

ICMR '18 Paper Acceptance Rate 44 of 136 submissions, 32%;

Overall Acceptance Rate 254 of 830 submissions, 31%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
362
Total Downloads

Downloads (Last 12 months)15
Downloads (Last 6 weeks)1

Reflects downloads up to 17 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Sun YLiu KLi YRen ZDai JPeng DCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Distribution Consistency Guided Hashing for Cross-Modal RetrievalProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680633(5623-5632)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3680633
Zeng DWu JHattori GXu RYu Y(2023)Learning Explicit and Implicit Dual Common Subspaces for Audio-visual Cross-modal RetrievalACM Transactions on Multimedia Computing, Communications, and Applications10.1145/356460819:2s(1-23)Online publication date: 17-Feb-2023
https://dl.acm.org/doi/10.1145/3564608
Dong GZhang XShen XLan LLuo ZYing X(2023)Discriminative Geometric-Structure-Based Deep Hashing for Large-Scale Image RetrievalIEEE Transactions on Cybernetics10.1109/TCYB.2022.317331553:10(6236-6247)Online publication date: Oct-2023
https://doi.org/10.1109/TCYB.2022.3173315
Zhao HLuo Z(2022)Self-Collaborative Unsupervised Hashing for Large-Scale Image RetrievalIEEE Access10.1109/ACCESS.2020.303262810(103588-103597)Online publication date: 2022
https://doi.org/10.1109/ACCESS.2020.3032628
Guo JZhu W(2020)Collective Affinity Learning for Partial Cross-Modal HashingIEEE Transactions on Image Processing10.1109/TIP.2019.294185829(1344-1355)Online publication date: 2020
https://doi.org/10.1109/TIP.2019.2941858
Tian YYang WLiu QYang Q(2020)Deep supervised multimodal semantic autoencoder for cross‐modal retrievalComputer Animation and Virtual Worlds10.1002/cav.196231:4-5Online publication date: 7-Sep-2020
https://doi.org/10.1002/cav.1962
Dong GZhang XLan LWang SLuo Z(2019)Label guided correlation hashing for large-scale cross-modal retrievalMultimedia Tools and Applications10.1007/s11042-019-7192-578:21(30895-30922)Online publication date: 6-Feb-2019
https://doi.org/10.1007/s11042-019-7192-5

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten