skip to main content
10.1145/3206025.3206042acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
research-article

Collaborative Subspace Graph Hashing for Cross-modal Retrieval

Published: 05 June 2018 Publication History

Abstract

Current hashing methods for cross-modal retrieval generally attempt to learn the separate modality-specific transformation matrices to embed multi-modality data into a latent common subspace, and usually ignore the fact that respecting the diversity of multi-modality features in the latent subspace could be beneficial for retrieval improvements. To this, we propose a collaborative subspace graph hashing method (CSGH) to perform a two-stage collaborative learning framework for cross-modal retrieval. Particularly, CSGH first embeds multi-modality data into separate latent subspaces through individual modality-specific transformation matrices, and then connects these latent subspaces to a common Hamming space through a shared transformation matrix. In this framework, CSGH considers the modality-specific neighborhood structure and the cross-modal correlation within multi-modality data through the Laplacian regularization and the graph based correlation constraint, respectively. To solve CSGH, we develop an alternative procedure to optimize it, and fortunately, each sub-problem of CSGH has the elegant analytical solution. Experiments of cross-modal retrieval on Wiki, NUS-WIDE, Flickr25K and Flickr1M datasets show the effectiveness of CSGH compared with the state-of-the-art cross-modal hashing methods.

References

[1]
M. M. Bronstein, A. M. Bronstein, F. Michel, and N. Paragios. 2010. Data fusion through cross-modality metric learning using similarity-sensitive hashing CVPR. 3594--3601.
[2]
M. M. Bronstein, A. M. Bronstein, F. Michel, and N. Paragios. 2013. Data fusion through cross-modality metric learning using similarity-sensitive hashing CVPR. 3594--3601.
[3]
Y. Cao, M Long, J. Wang, and S. Liu. 2017 a. Collective deep quantization for efficient cross-modal retrieval AAAI. 3974--3980.
[4]
Z. Cao, M. Long, J. Wang, and Q. Yang. 2017 b. Transitive hashing network for heterogeneous multimedia retrieval AAAI. 81--87.
[5]
L. Chi, B. Li, X. Zhu, S. Pan, and L. Chen. 2017. Hashing for adaptive real-time graph stream classification with concept drifts. IEEE TCB (2017).
[6]
T. S. Chua, J. Tang, R. Hong, H. Li, Z. Luo, and Y. T. Zheng. 2009. NUS-WIDE: a real-world web image database from national university of singapore CIVR. ACM, 48.
[7]
P. J. Costa, E. Coviello, G. Doyle, N. Rasiwasia, G. R. Lanckriet, R. Levy, and N. Vasconcelos. 2014. On the role of correlation and abstraction in cross-modal multimedia retrieval. IEEE TPAMI Vol. 36, 3 (2014), 521--35.
[8]
J. Costa Pereira, E. Coviello, G. Doyle, N. Rasiwasia, G. Lanckriet, R. Levy, and N. Vasconcelos. 2014. On the role of correlation and abstraction in cross-modal multimedia retrieval. IEEE TPAMI Vol. 36, 3 (March. 2014), 521--535.
[9]
M. Datar, N. Immorlica, P. Indyk, and V. S. Mirrokni. 2004. Locality-sensitive hashing scheme based on P-stable distributions SCG. ACM, 253--262.
[10]
G. Ding, Y. Guo, and J. Zhou. 2014. Collective matrix factorization hashing for multimodal data CVPR. 2083--2090.
[11]
G. Dong, X. Zhang, L. Lan, X. Huang, and Z. Luo. 2018. Discrete graph hashing via affine transformation. In ICME.
[12]
F. Feng, X. Wang, and R. Li. 2014. Cross-modal retrieval with correspondence autoencoder MM. ACM, 7--16.
[13]
L. Feng, X. S. Xu, S. Guo, and X. L. Wang. 2017. Supervised class graph preserving hashing for image retrieval and classification MMM. Springer, 391--403.
[14]
Y. Gong, S. Lazebnik, A. Gordo, and F. Perronnin. 2013. Iterative quantization: a procrustean approach to learning binary codes for large-scale image retrieval. IEEE TPAMI Vol. 35, 12 (2013), 2916--2929.
[15]
L. He, X. Xu, H. Lu, Y. Yang, F. Shen, and H. T. Shen. 2017. Unsupervised cross-modal retrieval through adversarial learning ICME. 1153--1158.
[16]
M. J. Huiskes and M. S. Lew. 2008. The MIR Flickr retrieval evaluation. In MIR. ACM, 39--43.
[17]
S. Kumar and R. Udupa. 2011. Learning hash functions for cross-view similarity search IJCAI. 1360--1365.
[18]
X. Li, D. Hu, and F. Nie. 2017. Large graph hashing with spectral rotation. In AAAI. 2203--2209.
[19]
J. Liang, Z. Li, D. Cao, R. He, and J. Wang. 2016. Self-paced cross-modal subspace matching. In SIGIR. ACM, 569--578.
[20]
J. Lin, Z. Li, and J. Tang. 2017 a. Discriminative deep hashing for scalable face image retrieval IJCAI.
[21]
J. Lin, A. Veillard, L. Y. Duan, V. Chandrasekhar, and T. Poggio. 2017 b. Nested invariance pooling and RBM hashing for image instance retrieval ICMR. ACM, 260--268.
[22]
Z. Lin, G. Ding, M. Hu, and J. Wang. 2015. Semantics-preserving hashing for cross-view retrieval CVPR. 3864--3872.
[23]
L. Liu, Z. Lin, L. Shao, F. Shen, G. Ding, and J. Han. 2017 a. Sequential discrete hashing for scalable cross-modality similarity retrieval. IEEE TIP Vol. 26, 1 (2017), 107--118.
[24]
L. Liu, F. Shen, Y. Shen, X. Liu, and L. Shao. 2017 b. Deep sketch hashing: fast free-hand sketch-based image retrieval arXiv.
[25]
W. Liu, J. Wang, S. Kumar, and S. F. Chang. 2011. Hashing with graphs. In ICML. 1--8.
[26]
M. Long, Y. Cao, J. Wang, and P. S. Yu . 2016. Composite correlation quantization for efficient multimodal retrieval SIGIR. ACM, 579--588.
[27]
X. Lu, F. Wu, S. Tang, Z. Zhang, X. He, and Y. Zhuang. 2013. A low rank structural large margin method for cross-modal ranking SIGIR. ACM, 433--442.
[28]
X. Lu, X. Zheng, and X. Li. 2017. Latent semantic minimal hashing for image retrieval. TIP Vol. 26, 1 (2017), 355--368.
[29]
Jonathan M., Michael M. B., Alexander M. B., and Jürgen S. 2014. Multimodal Similarity-Preserving Hashing. IEEE TPAMI Vol. 36, 4 (2014), 824--30.
[30]
B. Thomee M. J. Huiskes and M. S. Lew. 2010. New trends and ideas in visual concept detection: the MIR Flickr retrieval evaluation initiative. In MIR. ACM, 527--536.
[31]
S. Moran and V. Lavrenko. 2015. Regularised cross-modal hashing. (2015), 907--910.
[32]
J. Ngiam, A. Khosla, M. Kim, J. Nam, H. Lee, and A. Y. Ng. 2011. Multimodal deep learning. In ICML. 689--696.
[33]
M. Ou, P. Cui, F. Wang, J. Wang, W. Zhu, and S. Yang. 2013. Comparing apples to oranges:a scalable solution with heterogeneous hashing SIGKDD. ACM, 230--238.
[34]
Yuxin Peng, Xin Huang, and Yunzhen Zhao. 2017. An overview of cross-media retrieval: concepts, methodologies, benchmarks and challenges. TCSVT (2017).
[35]
Y. Peng, J. Qi, X. Huang, and Y. Yuan. 2018. CCL: Cross-modal Correlation Learning with Multi-grained Fusion by Hierarchical Network. IEEE Transactions on Multimedia Vol. 20, 2 (2018), 405--420.
[36]
Y. Peng, X. Zhai, Y. Zhao, and X. Huang. 2016. Semi-supervised cross-media feature learning with unified patch graph regularization. IEEE TCSVT Vol. 26, 3 (2016), 583--596.
[37]
V. Ranjan, N. Rasiwasia, and C. V. Jawahar. 2015. Multi-label cross-modal retrieval. In ICCV. 4094--4102.
[38]
N. Rasiwasia, J. Costa Pereira, E. Coviello, G. Doyle, G.R.G. Lanckriet, R. Levy, and N. Vasconcelos. 2010. A new approach to cross-modal multimedia retrieval MM. ACM, 251--260.
[39]
R. Salakhutdinov and G. Hinton. 2009. Semantic hashing. International Journal of Approximate Reasoning Vol. 50, 7 (2009), 969--978.
[40]
J. Song, Y. Yang, Y. Yang, Z. Huang, and H. T. Shen. 2013. Inter-media hashing for large-scale retrieval from heterogeneous data sources SIGMOD. ACM, 785--796.
[41]
N. Srivastava and R. R. Salakhutdinov. 2012. Multimodal learning with deep boltzmann machines. In NIPS, bibfieldeditorF. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger (Eds.). Curran Associates, Inc., 2222--2230. deftempurl%http://papers.nips.cc/paper/4683-multimodal-learning-with-deep-boltzmann-machines.pdf tempurl
[42]
D. Wang, X. Gao, X. Wang, and L. He. 2015. Semantic topic multimodal hashing for cross-media retrieval IJCAI. 3890--3896.
[43]
K. Wang, R. He, L. Wang, W. Wang, and T. Tan. 2016. Joint feature selection and subspace learning for cross-modal retrieval. IEEE TPAMI Vol. 38, 10 (2016), 2010--2023.
[44]
W. Wang, B. C. Ooi, X. Yang, D. Zhang, and Y. Zhuang. 2014. Effective multi-modal retrieval based on stacked auto-encoders. Proceedings of the VLDB Endowment Vol. 7, 8 (2014), 649--660.
[45]
Y. Wei, Y. Zhao, C. Lu, Sh. Wei, L. Liu, Zh. Zhu, and S. Yan. 2017. Cross-modal retrieval with CNN visual features: a new baseline. IEEE TCB Vol. 47, 2 (2017), 449--460.
[46]
Y. Wei, Y. Zhao, Z. Zhu, S. Wei, Y. Xiao, J. Feng, and S. Yan. 2016. Modality-Dependent Cross-Media Retrieval. Acm Transactions on Intelligent Systems and Technology Vol. 7, 4 (2016), 57.
[47]
Y. Weiss, A. Torralba, and R. Fergus. 2009. Spectral hashing. In NIPS. 1753--1760.
[48]
B. Wu, Q. Yang, W. S. Zheng, Y. Wang, and J. Wang. 2015. Quantized correlation hashing for fast cross-modal search IJCAI. 3946--3952.
[49]
D. Wu, Z. Lin, B. Li, M. Ye, and W. Wang. 2017. Deep supervised hashing for multi-label and large-scale image retrieval ICMR. ACM, 150--158.
[50]
F. Wu, Y. Zhou, Y. Yang, S. Tang, Y. Zhang, and Y. Zhuang. 2014. Sparse multi-modal hashing. IEEE TMM Vol. 16, 2 (2014), 427--439.
[51]
L. Wu and Y. Wang. 2017. Structured deep hashing with convolutional neural networks for fast person re-identification. In arXiv.
[52]
L. Xie, J. Shen, J. Han, L. Zhu, and L. Shao. 2017. Dynamic multi-view hashing for online image retrieval IJCAI. 3133--3139.
[53]
X. Xu, F. Shen, H. T. Shen, and X. Li. 2017 a. Learning discriminative binary codes for large-scale cross-modal retrieval. IEEE TIP Vol. 26, 5 (2017), 2494--2507.
[54]
Y. Xu, Y. Yang, F. Shen, X. Xu, Y. Zhou, and H. T. Shen. 2017 b. Attribute hashing for zero-shot image retrieval. In ICME. IEEE, 133--138.
[55]
E. Yang, C. Deng, W. Liu, Liu .X, D. Tao, and X. Gao. 2017. Pairwise relationship guided deep hashing for cross-modal retrieval AAAI. 1618--1625.
[56]
T. Yao, X. Kong, H. Fu, and Q. Tian. 2017. Supervised coarse-to-fine semantic hashing for cross-media retrieval. Digital Signal Processing Vol. 63 (2017), 135--144.
[57]
Z. Yu, F. Wu, Y. Yang, Q. Tian, J. Luo, and Y. Zhuang. 2014. Discriminative coupled dictionary hashing for fast cross-media retrieval SIGIR. ACM, 395--404.
[58]
X. Zhai, Y. Peng, and J. Xiao. 2013. Heterogeneous metric learning with joint graph regularization for cross-media retrieval. In AAAI. 1198--1204.
[59]
X. Zhai, Y. Peng, and J. Xiao. 2014. Learning cross-media joint representation with sparse and semisupervised regularization. IEEE TCSVT Vol. 24, 6 (2014), 965--978.
[60]
D. Zhang and W.J. Li. 2014. Large-scale supervised multimodal hashing with semantic correlation maximization AAAI. 2177--2183.
[61]
D. Zhang, F. Wang, and L. Si. 2011. Composite hashing with multiple information sources SIGIR. ACM, 225--234.
[62]
L. Zhang, B. Ma, G. Li, Q. Huang, and Q. Tian. 2017. Cross-modal retrieval using multi-ordered discriminative structured subspace learning. TMM Vol. 19, 6 (2017), 1220--1233.
[63]
R. Zhang, L. Lin, R. Zhang, W. Zuo, and L. Zhang. 2015. Bit-scalable deep hashing with regularized similarity learning for image retrieval and person re-Identification. IEEE TIP Vol. 24, 12 (2015), 4766--4779.
[64]
W. Zhao, Z. Guan, H. Luo, J. Peng, and J. Fan. 2017 b. Deep multiple instance hashing for object-based image retrieval IJCAI. 3504--3510.
[65]
X. Zhao, G. Ding, Y. Guo, J. Han, and Y. Gao. 2017 a. TUCH: turning cross-view hashing into single-view hashing via generative adversarial nets. In IJCAI. 3511--3517.
[66]
S. Zheng, X. Cai, C. Ding, F. Nie, and H. Huang. 2015. A closed form solution to multi-view low-rank regression AAAI. 1973--1979.
[67]
J. Zhou, G. Ding, and Y. Guo. 2014. Latent semantic sparse hashing for cross-modal similarity search SIGIR. ACM, 415--424.
[68]
F. Zhu, X. Kong, L. Zheng, H. Fu, and Q. Tian. 2017. Part-based deep hashing for large-scale person re-identification. IEEE TIP Vol. 26, 10 (2017), 4806--4817.
[69]
X. Zhu, Z. Huang, H. T. Shen, and X. Zhao. 2013. Linear cross-modal hashing for efficient multimedia search MM. ACM, 143--152.
[70]
Y. Zhuang, Z. Yu, W. Wang, F. Wu, S. Tang, and J. Shao. 2014. Cross-media hashing with neural networks. (2014), 901--904.

Cited By

View all
  • (2024)Distribution Consistency Guided Hashing for Cross-Modal RetrievalProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680633(5623-5632)Online publication date: 28-Oct-2024
  • (2023)Learning Explicit and Implicit Dual Common Subspaces for Audio-visual Cross-modal RetrievalACM Transactions on Multimedia Computing, Communications, and Applications10.1145/356460819:2s(1-23)Online publication date: 17-Feb-2023
  • (2023)Discriminative Geometric-Structure-Based Deep Hashing for Large-Scale Image RetrievalIEEE Transactions on Cybernetics10.1109/TCYB.2022.317331553:10(6236-6247)Online publication date: Oct-2023
  • Show More Cited By

Index Terms

  1. Collaborative Subspace Graph Hashing for Cross-modal Retrieval

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICMR '18: Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval
    June 2018
    550 pages
    ISBN:9781450350464
    DOI:10.1145/3206025
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 05 June 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. cross-modal correlation
    2. cross-modal hashing
    3. cross-modal retrieval
    4. graph hashing
    5. subspace learning

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    ICMR '18
    Sponsor:

    Acceptance Rates

    ICMR '18 Paper Acceptance Rate 44 of 136 submissions, 32%;
    Overall Acceptance Rate 254 of 830 submissions, 31%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)15
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 17 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Distribution Consistency Guided Hashing for Cross-Modal RetrievalProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680633(5623-5632)Online publication date: 28-Oct-2024
    • (2023)Learning Explicit and Implicit Dual Common Subspaces for Audio-visual Cross-modal RetrievalACM Transactions on Multimedia Computing, Communications, and Applications10.1145/356460819:2s(1-23)Online publication date: 17-Feb-2023
    • (2023)Discriminative Geometric-Structure-Based Deep Hashing for Large-Scale Image RetrievalIEEE Transactions on Cybernetics10.1109/TCYB.2022.317331553:10(6236-6247)Online publication date: Oct-2023
    • (2022)Self-Collaborative Unsupervised Hashing for Large-Scale Image RetrievalIEEE Access10.1109/ACCESS.2020.303262810(103588-103597)Online publication date: 2022
    • (2020)Collective Affinity Learning for Partial Cross-Modal HashingIEEE Transactions on Image Processing10.1109/TIP.2019.294185829(1344-1355)Online publication date: 2020
    • (2020)Deep supervised multimodal semantic autoencoder for cross‐modal retrievalComputer Animation and Virtual Worlds10.1002/cav.196231:4-5Online publication date: 7-Sep-2020
    • (2019)Label guided correlation hashing for large-scale cross-modal retrievalMultimedia Tools and Applications10.1007/s11042-019-7192-578:21(30895-30922)Online publication date: 6-Feb-2019

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media