Index and Retrieve Multimedia Data: Cross-Modal Hashing by Learning Subspace Relation

Liu, Luchen; Yang, Yang; Hu, Mengqiu; Xu, Xing; Shen, Fumin; Xie, Ning; Huang, Zi

doi:10.1007/978-3-319-91458-9_37

Luchen Liu²⁴,
Yang Yang²⁴,
Mengqiu Hu²⁴,
Xing Xu²⁴,
Fumin Shen²⁴,
Ning Xie²⁴ &
…
Zi Huang²⁵

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10828))

Included in the following conference series:

International Conference on Database Systems for Advanced Applications

3920 Accesses
7 Citations

Abstract

Hashing methods have been extensively applied to efficient multimedia data indexing and retrieval on account of explosion of multimedia data. Cross-modal hashing usually learns binary codes by mapping multi-modal data into a common Hamming space. Most supervised methods utilize relation information like class labels as pairwise similarities of cross-modal data pair to narrow intra-modal and inter-modal gap. In this paper, we propose a novel supervised cross-modal hashing method dubbed Subspace Relation Learning for Cross-modal Hashing (SRLCH), which exploits relation information in semantic labels to make similar data from different modalities closer in the low-dimension Hamming subspace. SRLCH preserves the discrete constraints and nonlinear structures, while admitting a closed-form binary codes solution, which effectively enhances the training efficiency. An iterative alternative optimization algorithm is developed to simultaneously learn both hash functions and unified binary codes, indexing multimedia data in an efficient way. Evaluations in two cross-modal retrieval tasks on three widely-used datasets show that the proposed SRLCH outperforms most cross-modal hashing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bronstein, M.M., Bronstein, A.M., Michel, F., Paragios, N.: Data fusion through cross-modality metric learning using similarity-sensitive hashing. In: Proceedings of CVPR, pp. 3594–3601 (2010)
Google Scholar
Chua, T., Tang, J., Hong, R., Li, H., Luo, Z., Zheng, Y.: NUS-WIDE: a real-world web image database from National University of Singapore. In: Proceedings of CVIR (2009)
Google Scholar
Ding, G., Guo, Y., Zhou, J.: Collective matrix factorization hashing for multimodal data. In: Proceedings of CVPR, pp. 2083–2090 (2014)
Google Scholar
Gong, Y., Lazebnik, S.: Iterative quantization: a procrustean approach to learning binary codes. In: Proceedings of CVPR, pp. 817–824 (2011)
Google Scholar
Gui, J., Liu, T., Sun, Z., Tao, D., Tan, T.: Fast supervised discrete hashing. IEEE TPAMI 40(2), 490–496 (2018)
Article Google Scholar
Hardoon, D.R., Szedmák, S., Shawe-Taylor, J.: Canonical correlation analysis: an overview with application to learning methods. Neural Comput. 16(12), 2639–2664 (2004)
Article Google Scholar
Hu, M., Yang, Y., Shen, F., Xie, N., Shen, H.T.: Hashing with angular reconstructive embeddings. IEEE TIP 27(2), 545–555 (2018)
MathSciNet MATH Google Scholar
Jiang, Q., Li, W.: Deep cross-modal hashing. In: Proceedings of CVPR, pp. 3270–3278 (2017)
Google Scholar
Kang, Y., Kim, S., Choi, S.: Deep learning to hash with multiple representations. In: Proceedings of ICDM, pp. 930–935 (2012)
Google Scholar
Kulis, B., Darrell, T.: Learning to hash with binary reconstructive embeddings. In: Proceedings of NIPS, pp. 1042–1050 (2009)
Google Scholar
Kulis, B., Grauman, K.: Kernelized locality-sensitive hashing for scalable image search. In: Proceedings of ICCV, pp. 2130–2137 (2009)
Google Scholar
Li, K., Qi, G., Ye, J., Hua, K.A.: Linear subspace ranking hashing for cross-modal retrieval. IEEE TPAMI 39(9), 1825–1838 (2017)
Article Google Scholar
Lin, Z., Ding, G., Hu, M., Wang, J.: Semantics-preserving hashing for cross-view retrieval. In: Proceedings of CVPR, pp. 3864–3872 (2015)
Google Scholar
Liu, H., Ji, R., Wu, Y., Hua, G.: Supervised matrix factorization for cross-modality hashing. In: Proceedings of IJCAI, pp. 1767–1773 (2016)
Google Scholar
Liu, H., Ji, R., Wu, Y., Huang, F., Zhang, B.: Cross-modality binary code learning via fusion similarity hashing. In: Proceedings of CVPR, pp. 6345–6353 (2017)
Google Scholar
Liu, J., Wang, R., Gao, X., Yang, X., Chen, G.: Anglecut: a ring-based hashing scheme for distributed metadata management. In: Proceedings of DASFAA, pp. 71–86 (2017)
Google Scholar
Liu, W., Wang, J., Ji, R., Jiang, Y., Chang, S.: Supervised hashing with kernels. In: Proceedings of CVPR, pp. 2074–2081 (2012)
Google Scholar
Luo, Y., Yang, Y., Shen, F., Huang, Z., Zhou, P., Shen, H.T.: Robust discrete code modeling for supervised hashing. PR 75, 128–135 (2018)
Google Scholar
McNamara, Q., de la Vega, A., Yarkoni, T.: Developing a comprehensive framework for multimodal feature extraction. In: Proceedings of ACM SIGKDD, pp. 1567–1574 (2017)
Google Scholar
Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. IJCV 42(3), 145–175 (2001)
Article Google Scholar
Peng, Y., Huang, X., Zhao, Y.: An overview of cross-media retrieval: concepts, methodologies, benchmarks and challenges. CoRR abs/1704.02223 (2017)
Google Scholar
Rasiwasia, N., Pereira, J.C., Coviello, E., Doyle, G., Lanckriet, G.R.G., Levy, R., Vasconcelos, N.: A new approach to cross-modal multimedia retrieval. In: Proceedings of ACM MM, pp. 251–260 (2010)
Google Scholar
Rastegari, M., Choi, J., Fakhraei, S., III, H.D., Davis, L.S.: Predictable dual-view hashing. In: Proceedings of ICML, pp. 1328–1336 (2013)
Google Scholar
Shen, F., Shen, C., Liu, W., Shen, H.T.: Supervised discrete hashing. In: Proceedings of CVPR, pp. 37–45 (2015)
Google Scholar
Song, J., Yang, Y., Yang, Y., Huang, Z., Shen, H.T.: Inter-media hashing for large-scale retrieval from heterogeneous data sources. In: Proceedings of ACM SIGMOD, pp. 785–796 (2013)
Google Scholar
Wang, B., Yang, Y., Xu, X., Hanjalic, A., Shen, H.T.: Adversarial cross-modal retrieval. In: Proceedings of ACM MM, pp. 154–162 (2017)
Google Scholar
Wang, J., Shen, H.T., Song, J., Ji, J.: Hashing for similarity search: a survey. CoRR abs/1408.2927 (2014)
Google Scholar
Wang, J., Zhang, T., Song, J., Sebe, N., Shen, H.T.: A survey on learning to hash. CoRR abs/1606.00185 (2016)
Google Scholar
Wang, K., Yin, Q., Wang, W., Wu, S., Wang, L.: A comprehensive survey on cross-modal retrieval. CoRR abs/1607.06215 (2016)
Google Scholar
Wang, W., Yang, X., Ooi, B.C., Zhang, D., Zhuang, Y.: Effective deep learning-based multi-modal retrieval. VLDB J. 25(1), 79–101 (2016)
Article Google Scholar
Xu, X., Shen, F., Yang, Y., Shen, H.T., Li, X.: Learning discriminative binary codes for large-scale cross-modal retrieval. IEEE TIP 26(5), 2494–2507 (2017)
MathSciNet MATH Google Scholar
Xu, Y., Yang, Y., Shen, F., Xu, X., Zhou, Y., Shen, H.T.: Attribute hashing for zero-shot image retrieval. In: Proceedings of ICME, pp. 133–138 (2017)
Google Scholar
Yang, Y., Luo, Y., Chen, W., Shen, F., Shao, J., Shen, H.T.: Zero-shot hashing via transferring supervised knowledge. In: Proceedings of ACM MM, pp. 1286–1295 (2016)
Google Scholar
Yang, Z., Li, Q., Liu, W., Ma, Y.: Learning manifold representation from multimodal data for event detection in flickr-like social media. In: Gao, H., Kim, J., Sakurai, Y. (eds.) DASFAA 2016. LNCS, vol. 9645, pp. 160–167. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-32055-7_14
Chapter Google Scholar
Yu, Z., Wu, F., Yang, Y., Tian, Q., Luo, J., Zhuang, Y.: Discriminative coupled dictionary hashing for fast cross-media retrieval. In: Proceedings of ACM SIGIR, pp. 395–404 (2014)
Google Scholar
Zhang, D., Li, W.: Large-scale supervised multimodal hashing with semantic correlation maximization. In: Proceedings of AAAI, pp. 2177–2183 (2014)
Google Scholar
Zhen, Y., Yeung, D.: Co-regularized hashing for multimodal data. In: Proceedings of NIPS, pp. 1385–1393 (2012)
Google Scholar
Zhen, Y., Yeung, D.: A probabilistic model for multimodal hash function learning. In: Proceedings of ACM SIGKDD, pp. 940–948 (2012)
Google Scholar
Zhou, J., Ding, G., Guo, Y.: Latent semantic sparse hashing for cross-modal similarity search. In: Proceedings of ACM SIGIR, pp. 415–424 (2014)
Google Scholar

Download references

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China under Project 61572108, Project 61632007 and Project 61502081.

Author information

Authors and Affiliations

Center for Future Media and School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China
Luchen Liu, Yang Yang, Mengqiu Hu, Xing Xu, Fumin Shen & Ning Xie
School of Information Technology and Electrical Engineering, The University of Queensland, Brisbane, Australia
Zi Huang

Authors

Luchen Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yang Yang
View author publications
You can also search for this author in PubMed Google Scholar
Mengqiu Hu
View author publications
You can also search for this author in PubMed Google Scholar
Xing Xu
View author publications
You can also search for this author in PubMed Google Scholar
Fumin Shen
View author publications
You can also search for this author in PubMed Google Scholar
Ning Xie
View author publications
You can also search for this author in PubMed Google Scholar
Zi Huang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yang Yang .

Editor information

Editors and Affiliations

Simon Fraser University, Burnaby, BC, Canada
Jian Pei
Aristotle University of Thessaloniki, Thessaloniki, Greece
Yannis Manolopoulos
University of Queensland, Brisbane, QLD, Australia
Shazia Sadiq
University of Western Australia, Crawley, WA, Australia
Jianxin Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, L. et al. (2018). Index and Retrieve Multimedia Data: Cross-Modal Hashing by Learning Subspace Relation. In: Pei, J., Manolopoulos, Y., Sadiq, S., Li, J. (eds) Database Systems for Advanced Applications. DASFAA 2018. Lecture Notes in Computer Science(), vol 10828. Springer, Cham. https://doi.org/10.1007/978-3-319-91458-9_37

Download citation

DOI: https://doi.org/10.1007/978-3-319-91458-9_37
Published: 12 May 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-91457-2
Online ISBN: 978-3-319-91458-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics