Abstract
Hashing methods have been extensively applied to efficient multimedia data indexing and retrieval on account of explosion of multimedia data. Cross-modal hashing usually learns binary codes by mapping multi-modal data into a common Hamming space. Most supervised methods utilize relation information like class labels as pairwise similarities of cross-modal data pair to narrow intra-modal and inter-modal gap. In this paper, we propose a novel supervised cross-modal hashing method dubbed Subspace Relation Learning for Cross-modal Hashing (SRLCH), which exploits relation information in semantic labels to make similar data from different modalities closer in the low-dimension Hamming subspace. SRLCH preserves the discrete constraints and nonlinear structures, while admitting a closed-form binary codes solution, which effectively enhances the training efficiency. An iterative alternative optimization algorithm is developed to simultaneously learn both hash functions and unified binary codes, indexing multimedia data in an efficient way. Evaluations in two cross-modal retrieval tasks on three widely-used datasets show that the proposed SRLCH outperforms most cross-modal hashing methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bronstein, M.M., Bronstein, A.M., Michel, F., Paragios, N.: Data fusion through cross-modality metric learning using similarity-sensitive hashing. In: Proceedings of CVPR, pp. 3594–3601 (2010)
Chua, T., Tang, J., Hong, R., Li, H., Luo, Z., Zheng, Y.: NUS-WIDE: a real-world web image database from National University of Singapore. In: Proceedings of CVIR (2009)
Ding, G., Guo, Y., Zhou, J.: Collective matrix factorization hashing for multimodal data. In: Proceedings of CVPR, pp. 2083–2090 (2014)
Gong, Y., Lazebnik, S.: Iterative quantization: a procrustean approach to learning binary codes. In: Proceedings of CVPR, pp. 817–824 (2011)
Gui, J., Liu, T., Sun, Z., Tao, D., Tan, T.: Fast supervised discrete hashing. IEEE TPAMI 40(2), 490–496 (2018)
Hardoon, D.R., Szedmák, S., Shawe-Taylor, J.: Canonical correlation analysis: an overview with application to learning methods. Neural Comput. 16(12), 2639–2664 (2004)
Hu, M., Yang, Y., Shen, F., Xie, N., Shen, H.T.: Hashing with angular reconstructive embeddings. IEEE TIP 27(2), 545–555 (2018)
Jiang, Q., Li, W.: Deep cross-modal hashing. In: Proceedings of CVPR, pp. 3270–3278 (2017)
Kang, Y., Kim, S., Choi, S.: Deep learning to hash with multiple representations. In: Proceedings of ICDM, pp. 930–935 (2012)
Kulis, B., Darrell, T.: Learning to hash with binary reconstructive embeddings. In: Proceedings of NIPS, pp. 1042–1050 (2009)
Kulis, B., Grauman, K.: Kernelized locality-sensitive hashing for scalable image search. In: Proceedings of ICCV, pp. 2130–2137 (2009)
Li, K., Qi, G., Ye, J., Hua, K.A.: Linear subspace ranking hashing for cross-modal retrieval. IEEE TPAMI 39(9), 1825–1838 (2017)
Lin, Z., Ding, G., Hu, M., Wang, J.: Semantics-preserving hashing for cross-view retrieval. In: Proceedings of CVPR, pp. 3864–3872 (2015)
Liu, H., Ji, R., Wu, Y., Hua, G.: Supervised matrix factorization for cross-modality hashing. In: Proceedings of IJCAI, pp. 1767–1773 (2016)
Liu, H., Ji, R., Wu, Y., Huang, F., Zhang, B.: Cross-modality binary code learning via fusion similarity hashing. In: Proceedings of CVPR, pp. 6345–6353 (2017)
Liu, J., Wang, R., Gao, X., Yang, X., Chen, G.: Anglecut: a ring-based hashing scheme for distributed metadata management. In: Proceedings of DASFAA, pp. 71–86 (2017)
Liu, W., Wang, J., Ji, R., Jiang, Y., Chang, S.: Supervised hashing with kernels. In: Proceedings of CVPR, pp. 2074–2081 (2012)
Luo, Y., Yang, Y., Shen, F., Huang, Z., Zhou, P., Shen, H.T.: Robust discrete code modeling for supervised hashing. PR 75, 128–135 (2018)
McNamara, Q., de la Vega, A., Yarkoni, T.: Developing a comprehensive framework for multimodal feature extraction. In: Proceedings of ACM SIGKDD, pp. 1567–1574 (2017)
Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. IJCV 42(3), 145–175 (2001)
Peng, Y., Huang, X., Zhao, Y.: An overview of cross-media retrieval: concepts, methodologies, benchmarks and challenges. CoRR abs/1704.02223 (2017)
Rasiwasia, N., Pereira, J.C., Coviello, E., Doyle, G., Lanckriet, G.R.G., Levy, R., Vasconcelos, N.: A new approach to cross-modal multimedia retrieval. In: Proceedings of ACM MM, pp. 251–260 (2010)
Rastegari, M., Choi, J., Fakhraei, S., III, H.D., Davis, L.S.: Predictable dual-view hashing. In: Proceedings of ICML, pp. 1328–1336 (2013)
Shen, F., Shen, C., Liu, W., Shen, H.T.: Supervised discrete hashing. In: Proceedings of CVPR, pp. 37–45 (2015)
Song, J., Yang, Y., Yang, Y., Huang, Z., Shen, H.T.: Inter-media hashing for large-scale retrieval from heterogeneous data sources. In: Proceedings of ACM SIGMOD, pp. 785–796 (2013)
Wang, B., Yang, Y., Xu, X., Hanjalic, A., Shen, H.T.: Adversarial cross-modal retrieval. In: Proceedings of ACM MM, pp. 154–162 (2017)
Wang, J., Shen, H.T., Song, J., Ji, J.: Hashing for similarity search: a survey. CoRR abs/1408.2927 (2014)
Wang, J., Zhang, T., Song, J., Sebe, N., Shen, H.T.: A survey on learning to hash. CoRR abs/1606.00185 (2016)
Wang, K., Yin, Q., Wang, W., Wu, S., Wang, L.: A comprehensive survey on cross-modal retrieval. CoRR abs/1607.06215 (2016)
Wang, W., Yang, X., Ooi, B.C., Zhang, D., Zhuang, Y.: Effective deep learning-based multi-modal retrieval. VLDB J. 25(1), 79–101 (2016)
Xu, X., Shen, F., Yang, Y., Shen, H.T., Li, X.: Learning discriminative binary codes for large-scale cross-modal retrieval. IEEE TIP 26(5), 2494–2507 (2017)
Xu, Y., Yang, Y., Shen, F., Xu, X., Zhou, Y., Shen, H.T.: Attribute hashing for zero-shot image retrieval. In: Proceedings of ICME, pp. 133–138 (2017)
Yang, Y., Luo, Y., Chen, W., Shen, F., Shao, J., Shen, H.T.: Zero-shot hashing via transferring supervised knowledge. In: Proceedings of ACM MM, pp. 1286–1295 (2016)
Yang, Z., Li, Q., Liu, W., Ma, Y.: Learning manifold representation from multimodal data for event detection in flickr-like social media. In: Gao, H., Kim, J., Sakurai, Y. (eds.) DASFAA 2016. LNCS, vol. 9645, pp. 160–167. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-32055-7_14
Yu, Z., Wu, F., Yang, Y., Tian, Q., Luo, J., Zhuang, Y.: Discriminative coupled dictionary hashing for fast cross-media retrieval. In: Proceedings of ACM SIGIR, pp. 395–404 (2014)
Zhang, D., Li, W.: Large-scale supervised multimodal hashing with semantic correlation maximization. In: Proceedings of AAAI, pp. 2177–2183 (2014)
Zhen, Y., Yeung, D.: Co-regularized hashing for multimodal data. In: Proceedings of NIPS, pp. 1385–1393 (2012)
Zhen, Y., Yeung, D.: A probabilistic model for multimodal hash function learning. In: Proceedings of ACM SIGKDD, pp. 940–948 (2012)
Zhou, J., Ding, G., Guo, Y.: Latent semantic sparse hashing for cross-modal similarity search. In: Proceedings of ACM SIGIR, pp. 415–424 (2014)
Acknowledgments
This work was supported in part by the National Natural Science Foundation of China under Project 61572108, Project 61632007 and Project 61502081.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Liu, L. et al. (2018). Index and Retrieve Multimedia Data: Cross-Modal Hashing by Learning Subspace Relation. In: Pei, J., Manolopoulos, Y., Sadiq, S., Li, J. (eds) Database Systems for Advanced Applications. DASFAA 2018. Lecture Notes in Computer Science(), vol 10828. Springer, Cham. https://doi.org/10.1007/978-3-319-91458-9_37
Download citation
DOI: https://doi.org/10.1007/978-3-319-91458-9_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-91457-2
Online ISBN: 978-3-319-91458-9
eBook Packages: Computer ScienceComputer Science (R0)