ABSTRACT
Recently, with the discoveries in neurobiology, high-dimensional sparse hashing has attracted increasing attention. In contrast with general hashing that generates low-dimensional hash codes, the high-dimensional sparse hashing maps inputs into a higher dimensional space and generates sparse hash codes, achieving superior performance. However, the sparse hashing has not been fully studied in hashing literature yet. For example, how to fully explore the power of sparse coding in cross-modal retrieval tasks; how to discretely solve the binary and sparse constraints so as to avoid the quantization error problem. Motivated by these issues, in this paper, we present an efficient sparse hashing method, i.e., High-dimensional Sparse Cross-modal Hashing, HSCH for short. It not only takes the high-level semantic similarity of data into consideration, but also properly exploits the low-level feature similarity. In specific, we theoretically design a fine-grained similarity with two critical fusion rules. Then we take advantage of sparse codes to embed the fine-grained similarity into the to-be-learnt hash codes. Moreover, an efficient discrete optimization algorithm is proposed to solve the binary and sparse constraints, reducing the quantization error. In light of this, it becomes much more trainable, and the learnt hash codes are more discriminative. More importantly, the retrieval complexity of HSCH is as efficient as general hash methods. Extensive experiments on three widely-used datasets demonstrate the superior performance of HSCH compared with several state-of-the-art cross-modal hashing approaches.
- Cong Bai, Chao Zeng, Qing Ma, Jinglin Zhang, and Shengyong Chen. 2020. Deep adversarial discrete hashing for cross-modal retrieval. In Proc. ACM Int. Conf. Multimedia Retr.525–531.Google ScholarDigital Library
- Moses S. Charikar. 2002. Similarity estimation techniques from rounding algorithms. In Proc. ACM Symposium on Theory of Compu.380–388.Google ScholarDigital Library
- Zhen Duo Chen, Chuan Xiang Li, Xin Luo, Liqiang Nie, Wei Zhang, and Xin Shun Xu. 2020. SCRATCH: A scalable discrete matrix factorization hashing framework for cross-modal retrieval. IEEE Trans. Circuits Syst. Video Technol. 30, 7 (2020), 2262–2275.Google ScholarDigital Library
- Zhen Duo Chen, Yongxin Wang, Hui Qiong Li, Xin Luo, Liqiang Nie, and Xin Shun Xu. 2019. A two-step cross-modal hashing by exploiting label correlations and preserving similarity in both steps. In Proc. ACM Multimedia Conf.1694–1702.Google ScholarDigital Library
- Miaomiao Cheng, Liping Jing, and Michael K. Ng. 2020. Robust unsupervised cross-modal hashing for multimedia retrieval. ACM Trans. Inf. Syst. 38, 3 (2020), 1–25.Google ScholarDigital Library
- Tat Seng Chua, Jinhui Tang, Richang Hong, Haojie Li, Zhiping Luo, and Yantao Zheng. 2009. NUS-WIDE: A real-world web image database from National University of Singapore. In Proc. ACM Int. Conf. Image Video Retr.48.Google ScholarDigital Library
- Cheng Da, Shibiao Xu, Kun Ding, Gaofeng Meng, Shiming Xiang, and Chunhong Pan. 2017. Asymmetric multi-valued hashing. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit.736–744.Google ScholarCross Ref
- Sanjoy Dasgupta, Charles F. Stevens, and Saket Navlakha. 2017. A neural algorithm for a fundamental computing problem. Science 358, 6364 (2017), 793–796.Google Scholar
- Guiguang Ding, Yuchen Guo, and Jile Zhou. 2014. Collective matrix factorization hashing for multimodal data. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit.2075–2082.Google ScholarDigital Library
- Wei Dong, Moses Charikar, and Kai Li. 2008. Asymmetric distance estimation with sketches for similarity search in high-dimensional spaces. In Proc. Int. ACM SIGIR Conf. Res. Develop. Inf. Retr.123–130.Google ScholarDigital Library
- Hugo Jair Escalante, Carlos A. Hernandez, Jesus A. Gonzalez, Aurelio Lopezlopez, Manuel Montes-y-Gómez, Eduardo F. Morales, Luis Enrique Sucar, Luis Villasenor, and Michael Grubinger. 2010. The segmented and annotated IAPR TC-12 benchmark. Comput. Vision Image Understanding 114, 4 (2010), 419–428.Google ScholarDigital Library
- Jianlong Fu, Heliang Zheng, and Tao Mei. 2017. Look closer to see better: recurrent attention convolutional neural network for fine-grained image recognition. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit.4476–4484.Google ScholarCross Ref
- Yunchao Gong, Svetlana Lazebnik, Albert Gordo, and Florent Perronnin. 2013. Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 35, 12 (2013), 2916–2929.Google ScholarDigital Library
- Albert Gordo, Florent Perronnin, Yunchao Gong, and Svetlana Lazebnik. 2014. Asymmetric distances for binary embeddings. IEEE Trans. Pattern Anal. Mach. Intell. 36, 1 (2014), 33–47.Google ScholarDigital Library
- Hengtong Hu, Lingxi Xie, Richang Hong, and Qi Tian. 2020. Creating something from nothing: unsupervised knowledge distillation for cross-modal hashing. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit.3123–3132.Google ScholarCross Ref
- Mengqiu Hu, Yang Yang, Fumin Shen, Ning Xie, Richang Hong, and Heng Tao Shen. 2019. Collective reconstructive embeddings for cross-modal hashing. IEEE Trans. Image Process. 28, 6 (2019), 2770–2784.Google ScholarCross Ref
- Mark J. Huiskes and Michael S. Lew. 2008. The MIR flickr retrieval evaluation. In Proc. ACM Int. Conf. on Multimedia Inf. Retr.39–43.Google Scholar
- Rongrong Ji, Hong Liu, Liujuan Cao, Di Liu, Yongjian Wu, and Feiyue Huang. 2017. Toward optimal manifold hashing via discrete locally linear embedding. IEEE Trans. Image Process. 26, 11 (2017), 5411–5420.Google ScholarDigital Library
- Qing Yuan Jiang and Wu Jun Li. 2017. Deep cross-modal hashing. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit.3270–3278.Google ScholarCross Ref
- Qing Yuan Jiang and Wu Jun Li. 2019. Discrete latent factor model for cross-modal hashing. IEEE Trans. Image Process. 28, 7 (2019), 3490–3501.Google ScholarDigital Library
- Wang Cheng Kang, Wu Jun Li, and Zhi Hua Zhou. 2016. Column sampling based discrete supervised hashing. In Proc. AAAI Conf. Artif. Intell.1230–1236.Google ScholarCross Ref
- Dmitry Krotov and John J. Hopfield. 2019. Unsupervised learning by competing hidden units. Proc. Natl. Acad. Sci. USA 116, 16 (2019), 7723–7731.Google ScholarCross Ref
- Zhihui Lai, Yudong Chen, Jian Wu, Wai Keung Wong, and Fumin Shen. 2018. Jointly sparse hashing for image retrieval. IEEE Trans. Image Process. 27, 12 (2018), 6147–6158.Google ScholarCross Ref
- Chao Li, Cheng Deng, Ning Li, Wei Liu, Xinbo Gao, and Dacheng Tao. 2018. Self-supervised adversarial hashing networks for cross-modal retrieval. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit.4242–4251.Google ScholarCross Ref
- Wenye Li, Jingwei Mao, Yin Zhang, and Shuguang Cui. 2018. Fast similarity search via optimal sparse lifting. In Proc. Neural Inf. Process. Syst.176–184.Google Scholar
- Andrew C. Lin, Alexei M. Bygrave, Alix De Calignon, Tzumin Lee, and Gero MiesenböCk. 2014. Sparse, decorrelated odor coding in the mushroom body enhances learned odor discrimination. Nat. Neurosci. 17, 4 (2014), 1097–6256.Google ScholarCross Ref
- Guosheng Lin, Chunhua Shen, David Suter, and Anton van den Hengel. 2013. A general two-step approach to learning-based hashing. In Proc. IEEE Int. Conf. Comput. Vis.2552–2559.Google ScholarDigital Library
- Zijia Lin, Guiguang Ding, Mingqing Hu, and Jianmin Wang. 2015. Semantics-preserving hashing for cross-view retrieval. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit.3864–3872.Google ScholarCross Ref
- Hong Liu, Rongrong Ji, Yongjian Wu, Feiyue Huang, and Baochang Zhang. 2017. Cross-modality binary code learning via fusion similarity hashing. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit.7380–7388.Google ScholarCross Ref
- Wei Liu, Cun Mu, Sanjiv Kumar, and Shih Fu Chang. 2014. Discrete graph hashing. In Proc. Neural Inf. Process. Syst.3419–3427.Google Scholar
- Wei Liu, Jun Wang, Rongrong Ji, and Yu Gang Jiang. 2012. Supervised hashing with kernels. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit.2074–2081.Google Scholar
- Mingsheng Long, Yue Cao, Jianmin Wang, and Philip S. Yu. 2016. Composite correlation quantization for efficient multimodal retrieval. In Proc. Int. ACM SIGIR Conf. Res. Develop. Inf. Retr.579–588.Google ScholarDigital Library
- Xin Luo, Liqiang Nie, Xiangnan He, Ye Wu, Zhen Duo Chen, and Xin Shun Xu. 2018. Fast scalable supervised hashing. In Proc. Int. ACM SIGIR Conf. Res. Develop. Inf. Retr.735–744.Google ScholarDigital Library
- Xin Luo, Peng Fei Zhang, Zi Huang, Liqiang Nie, and Xin Shun Xu. 2019. Discrete hashing with multiple supervision. IEEE Trans. Image Process. 28, 6 (2019), 2962–2975.Google ScholarCross Ref
- Changyi Ma, Chonglin Gu, Wenye Li, and Shuguang Cui. 2020. Large-scale image retrieval with sparse binary projections. In Proc. Int. ACM SIGIR Conf. Res. Develop. Inf. Retr.1817–1820.Google ScholarDigital Library
- Xinhong Ma, Tianzhu Zhang, and Changsheng Xu. 2020. Multi-level correlation adversarial hashing for cross-modal retrieval. IEEE Trans. Multimedia 22, 12 (2020), 3101–3114.Google ScholarDigital Library
- Devraj Mandal, Kunal N. Chaudhury, and Soma Biswas. 2019. Generalized semantic preserving hashing for cross-modal retrieval. IEEE Trans. Image Process. 28, 1 (2019), 102–112.Google ScholarDigital Library
- Walter Rudin. 1976. Principles of mathematical analysis. Vol. 3. McGraw-hill New York.Google Scholar
- Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael S. Bernstein, Alexander C. Berg, and Fei Fei Li. 2015. ImageNet large scale visual recognition challenge. Int. J. Comput. Vision 115, 3 (2015), 211–252.Google ScholarDigital Library
- Chaitanya K. Ryali, John J. Hopfield, Leopold Grinberg, and Dmitry Krotov. 2020. Bio-Inspired hashing for unsupervised similarity search. In Proc. Int. Conf. Mach. Learn.8295–8306.Google Scholar
- Fumin Shen, Chunhua Shen, Wei Liu, and Heng Tao Shen. 2015. Supervised discrete hashing. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit.37–45.Google ScholarCross Ref
- Yufeng Shi, Xinge You, Feng Zheng, Shuo Wang, and Qinmu Peng. 2019. Equally-guided discriminative hashing for cross-modal retrieval. In Proc. Int. Joint Conf. Artif. Intell.4767–4773.Google ScholarCross Ref
- Jingkuan Song, Yang Yang, Yi Yang, Zi Huang, and Heng Tao Shen. 2013. Inter-media hashing for large-scale retrieval from heterogeneous data sources. In Proc. ACM SIGMOD Int. Conf. Manag.785–796.Google ScholarDigital Library
- Charles F. Stevens. 2016. A statistical property of fly odor responses is conserved across odors. In Proc. Natl. Acad. Sci.6737––6742.Google ScholarCross Ref
- Di Wang, Xinbo Gao, Xiumei Wang, and Lihuo He. 2015. Semantic topic multimodal hashing for cross-media retrieval. In Proc. Int. Joint Conf. Artif. Intell.3890–3896.Google Scholar
- Di Wang, Xinbo Gao, Xiumei Wang, and Lihuo He. 2019. Label consistent matrix factorization hashing for large-scale cross-modal similarity search. IEEE Trans. Pattern Anal. Mach. Intell. 41, 10 (2019), 2466–2479.Google ScholarDigital Library
- Di Wang, Quan Wang, and Xinbo Gao. 2018. Robust and flexible discrete hashing for cross-modal similarity search. IEEE Trans. Circuits Syst. Video Technol. 28, 10 (2018), 2703–2715.Google ScholarDigital Library
- Lu Wang, Chao Ma, Enmei Tu, Jie Yang, and Nikola Kasabov. 2018. Discrete sparse hashing for cross-modal similarity search. In Proc. Int. Conf. Neural Inf. Process.256–267.Google ScholarDigital Library
- Yongxin Wang, Xin Luo, Liqiang Nie, Jingkuan Song, Wei Zhang, and Xin Shun Xu. 2021. BATCH: A scalable asymmetric discrete cross-modal hashing. IEEE Trans. Knowl. Data Eng.(2021). https://doi.org/10.1109/TKDE.2020.2974825Google Scholar
- Yongxin Wang, Xin Luo, and Xin Shun Xu. 2020. Label embedding online hashing for cross-modal retrieval. In Proc. ACM Multimedia Conf.871–879.Google ScholarDigital Library
- Yunbo Wang, Xianfeng Ou, Jian Liang, and Zhenan Sun. 2021. Deep semantic reconstruction hashing for similarity retrieval. IEEE Trans. Circuits Syst. Video Technol. 31, 1 (2021), 387–400.Google ScholarDigital Library
- Yair Weiss, Antonio Torralba, and Rob Fergus. 2009. Spectral hashing. In Proc. Neural Inf. Process. Syst.1753–1760.Google Scholar
- Rongkai Xia, Yan Pan, Hanjiang Lai, Cong Liu, and Shuicheng Yan. 2014. Supervised hashing for image retrieval via image representation learning. In Proc. AAAI Conf. Artif. Intell.2156–2162.Google ScholarCross Ref
- Xing Xu, Fumin Shen, Yang Yang, Heng Tao Shen, and Xuelong Li. 2017. Learning discriminative binary codes for large-scale cross-modal retrieval. IEEE Trans. Image Process. 26, 5 (2017), 2494–2507.Google ScholarDigital Library
- Jile Zhou, Guiguang Ding, and Yuchen Guo. 2014. Latent semantic sparse hashing for cross-modal similarity search. In Proc. Int. ACM SIGIR Conf. Res. Develop. Inf. Retr.415–424.Google ScholarDigital Library
Recommendations
Discrete Similarity Preserving Hashing for Cross-modal Retrieval
Artificial Intelligence and SecurityAbstractHashing methods have attracted great attention for cross-modal retrieval due to the low memory requirement and fast computation. Cross-modal hashing methods aim to transform the data from different modalities into a common Hamming space. However, ...
Discrete Sparse Hashing for Cross-Modal Similarity Search
Neural Information ProcessingAbstractCross-modal hashing approaches have achieved great success on cross-modal similarity search. However, most existing cross-modal hashing methods relax the discrete constraints to solve the hashing model and determine the weights of different ...
A Two-step Approach to Cross-modal Hashing
ICMR '15: Proceedings of the 5th ACM on International Conference on Multimedia RetrievalWith the rapid growth of multimedia data, it is very desirable to effectively and efficiently search objects of interest across different modalities from large scale databases. Cross-modal hashing provides a very promising way to address such problem. In ...
Comments