skip to main content
research-article

Online Cross-modal Hashing With Dynamic Prototype

Published: 13 June 2024 Publication History

Abstract

Online cross-modal hashing has received increasing attention due to its efficiency and effectiveness in handling cross-modal streaming data retrieval. Despite the promising performance, these methods mainly focus on the supervised learning paradigm, demanding expensive and laborious work to obtain clean annotated data. Existing unsupervised online hashing methods mostly struggle to construct instructive semantic correlations among data chunks, resulting in the forgetting of accumulated data distribution. To this end, we propose a Dynamic Prototype-based Online Cross-modal Hashing method, called DPOCH. Based on the pre-learned reliable common representations, DPOCH generates prototypes incrementally as sketches of accumulated data and updates them dynamically for adapting streaming data. Thereafter, the prototype-based semantic embedding and similarity graphs are designed to promote stability and generalization of the hashing process, thereby obtaining globally adaptive hash codes and hash functions. Experimental results on benchmarked datasets demonstrate that the proposed DPOCH outperforms state-of-the-art unsupervised online cross-modal hashing methods.

References

[1]
Samuel R. Buss and Jay P. Fillmore. 2001. Spherical averages and applications to spherical splines and interpolation. ACM Trans. Graph. 20, 2 (2001), 95–126.
[2]
Fatih Çakir and Stan Sclaroff. 2015. Online supervised hashing. In IEEE International Conference on Image Processing, 2015. IEEE, 2606–2610.
[3]
Haozhe Chen, Hang Zhou, Jie Zhang, Dongdong Chen, Weiming Zhang, Kejiang Chen, Gang Hua, and Nenghai Yu. 2023. Perceptual hashing of deep convolutional neural networks for model copy detection. ACM Trans. Multim. Comput. Commun. Appl. 19, 3 (2023), 123:1–123:20. DOI:
[4]
Xixian Chen, Irwin King, and Michael R. Lyu. 2017. FROSH: FasteR online sketching hashing. In Proceedings of the Thirty-Third Conference on Uncertainty in Artificial Intelligence, 2017. AUAI Press.
[5]
Xixian Chen, Haiqin Yang, Shenglin Zhao, Irwin King, and Michael R. Lyu. 2021. Making online sketching hashing even faster. IEEE Trans. Knowl. Data Eng. 33, 3 (2021), 1089–1101.
[6]
Hui Cui, Lei Zhu, Jingjing Li, Zhiyong Cheng, and Zheng Zhang. 2021. Two-pronged strategy: Lightweight augmented graph network hashing for scalable image retrieval. In Proceedings of the 29th ACM International Conference on Multimedia. 1432–1440.
[7]
Guiguang Ding, Yuchen Guo, and Jile Zhou. 2014. Collective matrix factorization hashing for multimodal data. In IEEE Conference on Computer Vision and Pattern Recognition. 2083–2090.
[8]
Hugo Jair Escalante, Carlos A. Hernández, Jesús A. González, Aurelio López-López, Manuel Montes-y-Gómez, Eduardo F. Morales, Luis Enrique Sucar, Luis Villaseñor Pineda, and Michael Grubinger. 2010. The segmented and annotated IAPR TC-12 benchmark. Comput. Vis. Image Underst. 114, 4 (2010), 419–428.
[9]
Yanbin Hao, Tingting Mu, John Y. Goulermas, Jianguo Jiang, Richang Hong, and Meng Wang. 2017. Unsupervised t-distributed video hashing and its deep hashing extension. IEEE Transactions on Image Processing 26, 11 (2017), 5531–5544.
[10]
Long-Kai Huang, Qiang Yang, and Wei-Shi Zheng. 2013. Online hashing. In IJCAI 2013, Proceedings of the 23rd International Joint Conference on Artificial Intelligence. 1422–1428.
[11]
Xin Huang and Yuxin Peng. 2018. Progressive cross-media correlation learning. In Image and Graphics Technologies and Applications: 13th Conference on Image and Graphics Technologies and Applications, IGTA 2018, Beijing, China, April 8–10, 2018, Revised Selected Papers 13. Springer, 201–211.
[12]
Mark J. Huiskes and Michael S. Lew. 2008. The MIR flickr retrieval evaluation. In Proceedings of the 1st ACM SIGMM International Conference on Multimedia Information Retrieval, 2008. 39–43.
[13]
Xiao Kang, Xingbo Liu, Pengyu Lu, Zhijie Zhao, Xiushan Nie, and Yilong Yin. 2023. Online cross-modal hashing with double structure preserving. Journal of Computer Research and Development (2023).
[14]
Xiao Kang, Xingbo Liu, Xuening Zhang, Xiushan Nie, and Yilong Yin. 2023. Online discriminative cross-modal hashing. IEEE Transactions on Circuits and Systems for Video Technology (2023), 1–1. DOI:
[15]
N. Kiarashi, J. Y. Lo, Y. Lin, L. C. Ikejimba, S. V. Ghate, L. W. Nolte, J. T. Dobbins, W. P. Segars, and E. Samei. 2014. Development and application of a suite of 4-D virtual breast phantoms for optimization and evaluation of breast imaging systems. IEEE Transactions on Medical Imaging (2014).
[16]
Charles L. Lawson and Richard J. Hanson. 1995. Solving Least Squares Problems. Classics in applied mathematics, Vol. 15. SIAM.
[17]
Cong Leng, Jiaxiang Wu, Jian Cheng, Xiao Bai, and Hanqing Lu. 2015. Online sketching hashing. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015. IEEE Computer Society, 2503–2511.
[18]
Pandeng Li, Hongtao Xie, Shaobo Min, Zheng-Jun Zha, and Yongdong Zhang. 2022. Online Residual quantization via streaming data correlation preserving. IEEE Trans. Multim. 24 (2022), 981–994. DOI:
[19]
Xuan Li, Wei Wu, Yun-Hao Yuan, Shirui Pan, and Xiaobo Shen. 2022. Online unsupervised cross-view discrete hashing for large-scale retrieval. Appl. Intell. 52, 13 (2022), 14905–14917.
[20]
Xiaoping Liang, Zhenjun Tang, Zhixin Li, Mengzhu Yu, Hanyun Zhang, and Xianquan Zhang. 2024. Robust hashing via global and local invariant features for image copy detection. ACM Trans. Multim. Comput. Commun. Appl. 20, 1 (2024), 2:1–2:22. DOI:
[21]
Mingbao Lin, Rongrong Ji, Shen Chen, Xiaoshuai Sun, and Chia-Wen Lin. 2020. Similarity-preserving linkage hashing for online image retrieval. IEEE Trans. Image Process. 29 (2020), 5289–5300.
[22]
Mingbao Lin, Rongrong Ji, Hong Liu, Xiaoshuai Sun, Yongjian Wu, and Yunsheng Wu. 2019. Towards optimal discrete online hashing with balanced similarity. In AAAI. AAAI Press, 8722–8729.
[23]
Mingbao Lin, Rongrong Ji, Xiaoshuai Sun, Baochang Zhang, Feiyue Huang, Yonghong Tian, and Dacheng Tao. 2021. Fast class-wise updating for online hashing. TPAMI 2012.00318 (2021).
[24]
Xingbo Liu, Xiushan Nie, Quan Zhou, Liqiang Nie, and Yilong Yin. 2020. Model optimization boosting framework for linear model hash learning. IEEE Transactions on Image Processing 29 (2020), 4254–4268.
[25]
Xin Liu, Jinhan Yi, Yiu-ming Cheung, Xing Xu, and Zhen Cui. 2022. OMGH: Online manifold-guided hashing for flexible cross-modal retrieval. IEEE Transactions on Multimedia (2022), 1-1.
[26]
Xu Lu, Lei Zhu, Zhiyong Cheng, Jingjing Li, Xiushan Nie, and Huaxiang Zhang. 2019. Flexible online multi-modal hashing for large-scale multimedia retrieval. In Proceedings of the 27th ACM International Conference on Multimedia (MM). 1129–1137.
[27]
Xu Lu, Lei Zhu, Zhiyong Cheng, Liqiang Nie, and Huaxiang Zhang. 2019. Online multi-modal hashing with dynamic query-adaption. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2019, Benjamin Piwowarski, Max Chevalier, Éric Gaussier, Yoelle Maarek, Jian-Yun Nie, and Falk Scholer (Eds.). ACM, 715–724.
[28]
Lei Ma, Xuan Li, Yu Shi, Likun Huang, Zhenghua Huang, and Jinmeng Wu. 2021. Learning discrete class-specific prototypes for deep semantic hashing. Neurocomputing 443 (2021), 85–95.
[29]
Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of Machine Learning Research 9, Nov. (2008), 2579–2605.
[30]
Stefanos Ougiaroglou, Panagiotis Filippakis, and Georgios Evangelidis. 2021. Prototype Generation for Multi-label Nearest Neighbours Classification. Lecture Notes in Computer Science, Vol. 12886. Springer International Publishing, Cham, 172–183. DOI:
[31]
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, Ilya Sutskever. 2021. Learning transferable visual models from natural language supervision. In Proceedings of the 38th International Conference on Machine Learning. 139 (2021), 8748–8763.
[32]
Fumin Shen, Chunhua Shen, Wei Liu, and Heng Tao Shen. 2015. Supervised discrete hashing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 37–45.
[33]
Ajit Paul Singh and Geoffrey J. Gordon. 2008. Relational learning via collective matrix factorization. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Ying Li, Bing Liu, and Sunita Sarawagi (Eds.). ACM, 650–658.
[34]
Jun Tang, Ke Wang, and Ling Shao. 2016. Supervised matrix factorization hashing for cross-modal retrieval. IEEE Transactions on Image Processing 25, 7 (2016), 3157–3166.
[35]
Xing Tian, Wing W. Y. Ng, and Huihui Xu. 2023. Deep incremental hashing for semantic image retrieval with concept drift. IEEE Transactions on Big Data 9, 4 (Aug.2023), 1102–1115. DOI:
[36]
Di Wang, Quan Wang, Yaqiang An, Xinbo Gao, and Yumin Tian. 2020. Online collective matrix factorization hashing for large-scale cross-media retrieval. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 1409–1418.
[37]
Di Wang, Quan Wang, Lihuo He, Xinbo Gao, and Yumin Tian. 2020. Joint and individual matrix factorization hashing for large-scale cross-modal retrieval. Pattern Recognit. 107 (2020), 107479.
[38]
Xiaoqin Wang, Chen Chen, Rushi Lan, Licheng Liu, Zhenbing Liu, Huiyu Zhou, and Xiaonan Luo. 2022. Binary representation via jointly personalized sparse hashing. ACM Trans. Multim. Comput. Commun. Appl. 18, 3s (2022), 137:1–137:20. DOI:
[39]
Yongxin Wang, Xin Luo, and Xin-Shun Xu. 2020. Label embedding online hashing for cross-modal retrieval. In The 28th ACM International Conference on Multimedia, 2020. ACM, 871–879.
[40]
Xiu-Shen Wei, Shu-Lin Xu, Hao Chen, Liang Xiao, and Yuxin Peng. 2022. Prototype-based classifier learning for long-tailed visual recognition. Science China Information Sciences 65, 6 (June2022), 160105. DOI:
[41]
Dayan Wu, Qi Dai, Jing Liu, Bo Li, and Weiping Wang. 2019. Deep incremental hashing network for efficient image retrieval. 9069–9077. https://openaccess.thecvf.com/content_CVPR_2019/html/Wu_Deep_Incremental_Hashing_Network_for_Efficient_Image_Retrieval_CVPR_2019_paper.html
[42]
Liang Xie, Jialie Shen, Jungong Han, Lei Zhu, and Ling Shao. 2017. Dynamic multi-view hashing for online image retrieval. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI 2017. ijcai.org, 3133–3139.
[43]
Liang Xie, Jialie Shen, and Lei Zhu. 2016. Online cross-modal hashing for web image retrieval. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence,. AAAI Press, 294–300.
[44]
Tao Yao, Yiru Li, Weili Guan, Gang Wang, Ying Li, Lianshan Yan, and Qi Tian. 2023. Discrete robust matrix factorization hashing for large-scale cross-media retrieval. IEEE Transactions on Knowledge and Data Engineering 35, 2 (2023), 1391–1401. DOI:
[45]
Tao Yao, Gang Wang, Lianshan Yan, Xiangwei Kong, Qingtang Su, Caiming Zhang, and Qi Tian. 2019. Online latent semantic hashing for cross-media retrieval. Pattern Recognit. 89 (2019), 1–11.
[46]
Zhaoda Ye, Xiangteng He, and Yuxin Peng. 2022. Unsupervised cross-media hashing learning via knowledge graph. Chinese Journal of Electronics 31, 6 (2022), 1081–1091.
[47]
Jinan Yu, Liyan Ma, Zhenglin Li, Yan Peng, and Shaorong Xie. 2022. Open-world object detection via discriminative class prototype learning. In 2022 IEEE International Conference on Image Processing (ICIP). IEEE, Bordeaux, France, 626–630. DOI:
[48]
Y. W. Zhan, Y. Wang, Y. Sun, X. M. Wu, X. Luo, and X. S. Xu. 2022. Discrete online cross-modal hashing. Pattern Recognition 122 (2022), 108262.
[49]
Donglin Zhang, Xiaojun Wu, and Jun Yu. 2021. Label consistent flexible matrix factorization hashing for efficient cross-modal retrieval. ACM Trans. Multim. Comput. Commun. Appl. 17, 3 (2021), 90:1–90:18. DOI:
[50]
Xuening Zhang, Xingbo Liu, Xiushan Nie, Xiao Kang, and Yilong Yin. 2023. Semi-supervised semi-paired cross-modal hashing. IEEE Transactions on Circuits and Systems for Video Technology (2023), 1–1. DOI:
[51]
Xiangyu Zhao and Yuxin Peng. 2018. Coarse label refined knowledge reasoning for fine-grained visual categorization. In Intelligence Science and Big Data Engineering: 8th International Conference, IScIDE 2018, Lanzhou, China, August 18–19, 2018, Revised Selected Papers 8. Springer, 349–359.
[52]
Lei Zhu, Xu Lu, Zhiyong Cheng, Jingjing Li, and Huaxiang Zhang. 2020. Deep collaborative multi-view hashing for large-scale image search. IEEE Trans. Image Process. 29 (2020), 4643–4655.
[53]
Lei Zhu, Jialie Shen, and Liang Xie. 2016. Unsupervised visual hashing with semantic assistant for content-based image retrieval. IEEE Transactions on Knowledge and Data Engineering (2016).
[54]
Lei Zhu, Chaoqun Zheng, Weili Guan, Jingjing Li, Yang Yang, and Heng Tao Shen. 2023. Multi-modal hashing for efficient multimedia retrieval: A survey. IEEE Transactions on Knowledge and Data Engineering (2023), 1–20. DOI:

Cited By

View all
  • (undefined)Multi-scale Consistency Deep Lifelong Cross-modal HashingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3704636

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications
ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 20, Issue 8
August 2024
726 pages
EISSN:1551-6865
DOI:10.1145/3618074
  • Editor:
  • Abdulmotaleb El Saddik
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 June 2024
Online AM: 16 May 2024
Accepted: 12 May 2024
Revised: 03 March 2024
Received: 13 December 2023
Published in TOMM Volume 20, Issue 8

Check for updates

Author Tags

  1. Cross-modal retrieval
  2. unsupervised online hashing
  3. common representation learning
  4. dynamic prototype update

Qualifiers

  • Research-article

Funding Sources

  • National Natural Science Foundation of China
  • Natural Science Foundation of Shandong Province
  • Major Basic Research Project of Natural Science Foundation of Shandong Province
  • Taishan Scholar Project of Shandong Province
  • Shandong Provincial Natural Science Foundation for Distinguished Young Scholars

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)296
  • Downloads (Last 6 weeks)18
Reflects downloads up to 27 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (undefined)Multi-scale Consistency Deep Lifelong Cross-modal HashingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3704636

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media