research-article

Online Cross-modal Hashing With Dynamic Prototype

Authors:

Yilong YinAuthors Info & Claims

ACM Transactions on Multimedia Computing, Communications and Applications, Volume 20, Issue 8

Article No.: 252, Pages 1 - 18

https://doi.org/10.1145/3665249

Published: 13 June 2024 Publication History

Abstract

Online cross-modal hashing has received increasing attention due to its efficiency and effectiveness in handling cross-modal streaming data retrieval. Despite the promising performance, these methods mainly focus on the supervised learning paradigm, demanding expensive and laborious work to obtain clean annotated data. Existing unsupervised online hashing methods mostly struggle to construct instructive semantic correlations among data chunks, resulting in the forgetting of accumulated data distribution. To this end, we propose a Dynamic Prototype-based Online Cross-modal Hashing method, called DPOCH. Based on the pre-learned reliable common representations, DPOCH generates prototypes incrementally as sketches of accumulated data and updates them dynamically for adapting streaming data. Thereafter, the prototype-based semantic embedding and similarity graphs are designed to promote stability and generalization of the hashing process, thereby obtaining globally adaptive hash codes and hash functions. Experimental results on benchmarked datasets demonstrate that the proposed DPOCH outperforms state-of-the-art unsupervised online cross-modal hashing methods.

References

[1]

Samuel R. Buss and Jay P. Fillmore. 2001. Spherical averages and applications to spherical splines and interpolation. ACM Trans. Graph. 20, 2 (2001), 95–126.

Digital Library

[2]

Fatih Çakir and Stan Sclaroff. 2015. Online supervised hashing. In IEEE International Conference on Image Processing, 2015. IEEE, 2606–2610.

Digital Library

[3]

Haozhe Chen, Hang Zhou, Jie Zhang, Dongdong Chen, Weiming Zhang, Kejiang Chen, Gang Hua, and Nenghai Yu. 2023. Perceptual hashing of deep convolutional neural networks for model copy detection. ACM Trans. Multim. Comput. Commun. Appl. 19, 3 (2023), 123:1–123:20. DOI:

Digital Library

[4]

Xixian Chen, Irwin King, and Michael R. Lyu. 2017. FROSH: FasteR online sketching hashing. In Proceedings of the Thirty-Third Conference on Uncertainty in Artificial Intelligence, 2017. AUAI Press.

[5]

Xixian Chen, Haiqin Yang, Shenglin Zhao, Irwin King, and Michael R. Lyu. 2021. Making online sketching hashing even faster. IEEE Trans. Knowl. Data Eng. 33, 3 (2021), 1089–1101.

[6]

Hui Cui, Lei Zhu, Jingjing Li, Zhiyong Cheng, and Zheng Zhang. 2021. Two-pronged strategy: Lightweight augmented graph network hashing for scalable image retrieval. In Proceedings of the 29th ACM International Conference on Multimedia. 1432–1440.

Digital Library

[7]

Guiguang Ding, Yuchen Guo, and Jile Zhou. 2014. Collective matrix factorization hashing for multimodal data. In IEEE Conference on Computer Vision and Pattern Recognition. 2083–2090.

Digital Library

[8]

Hugo Jair Escalante, Carlos A. Hernández, Jesús A. González, Aurelio López-López, Manuel Montes-y-Gómez, Eduardo F. Morales, Luis Enrique Sucar, Luis Villaseñor Pineda, and Michael Grubinger. 2010. The segmented and annotated IAPR TC-12 benchmark. Comput. Vis. Image Underst. 114, 4 (2010), 419–428.

Digital Library

[9]

Yanbin Hao, Tingting Mu, John Y. Goulermas, Jianguo Jiang, Richang Hong, and Meng Wang. 2017. Unsupervised t-distributed video hashing and its deep hashing extension. IEEE Transactions on Image Processing 26, 11 (2017), 5531–5544.

Digital Library

[10]

Long-Kai Huang, Qiang Yang, and Wei-Shi Zheng. 2013. Online hashing. In IJCAI 2013, Proceedings of the 23rd International Joint Conference on Artificial Intelligence. 1422–1428.

[11]

Xin Huang and Yuxin Peng. 2018. Progressive cross-media correlation learning. In Image and Graphics Technologies and Applications: 13th Conference on Image and Graphics Technologies and Applications, IGTA 2018, Beijing, China, April 8–10, 2018, Revised Selected Papers 13. Springer, 201–211.

[12]

Mark J. Huiskes and Michael S. Lew. 2008. The MIR flickr retrieval evaluation. In Proceedings of the 1st ACM SIGMM International Conference on Multimedia Information Retrieval, 2008. 39–43.

Digital Library

[13]

Xiao Kang, Xingbo Liu, Pengyu Lu, Zhijie Zhao, Xiushan Nie, and Yilong Yin. 2023. Online cross-modal hashing with double structure preserving. Journal of Computer Research and Development (2023).

[14]

Xiao Kang, Xingbo Liu, Xuening Zhang, Xiushan Nie, and Yilong Yin. 2023. Online discriminative cross-modal hashing. IEEE Transactions on Circuits and Systems for Video Technology (2023), 1–1. DOI:

[15]

N. Kiarashi, J. Y. Lo, Y. Lin, L. C. Ikejimba, S. V. Ghate, L. W. Nolte, J. T. Dobbins, W. P. Segars, and E. Samei. 2014. Development and application of a suite of 4-D virtual breast phantoms for optimization and evaluation of breast imaging systems. IEEE Transactions on Medical Imaging (2014).

[16]

Charles L. Lawson and Richard J. Hanson. 1995. Solving Least Squares Problems. Classics in applied mathematics, Vol. 15. SIAM.

[17]

Cong Leng, Jiaxiang Wu, Jian Cheng, Xiao Bai, and Hanqing Lu. 2015. Online sketching hashing. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015. IEEE Computer Society, 2503–2511.

[18]

Pandeng Li, Hongtao Xie, Shaobo Min, Zheng-Jun Zha, and Yongdong Zhang. 2022. Online Residual quantization via streaming data correlation preserving. IEEE Trans. Multim. 24 (2022), 981–994. DOI:

[19]

Xuan Li, Wei Wu, Yun-Hao Yuan, Shirui Pan, and Xiaobo Shen. 2022. Online unsupervised cross-view discrete hashing for large-scale retrieval. Appl. Intell. 52, 13 (2022), 14905–14917.

Digital Library

[20]

Xiaoping Liang, Zhenjun Tang, Zhixin Li, Mengzhu Yu, Hanyun Zhang, and Xianquan Zhang. 2024. Robust hashing via global and local invariant features for image copy detection. ACM Trans. Multim. Comput. Commun. Appl. 20, 1 (2024), 2:1–2:22. DOI:

Digital Library

[21]

Mingbao Lin, Rongrong Ji, Shen Chen, Xiaoshuai Sun, and Chia-Wen Lin. 2020. Similarity-preserving linkage hashing for online image retrieval. IEEE Trans. Image Process. 29 (2020), 5289–5300.

[22]

Mingbao Lin, Rongrong Ji, Hong Liu, Xiaoshuai Sun, Yongjian Wu, and Yunsheng Wu. 2019. Towards optimal discrete online hashing with balanced similarity. In AAAI. AAAI Press, 8722–8729.

Digital Library

[23]

Mingbao Lin, Rongrong Ji, Xiaoshuai Sun, Baochang Zhang, Feiyue Huang, Yonghong Tian, and Dacheng Tao. 2021. Fast class-wise updating for online hashing. TPAMI 2012.00318 (2021).

[24]

Xingbo Liu, Xiushan Nie, Quan Zhou, Liqiang Nie, and Yilong Yin. 2020. Model optimization boosting framework for linear model hash learning. IEEE Transactions on Image Processing 29 (2020), 4254–4268.

Digital Library

[25]

Xin Liu, Jinhan Yi, Yiu-ming Cheung, Xing Xu, and Zhen Cui. 2022. OMGH: Online manifold-guided hashing for flexible cross-modal retrieval. IEEE Transactions on Multimedia (2022), 1-1.

[26]

Xu Lu, Lei Zhu, Zhiyong Cheng, Jingjing Li, Xiushan Nie, and Huaxiang Zhang. 2019. Flexible online multi-modal hashing for large-scale multimedia retrieval. In Proceedings of the 27th ACM International Conference on Multimedia (MM). 1129–1137.

Digital Library

[27]

Xu Lu, Lei Zhu, Zhiyong Cheng, Liqiang Nie, and Huaxiang Zhang. 2019. Online multi-modal hashing with dynamic query-adaption. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2019, Benjamin Piwowarski, Max Chevalier, Éric Gaussier, Yoelle Maarek, Jian-Yun Nie, and Falk Scholer (Eds.). ACM, 715–724.

Digital Library

[28]

Lei Ma, Xuan Li, Yu Shi, Likun Huang, Zhenghua Huang, and Jinmeng Wu. 2021. Learning discrete class-specific prototypes for deep semantic hashing. Neurocomputing 443 (2021), 85–95.

[29]

Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of Machine Learning Research 9, Nov. (2008), 2579–2605.

[30]

Stefanos Ougiaroglou, Panagiotis Filippakis, and Georgios Evangelidis. 2021. Prototype Generation for Multi-label Nearest Neighbours Classification. Lecture Notes in Computer Science, Vol. 12886. Springer International Publishing, Cham, 172–183. DOI:

Digital Library

[31]

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, Ilya Sutskever. 2021. Learning transferable visual models from natural language supervision. In Proceedings of the 38th International Conference on Machine Learning. 139 (2021), 8748–8763.

[32]

Fumin Shen, Chunhua Shen, Wei Liu, and Heng Tao Shen. 2015. Supervised discrete hashing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 37–45.

[33]

Ajit Paul Singh and Geoffrey J. Gordon. 2008. Relational learning via collective matrix factorization. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Ying Li, Bing Liu, and Sunita Sarawagi (Eds.). ACM, 650–658.

Digital Library

[34]

Jun Tang, Ke Wang, and Ling Shao. 2016. Supervised matrix factorization hashing for cross-modal retrieval. IEEE Transactions on Image Processing 25, 7 (2016), 3157–3166.

Digital Library

[35]

Xing Tian, Wing W. Y. Ng, and Huihui Xu. 2023. Deep incremental hashing for semantic image retrieval with concept drift. IEEE Transactions on Big Data 9, 4 (Aug.2023), 1102–1115. DOI:

[36]

Di Wang, Quan Wang, Yaqiang An, Xinbo Gao, and Yumin Tian. 2020. Online collective matrix factorization hashing for large-scale cross-media retrieval. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 1409–1418.

Digital Library

[37]

Di Wang, Quan Wang, Lihuo He, Xinbo Gao, and Yumin Tian. 2020. Joint and individual matrix factorization hashing for large-scale cross-modal retrieval. Pattern Recognit. 107 (2020), 107479.

[38]

Xiaoqin Wang, Chen Chen, Rushi Lan, Licheng Liu, Zhenbing Liu, Huiyu Zhou, and Xiaonan Luo. 2022. Binary representation via jointly personalized sparse hashing. ACM Trans. Multim. Comput. Commun. Appl. 18, 3s (2022), 137:1–137:20. DOI:

Digital Library

[39]

Yongxin Wang, Xin Luo, and Xin-Shun Xu. 2020. Label embedding online hashing for cross-modal retrieval. In The 28th ACM International Conference on Multimedia, 2020. ACM, 871–879.

Digital Library

[40]

Xiu-Shen Wei, Shu-Lin Xu, Hao Chen, Liang Xiao, and Yuxin Peng. 2022. Prototype-based classifier learning for long-tailed visual recognition. Science China Information Sciences 65, 6 (June2022), 160105. DOI:

[41]

Dayan Wu, Qi Dai, Jing Liu, Bo Li, and Weiping Wang. 2019. Deep incremental hashing network for efficient image retrieval. 9069–9077. https://openaccess.thecvf.com/content_CVPR_2019/html/Wu_Deep_Incremental_Hashing_Network_for_Efficient_Image_Retrieval_CVPR_2019_paper.html

[42]

Liang Xie, Jialie Shen, Jungong Han, Lei Zhu, and Ling Shao. 2017. Dynamic multi-view hashing for online image retrieval. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI 2017. ijcai.org, 3133–3139.

[43]

Liang Xie, Jialie Shen, and Lei Zhu. 2016. Online cross-modal hashing for web image retrieval. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence,. AAAI Press, 294–300.

Digital Library

[44]

Tao Yao, Yiru Li, Weili Guan, Gang Wang, Ying Li, Lianshan Yan, and Qi Tian. 2023. Discrete robust matrix factorization hashing for large-scale cross-media retrieval. IEEE Transactions on Knowledge and Data Engineering 35, 2 (2023), 1391–1401. DOI:

[45]

Tao Yao, Gang Wang, Lianshan Yan, Xiangwei Kong, Qingtang Su, Caiming Zhang, and Qi Tian. 2019. Online latent semantic hashing for cross-media retrieval. Pattern Recognit. 89 (2019), 1–11.

[46]

Zhaoda Ye, Xiangteng He, and Yuxin Peng. 2022. Unsupervised cross-media hashing learning via knowledge graph. Chinese Journal of Electronics 31, 6 (2022), 1081–1091.

[47]

Jinan Yu, Liyan Ma, Zhenglin Li, Yan Peng, and Shaorong Xie. 2022. Open-world object detection via discriminative class prototype learning. In 2022 IEEE International Conference on Image Processing (ICIP). IEEE, Bordeaux, France, 626–630. DOI:

[48]

Y. W. Zhan, Y. Wang, Y. Sun, X. M. Wu, X. Luo, and X. S. Xu. 2022. Discrete online cross-modal hashing. Pattern Recognition 122 (2022), 108262.

Digital Library

[49]

Donglin Zhang, Xiaojun Wu, and Jun Yu. 2021. Label consistent flexible matrix factorization hashing for efficient cross-modal retrieval. ACM Trans. Multim. Comput. Commun. Appl. 17, 3 (2021), 90:1–90:18. DOI:

Digital Library

[50]

Xuening Zhang, Xingbo Liu, Xiushan Nie, Xiao Kang, and Yilong Yin. 2023. Semi-supervised semi-paired cross-modal hashing. IEEE Transactions on Circuits and Systems for Video Technology (2023), 1–1. DOI:

[51]

Xiangyu Zhao and Yuxin Peng. 2018. Coarse label refined knowledge reasoning for fine-grained visual categorization. In Intelligence Science and Big Data Engineering: 8th International Conference, IScIDE 2018, Lanzhou, China, August 18–19, 2018, Revised Selected Papers 8. Springer, 349–359.

Digital Library

[52]

Lei Zhu, Xu Lu, Zhiyong Cheng, Jingjing Li, and Huaxiang Zhang. 2020. Deep collaborative multi-view hashing for large-scale image search. IEEE Trans. Image Process. 29 (2020), 4643–4655.

Digital Library

[53]

Lei Zhu, Jialie Shen, and Liang Xie. 2016. Unsupervised visual hashing with semantic assistant for content-based image retrieval. IEEE Transactions on Knowledge and Data Engineering (2016).

[54]

Lei Zhu, Chaoqun Zheng, Weili Guan, Jingjing Li, Yang Yang, and Heng Tao Shen. 2023. Multi-modal hashing for efficient multimedia retrieval: A survey. IEEE Transactions on Knowledge and Data Engineering (2023), 1–20. DOI:

Digital Library

Cited By

Xu LLi HShao JZeng XLi W(undefined)Multi-scale Consistency Deep Lifelong Cross-modal HashingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3704636
https://dl.acm.org/doi/10.1145/3704636

Index Terms

Online Cross-modal Hashing With Dynamic Prototype
1. Information systems
  1. Information retrieval
    1. Specialized information retrieval
      1. Multimedia and multimodal retrieval

Recommendations

Discrete online cross-modal hashing
Highlights
- Different from the majority of related methods, DOCH is a discrete one.
- By ...
Abstract
With the prevalence of multimedia content on the Web which usually continuously comes in a stream fashion, online cross-modal hashing methods have attracted extensive interest in recent years. However, most online hashing methods adopt ...
Label Embedding Online Hashing for Cross-Modal Retrieval
MM '20: Proceedings of the 28th ACM International Conference on Multimedia

Supervised cross-modal hashing has gained a lot of attention recently. However, most existing methods learn binary codes or hash functions in a batch-based scheme, which is inefficient in an online scenario, i.e., data points come in a streaming ...
Supervised Discriminative Discrete Hashing for Cross-Modal Retrieval
Advanced Data Mining and Applications
Abstract
With the growing interest in cross-modal retrieval technology, cross-modal hashing has become a mainstream trend for comparing and searching between different modalities. However, when faced with multi-label information, existing research has ... $^{}$ $^{}$

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications

ACM Transactions on Multimedia Computing, Communications, and Applications Volume 20, Issue 8

August 2024

726 pages

EISSN:1551-6865

DOI:10.1145/3618074

Editor:
Abdulmotaleb El Saddik
Mohamed Bin Zayed University of Artificial Intelligence, UAE and University of Ottawa, Canada

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 June 2024

Online AM: 16 May 2024

Accepted: 12 May 2024

Revised: 03 March 2024

Received: 13 December 2023

Published in TOMM Volume 20, Issue 8

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Natural Science Foundation of China
Natural Science Foundation of Shandong Province
Major Basic Research Project of Natural Science Foundation of Shandong Province
Taishan Scholar Project of Shandong Province
Shandong Provincial Natural Science Foundation for Distinguished Young Scholars

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
296
Total Downloads

Downloads (Last 12 months)296
Downloads (Last 6 weeks)18

Reflects downloads up to 27 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Xu LLi HShao JZeng XLi W(undefined)Multi-scale Consistency Deep Lifelong Cross-modal HashingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3704636
https://dl.acm.org/doi/10.1145/3704636

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

Figures

Tables

Media

View full text|Download PDF

View Issue’s Table of Contents