Skip to main content
Log in

DAP\(^2\)CMH: Deep Adversarial Privacy-Preserving Cross-Modal Hashing

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

Privacy-preserving cross-modal retrieval is a significant problem in the area of multimedia analysis. As the amount of data is exploding, cross-modal data analysis and retrieval is often realized on cloud computing environment. Therefore, the privacy protection of large-scale cross-modal data has become a problem that can not be ignored. To further improve the accuracy and efficiency of privacy-preserving search, this paper proposes a novel cross-modal hashing scheme, named deep adversarial privacy-preserving cross-modal hashing (DAP\(^2\)CMH). This method consists of a deep cross-modal hashing model termed DACMH, and a secure index structure called CMH\(^2\)-Tree. The former is a combination of deep hashing and adversarial learning to capture intra-modal and inter-modal correlation. The latter is a hierarchical hashing index structure that can provide efficient data organization based on cross-modal hash codes. We conduct comprehensive experiments on three common used benchmarks. The results show that the proposed approach DAP\(^2\)CMH outperforms the state-of-the-arts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Xu C, Sun J, Wang C (2020) A novel image encryption algorithm based on bit-plane matrix rotation and hyper chaotic systems. Multimedia Tools Appl 5573–5593

  2. Cao D, Han N, Chen H, Wei X, He X (2020) Video-based recipe retrieval. Inf Sci 302–318

  3. Jiang B, Huang X, Yang C, Yuan J (2019) SLTFNet: a spatial and language-temporal tensor fusion network for video moment retrieval. Inf Process Manage 56(6)

  4. Cao D, Chu J, Zhu N, Nie L (2020). Cross-modal recipe retrieval via parallel- and cross-attention networks learning. Knowl Based Syst

  5. Cao D, Yu Z, Zhang H, Fang J, Nie L, Tian Q (2019). Video-based cross-modal recipe retrieval. acm multimedia

  6. Fang L, Liu Z, Song W (2019) Deep hashing neural networks for hyperspectral image feature extraction. IEEE Geosci Remote Sens Lett 16(9):1412–1416

    Article  Google Scholar 

  7. Liu Y, Xin G, Xiao Y (2016) Robust image hashing using radon transform and invariant features. Radioengineering 25(3):556–564

    Article  Google Scholar 

  8. Deng G, Xu C, Tu XH, Li T, Gao N (2018). Rapid image retrieval with binary hash codes based on deep learning. Third international workshop on pattern recognition

  9. Hanling Z, Caiqiong X, Guangzhi G (2009). Content based image hashing robust to geometric transformations. International symposium on electronic commerce and security

  10. Jiang B, Huang X, Yang C, Yuan J (2019). Cross-modal video moment retrieval with spatial and language-temporal attention. International conference on multimedia retrieval

  11. Liu Y, Qin Z, Liao X, Wu J (2020). Cryptanalysis and enhancement of an image encryption scheme based on a 1-d coupled sine map. Nonlinear Dyn (1)

  12. Ouyang J, Liu Y, Shu H (2017) Robust hashing for image authentication using SIFT feature and quaternion Zernike moments. Multimedia Tools Appl 76(2):2609–2626

    Article  Google Scholar 

  13. Zhang H, Huang S (2008). A novel image authentication robust to geometric transformations. Congress on image and signal processing

  14. Karthik K, Kashyap S (2013) Transparent hashing in the encrypted domain for privacy preserving image retrieval. SIViP 7(4):647–664

    Article  Google Scholar 

  15. Ferreira B, Rodrigues J, Leitao J, Domingos H. (2017). Practical privacy-preserving content-based retrieval in cloud image repositories. IEEE Trans Cloud Comput

  16. Cheng B, Zhuo L, Bai Y, Peng Y, Zhang J (2014) Secure index construction for privacy-preserving large-scale image retrieval. In 2014 IEEE fourth international conference on big data and cloud computing (pp 116–120). IEEE

  17. Weng L, Amsaleg L, Morton A, Marchand-Maillet S (2014) A privacy-preserving framework for large-scale content-based information retrieval. IEEE Trans Inf Forensics Secur 10(1):152–167

    Article  Google Scholar 

  18. Xia Z, Wang X, Zhang L, Qin Z, Sun X, Ren K (2016) A privacy-preserving and copy-deterrence content-based image retrieval scheme in cloud computing. IEEE Trans Inf Forensics Secur 11(11):2594–2608

    Article  Google Scholar 

  19. Xu Y, Gong J, Xiong L, Xu Z, Wang J, Shi YQ (2017) A privacy-preserving content-based image retrieval method in cloud environment. J Vis Commun Image Represent 43:164–172

    Article  Google Scholar 

  20. Guo C, Jia J, Jie Y, Liu CZ, Choo KR (2020) Enabling secure cross-modal retrieval over encrypted heterogeneous IoT databases with collective matrix factorization. IEEE Internet Things J 1–1

  21. Yang Y, Wu F, Xu D, Zhuang Y, Chia LT (2010) Cross-media retrieval using query dependent search methods. Pattern Recognit 43(8):2927–2936

    Article  Google Scholar 

  22. Jiang B, Huang X, Yang C, Yuan J (2019) Cross-modal video moment retrieval with spatial and language-temporal attention. In Proceedings of the 2019 on international conference on multimedia retrieval (pp 217–225)

  23. Wang Y (2020) Survey on deep multi-modal data analytics: collaboration, rivalry and fusion. arXiv preprint arXiv:2006.08159

  24. Rafailidis D, Manolopoulou S, Daras P (2013) A unified framework for multimodal retrieval. Pattern Recognit 46(12):3358–3370

    Article  Google Scholar 

  25. Zhang C, Chen R, Zhu L, Liu A, Lin Y, Huang F (2019) Hierarchical information quadtree: efficient spatial temporal image search for multimedia stream. Multimedia Tools Appl 78(21):30561–30583

    Article  Google Scholar 

  26. Atrey PK, Hossain MA, El Saddik A, Kankanhalli MS (2010) Multimodal fusion for multimedia analysis: a survey. Multimedia Syst 16(6):345–379

    Article  Google Scholar 

  27. Liu Z, Li H, Zhou W, Zhao R, Tian Q (2014) Contextual hashing for large-scale image search. IEEE Trans Image Process 23(4):1606–1614

    Article  MathSciNet  Google Scholar 

  28. Zhang C, Zhang Y, Zhang W, Lin X (2016) Inverted linear quadtree: efficient top k spatial keyword search. IEEE Trans Knowl Data Eng 28(7):1706–1721

    Article  Google Scholar 

  29. Ranjan V, Rasiwasia N, Jawahar CV (2015) Multi-label cross-modal retrieval. In Proceedings of the IEEE international conference on computer vision (pp 4094–4102)

  30. Xu X, Shen F, Yang Y, Shen HT, Li X (2017) Learning discriminative binary codes for large-scale cross-modal retrieval. IEEE Trans Image Process 26(5):2494–2507

    Article  MathSciNet  Google Scholar 

  31. Rasiwasia N, Costa Pereira J, Coviello E, Doyle G, Lanckriet GR, Levy R, Vasconcelos N (2010) A new approach to cross-modal multimedia retrieval. In Proceedings of the 18th ACM international conference on Multimedia (pp 251–260)

  32. Wang K, He R, Wang L, Wang W, Tan T (2015) Joint feature selection and subspace learning for cross-modal retrieval. IEEE Trans Pattern Anal Mach Intell 38(10):2010–2023

    Article  Google Scholar 

  33. Wang B, Yang Y, Xu X, Hanjalic A, Shen HT (2017) Adversarial cross-modal retrieval. In Proceedings of the 25th ACM international conference on Multimedia (pp 154–162)

  34. Zhu L, Long J, Zhang C, Yu W, Yuan X, Sun L (2019) An efficient approach for geo-multimedia cross-modal retrieval. IEEE Access 7:180571–180589

    Article  Google Scholar 

  35. Wei Y, Zhao Y, Lu C, Wei S, Liu L, Zhu Z, Yan S (2016) Cross-modal retrieval with CNN visual features: a new baseline. IEEE Trans Cybern 47(2):449–460

    Google Scholar 

  36. Wu L, Wang Y, Shao L (2018) Cycle-consistent deep generative hashing for cross-modal retrieval. IEEE Trans Image Process 28(4):1602–1612

    Article  MathSciNet  Google Scholar 

  37. Hardoon DR, Szedmak S, Shawe-Taylor J (2004) Canonical correlation analysis: an overview with application to learning methods. Neural Comput 16(12):2639–2664

    Article  Google Scholar 

  38. Wang S, Lu J, Gu X, Weyori BA, Yang JY (2016) Unsupervised discriminant canonical correlation analysis based on spectral clustering. Neurocomputing 171:425–433

    Article  Google Scholar 

  39. Zu C, Zhang D (2016) Canonical sparse cross-view correlation analysis. Neurocomputing 191:263–272

    Article  Google Scholar 

  40. Gong Y, Ke Q, Isard M, Lazebnik S (2014) A multi-view embedding space for modeling internet images, tags, and their semantics. Int J Comput Vision 106(2):210–233

    Article  Google Scholar 

  41. Andrew G, Arora R, Bilmes J, Livescu K (2013) Deep canonical correlation analysis. In: International conference on machine learning (pp 1247–1255)

  42. He Y, Xiang S, Kang C, Wang J, Pan C (2016) Cross-modal retrieval via deep and bidirectional representation learning. IEEE Trans Multimedia 18(7):1363–1377

    Article  Google Scholar 

  43. Huang X, Peng Y, Yuan M (2018) Mhtn: modal-adversarial hybrid transfer network for cross-modal retrieval. IEEE Trans Cybern

  44. Gu J, Cai J, Joty SR, Niu L, Wang G (2018) Look, imagine and match: improving textual-visual cross-modal retrieval with generative models. In: Proceedings of the IEEE conference on computer vision and pattern recognition (pp 7181–7189)

  45. Wen X, Han Z, Yin X, Liu Y (2019) Adversarial cross-modal retrieval via learning and transferring single-modal similarities. International conference on multimedia and expo, 2019, pp 478–483

  46. Shang F, Zhang H, Sun J, Nie L, Liu L (2020) Cross-modal dual subspace learning with adversarial network. Neural Netw

  47. Chen H, Ding G, Liu X, Lin Z, Liu J, Han J (2020) IMRAM: iterative matching with recurrent attention memory for cross-modal image-text retrieval. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp 12655–12663)

  48. Yu J, Lu Y, Zhang W, Qin Z, Liu Y, Hu Y (2020) Learning cross-modal correlations by exploring inter-word semantics and stacked co-attention. Pattern Recognit Lett 130:189–198

    Article  Google Scholar 

  49. Wei X, Zhang T, Li Y, Zhang Y, Wu F (2020). Multi-modality cross attention network for image and sentence matching. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp 10941–10950)

  50. Wang K, Tang J, Wang N, Shao L (2016) Semantic Boosting Cross-Modal Hashing for efficient multimedia retrieval. Inf Sci 199–210

  51. Cao Y, Long M, Wang J, Yang Q, Yu PS (2016). Deep visual-semantic hashing for cross-modal retrieval. The 22nd ACM SIGKDD international conference. ACM

  52. Liong VE, Lu J, Tan Y, Zhou J (2017) Cross-modal deep variational hashing. International conference on computer vision

  53. Li K, Qi GJ, Ye J, Hua KA (2017) Linear subspace ranking hashing for cross-modal retrieval. IEEE Trans Pattern Anal Mach Intell PP(9), 1825–1838

  54. Yang E, Deng C, Liu W, Liu X, Tao D, Gao X (2017) Pairwise relationship guided deep hashing for cross-modal retrieval. In: Thirty-first AAAI conference on artificial intelligence

  55. Zhang X, Lai H, Feng J (2018) Attention-aware deep adversarial hashing for cross-modal retrieval. In: Proceedings of the European conference on computer vision (ECCV) (pp 591–606)

  56. Zhong F, Chen Z, Min G (2018) Deep discrete cross-modal hashing for cross-media retrieval. Pattern Recognit 83:64–77

    Article  Google Scholar 

  57. Chen ZD, Yu WJ, Li CX, Nie L, Xu XS (2018) Dual deep neural networks cross-modal hashing. In: Thirty-second AAAI conference on artificial intelligence

  58. Zhang X, Zhou S, Feng J, Lai H, Li B, Pan Y, Yan S (2017) HashGAN: attention-aware deep adversarial hashing for cross modal retrieval. arXiv preprint arXiv:1711.09347

  59. Gu W, Gu X, Gu J, Li B, Xiong Z, Wang W (2019) Adversary guided asymmetric hashing for cross-modal retrieval. In: Proceedings of the 2019 on international conference on multimedia retrieval (pp 159–167)

  60. Tu RC, Mao XL, Ma B, Hu Y, Yan T, Wei W, Huang H (2020) Deep cross-modal hashing with hashing functions and unified hash codes jointly learning. IEEE Trans Knowl Data Eng

  61. Shen M, Cheng G, Zhu L, Du X, Hu J (2020) Content-based multi-source encrypted image retrieval in clouds with privacy preservation. Future Gener Comput Syst 109:621–632

    Article  Google Scholar 

  62. Rahim N, Ahmad J, Muhammad K, Sangaiah AK, Baik SW (2018) Privacy-preserving image retrieval for mobile devices with deep features on the cloud. Comput Commun 127:75–85

    Article  Google Scholar 

  63. Cheng SL, Wang LJ, Huang G, Du AY (2019) A privacy-preserving image retrieval scheme based secure kNN, DNA coding and deep hashing. Multimedia Tools Appl 1–23

  64. Jiang R, Lu R, Choo KKR (2018) Achieving high performance and privacy-preserving query over encrypted multidimensional big metering data. Future Gener Comput Syst 78:392–401

    Article  Google Scholar 

  65. Razeghi B, Voloshynovskiy S (2018) Privacy-preserving outsourced media search using secure sparse ternary codes. ICASSP 2018—2018 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE

  66. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems (pp 2672–2680)

  67. Krizhevsky A, Sutskever I, Hinton G (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25(2)

  68. Pennington J, Socher R, Manning C (2014). Glove: global vectors for word representation. Conference on empirical methods in natural language processing

  69. Kim Y (2014) Convolutional neural networks for sentence classification. Empir Methods Nat Lang Process

  70. Deppisch U (1986) S-tree: a dynamic balanced signature index for office retrieval. international acm sigir conference on research and development in information retrieval

  71. Chua TS, Tang J, Hong R, Li H, Luo Z, Zheng Y (2009) NUS-WIDE: a real-world web image database from National University of Singapore. In: Proceedings of the ACM international conference on image and video retrieval (pp 1–9)

  72. Escalante HJ, Hernández CA, Gonzalez JA, López-López A, Montes M, Morales EF, Grubinger M (2010) The segmented and annotated IAPR TC-12 benchmark. Comput Vis Image Understand 114(4):419–428

  73. Kumar S, Udupa R. (2011) Learning hash functions for cross-view similarity search. In: Twenty-second international joint conference on artificial intelligence

  74. Song J, Yang Y, Yang Y, Huang Z, Shen HT (2013) Inter-media hashing for large-scale retrieval from heterogeneous data sources. In: Proceedings of the 2013 ACM SIGMOD international conference on management of data (pp 785–796)

  75. Bronstein MM, Bronstein AM, Michel F, Paragios N (2010) Data fusion through cross-modality metric learning using similarity-sensitive hashing. In: 2010 IEEE computer society conference on computer vision and pattern recognition (pp 3594–3601). IEEE

  76. Jiang QY, Li WJ (2017) Deep cross-modal hashing. In: Proceedings of the IEEE conference on computer vision and pattern recognition (pp 3232–3240)

  77. Guo C, Jia J, Jie Y, Liu CZ, Choo KKR (2020) Enabling secure cross-modal retrieval over encrypted heterogeneous IoT databases with collective matrix factorization. IEEE Internet Things J 7(4):3104–3113

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China (62072166, 61702560, 61472450, 61972203, 61972203), the Key Research Program of Hunan Province (2016JC2018), Project (2018JJ3691) of Science and Technology Plan of Hunan Province.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Chengyuan Zhang or Weiren Yu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhu, L., Song, J., Yang, Z. et al. DAP\(^2\)CMH: Deep Adversarial Privacy-Preserving Cross-Modal Hashing. Neural Process Lett 54, 2549–2569 (2022). https://doi.org/10.1007/s11063-021-10447-4

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-021-10447-4

Keywords

Navigation