skip to main content
10.1145/3394171.3413882acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Searching Privately by Imperceptible Lying: A Novel Private Hashing Method with Differential Privacy

Published:12 October 2020Publication History

ABSTRACT

In the big data era, with the increasing amount of multi-media data, approximate nearest neighbor~(ANN) search has been an important but challenging problem. As a widely applied large-scale ANN search method, hashing has made great progress, and achieved sub-linear search time with low memory space. However, the advances in hashing are based on the availability of large and representative datasets, which often contain sensitive information. Typically, the privacy of this individually sensitive information is compromised. In this paper, we tackle this valuable yet challenging problem and formulate a task termed as private hashing, which takes into account both searching performance and privacy protection. Specifically, we propose a novel noise mechanism, i.e., Random Flipping, and two private hashing algorithms, i.e., PHashing and PITQ, with the refined analysis within the framework of differential privacy, since differential privacy is a well-established technique to measure the privacy leakage of an algorithm. Random Flipping targets binary scenarios and leverages the "Imperceptible Lying" idea to guarantee ε-differential privacy by flipping each datum of the binary matrix (noise addition). To preserve ε-differential privacy, PHashing perturbs and adds noise to the hash codes learned by non-private hashing algorithms using Random Flipping. However, the noise addition for privacy in PHashing will cause severe performance drops. To alleviate this problem, PITQ leverages the power of alternative learning to distribute the noise generated by Random Flipping into each iteration while preserving ε-differential privacy. Furthermore, to empirically evaluate our algorithms, we conduct comprehensive experiments on the image search task and demonstrate that proposed algorithms achieve equal performance compared with non-private hashing methods.

Skip Supplemental Material Section

Supplemental Material

3394171.3413882.mp4

mp4

3.3 MB

References

  1. Martin Abadi, Andy Chu, Ian Goodfellow, H. Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. 2016. Deep Learning with Differential Privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. Association for Computing Machinery, New York, NY, USA, 308--318. https://doi.org/10.1145/2976749.2978318Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Naman Agarwal and Karan Singh. 2017. The Price of Differential Privacy for Online Learning. In Proceedings of the 34th International Conference on Machine Learning. PMLR, International Convention Centre, Sydney, Australia, 32--40.Google ScholarGoogle Scholar
  3. Alexandr Andoni and Piotr Indyk. 2008. Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions. Commun. ACM, Vol. 51, 1 (Jan. 2008), 117--122. https://doi.org/10.1145/1327452.1327494Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Alexandr Andoni and Ilya Razenshteyn. 2015. Optimal Data-Dependent Hashing for Approximate Near Neighbors. In Proceedings of the 47th Annual ACM Symposium on Theory of Computing. Association for Computing Machinery, New York, NY, USA, 793--801. https://doi.org/10.1145/2746539.2746553Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Zhangjie Cao, Mingsheng Long, Jianmin Wang, and Philip S. Yu. 2017. HashNet: Deep Learning to Hash by Continuation. In The IEEE International Conference on Computer Vision. 5608--5617.Google ScholarGoogle Scholar
  6. Zhangjie Cao, Ziping Sun, Mingsheng Long, Jianmin Wang, and Philip S. Yu. 2018. Deep Priority Hashing. In Proceedings of the 26th ACM International Conference on Multimedia. Association for Computing Machinery, New York, NY, USA, 1653--1661. https://doi.org/10.1145/3240508.3240543Google ScholarGoogle Scholar
  7. Kamalika Chaudhuri and Claire Monteleoni. 2009. Privacy-preserving logistic regression. In Advances in Neural Information Processing Systems 21, D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou (Eds.). Curran Associates, Inc., 289--296.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Kamalika Chaudhuri, Claire Monteleoni, and Anand D. Sarwate. 2011. Differentially Private Empirical Risk Minimization. Journal of Machine Learning Research, Vol. 12 (July 2011), 1069--1109.Google ScholarGoogle Scholar
  9. Kamalika Chaudhuri, Anand D. Sarwate, and Kaushik Sinha. 2013. A Near-Optimal Algorithm for Differentially-Private Principal Components. Journal of Machine Learning Research, Vol. 14, 1 (Jan. 2013), 2905--2943.Google ScholarGoogle Scholar
  10. Tat-Seng Chua, Jinhui Tang, Richang Hong, Haojie Li, Zhiping Luo, and Yantao Zheng. 2009. NUS-WIDE: A Real-World Web Image Database from National University of Singapore. In Proceedings of the ACM International Conference on Image and Video Retrieval. Association for Computing Machinery, New York, NY, USA, Article 48, 9 pages. https://doi.org/10.1145/1646396.1646452Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition. 248--255.Google ScholarGoogle ScholarCross RefCross Ref
  12. John C. Duchi, Michael I. Jordan, and Martin J. Wainwright. 2014. Privacy Aware Learning. J. ACM, Vol. 61, 6, Article 38 (Dec. 2014), 57 pages. https://doi.org/10.1145/2666468Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. 2006. Calibrating Noise to Sensitivity in Private Data Analysis. In Proceedings of the Third Conference on Theory of Cryptography. Springer-Verlag, Berlin, Heidelberg, 265--284. https://doi.org/10.1007/11681878_14Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Cynthia Dwork and Aaron Roth. 2014. The Algorithmic Foundations of Differential Privacy. Foundations and Trends in Theoretical Computer Science, Vol. 9, 3--4 (Aug. 2014), 211--407. https://doi.org/10.1561/0400000042Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Cynthia Dwork, Guy N. Rothblum, and Salil Vadhan. 2010. Boosting and Differential Privacy. In Proceedings of the 2010 IEEE 51st Annual Symposium on Foundations of Computer Science. IEEE, 51--60.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Lianli Gao, Jingkuan Song, Fuhao Zou, Dongxiang Zhang, and Jie Shao. 2015. Scalable Multimedia Retrieval by Deep Learning Hashing with Relative Similarity Learning. In Proceedings of the 23rd ACM International Conference on Multimedia. Association for Computing Machinery, New York, NY, USA, 903--906. https://doi.org/10.1145/2733373.2806360Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Aristides Gionis, Piotr Indyk, and Rajeev Motwani. 1999. Similarity Search in High Dimensions via Hashing. In Proceedings of the 25th International Conference on Very Large Data Bases. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 518--529.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Yunchao Gong, Svetlana Lazebnik, Albert Gordo, and Florent Perronnin. 2013. Iterative Quantization: A Procrustean Approach to Learning Binary Codes for Large-Scale Image Retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 35, 12 (2013), 2916--2929.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Abhradeep Guha Thakurta and Adam Smith. 2013. (Nearly) Optimal Algorithms for Private Online Learning in Full-information and Bandit Settings. In Advances in Neural Information Processing Systems 26, C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Q. Weinberger (Eds.). Curran Associates, Inc., 2733--2741.Google ScholarGoogle Scholar
  20. Kun He, Fatih Cakir, Sarah Adel Bargal, and Stan Sclaroff. 2018. Hashing as Tie-Aware Learning to Rank. In 2018 IEEE Conference on Computer Vision and Pattern Recognition. 4023--4032.Google ScholarGoogle ScholarCross RefCross Ref
  21. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition. 770--778.Google ScholarGoogle Scholar
  22. Prateek Jain, Pravesh Kothari, and Abhradeep Thakurta. 2012. Differentially Private Online Learning. In Proceedings of the 25th Annual Conference on Learning Theory, Shie Mannor, Nathan Srebro, and Robert C. Williamson (Eds.). PMLR, Edinburgh, Scotland, 24.1--24.34.Google ScholarGoogle Scholar
  23. I-Hong Jhuo. 2019. Supervised Set-to-Set Hashing in Visual Recognition. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, 803--810. https://doi.org/10.24963/ijcai.2019/113Google ScholarGoogle ScholarCross RefCross Ref
  24. Qing-Yuan Jiang and Wu-Jun Li. 2018. Asymmetric Deep Supervised Hashing. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, Louisiana, USA, Sheila A. McIlraith and Kilian Q. Weinberger (Eds.). AAAI Press, 3342--3349. https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/17296Google ScholarGoogle Scholar
  25. Qing-Yuan Jiang and Wu-Jun Li. 2015. Scalable Graph Hashing with Feature Transformation. In Proceedings of the 24th International Conference on Artificial Intelligence. AAAI Press, 2248--2254.Google ScholarGoogle Scholar
  26. Alex Krizhevsky and Geoffrey Hinton. 2009. Learning multiple layers of features from tiny images. Technical Report. Citeseer.Google ScholarGoogle Scholar
  27. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2017. ImageNet Classification with Deep Convolutional Neural Networks. Commun. ACM, Vol. 60, 6 (May 2017), 84--90. https://doi.org/10.1145/3065386Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Brian Kulis and Trevor Darrell. 2009. Learning to Hash with Binary Reconstructive Embeddings. In Advances in Neural Information Processing Systems 22, Y. Bengio, D. Schuurmans, J. D. Lafferty, C. K. I. Williams, and A. Culotta (Eds.). Curran Associates, Inc., 1042--1050.Google ScholarGoogle Scholar
  29. Yann Lecun, Leon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE, Vol. 86, 11 (1998), 2278--2324.Google ScholarGoogle ScholarCross RefCross Ref
  30. Wu-Jun Li, Sheng Wang, and Wang-Cheng Kang. 2016. Feature Learning Based Deep Supervised Hashing with Pairwise Labels. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence. AAAI Press, 1711--1717.Google ScholarGoogle Scholar
  31. Jie Lin, Zechao Li, and Jinhui Tang. 2017. Discriminative Deep Hashing for Scalable Face Image Retrieval. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence. 2266--2272. https://doi.org/10.24963/ijcai.2017/315Google ScholarGoogle ScholarCross RefCross Ref
  32. Haomiao Liu, Ruiping Wang, Shiguang Shan, and Xilin Chen. 2016. Deep Supervised Hashing for Fast Image Retrieval. In 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2064--2072.Google ScholarGoogle Scholar
  33. Wei Liu, Jun Wang, Rongrong Ji, Yu-Gang Jiang, and Shih-Fu Chang. 2012. Supervised hashing with kernels. In 2012 IEEE Conference on Computer Vision and Pattern Recognition. 2074--2081.Google ScholarGoogle ScholarCross RefCross Ref
  34. Xingbo Liu, Xiushan Nie, Quan Zhou, Xiaoming Xi, Lei Zhu, and Yilong Yin. 2019. Supervised Short-Length Hashing. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, 3031--3037. https://doi.org/10.24963/ijcai.2019/420Google ScholarGoogle ScholarCross RefCross Ref
  35. Xu Lu, Lei Zhu, Zhiyong Cheng, Jingjing Li, Xiushan Nie, and Huaxiang Zhang. 2019. Flexible Online Multi-Modal Hashing for Large-Scale Multimedia Retrieval. In Proceedings of the 27th ACM International Conference on Multimedia. Association for Computing Machinery, New York, NY, USA, 1129--1137. https://doi.org/10.1145/3343031.3350999Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Subhransu Maji, Esa Rahtu, Juho Kannala, Matthew Blaschko, and Andrea Vedaldi. 2013. Fine-Grained Visual Classification of Aircraft. Technical Report. arxiv: cs-cv/1306.5151Google ScholarGoogle Scholar
  37. H. Brendan McMahan, Daniel Ramage, Kunal Talwar, and Li Zhang. 2018. Learning Differentially Private Recurrent Language Models. In International Conference on Learning Representations .Google ScholarGoogle Scholar
  38. Benjamin I. P. Rubinstein, Peter L. Bartlett, Ling Huang, and Nina Taft. 2012. Learning in a Large Function Space: Privacy-Preserving Mechanisms for SVM Learning. Journal of Privacy and Confidentiality, Vol. 4, 1 (Jul. 2012). https://doi.org/10.29012/jpc.v4i1.612Google ScholarGoogle ScholarCross RefCross Ref
  39. Anand D. Sarwate and Kamalika Chaudhuri. 2013. Signal Processing and Machine Learning with Differential Privacy: Algorithms and Challenges for Continuous Data. IEEE Signal Processing Magazine, Vol. 30, 5 (2013), 86--94.Google ScholarGoogle ScholarCross RefCross Ref
  40. Peter H Schönemann. 1966. A generalized solution of the orthogonal procrustes problem. Psychometrika, Vol. 31, 1 (1966), 1--10.Google ScholarGoogle ScholarCross RefCross Ref
  41. Fumin Shen, Xin Gao, Li Liu, Yang Yang, and Heng Tao Shen. 2017. Deep Asymmetric Pairwise Hashing. In Proceedings of the 25th ACM International Conference on Multimedia. Association for Computing Machinery, New York, NY, USA, 1522--1530. https://doi.org/10.1145/3123266.3123345Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Fumin Shen, Chunhua Shen, Wei Liu, and Heng Tao Shen. 2015a. Supervised Discrete Hashing. In 2015 IEEE Conference on Computer Vision and Pattern Recognition. 37--45.Google ScholarGoogle Scholar
  43. Xiaobo Shen, Fumin Shen, Quan-Sen Sun, and Yun-Hao Yuan. 2015b. Multi-View Latent Hashing for Efficient Multimedia Search. In Proceedings of the 23rd ACM International Conference on Multimedia. Association for Computing Machinery, New York, NY, USA, 831--834. https://doi.org/10.1145/2733373.2806342Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Reza Shokri and Vitaly Shmatikov. 2015. Privacy-Preserving Deep Learning. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security. Association for Computing Machinery, New York, NY, USA, 1310--1321. https://doi.org/10.1145/2810103.2813687Google ScholarGoogle Scholar
  45. Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2014).Google ScholarGoogle Scholar
  46. Ge Song and Xiaoyang Tan. 2017. Hierarchical deep hashing for image retrieval. Frontiers of Computer Science, Vol. 11, 2 (2017), 253--265. https://doi.org/10.1007/s11704-017--6537--3Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Jingkuan Song, Lianli Gao, Yan Yan, Dongxiang Zhang, and Nicu Sebe. 2015. Supervised Hashing with Pseudo Labels for Scalable Multimedia Retrieval. In Proceedings of the 23rd ACM International Conference on Multimedia. Association for Computing Machinery, New York, NY, USA, 827--830. https://doi.org/10.1145/2733373.2806341Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Catherine Wah, Steve Branson, Peter Welinder, and Serge Belongie Pietro Perona. 2011. The Caltech-UCSD Birds-200--2011 Dataset. Technical Report CNS-TR-2011-001. California Institute of Technology.Google ScholarGoogle Scholar
  49. Jun Wang, Sanjiv Kumar, and Shih-Fu Chang. 2010. Semi-supervised hashing for scalable image retrieval. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 3424--3431.Google ScholarGoogle ScholarCross RefCross Ref
  50. Yimu Wang, Renjie Song, Xiu-Shen Wei, and Lijun Zhang. 2020. An Adversarial Domain Adaptation Network For Cross-Domain Fine-Grained Recognition. In 2020 IEEE Winter Conference on Applications of Computer Vision. 1217--1225.Google ScholarGoogle Scholar
  51. Yair Weiss, Antonio Torralba, and Rob Fergus. 2009. Spectral Hashing. In Advances in Neural Information Processing Systems 21, D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou (Eds.). Curran Associates, Inc., 1753--1760.Google ScholarGoogle Scholar
  52. Liang Xie, Jialie Shen, Jungong Han, Lei Zhu, and Ling Shao. 2017. Dynamic Multi-View Hashing for Online Image Retrieval. In Proceedings of the 26th International Joint Conference on Artificial Intelligence. AAAI Press, 3133--3139.Google ScholarGoogle ScholarCross RefCross Ref
  53. Xinyu Yan, Lijun Zhang, and Wu-Jun Li. 2017. Semi-Supervised Deep Hashing with a Bipartite Graph. In Proceedings of the 26th International Joint Conference on Artificial Intelligence. 3238--3244.Google ScholarGoogle ScholarCross RefCross Ref
  54. Chengyuan Zhang, Lei Zhu, Shichao Zhang, and Weiren Yu. 2020. TDHPPIR: An Efficient Deep Hashing Based Privacy-Preserving Image Retrieval Method. Neurocomputing, Vol. 406 (2020), 386 -- 398. https://doi.org/10.1016/j.neucom.2019.11.119Google ScholarGoogle ScholarCross RefCross Ref
  55. Jun Zhang, Zhenjie Zhang, Xiaokui Xiao, Yin Yang, and Marianne Winslett. 2012. Functional Mechanism: Regression Analysis under Differential Privacy. Proceedings of the VLDB Endowment, Vol. 5, 11 (July 2012), 1364--1375. https://doi.org/10.14778/2350229.2350253Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Ruimao Zhang, Liang Lin, Rui Zhang, Wangmeng Zuo, and Lei Zhang. 2015. Bit-Scalable Deep Hashing With Regularized Similarity Learning for Image Retrieval and Person Re-Identification. IEEE Transactions on Image Processing, Vol. 24, 12 (2015), 4766--4779.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Searching Privately by Imperceptible Lying: A Novel Private Hashing Method with Differential Privacy

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        MM '20: Proceedings of the 28th ACM International Conference on Multimedia
        October 2020
        4889 pages
        ISBN:9781450379885
        DOI:10.1145/3394171

        Copyright © 2020 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 12 October 2020

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate995of4,171submissions,24%

        Upcoming Conference

        MM '24
        MM '24: The 32nd ACM International Conference on Multimedia
        October 28 - November 1, 2024
        Melbourne , VIC , Australia

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader