research-article

Searching Privately by Imperceptible Lying: A Novel Private Hashing Method with Differential Privacy

Authors:
Yimu Wang

Nanjing University, Nanjing, China

Nanjing University, Nanjing, China
View Profile

,
Shiyin Lu

Nanjing University, Nanjing, China

Nanjing University, Nanjing, China
View Profile

,
Lijun Zhang

Nanjing University, Nanjing, China

Nanjing University, Nanjing, China
View Profile

MM '20: Proceedings of the 28th ACM International Conference on MultimediaOctober 2020Pages 2700–2709https://doi.org/10.1145/3394171.3413882

Published:12 October 2020Publication History

MM '20: Proceedings of the 28th ACM International Conference on Multimedia

Pages 2700–2709

ABSTRACT

In the big data era, with the increasing amount of multi-media data, approximate nearest neighbor~(ANN) search has been an important but challenging problem. As a widely applied large-scale ANN search method, hashing has made great progress, and achieved sub-linear search time with low memory space. However, the advances in hashing are based on the availability of large and representative datasets, which often contain sensitive information. Typically, the privacy of this individually sensitive information is compromised. In this paper, we tackle this valuable yet challenging problem and formulate a task termed as private hashing, which takes into account both searching performance and privacy protection. Specifically, we propose a novel noise mechanism, i.e., Random Flipping, and two private hashing algorithms, i.e., PHashing and PITQ, with the refined analysis within the framework of differential privacy, since differential privacy is a well-established technique to measure the privacy leakage of an algorithm. Random Flipping targets binary scenarios and leverages the "Imperceptible Lying" idea to guarantee ε-differential privacy by flipping each datum of the binary matrix (noise addition). To preserve ε-differential privacy, PHashing perturbs and adds noise to the hash codes learned by non-private hashing algorithms using Random Flipping. However, the noise addition for privacy in PHashing will cause severe performance drops. To alleviate this problem, PITQ leverages the power of alternative learning to distribute the noise generated by Random Flipping into each iteration while preserving ε-differential privacy. Furthermore, to empirically evaluate our algorithms, we conduct comprehensive experiments on the image search task and demonstrate that proposed algorithms achieve equal performance compared with non-private hashing methods.

Supplemental Material

3394171.3413882.mp4

mp4

3.3 MB

Download

Available for Download

zip

mmfp0970aux.zip (1.3 MB)

Here, we present proofs of theorems and lemmas, and additional experiments results which are not presented in the main paper due to the limitation of space.

References

Martin Abadi, Andy Chu, Ian Goodfellow, H. Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. 2016. Deep Learning with Differential Privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. Association for Computing Machinery, New York, NY, USA, 308--318. https://doi.org/10.1145/2976749.2978318Google ScholarDigital Library
Naman Agarwal and Karan Singh. 2017. The Price of Differential Privacy for Online Learning. In Proceedings of the 34th International Conference on Machine Learning. PMLR, International Convention Centre, Sydney, Australia, 32--40.Google Scholar
Alexandr Andoni and Piotr Indyk. 2008. Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions. Commun. ACM, Vol. 51, 1 (Jan. 2008), 117--122. https://doi.org/10.1145/1327452.1327494Google ScholarDigital Library
Alexandr Andoni and Ilya Razenshteyn. 2015. Optimal Data-Dependent Hashing for Approximate Near Neighbors. In Proceedings of the 47th Annual ACM Symposium on Theory of Computing. Association for Computing Machinery, New York, NY, USA, 793--801. https://doi.org/10.1145/2746539.2746553Google ScholarDigital Library
Zhangjie Cao, Mingsheng Long, Jianmin Wang, and Philip S. Yu. 2017. HashNet: Deep Learning to Hash by Continuation. In The IEEE International Conference on Computer Vision. 5608--5617.Google Scholar
Zhangjie Cao, Ziping Sun, Mingsheng Long, Jianmin Wang, and Philip S. Yu. 2018. Deep Priority Hashing. In Proceedings of the 26th ACM International Conference on Multimedia. Association for Computing Machinery, New York, NY, USA, 1653--1661. https://doi.org/10.1145/3240508.3240543Google Scholar
Kamalika Chaudhuri and Claire Monteleoni. 2009. Privacy-preserving logistic regression. In Advances in Neural Information Processing Systems 21, D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou (Eds.). Curran Associates, Inc., 289--296.Google ScholarDigital Library
Kamalika Chaudhuri, Claire Monteleoni, and Anand D. Sarwate. 2011. Differentially Private Empirical Risk Minimization. Journal of Machine Learning Research, Vol. 12 (July 2011), 1069--1109.Google Scholar
Kamalika Chaudhuri, Anand D. Sarwate, and Kaushik Sinha. 2013. A Near-Optimal Algorithm for Differentially-Private Principal Components. Journal of Machine Learning Research, Vol. 14, 1 (Jan. 2013), 2905--2943.Google Scholar
Tat-Seng Chua, Jinhui Tang, Richang Hong, Haojie Li, Zhiping Luo, and Yantao Zheng. 2009. NUS-WIDE: A Real-World Web Image Database from National University of Singapore. In Proceedings of the ACM International Conference on Image and Video Retrieval. Association for Computing Machinery, New York, NY, USA, Article 48, 9 pages. https://doi.org/10.1145/1646396.1646452Google ScholarDigital Library
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition. 248--255.Google ScholarCross Ref
John C. Duchi, Michael I. Jordan, and Martin J. Wainwright. 2014. Privacy Aware Learning. J. ACM, Vol. 61, 6, Article 38 (Dec. 2014), 57 pages. https://doi.org/10.1145/2666468Google ScholarDigital Library
Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. 2006. Calibrating Noise to Sensitivity in Private Data Analysis. In Proceedings of the Third Conference on Theory of Cryptography. Springer-Verlag, Berlin, Heidelberg, 265--284. https://doi.org/10.1007/11681878_14Google ScholarDigital Library
Cynthia Dwork and Aaron Roth. 2014. The Algorithmic Foundations of Differential Privacy. Foundations and Trends in Theoretical Computer Science, Vol. 9, 3--4 (Aug. 2014), 211--407. https://doi.org/10.1561/0400000042Google ScholarDigital Library
Cynthia Dwork, Guy N. Rothblum, and Salil Vadhan. 2010. Boosting and Differential Privacy. In Proceedings of the 2010 IEEE 51st Annual Symposium on Foundations of Computer Science. IEEE, 51--60.Google ScholarDigital Library
Lianli Gao, Jingkuan Song, Fuhao Zou, Dongxiang Zhang, and Jie Shao. 2015. Scalable Multimedia Retrieval by Deep Learning Hashing with Relative Similarity Learning. In Proceedings of the 23rd ACM International Conference on Multimedia. Association for Computing Machinery, New York, NY, USA, 903--906. https://doi.org/10.1145/2733373.2806360Google ScholarDigital Library
Aristides Gionis, Piotr Indyk, and Rajeev Motwani. 1999. Similarity Search in High Dimensions via Hashing. In Proceedings of the 25th International Conference on Very Large Data Bases. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 518--529.Google ScholarDigital Library
Yunchao Gong, Svetlana Lazebnik, Albert Gordo, and Florent Perronnin. 2013. Iterative Quantization: A Procrustean Approach to Learning Binary Codes for Large-Scale Image Retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 35, 12 (2013), 2916--2929.Google ScholarDigital Library
Abhradeep Guha Thakurta and Adam Smith. 2013. (Nearly) Optimal Algorithms for Private Online Learning in Full-information and Bandit Settings. In Advances in Neural Information Processing Systems 26, C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Q. Weinberger (Eds.). Curran Associates, Inc., 2733--2741.Google Scholar
Kun He, Fatih Cakir, Sarah Adel Bargal, and Stan Sclaroff. 2018. Hashing as Tie-Aware Learning to Rank. In 2018 IEEE Conference on Computer Vision and Pattern Recognition. 4023--4032.Google ScholarCross Ref
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition. 770--778.Google Scholar
Prateek Jain, Pravesh Kothari, and Abhradeep Thakurta. 2012. Differentially Private Online Learning. In Proceedings of the 25th Annual Conference on Learning Theory, Shie Mannor, Nathan Srebro, and Robert C. Williamson (Eds.). PMLR, Edinburgh, Scotland, 24.1--24.34.Google Scholar
I-Hong Jhuo. 2019. Supervised Set-to-Set Hashing in Visual Recognition. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, 803--810. https://doi.org/10.24963/ijcai.2019/113Google ScholarCross Ref
Qing-Yuan Jiang and Wu-Jun Li. 2018. Asymmetric Deep Supervised Hashing. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, Louisiana, USA, Sheila A. McIlraith and Kilian Q. Weinberger (Eds.). AAAI Press, 3342--3349. https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/17296Google Scholar
Qing-Yuan Jiang and Wu-Jun Li. 2015. Scalable Graph Hashing with Feature Transformation. In Proceedings of the 24th International Conference on Artificial Intelligence. AAAI Press, 2248--2254.Google Scholar
Alex Krizhevsky and Geoffrey Hinton. 2009. Learning multiple layers of features from tiny images. Technical Report. Citeseer.Google Scholar
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2017. ImageNet Classification with Deep Convolutional Neural Networks. Commun. ACM, Vol. 60, 6 (May 2017), 84--90. https://doi.org/10.1145/3065386Google ScholarDigital Library
Brian Kulis and Trevor Darrell. 2009. Learning to Hash with Binary Reconstructive Embeddings. In Advances in Neural Information Processing Systems 22, Y. Bengio, D. Schuurmans, J. D. Lafferty, C. K. I. Williams, and A. Culotta (Eds.). Curran Associates, Inc., 1042--1050.Google Scholar
Yann Lecun, Leon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE, Vol. 86, 11 (1998), 2278--2324.Google ScholarCross Ref
Wu-Jun Li, Sheng Wang, and Wang-Cheng Kang. 2016. Feature Learning Based Deep Supervised Hashing with Pairwise Labels. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence. AAAI Press, 1711--1717.Google Scholar
Jie Lin, Zechao Li, and Jinhui Tang. 2017. Discriminative Deep Hashing for Scalable Face Image Retrieval. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence. 2266--2272. https://doi.org/10.24963/ijcai.2017/315Google ScholarCross Ref
Haomiao Liu, Ruiping Wang, Shiguang Shan, and Xilin Chen. 2016. Deep Supervised Hashing for Fast Image Retrieval. In 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2064--2072.Google Scholar
Wei Liu, Jun Wang, Rongrong Ji, Yu-Gang Jiang, and Shih-Fu Chang. 2012. Supervised hashing with kernels. In 2012 IEEE Conference on Computer Vision and Pattern Recognition. 2074--2081.Google ScholarCross Ref
Xingbo Liu, Xiushan Nie, Quan Zhou, Xiaoming Xi, Lei Zhu, and Yilong Yin. 2019. Supervised Short-Length Hashing. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, 3031--3037. https://doi.org/10.24963/ijcai.2019/420Google ScholarCross Ref
Xu Lu, Lei Zhu, Zhiyong Cheng, Jingjing Li, Xiushan Nie, and Huaxiang Zhang. 2019. Flexible Online Multi-Modal Hashing for Large-Scale Multimedia Retrieval. In Proceedings of the 27th ACM International Conference on Multimedia. Association for Computing Machinery, New York, NY, USA, 1129--1137. https://doi.org/10.1145/3343031.3350999Google ScholarDigital Library
Subhransu Maji, Esa Rahtu, Juho Kannala, Matthew Blaschko, and Andrea Vedaldi. 2013. Fine-Grained Visual Classification of Aircraft. Technical Report. arxiv: cs-cv/1306.5151Google Scholar
H. Brendan McMahan, Daniel Ramage, Kunal Talwar, and Li Zhang. 2018. Learning Differentially Private Recurrent Language Models. In International Conference on Learning Representations .Google Scholar
Benjamin I. P. Rubinstein, Peter L. Bartlett, Ling Huang, and Nina Taft. 2012. Learning in a Large Function Space: Privacy-Preserving Mechanisms for SVM Learning. Journal of Privacy and Confidentiality, Vol. 4, 1 (Jul. 2012). https://doi.org/10.29012/jpc.v4i1.612Google ScholarCross Ref
Anand D. Sarwate and Kamalika Chaudhuri. 2013. Signal Processing and Machine Learning with Differential Privacy: Algorithms and Challenges for Continuous Data. IEEE Signal Processing Magazine, Vol. 30, 5 (2013), 86--94.Google ScholarCross Ref
Peter H Schönemann. 1966. A generalized solution of the orthogonal procrustes problem. Psychometrika, Vol. 31, 1 (1966), 1--10.Google ScholarCross Ref
Fumin Shen, Xin Gao, Li Liu, Yang Yang, and Heng Tao Shen. 2017. Deep Asymmetric Pairwise Hashing. In Proceedings of the 25th ACM International Conference on Multimedia. Association for Computing Machinery, New York, NY, USA, 1522--1530. https://doi.org/10.1145/3123266.3123345Google ScholarDigital Library
Fumin Shen, Chunhua Shen, Wei Liu, and Heng Tao Shen. 2015a. Supervised Discrete Hashing. In 2015 IEEE Conference on Computer Vision and Pattern Recognition. 37--45.Google Scholar
Xiaobo Shen, Fumin Shen, Quan-Sen Sun, and Yun-Hao Yuan. 2015b. Multi-View Latent Hashing for Efficient Multimedia Search. In Proceedings of the 23rd ACM International Conference on Multimedia. Association for Computing Machinery, New York, NY, USA, 831--834. https://doi.org/10.1145/2733373.2806342Google ScholarDigital Library
Reza Shokri and Vitaly Shmatikov. 2015. Privacy-Preserving Deep Learning. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security. Association for Computing Machinery, New York, NY, USA, 1310--1321. https://doi.org/10.1145/2810103.2813687Google Scholar
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2014).Google Scholar
Ge Song and Xiaoyang Tan. 2017. Hierarchical deep hashing for image retrieval. Frontiers of Computer Science, Vol. 11, 2 (2017), 253--265. https://doi.org/10.1007/s11704-017--6537--3Google ScholarDigital Library
Jingkuan Song, Lianli Gao, Yan Yan, Dongxiang Zhang, and Nicu Sebe. 2015. Supervised Hashing with Pseudo Labels for Scalable Multimedia Retrieval. In Proceedings of the 23rd ACM International Conference on Multimedia. Association for Computing Machinery, New York, NY, USA, 827--830. https://doi.org/10.1145/2733373.2806341Google ScholarDigital Library
Catherine Wah, Steve Branson, Peter Welinder, and Serge Belongie Pietro Perona. 2011. The Caltech-UCSD Birds-200--2011 Dataset. Technical Report CNS-TR-2011-001. California Institute of Technology.Google Scholar
Jun Wang, Sanjiv Kumar, and Shih-Fu Chang. 2010. Semi-supervised hashing for scalable image retrieval. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 3424--3431.Google ScholarCross Ref
Yimu Wang, Renjie Song, Xiu-Shen Wei, and Lijun Zhang. 2020. An Adversarial Domain Adaptation Network For Cross-Domain Fine-Grained Recognition. In 2020 IEEE Winter Conference on Applications of Computer Vision. 1217--1225.Google Scholar
Yair Weiss, Antonio Torralba, and Rob Fergus. 2009. Spectral Hashing. In Advances in Neural Information Processing Systems 21, D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou (Eds.). Curran Associates, Inc., 1753--1760.Google Scholar
Liang Xie, Jialie Shen, Jungong Han, Lei Zhu, and Ling Shao. 2017. Dynamic Multi-View Hashing for Online Image Retrieval. In Proceedings of the 26th International Joint Conference on Artificial Intelligence. AAAI Press, 3133--3139.Google ScholarCross Ref
Xinyu Yan, Lijun Zhang, and Wu-Jun Li. 2017. Semi-Supervised Deep Hashing with a Bipartite Graph. In Proceedings of the 26th International Joint Conference on Artificial Intelligence. 3238--3244.Google ScholarCross Ref
Chengyuan Zhang, Lei Zhu, Shichao Zhang, and Weiren Yu. 2020. TDHPPIR: An Efficient Deep Hashing Based Privacy-Preserving Image Retrieval Method. Neurocomputing, Vol. 406 (2020), 386 -- 398. https://doi.org/10.1016/j.neucom.2019.11.119Google ScholarCross Ref
Jun Zhang, Zhenjie Zhang, Xiaokui Xiao, Yin Yang, and Marianne Winslett. 2012. Functional Mechanism: Regression Analysis under Differential Privacy. Proceedings of the VLDB Endowment, Vol. 5, 11 (July 2012), 1364--1375. https://doi.org/10.14778/2350229.2350253Google ScholarDigital Library
Ruimao Zhang, Liang Lin, Rui Zhang, Wangmeng Zuo, and Lei Zhang. 2015. Bit-Scalable Deep Hashing With Regularized Similarity Learning for Image Retrieval and Person Re-Identification. IEEE Transactions on Image Processing, Vol. 24, 12 (2015), 4766--4779.Google ScholarDigital Library

Index Terms

Searching Privately by Imperceptible Lying: A Novel Private Hashing Method with Differential Privacy
1. Information systems
  1. Information retrieval
2. Security and privacy

Recommendations

A Novel Differential Privacy Approach that Enhances Classification Accuracy
C3S2E '16: Proceedings of the Ninth International C* Conference on Computer Science & Software Engineering

In the recent past, there has been a tremendous increase of large repositories of data, examples being in healthcare data, consumer data from retailers, and airline passenger data. These data are continually being shared with interested parties, either ...
Read More
Applying Differential Privacy to Matrix Factorization
RecSys '15: Proceedings of the 9th ACM Conference on Recommender Systems

Recommender systems are increasingly becoming an integral part of on-line services. As the recommendations rely on personal user information, there is an inherent loss of privacy resulting from the use of such systems. While several works studied ...
Read More
Private record matching using differential privacy
EDBT '10: Proceedings of the 13th International Conference on Extending Database Technology

Private matching between datasets owned by distinct parties is a challenging problem with several applications. Private matching allows two parties to identify the records that are close to each other according to some distance functions, such that no ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MM '20: Proceedings of the 28th ACM International Conference on Multimedia
October 2020
4889 pages
ISBN:9781450379885
DOI:10.1145/3394171
General Chairs:
Chang Wen Chen
Chinese University of Hong Kong, Shenzhen, China
,
Rita Cucchiara
UNIMORE, Italy
,
Xian-Sheng Hua
Alibaba Group, China
,
Program Chairs:
Guo-Jun Qi
Futurewei Technologies, USA
,
Elisa Ricci
UNITN & Fondazione Bruno Kessler, Italy
,
Zhengyou Zhang
Tencent, China
,
Roger Zimmermann
National University of Singapore, Singapore
Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 12 October 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
differential privacy
hashing
large-scale multimedia retrieval
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate995of4,171submissions,24%
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 239
  Total Downloads
- Downloads (Last 12 months)28
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Searching Privately by Imperceptible Lying: A Novel Private Hashing Method with Differential Privacy

MM '20: Proceedings of the 28th ACM International Conference on Multimedia

ABSTRACT

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

A Novel Differential Privacy Approach that Enhances Classification Accuracy

Applying Differential Privacy to Matrix Factorization

Private record matching using differential privacy