Abstract
Recent progression in Information Technology facilitated the collection and storage of large amounts of data to be accessed by multiple parties in a distributed manner. Privacy is an important concern while mining sensitive data. In a distributed data scenario, when the data is available in encrypted form, mining it without sharing original data among the involved parties is a challenging task. One of the activities in privacy preserving data mining is privacy preserving data classification. In this work, we propose a privacy preserving \(k\)-NN data classification technique for distributed encrypted databases. Our classification approach uses a private Jaccard similarity measure, which is based on privacy equality testing protocol. We also discuss the security analysis of the proposed protocol with respect to various cryptographic attacks.
Similar content being viewed by others
References
Tan P-N, Steinbach M, Kumar V (2006) Introduction to data mining. Addison-Wesley, Boston
Kumar A, Madanu M, Prakash H, Jonnavithula L, Aravilli SR Advaita: big duplicity detection system. https://arxiv.org/abs/2001.10376
Kaosar MG, Paulet R, Yi X (2012) Fully homomorphic encryption based two-party association rule mining. Data Knowl Eng 76–78:1–15
Clifton C, Kantarcioglu M, Vaidya J (2002) Defining privacy for data mining. In: National science foundation workshop on next generation data mining
Clifton C, Kantarcioglu M, Vaidya J, Lin X, Zhu MY (2002) Tools for privacy preserving distributed data mining. SIGKDD Explor Newsl 4:28–34
Bellovin SM, Merritt M (1993) Augmented encrypted key exchange: a password-based protocol secure against dictionary attacks and password file compromise. In: Proceedings of the 1st ACM conference on computer and communications security. ACM, pp 244–250
Cunha M, Mendes R, Vilela JP (2021) A survey of privacy-preserving mechanisms for heterogeneous data types. Computer science review, vol 41. Elsevier, Amsterdam, p 100403
Aldeen YAAS, Salleh M, Razzaque MA (2015) A comprehensive review on privacy preserving data mining. Springerplus 4:694
Zhang N, Wang S, Zhao W (2005) A new scheme on privacy-preserving data classification. In: Proceedings of the eleventh ACM SIGKDD international conference on knowledge discovery in data mining, pp 374–383. ACM
Wang Q, Datta KS, Sivakumar K (2003) On the privacy preserving properties of random data perturbation techniques. In: 3rd IEEE international conference on data mining, pp 99–106
Dachman-Soled D, Malkin T, Raykova M, Yung M (2009) Efficient robust private set intersection. In: Applied cryptography and network security, volume 5536 of lecture notes in computer science. Springer, pp 125–142
Chitti S, Xiong L, Liu L (2006) k nearest neighbor classification across multiple private databases. In: Proceedings of the 15th ACM international conference on information and knowledge management (CIKM), pp 840–841
Aggarwal G, Mishra N, Pinkas B (2004) Secure computation of the kth ranked element. Adv Cryptol Eurocrypt 2004:40–55
Kissner L, Song D (2005) Privacy-preserving set operations. In: Proceedings of CRYPTO. Springer, pp 241–257
Lindell Y, Pinkas B (2002) Privacy preserving data mining. J Cryptol 15:177–206
Kao B, Mamoulis N, Wong WK, Cheung DW (2009) Secure knn computation on encrypted databases. In: Proceedings of the 35th international conference on management of data, (SIGMOD). ACM, pp 139–152
Upmanyu M, Namboodiri A, Srinathan K, Jawahar C (2010) Efficient privacy preserving k-means clustering. In: Intelligence and security informatics, vol 6122, LNCS. Springer Berlin/Heidelberg, pp 154–166
Resende A, Railsback D, Dowsley R, Nascimento ACA, Aranha DF (2021) Fast privacy-preserving text classification based on secure multiparty computation. arXiv:2101.07365v2 [cs.CR]. 8 Jun 2021
Yao AC (1986) How to generate and exchange secrets. In: 27th IEEE symposium on foundations of computer science, pp 162–167
Qi Y, Atallah MJ (2008) Efficient privacy-preserving k-nearest neighbor search. In: Proceedings of 28th international conference on distributed computing (ICDCS). IEEE Computer Society, pp 311–319
Wu W, Parampalli U, Liu J, Xian M (2019) Privacy preserving k-nearest neighbor classification over encrypted database in outsourced cloud environments, vol 22. World Wide Web, Springer, pp 101–123
Singh MD, Krishna PR, Saxena A (2010) A cryptography based privacy preserving solution to mine cloud data. In: ACM compute, 2010
Goldreich O (2004) The foundations of cryptography. Cambr Univ Press 2:28–34
Ishai Y, Kushilevitz E, Ostrovsky R (2005) Sufficient conditions for collision-resistant hashing. In: Theory of cryptography, vol. 3378, lecture notes in computer science. Springer, pp 445–456
SP 800-78-3: cryptographic algorithms and key sizes for personal identification verification (piv). December 2010. http://csrc.nist.gov/publications/PubsSPs.html
Wiener MJ (1990) Cryptanalysis of short RSA secret exponents. IEEE Trans Inf Theory 36(3):553–558
Lou D-C, Wu C-L, Chang T-J (2006) Computational complexity analyses of modular arithmetic for RSA cryptosystem. In: 23rd workshop on combinatorial mathematics and computation theory, 2006
Lindell Y, Pinkas B (2002) Privacy preserving data mining. J Cryptol 15(3):177–206
Yang Z, Zhong S, Wright R (2005) Privacy-preserving classification of customer data without loss of accuracy. In: Proc. 5th SIAM international conference on data mining, pp 92–102, 2005
Acknowledgements
The authors are thankful to the referees and the guest editor for their valuable comments, which resulted in an improved presentation of the paper. The authors are also thank R. Phani Bhushan, Scientist-G, Advanced Data Processing Research Institute (ADRIN) for providing system specific inputs to carryout this work.
Funding
Nil.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
Nil.
Rights and permissions
About this article
Cite this article
Saxena, A., Krishna, P.R. A novel cryptographic protocol for privacy preserving classification over distributed encrypted databases. J BANK FINANC TECHNOL 6, 31–41 (2022). https://doi.org/10.1007/s42786-022-00042-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42786-022-00042-z