Abstract
Privacy preservation in a distributed environment is a challenging task as it requires efficient control strategies for authentication and integrity preservation of various users and applications. In a distributed environment, most of the data items are distributed among various parties which may be located across different geographical locations. In the literature, most of the techniques proposed for privacy preserving consider only two parties collaboration for data items sharing using data perturbation and homomorphic encryption technique. Since data perturbation is less accurate and homomorphic encryption incurs high computation cost, so to fill these gaps, this paper proposes a Cloud-based Multi-party Privacy Preserving Classification Scheme (ClaMPP), which uses Naive Bayesian Classifier using multi-party random masking and polynomial aggregation technique for privacy preservation among different parties. The proposed ClaMPP scheme consists of four protocols which are described as follows. Protocols I and II are used to calculate conditional probabilities between the item vectors in a secure manner by using Privacy Preserving Conditional Probability Protocol. In the protocol III, the prior probabilities between the item vectors are computed by using additive property of Paillier homomorphic encryption technique. The naive Bayesian classification model constructed from the calculated values from protocols I, II and III is used in protocol IV to generate the predictions for items based on user’s interest. Three datasets MovieLens100K, MovieLens20M and Jester are used in ClaMPP to analyse the privacy, accuracy and overhead costs incurred. It has been experimentally demonstrated that the proposed scheme is secure, and the privacy preservation does not affect the accuracy of the prediction. Comparative analysis of ClaMPP has been done with other related schemes based on off-line model computation cost. It has been demonstrated from the results obtained that computation cost decreases significantly in comparison with the other state-of-art techniques.
Similar content being viewed by others
References
Bansal A, Chen T, Zhong S (2011) Privacy preserving back-propagation neural network learning over arbitrarily partitioned data. Neural Comput Appl 20(1):143–150
Basu A, Vaidya J, Kikuchi H, Dimitrakos T, Nair SK (2012) Privacy preserving collaborative filtering for saas enabling paas clouds. J Cloud Comput Adv Syst Appl 1(1):8. https://doi.org/10.1186/2192-113X-1-8
Bertino E, Khan LR, Sandhu R, Thuraisingham B (2006) Secure knowledge management: confidentiality, trust, and privacy. IEEE Trans Syst Man Cybern Part A 36(3):429–438
Bobadilla J, Ortega F, Hernando A, GutiéRrez A (2013) Recommender systems survey. Knowl-Based Syst 46:109–132. https://doi.org/10.1016/j.knosys.2013.03.012
Goldberg K, Roeder T, Gupta D, Perkins C (2001) Eigentaste: a constant time collaborative filtering algorithm. Inf Retr 4(2):133–151
Han S, Ng WK (2007) Privacy-preserving genetic algorithms for rule discovery. In: International Conference on Data Warehousing and Knowledge Discovery. Springer, pp 407–417
Harper FM, Konstan JA (2015) The movielens datasets: history and context. ACM Trans Interact Intell Syst 5(4):19:1–19:19. https://doi.org/10.1145/2827872
Hu Y, He G, Fang L, Tang J (2010) Privacy-preserving svm classification on arbitrarily partitioned data. In: 2010 IEEE International Conference on Progress in Informatics and Computing (PIC), vol 1. IEEE, pp 67–71
Hwang MS, Lee CC, Sun TH (2014) Data error locations reported by public auditing in cloud storage service. Autom Softw Eng 21(3):373–390
Hwang MS, Sun TH, Lee CC (2017) Achieving dynamic data guarantee and data confidentiality of public auditing in cloud storage service. J Circuits Syst Comput 26(05):1750072
Jagannathan G, Wright RN (2005) Privacy-preserving distributed k-means clustering over arbitrarily partitioned data. In: Proceedings of the eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. ACM, pp 593–599
Jalla HR, Girija P (2016) A novel approach for horizontal privacy preserving data mining. In: Information Systems Design and Intelligent Applications. Springer, pp 101–111
Kaleli C, Polat H (2007) Providing naïve bayesian classifier-based private recommendations on partitioned data. In: European Conference on Principles of Data Mining and Knowledge Discovery. Springer, pp 515–522
Kaleli C, Polat H (2012) Privacy-preserving som-based recommendations on horizontally distributed data. Knowl-Based Syst 33:124–135. https://doi.org/10.1016/j.knosys.2012.02.013
Kaleli C, Polat H (2015) Privacy-preserving naïve bayesian classifier-based recommendations on distributed data. Comput Intell 31(1):47–68
Kantarcıoglu M, Clifton C (2004) Privacy-preserving distributed mining of association rules on horizontally partitioned data. IEEE Trans Knowl Data Eng 16(9):1026–1037
Kantarcıoglu M, Vaidya J, Clifton C (2003) Privacy preserving naive bayes classifier for horizontally partitioned data. In: IEEE ICDM Workshop on Privacy Preserving Data Mining, pp 3–9
Kantarcıolu M, Clifton C (2004) Privately computing a distributed k-nn classifier. In: Knowledge Discovery in Databases: PKDD 2004. Springer, pp 279–290
Kao YH, Lee WB, Hsu TY, Lin CY, Tsai HF, Chen TS (2015) Data perturbation method based on contrast mapping for reversible privacy-preserving data mining. J Med Biol Eng 35(6):789–794. https://doi.org/10.1007/s40846-015-0088-6
Keshavamurthy NB, Sharma M, Toshniwal D (2010) Privacy preservation naive Bayes classification for a vertically distribution scenario using trusted third party. In: 2010 International Conference on Advances in Recent Technologies in Communication and Computing (ARTCom), pp 404–407 (2010). https://doi.org/10.1109/ARTCom.2010.36
Lee CC, Li CT, Chen CL, Chiu ST (2016) A searchable hierarchical conditional proxy re-encryption scheme for cloud storage services. Inf Technol Control 45(3):289–299
Lee CC, Li CT, Chiu ST, Chen SD (2016) Time-bound key-aggregate encryption for cloud storage. Secur Commun Netw 9(13):2059–2069
Levchin M, Nosek LP, Thiel P, Banister SA (2006) System and method for electronically exchanging value among distributed users. US Patent 7,089,208
Li CT, Lee CC, Weng CY (2013) An extended chaotic maps based user authentication and privacy preserving scheme against dos attacks in pervasive and ubiquitous computing environments. Nonlinear Dyn 74(4):1133–1143
Li CT, Weng CY, Lee CC (2015) A secure rfid tag authentication protocol with privacy preserving in telecare medicine information system. J Med Syst 39(8):77
Li L, Huang L, Yang W (2011) Privacy-preserving outlier detection over arbitrarily partitioned data. In: International Symposium on Information Engineering and Electronic Commerce, 3rd (IEEC 2011). ASME Press
Lu R, Lin X, Shen X (2013) Spoc: a secure and privacy-preserving opportunistic computing framework for mobile-healthcare emergency. IEEE Trans Parallel Distrib Syst 24(3):614–624. https://doi.org/10.1109/TPDS.2012.146
Merugu S, Ghosh J (2003) Privacy-preserving distributed clustering using generative models. In: Third IEEE International Conference on Data Mining, 2003. ICDM 2003. IEEE, pp 211–218
Miyahara K, Pazzani MJ (2000) Collaborative filtering with the simple Bayesian classifier. In: Pacific Rim International Conference on Artificial Intelligence. Springer, pp 679–689
Miyahara K, Pazzani MJ (2002) Improvement of collaborative filtering with the simple Bayesian classifier 1. In: Information Processing Society of Japan
Modak M, Shaikh R (2016) Privacy preserving distributed association rule hiding using concept hierarchy. Procedia Comput Sci 79:993–1000. https://doi.org/10.1016/j.procs.2016.03.126
Modi CN, Patil AR (2016) Privacy preserving association rule mining in horizontally partitioned databases without involving trusted third party (ttp). In: Proceedings of 3rd International Conference on Advanced Computing, Networking and Informatics. Springer, pp 549–555
Oliveira SR, Zaïane OR (2004) Toward standardization in privacy-preserving data mining. In: ACM SIGKDD 3rd Workshop on Data Mining Standards, pp 7–17
Paillier P (1999) Public-key cryptosystems based on composite degree residuosity classes. Springer, Berlin, pp 223–238. https://doi.org/10.1007/3-540-48910-X_16
Pham TAN, Li X, Cong G, Zhang Z (2016) A general recommendation model for heterogeneous networks. IEEE Trans Knowl Data Eng 28(12):3140–3153. https://doi.org/10.1109/TKDE.2016.2601091
Polat H, Du W (2005) Privacy-preserving collaborative filtering on vertically partitioned data. In: European Conference on Principles of Data Mining and Knowledge Discovery. Springer, pp 651–658
Polat H, Du W (2005) Privacy-preserving top-n recommendation on horizontally partitioned data. In: Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence, WI ’05. IEEE Computer Society, Washington, DC, USA, pp 725–731. https://doi.org/10.1109/WI.2005.117
Polat H, Du W (2008) Privacy-preserving top-n recommendation on distributed data. J Am Soc Inf Sci Technol 59(7):1093–1108
Prasad PK, Rangan CP (2007) Privacy preserving birch algorithm for clustering over arbitrarily partitioned databases. In: Advanced Data Mining and Applications. Springer, pp 146–157
Shuguo H, Ng WK (2007) Multi-party privacy-preserving decision trees for arbitrarily partitioned data. Int J Intell Control Syst 12(4):351–358
Skarkala ME, Maragoudakis M, Gritzalis S, Mitrou L (2011) Privacy preserving tree augmented naïve bayesian multi-party implementation on horizontally partitioned databases. In: Privacy and Security in Digital Business. Springer, pp 62–73
Vaidya J, Clifton C (2002) Privacy preserving association rule mining in vertically partitioned data. In: Proceedings of the Eighth ACMSIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, pp 639–644
Vaidya J, Clifton C (2003) Privacy-preserving k-means clustering oververtically partitioned data. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, pp 206–215
Vaidya J, Clifton C (2004) Privacy preserving naïve Bayes classifier for vertically partitioned data. In: SDM. SIAM, pp 522–526
Vaidya J, Kantarciouglu M, Clifton C (2008) Privacy-preserving naïve bayes classification. VLDB J 17(4):879–898. https://doi.org/10.1007/s00778-006-0041-y
Yakut I, Polat H (2010) Privacy-preserving svd-based collaborative filtering on partitioned data. Int J Inf Technol Decis Mak 9(03):473–502
Yakut I, Polat H (2012) Arbitrarily distributed data-based recommendations with privacy. Data Knowl Eng 72:239–256. https://doi.org/10.1016/j.datak.2011.11.002
Yakut I, Polat H (2012) Estimating nbc-based recommendations on arbitrarily partitioned data with privacy. Knowl-Based Syst 6:353–362. https://doi.org/10.1016/j.knosys.2012.07.015
Yakut I, Polat H (2012) Privacy-preserving hybrid collaborative filtering on cross distributed data. Knowl Inf Syst 30(2):405–433. https://doi.org/10.1007/s10115-011-0395-3
Yi X, Rao FY, Bertino E, Bouguettaya A (2015) Privacy-preserving association rule mining in cloud computing. In: Proceedings of the 10th ACM Symposium on Information, Computer and Communications Security, ASIA CCS ’15. ACM, New York, NY, USA, pp 439–450. https://doi.org/10.1145/2714576.2714603
Yuan J, Yu S (2014) Privacy preserving back-propagation neural network learning made practical with cloud computing. IEEE Trans Parallel Distrib Syst 25(1):212–221. https://doi.org/10.1109/TPDS.2013.18
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kaur, H., Kumar, N. & Batra, S. ClaMPP: a cloud-based multi-party privacy preserving classification scheme for distributed applications. J Supercomput 75, 3046–3075 (2019). https://doi.org/10.1007/s11227-018-2691-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-018-2691-0