Skip to main content
Log in

ClaMPP: a cloud-based multi-party privacy preserving classification scheme for distributed applications

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Privacy preservation in a distributed environment is a challenging task as it requires efficient control strategies for authentication and integrity preservation of various users and applications. In a distributed environment, most of the data items are distributed among various parties which may be located across different geographical locations. In the literature, most of the techniques proposed for privacy preserving consider only two parties collaboration for data items sharing using data perturbation and homomorphic encryption technique. Since data perturbation is less accurate and homomorphic encryption incurs high computation cost, so to fill these gaps, this paper proposes a Cloud-based Multi-party Privacy Preserving Classification Scheme (ClaMPP), which uses Naive Bayesian Classifier using multi-party random masking and polynomial aggregation technique for privacy preservation among different parties. The proposed ClaMPP scheme consists of four protocols which are described as follows. Protocols I and II are used to calculate conditional probabilities between the item vectors in a secure manner by using Privacy Preserving Conditional Probability Protocol. In the protocol III, the prior probabilities between the item vectors are computed by using additive property of Paillier homomorphic encryption technique. The naive Bayesian classification model constructed from the calculated values from protocols I, II and III is used in protocol IV to generate the predictions for items based on user’s interest. Three datasets MovieLens100K, MovieLens20M and Jester are used in ClaMPP to analyse the privacy, accuracy and overhead costs incurred. It has been experimentally demonstrated that the proposed scheme is secure, and the privacy preservation does not affect the accuracy of the prediction. Comparative analysis of ClaMPP has been done with other related schemes based on off-line model computation cost. It has been demonstrated from the results obtained that computation cost decreases significantly in comparison with the other state-of-art techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Bansal A, Chen T, Zhong S (2011) Privacy preserving back-propagation neural network learning over arbitrarily partitioned data. Neural Comput Appl 20(1):143–150

    Article  Google Scholar 

  2. Basu A, Vaidya J, Kikuchi H, Dimitrakos T, Nair SK (2012) Privacy preserving collaborative filtering for saas enabling paas clouds. J Cloud Comput Adv Syst Appl 1(1):8. https://doi.org/10.1186/2192-113X-1-8

    Article  Google Scholar 

  3. Bertino E, Khan LR, Sandhu R, Thuraisingham B (2006) Secure knowledge management: confidentiality, trust, and privacy. IEEE Trans Syst Man Cybern Part A 36(3):429–438

    Article  Google Scholar 

  4. Bobadilla J, Ortega F, Hernando A, GutiéRrez A (2013) Recommender systems survey. Knowl-Based Syst 46:109–132. https://doi.org/10.1016/j.knosys.2013.03.012

    Article  Google Scholar 

  5. Goldberg K, Roeder T, Gupta D, Perkins C (2001) Eigentaste: a constant time collaborative filtering algorithm. Inf Retr 4(2):133–151

    Article  MATH  Google Scholar 

  6. Han S, Ng WK (2007) Privacy-preserving genetic algorithms for rule discovery. In: International Conference on Data Warehousing and Knowledge Discovery. Springer, pp 407–417

  7. Harper FM, Konstan JA (2015) The movielens datasets: history and context. ACM Trans Interact Intell Syst 5(4):19:1–19:19. https://doi.org/10.1145/2827872

    Article  Google Scholar 

  8. Hu Y, He G, Fang L, Tang J (2010) Privacy-preserving svm classification on arbitrarily partitioned data. In: 2010 IEEE International Conference on Progress in Informatics and Computing (PIC), vol 1. IEEE, pp 67–71

  9. Hwang MS, Lee CC, Sun TH (2014) Data error locations reported by public auditing in cloud storage service. Autom Softw Eng 21(3):373–390

    Article  Google Scholar 

  10. Hwang MS, Sun TH, Lee CC (2017) Achieving dynamic data guarantee and data confidentiality of public auditing in cloud storage service. J Circuits Syst Comput 26(05):1750072

    Article  Google Scholar 

  11. Jagannathan G, Wright RN (2005) Privacy-preserving distributed k-means clustering over arbitrarily partitioned data. In: Proceedings of the eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. ACM, pp 593–599

  12. Jalla HR, Girija P (2016) A novel approach for horizontal privacy preserving data mining. In: Information Systems Design and Intelligent Applications. Springer, pp 101–111

  13. Kaleli C, Polat H (2007) Providing naïve bayesian classifier-based private recommendations on partitioned data. In: European Conference on Principles of Data Mining and Knowledge Discovery. Springer, pp 515–522

  14. Kaleli C, Polat H (2012) Privacy-preserving som-based recommendations on horizontally distributed data. Knowl-Based Syst 33:124–135. https://doi.org/10.1016/j.knosys.2012.02.013

    Article  Google Scholar 

  15. Kaleli C, Polat H (2015) Privacy-preserving naïve bayesian classifier-based recommendations on distributed data. Comput Intell 31(1):47–68

    Article  MathSciNet  Google Scholar 

  16. Kantarcıoglu M, Clifton C (2004) Privacy-preserving distributed mining of association rules on horizontally partitioned data. IEEE Trans Knowl Data Eng 16(9):1026–1037

    Article  Google Scholar 

  17. Kantarcıoglu M, Vaidya J, Clifton C (2003) Privacy preserving naive bayes classifier for horizontally partitioned data. In: IEEE ICDM Workshop on Privacy Preserving Data Mining, pp 3–9

  18. Kantarcıolu M, Clifton C (2004) Privately computing a distributed k-nn classifier. In: Knowledge Discovery in Databases: PKDD 2004. Springer, pp 279–290

  19. Kao YH, Lee WB, Hsu TY, Lin CY, Tsai HF, Chen TS (2015) Data perturbation method based on contrast mapping for reversible privacy-preserving data mining. J Med Biol Eng 35(6):789–794. https://doi.org/10.1007/s40846-015-0088-6

    Article  Google Scholar 

  20. Keshavamurthy NB, Sharma M, Toshniwal D (2010) Privacy preservation naive Bayes classification for a vertically distribution scenario using trusted third party. In: 2010 International Conference on Advances in Recent Technologies in Communication and Computing (ARTCom), pp 404–407 (2010). https://doi.org/10.1109/ARTCom.2010.36

  21. Lee CC, Li CT, Chen CL, Chiu ST (2016) A searchable hierarchical conditional proxy re-encryption scheme for cloud storage services. Inf Technol Control 45(3):289–299

    Google Scholar 

  22. Lee CC, Li CT, Chiu ST, Chen SD (2016) Time-bound key-aggregate encryption for cloud storage. Secur Commun Netw 9(13):2059–2069

    Google Scholar 

  23. Levchin M, Nosek LP, Thiel P, Banister SA (2006) System and method for electronically exchanging value among distributed users. US Patent 7,089,208

  24. Li CT, Lee CC, Weng CY (2013) An extended chaotic maps based user authentication and privacy preserving scheme against dos attacks in pervasive and ubiquitous computing environments. Nonlinear Dyn 74(4):1133–1143

    Article  MathSciNet  Google Scholar 

  25. Li CT, Weng CY, Lee CC (2015) A secure rfid tag authentication protocol with privacy preserving in telecare medicine information system. J Med Syst 39(8):77

    Article  Google Scholar 

  26. Li L, Huang L, Yang W (2011) Privacy-preserving outlier detection over arbitrarily partitioned data. In: International Symposium on Information Engineering and Electronic Commerce, 3rd (IEEC 2011). ASME Press

  27. Lu R, Lin X, Shen X (2013) Spoc: a secure and privacy-preserving opportunistic computing framework for mobile-healthcare emergency. IEEE Trans Parallel Distrib Syst 24(3):614–624. https://doi.org/10.1109/TPDS.2012.146

    Article  Google Scholar 

  28. Merugu S, Ghosh J (2003) Privacy-preserving distributed clustering using generative models. In: Third IEEE International Conference on Data Mining, 2003. ICDM 2003. IEEE, pp 211–218

  29. Miyahara K, Pazzani MJ (2000) Collaborative filtering with the simple Bayesian classifier. In: Pacific Rim International Conference on Artificial Intelligence. Springer, pp 679–689

  30. Miyahara K, Pazzani MJ (2002) Improvement of collaborative filtering with the simple Bayesian classifier 1. In: Information Processing Society of Japan

  31. Modak M, Shaikh R (2016) Privacy preserving distributed association rule hiding using concept hierarchy. Procedia Comput Sci 79:993–1000. https://doi.org/10.1016/j.procs.2016.03.126

    Article  Google Scholar 

  32. Modi CN, Patil AR (2016) Privacy preserving association rule mining in horizontally partitioned databases without involving trusted third party (ttp). In: Proceedings of 3rd International Conference on Advanced Computing, Networking and Informatics. Springer, pp 549–555

  33. Oliveira SR, Zaïane OR (2004) Toward standardization in privacy-preserving data mining. In: ACM SIGKDD 3rd Workshop on Data Mining Standards, pp 7–17

  34. Paillier P (1999) Public-key cryptosystems based on composite degree residuosity classes. Springer, Berlin, pp 223–238. https://doi.org/10.1007/3-540-48910-X_16

    MATH  Google Scholar 

  35. Pham TAN, Li X, Cong G, Zhang Z (2016) A general recommendation model for heterogeneous networks. IEEE Trans Knowl Data Eng 28(12):3140–3153. https://doi.org/10.1109/TKDE.2016.2601091

    Article  Google Scholar 

  36. Polat H, Du W (2005) Privacy-preserving collaborative filtering on vertically partitioned data. In: European Conference on Principles of Data Mining and Knowledge Discovery. Springer, pp 651–658

  37. Polat H, Du W (2005) Privacy-preserving top-n recommendation on horizontally partitioned data. In: Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence, WI ’05. IEEE Computer Society, Washington, DC, USA, pp 725–731. https://doi.org/10.1109/WI.2005.117

  38. Polat H, Du W (2008) Privacy-preserving top-n recommendation on distributed data. J Am Soc Inf Sci Technol 59(7):1093–1108

    Article  Google Scholar 

  39. Prasad PK, Rangan CP (2007) Privacy preserving birch algorithm for clustering over arbitrarily partitioned databases. In: Advanced Data Mining and Applications. Springer, pp 146–157

  40. Shuguo H, Ng WK (2007) Multi-party privacy-preserving decision trees for arbitrarily partitioned data. Int J Intell Control Syst 12(4):351–358

    Google Scholar 

  41. Skarkala ME, Maragoudakis M, Gritzalis S, Mitrou L (2011) Privacy preserving tree augmented naïve bayesian multi-party implementation on horizontally partitioned databases. In: Privacy and Security in Digital Business. Springer, pp 62–73

  42. Vaidya J, Clifton C (2002) Privacy preserving association rule mining in vertically partitioned data. In: Proceedings of the Eighth ACMSIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, pp 639–644

  43. Vaidya J, Clifton C (2003) Privacy-preserving k-means clustering oververtically partitioned data. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, pp 206–215

  44. Vaidya J, Clifton C (2004) Privacy preserving naïve Bayes classifier for vertically partitioned data. In: SDM. SIAM, pp 522–526

  45. Vaidya J, Kantarciouglu M, Clifton C (2008) Privacy-preserving naïve bayes classification. VLDB J 17(4):879–898. https://doi.org/10.1007/s00778-006-0041-y

    Article  Google Scholar 

  46. Yakut I, Polat H (2010) Privacy-preserving svd-based collaborative filtering on partitioned data. Int J Inf Technol Decis Mak 9(03):473–502

    Article  MATH  Google Scholar 

  47. Yakut I, Polat H (2012) Arbitrarily distributed data-based recommendations with privacy. Data Knowl Eng 72:239–256. https://doi.org/10.1016/j.datak.2011.11.002

    Article  Google Scholar 

  48. Yakut I, Polat H (2012) Estimating nbc-based recommendations on arbitrarily partitioned data with privacy. Knowl-Based Syst 6:353–362. https://doi.org/10.1016/j.knosys.2012.07.015

    Article  Google Scholar 

  49. Yakut I, Polat H (2012) Privacy-preserving hybrid collaborative filtering on cross distributed data. Knowl Inf Syst 30(2):405–433. https://doi.org/10.1007/s10115-011-0395-3

    Article  Google Scholar 

  50. Yi X, Rao FY, Bertino E, Bouguettaya A (2015) Privacy-preserving association rule mining in cloud computing. In: Proceedings of the 10th ACM Symposium on Information, Computer and Communications Security, ASIA CCS ’15. ACM, New York, NY, USA, pp 439–450. https://doi.org/10.1145/2714576.2714603

  51. Yuan J, Yu S (2014) Privacy preserving back-propagation neural network learning made practical with cloud computing. IEEE Trans Parallel Distrib Syst 25(1):212–221. https://doi.org/10.1109/TPDS.2013.18

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Neeraj Kumar.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kaur, H., Kumar, N. & Batra, S. ClaMPP: a cloud-based multi-party privacy preserving classification scheme for distributed applications. J Supercomput 75, 3046–3075 (2019). https://doi.org/10.1007/s11227-018-2691-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-018-2691-0

Keywords

Navigation