Skip to main content
Log in

A sanitization approach for privacy preserving data mining on social distributed environment

  • Original Research
  • Published:
Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Abstract

Data owners worry about their private data in the information that is being uncovered without authorization in the cloud computing environment. While applying privacy preserving methods to the data, the data owners attempt to retain the knowledge inside the data. One approach to solve this problem is the concept of distributed databases where different parties have horizontal or vertical partitions of the data. Cluster analysis is a frequently used data mining task which aims at decomposing or partitioning a usually multivariate data set into groups such that the data objects in one group are more similar to each other. While using encryption based kernel k-means algorithm, large data’s can’t be encrypted in the distributed environment. To extend the privacy concept, a novel method based Privacy Preserving Distributed Data Mining is planned. According to this, a sanitization approach will be developed to improve the privacy of the user data. In sanitization process, a privacy based objective function will be developed and an optimal key will be generated based on the proposed objective function. Here artificial bee colony algorithm will be utilized for optimal key generation and large amount of data can be encrypted. Once the sanitization process is done, the sanitized information will be updated to service provider by the helper user for each cluster. Finally, the experimentation will be carried out with existing database to prove the efficiency of the proposed algorithm. The implementation will be done in JAVA using cloud simulator. Extensive execution assessments and security analysis exhibit the legitimacy and efficiency of the proposed technique.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21

Similar content being viewed by others

References

  • Ahmed G, Zou J, Fareed MMS, Zeeshan M (2015) Sleep-awake energy efficient distributed clustering algorithm for wireless sensor networks. Comput Electr Eng 56:385–398

    Article  Google Scholar 

  • Akay B, Karaboga D (2012) A modified artificial bee colony algorithm for real-parameter optimization. Inf Sci 192:120–142

    Article  Google Scholar 

  • Azimi R, Sajedi H, Ghayekhloo M (2017) A distributed data clustering algorithm in P2P networks. Appl Soft Comput 51:147–167

    Article  Google Scholar 

  • Bhuyan HK, Kamila NK (2015) Privacy preserving sub-feature selection in distributed data mining. Appl Soft Comput 36:552–569

    Article  Google Scholar 

  • Chen J, Schizas ID (2016) Distributed information-based clustering of heterogeneous sensor data. Signal Process 126:35–51

    Article  Google Scholar 

  • Chitta R, Jin R, Jain AK (2012) Efficient kernel clustering using random fourier features. In: 2012 IEEE 12th international conference on data mining. Brussels, pp 161–170

  • Karaboga D, Ozturk C (2010) Fuzzy clustering with artificial bee colony algorithm. Sci Res Essays 5(14):1899–1902

    Google Scholar 

  • Kokkinos Y, Margaritis KG (2015) Confidence ratio affinity propagation in ensemble selection of neural network classifiers for distributed privacy-preserving data mining. Neurocomputing 150:513–528

    Article  Google Scholar 

  • Lakshmi NSR, Babu S, Bhalaji N (2016) Analysis of clustered QoS routing protocol for distributed wireless sensor network. Comput Electr Eng 64:173–181

    Article  Google Scholar 

  • Limón X, Guerra-Hernández A, Cruz-Ramírez N, Acosta-Mesa HG, Grimaldo F (2016) A windowing strategy for distributed data mining optimized through GPUs. Pattern Recognit Lett 93:23–30

    Article  Google Scholar 

  • Lin CY (2016) A reversible data transform algorithm using integer transform for privacy-preserving data mining. J Syst Softw 117:104–112

    Article  Google Scholar 

  • Matatov N, Rokach L, Maimon O (2010) Privacy-preserving data mining: a feature set partitioning approach. Inf Sci 180(14):2696–2720

    Article  MathSciNet  Google Scholar 

  • Movie Lens Dataset (2019). http://www.grouplens.org

  • Nagano J, Shinomiya N (2016) Efficient switch clustering for distributed controllers of OpenFlow network with bi-connectivity. Comput Netw 96:48–57

    Article  Google Scholar 

  • Naldi MC, Campello RJ (2015) Comparison of distributed evolutionary k-means clustering algorithms. Neurocomputing 163:78–93

    Article  Google Scholar 

  • Nayahi JJV, Kavitha V (2016) Privacy and utility preserving data clustering for data anonymization and distribution on Hadoop. Future Gener Comput Syst 74:393–408

    Article  Google Scholar 

  • Nettleton DF, Salas J (2016) A data driven anonymization system for information rich online social network graphs. Expert Syst Appl 55:87–105

    Article  Google Scholar 

  • Peng T (2016) Collaborative trajectory privacy preserving scheme in location-based services. Inf Sci 387:165–179

    Article  Google Scholar 

  • Taheri H, Neamatollahi P, Younis OM, Naghibzadeh S, Yaghmaee MH (2012) An energy-aware distributed clustering protocol in wireless sensor networks using fuzzy logic. Ad Hoc Netw 10(7):1469–1481

    Article  Google Scholar 

  • Tian Z, Shi W, Wang Y, Zhu C, Du X, Su S, Sun Y, Guizani N (2019a) Real time lateral movement detection based on evidence reasoning network for edge computing environment. IEEE Trans Industr Inf. https://doi.org/10.1109/TII.2019.2907754

  • Tian Z, Li M, Qiu M, Sun Y, Su S (2019b) Block-DEF: a secure digital evidence framework using blockchain. Inf Sci 491:151–165

    Article  Google Scholar 

  • Tian Z, Su S, Shi W, Du X, Guizani M, Yu X (2019c) A data-driven method for future Internet route decision modeling. Future Gener Comput Syst 95:212–220

    Article  Google Scholar 

  • Tsapanos N (2015) A distributed framework for trimmed kernel k-means clustering. Pattern Recognit 48(8):2685–2698

    Article  Google Scholar 

  • Xie K, Ning X, Wang X, He S, Ning Z, Liu X, Qin Z (2016) An efficient privacy-preserving compressive data gathering scheme in WSNs. Inf Sci 390:82–94

    Article  Google Scholar 

  • Ximeng L, Deng RH, Yang Y, Tran HN, Zhong S (2017) Hybrid privacy-preserving clinical decision support system in fog–cloud computing. Future Gener Comput Syst 78:1–50

    Google Scholar 

  • Yang JJ, Li JQ, Niu Y (2015) A hybrid solution for privacy preserving medical data sharing in the cloud environment. Future Gener Comput Syst 43:74–86

    Article  Google Scholar 

  • Ye A, Li Y, Xu L (2016) A novel location privacy-preserving scheme based on l-queries for continuous LBS. Comput Commun 98:1–10

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to P. L. Lekshmy.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lekshmy, P.L., Rahiman, M.A. A sanitization approach for privacy preserving data mining on social distributed environment. J Ambient Intell Human Comput 11, 2761–2777 (2020). https://doi.org/10.1007/s12652-019-01335-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12652-019-01335-w

Keywords

Navigation