Skip to main content
Log in

Privacy preserving data sharing and analysis for edge-based architectures

  • regular contribution
  • Published:
International Journal of Information Security Aims and scope Submit manuscript

Abstract

In this paper, we present a framework for privacy preserving collaborative data analysis among multiple data providers acting as edge of a cloud environment. The proposed framework computes the best trade-off among privacy and result accuracy, based on the privacy requirements of data providers and the specific requested analysis algorithm. Though the presented model is general and can be applied to different environments, this work is motivated by the need of sharing information related to Cyber Threats (CTI). The presented framework is independent from the number of data providers, used data format, privacy requirement and analysis operations. The model is based on the concepts of trade-off score between accuracy and privacy, which also considers measures for privacy requirement such as differential privacy, l-diversity and k-anonymity. Together with the model, the paper discusses the framework implementation and presents results to show the effectiveness and viability of the proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. https://www.mitre.org/about.

  2. http://docs.oasis-open.org/cti/stix/v2.0/stix-v2.0-part2-stix-objects.html

  3. The other privacy-preserving techniques in this category, such as data swapping, are not common approaches. Therefore, we do not consider them in this study.

  4. The process is iteratively repeated for 4 times to fulfill the 5 privacy levels’ requirement.

  5. https://www.oasis-open.org/.

  6. https://www.mitre.org/.

  7. https://stixproject.github.io/.

  8. https://cve.mitre.org/.

References

  1. Ashok, V., Navuluri, K., Alhafdhi, A., Mukkamala, R.: Dataless data mining: association rules-based distributed privacy-preserving data mining. In: 12th International Conference on Information Technology—New Generations (ITNG), 2015, pp. 615–620 (2015)

  2. Bertino, E., Lin, D., Jiang, W.: A survey of quantification of privacy preserving data mining algorithms. In: Privacy-preserving data mining, vol. 34, pp. 183–205. Springer (2008)

  3. Bogan, E., English, J.: Benchmarking for Best Practices: Winning Through Innovative Adaptation (1994)

  4. Brown, G., Pocock, A.C., Zhao, M., Luján, M.: Conditional likelihood maximisation: a unifying framework for information theoretic feature selection. J. Mach. Learn. Res. 13, 27–66 (2012)

    MathSciNet  MATH  Google Scholar 

  5. Bunn, P., Ostrovsky, R.: Secure two-party k-means clustering. In: Proceedings of the 14th ACM Conference on Computer and Communications Security, pp. 486–497. CCS 07, ACM (2007)

  6. Chen, K., Liu, L.: Privacy-preserving multiparty collaborative mining with geometric data perturbation. IEEE Trans. Parall. Distrib. Syst. 20(12), 1764–1776 (2009)

    Article  Google Scholar 

  7. Clifton, C., Kantarcioglu, M., Vaidya, J., Lin, X., Zhu, M.Y.: Tools for privacy preserving distributed data mining. SIGKDD Explor. Newsl. 4(2), 28–34 (2002)

    Article  Google Scholar 

  8. Costantino, G., Marra, A.L., Martinelli, F., Mori, P., Saracino, A.: Privacy preserving distributed attribute computation for usage control in the internet of things. In: 17th IEEE International Conference On Trust, Security And Privacy In Computing And Communications, pp. 1844–1851 (2018)

  9. De Vito, S., Massera, E., Piga, M., Martinotto, L., Di Francia, G.: On field calibration of an electronic nose for benzene estimation in an urban pollution monitoring scenario. Sens. Actuators: B. Chem. 129(2), 750–757 (2008)

    Article  Google Scholar 

  10. Dwork, C.: Differential privacy: a survey of results. In: Proceedings of the 5th International Conference on Theory and Applications of Models of Computation, pp. 1–19. TAMC’08 (2008)

  11. Egea, M., Matteucci, I., Mori, P., Petrocchi, M.: Definition of data sharing agreements—the case of spanish data protection law. In: Accountability and Security in the Cloud, June 2–6, 2014, pp. 248–272 (2014)

  12. Fan, W., He, J., Guo, M., Li, P., Han, Z., Wang, R.: Privacy preserving classification on local differential privacy in data centers. J. Parall. Distrib. Comput. 135, 70–82 (2020)

    Article  Google Scholar 

  13. Fung, B.C.M., Wang, K., Chen, R., Yu, P.S.: Privacy-preserving data publishing: a survey of recent developments. ACM Comput. Surv. 42(4), 14:1–14:53 (2010)

    Article  Google Scholar 

  14. Gao, C., Li, J., Xia, S., Choo, K.R., Lou, W., Dong, C.: Mas-encryption and its applications in privacy-preserving classifiers. IEEE Trans. Knowl. Data Eng. 1, 1–17 (2020)

    Google Scholar 

  15. Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)

    MATH  Google Scholar 

  16. Hamidi, M., Sheikhalishahi, M., Martinelli, F.: Secure two-party agglomerative hierarchical clustering construction. In: Proceedings of the 4th International Conference on Information Systems Security and Privacy, ICISSP. pp. 432–437 (2018)

  17. Inan, A., Kantarcioglu, M., Bertino, E.: Using anonymized data for classification. In: International Conference on Data Engineering. ICDE 09 (2009)

  18. Iyengar, V.S.: Transforming data to satisfy privacy constraints. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 279–288. KDD ’02, ACM (2002)

  19. Khodaparast, F., Sheikhalishahi, M., Haghighi, H., Martinelli, F.: Privacy preserving random decision tree classification over horizontally and vertically partitioned data. In: IEEE 16th International Conference on Dependable, Autonomic and Secure Computing, pp. 600–607 (2018)

  20. Kifer, D., Gehrke, J.: Injecting utility into anonymized datasets. In: Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data, pp. 217–228. SIGMOD ’06, ACM (2006)

  21. Li, T., Li, J., Liu, Z., Li, P., Jia, C.: Differentially private naive bayes learning over multiple data sources. Inf. Sci. 1, 89–104 (2018)

    Article  MathSciNet  Google Scholar 

  22. Lichman, M.: UCI machine learning repository (2013)

  23. Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: L-diversity: Privacy beyond k-anonymity. ACM Trans. Knowl. Discov. Data 1(1), 1 (2007)

    Article  Google Scholar 

  24. Martinelli, F., Sheikhalishahi, M.: Distributed data anonymization. In: IEEE International Conference on Dependable, Autonomic and Secure Computing (DASC), pp. 580–586 (2019)

  25. Martinelli, F., Matteucci, I., Petrocchi, M., Wiegand, L.: A formal support for collaborative data sharing. International Cross-Domain Conference and Workshop on Availability, Reliability, and Security, CD-ARES 2012, 547–561 (2012)

  26. Martinelli, F., Riesco, R.: Nis wg3 deliverable: Strategic research agenda (2015). https://resilience.enisa.europa.eu/nis-platform/shared-documents/3rd-plenary-meeting-april-2015

  27. Martinelli, F., Saracino, A., Sheikhalishahi, M.: Modeling privacy aware information sharing systems: a formal and general approach. In: IEEE Trustcom/BigDataSE/ISPA. pp. 767–774 (2016)

  28. Matteucci, I., Petrocchi, M., Sbodio, M.L.: CNL4DSA: a controlled natural language for data sharing agreements. In: Proceedings of the 2010 ACM Symposium on Applied Computing (SAC), pp. 616–620

  29. Matteucci, I., Petrocchi, M., Sbodio, M.L., Wiegand, L.: A design phase for data sharing agreements. In: Data Privacy Management and Autonomous Spontaneus Security, pp. 25–41. Springer (2012)

  30. Maymounkov, P., Mazieres, D.: Kademlia: A peer-to-peer information system based on the xor metric. In: Druschel, P., Kaashoek, F., Rowstron, A. (eds.) Peer-to-Peer Systems, pp. 53–65. Springer, Berlin (2002)

    Chapter  Google Scholar 

  31. Mohammed, N., Alhadidi, D., Fung, B.C.M., Debbabi, M.: Secure two-party differentially private data release for vertically partitioned data. IEEE Trans. Dependable Sec. Comput. 11(1), 59–71 (2014)

    Article  Google Scholar 

  32. Mohammed, N., Chen, R., Fung, B.C., Yu, P.S.: Differentially private data release for data mining. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’11, pp. 493–501. ACM (2011)

  33. Oliveira, S.R.M., Zaïane, O.R.: Privacy preserving frequent itemset mining. In: Proceedings of the IEEE International Conference on Privacy, Security and Data Mining—Volume 14, CRPIT ’14. pp. 43–54. Australian Computer Society, Inc. (2002)

  34. Rubinstein, B.I.P., Bartlett, P.L., Huang, L., Taft, N.: Learning in a large function space: Privacy-preserving mechanisms for svm learning. CoRR (2009). arXiv:0911.5708

  35. Seligman, L., Rosenthal, A., Caverlee, J.: Data service agreements: toward a data supply chain. In: Proceedings of the Information Integration on the Web workshop at the Very Large Database Conference, Toronto (2004)

  36. Sheikhalishahi, M., Martinelli, F.: Privacy-utility feature selection as a tool in private data classification. In: 14th International Conference on Distributed Computing and Artificial Intelligence DCAI, pp. 254–261 (2017)

  37. Sheikhalishahi, M., Nateghizad, M., Martinelli, F., Erkin, Z., Loog, M.: On the statistical detection of adversarial instances over encrypted data. In: Security and Trust Management—15th International Workshop, STM, pp. 71–88 (2019)

  38. Swarup, V., Seligman, L., Rosenthal, A.: A data sharing agreement framework. In: International Conference on Information Systems Security, pp. 22–36. Springer (2006)

  39. Sweeney, L.: K-anonymity: a model for protecting privacy. Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 10(5), 557–570 (2002)

    Article  MathSciNet  Google Scholar 

  40. Xiao, M.J., Huang, L.S., Luo, Y.L., Shen, H.: Privacy preserving id3 algorithm over horizontally partitioned data. In: 6th International Conference on Parallel and Distributed Computing Applications and Technologies (PDCAT’05), pp. 239–243 (2005)

Download references

Acknowledgements

This work has been partially supported by the H2020 EU-funded projects C3ISP (GA #700294), SIFIS-Home (GA #952652) and the ECSEL project SECREDAS (#783119).

Funding

Authors have received funding only from Public International research agencies.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrea Saracino.

Ethics declarations

Conflict of interest

All authors declare that they do not have any conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sheikhalishahi, M., Saracino, A., Martinelli, F. et al. Privacy preserving data sharing and analysis for edge-based architectures. Int. J. Inf. Secur. 21, 79–101 (2022). https://doi.org/10.1007/s10207-021-00542-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10207-021-00542-x

Keywords

Navigation