Skip to main content

Privacy Preserving Tree Augmented Naïve Bayesian Multi-party Implementation on Horizontally Partitioned Databases

  • Conference paper
Trust, Privacy and Security in Digital Business (TrustBus 2011)

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 6863))

Abstract

The evolution of new technologies and the spread of the Internet have led to the exchange and elaboration of massive amounts of data. Simultaneously, intelligent systems that parse and analyze patterns within data are gaining popularity. Many of these data contain sensitive information, a fact that leads to serious concerns on how such data should be managed and used from data mining techniques. Extracting knowledge from statistical databases is an essential step towards deploying intelligent systems that assist in making decisions, but also must preserve the privacy of parties involved. In this paper, we present a novel privacy preserving data mining algorithm from statistical databases that are horizontally partitioned. The novelty lies to the multi-candidate election schema and its capabilities of being a basic foundation for a privacy preserving Tree Augmented Naïve Bayesian (TAN) classifier, in order to obviate disclosure of personal information.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aggarwal, C.C., Yu, P.S.: A General Survey of Privacy-Preserving Data Mining Models and Algorithms. In: Aggarwal, C.C., Yu, P.S. (eds.) Privacy-Preserving Data Mining, pp. 11–52. Springer, US (2008)

    Chapter  Google Scholar 

  2. Agrawal, D., Aggarwal, C.: On the Design and Quantification of Privacy Preserving Data Mining Algorithms. In: 12th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 247–255. ACM, New York (2001)

    Google Scholar 

  3. Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: 2000 ACM SIGMOD Conference on Management of Data, vol. 29(2), pp. 439–450 (2000)

    Google Scholar 

  4. Baudron, O., Fouque, P.-A., Pointcheval, D., Stern, J., Poupard, G.: Practical multi-candidate election system. In: PODC 2001: Proceedings of the Twentieth Annual ACM Symposium on Principles of Distributed Computing, pp. 274–283. ACM, New York (2001)

    Chapter  Google Scholar 

  5. Chow, C.K., Liu, C.N.: Approximating discrete probability distributions with dependence trees. IEEE Transactions on Information Theory 14, 462–467 (1968)

    Article  MATH  Google Scholar 

  6. Clifton, C.: Privacy Preserving Distributed Data Mining. In: 13th European Conference on Machine Learning, pp. 19–23 (2001)

    Google Scholar 

  7. Clifton, C., Kantarcioglu, M., Vaidya, J., Lin, X., Zhu, M.Y.: Tools for Privacy Preserving Distributed Data Mining. ACM SIGKDD Explorations 4(2), 28–34 (2002)

    Article  Google Scholar 

  8. Clifton, C., Marks, D.: Security and Privacy Implications of Data Mining. In: Proceedings of the 1996 ACM SIGMOD Workshop on Data Mining and Knowledge Discovery, Montreal, Canada, pp. 15–19 (1996)

    Google Scholar 

  9. Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian network classifiers. Machine Learning 29(2-3), 131–163 (1997)

    Article  MATH  Google Scholar 

  10. Kantarcioglu, M., Clifton, C.: Privacy preserving distributed mining of association rules on horizontally partitioned data. IEEE Transactions on Knowledge and Data Engineering 16(9), 1026–1037 (2004)

    Article  Google Scholar 

  11. Goldreich, O.: Secure multi-party computation. Working Draft (1998)

    Google Scholar 

  12. Kantarcıoglu, M., Vaidya, J.: Privacy Preserving Naive Bayes Classifier for Horizontally Partitioned Data. In: IEEE ICDM Workshop on Privacy Preserving Data Mining, pp. 3–9 (2003)

    Google Scholar 

  13. Lindell, Y., Pinkas, B.: Privacy Preserving Data mining. Journal of Cryptology 15(3), 177–206 (2002)

    Article  MATH  Google Scholar 

  14. Magkos, E., Maragoudakis, M., Chrissikopoulos, V., Gritzalis, S.: Accurate and Large-Scale Privacy-Preserving Data Mining using the Election Paradigm. Data and Knowledge Engineering 68(11), 1224–1236 (2009)

    Article  Google Scholar 

  15. Mitchell, T.: Machine Learning. McGrawHill, New York (1997)

    MATH  Google Scholar 

  16. Paillier, P.: Public-key cryptosystems based on composite degree residuosity classes. In: Stern, J. (ed.) EUROCRYPT 1999. LNCS, vol. 1592, pp. 223–238. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  17. Pinkas, B.: Cryptographic techniques for privacy-preserving data mining. ACM SIGKDD Explorations Newsletter 4(2), 12–19 (2002)

    Article  Google Scholar 

  18. Sweeney, L.: k-Anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 10(5), 557–570 (2002)

    Article  MATH  Google Scholar 

  19. UC Irvine Machine Learning Repository, http://archive.ics.uci.edu/ml/index.html

  20. Vaidya, J., Clifton, C.: Privacy preserving association rule mining in vertically partitioned data. In: 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 639–644 (2002)

    Google Scholar 

  21. Vaidya, J., Kantarcioglu, M., Clifton, C.: Privacy-preserving Naive Bayes classification. The VLDB Journal 17(4), 879–898 (2008)

    Article  Google Scholar 

  22. Verykios, V., Bertino, E., Fovino, I., Parasiliti Provenza, L., Saygin, Y., Theodoridis, Y.: State-of-the-art in privacy preserving data mining. ACM SIGMOD Record 33(1), 50–57 (2004)

    Article  Google Scholar 

  23. Wright, R., Yang, Z.: Privacy-Preserving Bayesian Network Structure Computation on Distributed Heterogeneous Data. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2004), Seattle, WA, USA, pp. 713–718 (2004)

    Google Scholar 

  24. Yang, Z., Zhong, S., Wright, R.: Privacy-preserving classification of customer data without loss of accuracy. In: SIAM International Conference on Data Mining, SDM 2005 (2005)

    Google Scholar 

  25. Yao, A.C.: How to generate and exchange secrets. In: 27th Annual Symposium on Foundations of Computer Science, pp. 162–167 (1986)

    Google Scholar 

  26. Yi, X., Zhang, Y.: Privacy-preserving naive Bayes classification on distributed data via semi-trusted mixers. Information Systems 34(3), 371–380 (2009)

    Article  Google Scholar 

  27. Zhan, J., Matwin, S., Chang, L.: Privacy-Preserving Naive Bayesian Classification over Horizontally Partitioned Data. Data Mining: Foundation and Practice (118), 529–538 (2008)

    MATH  Google Scholar 

  28. Zhang, N., Wang, S., Zhao, W.: On a new scheme on privacy-preserving data classification. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, pp. 374–383. ACM, NewYork (2005)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Skarkala, M.E., Maragoudakis, M., Gritzalis, S., Mitrou, L. (2011). Privacy Preserving Tree Augmented Naïve Bayesian Multi-party Implementation on Horizontally Partitioned Databases. In: Furnell, S., Lambrinoudakis, C., Pernul, G. (eds) Trust, Privacy and Security in Digital Business. TrustBus 2011. Lecture Notes in Computer Science, vol 6863. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22890-2_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-22890-2_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-22889-6

  • Online ISBN: 978-3-642-22890-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics