Abstract
The evolution of new technologies and the spread of the Internet have led to the exchange and elaboration of massive amounts of data. Simultaneously, intelligent systems that parse and analyze patterns within data are gaining popularity. Many of these data contain sensitive information, a fact that leads to serious concerns on how such data should be managed and used from data mining techniques. Extracting knowledge from statistical databases is an essential step towards deploying intelligent systems that assist in making decisions, but also must preserve the privacy of parties involved. In this paper, we present a novel privacy preserving data mining algorithm from statistical databases that are horizontally partitioned. The novelty lies to the multi-candidate election schema and its capabilities of being a basic foundation for a privacy preserving Tree Augmented Naïve Bayesian (TAN) classifier, in order to obviate disclosure of personal information.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aggarwal, C.C., Yu, P.S.: A General Survey of Privacy-Preserving Data Mining Models and Algorithms. In: Aggarwal, C.C., Yu, P.S. (eds.) Privacy-Preserving Data Mining, pp. 11–52. Springer, US (2008)
Agrawal, D., Aggarwal, C.: On the Design and Quantification of Privacy Preserving Data Mining Algorithms. In: 12th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 247–255. ACM, New York (2001)
Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: 2000 ACM SIGMOD Conference on Management of Data, vol. 29(2), pp. 439–450 (2000)
Baudron, O., Fouque, P.-A., Pointcheval, D., Stern, J., Poupard, G.: Practical multi-candidate election system. In: PODC 2001: Proceedings of the Twentieth Annual ACM Symposium on Principles of Distributed Computing, pp. 274–283. ACM, New York (2001)
Chow, C.K., Liu, C.N.: Approximating discrete probability distributions with dependence trees. IEEE Transactions on Information Theory 14, 462–467 (1968)
Clifton, C.: Privacy Preserving Distributed Data Mining. In: 13th European Conference on Machine Learning, pp. 19–23 (2001)
Clifton, C., Kantarcioglu, M., Vaidya, J., Lin, X., Zhu, M.Y.: Tools for Privacy Preserving Distributed Data Mining. ACM SIGKDD Explorations 4(2), 28–34 (2002)
Clifton, C., Marks, D.: Security and Privacy Implications of Data Mining. In: Proceedings of the 1996 ACM SIGMOD Workshop on Data Mining and Knowledge Discovery, Montreal, Canada, pp. 15–19 (1996)
Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian network classifiers. Machine Learning 29(2-3), 131–163 (1997)
Kantarcioglu, M., Clifton, C.: Privacy preserving distributed mining of association rules on horizontally partitioned data. IEEE Transactions on Knowledge and Data Engineering 16(9), 1026–1037 (2004)
Goldreich, O.: Secure multi-party computation. Working Draft (1998)
Kantarcıoglu, M., Vaidya, J.: Privacy Preserving Naive Bayes Classifier for Horizontally Partitioned Data. In: IEEE ICDM Workshop on Privacy Preserving Data Mining, pp. 3–9 (2003)
Lindell, Y., Pinkas, B.: Privacy Preserving Data mining. Journal of Cryptology 15(3), 177–206 (2002)
Magkos, E., Maragoudakis, M., Chrissikopoulos, V., Gritzalis, S.: Accurate and Large-Scale Privacy-Preserving Data Mining using the Election Paradigm. Data and Knowledge Engineering 68(11), 1224–1236 (2009)
Mitchell, T.: Machine Learning. McGrawHill, New York (1997)
Paillier, P.: Public-key cryptosystems based on composite degree residuosity classes. In: Stern, J. (ed.) EUROCRYPT 1999. LNCS, vol. 1592, pp. 223–238. Springer, Heidelberg (1999)
Pinkas, B.: Cryptographic techniques for privacy-preserving data mining. ACM SIGKDD Explorations Newsletter 4(2), 12–19 (2002)
Sweeney, L.: k-Anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 10(5), 557–570 (2002)
UC Irvine Machine Learning Repository, http://archive.ics.uci.edu/ml/index.html
Vaidya, J., Clifton, C.: Privacy preserving association rule mining in vertically partitioned data. In: 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 639–644 (2002)
Vaidya, J., Kantarcioglu, M., Clifton, C.: Privacy-preserving Naive Bayes classification. The VLDB Journal 17(4), 879–898 (2008)
Verykios, V., Bertino, E., Fovino, I., Parasiliti Provenza, L., Saygin, Y., Theodoridis, Y.: State-of-the-art in privacy preserving data mining. ACM SIGMOD Record 33(1), 50–57 (2004)
Wright, R., Yang, Z.: Privacy-Preserving Bayesian Network Structure Computation on Distributed Heterogeneous Data. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2004), Seattle, WA, USA, pp. 713–718 (2004)
Yang, Z., Zhong, S., Wright, R.: Privacy-preserving classification of customer data without loss of accuracy. In: SIAM International Conference on Data Mining, SDM 2005 (2005)
Yao, A.C.: How to generate and exchange secrets. In: 27th Annual Symposium on Foundations of Computer Science, pp. 162–167 (1986)
Yi, X., Zhang, Y.: Privacy-preserving naive Bayes classification on distributed data via semi-trusted mixers. Information Systems 34(3), 371–380 (2009)
Zhan, J., Matwin, S., Chang, L.: Privacy-Preserving Naive Bayesian Classification over Horizontally Partitioned Data. Data Mining: Foundation and Practice (118), 529–538 (2008)
Zhang, N., Wang, S., Zhao, W.: On a new scheme on privacy-preserving data classification. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, pp. 374–383. ACM, NewYork (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Skarkala, M.E., Maragoudakis, M., Gritzalis, S., Mitrou, L. (2011). Privacy Preserving Tree Augmented Naïve Bayesian Multi-party Implementation on Horizontally Partitioned Databases. In: Furnell, S., Lambrinoudakis, C., Pernul, G. (eds) Trust, Privacy and Security in Digital Business. TrustBus 2011. Lecture Notes in Computer Science, vol 6863. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22890-2_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-22890-2_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-22889-6
Online ISBN: 978-3-642-22890-2
eBook Packages: Computer ScienceComputer Science (R0)