Abstract
This paper presents a scheme for privacy-preserving clustering in a three-party scenario, focusing on cooperative training of multivariate mixture models. With modern-day big data often collected and stored across multiple independent parties, preservation of private data is an important issue during cross-party communications when carrying out statistical analyzes of the joint data. We consider the situation where the data are horizontally distributed among three parties and that each data owner wants to learn the global parameters while data from other parties are kept private. The inter-party communications must not expose any information that may potentially disclose details of the private data, including how the data are partitioned across the parties. In addition, unlike most existing methods, the proposed scheme does not require a special trusted party to be involved. Clustering plays an important role in statistical learning and is one of the most widely used data mining methods. We shall illustrate our scheme using a Gaussian mixture model (GMM) based cluster analysis.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aggarwal, C.C., Yu, P.S.: A general survey of privacy-preserving data mining models and algorithms. In: Privacy-Preserving Data Mining and Algorithms, pp. 11–52 (2008)
Agrawal, S., Haritsa, J.R.: A framework for high-accuracy privacy-preserving mining. In: Proceedings of the 21st ICDE, Japan (2005)
Azzalini, A., Capitanio, A.: The Skew-Normal and Related Families. Institute of Mathematical Statistics Monographs. Cambridge University Press, UK (2014)
Beye, M., Erkin, Z., Lagendijk, R.L.: Efficient privacy preserving k-means clustering in a three-party setting. In: 2011 IEEE WIFS, pp. 1–6 (2011)
Evfimevski, A., Gehrke, J., Srikant, R.: Limiting privacy breaches in privacy preserving data mining. In: Proceedings of ACM SIGMOD/PODS Conference (2003)
Evfimievski, A.: Randomization in privacy preserving data mining. ACM SIGKDD Explor. Newsl. 4, 43–48 (2002)
Graham, R.L., Knuth, D.E., Patashnik, O.: Concrete Mathematics. Addison Wesley, Reading (1988)
Jagannathan, G., Wright, R.N.: Privacy-preserving distributed k-means clustering over arbitrarily partitioned data. In: Proceedings of the Eleventh ACM SIGKDDICKDDM, New York, NY, USA, pp. 593–599 (2005)
Jha, S., Kruger, L., McDaniel, P.: Privacy preserving clustering. In: Vimercati, S.C., Syverson, P., Gollmann, D. (eds.) ESORICS 2005. LNCS, vol. 3679, pp. 397–417. Springer, Heidelberg (2005). doi:10.1007/11555827_23
Kantarcoglu, M., Vaidya, J.: Privacy preserving naive bayes classifier for horizontally partitioned data. In: Proceedings of the IEEE ICDM PPDM, pp. 3–9 (2003)
Lee, S., McLachlan, G.J.: Finite mixtures of multivariate skew \(t\)-distributions: some recent and new results. Stat. Comput. 24, 181–202 (2014)
Lee, S.X., Leemaqz, K.L., McLachlan, G.J.: A simple parallel EM algorithm for statistical learning via mixture models. In: Liew, A.W.-C., et al. (eds.) Proceedings of DICTA 2016, pp. 295–302. IEEE eXpress, Los Alamitos, California (2016)
McLachlan, G.J., Basford, K.E.: Mixture Models: Inference and Applications. Marcel Dekker, New York (1988)
McLachlan, G.J., Krishnan, T.: The EM Algorithm and Extensions. Wiley, Hoboken (1997)
McLachlan, G.J., Peel, D.: Finite Mixture Models. Wiley, New York (2000)
Vaidya, J.: A survey of privacy-preserving methods across vertically partitioned data. In: Privacy-Preserving Data Mining and Algorithms, pp. 337–358 (2008)
Vaidya, J., Clifton, C.: Privacy preserving association rule mining in vertically partitioned data. In: Proceedings of the Eighth ACM SIGKDD ICKDDM, pp. 639–644. ACM Press (2002)
Verykios, V.S., Bertino, E., Fovino, I.N., Provenza, L.P., Saygin, Y., Theodoridis, Y.: State-of-the-art in privacy preserving data mining. In: Proceedings of ACM SIGMOD Record, New York, USA, pp. 50–57 (2004)
Wu, D., Atallah, M.: Privacy-preserving cooperative statistical analysis. In: Proceedings of the 17th ACSAC. pp. 103–110 (2001)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Leemaqz, K.L., Lee, S.X., McLachlan, G.J. (2017). Private Distributed Three-Party Learning of Gaussian Mixture Models. In: Batten, L., Kim, D., Zhang, X., Li, G. (eds) Applications and Techniques in Information Security. ATIS 2017. Communications in Computer and Information Science, vol 719. Springer, Singapore. https://doi.org/10.1007/978-981-10-5421-1_7
Download citation
DOI: https://doi.org/10.1007/978-981-10-5421-1_7
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-5420-4
Online ISBN: 978-981-10-5421-1
eBook Packages: Computer ScienceComputer Science (R0)