Abstract
Data mining is becoming increasingly important in the data-driven society in recent years. Unfortunately, privacy of the individuals fails to be protected and considered deliberately. It’s a significantly challenging question that outputs of data mining models can be applied to preserve privacy while simultaneously maintaining analyzing capability. With advancements in big data, series of big data computing platforms have evolved into widely utilized paradigms for data mining. However, users’ sensitive data which are outsourced on the cloud and mined on open-sourced computing platform. It poses such severe threats that measures must be taken to protect the privacy of individuals’ data. Regarding this issue, much fruitful work has been done on designing privacy preserving data mining approaches for improving big data computing platform security and privacy of individuals. In this paper, a systematic investigation of a wide array of the state-of-the-art privacy preserving data mining (PPDM) techniques has been performed from different aspects on threat model, anonymity, secure multiparty computation (SMC), differential privacy. We are focused on improving data privacy in these sensitive areas on big data computing platforms. Hopefully, our work aims to highlight the urgent need for applying privacy preserving data mining approaches on big data computing platforms. Moreover, a better understanding of this research area may benefit the usage of big data and future exploration.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Chasaki, D., Mansour, C.: Security challenges in the Internet of Things. Int. J. Space Based Situated Comput. 5, 141–149 (2015)
Beldad, A.: Sealing one’s online wall off from outsiders: determinants of the use of Facebook’s privacy settings among young Dutch users. Int. J. Technol. Hum. Interact. (IJTHI) 12, 21–34 (2016)
Barhamgi, M., Benslimane, D., Ghedira, C.: PPPDM–a privacy-preserving platform for data mashup. Int. J. Grid Util. Comput. 3, 175–187 (2012)
Li, X., He, Y., Niu, B.: An exact and efficient privacy-preserving spatiotemporal matching in mobile social networks. Int. J. Technol. Hum. Interact. (IJTHI) 12, 36–47 (2016)
Petrlic, R., Sekula, S., Sorge, C.: A privacy-friendly architecture for future cloud computing. Int. J. Grid Util. Comput. 4, 265–277 (2013)
Duan, Y., Canny, J.: How to deal with malicious users in privacy-preserving distributed data mining. Stat. Anal. Data Min. 2, 18–33 (2009)
Khan, N., Al-Yasiri, A.: Cloud security threats and techniques to strengthen cloud computing adoption framework. Int. J. Inf. Technol. Web Eng. (IJITWE) 11, 50–64 (2016)
Zhang, W., Jiang, S., Zhu, X.: Cooperative downloading with privacy preservation and access control for value-added services in VANETs. Int. J. Grid Util. Comput. 7, 50–60 (2016)
Almiani, M., Razaque, A., Al, D.A.: Privacy preserving framework to support mobile government services. Int. J. Inf. Technol. Web Eng. (IJITWE) 11, 65–78 (2016)
Yang, Q., Wu, X.: 10 challenging problems in data mining research. Int. J. Inf. Technol. Decis. Making 5, 597–604 (2006)
Lloyd, S.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28, 129–137 (1982)
Su, D., Cao, J., Li, N.: Differentially private k-means clustering. In: Proceedings of the Sixth ACM Conference on Data and Application Security and Privacy, pp. 26–37 (2016)
Samet, S., Miri, A., Orozco-Barbosa, L.: Privacy preserving k-means clustering in multi-party environment. In: SECRYPT, pp. 381–385 (2016)
Doganay, M.C., Pedersen, T.B., Saygin, Y.: Distributed privacy preserving k-means clustering with additive secret sharing. In: Proceedings of the 2008 International Workshop on Privacy and Anonymity in Information Society, pp. 3–11 (2008)
Upmanyu, M., Namboodiri, Anoop M., Srinathan, K., Jawahar, C.V.: Efficient privacy preserving k-means clustering. In: Chen, H., Chau, M., Li, S.-h., Urs, S., Srinivasa, S., Wang, G.A. (eds.) PAISI 2010. LNCS, vol. 6122, pp. 154–166. Springer, Heidelberg (2010). doi:10.1007/978-3-642-13601-6_17
Chen, H., Hu, Y., Lian, Z.: An additively homomorphic encryption over large message space. Int. J. Inf. Technol. Web Eng. (IJITWE) 10, 82–102 (2015)
Hadoop. http://hadoop.apache.org
Spark. http://spark.apache.org
Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. Int. J. Uncertainty Fuzziness Knowl. Based Syst. 10, 571–588 (2002)
Machanavajjhala, A., Kifer, D., Gehrke, J.: l-diversity: privacy beyond k-anonymity. ACM Trans. Knowl. Discov. Data (TKDD) 1 (2007)
Li, N., Li, T., Venkatasubramanian, S.: t-closeness: privacy beyond k-anonymity and l-diversity. In: IEEE 23rd International Conference on Data Engineering, ICDE 2007, pp. 106–115 (2007)
Guo, Y.: Reconstruction-based association rule hiding. In: Proceedings of SIGMOD 2007 Ph. D. Workshop on Innovative Database Research, pp. 51–56 (2007)
Verykios, V.S., Pontikakis, E.D., Theodoridis, Y.: Efficient algorithms for distortion and blocking techniques in association rule hiding. Distrib. Parallel Databases 22, 85–104 (2007)
Yao, A.C.: How to generate and exchange secrets. In: 27th Annual Symposium on Foundations of Computer Science, pp. 162–167 (1986)
Goldwasser, S., Micali, S., Wigderson, A.: How to play any mental game, or a completeness theorem for protocols with an honest majority. In: Proceedings of the Nineteenth Annual, vol. 87, pp. 218–229 (1987)
Franklin, M., Yung, M.: The varieties of secure distributed computation. In: Capocelli, R., De Santis, A., Vaccaro, U. (eds.) Sequences II, pp. 392–417. Springer, New York (1993)
Dwork, C.: A firm foundation for private data analysis. Commun. ACM 54, 86–95 (2011)
Fletcher, S., Islam, M.Z.: Decision tree classification with differential privacy: a survey. arXiv preprint arXiv:1611.01919 (2016)
Zhou, M., Zhang, R., Xie, W.: Security and privacy in cloud computing: a survey. In: 2010 Sixth International Conference on Semantics Knowledge and Grid (SKG), pp. 105–112 (2010)
Roy, I., Setty, S.T., Kilzer, A.: Airavat: security and privacy for MapReduce. In: NSDI, pp. 297–312 (2010)
Blass, E.O., Di Pietro, R., Molva, R., Önen, M.: PRISM – Privacy-Preserving Search in MapReduce. In: Fischer-Hübner, S., Wright, M. (eds.) PETS 2012. LNCS, vol. 7384, pp. 180–200. Springer, Heidelberg (2012)
Gursoy, M., Inan, A., Nergiz, M.E.: Privacy-preserving learning analytics: challenges and techniques. IEEE Trans. Learn. Technol. (2016)
Kong, W., Lei, Y., Ma, J.: Virtual machine resource scheduling algorithm for cloud computing based on auction mechanism. Optik-Int. J. Light Electron Opt. 127, 5099–5104 (2016)
Acknowledgments
We will thank for National Natural Science Foundation funded project 61309008 and Shaanxi Province Natural Science Funded Project 2014JQ8049. Also, we would also like to thank our partners at our Research Lab and their generous gifts in support of this research.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Zhiqiang, G., Longjun, Z. (2018). Privacy Preserving Data Mining on Big Data Computing Platform: Trends and Future. In: Barolli, L., Woungang, I., Hussain, O. (eds) Advances in Intelligent Networking and Collaborative Systems. INCoS 2017. Lecture Notes on Data Engineering and Communications Technologies, vol 8. Springer, Cham. https://doi.org/10.1007/978-3-319-65636-6_44
Download citation
DOI: https://doi.org/10.1007/978-3-319-65636-6_44
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-65635-9
Online ISBN: 978-3-319-65636-6
eBook Packages: EngineeringEngineering (R0)