Abstract
Many companies want to share data for data-mining tasks. However, privacy and security concerns have become a bottleneck in the data-sharing field. The secure multiparty computation (SMC)-based privacy-preserving data mining has emerged as a solution to this problem. However, there is heavy computation cost at user side in traditional SMC solutions. This study introduces an outsourcing method to reduce the computation cost of the user side. We also preserve the privacy of the shared databy proposing an outsourced privacy-preserving C4.5 algorithm over horizontally and vertically partitioned data for multiple parties based on the outsourced privacy preserving weighted average protocol (OPPWAP) and outsourced secure set intersection protocol (OSSIP). Consequently, we have found that our method can achieve a result same the original C4.5 decision tree algorithm while preserving data privacy. Furthermore, we also implement the proposed protocols and the algorithms. It shows that a sublinear relationship exists between the computational cost of the user side and the number of participating parties.
Similar content being viewed by others
References
A. Yao.: How to generate and exchange secrets. In: Proceedings of Annual Symposium on Foundations of Computer Science, pp. 162–167 (1986)
Bresson, E., Catalano, D., Pointcheval. A simple public-key cryptosystem with a double trapdoor decryption mechanism and its applications. In: Advances in Cryptology—ASIACRYPT 2003, Proceedings of the International Conference on the Theory and Application of Cryptology and Information Security, Taipei, Taiwan, November 30–December 4, 2003, vol. 2894, pp. 37–54. (2003)
Liu, D., Bertino, E., Yi, X.: Privacy of outsourced K-means clustering. In: Proceedings of ACM Symposium on Information, Computer and Communications Security, pp. 123–134. (2014)
Elgamal, T.: A public key cryptosystem and a signature scheme based on discrete logarithms. In: CRYPTO 1984: Proceedings of Advances in Cryptology, pp. 10–18. (1985)
Emekci, F., Sahin, O.D., et al.: Privacy preserving decision tree learning over multiple parties. Data Knowl Eng 63(2), 348–361 (2007)
Fu, Z., Huang, F., Sun, X., et al.: Enabling semantic search based on conceptual graphs over encrypted outsourced data. IEEE Trans. Serv. Comput. (2016). doi:10.1109/TSC.2016.2622697
Gangrade, A., Patel, R.: Building privacy-preserving C4.5 decision tree classifier on multi-parties. Int. J. Comput. Sci. Eng. 1(3), 199–205 (2009)
Gupta, B.B., Agrawal, D.P., Yamaguchi, S.: Handbook of Research on Modern Cryptographic Solutions for Computer and Cyber Security. IGI Global Publisher, Hershey (2016)
G. Jagannathan, Wright, R.N.: Privacy-preserving distributed Kmeans clustering over arbitrarily partitioned data. In: Proceedings of ACM International Conference on Knowledge Discovery, pp. 593–599. (2005)
Hohenberger, S., Lysyanskaya, A.: How to securely outsource cryptographic computations. In: Lecture Notes in Computer Science, vol. 3378, pp. 264–282. (2005)
Li, Jin, Chen, Xiaofeng, Li, Mingqiang, Li, Jingwei, Lee, Patrick, Lou, Wenjing: Secure deduplication with efficient and reliable convergent key management. IEEE Trans. Parallel Distrib. Syst. 25(6), 1615–1625 (2014)
Li, Jin, Li, Jingwei, Chen, Xiaofeng, Jia, C., Lou, W.: Identity-based encryption with outsourced revocation in cloud computing. IEEE Trans. Comput. 64(2), 425–437 (2015)
Li, Jian, Li, Xiaolong, Yang, Bin, Sun, Xingming: Segmentation-based image copy-move forgery detection scheme. IEEE Trans. Inf. Forensics Secur. 10(3), 507–518 (2015)
Zhan, J., Matwin, S., et al.: Privacy preserving decision tree classification over horizontally partitioned data. In: Proceedings of International Conference on Electronic Business, pp. 470–476. (2005)
Kamara, S., Mohassel, P., Raykova, M.: Outsourcing multi-party computation. IACR Cryptol. Eprint Arch. 2011(3), 435–451 (2011)
Keonsoo, L., et al.: A comparative evaluation of atrial fibrillation detection methods in Koreans based on optical recordings using a smartphone. In: IEEE Access. (2017). doi:10.1109/ACCESS.2017.2700488
Li, J., Yan, H., Liu, Z., et al.: Location-sharing systems with enhanced privacy in mobile online social networks. IEEE Syst. J. 99, 1–10 (2015)
Malek, M.S.B.A., Ahmadon, M.A.B., Yamaguchi, S., et al.: On privacy verification in the IoT service based on PN2. In: Global Conference on Consumer Electronics, 2016 IEEE. (2016)
Xiao, M., Huang, L., et al.: Privacy preserving ID3 algorithm over horizontally partitioned data. In: Proceedings of International Conference on Parallel and Distributed Computing, Applications and Technologies, pp. 239–243. (2005)
Paillier, P.: Public-key cryptosystems based on composite degree residuosity classes. In: EUROCRYPT 1999 Proceedings, pp. 223–238. (1999)
Lory, P.: Enhancing the efficiency in privacy preserving learning of decision trees in partitioned databases. In: Proceedings of International Conference on Privacy in Statistical Databases, pp. 322–334. (2012)
Peter, A., Tews, E., Katzenbeisser, S.: Efficiently outsourcing multiparty computation under multiple keys. IEEE Trans. Inf. Forensics Secur. 8(12), 2046–2058 (2013)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, Burlington (1993)
Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)
Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 439–450. (2000)
Samet, S., Miri, A.: Privacy preserving ID3 using Gini index over horizontally partitioned data. In: Proceedings of IEEE/ACS International Conference on Computer Systems and Applications, pp. 645–651. (2008)
Shen, Y., Shao, H., Yang, L.: Privacy preserving C4.5 algorithm over vertically distributed datasets. In: Proceedings of IEEE International Conference on Networks Security, Wireless Communications and Trusted Computing, pp. 446–448. (2009)
Stergiou, C., Psannis, K.E., Kim, B.G., et al.: Secure integration of IoT and cloud computing. Future Gener. Comput. Syst. (2016)
Vaidya, J., Clifton, C.: Privacy-preserving decision trees over vertically partitioned data. In: Lecture Notes in Computer Science, vol. 2, pp. 139–152. (2005)
Vaidya J, Clifton C.: Privacy preserving association rule mining in vertically partitioned data. In Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 639–644. (2002)
Vaidya, J., Clifton, C.: Secure set intersection cardinality with application to association rule mining. J. Comput. Secur. 13(4), 593–622 (2005)
Vaidya, J., Shafiq, B., Fan, W., et al.: A random decision tree framework for privacy-preserving data mining. IEEE Trans. Dependable Secur. Comput. 11(5), 399–411 (2014)
Veluru, S., Gupta, B.B., Rahulamathavan, Y., et al.: Privacy preserving text analytics: research challenges and strategies in name analysis. In: Handbook of Research on Securing Cloud-Based Databases with Biometric Applications, pp. 67–92 (2015)
Wang, Z., Gu, T. and Cheung, S.: A theoretical framework for distributed secure outsourced computing using secret sharing. In: Proceedings of IEEE International Workshop on Information Forensics and Security. (2014)
Fang, W., Yang, B.: Privacy preserving decision tree learning over vertically partitioned data. In: Proceedings of IEEE International Conference on Computer Science and Software Engineering, pp. 1049–1052. (2008)
Wu, D.J., Feng, T., Naehrig, M., et al.: Privately evaluating decision trees and random forests. In: Proceedings on Privacy Enhancing Technologies, vol. (4). (2016)
Liu, X., Jiang, Z.L., Yiu, S.M., Wang, X..: Outsourcing two-party privacy preserving K-means clustering protocol in wireless sensor networks. In: Proceedings of International Conference on Mobile Ad-Hoc and Sensor Networks, pp. 124–133. (2015)
Xiao, M.J., Han, K., Huang, L.S., et al.: Privacy preserving C4.5 algorithm over horizontally partitioned data. In: Proceedings of International Conference on Grid and Cooperative Computing, pp. 78–85. (2006)
Jararweh, Y., Alsmirat, M., Al-Ayyoub, M., et al.: Software defined system support for enabling ubiquitous mobile edge computing. Comput. J. Oxf. (2017). doi:10.1093/comjnl/bxx019
Lindell, Y., Pinkas, B.: Privacy preserving data mining. J. Cryptol. 15(3), 177–206 (2002)
Lindell, Y., Pinkas, B.: Secure multi-party computation for privacy-preserving data mining. J. Privacy Confid. 25(2), 59–98 (2009)
Zhangjie, Fu, Xinle, Wu, Guan, Chaowen, Sun, Xingming, Ren, Kui: Toward efficient multi-keyword fuzzy search over encrypted outsourced data with accuracy improvement. IEEE Trans. Inf. Forensics Secur. 11(12), 2706–2716 (2016)
Zhangjie, Fu, Ren, Kui, Shu, Jiangang, Sun, Xingming, Huang, Fengxiao: Enabling personalized search over encrypted outsourced data with efficiency improvement. IEEE Tran. Parallel Distrib. Syst. 27(9), 2546–2559 (2016)
Xia, Zhihua, Wang, Xinhui, Sun, Xingming, Wang, Qian: A secure and dynamic multi-keyword ranked search scheme over encrypted cloud data. IEEE Trans. Parallel Distrib. Syst. 27(2), 340–352 (2015)
Acknowledgements
This work is supported by National High Technology Research and Development Program of China (No. 2015AA016008), National Natural Science Foundation of China (No. 61402136), Natural Science Foundation of Guangdong Province, China (No. 2014A030313697), National Natural Science Foundation of China (No. 61472091), Natural Science Foundation of Guangdong Province for Distinguished Young Scholars (2014A030306020), Guangzhou scholars project for universities of Guangzhou (No. 1201561613), Science and Technology Planning Project of Guangdong Province, China (2015B01012 9015) and Guangdong Province Key Laboratory of High Performance Computing (No. [2013]82).
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
Li, Y., Jiang, Z.L., Yao, L. et al. Outsourced privacy-preserving C4.5 decision tree algorithm over horizontally and vertically partitioned dataset among multiple parties. Cluster Comput 22 (Suppl 1), 1581–1593 (2019). https://doi.org/10.1007/s10586-017-1019-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-017-1019-9