Skip to main content
Log in

Outsourced privacy-preserving C4.5 decision tree algorithm over horizontally and vertically partitioned dataset among multiple parties

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Many companies want to share data for data-mining tasks. However, privacy and security concerns have become a bottleneck in the data-sharing field. The secure multiparty computation (SMC)-based privacy-preserving data mining has emerged as a solution to this problem. However, there is heavy computation cost at user side in traditional SMC solutions. This study introduces an outsourcing method to reduce the computation cost of the user side. We also preserve the privacy of the shared databy proposing an outsourced privacy-preserving C4.5 algorithm over horizontally and vertically partitioned data for multiple parties based on the outsourced privacy preserving weighted average protocol (OPPWAP) and outsourced secure set intersection protocol (OSSIP). Consequently, we have found that our method can achieve a result same the original C4.5 decision tree algorithm while preserving data privacy. Furthermore, we also implement the proposed protocols and the algorithms. It shows that a sublinear relationship exists between the computational cost of the user side and the number of participating parties.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. A. Yao.: How to generate and exchange secrets. In: Proceedings of Annual Symposium on Foundations of Computer Science, pp. 162–167 (1986)

  2. Bresson, E., Catalano, D., Pointcheval. A simple public-key cryptosystem with a double trapdoor decryption mechanism and its applications. In: Advances in Cryptology—ASIACRYPT 2003, Proceedings of the International Conference on the Theory and Application of Cryptology and Information Security, Taipei, Taiwan, November 30–December 4, 2003, vol. 2894, pp. 37–54. (2003)

  3. Liu, D., Bertino, E., Yi, X.: Privacy of outsourced K-means clustering. In: Proceedings of ACM Symposium on Information, Computer and Communications Security, pp. 123–134. (2014)

  4. Elgamal, T.: A public key cryptosystem and a signature scheme based on discrete logarithms. In: CRYPTO 1984: Proceedings of Advances in Cryptology, pp. 10–18. (1985)

  5. Emekci, F., Sahin, O.D., et al.: Privacy preserving decision tree learning over multiple parties. Data Knowl Eng 63(2), 348–361 (2007)

    Article  Google Scholar 

  6. Fu, Z., Huang, F., Sun, X., et al.: Enabling semantic search based on conceptual graphs over encrypted outsourced data. IEEE Trans. Serv. Comput. (2016). doi:10.1109/TSC.2016.2622697

  7. Gangrade, A., Patel, R.: Building privacy-preserving C4.5 decision tree classifier on multi-parties. Int. J. Comput. Sci. Eng. 1(3), 199–205 (2009)

    Google Scholar 

  8. Gupta, B.B., Agrawal, D.P., Yamaguchi, S.: Handbook of Research on Modern Cryptographic Solutions for Computer and Cyber Security. IGI Global Publisher, Hershey (2016)

    Book  Google Scholar 

  9. G. Jagannathan, Wright, R.N.: Privacy-preserving distributed Kmeans clustering over arbitrarily partitioned data. In: Proceedings of ACM International Conference on Knowledge Discovery, pp. 593–599. (2005)

  10. Hohenberger, S., Lysyanskaya, A.: How to securely outsource cryptographic computations. In: Lecture Notes in Computer Science, vol. 3378, pp. 264–282. (2005)

  11. Li, Jin, Chen, Xiaofeng, Li, Mingqiang, Li, Jingwei, Lee, Patrick, Lou, Wenjing: Secure deduplication with efficient and reliable convergent key management. IEEE Trans. Parallel Distrib. Syst. 25(6), 1615–1625 (2014)

    Article  Google Scholar 

  12. Li, Jin, Li, Jingwei, Chen, Xiaofeng, Jia, C., Lou, W.: Identity-based encryption with outsourced revocation in cloud computing. IEEE Trans. Comput. 64(2), 425–437 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  13. Li, Jian, Li, Xiaolong, Yang, Bin, Sun, Xingming: Segmentation-based image copy-move forgery detection scheme. IEEE Trans. Inf. Forensics Secur. 10(3), 507–518 (2015)

    Article  Google Scholar 

  14. Zhan, J., Matwin, S., et al.: Privacy preserving decision tree classification over horizontally partitioned data. In: Proceedings of International Conference on Electronic Business, pp. 470–476. (2005)

  15. Kamara, S., Mohassel, P., Raykova, M.: Outsourcing multi-party computation. IACR Cryptol. Eprint Arch. 2011(3), 435–451 (2011)

    Google Scholar 

  16. Keonsoo, L., et al.: A comparative evaluation of atrial fibrillation detection methods in Koreans based on optical recordings using a smartphone. In: IEEE Access. (2017). doi:10.1109/ACCESS.2017.2700488

  17. Li, J., Yan, H., Liu, Z., et al.: Location-sharing systems with enhanced privacy in mobile online social networks. IEEE Syst. J. 99, 1–10 (2015)

    Google Scholar 

  18. Malek, M.S.B.A., Ahmadon, M.A.B., Yamaguchi, S., et al.: On privacy verification in the IoT service based on PN2. In: Global Conference on Consumer Electronics, 2016 IEEE. (2016)

  19. Xiao, M., Huang, L., et al.: Privacy preserving ID3 algorithm over horizontally partitioned data. In: Proceedings of International Conference on Parallel and Distributed Computing, Applications and Technologies, pp. 239–243. (2005)

  20. Paillier, P.: Public-key cryptosystems based on composite degree residuosity classes. In: EUROCRYPT 1999 Proceedings, pp. 223–238. (1999)

  21. Lory, P.: Enhancing the efficiency in privacy preserving learning of decision trees in partitioned databases. In: Proceedings of International Conference on Privacy in Statistical Databases, pp. 322–334. (2012)

  22. Peter, A., Tews, E., Katzenbeisser, S.: Efficiently outsourcing multiparty computation under multiple keys. IEEE Trans. Inf. Forensics Secur. 8(12), 2046–2058 (2013)

    Article  Google Scholar 

  23. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, Burlington (1993)

    Google Scholar 

  24. Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)

    Google Scholar 

  25. Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 439–450. (2000)

  26. Samet, S., Miri, A.: Privacy preserving ID3 using Gini index over horizontally partitioned data. In: Proceedings of IEEE/ACS International Conference on Computer Systems and Applications, pp. 645–651. (2008)

  27. Shen, Y., Shao, H., Yang, L.: Privacy preserving C4.5 algorithm over vertically distributed datasets. In: Proceedings of IEEE International Conference on Networks Security, Wireless Communications and Trusted Computing, pp. 446–448. (2009)

  28. Stergiou, C., Psannis, K.E., Kim, B.G., et al.: Secure integration of IoT and cloud computing. Future Gener. Comput. Syst. (2016)

  29. Vaidya, J., Clifton, C.: Privacy-preserving decision trees over vertically partitioned data. In: Lecture Notes in Computer Science, vol. 2, pp. 139–152. (2005)

  30. Vaidya J, Clifton C.: Privacy preserving association rule mining in vertically partitioned data. In Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 639–644. (2002)

  31. Vaidya, J., Clifton, C.: Secure set intersection cardinality with application to association rule mining. J. Comput. Secur. 13(4), 593–622 (2005)

    Article  Google Scholar 

  32. Vaidya, J., Shafiq, B., Fan, W., et al.: A random decision tree framework for privacy-preserving data mining. IEEE Trans. Dependable Secur. Comput. 11(5), 399–411 (2014)

    Article  Google Scholar 

  33. Veluru, S., Gupta, B.B., Rahulamathavan, Y., et al.: Privacy preserving text analytics: research challenges and strategies in name analysis. In: Handbook of Research on Securing Cloud-Based Databases with Biometric Applications, pp. 67–92 (2015)

  34. Wang, Z., Gu, T. and Cheung, S.: A theoretical framework for distributed secure outsourced computing using secret sharing. In: Proceedings of IEEE International Workshop on Information Forensics and Security. (2014)

  35. Fang, W., Yang, B.: Privacy preserving decision tree learning over vertically partitioned data. In: Proceedings of IEEE International Conference on Computer Science and Software Engineering, pp. 1049–1052. (2008)

  36. Wu, D.J., Feng, T., Naehrig, M., et al.: Privately evaluating decision trees and random forests. In: Proceedings on Privacy Enhancing Technologies, vol. (4). (2016)

  37. Liu, X., Jiang, Z.L., Yiu, S.M., Wang, X..: Outsourcing two-party privacy preserving K-means clustering protocol in wireless sensor networks. In: Proceedings of International Conference on Mobile Ad-Hoc and Sensor Networks, pp. 124–133. (2015)

  38. Xiao, M.J., Han, K., Huang, L.S., et al.: Privacy preserving C4.5 algorithm over horizontally partitioned data. In: Proceedings of International Conference on Grid and Cooperative Computing, pp. 78–85. (2006)

  39. Jararweh, Y., Alsmirat, M., Al-Ayyoub, M., et al.: Software defined system support for enabling ubiquitous mobile edge computing. Comput. J. Oxf. (2017). doi:10.1093/comjnl/bxx019

  40. Lindell, Y., Pinkas, B.: Privacy preserving data mining. J. Cryptol. 15(3), 177–206 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  41. Lindell, Y., Pinkas, B.: Secure multi-party computation for privacy-preserving data mining. J. Privacy Confid. 25(2), 59–98 (2009)

    Google Scholar 

  42. Zhangjie, Fu, Xinle, Wu, Guan, Chaowen, Sun, Xingming, Ren, Kui: Toward efficient multi-keyword fuzzy search over encrypted outsourced data with accuracy improvement. IEEE Trans. Inf. Forensics Secur. 11(12), 2706–2716 (2016)

    Article  Google Scholar 

  43. Zhangjie, Fu, Ren, Kui, Shu, Jiangang, Sun, Xingming, Huang, Fengxiao: Enabling personalized search over encrypted outsourced data with efficiency improvement. IEEE Tran. Parallel Distrib. Syst. 27(9), 2546–2559 (2016)

    Article  Google Scholar 

  44. Xia, Zhihua, Wang, Xinhui, Sun, Xingming, Wang, Qian: A secure and dynamic multi-keyword ranked search scheme over encrypted cloud data. IEEE Trans. Parallel Distrib. Syst. 27(2), 340–352 (2015)

    Article  Google Scholar 

Download references

Acknowledgements

This work is supported by National High Technology Research and Development Program of China (No. 2015AA016008), National Natural Science Foundation of China (No. 61402136), Natural Science Foundation of Guangdong Province, China (No. 2014A030313697), National Natural Science Foundation of China (No. 61472091), Natural Science Foundation of Guangdong Province for Distinguished Young Scholars (2014A030306020), Guangzhou scholars project for universities of Guangzhou (No. 1201561613), Science and Technology Planning Project of Guangdong Province, China (2015B01012 9015) and Guangdong Province Key Laboratory of High Performance Computing (No. [2013]82).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Ye Li or Zoe L. Jiang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, Y., Jiang, Z.L., Yao, L. et al. Outsourced privacy-preserving C4.5 decision tree algorithm over horizontally and vertically partitioned dataset among multiple parties. Cluster Comput 22 (Suppl 1), 1581–1593 (2019). https://doi.org/10.1007/s10586-017-1019-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-017-1019-9

Keywords

Navigation