Abstract
In the context of big data, privacy protection in the process of data mining has become a hot issue in the field of security. The commonly used privacy protection method is to add differential privacy in the process of data mining, but the unreasonable privacy budget allocation leads to unsatisfactory classification results. To solve this problem, an adaptive differential privacy budget allocation algorithm RFDPP-weight based on random forest is proposed. Construct multiple decision trees with balanced features, calculate the Balance Error Rate (BER) of the data outside the bag in order to calculate the feature weight and then reconstruct the feature set. And the random forest with higher classification performance was constructed based on the new feature set. The decision tree weight was calculated according to the feature weight, and the privacy protection budget was adaptively allocated according to the decision tree weight. In the final prediction stage, the classification result with the largest weight of the corresponding decision tree is taken as the result of the random forest algorithm, so as to further improve the classification accuracy of the random forest algorithm. The experimental results show that the F1 score of our algorithm reaches 0.977 and 0.893 with a small number of decision trees in the Mushroom and Adult data sets, which proves that the algorithm can reasonably allocate the privacy budget and further improve the availability of data while protecting data privacy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Kousika, N., Premalatha, K.: An improved privacy- preserving data mining technique using singular value decomposition with three-dimensional rotation data perturbation. J. Supercomput. 77(6), 1–9 (2021)
Xie, X., Liang, Y., Wang, Z., Dong, X.: A quantitative evaluation method of social network uses’ privacy leakage. Comput. Eng. Sci. 43(08), 1376–1386 (2021)
Kairouz, P., Oh, S., Viswanath, P.: The composition theorem for differential privacy. IEEE Trans. Inf. Theory 63(6), 4037–4049 (2017)
Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 10(05), 557–570 (2002)
Liu, X., Li, Q., Li, T., Chen, D.: Differentially private classification with decision tree ensemble. Appl. Soft Comput. 62, 807–816 (2018)
Dwork, C.: Differential privacy. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) ICALP 2006. LNCS, vol. 4052, pp. 1–12. Springer, Heidelberg (2006). https://doi.org/10.1007/11787006_1
Dwork, C.: A firm foundation for private data analysis. Commun. ACM 54(1), 86–95 (2011)
Li, H., Xiaoping, W.: Network intrusion correlation method with differential privacy protection of alerts sequence. Comput. Eng. 44(5), 128–132 (2018)
Hao, C., Peng, C., Zhang, P.: Selection method of differential privacy protection parameter ε under repeated attack. Comput. Eng. 44(7), 145–149 (2018)
Breakthrough Technologies 2020. https://www.technologyreview.com/10-breakthrough-technologies, 2020–01–26/2020–06–06 (2020)
Murphy, K.P.: Machine Learning: A Probabilistic Perspective, 1st edn. MIT Press, Cambridge (2012)
Letham, B., Rudin, C., Mccormick, T.H., et al.: Interpretable classifiers using rules and Bayesian analysis: Building a better stroke prediction model. Ann. Appl. Stat. 9(3), 1350–1371 (2015)
Han, J. Kamber, M., Pei, J.: Data Mining: Concepts and Techniques, pp. 585–631 (2006)
Huysmans, J., Dejaeger, K., Mues, C., et al.: An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models. Decis. Support Syst. 51(1), 141–154 (2011)
Fletcher, S., Islam, M.Z.: Decision tree classification with differential privacy: a survey. ACM Comput. Surv. 52(4), 1–33 (2016)
Blum, A., Dwork, C., Mcsherry, F., et al.: Practical privacy: the SuLQ framework. In: Proceedings of the 24th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 128–138. ACM, Baltimore (2005)
Mcsherry, F.: Privacy integrated queries: an extensible platform for privacy-preserving data analysis. Commun. ACM 53(9), 89–97 (2010)
Friedman, A., Schuster, A.: Data mining with differential privacy. In: Proceedings of the 16th ACM SIGKDD, pp. 493–502. ACM, New York (2010)
Jagannathan, G., Pillaipakkamnatt, K., Wright, R.N.: A practical differentially private random decision tree classifier. In: 2009 IEEE International Conference on Data Mining Workshops, pp. 114–121. Miami, FL, USA (2009)
Patil, A., Singh, S.:Differential private random forest. In: 2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 2623–2630. Delhi, India (2014)
Mu, H., Ding, L., Song, Y., et al.: DiffPRFs: random forest under differential privacy. J. Commun. 37(9), 175–182 (2016)
Li, Y., Chen, X., Liu, L., et al.: Random forest algorithm for differential privacy protection. Comput. Eng. 46(1), 93–101 (2020)
Chi, C., Liang, X.: Classification feature selection based on random forest and support vector machine. J. Univ. Sci Technol. Liaon. 39(2), 146–151 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Wang, Cy., Chen, Sy., Li, Xc. (2022). Adaptive Differential Privacy Budget Allocation Algorithm Based on Random Forest. In: Pan, L., Cui, Z., Cai, J., Li, L. (eds) Bio-Inspired Computing: Theories and Applications. BIC-TA 2021. Communications in Computer and Information Science, vol 1565. Springer, Singapore. https://doi.org/10.1007/978-981-19-1256-6_15
Download citation
DOI: https://doi.org/10.1007/978-981-19-1256-6_15
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-1255-9
Online ISBN: 978-981-19-1256-6
eBook Packages: Computer ScienceComputer Science (R0)