Skip to main content
Log in

Correlated Differential Privacy of Multiparty Data Release in Machine Learning

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Differential privacy (DP) is widely employed for the private data release in the single-party scenario. Data utility could be degraded with noise generated by ubiquitous data correlation, and it is often addressed by sensitivity reduction with correlation analysis. However, increasing multiparty data release applications present new challenges for existing methods. In this paper, we propose a novel correlated differential privacy of the multiparty data release (MP-CRDP). It effectively reduces the merged dataset’s dimensionality and correlated sensitivity in two steps to optimize the utility. We also propose a multiparty correlation analysis technique. Based on the prior knowledge of multiparty data, a more reasonable and rigorous standard is designed to measure the correlated degree, reducing correlated sensitivity, and thus improve the data utility. Moreover, by adding noise to the weights of machine learning algorithms and query noise to the release data, MP-CRDP provides the release technology for both low-noise private data and private machine learning algorithms. Comprehensive experiments demonstrate the effectiveness and practicability of the proposed method on the utilized Adult and Breast Cancer datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Shanthamallu U S, Spanias A, Tepedelenlioglu C, Stanley M. A brief survey of machine learning methods and their sensor and IoT applications. In Proc. the 8th Int. Conf. Information, Intelligence, Systems & Applications, Aug. 2017. https://doi.org/10.1109/IISA.2017.8316459.

  2. Mohammed N, Fung B C M, Debbabi M. Anonymity meets game theory: Secure data integration with malicious participants. The VLDB Journal, 2011, 20(4): 567-588. https://doi.org/10.1007/s00778-010-0214-6.

    Article  Google Scholar 

  3. Fung B C M, Wang K, Chen R, Yu P S. Privacy-preserving data publishing: A survey of recent developments. ACM Computing Surveys, 2010, 42(4): Article No. 14. https://doi.org/10.1145/1749603.1749605.

  4. Kim H, Ben-Othman J, Mokdad L. UDiPP: A framework for differential privacy preserving movements of unmanned aerial vehicles in smart cities. IEEE Trans. Veh. Technol., 2019, 68(4): 3933-3943. https://doi.org/10.1109/TVT.2019.2897509.

    Article  Google Scholar 

  5. Du M, Wang K, Xia Z, Zhang Y. Differential privacy preserving of training model in wireless big data with edge computing. IEEE Trans. Big Data, 2020, 6(2): 283-295. https://doi.org/10.1109/TBDATA.2018.2829886.

    Article  Google Scholar 

  6. Kim S, Shin H, Baek C H, Kim S, Shin J. Learning new words from keystroke data with local differential privacy. IEEE Trans. Knowl. Data Eng., 2020, 32(3): 479-491. https://doi.org/10.1109/TKDE.2018.2885749.

    Article  Google Scholar 

  7. Li D, Yang Q, Yu W, An D, Zhang Y, Zhao W. Towards differential privacy-based online double auction for smart grid. IEEE Trans. Inf. Forensics Secur., 2020, 15: 971-986. https://doi.org/10.1109/TIFS.2019.2932911.

    Article  Google Scholar 

  8. Dwork C. Differential privacy. In Proc. the 33rd International Colloquium on Automata, Languages and Programming, July 2006, pp.1-12. https://doi.org/10.1007/11787006_1.

  9. Dwork C, McSherry F, Nissim K, Smith A D. Calibrating noise to sensitivity in private data analysis. In Proc. the 3rd Theory of Cryptography Conference, March 2006, pp.265-284. https://doi.org/10.1007/11681878_14.

  10. Ji Z, Lipton Z C, Elkan C. Differential privacy and machine learning: A survey and review. arXiv:1412.7584, 2014. https://arxiv.org/abs/1412.7584, May 2020.

  11. Mir D J. Differentially-private learning and information theory. In Proc. the 2012 EDBT/ICDT Workshops, March 2012, pp.206-210. https://doi.org/10.1145/2320765.2320823.

  12. Friedman A, Schuster A. Data mining with differential privacy. In Proc. the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, July 2010, pp.493-502. https://doi.org/10.1145/1835804.1835868.

  13. Mohammed N, Chen R, Fung B C M, Yu P S. Differentially private data release for data mining. In Proc. the17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2011, pp.493-501. https://doi.org/10.1145/2020408.2020487.

  14. Vaidya J, Shafiq B, Basu A, Hong Y. Differentially private naive Bayes classiffcation. In Proc. the 2013 IEEE/WIC/ACM International Conferences on Web Intelligence, November 2013, pp.571-576. https://doi.org/10.1109/WIIAT.2013.80.

  15. Chaudhuri K, Monteleoni C. Privacy-preserving logistic regression. In Proc. the 22nd Annual Conference on Neural Information Processing Systems, December 2008, pp.289-296.

  16. Lei J. Differentially private M-estimators. In Proc. the 25th Annual Conference on Neural Information Processing Systems, December 2011, pp.361-369.

  17. Zhang J, Zhang Z, Xiao X, Yang Y, Winslett M. Functional mechanism: Regression analysis under differential privacy. Proceedings of the VLDB Endowment, 2012, 15(11): 1364-1375. https://doi.org/10.14778/2350229.2350253.

    Article  Google Scholar 

  18. Rubinstein B I P, Bartlett P L, Huang L, Taft N. Learning in a large function space: Privacy-preserving mechanisms for SVM learning. arXiv:0911.5708, 2009. https://arxiv.org/abs/0911.5708, May 2020.

  19. Chaudhuri K, Monteleoni C, Sarwate A D. Differentially private empirical risk minimization. Machine Learning Research, 2011, 12: 1069-1109.

    MathSciNet  MATH  Google Scholar 

  20. Song S, Chaudhuri K, Sarwate A D. Stochastic gradient descent with differentially private updates. In Proc. the 2013 IEEE Global Conf. Signal Inf. Process., December 2013, pp.245-248. https://doi.org/10.1109/GlobalSIP.2013.6736861.

  21. Abadi M, Chu A, Goodfellow I J, McMahan H B, Mironov I, Talwar K, Zhang L. Deep learning with differential privacy. In Proc. the 2016 ACMSIGSAC Conf. Comput. Commun. Secur., October 2016, pp.308-318. https://doi.org/10.1145/2976749.2978318.

  22. Xiao Y, Xiong L. Protecting locations with differential privacy under temporal correlations. In Proc. the 22nd ACM Conference on Computer and Communications Security, October 2015, pp.1298-1309. https://doi.org/10.1145/2810103.2813640.

  23. Lv D, Zhu S. Achieving correlated differential privacy of big data publication. Computers & Security, 2019, 82: 184-195. https://doi.org/10.1016/j.cose.2018.12.017.

  24. Kifer D, Machanavajjhala A. No free lunch in data privacy. In Proc. the 2011 ACM SIGMOD International Conference on Management of Data, June 2011, pp.193-204. https://doi.org/10.1145/1989323.1989345.

  25. He X, Machanavajjhala A, Ding B. Blowfish privacy: Tuning privacy-utility trade-offs using policies. In Proc. the 2014 ACM SIGMOD International Conference on Management of Data, June 2014, pp.1447-1458. https://doi.org/10.1145/2588555.2588581.

  26. Kifer D, Machanavajjhala A. Pufferfish: A framework for mathematical privacy definitions. ACM Trans. Database Syst., 2014, 39(1): Article No. 3. 10.1145/2514689.

  27. Chen R, Fung B C M, Yu P S, Desai B C. Correlated network data publication via differential privacy. The VLDB Journal, 2014, 23(4): 653-676. https://doi.org/10.1007/s00778-013-0344-8.

    Article  Google Scholar 

  28. Zhu T, Xiong P, Li G, Zhou W. Correlated differential privacy: Hiding information in Non-IID data set. IEEE Trans. Info. Fore. and Secur., 2015, 10(2): 229-242. https://doi.org/10.1109/TIFS.2014.2368363.

    Article  Google Scholar 

  29. Yang B, Sato I, Nakagawa H. Bayesian differential privacy on correlated data. In Proc. the 2015 ACM SIGMOD International Conference on Management of Data, May 31-June 4, 2015, pp.747-762. https://doi.org/10.1145/2723372.2747643.

  30. Alhadidi D, Mohammed N, Fung B C M, Debbabi M. Secure distributed framework for achieving ϵ-differential privacy. In Proc. the 12th International Symposium on Privacy Enhancing Technologies, July 2012, pp.120-139. https://doi.org/10.1007/978-3-642-31680-7_7.

  31. Hong Y, Vaidya J, Lu H, Karras P, Goel S. Collaborative search log sanitization: Toward differential privacy and boosted utility. IEEE Trans. Dependable Secur. Comput., 2015, 12(5): 504-518. https://doi.org/10.1109/TDSC.2014.2369034.

    Article  Google Scholar 

  32. Mohammed N, Alhadidi D, Fung B C M, Debbabi M. Secure two-party differentially private data release for vertically partitioned data. IEEE Trans. Dependable Secur. Comput., 2014, 11(1): 59-71. https://doi.org/10.1109/TDSC.2013.22.

    Article  Google Scholar 

  33. Cheng X, Tang P, Su S, Chen R, Wu Z, Zhu B. Multi-party high-dimensional data publishing under differential privacy. IEEE Trans. Knowl. Data Eng., 2020, 32(8): 1557-1571. https://doi.org/10.1109/TKDE.2019.2906610.

    Article  Google Scholar 

  34. Goryczka S, Xiong L. A comprehensive comparison of multiparty secure additions with differential privacy. IEEE Transactions on Dependable and Secure Computing, 2017, 14(5): 463-477. https://doi.org/10.1109/TDSC.2015.2484326.

    Article  Google Scholar 

  35. Dangi D, Santhi G. Secured multi-party data release on cloud for big data privacy-preserving using fusion learning. Turkish Journal of Computer and Mathematics Education, 2021, 12(3): 4716-4725. https://doi.org/10.17762/turcomat.v12i3.1893.

    Article  Google Scholar 

  36. Zhu T, Xiong P, Li G, Zhou W. Answering differentially private queries for continual datasets release. Future Gener. Comput. Syst., 2018, 87: 816-827. https://doi.org/10.1016/j.future.2017.05.007.

    Article  Google Scholar 

  37. Chen J, Ma H, Zhao D, Liu L. Correlated differential privacy protection for mobile crowdsensing. IEEE Trans. Big Data, 2021, 7(4): 784-795. https://doi.org/10.1109/TB-DATA.2017.2777862.

    Article  Google Scholar 

  38. Cao Y, Yoshikawa M, Xiao Y, Xiong L. Quantifying differential privacy in continuous data release under temporal correlations. IEEE Trans. Knowl. Data Eng., 2019, 31(7): 1281-1295. https://doi.org/10.1109/TKDE.2018.2824328.

    Article  Google Scholar 

  39. Song S, Wang Y, Chaudhuri K. Pufferfish privacy mechanisms for correlated data. In Proc. the 2017 ACM International Conference on Management of Data, May 2017, pp.1291-1306. https://doi.org/10.1145/3035918.3064025.

  40. Zhang T, Zhu T, Xiong P, Huo H, Tari Z, Zhou W. Correlated differential privacy: Feature selection in machine learning. IEEE Trans. Industrial Informatics, 2020, 16(3): 2115-2124. https://doi.org/10.1109/TII.2019.2936825.

    Article  Google Scholar 

  41. Wang H, Wang H. Correlated tuple data release via differential privacy. Inf. Sci., 2021, 560: 347-369. https://doi.org/10.1016/j.ins.2021.01.058.

    Article  MathSciNet  Google Scholar 

  42. Wang H, Xu Z, Jia S, Xia Y, Zhang X. Why current differential privacy schemes are inapplicable for correlated data publishing? World Wide Web, 2021, 24(1): 1-23. https://doi.org/10.1007/s11280-020-00825-8.

    Article  Google Scholar 

  43. Ou L, Qin Z, Liao S, Hong Y, Jia X. Releasing correlated trajectories: Towards high utility and optimal differential privacy. IEEE Trans. Dependable Secur. Comput., 2020, 17(5): 1109-1123. https://doi.org/10.1109/TDSC.2018.2853105.

    Article  Google Scholar 

  44. Tang P, Chen R, Su S, Guo S, Ju L, Liu G. Differentially private publication of multi-party sequential data. In Proc. the 37th IEEE International Conference on Data Engineering, April 2021, pp.145-156, https://doi.org/10.1109/ICDE51399.2021.00020.

  45. Wu X, Dou W, Ni Q. Game theory based privacy preserving analysis in correlated data publication. In Proc. the Australasian Computer Science Week Multiconference, January 31-February 3, 2017, Article No. 73. https://doi.org/10.1145/3014812.3014887.

  46. McSherry F, Talwar K. Mechanism design via differential privacy. In Proc. the 48th Annu. IEEE Symp. Found. Comput. Sci., October 2007, pp.94-103. https://doi.org/10.1109/FOCS.2007.66.

  47. Chandrashekar G, Sahin F. A survey on feature selection methods. Comput. Elect. Eng., 2014, 40(1): 16-28. https://doi.org/10.1016/j.compeleceng.2013.11.024.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xing-Wei Wang.

Supplementary Information

ESM 1

(PDF 154 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhao, JZ., Wang, XW., Mao, KM. et al. Correlated Differential Privacy of Multiparty Data Release in Machine Learning. J. Comput. Sci. Technol. 37, 231–251 (2022). https://doi.org/10.1007/s11390-021-1754-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-021-1754-5

Keywords

Navigation