Abstract
The data produced by differential privacy histogram publishing algorithm based on grouping has low usability due to large approximation error and Laplace error. To solve this problem, a histogram publishing algorithm based on roulette sampling sort and greedy partition is proposed. Our algorithm combines the exponential mechanism with the roulette sampling sorting method, arranges the similar histogram bins together with a larger probability by the utility function and the restriction on the number of sampled entity. The greedy clustering algorithm is used to partition the sorted histogram bins into groups, and the error among histogram bins in each group is reduced by optimizing the lower bound error of the grouping. Extensive experimental results show that the proposed algorithm can effectively improve the usability of published data under the premise of satisfying differential privacy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Dwork, C.: Differential privacy. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) ICALP 2006. LNCS, vol. 4052, pp. 1–12. Springer, Heidelberg (2006). https://doi.org/10.1007/11787006_1
Xu, J., Zhang, Z.J., Xiao, X.K., et al.: Differentially private histogram publication. VLDB J. 22(6), 797–822 (2013)
Xiao, X.K., Wang, G.Z., Gehrke, J.G.: Differential privacy via wavelet transforms. IEEE Trans. Knowl. Data Eng. 23(8), 1200–1214 (2011)
Hay, M., Rastogi, V., Miklau, G., et al.: Boosting the accuracy of differentially private histograms through consistency. In: Proceedings of the 36th Conference of Very Large Databases, pp. 1021–1032. ACM, New York (2010)
Lee, J., Wang, Y., Kifer, D.: Maximum likelihood postprocessing for differential privacy under consistency constraints. In: Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 635–644. ACM, New York (2015)
Ge, L., Hu, Y., Wang, H., He, Z., Meng, H., Tang, X., Wu, L.: IDP - OPTICS: improvement of differential privacy algorithm in data histogram publishing based on density clustering. In: Huang, D.-S., Jo, K.-H., Huang, Z.-K. (eds.) ICIC 2019. LNCS, vol. 11644, pp. 770–781. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-26969-2_73
Zhang, Y.X., Wei, J.H., Li, J., Liu, W.F., Hu, X.X.: Graph degree histogram publication method with node-differential privacy. J. Comput. Res. Dev. 56(03), 508–520 (2019)
Zhang, X.J., Chen, R., Xu, J.L., et al.: Towards accurate histogram publication under differential privacy. In: Proceedings of the 14th SIAM International Conference on Data Mining, pp. 587–595. SIAM, Philadelphia (2014)
Zhang, X.J., Shao, C., Meng, X.F.: Accurate histogram release under differential privacy. J. Comput. Res. Dev. 53(5), 1106–1117 (2016)
Li, H., Cui, J.T., Lin, X.B., et al.: Improving the utility in differential private histogram publishing: theoretical study and practice. In: 2016 IEEE International Conference on Big Data, HangZhou, China, pp. 1100–1109. IEEE (2016)
Tang, Z.L., Long, S.G.: Differential privacy histogram publishing based on hybrid mechanism. J. Guizhou Univ. Nat. Sci. 35(4), 32–36 (2018)
Tang, H.X., Yang, G., Bai, Y.L.: Histogram publishing algorithm based on adaptive privacy budget allocation strategy under differential privacy. Appl. Res. Comput. https://doi.org/10.19734/j.issn.1001-3695.2018.11.0925
Zhang, X.J., Meng, X.F.: Streaming histogram publication method with differential privacy. J. Softw. 27(2), 381–393 (2016)
Yan, F., Zhang, X., Li, C., et al.: Differentially private histogram publishing through fractal dimension for dynamic datasets. In: 2018 13th IEEE Conference on Industrial Electronics and Applications, WuHan, China, pp. 1542–1546. IEEE (2018)
Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Halevi, S., Rabin, T. (eds.) TCC 2006. LNCS, vol. 3876, pp. 265–284. Springer, Heidelberg (2006). https://doi.org/10.1007/11681878_14
McSherry, F., Talwar, K.: Mechanism design via differential privacy. In: Proceedings of the 48th Annual IEEE Symposium on Foundations of Computer Science, Piscataway, NJ, pp. 94–103. IEEE (2007)
Holland, J.H.: Adaptation in Natural and Artificial Systems. MIT Press, Cambridge (1992)
McSherry, F.: Privacy integrated queries: an extensible platform for privacy-preserving data analysis. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 19–30. ACM, New York (2009)
Acknowledgment
This article is supported in part by Guangxi Natural Science Foundation (No. 2018GXNSFAA294036, 2018GXNSFAA138116), Guangxi Key Laboratory of Cryptography and Information Security of China (No. GCIS201705), and Innovation Project of Guangxi Graduate Education (No. YCSW2018138).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Wu, X., Tong, N., Ye, Z., Wang, Y. (2020). Histogram Publishing Algorithm Based on Sampling Sorting and Greedy Clustering. In: Zheng, Z., Dai, HN., Tang, M., Chen, X. (eds) Blockchain and Trustworthy Systems. BlockSys 2019. Communications in Computer and Information Science, vol 1156. Springer, Singapore. https://doi.org/10.1007/978-981-15-2777-7_7
Download citation
DOI: https://doi.org/10.1007/978-981-15-2777-7_7
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-2776-0
Online ISBN: 978-981-15-2777-7
eBook Packages: Computer ScienceComputer Science (R0)