Abstract
Governments own various types and large amounts of individual data. One governmental department manages specific areas of data. To develop smart government, data need to be shared among the governmental departments. However, how to prevent potential attackers from getting private information in data sharing is a challenging problem. To protect private information while sharing statistics among government departments, an improved LDP-based (local differential privacy) approach is proposed. This approach combines the data binning technique with the count mean sketch (CMS) algorithm. Equi-width binning is adopted to divide the data records into smaller data domains to overcome the problem of large statistical errors in the current privacy protection algorithms with large data domain size and small amounts of data. Then, the proposed algorithm is compared with the CMS and HCMS algorithms from different aspects such as frequency estimation, data size, privacy budget, and data domain size. Experimental results show that the proposed algorithm effectively reduces statistical errors and enhances the utility of data after privacy protection with both various distributions and data domain sizes.
















Similar content being viewed by others
References
Acharya J, Sun Z, Zhang H (2018) Hadamard response: Estimating distributions privately, efficiently, and with little communication p 120.arXiv:1802.04705
Alenezi H, Tarhini A, Sharma SK (2015) Development of quantitative model to investigate the strategic relationship between information quality and e-government benefits. Transf Gov People Process Policy 9(3):324–351. https://doi.org/10.1108/TG-01-2015-0004
Bassily R, Smith A (2015) Local, private, efficient protocols for succinct histograms. In: STOC ’15: Proceedings of the forty-seventh annual ACM symposium on Theory of Computing (June):127–135, https://doi.org/10.1145/2746539.2746632.arXiv: 1911.12834?context=cs
Bassily R, Nissim K, Stemmer U, Thakurta A (2017) Practical locally private heavy hitters. Adv Neural Inf Process Syst pp 2285–2293.arXiv:1707.04982v1
Chatfield AT, Reddick CG (2019) A framework for internet of things-enabled smart government: a case of iot cybersecurity policies and use cases in U.S. federal government. Library Inf Service 36(2):346–357. https://doi.org/10.1016/j.giq.2018.09.007
Christofides TC (2003) A generalized randomized response technique. Metrika 57(2):195–200. https://doi.org/10.1007/s001840200216
Cormode G, Muthukrishnan S (2004) An improved data stream summary: The count-min sketch and its applications. J. Algorithm 55(1):5875. https://doi.org/10.1007/978-3-540-24698-5-7
Differential Privacy Team A (2017) Learning with privacy at scale. Mach Learn J 1(8):125
Duchi JC, Jordan MI, Wainwright MJ (2008) Local privacy and statistical minimax rates. In: In Proceedings of the 2008 IEEE symposium on security and privacy (sp 2008), IEEE, pp 429–438. https://doi.org/10.1109/FOCS.2013.53
Duchi JC, Jordan MI, Wainwright MJ (2014) Privacy aware learning. J ACM 61(6):1–57. https://doi.org/10.1145/2666468
Dwork C (2011) Differential privacy. In Encyclopedia of Cryptography and Security (2nd Ed) pp 338–340. https://doi.org/10.1007/11787006-1
Dwork C (2012) Calibrating noise to sensitivity in private data analysis. Lect Notes Comput Sci 3876(8):265–284. https://doi.org/10.1007/11681878-14
Erlingsson L, Pihur V, Korolova A (2014) Rappor: randomized aggregatable privacy-preserving ordinal response. Proc ACM Conf Comput Commun Secur 61(6):10541067. https://doi.org/10.1145/2660267.2660348
Fang X, Zeng Q, Yang G (2020) Local differential privacy for human-centered computing. EURASIP J Wirel Commun Netw 2020(65). https://doi.org/10.1186/s13638-020-01675-8
Fanti G, Pihur V, Erlingsson L (2016) Building a rappor with the unknown: privacy-preserving learning of associations and data dictionaries. In: Proceedings on privacy enhancing technologies 2016(3):41–61.arXiv:1503.01214v1
Gu X, Li M, Cheng Y, Xiong L, Cao Y (2019) Pckv: Locally differentially private correlated key-value data collection with optimized utility. Usenix Secur Symp 2020:1–18
Homer N, Szelinger S, Redman M, Duggan D, Tembe W, Muehling J, Pearson JV, Stephan DA, Nelson SF, Craig DW (2008) Resolving individuals contributing trace amounts of dna to highly complex mixtures using high-density snp genotyping microarrays. PLoS Genet 4(8):1–9. https://doi.org/10.1371/journal.pgen.1000167
Joseph M, Roth A, Ullman J, Waggoner B (2018) Local differential privacy for evolving data. In: NIPS’18: Proceedings of the 32nd international conference on neural information processing systems p 23752384. https://doi.org/10.29012/jpc.718
Lu R, Liang X, Li X, Lin X, Shen X (2012) Eppa: an efficient and privacy-preserving aggregation scheme for secure smart grid communications. IEEE Trans Parallel Distrib Syst 23(9):1621–1631. https://doi.org/10.1109/TPDS.2012.86
Lv Z, Li X, Wang W, Zhang B, Hu J, Feng S (2017) Government affairs service platform for smart city. Future Gen Comput Syst 81:443–451. https://doi.org/10.1016/j.future.2017.08.047
Marmol FG (2012) Do not snoop my habits: preserving privacy in the smart grid. Commun Mag IEEE 50(5):166–172. https://doi.org/10.1109/MCOM.2012.6194398
Myttenaere AD, Golden B, Grand BL, Rossi F (2016) Mean absolute percentage error for regression models. Neurocomputing 192:38–48. https://doi.org/10.1016/j.neucom.2015.12.114
Narayanan A, Shmatikov V (2008) Robust de-anonymization of large sparse datasets. In: In Proceedings of the 2008 IEEE symposium on security and privacy (sp 2008), IEEE, pp 111–125. https://doi.org/10.1109/SP.2008.33
Omori Y, Yamashita T (2020) Extended inter-device digital rights sharing and transfer based on device-owner equality verification using homomorphic encryption. IEICE Trans Inf Syst E103D(6):1339–1354. https://doi.org/10.1587/transinf.2019EDP7163
Perboli G, Marco AD, Perfetti F, Marone M (2014) A new taxonomy of smart city projects. Transp Res Procedia 3:470–478. https://doi.org/10.1016/j.trpro.2014.10.028
Piao C, Shi Y, Yan J, Zhang C, Liu L (2018) Privacy-preserving governmental data publishing: a fog-computing-based differential privacy approach. Future Gen Comput Syst 90:159–174. https://doi.org/10.1016/j.future.2018.07.038
Ruggles S, Fitch C, Magnuson D, Schroeder J (2019) Differential privacy and census data: implications for social and economic research. Aea Papers Proc 109(MAY):403–408. https://doi.org/10.1257/pandp.20191107
Sei Y, Ohsuga A (2017) Differential private data collection and analysis based on randomized multiple dummies for untrusted mobile crowdsensing. IEEE Trans Inf Forensics Secur 12(4):926–939. https://doi.org/10.1109/TIFS.2016.2632069
Sei Y, Okumura H, Takenouchi T, Ohsuga A (2019) Anonymization of sensitive quasi-identifiers for l-diversity and t-closeness. IEEE Trans Dependable Secure Comput 16(4):580–593. https://doi.org/10.1109/TDSC.2017.2698472
Shang S, Du J (2019) Smart government function construction analysis and path design under big data. Inf Stud Theory Appl 42(04):45–51
Shen J, Liu D, Sun X, Wei F, Xiang Y (2020) Efficient cloud-aided verifiable secret sharing scheme with batch verification for smart cities. Future Gen Comput Syst 109:450–456. https://doi.org/10.1016/j.future.2018.10.049
Susan VS, Christopher T (2016) Anatomisation with slicing: a new privacy preservation approach for multiple sensitive attributes. Springerplus 5(1):1–21. https://doi.org/10.1186/s40064-016-2490-0
Sweeney L (2002) k-anonymity: A model for protecting privacy. Int J Uncert Fuzz Knowl Based Syst 10(05):557–570. https://doi.org/10.1142/S0218488502001648
Vu DH, Luong TD, Ho TB (2020) An efficient approach for secure multi-party computation without authenticated channel. Inf Sci 527(JULY):356–368. https://doi.org/10.1016/j.ins.2019.07.031
Wang T, Blocki J (2017) Locally differentially private protocols for frequency estimation. In: Proceedings of the 26th USENIX security symposium (AUG.):729745
Wang T, Lopuha-Zwakenberg M, Li Z, Skoric B, Li N (2019) Consistent and accurate frequency oracles under local differential privacy. Netw Distrib Syst Securit Symp (NDSS) p 23752384.arXiv:1905.08320
Warner SL (1965) Randomized response: a survey technique for eliminating evasive answer bias. J Am Stat Assoc 60(309):63–69. https://doi.org/10.1080/01621459.1965.10480775
Wirtz BW, Weyerer JC, Schichte FT (2019) An integrative public iot framework for smart government. Govern Inf Quart 36(2):333–345. https://doi.org/10.1016/j.giq.2018.07.001
Xu X, Liu Q, Zhang X, Zhang J, Qi L, Dou W (2019) A blockchain-powered crowdsourcing method with privacy preservation in mobile environment. IEEE Trans Comput Soc Syst 6(6):1407–1419. https://doi.org/10.1109/TCSS.2019.2909137
Yan Z, Liu J, Liu S (2019) Dpwevote: differentially private weighted voting protocol for cloud-based decision-making. Enterp Inf Syst 13(2):1–21. https://doi.org/10.1080/17517575.2018.1442935
Ye Q, Meng X, Zhu M, Huo Z (2018) A study of local differential privacy research. J Softw 29(07):1981–2005
Ye Q, Hu H, Meng X, Zheng H (2019) Privkv: Key-value data collection with local differential privacy. In: 2019 IEEE symposium on security and privacy (SP). https://doi.org/10.1109/SP.2019.00018
Acknowledgements
This work was supported by the National Natural Science Foundation of China (71701091), the Chinese Ministry of Education Project of Humanities and Social Science(17YJC870020), and the Graduate Innovation Foundation of Hebei Province (CXZZSS2020071).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Piao, C., Hao, Y., Yan, J. et al. Privacy protection in government data sharing: an improved LDP-based approach. SOCA 15, 309–322 (2021). https://doi.org/10.1007/s11761-021-00315-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11761-021-00315-3