Abstract
Many areas are now experiencing data streams that contain privacy-sensitive information. Although the sharing and release of these data are of great commercial value, if these data are released directly, the private user information in the data will be disclosed. Therefore, how to continuously generate publishable histograms (meeting privacy protection requirements) based on sliding data stream windows has become a critical issue, especially when sending data to an untrusted third party. Existing histogram publication methods are unsatisfactory in terms of time and storage costs, because they must cache all elements in the current sliding window (SW). Our work addresses this drawback by designing an efficient online histogram publication (EOHP) method for local differential privacy data streams. Specifically, in the EOHP method, the data collector first crafts a histogram of the current SW using an approximate counting method. Second, the data collector reduces the privacy budget by using the optimized budget absorption mechanism and adds appropriate noise to the approximate histogram, making it possible to publish the histogram while retaining satisfactory data utility. Extensive experimental results on two different real datasets show that the EOHP algorithm significantly reduces the time and storage costs and improves data utility compared to other existing algorithms.
摘要
目前各领域都在产生包含用户敏感信息的实时数据流. 尽管这些数据的共享和发布具有巨大商业价值, 但如果直接发布数据, 将会泄露数据中的用户隐私信息. 因此, 如何基于滑动数据流窗口持续生成满足隐私保护要求的可发布直方图已成为一个关键问题, 尤其是在将数据发送给不受信任的第三方时. 现有直方图发布方法在时间和存储成本方面的表现并不令人满意, 因为它们必须缓存当前滑动窗口 (SW) 中的所有元素. 为解决这一问题, 我们为本地差分隐私数据流提出一种高效的在线直方图发布算法 (EOHP). 具体来说, 在 EOHP 算法中, 数据收集器首先使用数据流的近似计数方法实现在线处理数据获得初步直方图. 其次, 提出了优化隐私预算分配策略减少隐私预算的消耗, 在近似直方图中添加适当噪声, 使其在保持较好数据可用性的同时发布直方图. 经两个不同真实数据集上的大量实验结果表明, 与其他现有算法相比, EOHP 算法显著降低了时间和存储成本, 提高数据实用性.
Similar content being viewed by others
Data availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.
References
Cormode G, Jha S, Kulkarni T, et al., 2018. Privacy at scale: local differential privacy in practice. Proc Int Conf on Management of Data, p.1655–1658. https://doi.org/10.1145/3183713.3197390
Datar M, Gionis A, Indyk P, et al., 2002. Maintaining stream statistics over sliding windows. SIAM J Comput, 31(6):1794–1813. https://doi.org/10.1137/S0097539701398363
Duchi JC, Jordan MI, Wainwright MJ, 2013. Local privacy and statistical minimax rates. Proc IEEE 54th Annual Symp on Foundations of Computer Science, p.429–438. https://doi.org/10.1109/FOCS.2013.53
Dwork C, 2008. Differential privacy: a survey of results. Proc 5th Int Conf on Theory and Applications of Models of Computation, p.1–19. https://doi.org/10.1007/978-3-540-79228-4_1
Dwork C, Roth A, 2014. The algorithmic foundations of differential privacy. Found Trends Theor Comput Sci, 9(3–4):211–407. https://doi.org/10.1561/0400000042
Dwork C, Naor M, Pitassi T, et al., 2010. Differential privacy under continual observation. Proc 42nd ACM Symp on Theory of Computing, p.715–724. https://doi.org/10.1145/1806689.1806787
Erlingsson Ú, Pihur V, Korolova A, 2014. RAPPOR: randomized aggregatable privacy-preserving ordinal response. Proc ACM SIGSAC Conf on Computer and Communications Security, p.1054–1067. https://doi.org/10.1145/2660267.2660348
Erlingsson Ú, Feldman V, Mironov I, et al., 2019. Amplification by shuffling: from local to central differential privacy via anonymity. Proc 30th Annual ACM-SIAM Symp on Discrete Algorithms, p.2468–2479. https://doi.org/10.1137/1.9781611975482.151
Errounda FZ, Liu Y, 2018. Continuous location statistics sharing algorithm with local differential privacy. Proc IEEE Int Conf on Big Data, p.5147–5152. https://doi.org/10.1109/BigData.2018.8621876
Fan LY, Xiong L, 2013. An adaptive approach to real-time aggregate monitoring with differential privacy. IEEE Trans Knowl Data Eng, 26(9):2094–2106. https://doi.org/10.1109/TKDE.2013.96
Joy S, Paulraj RL, Punith M, et al., 2023. A Raspberry Pi based smart security patrol robot. Proc 7th Int Conf on Computing Methodologies and Communication, p.1140–1145. https://doi.org/10.1109/ICCMC56507.2023.10083908
Kairouz P, Oh S, Viswanath P, 2014. Extremal mechanisms for local differential privacy. Proc 27th Int Conf on Neural Information Processing Systems, p.2879–2887.
Kim H, Ben-Othman J, Mokdad L, 2019. UDiPP: a framework for differential privacy preserving movements of unmanned aerial vehicles in smart cities. IEEE Trans Veh Technol, 68(4):3933–3943. https://doi.org/10.1109/TVT.2019.2897509
Kim H, Ben-Othman J, Mokdad L, et al., 2020. Research challenges and security threats to AI-driven 5G virtual emotion applications using autonomous vehicles, drones, and smart devices. IEEE Netw, 34(6):288–294. https://doi.org/10.1109/MNET.011.2000245
Labs J, Terry S, 2020. Privacy in the coronavirus era. Genet Test Mol Biomark, 24(9):535–536. https://doi.org/10.1089/gtmb.2020.29055.sjt
Lee S, Lee S, Kim H, 2023. Differential security barriers for virtual emotion detection in maritime transportation stations with cooperative mobile robots and UAVs. IEEE Trans Intell Trans Syst, 24(2):2461–2471. https://doi.org/10.1109/tits.2022.3172668
Liu H, Liu JY, Chen F, et al., 2022. Progressive residual learning with memory upgrade for ultrasound image blind super-resolution. IEEE J Biomed Health Inform, 26(9):4390–4401. https://doi.org/10.1109/jbhi.2022.3142076
Liu Z, Li J, Chen XF, et al., 2020. Fuzzy logic-based adaptive point cloud video streaming. IEEE Open J Comput Soc, 1:121–130. https://doi.org/10.1109/OJCS.2020.3006205
Liu Z, Zhan C, Cui Y, et al., 2021. Robust edge computing in UAV systems via scalable computing and cooperative computing. IEEE Wirel Commun, 28(5):36–42. https://doi.org/10.1109/MWC.121.2100041
Narayanan A, Shmatikov V, 2008. Robust de-anonymization of large sparse datasets. Proc IEEE Symp on Security and Privacy, p.111–125. https://doi.org/10.1109/SP.2008.33
Nguyên TT, Xiao XK, Yang Y, et al., 2016. Collecting and analyzing data from smart device users with local differential privacy. https://arxiv.org/abs/160605053
Qin Z, Yang Y, Yu T, et al., 2016. Heavy hitter estimation over set-valued data with local differential privacy. Proc ACM SIGSAC Conf on Computer and Communications Security, p.192–203. https://doi.org/10.1145/2976749.2978409
Ren XB, Shi L, Yu WR, et al., 2022. LDP-IDS: local differential privacy for infinite data streams. Proc Int Conf on Management of Data, p.1064–1077. https://doi.org/10.1145/3514221.3526190
Sultani ZN, Ghani RF, 2015. Kinect 3D point cloud live video streaming. Proc Comput Sci, 65:125–132. https://doi.org/10.1016/j.procs.2015.09.090
Thakurta AG, Vyrros AH, Vaishampayan US, et al., 2018. Emoji Frequency Detection and Deep Link Frequency. Patent No. 9894089 B2, US.
Wang Q, Zhang Y, Lu X, et al., 2018. Real-time and spatiotemporal crowd-sourced social network data publishing with differential privacy. IEEE Trans Depend Secure Comput, 15(4):591–606. https://doi.org/10.1109/TDSC.2016.2599873
Wang S, Sinnott R, Nepal S, 2018. Privacy-protected statistics publication over social media user trajectory streams. Future Gener Comput Syst, 87:792–802. https://doi.org/10.1016/j.future.2017.08.002
Wang XJ, Liu Z, Liu AX, et al., 2023. A near-optimal protocol for continuous tag recognition in mobile RFID systems. IEEE/ACM Trans Netw, 32(2):1303–1318. https://doi.org/10.1109/TNET.2023.3317875
Yang G, Xia CT, Bai YL, 2018. Algorithm for differential privacy histogram for real-time data flow. J Nanjing Univ Posts Telecommun (Nat Sci Ed), 38(2):69–77 (in Chinese). https://doi.org/10.14132/j.cnki.1673-5439.2018.02.012
Ye QQ, Hu HB, Meng XF, et al., 2019. PrivKV: key-value data collection with local differential privacy. Proc IEEE Symp on Security and Privacy, p.317–331. https://doi.org/10.1109/SP.2019.00018
Zhang XJ, Chen R, Xu JL, et al., 2014. Towards accurate histogram publication under differential privacy. Proc SIAM Int Conf on Data Mining, p.587–595. https://doi.org/10.1137/1.9781611973440.68
Author information
Authors and Affiliations
Contributions
Xiujun WANG and Tao TAO designed the research. Funan ZHANG processed the data. Tao TAO, Funan ZHANG, and Xiujun WANG drafted the paper. Xiao ZHENG and Xin ZHAO helped organize the paper, process the images, and verify the results in Sections 3 and 4. Funan ZHANG and Xiujun WANG revised and finalized the paper.
Corresponding author
Ethics declarations
All the authors declare that they have no conflict of interest.
Additional information
Project supported by the Anhui Provincial Natural Science Foundation, China (Nos. 2108085MF218 and 2022AH040052), the University Synergy Innovation Program of Anhui Province, China (No. GXXT-2023-021), the Key Program of the Natural Science Foundation of the Educational Commission of Anhui Province of China (No. 2022AH050319), and the National Natural Science Foundation of China (Nos. 62172003 and 61402008)
Rights and permissions
About this article
Cite this article
Tao, T., Zhang, F., Wang, X. et al. An efficient online histogram publication method for data streams with local differential privacy. Front Inform Technol Electron Eng 25, 1096–1109 (2024). https://doi.org/10.1631/FITEE.2300368
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1631/FITEE.2300368