Skip to main content

Advertisement

Log in

An efficient online histogram publication method for data streams with local differential privacy

一种基于局部差分隐私的数据流高效在线直方图发布算法

  • Research Article
  • Published:
Frontiers of Information Technology & Electronic Engineering Aims and scope Submit manuscript

Abstract

Many areas are now experiencing data streams that contain privacy-sensitive information. Although the sharing and release of these data are of great commercial value, if these data are released directly, the private user information in the data will be disclosed. Therefore, how to continuously generate publishable histograms (meeting privacy protection requirements) based on sliding data stream windows has become a critical issue, especially when sending data to an untrusted third party. Existing histogram publication methods are unsatisfactory in terms of time and storage costs, because they must cache all elements in the current sliding window (SW). Our work addresses this drawback by designing an efficient online histogram publication (EOHP) method for local differential privacy data streams. Specifically, in the EOHP method, the data collector first crafts a histogram of the current SW using an approximate counting method. Second, the data collector reduces the privacy budget by using the optimized budget absorption mechanism and adds appropriate noise to the approximate histogram, making it possible to publish the histogram while retaining satisfactory data utility. Extensive experimental results on two different real datasets show that the EOHP algorithm significantly reduces the time and storage costs and improves data utility compared to other existing algorithms.

摘要

目前各领域都在产生包含用户敏感信息的实时数据流. 尽管这些数据的共享和发布具有巨大商业价值, 但如果直接发布数据, 将会泄露数据中的用户隐私信息. 因此, 如何基于滑动数据流窗口持续生成满足隐私保护要求的可发布直方图已成为一个关键问题, 尤其是在将数据发送给不受信任的第三方时. 现有直方图发布方法在时间和存储成本方面的表现并不令人满意, 因为它们必须缓存当前滑动窗口 (SW) 中的所有元素. 为解决这一问题, 我们为本地差分隐私数据流提出一种高效的在线直方图发布算法 (EOHP). 具体来说, 在 EOHP 算法中, 数据收集器首先使用数据流的近似计数方法实现在线处理数据获得初步直方图. 其次, 提出了优化隐私预算分配策略减少隐私预算的消耗, 在近似直方图中添加适当噪声, 使其在保持较好数据可用性的同时发布直方图. 经两个不同真实数据集上的大量实验结果表明, 与其他现有算法相比, EOHP 算法显著降低了时间和存储成本, 提高数据实用性.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Data availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

References

Download references

Author information

Authors and Affiliations

Authors

Contributions

Xiujun WANG and Tao TAO designed the research. Funan ZHANG processed the data. Tao TAO, Funan ZHANG, and Xiujun WANG drafted the paper. Xiao ZHENG and Xin ZHAO helped organize the paper, process the images, and verify the results in Sections 3 and 4. Funan ZHANG and Xiujun WANG revised and finalized the paper.

Corresponding author

Correspondence to Xiujun Wang  (王修君).

Ethics declarations

All the authors declare that they have no conflict of interest.

Additional information

Project supported by the Anhui Provincial Natural Science Foundation, China (Nos. 2108085MF218 and 2022AH040052), the University Synergy Innovation Program of Anhui Province, China (No. GXXT-2023-021), the Key Program of the Natural Science Foundation of the Educational Commission of Anhui Province of China (No. 2022AH050319), and the National Natural Science Foundation of China (Nos. 62172003 and 61402008)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tao, T., Zhang, F., Wang, X. et al. An efficient online histogram publication method for data streams with local differential privacy. Front Inform Technol Electron Eng 25, 1096–1109 (2024). https://doi.org/10.1631/FITEE.2300368

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1631/FITEE.2300368

Key words

关键词

CLC number