Skip to main content
Log in

Impact of data correlation on privacy budget allocation in continuous publication of location statistics

  • Published:
Peer-to-Peer Networking and Applications Aims and scope Submit manuscript

Abstract

Continuous publication of statistics collected from various location-based applications may compromise users’ privacy as the statistics could be procured from users’ private data. Differential Privacy (DP) is a new privacy notion that offers a strong privacy guarantee to all users who participate in the statistics. However, the existing DP mechanism for continuous publication of location statistics provides a privacy guarantee with the assumption that the data-points of users stream at consecutive timestamps are independent. In reality, users’ data-points may be temporally correlated, resulting in more privacy leakage due to an inadequate supply of privacy budget to the timestamps where the data-points are correlated. In this paper, we present a reformulated differential privacy definition to quantify the impact of temporal correlation on privacy leakage. Then, we introduce a privacy budget allocation method for allocating an adequate amount of privacy budget to each successive timestamps under the protection of differential privacy. Our solution adopts w-event privacy for continuously releasing statistics over infinite streams. The main idea is to check the dissimilarity between statistics at each timestamp and decide whether to publish current statistics or last release statistics. Finally, we evaluate the data utility of our proposed method by presenting experimental results for real and synthetic data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Schneider MJ, Jagpal S, Gupta S, Li S, Yan Y (2017) Protecting customer privacy when marketing with second-party data. Int J Res Mark 34:593–603

    Article  Google Scholar 

  2. Lane ND, Mohammod M, Lin M, Yang X, Lu H, Ali S, Doryab A, Berke E, Choudhury T, Campbell A (2011) Bewell: a smartphone application to monitor, model and promote wellbeing. In: 5th International ICST conference on pervasive computing technologies for healthcare, pp 23–26

  3. Thiagarajan A, Ravindranath L, LaCurts K, Madden S, Balakrishnan H, Toledo S, Eriksson J (2009) VTrack: accurate, energy-aware road traffic delay estimation using mobile phones. In: Proceedings of the 7th ACM conference on embedded networked sensor systems. ACM, pp 85–98

  4. Guha S, Reznichenko A, Tang K, Haddadi H, Francis P (2009) Serving Ads from localhost for performance, privacy, and profit. HotNets

  5. Malathi D, Logesh R, Subramaniyaswamy V, Vijayakumar V, Sangaiah AK (2019) Hybrid reasoning-based privacy-aware disease prediction support system. Comput Electr Eng 73:114–127

    Article  Google Scholar 

  6. Chow C-Y, Mokbel MF (2011) Trajectory privacy in location-based services and data publication. ACM Sigkdd Explor Newslett 13(1):19–29

    Article  Google Scholar 

  7. Dong Y, Pi D (2018) Novel privacy-preserving algorithm based on frequent path for trajectory data publishing. Knowl-Based Syst 148:55–65

    Article  Google Scholar 

  8. Aggarwal CC, Philip SY (2008) A general survey of privacy-preserving data mining models and algorithms. In: Privacy-preserving data mining. Springer, pp 11–52

  9. Dwork C (2011) Differential privacy. Encyclopedia of cryptography and security. Springer, pp 338–340

  10. Sweeney L (2002) k-anonymity: a model for protecting privacy. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems. World Scientific, pp 557–570

  11. Hemkumar D, Ravichandra S, Somayajulu DVLN (2020) Impact of prior knowledge on privacy leakage in trajectory data publishing. Engineering Science and Technology, an International Journal. Elsevier

  12. Fan L, Xiong L, Sunderam V (2013) Fast: differentially private real-time aggregate monitor with filtering and adaptive sampling. In: Proceedings of the 2013 ACM SIGMOD international conference on management of data. ACM, pp 1065–1068

  13. Kellaris G, Papadopoulos S, Xiao X, Papadias D (2014) Differentially private event sequences over infinite streams. Proceedings of the VLDB Endowment, 1155–1166

  14. Cao Y, Yoshikawa M, Xiao Y, Xiong L (2018) Quantifying differential privacy in continuous data release under temporal correlations. IEEE Trans Knowl Data Eng 31(7):1281–1295

    Article  Google Scholar 

  15. Dwork C (2010) Differential privacy in new settings. In: Proceedings of the twenty-first annual ACM-SIAM symposium on discrete algorithm. SIAM, pp 174–183

  16. Fan L, Xiong L (2012) Real-time aggregate monitoring with differential privacy. In: Proceedings of the 21st ACM international conference on information and knowledge management. ACM, pp 2169–2173

  17. Zheng Y, Zhang L, Xie X, Ma W-Y (2009) Mining interesting locations and travel sequences from gps trajectories. In: Proceedings of the 18th international conference on world wide web. ACM, pp 791–800

  18. Yuan J, Zheng Y, Xie X, Sun G (2011) Driving with knowledge from the physical world. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 316–324

  19. Yuan J, Zheng Y, Zhang C, Xie W, Xie X, Sun G, Huang Y (2010) T-drive: driving directions based on taxi trajectories. In: Proceedings of the 18th SIGSPATIAL international conference on advances in geographic information systems. ACM, pp 99– 108

  20. Cocchia A (2014) Smart and digital city: a systematic literature review. In: Smart city. Springer, pp 13–43

  21. Mohammed N, Fung B, Debbabi M (2010) Preserving privacy and utility in rfid data publishing

  22. Chai T, Draxler RR (2014) Root mean square error (RMSE) or mean absolute error (MAE)?–Arguments against avoiding RMSE in the literature. Geoscientific model development. Copernicus GmbH, pp 1247–1250

  23. Dwork C, Naor M, Pitassi T, Rothblum GN (2010) Differential privacy under continual observation. In: Proceedings of the forty-second ACM symposium on theory of computing, pp 715– 724

  24. Cao Y, Yoshikawa M (2016) Differentially private real-time data publishing over infinite trajectory streams. IEICE Transactions on Information and Systems, pp 163–175

  25. Li H, Xiong L, Jiang X, Liu J (2015) Differentially private histogram publication for dynamic datasets: an adaptive sampling approach. In: Proceedings of the 24th ACM international on conference on information and knowledge management. ACM, pp 1001–1010

  26. Acs G, Castelluccia C (2014) A case study: privacy preserving release of spatio-temporal density in paris. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1679–1688

  27. Xiao Y, Xiong L, Fan L, Goryczka S (2012) DPCube: differentially private histogram release through multidimensional partitioning. arXiv:1202.5358

  28. Chan T-HH, Chan ES, Song D (2011) Private and continual release of statistics. ACM Transactions on information and system security (TISSEC). ACM

  29. Bolot J, Fawaz N, Muthukrishnan S, Nikolov A, Taft N (2013) Private decayed predicate sums on streams. In: Proceedings of the 16th international conference on database theory. ACM, pp 284–295

  30. Fan L, Bonomi L, Xiong L, Sunderam V (2014) Monitoring web browsing behavior with differential privacy. In: Proceedings of the 23rd international conference on world wide web. ACM, pp 177–188

  31. Mir D, Muthukrishnan S, Nikolov A, Wright RN (2011) Pan-private algorithms via statistics on sketches. In: Proceedings of the thirtieth ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems. ACM, pp 37–48

  32. Jiang K, Shao D, Bressan S, Kister T, Tan K-L (2013) Publishing trajectories with differential privacy guarantees. In: Proceedings of the 25th International conference on scientific and statistical database management, pp 1–12

  33. Xiao Y, Xiong L (2015) Protecting locations with differential privacy under temporal correlations. In: Proceedings of the 22nd ACM SIGSAC conference on computer and communications security. ACM, pp 1298–1309

  34. Kifer D, Machanavajjhal A (2011) No free lunch in data privacy. In: Proceedings of the 2011 ACM SIGMOD international conference on management of data. ACM, pp 193–204

  35. Kifer D, Machanavajjhala A (2012) A rigorous and customizable framework for privacy. In: Proceedings of the 31st ACM SIGMOD-SIGACT-SIGAI symposium on principles of database systems. ACM, pp 77–88

  36. Song S, Wang Y, Chaudhuri K (2017) Pufferfish privacy mechanisms for correlated data. In: Proceedings of the 2017 ACM international conference on management of data. ACM, pp 1291–1306

  37. Yang B, Sato I, Nakagawa H (2015) Bayesian differential privacy on correlated data. In: Proceedings of the 2015 ACM SIGMOD international conference on management of data. ACM, pp 747–762

  38. Liu C, Chakraborty S, Mittal P (2016) Dependence makes you vulnberable: differential privacy under dependent tuples. In: NDSS, pp 21–24

  39. Zhu T, Xiong P, Li G, Zhou W (2014) Correlated differential privacy: hiding information in non-IID data set. IEEE Transactions on Information Forensics and Security. IEEE, pp 229–242

  40. Xiaotong W, Wu T, Maqbool K, Qiang N, Wanchun D (2017) Game theory based correlated privacy preserving analysis in big data. IEEE Transactions on Big Data. IEEE

  41. Cao Y, Yoshikawa M (2015) Differentially private real-time data release over infinite trajectory streams. In: 2015 16th IEEE international conference on mobile data management. IEEE, pp 68–73

  42. Ma Z, Zhang T, Liu X, Li X, Ren K (2019) Real-time privacy-preserving data release over vehicle trajectory. IEEE Trans Veh Technol 68(8):8091–8102

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to D. Hemkumar.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: Special Issue on Privacy-Preserving Computing

Guest Editors: Kaiping Xue, Zhe Liu, Haojin Zhu, Miao Pan and David S.L. Wei

Appendix:

Appendix:

Theorem 3

Prove that

$${\sum}_{j=k-w+1}^{k}(\epsilon/4 - [{\Sigma}_{j=k-w+1}^{k-1}\epsilon_{j}])/2 \leq \epsilon/4$$

.

Proof

Sub-mechanism \({\mathscr{M}}^{2}\) allocates a privacy budgets per timestamps in exponential decreasing fashion i.e., (𝜖/8, 𝜖/16, 𝜖/32,...). Then LHS is,

$$ {\sum}_{j=k-w+1}^{k}(\epsilon/4 - {\Sigma}_{j=k-w+1}^{k-1}\epsilon_{j})/2 = (\frac{\epsilon}{8} + \frac{\epsilon}{16} + \cdot\cdot\cdot + \frac{\epsilon}{2^{n+2}}) $$
$$ = {\Sigma}_{i=3}^{n}\frac{\epsilon}{2^{i}} \leq \epsilon/4 $$

We can rewrite

$$ {\Sigma}_{i=3}^{n}\frac{\epsilon}{2^{i}} \leq \epsilon/4 \text{into} {\Sigma}_{i=1}^{n}\frac{\epsilon}{2^{i}} \leq \epsilon $$

Let assume that 𝜖 = 1

$$ {\Sigma}_{i=1}^{n}\frac{1}{2^{i}} = {\Sigma}_{i=1}^{1}(\frac{1}{2})(\frac{1}{2})^{i-1} = \frac{1}{2}\frac{(1-(\frac{1}{2})^{i})}{(1-(\frac{1}{2}))} $$

By mathematical induction: Basis: n = 1

$$ {\Sigma}_{i=1}^{1}(\frac{1}{2})(\frac{1}{2})^{1-1} = \frac{1}{2}\frac{(1-(\frac{1}{2})^{1})}{(1-(\frac{1}{2}))} \leq 1 $$

Inductive step: Assume true for n = p

$$ {\Sigma}_{i=1}^{p}(\frac{1}{2})(\frac{1}{2})^{0}+(\frac{1}{2})(\frac{1}{2})^{1}+\cdot\cdot\cdot+(\frac{1}{2})(\frac{1}{2})^{p-1} $$
$$ = \frac{1}{2}\frac{(1-(\frac{1}{2})^{p})}{(1-(\frac{1}{2}))} \leq 1 $$

Prove true for n = p + 1

$$ {\Sigma}_{i=1}^{p+1}(\frac{1}{2})(\frac{1}{2})^{0}+(\frac{1}{2})(\frac{1}{2})^{1}+\cdot\cdot\cdot+(\frac{1}{2})(\frac{1}{2})^{p-1}+ (\frac{1}{2})(\frac{1}{2})^{p} $$
$$ = \frac{1}{2}\frac{(1-(\frac{1}{2})^{p+1})}{(1-(\frac{1}{2}))} $$
$$ = \frac{1}{2}\frac{(1-(\frac{1}{2})^{p})}{(1-(\frac{1}{2}))}+(\frac{1}{2})(\frac{1}{2})^{p} $$
$$ = \frac{1}{2}\frac{(1-(\frac{1}{2})^{p}+(\frac{1}{2})^{p}-(\frac{1}{2})^{p+1})}{(1-(\frac{1}{2}))} $$
$$ = \frac{1}{2}\frac{(1-(\frac{1}{2})^{p+1})}{(1-(\frac{1}{2}))}=\frac{1}{2}\frac{(1-(\frac{1}{2})^{p+1})}{(1-(\frac{1}{2}))} \leq 1 $$

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hemkumar, D., Ravichandra, S. & Somayajulu, D.V.L.N. Impact of data correlation on privacy budget allocation in continuous publication of location statistics. Peer-to-Peer Netw. Appl. 14, 1650–1665 (2021). https://doi.org/10.1007/s12083-021-01078-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12083-021-01078-6

Keywords

Navigation