Skip to main content

Reviewing the Methods of Estimating the Density Function Based on Masked Data

  • Conference paper
  • First Online:
Privacy in Statistical Databases (PSD 2018)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11126))

Included in the following conference series:

Abstract

Data privacy is an issue of increasing importance for big data mining, especially for micro-level data. A popular approach to protecting the such is perturbation. Therefore, techniques used to recover the statistical information of the original data from the perturbed data become indispensable in data mining.

This paper reviews and exams the existing techniques for estimating (alternatively, reconstructing) the density function of the original data based on the data perturbed using the additive/multiplicative noise method. Our studies show that the techniques developed for noise-added data cannot replace the techniques for noise-multiplied data, though the two types of masked data could be mutually converted through data transformation. This conclusion might attract data providers’ attention.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Some literature uses the term reconstructing. We will use them interchangeably in this paper.

  2. 2.

    See the discussion of the KEtal2003 Approach.

  3. 3.

    Other multiplicative noise distributions might be considered. Identifying a best multiplicative noise for masking the underlying data in terms of minimising the level of values disclosure risk and minimising the original data utility loss subject for future work.

References

  1. Agrawal, R., Srikant, R.: Privacy-preserving data mining. ACM SIGMOD Rec. 29, 439–450 (2000)

    Article  Google Scholar 

  2. Agrawal, D., Aggarwal, C.C.: On the design and quantification of privacy preserving data mining algorithms. In: Proceedings of the Twentieth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 247–255. ACM (2001)

    Google Scholar 

  3. Kargupta, H., Datta, S., Wang, Q., Sivakumar, K.: On the privacy preserving properties of random data perturbation techniques. In: 2003 Third IEEE International Conference on Data Mining, ICDM 2003, pp. 99–106. IEEE (2003)

    Google Scholar 

  4. Lin, Y.-X.: Density approximant based on noise multiplied data. In: Domingo-Ferrer, J. (ed.) PSD 2014. LNCS, vol. 8744, pp. 89–104. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11257-2_8

    Chapter  Google Scholar 

  5. Lin, Y.X., Fielding, M.J.: MaskDensity14: an R package for the density approximant of a univariate based on noise multiplied data. SoftwareX 3, 37–43 (2015)

    Article  Google Scholar 

  6. Lin, Y.X.: Mining the statistical information of confidential data from noise-multiplied data. In: Proceedings of the 3rd IEEE International Conference on Big Data Intelligence and Computing (2017)

    Google Scholar 

  7. Domingo-Ferrer, J., Sebé, F., Castellà-Roca, J.: On the security of noise addition for privacy in statistical databases. In: Domingo-Ferrer, J., Torra, V. (eds.) PSD 2004. LNCS, vol. 3050, pp. 149–161. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-25955-8_12

    Chapter  Google Scholar 

  8. Lin, Y.X., Mazur, L., Sarathy, R., Muralidhar, K.: Statistical information recovery from multivariate noise-multiplied data, a computational approach. Trans. Data Priv. 11, 23–45 (2018)

    Google Scholar 

  9. Kim, J.J.: A method for limiting disclosure in microdata based on random noise and transformation. In: Proceedings of the Section on Survey Research Methods, pp. 303–308. American Statistical Association (1986)

    Google Scholar 

  10. Kim, J., Winkler, W.: Multiplicative noise for masking continuous data. Statistics 2003-01 (2003)

    Google Scholar 

  11. Mivule, K.: Utilizing noise addition for data privacy, an overview. In: Proceedings of the International Conference on Information and Knowledge Engineering (IKE), The Steering Committee of The World Congress in Computer Science, Computer Engineering and Applied Computing (WorldComp), p. 1 (2012)

    Google Scholar 

  12. Torra, V.: Data Privacy: Foundations, New Developments and the Big Data Challenge. SBD, vol. 28. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-57358-8

    Book  Google Scholar 

  13. Nayak, T.K., Sinha, B., Zayatz, L.: Statistical properties of multiplicative noise masking for confidentiality protection. J. Off. Stat. 27(3), 527–544 (2011)

    Google Scholar 

  14. Muralidhar, K., Batra, D., Kirs, P.J.: Accessibility, security, and accuracy in statistical databases: the case for the multiplicative fixed data perturbation approach. Manag. Sci. 41(9), 1549–1564 (1995)

    Article  Google Scholar 

  15. Provost, S.B.: Moment-based density approximants. Math. J. 9(4), 727–756 (2005)

    Google Scholar 

  16. Lin, Y.X.: A computational Bayesian approach for estimating density functions based on noise-multiplied data. Int. J. Big Data Intell. (2018). (in press)

    Google Scholar 

  17. Ma, Y., Lin, Y.X., Sarathy, R.: The vulnerability of multiplicative noise protection to correlational attacks on continuous microdata. Technical report, National Institute for Applied Statistics Research Australia, School of Mathematics and Applied Statistics, University of Wollongong, Australia (2017)

    Google Scholar 

  18. United States Census Bureau: United states census dataset (2000). Accessed 27 July 2000

    Google Scholar 

Download references

Acknowledgements

Part of R code for implementing the AS2000 Approach was developed by Miss A. Fernando supported by the Winter Project Scholarship 2016, School of Mathematics and Applied Statistics, UoW.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yan-Xia Lin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lin, YX., Krivitsky, P.N. (2018). Reviewing the Methods of Estimating the Density Function Based on Masked Data. In: Domingo-Ferrer, J., Montes, F. (eds) Privacy in Statistical Databases. PSD 2018. Lecture Notes in Computer Science(), vol 11126. Springer, Cham. https://doi.org/10.1007/978-3-319-99771-1_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-99771-1_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-99770-4

  • Online ISBN: 978-3-319-99771-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics