Skip to main content

Efficiency and Sample Size Determination of Protected Data

  • Conference paper
  • First Online:
Privacy in Statistical Databases (PSD 2018)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11126))

Included in the following conference series:

  • 847 Accesses

Abstract

This paper assesses the usefulness of a proposed multiplicative perturbation method by contrasting the statistical efficiency achieved in point hypothesis testing of simple proportions with that of the differentially private aggregated Laplace mechanism. This efficiency is evaluated by obtaining an analytical expression that determines the sample size required for protected data to retain a given significance level and power.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    When \(N_p\) is larger, there is no significant difference between \(1/(N_p-1)\) and \(1/N_p\). To simplify the calculation, we use \(1/N_p\) instead of \(1/(N_p-1)\).

References

  1. Ács, G., Castelluccia, C.: I have a DREAM! (DiffeRentially privatE smArt Metering). In: Filler, T., Pevný, T., Craver, S., Ker, A. (eds.) IH 2011. LNCS, vol. 6958, pp. 118–132. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-24178-9_9

    Chapter  Google Scholar 

  2. Drechsler, J.: My understanding of the differences between the CS and the statistical approach to data confidentiality. In: IFE Research (ed.) 4th IAB Workshop on Confidentiality and Disclosure (2011). http://doku.iab.de/veranstaltungen/2011/ws_data2011_drechsler.pdf

  3. Duncan, G.T., Lambert, D.: Disclosure-limited data dissemination. J. Am. Stat. Assoc. 81, 10–18 (1986)

    Article  Google Scholar 

  4. Dwork, C., Smith, A.: Differential privacy for statistics: what we know and what we want to learn. J. Priv. Confid. 2, 135–154 (2010)

    Google Scholar 

  5. Dwork, C., Roth, A.: The algorithmic foundations of differential privacy. Found. Trends Theor. Comput. Sci. 9, 211–407 (2013)

    Article  MathSciNet  Google Scholar 

  6. Dwork, C.: Differential privacy. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) ICALP 2006. LNCS, vol. 4052, pp. 1–12. Springer, Heidelberg (2006). https://doi.org/10.1007/11787006_1

    Chapter  Google Scholar 

  7. Gostin, L.O.: Privacy and security of personal information in a new health care system. J. Am. Med. Assoc. 270, 2487–2493 (1993)

    Article  Google Scholar 

  8. Green, A.K., et al.: The project data sphere initiative: accelerating cancer research by sharing data. Oncologist 20, 464–471 (2015)

    Article  Google Scholar 

  9. Hwang, J.T.: Multiplicative errors-in-variables models with applications to recent data released by the U.S. Department of Energy. J. Am. Stat. Assoc. 81, 680–688 (1986)

    Article  MathSciNet  Google Scholar 

  10. Kim, J.J., Winkler, W.E.: Multiplicative Noise for Masking Continuous Data, Research Report Series (Statistics \(\sharp \)2003-01), Statistical Research Division, US Bureau of the Census, Washington D.C., pp. 1–17 (2003)

    Google Scholar 

  11. Kim, J.J., Jeong, D.M.: Truncated triangular distribution for multiplicative noise and domain estimation. Sect. Gov. Stat. - JSM 2008, 1023–1030 (2008)

    Google Scholar 

  12. Klein, M., Mathew, T., Sinha, B.: Noise multiplicative for statistical disclosure control of extreme values in log-normal regression samples. J. Priv. Confid. 6, 77–125 (2014)

    Google Scholar 

  13. Lin, Y.-X., Fielding, M.J.: MaskDensity14: an R package for the density approximant of a univariate based on noise multiplied data. SoftwareX 3–4, 37–43 (2015). https://doi.org/10.1016/j.softx.2015.11.002

    Article  Google Scholar 

  14. Lin, Y.-X., Wise, P.: Estimation of regression parameters from noise multiplied data. J. Priv. Confid. 61–94 (2012)

    Google Scholar 

  15. Lin, Y.-X.: Density approximant based on noise multiplied data. In: Domingo-Ferrer, J. (ed.) PSD 2014. LNCS, vol. 8744, pp. 89–104. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11257-2_8

    Chapter  Google Scholar 

  16. Ma, Y., Lin, Y.-X., Sarathy, R.: The vulnerability of multiplicative noise protection to correlational attacks on continuous microdata. In: 2016 Working Paper, School of Mathematics and Applied Statistics, National Institute for Applied Statistics Research Australia, University of Wollongong, Australia (2016)

    Google Scholar 

  17. McSherry, F.D.: Privacy integrated queries: an extensible platform for privacy-preserving data analysis. In: Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data, Providence, Rhode Island, USA, pp. 19–30, https://doi.org/10.1145/1559845.1559850 (2009)

  18. McSherry, F., Talwar, K.: Mechanism design via differential privacy. In: Proceedings of the 48th Annual IEEE Symposium on Foundations of Computer Science, Washington, DC, USA, pp. 94–103 (2007). https://doi.org/10.1109/FOCS.2007.41

  19. Oganian, A.: Multiplicative noise protocols. In: Domingo-Ferrer, J., Magkos, E. (eds.) PSD 2010. LNCS, vol. 6344, pp. 107–117. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15838-4_10

    Chapter  Google Scholar 

  20. Oganian, A.: Multiplicative noise for masking numerical microdata data with constraints. SORT - Stat. Oper. Res. Trans. (Special Issue), 99–112 (2011)

    Google Scholar 

  21. Sarathy, R., Muralidhar, K.: Evaluating laplace noise addition to satisfy differential privacy for numeric data. Trans. Data Priv. 4, 1–17 (2011)

    MathSciNet  Google Scholar 

  22. Sinha, B., Nayak, T.K., Zayatz, L.: Privacy protection and quantile estimation from noise multiplied data. Sankhya B 73, 297–315 (2011)

    Article  MathSciNet  Google Scholar 

  23. Shlomo, N., Skinner, C.J.: Privacy protection from sampling and perturbation in survey microdata. J. Priv. Confid. 4, 155–169 (2012)

    Google Scholar 

  24. Torra, V.: Data Privacy: Foundations, New Developments and the Big Data Challenge. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-57358-8

    Book  Google Scholar 

  25. Wang, Y., Lee, J., Kifer, D.: Differentially private hypothesis testing (2015). Revisited, CoRR, arXiv: 1511.03376

  26. Vu, D., Slavkovic, A.: Differential privacy for clinical trial data: preliminary evaluations. In: Proceedings of the 2009 IEEE International Conference on Data Mining Workshops, Washington, DC, USA, pp. 138–143 (2009). https://doi.org/10.1109/ICDMW.2009.52

  27. Wang, Y., Wu, X., Hu, D.: Using randomized response for differential privacy preserving data collection. In: Proceedings of the Workshops of the (EDBT/ICDT) 2016 Joint Conference, (EDBT/ICDT) Workshops 2016, Bordeaux, France, 15 March 2016 (2016). http://ceur-ws.org/Vol-1558/paper35.pdf

  28. Willenborg, L., De Waal, T.: Elements of Statistical Disclosure Control. LNS, vol. 155. Springer, New York (2012). https://doi.org/10.1007/978-1-4613-0121-9

    Book  MATH  Google Scholar 

  29. Warner, S.L.: Randomized response: a survey technique for eliminating evasive answer bias. J. Am. Stat. Assoc. 60, 63–69 (1965)

    Article  Google Scholar 

Download references

Acknowledgement

This research has been conducted with the support of the Australian Government Research Training Program Scholarship.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bradley Wakefield .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wakefield, B., Lin, YX. (2018). Efficiency and Sample Size Determination of Protected Data. In: Domingo-Ferrer, J., Montes, F. (eds) Privacy in Statistical Databases. PSD 2018. Lecture Notes in Computer Science(), vol 11126. Springer, Cham. https://doi.org/10.1007/978-3-319-99771-1_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-99771-1_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-99770-4

  • Online ISBN: 978-3-319-99771-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics