skip to main content
10.1145/1755688.1755694acmconferencesArticle/Chapter ViewAbstractPublication Pagesasia-ccsConference Proceedingsconference-collections
research-article

Restoring compromised privacy in micro-data disclosure

Published: 13 April 2010 Publication History

Abstract

Studied in this paper is the problem of restoring compromised privacy for micro-data disclosure with multiple disclosed views. The property of γ-privacy is proposed, which requires that the probability of an individual to be associated with a sensitive value must be bounded by γ in a possible table which is randomly selected from a set of tables that would lead the same disclosed answers. For the restricted case of a single disclosed view, the γ-privacy is shown to be equivalent to recursive ([EQUATION], 2)-Diversity, which is not defined for multiple disclosed views. The problem of deciding on γ-privacy for a set of disclosed views is proven to be #P-complete. To mitigate the high computational complexity, the property of γ-privacy is relaxed to be satisfied with (ε, θ) confidence, i.e., that the probability of disclosing a sensitive value of an individual must be bounded by γ + ε with statistical confidence θ. A Monte Carlo-based algorithm is proposed to check the relaxed property in O((λλ')4) time for constant ε and θ, where λ is the number of tuples in the original table and λ' is the number different sensitive values in the original table. Restoring compromised privacy using additional disclosed views is studied. Heuristic polynomial time algorithms are proposed based on enumerating and checking additional disclosed views. A preliminary experimental study is conducted on real-life medical data, which demonstrates that the proposed polynomial algorithms restore privacy in up to 60% of compromised disclosures.

References

[1]
A. Dobra and S. E. Feinberg. Bounding entries in multi-way contingency tables given a set of marginal totals. In Foundations of Statistical Inference: Proceedings of the Shoresh Conference 2000. Springer Verlag, 2003.
[2]
R. Agrawal and R. Srikant. Privacy-preserving data mining. In Proc. of the ACM SIGMOD Conference on Management of Data, pages 439--450, May 2000.
[3]
A. Machanavajjhala, J. Gehrke, D. Kifer, and M. Venkitasubramaniam. l-diversity: Privacy beyond k-anonymity. In Proceedings of the 22nd IEEE International Conference on Data Engineering (ICDE 2006), 2006.
[4]
A. Slavkovic and S. E. Feinberg. Bounds for cell entries in two-way tables given conditional relative frequencies. Privacy in Statistical Databases, 2004.
[5]
A. Asuncion and D. Newman. UCI machine learning repository (Data Provider: Andras. Janosi, hungarian institute of cardiology; William. Steinbrunn, university hospital, zurich, switzerland; Matthias. Pfisterer, university hospital, basel, switzerland; Robert. Detrano, v.a. medical center, long beach and cleveland clinic foundation.), 2007.
[6]
M. Bellare and P. Rogaway. Random oracles are practical: A paradigm for designing efficient protocols. In CCS, 1995.
[7]
A. Bertoni, M. Goldwurm, and M. Santini. Random generation and approximate counting of ambiguously described combinatorial structures. In STACS, pages 567--580, 2000.
[8]
D. P. Dobkin, A. K. Jones, and R. J. Lipton. Secure databases: Protection against user influence. ACM: Transactions on Database Systems (TODS), 4(1):76--96, 1979.
[9]
M. Dyer, R. Kannan, and J. Mount. Sampling contingency tables. In CCC, pages 487--506, 1997.
[10]
F. Chin. Security problems on inference control for sum, max, and min queries. J. ACM, 33(3):451--464, 1986.
[11]
G. Aggarwal, T. Feder, K. Kenthapadi, R. Motwani, R. Panigrahy, D. Thomas, and A. Zhu. k-anonymity: Algorithms and hardness. Technical report, Stanford University, 2004.
[12]
G. Miklau and D. Suciu. A formal analysis of information disclosure in data exchange. In SIGMOD, 2004.
[13]
P. W. P. J. Grefen and R. A. d. By. A multi-set extended relational algebra - a formal approach to a practical issue. In Proceedings of the Tenth International Conference on Data Engineering, pages 80--88, 1994.
[14]
G. T. Duncan and S. E. Feinberg. Obtaining information while preserving privacy: A markov perturbation method for tabular data. In Joint Statistical Meetings. Anaheim, CA, 1997.
[15]
I. P. Fellegi. On the question of statistical confidentiality. Journal of the American Statistical Association, 67(337):7--18, 1993.
[16]
J. Kleinberg, C. Papadimitriou, and P. Raghavan. Auditing boolean attributes. In PODS, 2000.
[17]
J. Schlorer. Identification and retrieval of personal records from a statistical bank. In Methods Info. Med., 1975.
[18]
K. Kenthapadi, N. Mishra, and K. Nissim. Simulatable auditing. In PODS, 2005.
[19]
K. LeFevre, D. DeWitt, and R. Ramakrishnan. Incognito: Efficient fulldomain k-anonymity. In SIGMOD, 2005.
[20]
L. H. Cox. Solving confidentiality protection problems in tabulations using network optimization: A network model for cell suppression in the u.s. economic censuses. In Proceedings of the Internatinal Seminar on Statistical Confidentiality, pages 229--245. International Statistical Institute, Dublin, 1982.
[21]
L. H. Cox. New results in disclosure avoidance for tabulations. In International Statistical Institute Proceedings of the 46th Session, pages 83--84. Tokyo, 1987.
[22]
L. H. Cox. Suppression, methodology and statistical disclosure control. Journal of the American Statistical Association, 90:1453--1462, 1995.
[23]
N. Li and T. Li. t-closeness: Privacy beyond k-anonymity and l-diversity. In ICDE, 2007.
[24]
L. Sweeney. k-anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems, 10(5):557--570, 2002.
[25]
M. Mether. The history of the central limit theorem. Sovelletun Matematiikan erikoistyöt, Mat-2(108), 2003.
[26]
A. Meyerson and R. Williams. On the complexity of optimal k-anonymity. In ACM Symposium on Principles of Database Systems (PODS), 2004.
[27]
N. R. Adam and J. C. Wortmann. Security-control methods for statistical databases: A comparative study. ACM Comput. Surv., 21(4):515--556, 1989.
[28]
P. Diaconis and B. Sturmfels. Algebraic algorithms for sampling from conditional distributions. Annals of Statistics, 1:363--397, 1998.
[29]
P. Samarati. Protecting respondents' identities in microdata release. In IEEE Transactions on Knowledge and Data Engineering, pages 1010--1027, 2001.
[30]
P. Samarati and L. Sweeney. Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. Technical report, CMU, SRI, 1998.
[31]
R. J. Bayardo and R. Agrawal. Data privacy through optimal k-anonymization. In ICDE-2005, 2005.
[32]
S. Chawla, C. Dwork, F. McSherry, A. Smith, and H. Wee. Toward privacy in public databases. In Theory of Cryptography Conference, 2005.
[33]
T. Dalenius and S. Reiss. Data swapping: A technique for disclosure control. Journal of Statistical Planning and Inference, 6:73--85, 1982.
[34]
X. Xiao and Y. Tao. Personalized privacy preservation. In SIGMOD, 2006.
[35]
L. Zhang, S. Jajodia, and A. Brodsky. Information disclosure under realistic assumptions: Privacy versus optimality. In ACM Conference on Computer and Communications Security (CCS) 2007.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ASIACCS '10: Proceedings of the 5th ACM Symposium on Information, Computer and Communications Security
April 2010
363 pages
ISBN:9781605589367
DOI:10.1145/1755688
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 April 2010

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. data privacy
  2. micro-data disclosure

Qualifiers

  • Research-article

Funding Sources

Conference

ASIA CCS '10
Sponsor:

Acceptance Rates

ASIACCS '10 Paper Acceptance Rate 25 of 166 submissions, 15%;
Overall Acceptance Rate 418 of 2,322 submissions, 18%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 257
    Total Downloads
  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)1
Reflects downloads up to 16 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media