skip to main content
10.1145/3460120.3484751acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article

Side-Channel Attacks on Query-Based Data Anonymization

Published: 13 November 2021 Publication History

Abstract

A longstanding problem in computer privacy is that of data anonymization. One common approach is to present a query interface to analysts, and anonymize on a query-by-query basis. In practice, this approach often uses a standard database back end, and presents the query semantics of the database to the analyst.
This paper presents a class of novel side-channel attacks that work against any query-based anonymization system that uses a standard database back end. The attacks exploit the implicit conditional logic of database runtime optimizations. They manipulate this logic to trigger timing and exception-throwing side-channels based on the contents of the data.
We demonstrate the attacks on the implementation of the CHORUS Differential Privacy system released by Uber as an open source project. We obtain perfect reconstruction of millions of data values even with a Differential Privacy budget smaller than epsilon = 1.0 and no prior knowledge.
The paper also presents the design of a general defense to the runtime-optimization attacks, and a concrete implementation of the defense in the latest version of Diffix. The defense works without modifications to the back end database, and operates by modifying SQL to eliminate the runtime optimization or disable the side-channels.
In addition, two other attacks that exploit specific flaws in Diffix and CHORUS are reported. These have been fixed in the respective implementations.

References

[1]
Aloni Cohen and Kobbi Nissim. 2018. Linear program reconstruction in practice. (2018). arxiv: 1810.05692 https://arxiv.org/pdf/1810.05692
[2]
Dorothy E. Denning. 1981. Restricting Queries that Might Lead to Compromise. In 1981 IEEE Symposium on Security and Privacy, Oakland, CA, USA, April 27--29, 1981. 33--40. https://doi.org/10.1109/SP.1981.10000
[3]
Cynthia Dwork. 2006. Differential Privacy. In ICALP.
[4]
Ivan P Fellegi. 1972. On the question of statistical confidentiality. J. Amer. Statist. Assoc., Vol. 67, 337 (1972), 7--18.
[5]
Agency for Healthcare Research and Quality. 2013. Healthcare Cost and Utilization Project (HCUP). http://www.ahrq.gov/research/data/hcup/index.html. Last Accessed April 10, 2021.
[6]
Paul Francis. 2021. href http://www.mpi-sws.org/tr/2021-002.pdf Procedures and Rules for the 2020 Diffix Bounty Program. Technical Report MPI-SWS-2021-002. MPI-SWS. http://www.mpi-sws.org/tr/2021-002.pdf
[7]
Paul Francis, Sebastian Probst Eide, Pawel Obrok, Cristian Berneanu, Sasa Juric, and Reinhard Munz. 2018. Diffix-Birch: Extending Diffix-Aspen. (2018). arxiv: 1806.02075 http://arxiv.org/abs/1806.02075
[8]
Paul Francis, Sebastian Probst Eide, and Reinhard Munz. [n.d.]. Diffix: High -Utility Database Anonymization. In Privacy Technologies and Policy, Erich Schweighofer, Herbert Leitold, Andreas Mitrakas, and Kai Rannenberg (Eds.). Lecture Notes in Computer Science, Vol. 10518. Springer International Publishing, 141--158. https://doi.org/10.1007/978--3--319--67280--9_8
[9]
Marco Gaboardi, James Honaker, Gary King, Jack Murtagh, Kobbi Nissim, Jonathan Ullman, and Salil Vadhan. 2018. PSI (textbackslashvphantomPsi vphantom): A Private Data Sharing Interface .arxiv: 1609.04340 http://arxiv.org/abs/1609.04340
[10]
Andrea Gadotti, Florimond Houssiau, Luc Rocher, Benjamin Livshits, and Yves-Alexandre de Montjoye. 2019. When the Signal is in the Noise: Exploiting Diffixtextquoterights Sticky Noise. In 28th USENIX Security Symposium (USENIX Security 19). USENIX Association, Santa Clara, CA, 1081--1098. https://www.usenix.org/conference/usenixsecurity19/presentation/gadotti
[11]
Andreas Haeberlen, Benjamin C Pierce, and Arjun Narayan. 2011. Differential Privacy Under Fire. USENIX Security Symposium, Vol. 33 (2011).
[12]
Noah Johnson, Joseph P. Near, Joseph M. Hellerstein, and Dawn Song. 2018b. Chorus: Differential Privacy via Query Rewriting .arxiv: 1809.07750 http://arxiv.org/abs/1809.07750
[13]
Noah M. Johnson, Joseph P. Near, and Dawn Song. 2018a. Towards Practical Differential Privacy for SQL Queries. Proc. VLDB Endow., Vol. 11, 5 (2018), 526--539. https://doi.org/10.1145/3187009.3177733
[14]
Joseph Near. 2020. Github Repo uvm-plaid/chorus. https://github.com/uvm-plaid/chorus. Last Accessed April 10, 2021.
[15]
Marvin Karson. 1968. Handbook of Methods of Applied Statistics. Volume I: Techniques of Computation Descriptive Methods, and Statistical Inference. Volume II: Planning of Surveys and Experiments. I. M. Chakravarti, R. G. Laha, and J. Roy, New York, John Wiley;. J. Amer. Statist. Assoc., Vol. 63 (1968), 1047--1049.
[16]
Daniel Kifer, Solomon Messing, Aaron Roth, Abhradeep Thakurta, and Danfeng Zhang. 2020. Guidelines for Implementing and Auditing Differentially Private Systems .arxiv: 2002.04049 http://arxiv.org/abs/2002.04049
[17]
Frank McSherry. 2009. Privacy Integrated Queries. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data. ACM, 19--30.
[18]
Ilya Mironov. 2012. On Significance of the Least Significant Bits for Differential Privacy. In Proceedings of the 2012 ACM Conference on Computer and Communications Security (Raleigh, North Carolina, USA) (CCS '12). Association for Computing Machinery, 650--661. https://doi.org/10.1145/2382196.2382264
[19]
Prashanth Mohan, Abhradeep Thakurta, Elaine Shi, Dawn Song, and David Culler. 2012. GUPT : Privacy Preserving Data Analysis Made Easy. In Proceedings of the 2012 International Conference on Management of Data - SIGMOD '12 (Scottsdale, Arizona, USA). ACM Press, 349. https://doi.org/10.1145/2213836.2213876
[20]
Joe Near. 2018. Differential Privacy at Scale: Uber and Berkeley Collaboration. In Enigma 2018 (Enigma 2018). USENIX Association, Santa Clara, CA. https://www.usenix.org/node/208168
[21]
Open Diffix Project. 2020. Open Diffix. http://open-diffix.org. Last Accessed April 10, 2021.
[22]
Paul Francis. 2021 a. Customer Documentation for Aircloak's Diffix Dogwood. http://www.mpi-sws.org/tr/2021-003.pdf. Last Accessed April 10, 2021.
[23]
Paul Francis. 2021 b. Procedures and Rules for the 2020 Diffix Bounty Program. http://www.mpi-sws.org/tr/2021-002.pdf. Last Accessed April 10, 2021.
[24]
Paul Francis. 2021 c. Specification of Diffix Dogwood. http://www.mpi-sws.org/tr/2021-001.pdf. Last Accessed April 10, 2021.
[25]
Paul Francis. 2021 d. GDA Score Overview. https://www.gda-score.org/what-is-a-gda-score/. Last Accessed April 10, 2021.
[26]
Uber Privacy and Security. 2017. Uber Releases Open Source Project for Differential Privacy. https://medium.com/uber-security-privacy/differential-privacy-open-source-7892c82c42b6. Last Accessed April 10, 2021.
[27]
Reinhard Munz. 2019. Github code branch for the GDA-Score port of Uber sql-differential-privacy. https://github.com/gda-score/anonymization-mechanisms/tree/master/uber. Last Accessed April 10, 2021.
[28]
Indrajit Roy, Hany Ramadan, Srinath Setty, Ann Kilzer, Vitaly Shmatikov, and Emmett Witchel. 2010. Airavat: Security and Privacy for MapReduce., Vol. 10 (2010), 297--312.
[29]
Amaresh Ankit Siva. 2020. CHORUS Is Porous : Attacking Implementations of Differential Privacy. (2020).
[30]
Uber and UC Berkeley. 2017. Github Repo sql-differential-privacy. https://github.com/uber-archive/sql-differential-privacy.
[31]
Pauli Virtanen, Ralf Gommers, Travis E. Oliphant, Matt Haberland, Reddy et al., and SciPy 1.0 Contributors. 2020. SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nature Methods, Vol. 17 (2020), 261--272. https://doi.org/10.1038/s41592-019-0686--2
[32]
Jianguo Zheng and Xinyu Shen. 2021. Pattern Mining and Detection of Malicious SQL Queries on Anonymization Mechanism. IEEE Access, Vol. 9 (2021), 15015--15027.

Cited By

View all
  • (2024)Membrane - Safe and Performant Data Access Controls in Apache Spark in the Presence of Imperative CodeProceedings of the VLDB Endowment10.14778/3685800.368580817:12(3813-3826)Online publication date: 8-Nov-2024
  • (2024)On Vulnerability of Access Control Restrictions to Timing Attacks in a Database Management SystemProceedings of the 36th International Conference on Scientific and Statistical Database Management10.1145/3676288.3676306(1-4)Online publication date: 10-Jul-2024
  • (2024)QueryCheetah: Fast Automated Discovery of Attribute Inference Attacks Against Query-Based SystemsProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3690272(3451-3465)Online publication date: 2-Dec-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CCS '21: Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security
November 2021
3558 pages
ISBN:9781450384544
DOI:10.1145/3460120
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 November 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. anonymization
  2. databases
  3. privacy
  4. side-channels

Qualifiers

  • Research-article

Conference

CCS '21
Sponsor:
CCS '21: 2021 ACM SIGSAC Conference on Computer and Communications Security
November 15 - 19, 2021
Virtual Event, Republic of Korea

Acceptance Rates

Overall Acceptance Rate 1,261 of 6,999 submissions, 18%

Upcoming Conference

CCS '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)63
  • Downloads (Last 6 weeks)4
Reflects downloads up to 27 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Membrane - Safe and Performant Data Access Controls in Apache Spark in the Presence of Imperative CodeProceedings of the VLDB Endowment10.14778/3685800.368580817:12(3813-3826)Online publication date: 8-Nov-2024
  • (2024)On Vulnerability of Access Control Restrictions to Timing Attacks in a Database Management SystemProceedings of the 36th International Conference on Scientific and Statistical Database Management10.1145/3676288.3676306(1-4)Online publication date: 10-Jul-2024
  • (2024)QueryCheetah: Fast Automated Discovery of Attribute Inference Attacks Against Query-Based SystemsProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3690272(3451-3465)Online publication date: 2-Dec-2024
  • (2024)From Principle to Practice: Vertical Data Minimization for Machine Learning2024 IEEE Symposium on Security and Privacy (SP)10.1109/SP54263.2024.00089(4733-4752)Online publication date: 19-May-2024
  • (2023)Reversible Database Watermarking Based on Order-preserving Encryption for Data SharingACM Transactions on Database Systems10.1145/358976148:2(1-25)Online publication date: 13-May-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media