Article

Privacy preserving database application testing

Authors:

Yuliang ZhengAuthors Info & Claims

WPES '03: Proceedings of the 2003 ACM workshop on Privacy in the electronic society

Pages 118 - 128

https://doi.org/10.1145/1005140.1005159

Published: 30 October 2003 Publication History

Abstract

Traditionally, application software developers carry out their tests on their own local development databases. However, such local databases usually have only a small number of sample data and hence cannot simulate satisfactorily a live environment, especially in terms of performance and scalability testing. On the other hand, the idea of testing applications over live production databases is increasingly problematic in most situations primarily due to the fact that such use of live production databases has the potential to expose sensitive data to an unauthorized tester and to incorrectly update information in the underlying database. In this paper, we investigate techniques to generate mock databases for application software testing without revealing any confidential information from the live production databases. Specifically, we will design mechanisms to create the deterministic rule set R, non-deterministic rule set N R, and statistic data set S for a live production database. We will then build a security Analyzer which will process the triplet <R',N R',S'> together with security requirements (security policy) and output a new triplet <R',N R',S'> The security Analyzer will guarantee that no confidential information could be inferred from the new triplet <R',N R',S'> The mock database generated from this new triplet can simulate the live environment for testing purpose, while maintaining the privacy of data in the original database.

References

[1]

N. R. Adam, and J. C. Wortman. Security-control methods for statistical databases. ACM Computing Surveys, 21(4):515--556, Dec. 1989.]]

Digital Library

[2]

R. Agrawal, and R. Srikant. Privacy-preserving data mining. In Proceedings of ACM SIGMOD Conference on Management of Data, pp. 439--450, Dallas, Texas, May 2000.]]

Digital Library

[3]

L. Brankovic, and V. Estivill-Castro. Privacy issues in knowledge discovery and data mining. In Proceedings of 1st Australian Institute of Computer Ethics Conference, July, 1999.]]

[4]

D. Chays, S. Dan, P. Frankl, F. Vokolos, E. Weyuker. A framework for testing database applications. In Proceedings of International Symposium on Software Testing and Analysis, Portland, Oregon, August 2000.]]

Digital Library

[5]

B. Chor, O. Goldreich, E. Kushilevitz, and M. Sudan. Private information retrieval. FOCS 1995.]]

Digital Library

[6]

Y. Gertner, Y. Ishai, E. Kushilevitz, and T. Malkin. Protecting data privacy in private information retrieval schemes. JCSS 60 (3):592--629 (2000).]]

Digital Library

[7]

R. A. Davies, R. J. Beynon, and B. F. Jones. Automating the testing of databases. In Proceedings of the first International Workshop on Automated Program Analysis, Testing and Verification, June 2000.]]

[8]

I. Dinur and K. Nissim. Revealing information while preserving privacy. In: Proc. 22nd ACM PODS, pages 202--210, ACM Press, 2003.]]

Digital Library

[9]

J. Domingo-Ferrer. Current directions in statistical data protection. In Proceeding of Statistical Data Protection, 1998.]]

[10]

V. Estivill-Castro, and L. Brankovic. Data swapping: balancing privacy against precision in mining logical rules. In Proceedings of International Conference of Data Warehousing and Knowledge Discovery, 1999.]]

Digital Library

[11]

A. Evfimievski, R. Srikant, R. Agrawal, and J. Gehrke. Privacy Preserving Mining of Association Rules. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, Canada, July 2002.]]

Digital Library

[12]

O. Goldreich. Foundation of Cryptography | Basic Tools. Cambridge University Press, 2001.]]

Digital Library

[13]

S. Goldwasser, S. Micali, and C. Rackoff. The knowledge complexity of interactive proof systems. SIAM J. Computing18:186--208, 1989.]]

Digital Library

[14]

S. Goldwasser and S. Micali. Probabilistic encryption. Journal of Computer and System Science28 (2):270--299, 1984.]]

[15]

A. Gotlieb, B. Botella, and M. Rueher. Automatic test data generation using constraint solving techniques. In Proceedings of the 1998 International Symposium on Software Testing and Analysis, pp. 53--62, March 1998.]]

Digital Library

[16]

J. Gray, P. Sundaresan, S. Englert, K. Baclawski, and P. J. Weinberger. Quickly generating billion-records synthetic databases. In ACM SIGMOD, pp. 243--252, June 1994.]]

Digital Library

[17]

M. Kantarcioglu, and C. Clifton. Privacy preserving distributed mining of association rules on horizontally partitioned data. In ACM SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery, pp. 24--31, June 2002.]]

[18]

J. J. Kim. A method for limiting disclosure in microdata based on random noise and transformation. In Proceedings of the section on survey research methods, American Statistical Association, 1986.]]

[19]

J. J. Kim, and W. E. Winkler. Masking microdata files. Report of Bureau of the Census, 1997.]]

[20]

S. Kirkpatrick, S. D. Gelatt, and M. P. Vecchi. Optimization by simulated annealing. Science 220(4958):671--680.]]

[21]

Y. Lindell, and B. Pinkas. Privacy preserving data mining. In CRYPTO, pp. 36--54, 2000.]]

Digital Library

[22]

Niagara. http://www.cs.wisc.edu/niagara/datagendownload.html]]

[23]

W. H. Press, B. P. Flannery, S. A. Teukolsky, and W. T. Vetterling. Numerical recipes in C, the art of scientific computing. Cambridge University Press, 1988.]]

Digital Library

[24]

Quest. http://www.almaden.ibm.com/software/quest/]]

[25]

S. Rizvi, and J. Haritsa. Privacy preserving association rule mining. In Proceedings of 28th International Conference on Very Large Data Bases. Aug, 2002.]]

Digital Library

[26]

C. J. Skinner. On identification disclosure and prediction disclosure for microdata. Statistica Neerlandica, 44:21--32, 1992.]]

[27]

M. Stonebraker, and L. Rowe. The design of postgres. In Proceedings of ACM-SIGMOD International Conference on the Management of Data, June 1986.]]

Digital Library

[28]

B. Malin, L. Sweeney, and E. Newton. Trail re-identification: learning who you are from where you have been. Proc. LIDAP-WP12. Carnegie Mellon University, 2003.]]

[29]

Transaction Processing Performance Council. TPC-Benchmark C. 1998.]]

[30]

Edward Tsang. Foundations of constraint satisfaction. Academic Press, 1993.]]

[31]

J. Vaidya, and C. Clifton. Privacy preserving association rule mining in vertically partitioned data. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, Canada, July 2002.]]

Digital Library

[32]

J. Vaidya, and C. Clifton. Privacy preserving k-means clustering over vertically partitioned data. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 206--215, August 2003.]]

Digital Library

[33]

G. Wiederhold, and M. Bilello. Protecting inappropriate release of data from realistic databases. In Proceedings of the Ninth International Workshop on database and Expert Systems Applications, Vienna, Austria, 1998.]]

Digital Library

[34]

G. Wiederhold, M. Bilello, and C. Donahue. Web implementation of a security mediator for medical databases. In Proceedings of the Eleventh International Conference on Database Security, 1997.]]

Digital Library

[35]

A. Yao. How to generate and exchange secrets. In Proceedings of the 27th IEEE symposium on Foundations of Computer Science, pp. 162--167, 1986.]]

Digital Library

[36]

A. Yao. Theory and application of trap-door functions. In Proc. of 23rd IEEE Symposium on Foundation of Computer Science, page 80--91, 1982.]]

Digital Library

Cited By

McLachlan SDube KGallagher TSimmonds JFenton N(2019)Realistic Synthetic Data Generation: The ATEN FrameworkBiomedical Engineering Systems and Technologies10.1007/978-3-030-29196-9_25(497-523)Online publication date: 13-Aug-2019
https://doi.org/10.1007/978-3-030-29196-9_25
Chandrasekaran JFeng HLei YKuhn DKacker R(2017)Applying Combinatorial Testing to Data Mining Algorithms2017 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)10.1109/ICSTW.2017.46(253-261)Online publication date: Mar-2017
https://doi.org/10.1109/ICSTW.2017.46
Venkataramanan NShriram A(2016)Synthetic Data GenerationData Privacy10.1201/9781315370910-8(141-154)Online publication date: 10-Oct-2016
https://doi.org/10.1201/9781315370910-8
Show More Cited By

Index Terms

Privacy preserving database application testing

Recommendations

Privacy-preserving deletion to generalization-based anonymous database
CUBE '12: Proceedings of the CUBE International Information Technology Conference

While creating an anonymous database it is assumed that all data is available at the time of creation. Once record is added to database, it is not deleted or if a user wants to delete person's record from database, it will be removed from it in its next ...
Privacy Preserving Database Generation for Database Application Testing
Special issue ISMIS'05

Testing of database applications is of great importance. Although various studies have been conducted to investigate testing techniques for database design, relatively few efforts have been made to explicitly address the testing of database applications ...
Evaluating privacy threats in released database views by symmetric indistinguishability
Selected papers from the Third and Fourth Secure Data Management (SDM) workshops

A privacy violation occurs when the association between an individual identity and data considered private by that individual is obtained by an unauthorized party. Uncertainty and indistinguishability are two independent aspects that characterize the ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

WPES '03: Proceedings of the 2003 ACM workshop on Privacy in the electronic society

October 2003

135 pages

ISBN:1581137761

DOI:10.1145/1005140

General Chair:
Sushil Jajodia
George Mason University
,
Program Chairs:
Pierangela Samarati
University of Milan, Italy
,
Paul Syverson
Naval Research Lab

Copyright © 2003 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGSAC: ACM Special Interest Group on Security, Audit, and Control

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 October 2003

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

CCS03

Sponsor:

SIGSAC

CCS03: Tenth ACM Conference on Computer and Communications Security 2003

30 10 2003

Washington, DC

Acceptance Rates

Overall Acceptance Rate 106 of 355 submissions, 30%

Upcoming Conference

CCS '25

Sponsor:
sigsac

ACM SIGSAC Conference on Computer and Communications Security

October 13 - 17, 2025

Taipei , Taiwan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

13
Total Citations
View Citations
1,325
Total Downloads

Downloads (Last 12 months)5
Downloads (Last 6 weeks)1

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

McLachlan SDube KGallagher TSimmonds JFenton N(2019)Realistic Synthetic Data Generation: The ATEN FrameworkBiomedical Engineering Systems and Technologies10.1007/978-3-030-29196-9_25(497-523)Online publication date: 13-Aug-2019
https://doi.org/10.1007/978-3-030-29196-9_25
Chandrasekaran JFeng HLei YKuhn DKacker R(2017)Applying Combinatorial Testing to Data Mining Algorithms2017 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)10.1109/ICSTW.2017.46(253-261)Online publication date: Mar-2017
https://doi.org/10.1109/ICSTW.2017.46
Venkataramanan NShriram A(2016)Synthetic Data GenerationData Privacy10.1201/9781315370910-8(141-154)Online publication date: 10-Oct-2016
https://doi.org/10.1201/9781315370910-8
Shmueli EZrihen TYahalom RTassa T(2014)Constrained obfuscation of relational databasesInformation Sciences: an International Journal10.1016/j.ins.2014.07.009286(35-62)Online publication date: 1-Dec-2014
https://dl.acm.org/doi/10.1016/j.ins.2014.07.009
Schanes CFankhauser FGrechenig TSchafferer MBehning KHovemeyer D(2009)Problem Space and Special Characteristics of Security Testing in Live and Operational Environments of Large Systems Exemplified by a Nationwide IT InfrastructureProceedings of the 2009 First International Conference on Advances in System Testing and Validation Lifecycle10.1109/VALID.2009.24(161-166)Online publication date: 20-Sep-2009
https://dl.acm.org/doi/10.1109/VALID.2009.24
Taylor CGittens MMiranskyy AGiakoumakis LKossmann D(2008)A case study in database reliabilityProceedings of the 1st international workshop on Testing database systems10.1145/1385269.1385283(1-6)Online publication date: 13-Jun-2008
https://dl.acm.org/doi/10.1145/1385269.1385283
Wu XWang YGuo SZheng Y(2007)Privacy Preserving Database Generation for Database Application TestingFundamenta Informaticae10.5555/2366516.236652578:4(595-612)Online publication date: 1-Dec-2007
https://dl.acm.org/doi/10.5555/2366516.2366525
Wu XWang YGuo SZheng Y(2007)Privacy Preserving Database Generation for Database Application TestingFundamenta Informaticae10.5555/1366038.136604778:4(595-612)Online publication date: 1-Dec-2007
https://dl.acm.org/doi/10.5555/1366038.1366047
Han-Yuen Ong Ali Miri (2007)Privacy preserving database access through dynamic privacy filters with stable data randomization2007 IEEE International Conference on Systems, Man and Cybernetics10.1109/ICSMC.2007.4414178(3333-3338)Online publication date: Oct-2007
https://doi.org/10.1109/ICSMC.2007.4414178
Sengupta BChandra SSinha VOsterweil LRombach DSoffa M(2006)A research agenda for distributed software developmentProceedings of the 28th international conference on Software engineering10.1145/1134285.1134402(731-740)Online publication date: 28-May-2006
https://dl.acm.org/doi/10.1145/1134285.1134402
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten