Abstract
Data masking is a means to protect data from unauthorized access by third parties. In this paper, we propose a case-based assistance system for data masking that reuses experience on substituting (pseudonymising) the values of database fields. The data masking experts use rules that maintain task-oriented properties of the data values, such as the environmental hazards risk class of residential areas when masking address data of insurance customers. The rules transform operational data into hardly traceable, masked data sets, which are to be applied, for instance, during software test management in the insurance sector. We will introduce a case representation for masking a database column, including problem descriptors about structural properties and value properties of the column as well as the data masking rule as the solution part of the case. We will describe the similarity functions and the implementation of the approach by means of myCBR. Finally, we report about an experimental evaluation with a case base of more than 600 cases and 31 queries that compares the results of a case-based retrieval with the solutions recommended by a data masking expert.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
ETL stands for Extract - Transform - Load.
- 2.
ZUERS is a zoning-system that is determined by the potential risk to become victim of a flooding or a similar environmental hazard. The ZUERS-zone is an important criteria to calculate the insurance rate, e.g. of a residence insurance.
References
Regulation (EU) 2016/679 of the European Parliament and of the Council. Official Journal of the European Union, L 119 (2016)
Bergmann, R.: Experience Management: Foundations, Development Methodology, and Internet-Based Applications. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45759-3
Lang, A.: Anonymisierung/Pseudonymisierung von Daten für den Test. In D.A.CH Security Conference 2012, Konstanz (2012). Syssec Forschungsgruppe Systemsicherheit
Raghunathan, B.: The Complete Book of Data Anonymization: From Planning to Implementation. CRC Press, Boca Raton (2013)
Richter, M.M., Weber, R.O.: Case-Based Reasoning: A Textbook. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40167-1
Stahl, A., Roth-Berghofer, T.R.: Rapid prototyping of CBR applications with the open source tool myCBR. In: Althoff, K.-D., Bergmann, R., Minor, M., Hanft, A. (eds.) ECCBR 2008. LNCS (LNAI), vol. 5239, pp. 615–629. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-85502-6_42
Venkataramanan, N., Shriram, A.: Data Privacy: Principles and Practice. CRC Press, Boca Raton (2016)
Acknowledgements
The authors would like to thank the data masking experts of R + V who contributed to this work by their rule recommendations. Providing the golden standard for the evaluation, they are vitally important to demonstrate the feasibility of the approach. We highly appreciate their time and efforts.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Minor, M., Herborn, A., Jordan, D. (2018). Case-Based Data Masking for Software Test Management. In: Cox, M., Funk, P., Begum, S. (eds) Case-Based Reasoning Research and Development. ICCBR 2018. Lecture Notes in Computer Science(), vol 11156. Springer, Cham. https://doi.org/10.1007/978-3-030-01081-2_19
Download citation
DOI: https://doi.org/10.1007/978-3-030-01081-2_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01080-5
Online ISBN: 978-3-030-01081-2
eBook Packages: Computer ScienceComputer Science (R0)