Abstract
This work aims to maximise the utility of published data for the partition-based anonymisation of transactional data. We make an observation that, by optimising the clustering i.e. horizontal partitioning, the utility of published data can significantly be improved without affecting the privacy guarantees. We present a new clustering method with a specially designed distance function that considers the effect of sensitive terms in the privacy goal as part of the clustering process. In this way, when the clustering minimises the total intra-cluster distances of the partition, the utility loss is also minimised. We present two algorithms DocClust and DetK for clustering transactions and determining the best number of clusters respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Barbaro, M., Zeller, T.: A face is exposed for AOL searcher no. 4417749. The New York Times (2006)
Byun, J., Kamra, A., Bertino, E., Li, N.: Efficient k-anonymization using clustering techniques. In: DASFAA, pp. 188–200 (2007)
Ghinita, G., Kalnis, P., Tao, Y.: Anonymous publication of sensitive transactional data. IEEE TKDE 23(2), 161–174 (2011)
Goldberger, J., Tassa, T.: Efficient anonymizations with enhanced utility. TDP 3(2), 149–175 (2010)
Liu, J., Wang, K.: Anonymizing bag-valued sparse data by semantic similarity-based clustering. KIS 35(2), 435–461 (2013)
Loukides, G., Liagouris, J., Gkoulalas-Divanis, A., Terrovitis, M.: Disassociation for electronic health record privacy. JBI 50, 46–61 (2014)
Loukides, G., Liagouris, J., Gkoulalas-Divanis, A., Terrovitis, M.: Utility-constrained electronic health record data publishing through generalization and disassociation. In: Medical Data Privacy Handbook, pp. 149–177 (2015)
Machanavajjhala, A., Gehrke, J., Kifer, D., Venkitasubramaniam, M.: l-Diversity: privacy beyond k-anonymity. In: ICDE, p. 24 (2006)
Terrovitis, M., Liagouris, J., Mamoulis, N., Skiadopoulos, S.: Privacy preservation by disassociation. PVLDB 5(10), 944–955 (2012)
Terrovitis, M., Mamoulis, N., Kalnis, P.: Local and global recoding methods for anonymizing set-valued data. VLDB J. 20(1), 83–106 (2011)
Zheng, Z., Kohavi, R., Mason, L.: Real world performance of association rule algorithms. In: ACM SIGKDD, pp. 401–406 (2001)
Zhu, H., Ye, X.: Achieving k-anonymity via a density-based clustering method. In: WAIM, pp. 745–752 (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Bewong, M., Liu, J., Liu, L., Li, J. (2017). Utility Aware Clustering for Publishing Transactional Data. In: Kim, J., Shim, K., Cao, L., Lee, JG., Lin, X., Moon, YS. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2017. Lecture Notes in Computer Science(), vol 10235. Springer, Cham. https://doi.org/10.1007/978-3-319-57529-2_38
Download citation
DOI: https://doi.org/10.1007/978-3-319-57529-2_38
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-57528-5
Online ISBN: 978-3-319-57529-2
eBook Packages: Computer ScienceComputer Science (R0)