Utility Aware Clustering for Publishing Transactional Data

Bewong, Michael; Liu, Jixue; Liu, Lin; Li, Jiuyong

doi:10.1007/978-3-319-57529-2_38

Michael Bewong¹⁹,
Jixue Liu¹⁹,
Lin Liu¹⁹ &
…
Jiuyong Li¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10235))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

3065 Accesses
6 Citations

Abstract

This work aims to maximise the utility of published data for the partition-based anonymisation of transactional data. We make an observation that, by optimising the clustering i.e. horizontal partitioning, the utility of published data can significantly be improved without affecting the privacy guarantees. We present a new clustering method with a specially designed distance function that considers the effect of sensitive terms in the privacy goal as part of the clustering process. In this way, when the clustering minimises the total intra-cluster distances of the partition, the utility loss is also minimised. We present two algorithms DocClust and DetK for clustering transactions and determining the best number of clusters respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Barbaro, M., Zeller, T.: A face is exposed for AOL searcher no. 4417749. The New York Times (2006)
Google Scholar
Byun, J., Kamra, A., Bertino, E., Li, N.: Efficient k-anonymization using clustering techniques. In: DASFAA, pp. 188–200 (2007)
Google Scholar
Ghinita, G., Kalnis, P., Tao, Y.: Anonymous publication of sensitive transactional data. IEEE TKDE 23(2), 161–174 (2011)
Google Scholar
Goldberger, J., Tassa, T.: Efficient anonymizations with enhanced utility. TDP 3(2), 149–175 (2010)
MathSciNet Google Scholar
Liu, J., Wang, K.: Anonymizing bag-valued sparse data by semantic similarity-based clustering. KIS 35(2), 435–461 (2013)
Google Scholar
Loukides, G., Liagouris, J., Gkoulalas-Divanis, A., Terrovitis, M.: Disassociation for electronic health record privacy. JBI 50, 46–61 (2014)
Google Scholar
Loukides, G., Liagouris, J., Gkoulalas-Divanis, A., Terrovitis, M.: Utility-constrained electronic health record data publishing through generalization and disassociation. In: Medical Data Privacy Handbook, pp. 149–177 (2015)
Google Scholar
Machanavajjhala, A., Gehrke, J., Kifer, D., Venkitasubramaniam, M.: l-Diversity: privacy beyond k-anonymity. In: ICDE, p. 24 (2006)
Google Scholar
Terrovitis, M., Liagouris, J., Mamoulis, N., Skiadopoulos, S.: Privacy preservation by disassociation. PVLDB 5(10), 944–955 (2012)
Google Scholar
Terrovitis, M., Mamoulis, N., Kalnis, P.: Local and global recoding methods for anonymizing set-valued data. VLDB J. 20(1), 83–106 (2011)
Article Google Scholar
Zheng, Z., Kohavi, R., Mason, L.: Real world performance of association rule algorithms. In: ACM SIGKDD, pp. 401–406 (2001)
Google Scholar
Zhu, H., Ye, X.: Achieving k-anonymity via a density-based clustering method. In: WAIM, pp. 745–752 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

ITMS, University of South Australia, Adelaide, SA, 5095, Australia
Michael Bewong, Jixue Liu, Lin Liu & Jiuyong Li

Authors

Michael Bewong
View author publications
You can also search for this author in PubMed Google Scholar
Jixue Liu
View author publications
You can also search for this author in PubMed Google Scholar
Lin Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jiuyong Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Michael Bewong .

Editor information

Editors and Affiliations

Kangwon National University, Chuncheon, Korea (Republic of)
Jinho Kim
Seoul National University, Seoul, Korea (Republic of)
Kyuseok Shim
University of Technology Sydney, Sydney, New South Wales, Australia
Longbing Cao
KAIST, Daejeon, Korea (Republic of)
Jae-Gil Lee
University of New South Wales, Sydney, New South Wales, Australia
Xuemin Lin
Kangwon National University, Chuncheon, Korea (Republic of)
Yang-Sae Moon

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bewong, M., Liu, J., Liu, L., Li, J. (2017). Utility Aware Clustering for Publishing Transactional Data. In: Kim, J., Shim, K., Cao, L., Lee, JG., Lin, X., Moon, YS. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2017. Lecture Notes in Computer Science(), vol 10235. Springer, Cham. https://doi.org/10.1007/978-3-319-57529-2_38

Download citation

DOI: https://doi.org/10.1007/978-3-319-57529-2_38
Published: 23 April 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-57528-5
Online ISBN: 978-3-319-57529-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics