skip to main content
10.1145/1557019.1557157acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Anonymizing healthcare data: a case study on the blood transfusion service

Published: 28 June 2009 Publication History

Abstract

Sharing healthcare data has become a vital requirement in healthcare system management; however, inappropriate sharing and usage of healthcare data could threaten patients' privacy. In this paper, we study the privacy concerns of the blood transfusion information-sharing system between the Hong Kong Red Cross Blood Transfusion Service (BTS) and public hospitals, and identify the major challenges that make traditional data anonymization methods not applicable. Furthermore, we propose a new privacy model called LKC-privacy, together with an anonymization algorithm, to meet the privacy and information requirements in this BTS case. Experiments on the real-life data demonstrate that our anonymization algorithm can effectively retain the essential information in anonymous data for data analysis and is scalable for anonymizing large datasets.

Supplementary Material

JPG File (p1285-mohammed.jpg)
MP4 File (p1285-mohammed.mp4)

References

[1]
C. C. Aggarwal. On k-anonymity and the curse of dimensionality. In VLDB, 2005.
[2]
C. C. Aggarwal and P. S. Yu. Privacy Preserving Data Mining: Models and Algorithms. Springer, 2008.
[3]
R. Agrawal and R. Srikant. Privacy preserving data mining. In SIGMOD, 2000.
[4]
R. J. Bayardo and R. Agrawal. Data privacy through optimal k-anonymization. In ICDE, 2005.
[5]
D. M. Carlisle, M. L. Rodrian, and C. L. Diamond. California inpatient data reporting manual, medical information reporting for california, 5th edition. Technical report, Office of Statewide Health Planning and Development, July 2007.
[6]
C. Dwork. Differential privacy: A survey of results. Theory and Applications of Models of Computation, 2008.
[7]
B. C. M. Fung, K. Wang, R. Chen, and P. S. Yu. Privacy-preserving data publishing: A survey on recent developments. ACM Computing Surveys, 2010.
[8]
B. C. M. Fung, K. Wang, and P. S. Yu. Anonymizing classification data for privacy preservation. IEEE TKDE, 19(5):711--725, May 2007.
[9]
J. Gardner and L. Xiong. An integrated framework for de-identifying heterogeneous data. DKE, 2009.
[10]
G. Ghinita, Y. Tao, and P. Kalnis. On the anonymization of sparse high-dimensional data. In ICDE, 2008.
[11]
V. S. Iyengar. Transforming data to satisfy privacy constraints. In SIGKDD, 2002.
[12]
J. Kim and W. Winkler. Masking microdata files. In ASA Section on Survey Research Methods, 1995.
[13]
K. LeFevre, D. J. DeWitt, and R. Ramakrishnan. Workload-aware anonymization techniques for large-scale data sets. ACM TODS, 2008.
[14]
A. Machanavajjhala, D. Kifer, J. Gehrke, and M. Venkitasubramaniam. l-diversity: Privacy beyond k-anonymity. ACM TKDD, 2007.
[15]
N. Mohammed, B. C. M. Fung, K. Wang, and P. C. K. Hung. Privacy-preserving data mashup. In EDBT, 2009.
[16]
D. J. Newman, S. Hettich, C. L. Blake, and C. J. Merz. UCI repository of machine learning databases, 1998.
[17]
J. R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993.
[18]
P. Samarati. Protecting respondents' identities in microdata release. IEEE TKDE, 2001.
[19]
A. Skowron and C. Rauszer. Intelligent Decision Support: Handbook of Applications and Advances of the Rough Set Theory, chapter The discernibility matrices and functions in information systems. 1992.
[20]
L. Sweeney. k-anonymity: A model for protecting privacy. In International Journal on Uncertainty, Fuzziness and Knowledge-based Systems, 2002.
[21]
M. Terrovitis, N. Mamoulis, and P. Kalnis. Privacy-preserving anonymization of set-valued data. In VLDB, 2008.
[22]
K. Wang and B. C. M. Fung. Anonymizing sequential releases. In SIGKDD, pages 414--423, August 2006.
[23]
K. Wang, B. C. M. Fung, and P. S. Yu. Handicapping attacker's confidence: An alternative to k-anonymization. KAIS, 11(3):345--368, April 2007.
[24]
R. C. W. Wong, J. Li., A. W. C. Fu, and K. Wang. (®,k)-anonymity: An enhanced k-anonymity model for privacy preserving data publishing. In SIGKDD, 2006.
[25]
X. Xiao and Y. Tao. Anatomy: Simple and effective privacy preservation. In VLDB, 2006.
[26]
Y. Xu, B. C. M. Fung, K. Wang, A. W. C. Fu, and J. Pei. Publishing sensitive transactions for itemset utility. In ICDM, pages 1109--1114, December 2008.
[27]
Y. Xu, K. Wang, A. W. C. Fu, and P. S. Yu. Anonymizing transaction databases for publication. In SIGKDD, 2008.
[28]
S. Yu, G. Fung, R. Rosales, S. Krishnan, R. B. Rao, C. Dehing-Oberije, and P. Lambin. Privacy-preserving cox regression for survival analysis. In SIGKDD, 2008.

Cited By

View all
  • (2024)Advancements on IoT and AI applied to PneumologyMicroprocessors and Microsystems10.1016/j.micpro.2024.105062108(105062)Online publication date: Jul-2024
  • (2024) Algorithm to satisfy l ‐diversity by combining dummy records and grouping SECURITY AND PRIVACY10.1002/spy2.3737:3Online publication date: 7-Feb-2024
  • (2023)Privacy-preserving analysis of time-to-event data under nested case-control samplingStatistical Methods in Medical Research10.1177/0962280223121580433:1(96-111)Online publication date: 13-Dec-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
June 2009
1426 pages
ISBN:9781605584959
DOI:10.1145/1557019
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 June 2009

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. anonymity
  2. classification
  3. healthcare
  4. privacy

Qualifiers

  • Research-article

Conference

KDD09

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)42
  • Downloads (Last 6 weeks)5
Reflects downloads up to 27 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Advancements on IoT and AI applied to PneumologyMicroprocessors and Microsystems10.1016/j.micpro.2024.105062108(105062)Online publication date: Jul-2024
  • (2024) Algorithm to satisfy l ‐diversity by combining dummy records and grouping SECURITY AND PRIVACY10.1002/spy2.3737:3Online publication date: 7-Feb-2024
  • (2023)Privacy-preserving analysis of time-to-event data under nested case-control samplingStatistical Methods in Medical Research10.1177/0962280223121580433:1(96-111)Online publication date: 13-Dec-2023
  • (2022)A Survey on Privacy-Preserving Data Publishing Models for Big DataHandbook of Research on Cyber Law, Data Protection, and Privacy10.4018/978-1-7998-8641-9.ch015(250-276)Online publication date: 2022
  • (2022)Exploring the Utility of Anonymized EHR Datasets in Machine Learning Experiments in the Context of the MODELHealth ProjectApplied Sciences10.3390/app1212594212:12(5942)Online publication date: 10-Jun-2022
  • (2022)Artificial Intelligence-Enabled IoT-Based Smart Blood Banking SystemProceedings of 2nd International Conference on Artificial Intelligence: Advances and Applications10.1007/978-981-16-6332-1_12(119-130)Online publication date: 14-Feb-2022
  • (2022) Dynamic distributed KC i ‐slice data publishing model with multiple sensitive attributes Concurrency and Computation: Practice and Experience10.1002/cpe.706434:21Online publication date: 27-May-2022
  • (2021)Personalized trajectory privacy-preserving method based on sensitive attribute generalization and location perturbationIntelligent Data Analysis10.3233/IDA-20530625:5(1247-1271)Online publication date: 15-Sep-2021
  • (2021)Images in Space and TimeACM Computing Surveys10.1145/345365754:6(1-38)Online publication date: 13-Jul-2021
  • (2021)Scalable Privacy-Preserving Distributed Extremely Randomized Trees for Structured Data With Multiple Colluding PartiesICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP39728.2021.9413632(2655-2659)Online publication date: 6-Jun-2021
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media