Abstract
High-quality medical treatment is unattainable without protecting patients’ medical records and other sensitive information. One of the most critical challenges in the medical industry is patient privacy in light of medical systems’ widespread digitization and networking. What we call “health data” includes a plethora of information on individuals, including their medical records, treatment records, genetic data, and demographic information. In this paper, we review existing methods to keep patients’ health records private and compare their advantages and limitations. We then analyze the public medical dataset from the perspective of privacy protection, utilizing the k-anonymity and l-diversity models, and compare the impact of quasi-identifier attributes on privacy protection. Furthermore, we conduct experiments to investigate the trade-off between privacy and utility. Based on the analysis results, this paper provides data owners with a guide on how to choose attributes for medical data publication and how to select the appropriate techniques for preserving privacy in medical data publication.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Alnemari, A., Romanowski, C.J., Raj, R.K.: An adaptive differential privacy algorithm for range queries over healthcare data. In: 2017 IEEE International Conference on Healthcare Informatics (ICHI), pp. 397–402. IEEE (2017)
Anjum, A., et al.: An efficient privacy mechanism for electronic health records. Comput. Secur. 72, 196–211 (2018)
Anjum, A., Raschia, G.: BangA: an efficient and flexible generalization-based algorithm for privacy preserving data publication. Computers 6(1), 1 (2017)
Begum, S.H., Nausheen, F.: A comparative analysis of differential privacy vs other privacy mechanisms for big data. In: 2018 2nd International Conference on Inventive Systems and Control (ICISC), pp. 512–516. IEEE (2018)
Belsis, P., Pantziou, G.: Protecting anonymity in wireless medical monitoring environments. In: Proceedings of the 4th International Conference on PErvasive Technologies Related to Assistive Environments, pp. 1–6 (2011)
Belsis, P., Pantziou, G.: A k-anonymity privacy-preserving approach in wireless medical monitoring environments. Pers. Ubiquit. Comput. 18, 61–74 (2014)
Bhuiyan, M.Z.A., Wang, G., Choo, K.K.R.: Secured data collection for a cloud-enabled structural health monitoring system. In: 2016 IEEE 18th International Conference on High Performance Computing and Communications; IEEE 14th International Conference on Smart City; IEEE 2nd International Conference on Data Science and Systems (HPCC/SmartCity/DSS), pp. 1226–1231. IEEE (2016)
Carvalho, T., Moniz, N., Faria, P., Antunes, L.: Survey on privacy-preserving techniques for data publishing. arXiv preprint arXiv:2201.08120 (2022)
Chong, K.M.: Privacy-preserving healthcare informatics: a review. In: ITM Web of Conferences, vol. 36, p. 04005. EDP Sciences (2021)
Domingo-Ferrer, J., Martínez, S., Sánchez, D.: Decentralized k-anonymization of trajectories via privacy-preserving tit-for-tat. Comput. Commun. 190, 57–68 (2022)
Dwork, C.: Differential privacy: a survey of results. In: Agrawal, M., Du, D., Duan, Z., Li, A. (eds.) TAMC 2008. LNCS, vol. 4978, pp. 1–19. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-79228-4_1
Ebadi, H., Sands, D., Schneider, G.: Differential privacy: now it’s getting personal. Acm Sigplan Not. 50(1), 69–81 (2015)
El Emam, K., Dankar, F.K.: Protecting privacy using k-anonymity. J. Am. Med. Inform. Assoc. 15(5), 627–637 (2008)
Fatima, M., Rehman, O., Rahman, I.M.: Impact of features reduction on machine learning based intrusion detection systems. EAI Endors. Trans. Scalable Inf. Syst. 9(6), e9–e9 (2022)
Ficek, J., Wang, W., Chen, H., Dagne, G., Daley, E.: Differential privacy in health research: a scoping review. J. Am. Med. Inform. Assoc. 28(10), 2269–2276 (2021)
Fung, B.C.M., Wang, K., Chen, R., Yu, P.S.: Privacy-preserving data publishing: a survey of recent developments. ACM Comput. Surv. 42(4) (2010). https://doi.org/10.1145/1749603.1749605
Ge, Y.F., Bertino, E., Wang, H., Cao, J., Zhang, Y.: Distributed cooperative coevolution of data publishing privacy and transparency. ACM Trans. Knowl. Discov. Data (2023). https://doi.org/10.1145/3613962
Ge, Y.F., Orlowska, M., Cao, J., Wang, H., Zhang, Y.: MDDE: multitasking distributed differential evolution for privacy-preserving database fragmentation. VLDB J. 31(5), 957–975 (2022)
Ge, Y.F., et al.: Evolutionary dynamic database partitioning optimization for privacy and utility. IEEE Trans. Dependable Secure Comput. (2023). https://doi.org/10.1109/tdsc.2023.3302284
Ge, Y.F., Wang, H., Cao, J., Zhang, Y.: An information-driven genetic algorithm for privacy-preserving data publishing. In: Chbeir, R., Huang, H., Silvestri, F., Manolopoulos, Y., Zhang, Y. (eds.) WISE 2022. LNCS, vol. 13724, pp. 340–354. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-20891-1_24
Ge, Y.F., et al.: DSGA: a distributed segment-based genetic algorithm for multi-objective outsourced database partitioning. Inf. Sci. 612, 864–886 (2022). https://doi.org/10.1016/j.ins.2022.09.003
Hu, J., Sun, K., Zhang, H.: Helmholtz machine with differential privacy. Inf. Sci. 613, 888–903 (2022)
Jain, P., Gyanchandani, M., Khare, N.: Big data privacy: a technological perspective and review. J. Big Data 3(1), 1–25 (2016). https://doi.org/10.1186/s40537-016-0059-y
Kabir, M.E., Mahmood, A.N., Wang, H., Mustafa, A.K.: Microaggregation sorting framework for k-anonymity statistical disclosure control in cloud computing. IEEE Trans. Cloud Comput. 8(2), 408–417 (2020). https://doi.org/10.1109/tcc.2015.2469649
Kong, L., Wang, L., Gong, W., Yan, C., Duan, Y., Qi, L.: LSH-aware multitype health data prediction with privacy preservation in edge environment. World Wide Web 25, 1793–1808 (2022)
Li, N., Li, T., Venkatasubramanian, S.: t-closeness: privacy beyond k-anonymity and l-diversity. In: 2007 IEEE 23rd International Conference on Data Engineering, pp. 106–115. IEEE (2006)
Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: l-diversity: privacy beyond k-anonymity. ACM Trans. Knowl. Discov. Data (TKDD) 1(1), 3-es (2007)
Ngatchou, P., Zarei, A., El-Sharkawi, A.: Pareto multi objective optimization. In: Proceedings of the 13th International Conference on, Intelligent Systems Application to Power Systems, pp. 84–91. IEEE (2005)
Rajendran, K., Jayabalan, M., Rana, M.E.: A study on k-anonymity, l-diversity, and t-closeness techniques. IJCSNS 17(12), 172 (2017)
Samarati, P.: Protecting respondents identities in microdata release. IEEE Trans. Knowl. Data Eng. 13(6), 1010–1027 (2001)
Sarki, R., Ahmed, K., Wang, H., Zhang, Y., Wang, K.: Convolutional neural network for multi-class classification of diabetic eye disease. EAI Endors. Trans. Scalable Inf. Syst. 9(4), e5–e5 (2022)
Soria-Comas, J., Domingo-Ferrer, J., Sánchez, D., Martínez, S.: Enhancing data utility in differential privacy via microaggregation-based k-anonymity. VLDB J. 23(5), 771–794 (2014)
Sowmiyaa, P., Tamilarasu, P., Kavitha, S., Rekha, A., Krishna, G.: Privacy preservation for microdata by using k-anonymity Algorithm. Int. J. Adv. Res. Comput. Commun. Eng. 4(4), 373–5 (2015)
Sun, X., Li, M., Wang, H.: A family of enhanced (l, \(\alpha \))-diversity models for privacy preserving data publishing. Futur. Gener. Comput. Syst. 27(3), 348–356 (2011). https://doi.org/10.1016/j.future.2010.07.007
Sun, X., Wang, H., Li, J., Zhang, Y.: Satisfying privacy requirements before data anonymization. Comput. J. 55(4), 422–437 (2012)
Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 10(05), 557–570 (2002)
Vadavalli, A., Subhashini, R.: An improved differential privacy-preserving truth discovery approach in healthcare. In: 2019 IEEE 10th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON), pp. 1031–1037. IEEE (2019)
Vasa, J., Thakkar, A.: Deep learning: differential privacy preservation in the era of big data. J. Comput. Inf. Syst. 63, 1–24 (2022)
Venkateswaran, N., Prabaharan, S.P.: An efficient neuro deep learning intrusion detection system for mobile adhoc networks. EAI Endors. Trans. Scalable Inf. Syst. 9(6), e7–e7 (2022)
Vimalachandran, P., Liu, H., Lin, Y., Ji, K., Wang, H., Zhang, Y.: Improving accessibility of the Australian my health records while preserving privacy and security of the system. Health Inf. Sci. Syst. 8, 1–9 (2020)
Wang, H., Yi, X., Bertino, E., Sun, L.: Protecting outsourced data in cloud computing through access management. Concurr. Comput.: Pract. Exp. 28(3), 600–615 (2016)
Yin, J., Tang, M., Cao, J., Wang, H., You, M., Lin, Y.: Vulnerability exploitation time prediction: an integrated framework for dynamic imbalanced learning. World Wide Web 25, 401–423 (2022)
You, M., et al.: A knowledge graph empowered online learning framework for access control decision-making. World Wide Web 26(2), 827–848 (2023)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Jahan, S., Ge, YF., Kabir, E., Wang, H. (2023). Analysis and Protection of Public Medical Dataset: From Privacy Perspective. In: Li, Y., Huang, Z., Sharma, M., Chen, L., Zhou, R. (eds) Health Information Science. HIS 2023. Lecture Notes in Computer Science, vol 14305. Springer, Singapore. https://doi.org/10.1007/978-981-99-7108-4_7
Download citation
DOI: https://doi.org/10.1007/978-981-99-7108-4_7
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-7107-7
Online ISBN: 978-981-99-7108-4
eBook Packages: Computer ScienceComputer Science (R0)