loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Authors: Takayuki Miura 1 ; Eizen Kimura 2 ; Atsunori Ichikawa 1 ; Masanobu Kii 1 and Juko Yamamoto 1

Affiliations: 1 NTT Social Informatics Laboratories, Tokyo, Japan ; 2 Dept. Medical Informatics, Medical School of Ehime Univ., Ehime, Japan

Keyword(s): Synthetic Data Generation, Differential Privacy, Real-World Data.

Abstract: Anticipation surrounds the use of real-world data for data analysis in medicine and healthcare, yet handling sensitive data demands ethical review and safety management, presenting bottlenecks in the swift progression of research. Consequently, numerous techniques have emerged for generating synthetic data, which preserves the features of the original data. Nonetheless, the quality of such synthetic data, particularly in the context of real-world data, has yet to be sufficiently examined. In this paper, we conduct experiments with a Diagonosis Procedure Combination (DPC) dataset to evaluate the quality of synthetic data generated by statistics-based, graphical model-based, and deep neural network-based methods. Further, we implement differential privacy for theoretical privacy protection and assess the resultant degradation of data quality. The findings indicate that a statistics-based method called Gaussian Copula and a graphical-model-based method called AIM yield high-quality synt hetic data regarding statistical similarity and machine learning model performance. The paper also summarizes issues pertinent to the practical application of synthetic data derived from the experimental results. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 18.224.63.87

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Miura, T.; Kimura, E.; Ichikawa, A.; Kii, M. and Yamamoto, J. (2024). Evaluating Synthetic Data Generation Techniques for Medical Dataset. In Proceedings of the 17th International Joint Conference on Biomedical Engineering Systems and Technologies - HEALTHINF; ISBN 978-989-758-688-0; ISSN 2184-4305, SciTePress, pages 315-322. DOI: 10.5220/0012314500003657

@conference{healthinf24,
author={Takayuki Miura. and Eizen Kimura. and Atsunori Ichikawa. and Masanobu Kii. and Juko Yamamoto.},
title={Evaluating Synthetic Data Generation Techniques for Medical Dataset},
booktitle={Proceedings of the 17th International Joint Conference on Biomedical Engineering Systems and Technologies - HEALTHINF},
year={2024},
pages={315-322},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012314500003657},
isbn={978-989-758-688-0},
issn={2184-4305},
}

TY - CONF

JO - Proceedings of the 17th International Joint Conference on Biomedical Engineering Systems and Technologies - HEALTHINF
TI - Evaluating Synthetic Data Generation Techniques for Medical Dataset
SN - 978-989-758-688-0
IS - 2184-4305
AU - Miura, T.
AU - Kimura, E.
AU - Ichikawa, A.
AU - Kii, M.
AU - Yamamoto, J.
PY - 2024
SP - 315
EP - 322
DO - 10.5220/0012314500003657
PB - SciTePress