skip to main content
10.1145/3534678.3539020acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Generative Adversarial Networks Enhanced Pre-training for Insufficient Electronic Health Records Modeling

Published: 14 August 2022 Publication History

Abstract

In recent years, automatic computational systems based on deep learning are widely used in medical fields, such as automatic diagnosing and disease prediction. Most of these systems are designed for data sufficient scenarios. However, due to the disease rarity or privacy, the medical data are always insufficient. When applying these data-hungry deep learning models with insufficient data, it is likely to lead to issues of over-fitting and cause serious performance problems. Many data augmentation methods have been proposed to solve the data insufficiency problem, such as using GAN (Generative Adversarial Networks) to generate training data. However, the augmented data usually contains lots of noise. Directly using them to train sensitive medical models is very difficult to achieve satisfactory results.
To overcome this problem, we propose a novel deep model learning method for insufficient EHR (Electronic Health Record) data modeling, namely GRACE, which stands GeneRative Adversarial networks enhanCed prE-training. In the method, we propose an item-relation-aware GAN to capture changing trends and correlations among data for generating high-quality EHR records. Furthermore, we design a pre-training mechanism consisting of a masked records prediction task and a real-fake contrastive learning task to learn representations for EHR data using both generated and real data. After the pre-training, only the representations of real data is used to train the final prediction model. In this way, we can fully exploit useful information in generated data through pre-training, and also avoid the problems caused by directly using noisy generated data to train the final prediction model. The effectiveness of the proposed method is evaluated using extensive experiments on three healthcare-related real-world datasets. We also deploy our method in a maternal and child health care hospital for the online test. Both offline and online experimental results demonstrate the effectiveness of the proposed method. We believe doctors and patients can benefit from our effective learning method in various healthcare-related applications.

References

[1]
American Diabetes Association et al. 2004. Gestational diabetes mellitus. Diabetes care, Vol. 27, suppl 1 (2004), s88--s90.
[2]
Tian Bai, Shanshan Zhang, Brian L. Egleston, and Slobodan Vucetic. 2018. Interpretable Representation Learning for Healthcare via Capturing Disease Progression through Time. In KDD'18. ACM, 43--51.
[3]
Inci M. Baytas, Cao Xiao, Xi Zhang, Fei Wang, Anil K. Jain, and Jiayu Zhou. 2017. Patient Subtyping via Time-Aware LS™ Networks. In KDD'17. ACM, 65--74.
[4]
Zhengping Che, Yu Cheng, Shuangfei Zhai, Zhaonan Sun, and Yan Liu. 2017. Boosting Deep Learning Risk Prediction with Generative Adversarial Networks for Electronic Health Records. In ICDM'17. IEEE Computer Society, 787--792.
[5]
Kyunghyun Cho, Bart van Merrienboer, Caglar Gülcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. In EMNLP'14. ACL, 1724--1734.
[6]
Edward Choi, Mohammad Taha Bahadori, Le Song, Walter F. Stewart, and Jimeng Sun. 2017a. GRAM: Graph-based Attention Model for Healthcare Representation Learning. In KDD'17. ACM, 787--795.
[7]
Edward Choi, Mohammad Taha Bahadori, Jimeng Sun, Joshua Kulas, Andy Schuetz, and Walter F. Stewart. 2016. RETAIN: An Interpretable Predictive Model for Healthcare using Reverse Time Attention Mechanism. In NIPS'16. 3504--3512.
[8]
Edward Choi, Siddharth Biswal, Bradley A. Malin, Jon Duke, Walter F. Stewart, and Jimeng Sun. 2017b. Generating Multi-label Discrete Patient Records using Generative Adversarial Networks. In MLHC'17 (Proceedings of Machine Learning Research, Vol. 68). PMLR, 286--305.
[9]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT'19 (1). Association for Computational Linguistics, 4171--4186.
[10]
Cristóbal Esteban, Stephanie L. Hyland, and Gunnar Ratsch. 2017. Real-valued (Medical) Time Series Generation with Recurrent Conditional GANs. CoRR, Vol. abs/1706.02633 (2017).
[11]
Junyi Gao, Cao Xiao, Yasha Wang, Wen Tang, Lucas M. Glass, and Jimeng Sun. 2020. StageNet: Stage-Aware Neural Networks for Health Risk Prediction. In WWW'20. ACM / IW3C2, 530--540.
[12]
Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron C. Courville, and Yoshua Bengio. 2014. Generative Adversarial Networks. CoRR, Vol. abs/1406.2661 (2014).
[13]
Alex Graves. 2013. Generating Sequences With Recurrent Neural Networks. CoRR, Vol. abs/1308.0850 (2013).
[14]
Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, and Aaron C. Courville. 2017. Improved Training of Wasserstein GANs. In NIPS'17. 5767--5777.
[15]
Hrayr Harutyunyan, Hrant Khachatrian, David C. Kale, and Aram Galstyan. 2017. Multitask Learning and Benchmarking with Clinical Time Series Data. CoRR, Vol. abs/1703.07771 (2017).
[16]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Comput., Vol. 9, 8 (1997), 1735--1780.
[17]
Alistair EW Johnson, Tom J Pollard, Lu Shen, H Lehman Li-Wei, Mengling Feng, Mohammad Ghassemi, Benjamin Moody, Peter Szolovits, Leo Anthony Celi, and Roger G Mark. 2016. MIMIC-III, a freely accessible critical care database. Scientific data, Vol. 3, 1 (2016), 1--9.
[18]
Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In ICLR'15 (Poster).
[19]
Yikuan Li, Mohammad Mamouei, Gholamreza Salimi Khorshidi, Shishir Rao, Abdelaali Hassaine, Dexter Canoy, Thomas Lukasiewicz, and Kazem Rahimi. 2021. Hi-BEHRT: Hierarchical Transformer-based model for accurate prediction of clinical events using multimodal longitudinal electronic health records. CoRR, Vol. abs/2106.11360 (2021).
[20]
Yikuan Li, Shishir Rao, Jose Roberto Ayala Solares, Abdelaali Hassaine, Rema Ramakrishnan, Dexter Canoy, Yajie Zhu, Kazem Rahimi, and Gholamreza Salimi-Khorshidi. 2020. BEHRT: transformer for electronic health records. Scientific reports, Vol. 10, 1 (2020), 1--12.
[21]
Junyu Luo, Muchao Ye, Cao Xiao, and Fenglong Ma. 2020. HiTANet: Hierarchical Time-Aware Attention Networks for Risk Prediction on Electronic Health Records. In KDD'20. ACM, 647--656.
[22]
Fenglong Ma, Radha Chitta, Jing Zhou, Quanzeng You, Tong Sun, and Jing Gao. 2017. Dipole: Diagnosis Prediction in Healthcare via Attention-based Bidirectional Recurrent Neural Networks. In KDD'17. ACM, 1903--1911.
[23]
Fenglong Ma, Jing Gao, Qiuling Suo, Quanzeng You, Jing Zhou, and Aidong Zhang. 2018. Risk Prediction on Electronic Health Records with Prior Medical Knowledge. In KDD'18. ACM, 1910--1919.
[24]
Olof Mogren. 2016. C-RNN-GAN: Continuous recurrent neural networks with adversarial training. CoRR, Vol. abs/1611.09904 (2016).
[25]
Hengzhi Pei, Kan Ren, Yuqing Yang, Chang Liu, Tao Qin, and Dongsheng Li. 2021. Towards Generating Real-World Time Series Data. CoRR, Vol. abs/2111.08386 (2021).
[26]
P. K. S. Prakash, Srinivas Chilukuri, Nikhil Ranade, and Shankar Viswanathan. 2021. RareBERT: Transformer Architecture for Rare Disease Patient Identification using Administrative Claims. In AAA21. AAAI Press, 453--460.
[27]
Laila Rasmy, Yang Xiang, Ziqian Xie, Cui Tao, and Degui Zhi. 2021. Med-BERT: pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction. NPJ digital medicine, Vol. 4, 1 (2021), 1--13.
[28]
Houxing Ren, Jingyuan Wang, Wayne Xin Zhao, and Ning Wu. 2021. RAPT: Pre-training of Time-Aware Transformer for Learning Robust Healthcare Representation. In KDD'21. ACM, 3503--3511.
[29]
Nitish Srivastava, Geoffrey E. Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res., Vol. 15, 1 (2014), 1929--1958.
[30]
Ilya Sutskever, James Martens, and Geoffrey E. Hinton. 2011. Generating Text with Recurrent Neural Networks. In ICML'11. Omnipress, 1017--1024.
[31]
Aaron van den Oord, Yazhe Li, and Oriol Vinyals. 2018. Representation Learning with Contrastive Predictive Coding. CoRR, Vol. abs/1807.03748 (2018).
[32]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In NIPS'17. 5998--6008.
[33]
Jianing Xi, Liping Ye, Qinghua Huang, and Xuelong Li. 2021. Tolerating Data Missing in Breast Cancer Diagnosis from Clinical Ultrasound Reports via Knowledge Graph Inference. In KDD'21. ACM, 3756--3764.
[34]
David Yarowsky. 1995. Unsupervised Word Sense Disambiguation Rivaling Supervised Methods. In ACL'95. Morgan Kaufmann Publishers / ACL, 189--196.
[35]
Muchao Ye, Suhan Cui, Yaqing Wang, Junyu Luo, Cao Xiao, and Fenglong Ma. 2021 a. MedPath: Augmenting Health Risk Prediction via Medical Knowledge Paths. In WWW'21. ACM / IW3C2, 1397--1409.
[36]
Muchao Ye, Suhan Cui, Yaqing Wang, Junyu Luo, Cao Xiao, and Fenglong Ma. 2021 b. MedRetriever: Target-Driven Interpretable Health Risk Prediction via Retrieving Unstructured Medical Text. In CIKM'21. ACM, 2414--2423.
[37]
Muchao Ye, Junyu Luo, Cao Xiao, and Fenglong Ma. 2020. LSAN: Modeling Long-term Dependencies and Short-term Correlations with Hierarchical Attention for Risk Prediction. In CIKM'20. ACM, 1753--1762.
[38]
Jinsung Yoon, Daniel Jarrett, and Mihaela van der Schaar. 2019. Time-series Generative Adversarial Networks. In NeurIPS'19. 5509--5519.
[39]
Xianli Zhang, Buyue Qian, Shilei Cao, Yang Li, Hang Chen, Yefeng Zheng, and Ian Davidson. 2020. INPREM: An Interpretable and Trustworthy Predictive Model for Healthcare. In KDD'20. ACM, 450--460.
[40]
Xi Sheryl Zhang, Fengyi Tang, Hiroko H. Dodge, Jiayu Zhou, and Fei Wang. 2019. MetaPred: Meta-Learning for Clinical Risk Prediction with Limited Patient Electronic Health Records. In KDD'19. ACM, 2487--2495.

Cited By

View all
  • (2024)Fuzzy Multiview Graph Learning on Sparse Electronic Health RecordsIEEE Transactions on Fuzzy Systems10.1109/TFUZZ.2024.341573032:10(5520-5532)Online publication date: Oct-2024
  • (2024)Multi-Channel Hypergraph Network for Sequential Diagnosis Prediction in Healthcare2024 27th International Conference on Computer Supported Cooperative Work in Design (CSCWD)10.1109/CSCWD61410.2024.10580191(2937-2942)Online publication date: 8-May-2024
  • (2024)LMKGKnowledge-Based Systems10.1016/j.knosys.2023.111323284:COnline publication date: 17-Apr-2024
  • Show More Cited By

Index Terms

  1. Generative Adversarial Networks Enhanced Pre-training for Insufficient Electronic Health Records Modeling

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
    August 2022
    5033 pages
    ISBN:9781450393850
    DOI:10.1145/3534678
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 14 August 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. healthcare informatics
    2. pre-training
    3. representation learning

    Qualifiers

    • Research-article

    Funding Sources

    • The Fundamental Research Funds for the Central Universities
    • The National Key R&D Program of China
    • The National Natural Science Foundation of China

    Conference

    KDD '22
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Upcoming Conference

    KDD '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)139
    • Downloads (Last 6 weeks)3
    Reflects downloads up to 12 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Fuzzy Multiview Graph Learning on Sparse Electronic Health RecordsIEEE Transactions on Fuzzy Systems10.1109/TFUZZ.2024.341573032:10(5520-5532)Online publication date: Oct-2024
    • (2024)Multi-Channel Hypergraph Network for Sequential Diagnosis Prediction in Healthcare2024 27th International Conference on Computer Supported Cooperative Work in Design (CSCWD)10.1109/CSCWD61410.2024.10580191(2937-2942)Online publication date: 8-May-2024
    • (2024)LMKGKnowledge-Based Systems10.1016/j.knosys.2023.111323284:COnline publication date: 17-Apr-2024
    • (2023)Applications of Artificial Intelligence for Health Informatics: A Systematic ReviewJournal of Artificial Intelligence for Medical Sciences10.55578/joaims.230920.0014:2(19-46)Online publication date: 2023
    • (2023)Continuous trajectory generation based on two-stage GANProceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v37i4.25557(4374-4382)Online publication date: 7-Feb-2023
    • (2023)Fusion of Dynamic Hypergraph and Clinical Event for Sequential Diagnosis Prediction2023 IEEE 29th International Conference on Parallel and Distributed Systems (ICPADS)10.1109/ICPADS60453.2023.00227(1620-1627)Online publication date: 17-Dec-2023
    • (2022)RSDProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557440(1675-1684)Online publication date: 17-Oct-2022

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media