GAGIN: generative adversarial guider imputation network for missing data

Wang, Wei; Chai, Yimeng; Li, Yue

doi:10.1007/s00521-021-06862-2

GAGIN: generative adversarial guider imputation network for missing data

Original Article
Published: 08 January 2022

Volume 34, pages 7597–7610, (2022)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Wei Wang^1,2,
Yimeng Chai^1,3,4 &
Yue Li^1,2,3

884 Accesses
4 Citations
1 Altmetric
Explore all metrics

Abstract

Missing data imputation aims to accurately impute the unobserved regions with complete data in the real world. Although many current methods have made remarkable advances, the local homogenous regions, especially in boundary, and the reason of the imputed data are still the two most challenging issues. To address these issues, we propose a novel Generative Adversarial Guider Imputation Network (GAGIN) based on generative adversarial network (GAN) for unsupervised imputation, which is composed of a Global-Impute-Net (GIN), a Local-Impute-Net (LIN) and an Impute Guider Model (IGM). The GIN looks at the entire missing regions to generate and impute data as a whole. Considering the reason of the GIN results, IGM is assigned to capture coherent information between global and local and guide the LIN to look only at a small area centered at the missing focused regions. After processing these three modules, the local imputed results are concatenated to those global imputed results, which impute the rational values and refine the local details from rough to accurate. The comprehensive experiments demonstrate our proposed method is significantly superior to the other three state-of-the-art approaches and seven traditional methods, and we achieve the best RMSE surpass the second-best method on both numeric datasets (17.3%) and image dataset (24.1%). Besides, the extensive ablation study validates the superior performance for dealing with missing data imputation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

MI2AMI: Missing Data Imputation Using Mixed Deep Gaussian Mixture Models

Missing Features Reconstruction Using a Wasserstein Generative Adversarial Imputation Network

SGAIN, WSGAIN-CP and WSGAIN-GP: Novel GAN Methods for Missing Data Imputation

References

Fortuin V, Baranchuk D, Rätsch G, et al. (2020) Gp-vae: Deep probabilistic time series imputation[C]//International Conference on artificial intelligence and statistics. PMLR, pp 1651–1661
Yonghong Luo, Ying Zhang, Xiangrui Cai, and Xiaojie Yuan. (2019) EGAN: End-to-end generative adversarial network for multivariate time series imputation. In: 12th International joint conference on artificial intelligence IJCAI-19
Rubanova Y, Chen R T Q, Duvenaud D. 2019 Latent odes for irregularly-sampled time series[J]. arXiv preprint arXiv:1907.03907
Liu Y, Yu R, Zheng S, et al. Naomi 2019 Non-auto regressive multiresolution sequence imputation[J]. arXiv preprint arXiv:1901.10946
Fedus W, Goodfellow I, Dai A M. Maskgan 2018 better text generation via filling in the_[J]. arXiv preprint arXiv:1801.07736
Lee D, Kim J, Moon W J, et al. 2019 CollaGAN: Collaborative GAN for missing image data imputation[C] In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 2487–2496
Becker P, Pandya H, Gebhardt G, et al. 2019 Recurrent kalman networks: Factorized inference in high-dimensional deep feature spaces[C]//International conference on machine learning. PMLR pp 544–552
Dalca AV, Bouman KL, Freeman WT et al (2018) Medical image imputation from image collections[J]. IEEE Trans Med Imaging 38(2):504–514
Article Google Scholar
Lee D, Moon W J, Ye J C. 2019 Which contrast does matter? towards a deep understanding of MR contrast using collaborative GAN[J]. arXiv preprint arXiv:1905.04105
Khosravi P, Liang Y, Choi Y J, et al. 2019 What to expect of classifiers? reasoning about logistic regression with missing features[J]. arXiv preprint arXiv:1903.01620
Cortes D. 2019 Imputing missing values with unsupervised random trees[J]. arXiv preprint arXiv:1911.06646
Brown T B, Mann B, Ryder N, et al. 2020 Language models are few-shot learners[J]. arXiv preprint arXiv:2005.14165
Tran K, Bisazza A, Monz C. 2016 Recurrent memory networks for language modeling[J]. arXiv preprint arXiv:1601.01272
Zhang X, Lu L, Lapata M. 2015 Top-down tree long short-term memory networks[J]. arXiv preprint arXiv:1511.00060
Goodfellow IJ, Pouget-Abadie J, Mirza M (2014) Generative Adversarial Networks. Adv Neural Inf Process Syst 3:2672–2680
Google Scholar
Seongwook Yoon, and Sanghoon Sull. 2020 GAMIN: Generative adversarial multiple imputation network for highly missing data. 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR). IEEE
Steven Cheng-Xian Li, Bo Jiang, and Benjamin Marlin. 2019 Misgan: Learning from incomplete data with generative adversarial networks
Jinsung Yoon, James Jordon, and Mihaela Schaar. 2018 Gain: Missing data imputation using generative adversarial nets. In International conference on machine learning, pp 5675–5684
Kantardzic Mehmed. 2011 Data mining: concepts, models, methods, and algorithms
White I R, Royston P, Wood A M (2011) Multiple imputation using chained equations: issues and guidance for practice. Statistic Med 30(4):377–399
Article MathSciNet Google Scholar
Stekhoven DJ, Bühlmann P (2011) Missforest–non-parametric missing value imputation for mixed-type data. Bioinformatics 28(1):112–118
Article Google Scholar
Evrim Acar, Daniel M Dunlavy, and Tamara G Kolda, and Morten Mørup. 2010 Scalable tensor factorizations with missing data. In Proceedings of the 2010 SIAM international conference on data mining, pp 701–712. SIAM
García-Laencina PJ, Sancho-Gómez J-L, Figueiras-Vidal AR (2010) Pattern classification with missing data: a review[J]. Neural Comput Appl 19(2):263–282
Article Google Scholar
Hudak A T, Crookston N L, Evans J S, Hall D E, Falkowski M J (2008) Nearest neighbor imputation of species-level, plot-scale forest structure attributes from lidar data. Remote Sens Environ 112(5):2232–2245
Article Google Scholar
Li M, Lin J, Ding Y, et al. 2020 Gan compression: Efficient architectures for interactive conditional gans[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 5284–5294
Shen Y, Gu J, Tang X, et al. 2020 Interpreting the latent space of gans for semantic face editing[C]//Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition. pp 9243–9252
Daras G, Odena A, Zhang H, et al. 2020 Your local GAN: Designing two dimensional local attention mechanisms for generative models[C]//Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition. pp 14531–14539
Lichman M. 2013 UCI machine learning repository. URL http://archive.ics.uci.edu/ml.
LeCun Y, and Cortes C. 2010 MNIST handwritten digit database. URL http://yann.lecun.com/ exdb/mnist/.
Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. J Mach Learn Res 9:249–256
Google Scholar
Diederik P. Kingma, and Jimmy Lei Ba. 2014 Adam: A method for stochastic optimization. Computer Science
Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M, et al. 2016 Tensorflow: a system for large-scale machine learning
Suhrid Balakrishnan and S. Chopra. 2012 Collaborative ranking. WSDM ’12, pp 143–152. ACM
Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S. 2017 Gans trained by a two time-scale update rule converge to a local nash equilibrium. Neural Information Processing Systems, pp 6626–6637
XU, Qiantong, et al. 2018 An empirical study on evaluation metrics of generative adversarial networks. arXiv preprint arXiv:1806.07755
Chai T, Draxler RR (2014) Root mean square error (RMSE) or mean absolute error (MAE)?–Arguments against avoiding RMSE in the literature. Geoscientific model development 7(3):1247–1250
Article Google Scholar
Kumar S K. 2017 On weight initialization in deep neural networks[J]. arXiv preprint arXiv:1704.08863
Pajankar A (2021) Useful unix commands and tools[M]//Practical Linux with Raspberry Pi OS. Apress, Berkeley, CA, pp 81–89
Book Google Scholar
Kumar N. 2019 Neural network implementation using CUDA[D]
Yin X et al (2003) A flexible sigmoid function of determinate owth. Annal Botany 91(3):361–371
Article Google Scholar
Gulrajani I, Ahmed F, Arjovsky M, et al. 2017 Improved Training of Wasserstein GANs[J]. arXiv preprint arXiv:1704.00028v3
Vapnik V. 2013 The nature of statistical learning theory[M]. Springer science & business media
Mirza M, Osindero S. 2014 Conditional generative adversarial nets[J]. arXiv preprint arXiv:1411.1784
Liu Y, Gopalakrishnan V (2017) An overview and evaluation of recent machine learning imputation methods using cardiac imaging data. Data 2(1):8
Article Google Scholar
Jerez José M et al (2010) Missing data imputation using statistical and machine learning methods in a real breast cancer problem. Artifi Intell Med 50(2):105–115
Article Google Scholar
Li L, Fu H, Xu X (2021) Active learning with sampling by joint global-local uncertainty for salient object detection. Neural Comput Applic. https://doi.org/10.1007/s00521-021-06395-8
Article Google Scholar
Ma X, Li X, Zhou Y et al (2021) Image smoothing based on global sparsity decomposition and a variable parameter. Comp Visual Media 7:483–497
Article Google Scholar
Wang Q, Hu X, Gao Q et al (2014) Global–local fisher discriminant approach for face recognition. Neural Comput Applic 25:1137–1144
Article Google Scholar
Cheng Y, Song F, Qian K (2021) Missing multi-label learning with non-equilibrium based on two-level autoencoder. Appl Intell 51:6997–7015
Article Google Scholar
Raja PS, Sasirekha K, Thangavel K (2020) A novel fuzzy rough clustering parameter-based missing value imputation. Neural Comput Applic 32:10033–10050
Article Google Scholar

Download references

Acknowledgements

This work was supported by Qian Xuesen Laboratory of Space Technology, CAST(GZZKFJJ2020002), National Key Research and Development Program of China under the grant number (2018hjyzkfkt-002).

Author information

Authors and Affiliations

College of Computer Science, Nankai University, Tianjin, 300350, People’s Republic of China
Wei Wang, Yimeng Chai & Yue Li
Key Laboratory for Medical Data Analysis and Statistical Research of Tianjin (KLMDASR), Tianjin, 300350, People’s Republic of China
Wei Wang & Yue Li
Trusted AI System Laboratory, College of Cyber Science, Nankai University, Tianjin, 300350, People’s Republic of China
Yimeng Chai & Yue Li
Tianjin Key Laboratory of Network and Data Security Technology, Tianjin, 300350, People’s Republic of China
Yimeng Chai

Authors

Wei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yimeng Chai
View author publications
You can also search for this author in PubMed Google Scholar
Yue Li
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Wei Wang was involved in supervision and project administration. Yimeng Chai was involved in methodology, software, and writing—original draft. Yue Li was involved in conceptualization, methodology, and writing—review & editing.

Corresponding author

Correspondence to Yue Li.

Ethics declarations

Conflict of Interest

We declare that we have no financial and personal relationships with other people or organizations that can inappropriately influence our work, and there is no professional or other personal interest of any nature or kind in any product, service and/or company that could be construed as influencing the position presented in, or the review of, the manuscript entitled.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, W., Chai, Y. & Li, Y. GAGIN: generative adversarial guider imputation network for missing data. Neural Comput & Applic 34, 7597–7610 (2022). https://doi.org/10.1007/s00521-021-06862-2

Download citation

Received: 23 April 2021
Accepted: 12 December 2021
Published: 08 January 2022
Issue Date: May 2022
DOI: https://doi.org/10.1007/s00521-021-06862-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

GAGIN: generative adversarial guider imputation network for missing data

Abstract

Access this article

Similar content being viewed by others

MI2AMI: Missing Data Imputation Using Mixed Deep Gaussian Mixture Models

Missing Features Reconstruction Using a Wasserstein Generative Adversarial Imputation Network

SGAIN, WSGAIN-CP and WSGAIN-GP: Novel GAN Methods for Missing Data Imputation

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

GAGIN: generative adversarial guider imputation network for missing data

Abstract

Access this article

Similar content being viewed by others

MI2AMI: Missing Data Imputation Using Mixed Deep Gaussian Mixture Models

Missing Features Reconstruction Using a Wasserstein Generative Adversarial Imputation Network

SGAIN, WSGAIN-CP and WSGAIN-GP: Novel GAN Methods for Missing Data Imputation

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation