Skip to main content
Log in

Complete autoencoders for classification with missing values

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

It has been demonstrated that modified denoising stacking autoencoders (MSDAEs) serve to implement high-performance missing value imputation schemes. On the other hand, complete MSDAE (CMSDAE) classifiers, which extend their inputs with target estimates from an auxiliary classifier and are layer by layer trained to recover both the observation and the target estimates, offer classification results that are better than those provided by MSDAEs. As a consequence, investigating whether CMSDAEs can improve the MSDAEs imputation processes has an obvious practical importance. In this correspondence, two types of imputation mechanisms with CMSDAEs are considered. The first is a direct procedure in which the CMSDAE output is just the target. The second mechanism is suggested by the presence of the targets in the vectors to be autoencoded, and it uses the well-known multitask learning (MTL) ideas, including the observations as a secondary task. Experimental results show that these CMSDAE structures increase the quality of the missing value imputations, in particular the MTL versions. They give the best result in 5 out of 6 missing value problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. LeCun Y (1987) Modeles connexionistes de l’apprentissage. Ph.D. thesis, Universite de Paris

  2. Hinton GE, Zemel RS (1994) Autoencoders, minimum description length and Helmholtz free energy. In: Cowan JD, Tesauro G, Alspector J (eds) Advances in neural information processing systems, vol 6. Morgan Kaufmann, Burlington, pp 3–10

    Google Scholar 

  3. Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol PA (2010) Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J Mach Learn Res 11:3371–3408

    MathSciNet  MATH  Google Scholar 

  4. Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35:3371–3408

    Google Scholar 

  5. Tan CC, Eswaran C (2010) Reconstruction and recognition of face and digit images using autoencoders. Neural Comput Appl 19:1069–1079

    Article  Google Scholar 

  6. Hadjahmadi AH, Homayounpour MM (2019) Robust feature extraction and uncertainty estimation based on attractor dynamics in cyclic deep denoising autoencoder. Neural Comput Appl 31:7989–8002

    Article  Google Scholar 

  7. Alvear-Sandoval RF, Figueiras-Vidal AR (2018) On building ensembles of stacked denoising auto-encoding classifiers and their further improvement. Inf Fusion 39:41–52

    Article  Google Scholar 

  8. Alhassan Z, Budgen D, Alshammari R, Daghstani T, McGough AS, Moubayed NA (2018) Stacked denoising autoencoders for mortality risk prediction using imbalanced clinical data. In: Proceedings of the 17th IEEE international conference on machine learning and applications (ICMLA), Orlando, FL, pp 541–546

  9. Sánchez-Morales A, Sancho-Gómez JL, Figueiras-Vidal AR (2019) Exploiting label information to improve auto-encoding based classifiers. Neurocomputing. https://doi.org/10.1016/j.neucom.2019.08.055

    Article  Google Scholar 

  10. Jia C, Shao M, Li S, Zhao H, Fu Y (2018) Stacked denoising tensor auto-encoder for action recognition with spatiotemporal corruptions. IEEE Trans Image Process 27:1878–1887

    Article  MathSciNet  Google Scholar 

  11. Rubio JJ (2017) Stable Kalman filter and neural network for the chaotic systems identification. J Frankl Inst 354:7444–7462

    Article  MathSciNet  Google Scholar 

  12. Sánchez-Morales A, Sancho-Gómez JL, Martínez-García JA, Figueiras-Vidal AR (2019) Improving deep learning performance with missing values via deletion and compensation. Neural Comput Appl. https://doi.org/10.1007/s00521-019-04013-2 (to appear)

    Article  Google Scholar 

  13. Gondara L, Wang K. Multiple imputation using deep denoising autoencoders. CoRR abs/1705.02737. arXiv:1705.02737

  14. Caruana R (1997) Multitask learning. Mach Learn 28:41–75

    Article  Google Scholar 

  15. Wang C, Liao X, Carin L, Dunson D-B (2010) Classification with incomplete data using Dirichlet process priors. J Mach Learn Res 11:3269–3311

    MathSciNet  MATH  Google Scholar 

  16. García-Laencina PJ, Sancho-Gómez J-L, Figueiras-Vidal AR (2013) Classifying patterns with missing values using multi-task learning perceptrons. Expert Syst Appl 40:1333–1341

    Article  Google Scholar 

  17. Raghunathan TW, Lepkowksi JM, Hoewyk JV, Solenbeger P (2001) A multivariate technique for multiply imputing missing values using a sequence of regression models. Surv Methodol 27:85–95

    Google Scholar 

  18. Buuren SV (2007) Multiple imputation of discrete and continuous data by fully conditional specification. Stat Methods Med Res 16:219–242

    Article  MathSciNet  Google Scholar 

  19. Dua D, Graff C (2017) UCI machine learning repository. University of California, School of Information and Computer Sciences, Irvine. http://archive.ics.uci.edu/ml

  20. Sloan digital sky survey RD14. http://www.kaggle.com/lucidlenn/sloan-digital-sky-survey/

  21. Rectangles data. http://www.iro.umontreal.ca/

Download references

Acknowledgements

This work has been partially supported by Network of Excellence MAPAS (TIN2017-90567-REDT, M\(^\circ \) Ciencia, Inn. y Univ.) and Grant 2-BARBAS (BBVA Foundation).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Adrián Sánchez-Morales.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sánchez-Morales, A., Sancho-Gómez, JL. & Figueiras-Vidal, A.R. Complete autoencoders for classification with missing values. Neural Comput & Applic 33, 1951–1957 (2021). https://doi.org/10.1007/s00521-020-05066-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-020-05066-4

Keywords

Navigation