Skip to main content
Log in

Auto-encoder-based generative models for data augmentation on regression problems

  • Focus
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Recently, auto-encoder-based generative models have been widely used successfully for image processing. However, there are few studies on the realization of continuous input–output mappings for regression problems. Lack of a sufficient amount of training data plagues regression problems, which is also a notable problem in machine learning, which affects its application in the field of materials science. Using variational auto-encoders (VAEs) as generative models for data augmentation, we address the issue of small data size for regression problems. VAEs are popular and powerful auto-encoder-based generative models. Generative auto-encoder models such as VAEs use multilayer neural networks to generate sample data. In this study, we demonstrate the effectiveness of multi-task learning (auto-encoding and regression tasks) relating to regression problems. We conducted experiments on seven benchmark datasets and on one ionic conductivity dataset as an application in materials science. The experimental results show that the multi-task learning for VAEs improved the generalization performance of multivariable linear regression model trained with augmented data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  • Abu Arqub O, AL-Smadi M, Momani S, Hayat T (2016) Numerical solutions of fuzzy differential equations using reproducing kernel Hilbert space method. Soft Comput 20(8):3283–3302

    Article  Google Scholar 

  • Abu Arqub O, Al-Smadi M, Momani S, Hayat T (2017) Application of reproducing kernel algorithm for solving second-order, two-point fuzzy boundary value problems. Soft Comput 21(23):7191–7206

    Article  Google Scholar 

  • Alain G, Bengio Y (2014) What regularized auto-encoders learn from the data-generating distribution. J Mach Learn Res 15:3563–3593

    MathSciNet  MATH  Google Scholar 

  • An G (1996) The effects of adding noise during backpropagation training on a generalization performance. Neural Comput 8(3):643–674

    Article  MathSciNet  Google Scholar 

  • Arjovsky M, Bottou L (2017) Towards principled methods for training generative adversarial networks. CoRR arXiv:1701.04862

  • Arulkumaran K, Creswell A, Bharath AA (2016) Improving sampling from generative autoencoders with Markov chains. CoRR arXiv:1610.09296

  • Bengio Y (2012) Practical recommendations for gradient-based training of deep architectures. Springer, Berlin, pp 437–478

    Google Scholar 

  • Bengio Y, Alain G, Rifai S (2012) Implicit density estimation by local moment matching to sample from auto-encoders. Technical Report, Université de Montréal. Arxiv report arXiv:1207.0057

  • Bengio Y, Mesnil G, Dauphin Y, Rifai S (2013a) Better mixing via deep representations. In: Proceedings of the 30th international conference on machine learning (ICML’13)

  • Bengio Y, Yao L, Alain G, Vincent P (2013b) Generalized denoising auto-encoders as generative models. In: Advances in neural information processing systems, vol 26 (NIPS 2013), pp 899–907

  • Bengio Y, Thibodeau-Laufer E, Yosinski J, Alain G (2014) Deep generative stochastic networks trainable by backprop. In: Proceedings of the thirty-one international conference on machine learning (ICML’14)

  • Bishop CM (1995) Training with noise is equivalent to Tikhonov regularization. Neural Comput 7(1):108–116

    Article  Google Scholar 

  • Blöchl PE (1994) Projector augmented-wave method. Phys Rev B 50:17,953–17,979

    Article  Google Scholar 

  • Denton EL, Chintala S, Szlam A, Fergus R (2015) Deep generative image models using a Laplacian pyramid of adversarial networks. In: Cortes C, Lawrence ND, Lee DD, Sugiyama M, Garnett R (eds) Advances in neural information processing systems, vol 28. Curran Associates, Inc., Red Hook, pp 1486–1494

    Google Scholar 

  • Desjardins G, Courville A, Bengio Y, Vincent P, Delalleau O (2010) Tempered Markov chain Monte Carlo for training of restricted Boltzmann machines. In: Proceedings of the 13th international conference on artificial intelligence and statistics, vol 9, pp 145–152

  • Dinh L, Sohl-Dickstein J, Bengio S (2016) Density estimation using real NVP. CoRR arXiv:1605.08803

  • Drugowitsch J (2013) Variational Bayesian inference for linear and logistic regression. ArXiv e-prints arXiv:1310.5438

  • Duchi J, Hazan E, Singer Y (2011) Adaptive subgradient methods for online learning and stochastic optimization. J Mach Learn Res 12:2121–2159

    MathSciNet  MATH  Google Scholar 

  • Gan Z, Henao R, Carlson D, Carin L (2015) Learning deep sigmoid belief networks with data augmentation. In: Lebanon G, Vishwanathan SVN (eds) Proceedings of the eighteenth international conference on artificial intelligence and statistics, PMLR, proceedings of machine learning research, San Diego, vol 38, pp 268–276

  • Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Ghahramani Z, Welling M, Cortes C, Lawrence ND, Weinberger KQ (eds) Advances in neural information processing systems, vol 27. Curran Associates, Inc., Red Hook, pp 2672–2680

    Google Scholar 

  • Grandvalet Y, Bengio Y (2004) Semi-supervised learning by entropy minimization. In: Saul LK, Weiss Y, Bottou L (eds) Proceedings of the 17th international conference on neural information processing systems, NIPS’04. MIT Press, Cambridge, pp 529–536

  • Guimaraes GL, Sanchez-Lengeling B, Farias PLC, Aspuru-Guzik A (2017) Objective-reinforced generative adversarial networks (ORGAN) for sequence generation models. CoRR arXiv:1705.10843

  • Huang C, Touati A, Dinh L, Drozdzal M, Havaei M, Charlin L, Courville AC (2017) Learnable explicit density for continuous latent space and variational inference. CoRR arXiv:1710.02248

  • Kawaguchi K (2016) Deep learning without poor local minima. In: Lee DD, Sugiyama M, Luxburg UV, Guyon I, Garnett R (eds) Advances in neural information processing systems, vol 29. Curran Associates, Inc., Red Hook, pp 586–594

    Google Scholar 

  • Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. CoRR arXiv:1412.6980

  • Kingma DP, Welling M (2013) Auto-encoding variational Bayes. In: Proceedings of the 2nd international conference on learning representation

  • Kresse G, Furthmüller J (1996) Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set. Phys Rev B 54:11,169–11,186

    Article  Google Scholar 

  • LeCun Y, Cortes C (2010) MNIST handwritten digit database. http://yann.lecun.com/exdb/mnist/. Accessed 22 Apr 2019

  • Minka T (2005) Divergence measures and message passing. Technical Report, MSR-TR-2005-173

  • Neal RM (1996) Sampling from multimodal distributions using tempered transitions. Stat Comput 6(4):353–366

    Article  Google Scholar 

  • Nguyen A, Dosovitskiy A, Yosinski J, Brox T, Clune J (2016) Synthesizing the preferred inputs for neurons in neural networks via deep generator networks. In: Lee DD, Sugiyama M, Luxburg UV, Guyon I, Garnett R (eds) Proceedings of the 30th international conference on neural information processing systems, NIPS’16. Curran Associates Inc., Red Hook, pp 3395–3403

  • Nguyen A, Clune J, Bengio Y, Dosovitskiy A, Yosinski J (2017) Plug play generative networks: Conditional iterative generation of images in latent space. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 3510–3520

  • Parzen E (1962) On estimation of a probability density function and mode. Ann Math Stat 33(3):1065–1076

    Article  MathSciNet  Google Scholar 

  • Poole B, Sohl-Dickstein J, Ganguli S (2014) Analyzing noise in autoencoders and deep networks. CoRR arXiv:1406.1831

  • Radford A, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks. CoRR arXiv:1511.06434

  • Ramprasad R, Batra R, Pilania G, Mannodi-Kanakkithodi A, Kim C (2017) Machine learning in materials informatics: recent applications and prospects. npj Comput Mater 3(1):54

    Article  Google Scholar 

  • Rezende DJ, Mohamed S (2015) Variational inference with normalizing flows. In: Bach FR, Blei DM (eds) ICML, JMLR.org, JMLR workshop and conference proceedings, vol 37, pp 1530–1538

  • Rezende DJ, Mohamed S, Wierstra D (2014) Stochastic backpropagation and approximate inference in deep generative models. In: Xing EP, Jebara T (eds) Proceedings of the 31st international conference on machine learning, PMLR, proceedings of machine learning research, vol 32. PMLR, Beijing, China, pp 1278–1286

  • Rifai S, Dauphin YN, Vincent P, Bengio Y, Muller X (2011) The manifold tangent classifier. In: Shawe-Taylor J, Zemel RS, Bartlett PL, Pereira F, Weinberger KQ (eds) Advances in neural information processing systems, vol 24. Curran Associates Inc., Red Hook, pp 2294–2302

  • Rifai S, Bengio Y, Dauphin Y, Vincent P (2012) A generative process for sampling contractive auto-encoders. In: Proceedings of the twenty-nine international conference on machine learning (ICML’12)

  • Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958

    MathSciNet  MATH  Google Scholar 

  • Theis L, van den Oord A, Bethge M (2016) A note on the evaluation of generative models. In: International conference on learning representations

  • Wu Y, Burda Y, Salakhutdinov R, Grosse RB (2016) On the quantitative analysis of decoder-based generative models. CoRR arXiv:1611.04273

  • Zhang Y, Ling C (2018) A strategy to apply machine learning to small datasets in materials science. npj Comput Mater 4(1):25

    Article  Google Scholar 

  • Zhu JY, Krähenbühl P, Shechtman E, Efros AA (2016) Generative visual manipulation on the natural image manifold. In: Proceedings of European conference on computer vision (ECCV)

Download references

Acknowledgements

The author would like to thank Dr. Nobuko Ohba for preparing the ionic conductivity data, and anonymous reviewers for their constructive comments on the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hiroshi Ohno.

Ethics declarations

Conflict of interest

The author declares that there is no conflict of interest regarding the publication of this article.

Human and animals rights

This article does not contain any studies with human participants or animals performed by the author.

Additional information

Communicated by Mu-Yen Chen.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ohno, H. Auto-encoder-based generative models for data augmentation on regression problems. Soft Comput 24, 7999–8009 (2020). https://doi.org/10.1007/s00500-019-04094-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-019-04094-0

Keywords

Navigation