Abstract
In this work we propose an l p -norm data fidelity constraint for training the autoencoder. Usually the Euclidean distance is used for this purpose; we generalize the l 2 -norm to the l p -norm; smaller values of p make the problem robust to outliers. The ensuing optimization problem is solved using the Augmented Lagrangian approach. The proposed l p -norm Autoencoder has been tested on benchmark deep learning datasets – MNIST, CIFAR-10 and SVHN. We have seen that the proposed robust autoencoder yields better results than the standard autoencoder (l 2 -norm) and deep belief network for all of these problems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Huber, P.J.: Robust estimation of a location parameter. Ann. Math. Stat. 35(1), 73–101 (1964)
Branham Jr., R.L.: Alternatives to least squares. Astron. J. 87, 928–937 (1982)
Shi, M., Lukas, M.A.: An L1 estimation algorithm with degeneracy and linear constraints. Comput. Stat. Data Anal. 39(1), 35–55 (2002)
Wang, L., Gordon, M.D., Zhu, J.: Regularized least absolute deviations regression and an efficient algorithm for parameter tuning. In: IEEE ICDM, pp. 690–700 (2006)
Barrodale, I., Roberts, F.D.K.: An improved algorithm for discrete L1 linear approximation. SIAM J. Numer. Anal. 10(5), 839–848 (1973)
Schlossmacher, E.J.: An iterative technique for absolute deviations curve fitting. J. Am. Stat. Assoc. 68(344), 857–859 (1973)
Wesolowsky, G.O.: A new descent algorithm for the least absolute value regression problem. Commun. Stat. Simul. Comput. B10(5), 479–491 (1981)
Li, Y., Arce, G.R.: A maximum likelihood approach to least absolute deviation regression. EURASIP J. Appl. Sig. Process. 12, 1762–1769 (2004)
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323, 533–536 (1986)
Baldi, P., Hornik, K.: Neural networks and principal component analysis: learning from examples without local minima. Neural Netw. 2, 53–58 (1989)
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.A.: Stacked denoising autoen coders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 3371–3408 (2010)
Chartrand, R.: Nonconvex splitting for regularized low-rank + sparse decomposition. IEEE Trans. Sig. Process. 60, 5810–5819 (2012)
Majumdar, A., Ward, R.K.: On the choice of compressed sensing priors: an experimental study. Sig. Process. Image Commun. 27(9), 1035–1048 (2012)
Rifai, S., Vincent, P., Muller, X., Glorot, X., Bengio, Y.: Contractive auto-encoders: explicit invariance during feature extraction. In: ICML (2011)
Wright, J., Yang, A.Y., Ganesh, A., Sastry, S.S., Ma, Y.: Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell. 31, 210–227 (2009)
Masci, J., Meier, U., Cireşan, D., Schmidhuber, J.: Stacked convolutional auto-encoders for hierarchical feature extraction. In: Honkela, T. (ed.) ICANN 2011, Part I. LNCS, vol. 6791, pp. 52–59. Springer, Heidelberg (2011)
Lee, H., Grosse, R., Ng, A.: Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: ICML (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Mehta, J., Gupta, K., Gogna, A., Majumdar, A., Anand, S. (2016). Stacked Robust Autoencoder for Classification. In: Hirose, A., Ozawa, S., Doya, K., Ikeda, K., Lee, M., Liu, D. (eds) Neural Information Processing. ICONIP 2016. Lecture Notes in Computer Science(), vol 9949. Springer, Cham. https://doi.org/10.1007/978-3-319-46675-0_66
Download citation
DOI: https://doi.org/10.1007/978-3-319-46675-0_66
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46674-3
Online ISBN: 978-3-319-46675-0
eBook Packages: Computer ScienceComputer Science (R0)