Elsevier

Neurocomputing

Volume 266, 29 November 2017, Pages 325-335
Neurocomputing

Deep learning algorithms for discriminant autoencoding

https://doi.org/10.1016/j.neucom.2017.05.042Get rights and content

Abstract

In this paper, a new family of Autoencoders (AE) for dimensionality reduction as well as class discrimination is proposed, using various class separating methods which cause a translation of the reconstructed data in a way such that the classes are better separated. The result of this combination is a new type of Discriminant Autoencoder, in which the targets are shifted in space in a discriminative fashion. The proposed Discriminant AE is experimentally compared to the standard Denoising AE in the challenging classification tasks of handwritten digit recognition and facial expression recognition as well as in the CIFAR10 dataset.

Introduction

Classification is one of the most important problems in the field Machine Learning, with many kinds of algorithms having been developed towards this purpose, from Linear Classifiers [1], shallow or deep Neural Networks [2], [3] to Support Vector Machines [4], [5] among others.

A lot of data preprocessing techniques have been suggested towards the optimization of the classification process. A common such technique is Dimensionality Reduction, i.e., the process of reducing the number of random variables, or features, used to represent the data. Perhaps the most well known unsupervised method used for this purpose is Principal Component Analysis (PCA), a linear dimensionality reduction method. There are, however, non-linear methods used for the same purpose, such as kernel based versions of the well-known linear dimensionality reduction methods (e.g. kPCA [6]), or neural network based methods such as Autoencoders [7], [8]. In addition to the reduction in dimension, Autoencoders, as Deep Learning Algorithms, attempt to capture and unveil their input’s most robust features.

Another class of dimensionality reduction methods proposed towards classification optimization is the supervised one, such as Linear Discriminant Analysis [9], which is related to Data Separation. Methods used towards this purpose attempt to move samples of the same class closer to each other while simultaneously moving them away from samples of rival classes. Usually, Linear Discriminant Analysis is performed, along with Principal Component Analysis, to achieve lower dimensions and better separated data at the same time. Intuitively, if we manage to move samples away from rival neighboring samples and towards samples of the same class, then we have achieved data separation.

Many other techniques have been proposed for the same purpose [10]. In this paper, we propose a new family of Autoencoders as Deep Learning Algorithms and as tools for dimensionality reduction, combined with various methods which achieve data separation. Ultimately, we attempt to combine these techniques under a common framework, in order to achieve dimensionality reduction and data separation using Neural Networks. The proposed learning machines are called Discriminant Autoencoders.

The manuscript is organized as follows. The deep autoencoders and the proposed family of discriminant autoencoders are described in Sections 2 and 3, respectively. In Section 4 we describe the experiments conducted to measure the effectiveness of the Discriminant autoencoder and present the results they gave. Last, our conclusions are summarized in Section 5.

Section snippets

Autoencoders

Autoencoders are unsupervised Artificial Neural Networks trained so that they can reconstruct their input, through an intermediate representation, typically of lower dimension. A very simple Autoencoder has a structure similar to a Multilayer Perceptron (MLP) [11], [12], where the output targets are the input itself. The network can then be trained effectively using a variation of the Backpropagation algorithm [12], so that it reconstructs its input with great precision, or in other words, its

Discriminant autoencoders

The proposed Discriminant Autoencoders try to increase the intra-class compactness and also the inter-class separability. To this end, we set new targets for the reconstructed samples that improve specific compactness and separability criteria that are usually used in supervised dimensionality reduction. That is, we employ discriminant analysis criteria described in Section 3.2, under the Graph Embedding Framework [14], to the iterative optimization of the designed autoencoders.

Experimental results

In this Section, we evaluate the proposed discriminant autoencoding framework using various datasets. The first set of experiments uses the well known MNIST dataset [34]. The second set of experiments is applied to three different facial expressions datasets: BU [35], JAFFE [36] and KANADE [37], [38], while the last set is performed on the CIFAR10 dataset [39]. Some samples from the MNIST dataset are shown in Fig. 11, from CIFAR10 in Fig. 12, from the JAFFE dataset in Fig. 13 and from the

Conclusion

In this paper, a novel family of Autoencoders that take into account class information as well as the local geometry of the data has been proposed. The class information is used in order to define new discriminant reconstruction targets and the resulting learning machines are called Discriminant Autoencoders. The proposed dAE are able to encode information for reconstruction as well as information for class separability, leading to improved representations in the hidden layer. By combining the

Paraskevi Nousi obtained her B.Sc. in Informatics in 2014 from Aristotle University of Thessaloniki, Greece. She is currently pursuing her Ph.D. studies in the Artificial Intelligence and Information Analysis Laboratory in the Department of Informatics at the University of Thessaloniki. Her research interests include Machine Learning and Computational Intelligence.

References (40)

  • M.-F.F. Balcan et al.

    Distributed k-means and k-median clustering on general topologies

    Proceedings of the Advances in Neural Information Processing Systems

    (2013)
  • YuanG.-X. et al.

    Recent advances of large-scale linear classification

    Proc. IEEE

    (2012)
  • F. Rosenblatt

    Principles of Neurodynamics

    (1962)
  • M.L. Minsky et al.

    Perceptrons -Expanded Edition: An Introduction to Computational Geometry

    (1987)
  • C. Cortes et al.

    Support-vector networks

    Mach. Learn.

    (1995)
  • B. Schölkopf et al.

    Advances in Kernel Methods: Support Vector Learning

    (1999)
  • B. Schölkopf et al.

    Nonlinear component analysis as a kernel eigenvalue problem

    Neural comput.

    (1998)
  • Y. Bengio

    Learning deep architectures for AI

    Found. Trends® Mach. Learn.

    (2009)
  • G.E. Hinton et al.

    Reducing the dimensionality of data with neural networks

    Science

    (2006)
  • R.A. Fisher

    The use of multiple measurements in taxonomic problems

    Ann. Eugen.

    (1936)
  • Y. Bengio et al.

    Representation learning: a review and new perspectives

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2013)
  • R.P. Lippmann

    An introduction to computing with neural nets

    ASSP Mag. IEEE

    (1987)
  • D.E. Rumelhart et al.

    Learning internal representations by error propagation

    Technical Report

    (1985)
  • P. Vincent et al.

    Extracting and composing robust features with denoising autoencoders

    Proceedings of the 25th international conference on Machine learning

    (2008)
  • YanS. et al.

    Graph embedding and extensions: a general framework for dimensionality reduction

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2007)
  • F. Schroff et al.

    Facenet: a unified embedding for face recognition and clustering

    Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

    (2015)
  • M. Kan et al.

    Stacked progressive auto-encoders (SPAE) for face recognition across poses

    Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

    (2014)
  • J. Snoek et al.

    On nonparametric guidance for learning autoencoder representations.

    Proceedings of the International Conference on Artificial Intelligence and Statistics AISTATS

    (2012)
  • S. Rifai et al.

    Disentangling factors of variation for facial expression recognition

    Computer Vision–ECCV 2012

    (2012)
  • J.T. Rolfe et al.

    Discriminative recurrent sparse auto-encoders

    International Conference on Learning Representations (ICLR)

    (2013)
  • Cited by (24)

    • Autoencoder-driven spiral representation learning for gravitational wave surrogate modelling

      2022, Neurocomputing
      Citation Excerpt :

      And secondly, the addition of the learnable spiral module to the neural networks, which, as demonstrated in Section 4, leads to surrogate waveforms with better mismatches to the ground truth waveforms, while being faster than equivalent architectures, which do not use the module. Due to their ability to extract semantically meaningful representations without the use of labels, AEs have been widely studied for a variety of tasks, including clustering [51,44], classification [42,52], and image retrieval [53,54]. Recently, artificial neural networks have been gaining momentum in the field of gravitational wave astronomy, and specifically in surrogate modelling of fiducial waveform models.

    • Representation learning and retrieval

      2022, Deep Learning for Robot Perception and Cognition
    • Deep autoencoders for attribute preserving face de-identification

      2020, Signal Processing: Image Communication
      Citation Excerpt :

      In contrast to the above methods, in this work Autoencoders [22] are used, and it is shown experimentally that they are tools capable of achieving good de-identification performance, while producing high quality reconstructions. Deep autoencoders have recently been proposed to solve tasks such as one-shot face recognition [23], unconstrained face recognition [24,25], face alignment [26] and face hallucination [27] among others. The features extracted by deep AEs have been shown time and time again to be significantly useful to such tasks.

    • Discriminative clustering using regularized subspace learning

      2019, Pattern Recognition
      Citation Excerpt :

      The ability of the proposed method to provide useful representations for other unsupervised tasks, e.g., information retrieval [4,72], or data visualization [55], can be examined. Also, the optimization objective used in this paper can be also combined with other deep clustering architectures, e.g., [73,74], to allow for learning deep regularized models for clustering tasks. Finally, the regularization parameters aintra and αinter can be adaptively chosen based on the confidence/probability for each sample to belong to each cluster, possibly further improving the clustering performance.

    • Manifold regularized stacked denoising autoencoders with feature selection

      2019, Neurocomputing
      Citation Excerpt :

      Feng et al. [36] proposed a new method for graph and autoencoder-based feature selection (GAFS). Nousi and Tefas [37] developed a new type of discriminant autoencoder, in which the targets are shifted in space in a discriminative fashion. Zhang et al. [38] proposed a new feature learning method called local deep-feature alignment (LDFA) for dimension reduction and feature learning.

    • Fast Deep Convolutional Face Detection in the Wild Exploiting Hard Sample Mining

      2018, Big Data Research
      Citation Excerpt :

      The model consists of three convolutional layers each followed by a max pooling layer and the last pooling layer is followed by a fully connected layer, whose output comprises the input of the three aforementioned branches. Amongst other deep learning strategies, Denoising Autoencoders [49] are comprised of fully connected layers and they are typically used for unsupervised pretraining as well as dimensionality reduction, leading to more compact and robust representations of data. Face detection is a supervised visual learning task where convolutional deep approaches dominate in terms of performance and speed.

    View all citing articles on Scopus

    Paraskevi Nousi obtained her B.Sc. in Informatics in 2014 from Aristotle University of Thessaloniki, Greece. She is currently pursuing her Ph.D. studies in the Artificial Intelligence and Information Analysis Laboratory in the Department of Informatics at the University of Thessaloniki. Her research interests include Machine Learning and Computational Intelligence.

    Anastasios Tefas received the B.Sc. in informatics in 1997 and the Ph.D. degree in informatics in 2002, both from the Aristotle University of Thessaloniki, Greece. Since 2017 he has been an Associate Professor at the Department of Informatics, Aristotle University of Thessaloniki. From 2008 to 2017, he was a Lecturer, Assistant Professor at the same University. From 2006 to 2008, he was an Assistant Professor at the Department of Information Management, Technological Institute of Kavala. From 2003 to 2004, he was a temporary lecturer in the Department of Informatics, University of Thessaloniki. From 1997 to 2002, he was a researcher and teaching assistant in the Department of Informatics, University of Thessaloniki. Dr. Tefas participated in 15 research projects financed by national and European funds. He has co-authored 69 journal papers, 177 papers in international conferences and contributed 8 chapters to edited books in his area of expertise. Over 3730 citations have been recorded to his publications and his H-index is 32 according to Google scholar. His current research interests include computational intelligence, pattern recognition, statistical machine learning, digital signal and image analysis and retrieval and computer vision.

    View full text