Self-supervised autoencoders for clustering and classification

Nousi, Paraskevi; Tefas, Anastasios

doi:10.1007/s12530-018-9235-y

Self-supervised autoencoders for clustering and classification

Original Paper
Published: 23 May 2018

Volume 11, pages 453–466, (2020)
Cite this article

Evolving Systems Aims and scope Submit manuscript

863 Accesses
8 Citations
Explore all metrics

Abstract

Clustering techniques aim at finding meaningful groups of data samples which exhibit similarity with regards to a set of characteristics, typically measured in terms of pairwise distances. Due to the so-called curse of dimensionality, i.e., the observation that high-dimensional spaces are unsuited for measuring distances, distance-based clustering techniques such as the classic k-means algorithm fail to uncover meaningful clusters in high-dimensional spaces. Thus, dimensionality reduction techniques can be used to greatly improve the performance of such clustering methods. In this work, we study Autoencoders as Deep Learning tools for dimensionality reduction, and combine them with k-means clustering to learn low-dimensional representations which improve the clustering performance by enhancing intra-cluster relationships and suppressing inter-cluster ones, in a self-supervised manner. In the supervised paradigm, distance-based classifiers may also greatly benefit from robust dimensionality reduction techniques. The proposed method is evaluated via multiple experiments on datasets of handwritten digits, various objects and faces, and is shown to improve external cluster quality measuring criteria. A fully supervised counterpart is also evaluated on two face recognition datasets, and is shown to improve the performance of various lightweight classifiers, allowing their use in real-time applications on devices with limited computational resources, such as Unmanned Aerial Vehicles (UAVs).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Review of Unsupervised Learning Techniques

Local Intrinsic Dimensionality Based Features for Clustering

Performance Analysis of Deep Neural Maps

References

Aggarwal CC, Hinneburg A, Keim DA (2001) On the surprising behavior of distance metrics in high dimensional spaces. In: ICDT, vol 1. Springer, pp 420–434
Arthur D, Vassilvitskii S (2007) k-means++: the advantages of careful seeding. In: Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms. Society for Industrial and Applied Mathematics, pp 1027–1035
Belhumeur PN, Hespanha JP, Kriegman DJ (1997) Eigenfaces vs. fisherfaces: recognition using class specific linear projection. IEEE Trans Pattern Anal Mach Intell 19(7):711–720
Article Google Scholar
Beyer K, Goldstein J, Ramakrishnan R, Shaft U (1999) When is nearest neighbor meaningful? In: International conference on database theory. Springer, pp 217–235
Bezdek JC, Ehrlich R, Full W (1984) Fcm: the fuzzy c-means clustering algorithm. Comput Geosci 10(2–3):191–203
Article Google Scholar
Bhuiyan AA, Liu CH (2007) On face recognition using gabor filters. World Acad Sci Eng Technol 28:51–56
Google Scholar
Boutsidis C, Zouzias A, Mahoney MW, Drineas P (2015) Randomized dimensionality reduction for \(k\)-means clustering. IEEE Trans Inf Theory 61(2):1045–1062
Article MathSciNet Google Scholar
Bouzas D, Arvanitopoulos N, Tefas A (2015) Graph embedded nonparametric mutual information for supervised dimensionality reduction. IEEE Trans Neural Netw Learn Syst 26(5):951–963
Article MathSciNet Google Scholar
Celebi ME, Kingravi HA, Vela PA (2013) A comparative study of efficient initialization methods for the k-means clustering algorithm. Expert Syst Appl 40(1):200–210
Article Google Scholar
Chrysouli C, Tefas A (2015) Spectral clustering and semi-supervised learning using evolving similarity graphs. Appl Soft Comput 34:625–637
Article Google Scholar
Davis JV, Kulis B, Jain P, Sra S, Dhillon IS (2007) Information-theoretic metric learning. In: Proceedings of the 24th international conference on Machine learning. ACM, pp 209–216
Dehghan A, Ortiz EG, Villegas R, Shah M (2014) Who do i look like? Determining parent-offspring resemblance via gated autoencoders. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1757–1764
Dhillon IS, Guan Y, Kulis B (2004) Kernel k-means: spectral clustering and normalized cuts. In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 551–556
Ding C, He X (2004) K-means clustering via principal component analysis. In: Proceedings of the twenty-first international conference on Machine learning. ACM, p 29
Ding C, Li T (2007) Adaptive dimension reduction using discriminant analysis and k-means clustering. In: Proceedings of the 24th international conference on Machine learning. ACM, pp 521–528
Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7(2):179–188
Article Google Scholar
Ghosh S, Dubey SK (2013) Comparative analysis of k-means and fuzzy c-means algorithms. Int J Adv Comput Sci Appl 4(4)
Guo G, Li SZ, Chan K (2000) Face recognition by support vector machines. In: Fourth IEEE International Conference on Automatic Face and Gesture Recognition, 2000. Proceedings, pp 196–201. IEEE
Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861
Huang P, Huang Y, Wang W, Wang L (2014) Deep embedding network for clustering. In: 2014 22nd International Conference on Pattern Recognition (ICPR). IEEE, pp 1532–1537
Huang Z (1998) Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Min Knowl Disc 2(3):283–304
Article Google Scholar
Jain AK (2010) Data clustering: 50 years beyond k-means. Pattern Recogn Lett 31(8):651–666
Article Google Scholar
Jolliffe I (2002) Principal component analysis. Wiley Online Library
Khan SS, Ahmad A (2004) Cluster center initialization algorithm for k-means clustering. Pattern Recogn Lett 25(11):1293–1302
Article Google Scholar
Kingma D, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Kuhn HW (1955) The Hungarian method for the assignment problem. NRL 2(1–2):83–97
Article MathSciNet Google Scholar
Le QV (2013) Building high-level features using large scale unsupervised learning. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, pp 8595–8598
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Article Google Scholar
Lee KC, Ho J, Kriegman DJ (2005) Acquiring linear subspaces for face recognition under variable lighting. IEEE Trans Pattern Anal Mach Intell 27(5):684–698
Article Google Scholar
Likas A, Vlassis N, Verbeek JJ (2003) The global k-means clustering algorithm. Pattern Recogn 36(2):451–461
Article Google Scholar
MacQueen J et al (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, vol 1. Oakland, CA, USA, pp 281–297
Mika S, Ratsch G, Weston J, Scholkopf B, Mullers KR (1999) Fisher discriminant analysis with kernels. In: Neural Networks for Signal Processing IX, 1999. Proceedings of the 1999 IEEE Signal Processing Society Workshop., pp 41–48. IEEE
Nene SA, Nayar SK, Murase H et al (1996) Columbia object image library (coil-20)
Nikitidis S, Tefas A, Pitas I (2014) Maximum margin projection subspace learning for visual data analysis. IEEE Trans Image Process 23(10):4413–4425
Article MathSciNet Google Scholar
Nousi P, Tefas A (2017) Deep learning algorithms for discriminant autoencoding. Neurocomputing
Nousi P, Tefas A (2017) Discriminatively trained autoencoders for fast and accurate face recognition. In: International Conference on Engineering Applications of Neural Networks. Springer, pp 205–215
Passalis N, Tefas A (2016) Information clustering using manifold-based optimization of the bag-of-features representation. IEEE Trans Cybern
Passalis N, Tefas A (2017) Dimensionality reduction using similarity-induced embeddings. IEEE Trans Neural Netw Learn Syst
Rolfe JT, LeCun Y (2013) Discriminative recurrent sparse auto-encoders. arXiv preprint arXiv:1301.3775
Rumelhart DE, Hinton GE, Williams RJ (1985) Learning internal representations by error propagation. Tech. rep, DTIC Document
Samaria FS, Harter AC (1994) Parameterisation of a stochastic model for human face identification. In: Proceedings of the Second IEEE Workshop on Applications of Computer Vision. IEEE, pp 138–142
Schroff F, Kalenichenko D, Philbin J (2015) Facenet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 815–823
Song C, Liu F, Huang Y, Wang L, Tan T (2013) Auto-encoder based data clustering. In: Iberoamerican Congress on Pattern Recognition. Springer, pp 117–124
Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
MathSciNet MATH Google Scholar
Strehl A, Ghosh J (2002) Cluster ensembles—a knowledge reuse framework for combining multiple partitions. J Mach Learn Res 3:583–617
Tian F, Gao B, Cui Q, Chen E, Liu TY (2014) Learning deep representations for graph clustering. In: AAAI, pp 1293–1299
Tsapanos N, Tefas A, Nikolaidis N, Pitas I (2015) A distributed framework for trimmed kernel k-means clustering. Pattern Recogn 48(8):2685–2698
Article Google Scholar
Vincent P, Larochelle H, Bengio Y, Manzagol PA (2008) Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th international conference on Machine learning. ACM, pp 1096–1103
Wang J, Wang J, Ke Q, Zeng G, Li S (2015) Fast approximate k-means via cluster closures. In: Multimedia Data Mining and Analytics. Springer, pp 373–395
Xie J, Girshick R, Farhadi A (2016) Unsupervised deep embedding for clustering analysis. In: International Conference on Machine Learning, pp 478–487
Xing EP, Jordan MI, Russell SJ, Ng AY (2003) Distance metric learning with application to clustering with side-information. In: Advances in neural information processing systems, pp 521–528
Yang J, Parikh D, Batra D (2016) Joint unsupervised learning of deep representations and image clusters. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 5147–5156
Zhang T (2004) Solving large scale linear prediction problems using stochastic gradient descent algorithms. In: Proceedings of the twenty-first international conference on Machine learning. ACM, p 116

Download references

Acknowledgements

This project has received funding from the European Unions Horizon 2020 research and innovation programme under grant agreement No. 731667 (MULTIDRONE). This publication reflects the authors views only. The European Commission is not responsible for any use that may be made of the information it contains.

Author information

Authors and Affiliations

Department of Informatics, Aristotle University of Thessaloniki, 54124, Thessaloniki, Greece
Paraskevi Nousi & Anastasios Tefas

Authors

Paraskevi Nousi
View author publications
You can also search for this author in PubMed Google Scholar
Anastasios Tefas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Paraskevi Nousi.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nousi, P., Tefas, A. Self-supervised autoencoders for clustering and classification. Evolving Systems 11, 453–466 (2020). https://doi.org/10.1007/s12530-018-9235-y

Download citation

Received: 04 January 2018
Accepted: 13 May 2018
Published: 23 May 2018
Issue Date: September 2020
DOI: https://doi.org/10.1007/s12530-018-9235-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Self-supervised autoencoders for clustering and classification

Abstract

Access this article

Similar content being viewed by others

Review of Unsupervised Learning Techniques

Local Intrinsic Dimensionality Based Features for Clustering

Performance Analysis of Deep Neural Maps

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Self-supervised autoencoders for clustering and classification

Abstract

Access this article

Similar content being viewed by others

Review of Unsupervised Learning Techniques

Local Intrinsic Dimensionality Based Features for Clustering

Performance Analysis of Deep Neural Maps

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation