Conferences >2018 IEEE International Confe...

Learning Embedding Space for Clustering From Deep Representations

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Clustering is one of the most fundamental unsupervised tasks in machine learning and is elementary in the exploration of high volume data. Recent works propose using deep...Show More

Metadata

Abstract:

Clustering is one of the most fundamental unsupervised tasks in machine learning and is elementary in the exploration of high volume data. Recent works propose using deep neural networks for clustering, owing to their ability to learn powerful representations of the data. In this work, we present a novel clustering approach using deep neural networks that simultaneously learns feature representations and embeddings suitable for clustering by encouraging separation of natural clusters in the embedding space. More specifically, an autoencoder is employed to learn representations of the data. Then a mapping from autoencoder representation space to an embedding space is learned using a deep neural network we call Representation Network. This neural network promotes separation between natural clusters by minimizing cross-entropy between two probability distributions that denote pairwise similarity in autoencoder latent space and representation network's embedding space. The resultant optimization problem can be solved effectively by jointly training the autoencoder and the representation network using minibatch stochastic gradient descent and backpropagation. Ultimately we obtain a K-Means friendly embedding space. Experimental results show that despite being a simple model, the proposed approach outperforms a broad range of recent approaches on Reuters dataset, other autoencoder based models on MNIST dataset and produces consistently good results that are very competitive with other complex and hybrid models.

Published in: 2018 IEEE International Conference on Big Data (Big Data)

Date of Conference: 10-13 December 2018

Date Added to IEEE Xplore: 24 January 2019

ISBN Information:

DOI: 10.1109/BigData.2018.8622629

Conference Location: Seattle, WA, USA