Active Learning with Clustering and Unsupervised Feature Learning

Berardo, Saul; Favero, Eloi; Neto, Nelson

doi:10.1007/978-3-319-18356-5_25

Saul Berardo⁶,
Eloi Favero⁶ &
Nelson Neto⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9091))

Included in the following conference series:

Canadian Conference on Artificial Intelligence

2876 Accesses
1 Citations

Abstract

Active learning is a type of semi-supervised learning in which the training algorithm is able to obtain the labels of a small portion of the unlabeled dataset by interacting with an external source (e.g. a human annotator). One strategy employed in active learning is based on the exploration of the cluster structure in the data, by using the labels of a few representative samples in the classification of the remaining points. In this paper we show that unsupervised feature learning can improve the "purity" of clusters found, and how this can be combined with a simple but effective active learning strategy. The proposed method shows state-of-the art performance in MNIST digit recognition in the semi-supervised setting.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Arthur, D., Vassilvitskii, S.: k-means ++ : the advantages of careful seeding. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, vol. 8, pp. 1027–1035 (2007)
Google Scholar
Bengio, Y., Courville, A., Vincent, P.: Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence 35(8), 1798–1828 (2013)
Article Google Scholar
Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. Advances in Neural Information Processing Systems 19(1), 153–160 (2007)
Google Scholar
Cayton, L.: Algorithms for manifold learning. Univ. of California at San Diego Tech. Rep pp. 1–17 (2005). http://www.vis.lbl.gov/romano/mlgroup/papers/manifold-learning.pdf
Dasgupta, S.: Two faces of active learning. Theoretical Computer Science 412(19), 1767–1781 (2011)
Article MATH MathSciNet Google Scholar
Dasgupta, S., Hsu, D.: Hierarchical sampling for active learning. In: Proceedings of the 25th International Conference on Machine Learning, pp. 208–215 (2008)
Google Scholar
Dhillon, I.S.: Concept Decompositions for Large Sparse Text Data using Clustering. Machine Learning 42(1–2), 143–175 (2004)
Google Scholar
Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Computation 18(7), 1527–1554 (2006)
Article MATH MathSciNet Google Scholar
Lee, D.H.: Pseudo-label: the simple and efficient semi-supervised learning method for deep neural networks. In: Workshop on Challenges in Representation Learning, ICML (2013)
Google Scholar
Lefakis, L., Wiering, M.: Semi-supervised methods for handwritten character recognition using active learning. In: Proceedings of the Belgium Netherlands Conference on Artificial Intelligence, pp. 205–212 (2007)
Google Scholar
Martens, J.: Deep learning via hessian-free optimization. In: Proceedings of the 27th International Conference on Machine Learning (ICML-2010), pp. 735–742 (2010)
Google Scholar
Nguyen, H.T., Smeulders, A.: Active learning using pre-clustering. In: Proceedings of the Twenty-First International Conference on Machine Learning, p. 79. ACM (2004)
Google Scholar
Raina, R., Battle, A., Lee, H., Packer, B., Ng, A.Y.: Self-taught learning: transfer learning from unlabeled data. In: Proceedings of the 24th International Conference on Machine Learning, pp. 759–766. ACM (2007)
Google Scholar
Rifai, S., Dauphin, Y.N., Vincent, P., Bengio, Y., Muller, X.: The manifold tangent classifier. In: Advances in Neural Information Processing Systems, pp. 2294–2302 (2011)
Google Scholar
Selfridge, O.G.: Pandemonium: a paradigm for learning. In: Proceedings of the Symposium on Mechanisation of Thought Processes, vol. 1, pp. 511–529. HMSO (1959)
Google Scholar
Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.A.: Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th International Conference on Machine Learning, pp. 1096–1103. ACM (2008)
Google Scholar
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.A.: Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. The Journal of Machine Learning Research 11, 3371–3408 (2010)
MATH MathSciNet Google Scholar
Weston, J., Leslie, C., Ie, E., Zhou, D., Elisseeff, A., Noble, W.S.: Semi-supervised protein classification using cluster kernels. Bioinformatics 21(15), 3241–3247 (2005)
Article Google Scholar
Weston, J., Ratle, F., Mobahi, H., Collobert, R.: Deep learning via semi-supervised embedding. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) NN: Tricks of the Trade, 2nd edn. LNCS, vol. 7700, pp. 639–655. Springer, Heidelberg (2012)
Chapter Google Scholar
Xu, Z., Yu, K., Tresp, V., Xu, X.W., Wang, J.: Representative Sampling for Text Classification Using Support Vector Machines. Springer, Heidelberg (2003)
Book Google Scholar
Zhu, X., Ghahramani, Z., Lafferty, J., et al.: Semi-supervised learning using gaussian fields and harmonic functions. In: ICML, vol. 3, pp. 912–919 (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Instituto de Ciências Exatas e Naturais, Universidade Federal do Pará, Belém, PA, 66075-900, Brazil
Saul Berardo, Eloi Favero & Nelson Neto

Authors

Saul Berardo
View author publications
You can also search for this author in PubMed Google Scholar
Eloi Favero
View author publications
You can also search for this author in PubMed Google Scholar
Nelson Neto
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Saul Berardo .

Editor information

Editors and Affiliations

University of Alberta, Edmonton, Canada
Denilson Barbosa
Dalhousie University, Halifax, Canada
Evangelos Milios

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Berardo, S., Favero, E., Neto, N. (2015). Active Learning with Clustering and Unsupervised Feature Learning. In: Barbosa, D., Milios, E. (eds) Advances in Artificial Intelligence. Canadian AI 2015. Lecture Notes in Computer Science(), vol 9091. Springer, Cham. https://doi.org/10.1007/978-3-319-18356-5_25

Download citation

DOI: https://doi.org/10.1007/978-3-319-18356-5_25
Published: 29 April 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18355-8
Online ISBN: 978-3-319-18356-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics