Skip to main content

Active Learning with Clustering and Unsupervised Feature Learning

  • Conference paper
  • First Online:
Advances in Artificial Intelligence (Canadian AI 2015)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9091))

Included in the following conference series:

Abstract

Active learning is a type of semi-supervised learning in which the training algorithm is able to obtain the labels of a small portion of the unlabeled dataset by interacting with an external source (e.g. a human annotator). One strategy employed in active learning is based on the exploration of the cluster structure in the data, by using the labels of a few representative samples in the classification of the remaining points. In this paper we show that unsupervised feature learning can improve the "purity" of clusters found, and how this can be combined with a simple but effective active learning strategy. The proposed method shows state-of-the art performance in MNIST digit recognition in the semi-supervised setting.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Arthur, D., Vassilvitskii, S.: k-means ++ : the advantages of careful seeding. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, vol. 8, pp. 1027–1035 (2007)

    Google Scholar 

  2. Bengio, Y., Courville, A., Vincent, P.: Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence 35(8), 1798–1828 (2013)

    Article  Google Scholar 

  3. Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. Advances in Neural Information Processing Systems 19(1), 153–160 (2007)

    Google Scholar 

  4. Cayton, L.: Algorithms for manifold learning. Univ. of California at San Diego Tech. Rep pp. 1–17 (2005). http://www.vis.lbl.gov/romano/mlgroup/papers/manifold-learning.pdf

  5. Dasgupta, S.: Two faces of active learning. Theoretical Computer Science 412(19), 1767–1781 (2011)

    Article  MATH  MathSciNet  Google Scholar 

  6. Dasgupta, S., Hsu, D.: Hierarchical sampling for active learning. In: Proceedings of the 25th International Conference on Machine Learning, pp. 208–215 (2008)

    Google Scholar 

  7. Dhillon, I.S.: Concept Decompositions for Large Sparse Text Data using Clustering. Machine Learning 42(1–2), 143–175 (2004)

    Google Scholar 

  8. Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Computation 18(7), 1527–1554 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  9. Lee, D.H.: Pseudo-label: the simple and efficient semi-supervised learning method for deep neural networks. In: Workshop on Challenges in Representation Learning, ICML (2013)

    Google Scholar 

  10. Lefakis, L., Wiering, M.: Semi-supervised methods for handwritten character recognition using active learning. In: Proceedings of the Belgium Netherlands Conference on Artificial Intelligence, pp. 205–212 (2007)

    Google Scholar 

  11. Martens, J.: Deep learning via hessian-free optimization. In: Proceedings of the 27th International Conference on Machine Learning (ICML-2010), pp. 735–742 (2010)

    Google Scholar 

  12. Nguyen, H.T., Smeulders, A.: Active learning using pre-clustering. In: Proceedings of the Twenty-First International Conference on Machine Learning, p. 79. ACM (2004)

    Google Scholar 

  13. Raina, R., Battle, A., Lee, H., Packer, B., Ng, A.Y.: Self-taught learning: transfer learning from unlabeled data. In: Proceedings of the 24th International Conference on Machine Learning, pp. 759–766. ACM (2007)

    Google Scholar 

  14. Rifai, S., Dauphin, Y.N., Vincent, P., Bengio, Y., Muller, X.: The manifold tangent classifier. In: Advances in Neural Information Processing Systems, pp. 2294–2302 (2011)

    Google Scholar 

  15. Selfridge, O.G.: Pandemonium: a paradigm for learning. In: Proceedings of the Symposium on Mechanisation of Thought Processes, vol. 1, pp. 511–529. HMSO (1959)

    Google Scholar 

  16. Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.A.: Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th International Conference on Machine Learning, pp. 1096–1103. ACM (2008)

    Google Scholar 

  17. Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.A.: Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. The Journal of Machine Learning Research 11, 3371–3408 (2010)

    MATH  MathSciNet  Google Scholar 

  18. Weston, J., Leslie, C., Ie, E., Zhou, D., Elisseeff, A., Noble, W.S.: Semi-supervised protein classification using cluster kernels. Bioinformatics 21(15), 3241–3247 (2005)

    Article  Google Scholar 

  19. Weston, J., Ratle, F., Mobahi, H., Collobert, R.: Deep learning via semi-supervised embedding. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) NN: Tricks of the Trade, 2nd edn. LNCS, vol. 7700, pp. 639–655. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  20. Xu, Z., Yu, K., Tresp, V., Xu, X.W., Wang, J.: Representative Sampling for Text Classification Using Support Vector Machines. Springer, Heidelberg (2003)

    Book  Google Scholar 

  21. Zhu, X., Ghahramani, Z., Lafferty, J., et al.: Semi-supervised learning using gaussian fields and harmonic functions. In: ICML, vol. 3, pp. 912–919 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Saul Berardo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Berardo, S., Favero, E., Neto, N. (2015). Active Learning with Clustering and Unsupervised Feature Learning. In: Barbosa, D., Milios, E. (eds) Advances in Artificial Intelligence. Canadian AI 2015. Lecture Notes in Computer Science(), vol 9091. Springer, Cham. https://doi.org/10.1007/978-3-319-18356-5_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-18356-5_25

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-18355-8

  • Online ISBN: 978-3-319-18356-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics