Active Image Clustering with Pairwise Constraints from Humans

Biswas, Arijit; Jacobs, David

doi:10.1007/s11263-013-0680-6

Active Image Clustering with Pairwise Constraints from Humans

Published: 10 December 2013

Volume 108, pages 133–147, (2014)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Arijit Biswas¹ &
David Jacobs¹

914 Accesses
21 Citations
Explore all metrics

Abstract

We propose a method of clustering images that combines algorithmic and human input. An algorithm provides us with pairwise image similarities. We then actively obtain selected, more accurate pairwise similarities from humans. A novel method is developed to choose the most useful pairs to show a person, obtaining constraints that improve clustering. In a clustering assignment, elements in each data pair are either in the same cluster or in different clusters. We simulate inverting these pairwise relations and see how that affects the overall clustering. We choose a pair that maximizes the expected change in the clustering. The proposed algorithm has high time complexity, so we also propose a version of this algorithm that is much faster and exactly replicates our original algorithm. We further improve run-time by adding two heuristics, and show that these do not significantly impact the effectiveness of our method. We have run experiments in three different domains, namely leaf, face and scene images, and show that the proposed method improves clustering performance significantly.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Angluin, D. (1987). Queries and concept learning. Machine Learning, 2(4), 319–342.
Google Scholar
Basu, S., Banerjee, A., & Mooney, R. J. (2002). Semi-supervised clustering by seeding. In C. Sammut & A. G. Hoffmann (eds.), ICML (pp. 27–34). San Francisco, CA: Morgan Kaufmann. ISBN 1-55860-873-7.
Basu, S., Banerjee, A., & Mooney, R. J. (2004). Active semi-supervision for pairwise constrained clustering. In Fourth SIAM International Conference on Data Mining. ISBN 0-89871-568-7.
Basu, S., Davidson, I., & Wagstaff, K. (2008). Constrained clustering: Advances in algorithms. Data mining and knowledge discovery series. Washington, DC: IEEE Computer Society
Biswas, A. & Jacobs, D. (2011). Large scale image clustering with active pairwise constraints. In International Conference in Machine Learning 2011 Workshop on Combining Learning Strategies to Reduce Label Cost.
Biswas, A. & Jacobs, D. W. (2012). Active image clustering: Seeking constraints from humans to complement algorithms. In IEEE Conference on CVPR (pp. 2152–2159).
Branson, S., Perona, P., & Belongie, S. (2011). Strong supervision from weak annotation: Interactive training of deformable part models. In D. N. Metaxas, L. Quan, A. Sanfeliu, & L. J. Van Gool (eds.), IEEE Internaional Conference on Computer Vision (pp. 1832–1839). http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=6118259.
Branson, S., Wah, C., Schroff, F., Babenko, B., Welinder, P., Perona, P., & Belongie, S. (2010). Visual recognition with humans in the loop. In K. Daniilidis, P. Maragos, & Paragios, N. (eds.), ECCV Lecture Notes in Computer Science (Vol. 6314, pp. 438–451). Berlin: Springer. ISBN 978-3-642-15560-4. doi:10.1007/978-3-642-15561-1.
Chellappa, R., & Jain, A. K. (1993). Markov random fields: Theory and applications. Boston: Academic Press.
Google Scholar
Dagli, C. K., Rajaram, S., & Huang, T. S. (2006). Utilizing information theoretic diversity for SVM active learn. In ICPR (pp. 506–511). doi:10.1109/ICPR.2006.1161.
Davidson, I., & Ravi, S. S. (2009). Using instance-level constraints in agglomerative hierarchical clustering: Theoretical and empirical results. Data Mining and Knowledge Discovery, 18(2), 257–282.
Article MathSciNet Google Scholar
Felzenszwalb, P. F., McAllester, D., Ramanan, D. (2008). A discriminatively trained, multiscale, deformable part model. In CVPR (pp. 1–8). doi:10.1109/CVPR.2008.4587597.
Gomes, R., Welinder, P., Krause, A., & Perona, P. (2011). Crowdclustering. In J. Shawe-Taylor, R. S. Zemel, P. L. Bartlett, F. C. N. Pereira, & K. Q. Weinberger (eds.), NIPS (pp. 558–566). http://books.nips.cc/nips24.html
Grira, N., Crucianu, M., & Boujemaa, N. (2005). Active semi-supervised fuzzy clustering for image database categorization. In ACM SIGMM International Workshop on Multimedia Information Retrieval (pp. 9–16).
Guo, Y. & Greiner, R. (2007). Optimistic active-learning using mutual information. In M. M. Veloso (ed.), IJCAI (pp. 823–829). http://dli.iiit.ac.in/ijcai/IJCAI-2007/PDF/IJCAI07-132.pdf
Holub, A. D., Perona, P., & Burl, M. C. (2008). Entropy-based active learning for object recognition. In Online Learning for Classification Workshop (pp. 1–8). doi:10.1109/CVPRW.2008.4563068.
Huang, R., & Lam, W. (2009). An active learning framework for semi-supervised document clustering with language modeling. Data & Knowledge Engineering, 68(1), 49–67.
Article Google Scholar
Huang, R., Wai, L. & Zhigang, Z. (2007). Active learning of constraints for semi-supervised text clustering. In SDM, SIAM.
Jain, A. K. (2010). Data clustering: 50 years beyond K-means. Pattern Recognition Letters, 31(8), 651–666. doi:10.1016/j.patrec.2009.09.011.
Article Google Scholar
Jain, P. & A. Kapoor. (2009). Active learning for large multi-class problems. In CVPR (pp. 762–769). doi:10.1109/CVPRW.2009.5206651.
Joshi, A. J., Porikli, F., & Papanikolopoulos, N. (2010). Breaking the interactive bottleneck in multi-class classification with active selection and binary feedback. In IEEE Conference on CVPR (pp. 2995–3002). doi:10.1109/CVPR.2010.5540047.
Kapoor, A., Horvitz, E., Basu, S. (2007). Selective supervision: Guiding supervised learning with decision-theoretic active learning. In M. M. Veloso (ed.), IJCAI (pp. 877–882). http://dli.iiit.ac.in/ijcai/IJCAI-2007/PDF/IJCAI07-141.pdf.
Kruskal, J. (1956). On the shortest spanning subtree of a graph and the traveling salesman problem. Proceedings of the AMS, 2, 48–50.
Article MathSciNet Google Scholar
Kumar, N., Belhumeur, P. N., Biswas, A., Jacobs, D. W., Kress, W. J., Lopez, I. C., & Soares, J. V. B. (2012) Leafsnap: A computer vision system for automatic plant species identification. In ECCV 2, Lecture Notes in Computer Science (Vol. 7573, pp. 502–516).
Kumar, N., Berg, A. C., Belhumeur, P. N., & Nayar, S. K. (2009). Attribute and simile classifiers for face verification. In IEEE Conference on ICCV (pp. 365–372).
MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. In Proceedings of the 5th Berkeley Symposium on Mathematics Statistics and Probability.
Mallapragada, P. K., Jin, R., & Jain, A. K. (2008). Active query selection for semi-supervised clustering. In ICPR (pp. 1–4).
Martinez, A. M., & Benavente, R. (1998). The AR face database. CVC Technical, Report #24.
Punera, K., & Ghosh, J. (2008). Consensus-based ensembles of soft clusterings. Applied Artificial Intelligence, 22(7 &8), 780–810.
Article Google Scholar
Quattoni, A. & Torralba, A. (2009). Recognizing indoor scenes. In CVPR (pp. 413–420). doi:10.1109/CVPRW.2009.5206537.
Settles, B. (2010). Active learning literature survey. Technical report.
Siddiquie, B., & Gupta, A. (2010). Beyond active noun tagging: Modeling contextual interactions for multi-class active learning. In IEEE Conference on CVPR (pp. 2979–2986). doi:10.1109/CVPR.2010.5540044.
Sugar, C. A., & James, G. M. (2003). Finding the number of clusters in a data set: An information theoretic approach. Journal of the American Statistical Association, 98, 750–763.
Article MATH MathSciNet Google Scholar
Tan, P.-N., Steinbach, M., & Kumar, V. (2005). Introduction to Data Mining. Upper Saddle River, NJ: Addison-Wesley. ISBN 0-321-32136-7.
Vijayanarasimhan, S., & Grauman, K. (2011). Large-scale live active learning: Training object detectors with crawled data and crowds. In IEEE conference on CVPR (pp. 1449–1456). doi:10.1109/CVPR.2011.5995430.
Vijayanarasimhan, S. & Grauman, K. (2012). Active frame selection for label propagation in videos. In A. W. Fitzgibbon, S. Lazebnik, P. Perona, Y. Sato, & C. Schmid (eds.), ECCV 5, Lecture Notes in Computer Science (Vol. 7576, pp. 496–509). Berlin: Springer. ISBN 978-3-642-33714-7.
Vijayanarasimhan, S., Jain, P., & Grauman, K. (2010). Far-sighted active learning on a budget for image and video recognition. In IEEE Conference on CVPR (pp. 3035–3042). doi:10.1109/CVPR.2010.5540055.
Wagstaff, K., Cardie, C., Rogers, S., & Schroedl, S. (2001). Constrained K-means clustering with background knowledge. In Proceedings of 18th International Conference on Machine Learning (pp. 577–584). San Francisco, CA: Morgan Kaufmann.
Wang, X., & Davidson, I. (2010). Active spectral clustering. In ICDM (pp. 561–568). IEEE Computer Society.
Xiong, C., Johnson, D. & Corso, J. J. (2012). Spectral active clustering via purification of the \(k\)-nearest neighbor graph. In ECDM.
Xu, Q., desJardins, M., & Wagstaff, K. (2005). Active constrained clustering by examining spectral eigenvectors. In Discovery, science (Vol. 3735). Berlin: Springer.

Download references

Author information

Authors and Affiliations

Computer Science Department, University of Maryland, College Park, MD , 20742, USA
Arijit Biswas & David Jacobs

Authors

Arijit Biswas
View author publications
You can also search for this author in PubMed Google Scholar
David Jacobs
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Arijit Biswas.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Biswas, A., Jacobs, D. Active Image Clustering with Pairwise Constraints from Humans. Int J Comput Vis 108, 133–147 (2014). https://doi.org/10.1007/s11263-013-0680-6

Download citation

Received: 25 February 2013
Accepted: 23 November 2013
Published: 10 December 2013
Issue Date: May 2014
DOI: https://doi.org/10.1007/s11263-013-0680-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Active Image Clustering with Pairwise Constraints from Humans

Abstract

Access this article

Similar content being viewed by others

Microsoft COCO: Common Objects in Context

ImageNet Large Scale Visual Recognition Challenge

A Comprehensive Survey of Clustering Algorithms

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Active Image Clustering with Pairwise Constraints from Humans

Abstract

Access this article

Similar content being viewed by others

Microsoft COCO: Common Objects in Context

ImageNet Large Scale Visual Recognition Challenge

A Comprehensive Survey of Clustering Algorithms

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation