Where Next in Object Recognition and how much Supervision Do We Need?

Ebert, Sandra; Schiele, Bernt

doi:10.1007/978-1-4471-5520-1_2

Sandra Ebert⁶ &
Bernt Schiele⁶

Part of the book series: Advances in Computer Vision and Pattern Recognition ((ACVPR))

3126 Accesses
1 Citations

Abstract

Object class recognition is an active topic in computer vision still presenting many challenges. In most approaches, this task is addressed by supervised learning algorithms that need a large quantity of labels to perform well. This leads either to small datasets (<10,000 images) that capture only a subset of the real-world class distribution (but with a controlled and verified labeling procedure), or to large datasets that are more representative but also add more label noise. Therefore, semi-supervised learning has been established as a promising direction to address object recognition. It requires only few labels while simultaneously making use of the vast amount of images available today. In this chapter, we outline the main challenges of semi-supervised object recognition, we review existing approaches, and we emphasize open issues that should be addressed next to advance this research topic.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Angluin D, Laird P (1988) Learning from noisy examples. Mach Learn 2:343–370
Google Scholar
Argyriou A, Herbster M, Pontil M (2005) Combining graph Laplacians for semi-supervised learning. In: NIPS
Google Scholar
Ashby FG (1992) Multidimensional models of categorization. In: Multidimensional models of perception and cognition. Erlbaum, Hillsdale, pp 449–483
Google Scholar
Ashby FG, Todd WT (2011) Human category learning 2.0. Ann NY Acad Sci 1224:147–161
Article Google Scholar
Balcan M-F, Blum A (2005) A PAC-style model for learning from labeled and unlabeled data. In: COLT
Google Scholar
Balcan M-f, Blum A, Pakyan Choi P, Lafferty J, Pantano B, Rwebangira MR, Zhu X (2005) Person identification in webcam images: an application of semi-supervised learning. In: ICML WS
Google Scholar
Baram Y, El-yaniv R, Luz K (2004) Online choice of active learning algorithms. J Mach Learn Res 5:255–291
MathSciNet Google Scholar
Bauckhage C, Thurau C (2009) Making archetypal analysis practical. In: DAGM
Google Scholar
Berg TL, Forsyth DA (2006) Animals on the web. In: CVPR
Google Scholar
Biederman I (1987) Recognition-by-components: a theory of human image understanding. Psychol Rev 94(2):115–147
Article Google Scholar
Bischof H, Pinz A, Kropatsch WG (1992) Visualization methods for neural networks. In: IAPR
Google Scholar
Blum A, Chawla S (2001) Learning from labeled and unlabeled data using graph mincuts. In: ICML
Google Scholar
Buhmann JM, Zöller T (2000) Active learning for hierarchical pairwise data clustering. In: ICPR
Google Scholar
Burl MC, Perona P (1996) Recognition of planar object classes. In: CVPR
Google Scholar
Cebron N, Berthold MR (2009) Active learning for object classification: from exploration to exploitation. Data Min Knowl Discov 18(2):283–299
Article MathSciNet Google Scholar
Cebron N, Richter F, Lienhart R (2012) “I can tell you what it’s not”: active learning from counterexamples. In: Progress in artificial intelligence
Google Scholar
Chabris C, Simons D (2010) The invisible gorilla: how our intuitions deceive us. Crown Publishing Group
Google Scholar
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:341–378
Google Scholar
Cohen B, Murphy GL (1984) Models of concepts. Cogn Sci 8(1):27–58
Article Google Scholar
Cootes TF, Edwards GJ, Taylor CJ (1998) Active appearance models. In: ECCV
Google Scholar
Cutler A, Breiman L (1994) Archetypal analysis. Technometrics 36(4):338–347
Article MathSciNet MATH Google Scholar
Daitch SI, Kelner JA, Spielman DA, Haven N (2009) Fitting a graph to vector data. In: ICML
Google Scholar
Damasio A (1994) Descartes’ error: emotion, reason, and the human brain. Penguin Group
Google Scholar
Davis J, Kulis B, Jain P, Sra S, Dhillon I (2007) Information-theoretic metric learning. In: ICML
Google Scholar
Delaitre V, Fouhey DF, Laptev I, Sivic J, Gupta A, Efros AA (2012) Scene semantics from long-term observation of people. In: ECCV
Google Scholar
Delalleau O, Bengio Y, Le Roux N (2005) Efficient non-parametric function induction in semi-supervised learning. In: AISTATS
Google Scholar
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) ImageNet: a large-scale hierarchical image database. In: CVPR, June 2009. IEEE
Google Scholar
Dubout C, Fleuret F (2011) Tasting families of features for image classification. In: ICCV
Google Scholar
Ebert S (2012) Semi-supervised learning for image classification. PhD thesis, Saarland University
Google Scholar
Ebert S, Larlus D, Schiele B (2010) Extracting structures in image collections for object recognition. In: ECCV
Google Scholar
Ebert S, Fritz M, Schiele B (2011) Pick your neighborhood—improving labels and neighborhood structure for label propagation. In: DAGM
Google Scholar
Ebert S, Fritz M, Schiele B (2012) Active metric learning for object recognition. In: DAGM
Google Scholar
Ebert S, Fritz M, Schiele B (2012) Semi-supervised learning on a budget: scaling up to large datasets. In: ACCV
Google Scholar
Elhamifar E, Sapiro G, Vidal R (2012) See all by looking at a few: sparse modeling for finding representative objects. In: CVPR
Google Scholar
Erickson MA, Kruschke JK (1998) Rules and exemplars in category learning. J Exp Psychol Gen 127(2):107–140
Article Google Scholar
Everingham M, Van Gool L, Williams CK (2008) The PASCAL VOC
Google Scholar
Farajtabar M, Shaban A, Reza Rabiee H, Rohban MH (2011) Manifold coarse graining for online semi-supervised learning. In: ECML
Google Scholar
Fei-Fei L, Fergus R, Perona P (2006) One-shot learning of object categories. IEEE Trans Pattern Anal Mach Intell 28(4):594–611
Article Google Scholar
Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645
Article Google Scholar
Fergus R, Weiss Y, Torralba A (2009) Semi-supervised learning in gigantic image collections. In: NIPS
Google Scholar
Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7:179–188
Article Google Scholar
Fowlkes C, Belongie S, Chung F, Malik J (2004) Spectral grouping using the Nystrom method. IEEE Trans Pattern Anal Mach Intell 26(2):214–225
Article Google Scholar
Freeman WT (2011) Where computer vision needs help from computer science. In: ACM-SIAM symposium on discrete algorithms
Google Scholar
Fritz M, Black M, Bradski G, Darrell T (2009) An additive latent feature model for transparent object recognition. In: NIPS
Google Scholar
Fussenegger M, Roth PM, Bischof H, Pinz A (2006) On-line, incremental learning of a robust active shape model. Pattern Recognit 4174:122–131
Google Scholar
Gehler P, Nowozin S (2009) On feature combination for multiclass object classification. In: ICCV
Google Scholar
Goldberg AB, Zhu X, Wright S (2007) Dissimilarity in graph-based semi-supervised classification. In: AISTATS
Google Scholar
Grabner H, Leistner C, Bischof H (2008) Semi-supervised on-line boosting for robust tracking. In: ECCV
Google Scholar
Hayward WG (2003) After the viewpoint debate: where next in object recognition? Trends Cogn Sci 7(10):425–427
Article MathSciNet Google Scholar
Joachims T (1999) Transductive inference for text classification using support vector machines. In: ICML
Google Scholar
Kahneman D, Tversky A (1979) Prospect theory: an analysis of decision under risk. Econometrica 47(2):263–291
Article MATH Google Scholar
Kant I (1781) Kritik der reinen Vernunft. Johann Friedrich Hartknoch Verlag. English edition: Kant I (1838) Critique of pure reason (trans: Haywood F)
Google Scholar
Kaplan AS, Murphy GL (2000) Category learning with minimal prior knowledge. J Exp Psychol 26(4):829–846
Google Scholar
Karlen M, Weston J, Erkan A, Collobert R (2008) Large scale manifold transduction. In: ICML. ACM Press, New York
Google Scholar
Kato T, Kashima H, Sugiyama M (2009) Robust label propagation on multiple networks. IEEE Trans Neural Netw 20(1):35–44
Article Google Scholar
Khosla A, Zhou T, Malisiewicz T, Efros AA, Torralba A (2012) Undoing the damage of dataset bias. In: ECCV
Google Scholar
Kruschke JK (1992) ALCOVE: an exemplar-based connectionist model of category learning. Psychol Rev 99(1):22–44
Article Google Scholar
Kulis B, Jain P, Grauman K (2009) Fast similarity search for learned metrics. IEEE Trans Pattern Anal Mach Intell 31(12):2143–2157
Article Google Scholar
Lampert CH, Nickisch H, Harmeling S (2009) Learning to detect unseen object classes by between-class attribute transfer. In: CVPR
Google Scholar
Lee YJ, Grauman K (2009) Foreground focus: unsupervised learning from partially matching images. Int J Comput Vis 85:143–166
Article Google Scholar
Leibe B, Seemann E, Schiele B (2005) Pedestrian detection in crowded scenes. In: CVPR. IEEE
Google Scholar
Levin DT, Simons DJ (1997) Failure to detect changes to attended objects in motion pictures. Psychon Bull Rev 4(4):501–506
Article Google Scholar
Li W, Fritz M (2012) Recognizing materials from virtual examples. In: ECCV
Google Scholar
Li Y-F, Zhou Z-H (2011) Improving semi-supervised support vector machines through unlabeled instances selection. In: AAAI
Google Scholar
Liu W, He J, Chang SF (2010) Large graph construction for scalable semi-supervised learning. In: ICML, pp 1–8
Google Scholar
Lu Z, Jain P, Dhillon IS (2009) Geometry-aware metric learning. In: ICML
Google Scholar
Medin DL, Schaffer MM (1978) Context theory of classification learning. Psychol Rev 85(3):207–238
Article Google Scholar
Minda JP, Smith JD (2001) Prototypes in category learning: the effects of category size, category structure, and stimulus complexity. J Exp Psychol Learn Mem Cogn 27(3):775–799
Article Google Scholar
Murphy GL (2002) The big book of concepts
Google Scholar
Murphy GL, Allopenna PD (1994) The locus of knowledge effects in concept learning. J Exp Psychol Learn Mem Cogn 20(4):904–919
Article Google Scholar
Murphy GL, Medin DL (1985) The role of theories in conceptual coherence. Psychol Rev 92(3):289–316
Article Google Scholar
Nguyen HT, Smeulders A (2004) Active learning using pre-clustering. In: ICML
Google Scholar
Nigam K, McCallum AK, Thrun S, Mitchell T (2000) Text classification from labeled and unlabeled documents using EM. Mach Learn 39:103–134
Article MATH Google Scholar
Nosofsky RM (1984) Choice, similarity, and the context theory of classification. J Exp Psychol 10(1):104–114
Google Scholar
Osherson DN, Smith EE (1981) On the adequacy of prototype theory as a theory of concepts. Cognition 9(1):35–58
Article Google Scholar
Osugi T, Kun D, Scott S (2005) Balancing exploration and exploitation: a new algorithm for active machine learning. In: ICDM
Google Scholar
Parikh D, Grauman K (2011) Relative attributes. In: ICCV, November 2011. IEEE
Google Scholar
Pazzani MJ (1991) Influence of prior knowledge on concept acquisition: experimental and computational results. J Exp Psychol 17(3):416–432
Google Scholar
Pearson K (1901) On lines and planes of closest fit to systems of points in space. Philos Mag 2(6):559–572
Article Google Scholar
Pepik B, Stark M, Gehler P, Schiele B (2012) Teaching 3D geometry to deformable part models. In: CVPR
Google Scholar
Pishchulin L, Jain A, Andriluka M, Thormälen T, Schiele B (2012) Articulated people detection and pose estimation: reshaping the future. In: CVPR
Google Scholar
Ponce J, Berg TL, Everingham M, Forsyth DA, Hebert M, Lazebnik S, Marszalek M, Schmid C, Russell BC, Torralba A, Williams CKI, Zhang J, Zisserman A (2006) Dataset issues in object recognition. In: Ponce J, Hebert M, Schmid C, Zisserman A (eds) Towards category-level object recognition. LNCS. Springer, Berlin, pp 29–48
Chapter Google Scholar
Pope A, Lowe DG (1996) Learning appearance models for object recognition. In: Object representation in computer vision II
Google Scholar
Posner MI, Goldsmith R, Welton KE (1967) Perceived distance and the classification of distorted patterns. J Exp Psychol 73(1):28–38
Article Google Scholar
Prabhakaran S, Raman S, Vogt JE, Roth V (2012) Automatic model selection in archetype analysis. In: DAGM
Google Scholar
Rohban MH, Rabiee HR (2012) Supervised neighborhood graph construction for semi-supervised classification. Pattern Recognit 45(4):1363–1372
Article MATH Google Scholar
Rohrbach M, Stark M, Schiele B (2011) Evaluating knowledge transfer and zero-shot learning in a large-scale setting. In: CVPR
Google Scholar
Rosch E, Mervis CB, Gray WD, Johnson DM, Boyes-Braem P (1976) Basic objects in natural categories. Cogn Psychol 8:382–439
Article Google Scholar
Saffari A, Godec M, Pock T, Leistner C, Bischof H (2010) Online multi-class LPBoost. In: CVPR, June 2010. IEEE
Google Scholar
Schiele B (2000) Towards automatic extraction and modeling of objects from image sequences. In: Int sym on intelligent robotic systems
Google Scholar
Schiele B, Crowley JL (1996) Where to look next and what to look for. In: IROS
Google Scholar
Schiele B, Crowley JL (1997) The concept of visual classes for object classification. In: Scand conf image analysis
Google Scholar
Schiele B, Crowley JL (1998) Transinformation for active object recognition. In: ICCV
Google Scholar
Schiele B, Crowley JL (2000) Recognition without correspondence using multidimensional receptive field histograms. Int J Comput Vis 36(1):31–52
Article Google Scholar
Schiele B, Pentland A (1999) Probabilistic object recognition and localization. In: ICCV
Google Scholar
Schnitzspan P, Fritz M, Roth S, Schiele B, Berkeley Eecs UC (2009) Discriminative structure learning of hierarchical representations for object detection. In: CVPR
Google Scholar
Schohn G, Cohn D (2000) Less is more: active learning with support vector machines. In: ICML
Google Scholar
Seeger M (2001) Learning with labeled and unlabeled data. Technical report, University of Edinburgh
Google Scholar
Settles B (2009) Active Learning Literature Survey. Technical report, University of Wisconsin–Madison
Google Scholar
Simon I, Snavely N, Seitz SM (2007) Scene summarization for online image collections. In: ICCV. IEEE
Google Scholar
Simons DJ, Chabris CF (1999) Gorillas in our midst: sustained inattentional blindness for dynamic events. Perception 28(9):1059–1074
Article Google Scholar
Simons DJ, Levin DT (1998) Failure to detect changes to people during a real-world interaction. Psychon Bull Rev 5(4):644–649
Article Google Scholar
Sivic J, Russell BC, Efros AA, Zisserman A, Freeman WT (2005) Discovering object categories in image collections. In: ICCV
Google Scholar
Smith E, Medin DL (1981) Categories and concepts. Harvard University Press, Cambridge
Google Scholar
Sonnenburg S, Rätsch G, Schäfer C, Schölkopf B (2006) Large scale multiple kernel learning. J Mach Learn Res 7:1531–1565
MathSciNet MATH Google Scholar
Stark M, Goesele M, Schiele B (2010) Back to the future: learning shape models from 3D CAD data. In: BMVC
Google Scholar
Sternig S, Roth PM, Bischof H (2012) On-line inverse multiple instance boosting for classifier grids. Pattern Recognit Lett, 33(7):890–897
Article Google Scholar
Sugiyama M, Rubens N (2008) Active learning with model selection in linear regression. In: DMKD
Google Scholar
Talwalkar A, Kumar S, Rowley H (2008) Large-scale manifold learning. In: CVPR, June 2008, pp 1–8
Google Scholar
Tong W, Jin R (2007) Semi-supervised learning by mixed label propagation. In: AAAI, vol 22
Google Scholar
Tong S, Koller D (2001) Support vector machine active learning with applications to text classification. J Mach Learn Res 2:45–66
Google Scholar
Tong H, He J, Li M, Zhang C, Ma WY (2005) Graph based multi-modality learning. In: ACM multimedia
Google Scholar
Torralba A (2011) Unbiased look at dataset bias. In: CVPR
Google Scholar
Torralba BA, Russell BC, Yuen J (2010) LabelMe: online image annotation and applications. In: Proc IEEE
Google Scholar
Tsang IW, Kwok JT (2006) Large-scale sparsified manifold regularization. In: NIPS
Google Scholar
Tsuda K, Shin H, Schoelkopf B (2005) Fast protein classification with multiple networks. Bioinformatics 21:59–65
Article Google Scholar
Vedaldi A, Gulshan V, Varma M, Zisserman A (2009) Multiple kernels for object detection. In: ICCV, pp 606–613
Google Scholar
Vernon D (2005) A research roadmap of cognitive vision. Technical report, ECVision: the European research network for cognitive computer vision systems
Google Scholar
Von Luxburg U, Radl A, Hein M (2010) Getting lost in space: large sample analysis of the commute distance. In: NIPS
Google Scholar
Wang L, Chan KL, Zhang Z (2003) Bootstrapping SVM active learning by incorporating unlabelled images for image retrieval. In: CVPR. IEEE Comput. Soc., Los Alamitos
Google Scholar
Wang X, Han TX, Yan S (2009) An HOG-LBP human detector with partial occlusion handling. In: ICCV, September 2009. IEEE
Google Scholar
Wang G, Wang B, Yang X, Yu G (2012) Efficiently indexing large sparse graphs for similarity search. IEEE Trans Knowl Data Eng 24(3):440–451
Article MATH Google Scholar
Weber M, Welling M, Perona P (2000) Unsupervised learning of models for recognition. In: ECCV
Google Scholar
Welinder P, Branson S, Belongie S, Perona P (2010) The multidimensional wisdom of crowds. In: NIPS, pp 1–9
Google Scholar
Wiskott L, von der Malsburg C (1993) A neural system for the recognition of partially occluded objects in cluttered scenes. Int J Pattern Recognit Artif Intell 7(4):935–948
Article Google Scholar
Yang L (2006) Distance metric learning: a comprehensive survey. Technical report, Michigan State University
Google Scholar
Yang X, Bai X, Köknar-Tezel S, Latecki LJ (2013) Densifying distance spaces for shape and image retrieval. J Math Imaging Vis 46:12–28
Article Google Scholar
Zaki SR, Nosofsky RM (2007) A high-distortion enhancement effect in the prototype-learning paradigm: dramatic effects of category learning during test. Mem Cogn 35(8):2088–2096
Article Google Scholar
Zaki SR, Nosofsky RM, Stanton RD, Cohen AL (2003) Prototype and exemplar accounts of category learning and attentional allocation: a reassessment. J Exp Psychol Learn Mem Cogn 29(6):1160–1173
Article Google Scholar
Zhang Z, Zha H, Zhang M (2008) Spectral methods for semi-supervised manifold learning. In: CVPR
Google Scholar
Zhang K, Kwok JT, Parvin B (2009) Prototype vector machine for large scale semi-supervised learning. In: ICML. ACM Press, New York
Google Scholar
Zhou D, Bousquet O, Navin Lal T, Weston J, Schölkopf B (2004) Learning with local and global consistency. In: NIPS
Google Scholar
Zhu X, Goldberg AB, Khot T (2009) Some new directions in graph-based semi-supervised learning. In: ICME
Google Scholar

Download references

Author information

Authors and Affiliations

Max Planck Institute for Informatics, Saarbrücken, Germany
Sandra Ebert & Bernt Schiele

Authors

Sandra Ebert
View author publications
You can also search for this author in PubMed Google Scholar
Bernt Schiele
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sandra Ebert .

Editor information

Editors and Affiliations

Dipartimento di Matematica e Informatica, Università di Catania, Catania, Italy
Giovanni Maria Farinella
Dipartimento di Matematica e Informatica, Università di Catania, Catania, Italy
Sebastiano Battiato
Department of Engineering, University of Cambridge, Cambridge, United Kingdom
Roberto Cipolla

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Ebert, S., Schiele, B. (2013). Where Next in Object Recognition and how much Supervision Do We Need?. In: Farinella, G., Battiato, S., Cipolla, R. (eds) Advanced Topics in Computer Vision. Advances in Computer Vision and Pattern Recognition. Springer, London. https://doi.org/10.1007/978-1-4471-5520-1_2

Download citation

DOI: https://doi.org/10.1007/978-1-4471-5520-1_2
Publisher Name: Springer, London
Print ISBN: 978-1-4471-5519-5
Online ISBN: 978-1-4471-5520-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics