Abstract
Object class recognition is an active topic in computer vision still presenting many challenges. In most approaches, this task is addressed by supervised learning algorithms that need a large quantity of labels to perform well. This leads either to small datasets (<10,000 images) that capture only a subset of the real-world class distribution (but with a controlled and verified labeling procedure), or to large datasets that are more representative but also add more label noise. Therefore, semi-supervised learning has been established as a promising direction to address object recognition. It requires only few labels while simultaneously making use of the vast amount of images available today. In this chapter, we outline the main challenges of semi-supervised object recognition, we review existing approaches, and we emphasize open issues that should be addressed next to advance this research topic.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Angluin D, Laird P (1988) Learning from noisy examples. Mach Learn 2:343–370
Argyriou A, Herbster M, Pontil M (2005) Combining graph Laplacians for semi-supervised learning. In: NIPS
Ashby FG (1992) Multidimensional models of categorization. In: Multidimensional models of perception and cognition. Erlbaum, Hillsdale, pp 449–483
Ashby FG, Todd WT (2011) Human category learning 2.0. Ann NY Acad Sci 1224:147–161
Balcan M-F, Blum A (2005) A PAC-style model for learning from labeled and unlabeled data. In: COLT
Balcan M-f, Blum A, Pakyan Choi P, Lafferty J, Pantano B, Rwebangira MR, Zhu X (2005) Person identification in webcam images: an application of semi-supervised learning. In: ICML WS
Baram Y, El-yaniv R, Luz K (2004) Online choice of active learning algorithms. J Mach Learn Res 5:255–291
Bauckhage C, Thurau C (2009) Making archetypal analysis practical. In: DAGM
Berg TL, Forsyth DA (2006) Animals on the web. In: CVPR
Biederman I (1987) Recognition-by-components: a theory of human image understanding. Psychol Rev 94(2):115–147
Bischof H, Pinz A, Kropatsch WG (1992) Visualization methods for neural networks. In: IAPR
Blum A, Chawla S (2001) Learning from labeled and unlabeled data using graph mincuts. In: ICML
Buhmann JM, Zöller T (2000) Active learning for hierarchical pairwise data clustering. In: ICPR
Burl MC, Perona P (1996) Recognition of planar object classes. In: CVPR
Cebron N, Berthold MR (2009) Active learning for object classification: from exploration to exploitation. Data Min Knowl Discov 18(2):283–299
Cebron N, Richter F, Lienhart R (2012) “I can tell you what it’s not”: active learning from counterexamples. In: Progress in artificial intelligence
Chabris C, Simons D (2010) The invisible gorilla: how our intuitions deceive us. Crown Publishing Group
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:341–378
Cohen B, Murphy GL (1984) Models of concepts. Cogn Sci 8(1):27–58
Cootes TF, Edwards GJ, Taylor CJ (1998) Active appearance models. In: ECCV
Cutler A, Breiman L (1994) Archetypal analysis. Technometrics 36(4):338–347
Daitch SI, Kelner JA, Spielman DA, Haven N (2009) Fitting a graph to vector data. In: ICML
Damasio A (1994) Descartes’ error: emotion, reason, and the human brain. Penguin Group
Davis J, Kulis B, Jain P, Sra S, Dhillon I (2007) Information-theoretic metric learning. In: ICML
Delaitre V, Fouhey DF, Laptev I, Sivic J, Gupta A, Efros AA (2012) Scene semantics from long-term observation of people. In: ECCV
Delalleau O, Bengio Y, Le Roux N (2005) Efficient non-parametric function induction in semi-supervised learning. In: AISTATS
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) ImageNet: a large-scale hierarchical image database. In: CVPR, June 2009. IEEE
Dubout C, Fleuret F (2011) Tasting families of features for image classification. In: ICCV
Ebert S (2012) Semi-supervised learning for image classification. PhD thesis, Saarland University
Ebert S, Larlus D, Schiele B (2010) Extracting structures in image collections for object recognition. In: ECCV
Ebert S, Fritz M, Schiele B (2011) Pick your neighborhood—improving labels and neighborhood structure for label propagation. In: DAGM
Ebert S, Fritz M, Schiele B (2012) Active metric learning for object recognition. In: DAGM
Ebert S, Fritz M, Schiele B (2012) Semi-supervised learning on a budget: scaling up to large datasets. In: ACCV
Elhamifar E, Sapiro G, Vidal R (2012) See all by looking at a few: sparse modeling for finding representative objects. In: CVPR
Erickson MA, Kruschke JK (1998) Rules and exemplars in category learning. J Exp Psychol Gen 127(2):107–140
Everingham M, Van Gool L, Williams CK (2008) The PASCAL VOC
Farajtabar M, Shaban A, Reza Rabiee H, Rohban MH (2011) Manifold coarse graining for online semi-supervised learning. In: ECML
Fei-Fei L, Fergus R, Perona P (2006) One-shot learning of object categories. IEEE Trans Pattern Anal Mach Intell 28(4):594–611
Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645
Fergus R, Weiss Y, Torralba A (2009) Semi-supervised learning in gigantic image collections. In: NIPS
Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7:179–188
Fowlkes C, Belongie S, Chung F, Malik J (2004) Spectral grouping using the Nystrom method. IEEE Trans Pattern Anal Mach Intell 26(2):214–225
Freeman WT (2011) Where computer vision needs help from computer science. In: ACM-SIAM symposium on discrete algorithms
Fritz M, Black M, Bradski G, Darrell T (2009) An additive latent feature model for transparent object recognition. In: NIPS
Fussenegger M, Roth PM, Bischof H, Pinz A (2006) On-line, incremental learning of a robust active shape model. Pattern Recognit 4174:122–131
Gehler P, Nowozin S (2009) On feature combination for multiclass object classification. In: ICCV
Goldberg AB, Zhu X, Wright S (2007) Dissimilarity in graph-based semi-supervised classification. In: AISTATS
Grabner H, Leistner C, Bischof H (2008) Semi-supervised on-line boosting for robust tracking. In: ECCV
Hayward WG (2003) After the viewpoint debate: where next in object recognition? Trends Cogn Sci 7(10):425–427
Joachims T (1999) Transductive inference for text classification using support vector machines. In: ICML
Kahneman D, Tversky A (1979) Prospect theory: an analysis of decision under risk. Econometrica 47(2):263–291
Kant I (1781) Kritik der reinen Vernunft. Johann Friedrich Hartknoch Verlag. English edition: Kant I (1838) Critique of pure reason (trans: Haywood F)
Kaplan AS, Murphy GL (2000) Category learning with minimal prior knowledge. J Exp Psychol 26(4):829–846
Karlen M, Weston J, Erkan A, Collobert R (2008) Large scale manifold transduction. In: ICML. ACM Press, New York
Kato T, Kashima H, Sugiyama M (2009) Robust label propagation on multiple networks. IEEE Trans Neural Netw 20(1):35–44
Khosla A, Zhou T, Malisiewicz T, Efros AA, Torralba A (2012) Undoing the damage of dataset bias. In: ECCV
Kruschke JK (1992) ALCOVE: an exemplar-based connectionist model of category learning. Psychol Rev 99(1):22–44
Kulis B, Jain P, Grauman K (2009) Fast similarity search for learned metrics. IEEE Trans Pattern Anal Mach Intell 31(12):2143–2157
Lampert CH, Nickisch H, Harmeling S (2009) Learning to detect unseen object classes by between-class attribute transfer. In: CVPR
Lee YJ, Grauman K (2009) Foreground focus: unsupervised learning from partially matching images. Int J Comput Vis 85:143–166
Leibe B, Seemann E, Schiele B (2005) Pedestrian detection in crowded scenes. In: CVPR. IEEE
Levin DT, Simons DJ (1997) Failure to detect changes to attended objects in motion pictures. Psychon Bull Rev 4(4):501–506
Li W, Fritz M (2012) Recognizing materials from virtual examples. In: ECCV
Li Y-F, Zhou Z-H (2011) Improving semi-supervised support vector machines through unlabeled instances selection. In: AAAI
Liu W, He J, Chang SF (2010) Large graph construction for scalable semi-supervised learning. In: ICML, pp 1–8
Lu Z, Jain P, Dhillon IS (2009) Geometry-aware metric learning. In: ICML
Medin DL, Schaffer MM (1978) Context theory of classification learning. Psychol Rev 85(3):207–238
Minda JP, Smith JD (2001) Prototypes in category learning: the effects of category size, category structure, and stimulus complexity. J Exp Psychol Learn Mem Cogn 27(3):775–799
Murphy GL (2002) The big book of concepts
Murphy GL, Allopenna PD (1994) The locus of knowledge effects in concept learning. J Exp Psychol Learn Mem Cogn 20(4):904–919
Murphy GL, Medin DL (1985) The role of theories in conceptual coherence. Psychol Rev 92(3):289–316
Nguyen HT, Smeulders A (2004) Active learning using pre-clustering. In: ICML
Nigam K, McCallum AK, Thrun S, Mitchell T (2000) Text classification from labeled and unlabeled documents using EM. Mach Learn 39:103–134
Nosofsky RM (1984) Choice, similarity, and the context theory of classification. J Exp Psychol 10(1):104–114
Osherson DN, Smith EE (1981) On the adequacy of prototype theory as a theory of concepts. Cognition 9(1):35–58
Osugi T, Kun D, Scott S (2005) Balancing exploration and exploitation: a new algorithm for active machine learning. In: ICDM
Parikh D, Grauman K (2011) Relative attributes. In: ICCV, November 2011. IEEE
Pazzani MJ (1991) Influence of prior knowledge on concept acquisition: experimental and computational results. J Exp Psychol 17(3):416–432
Pearson K (1901) On lines and planes of closest fit to systems of points in space. Philos Mag 2(6):559–572
Pepik B, Stark M, Gehler P, Schiele B (2012) Teaching 3D geometry to deformable part models. In: CVPR
Pishchulin L, Jain A, Andriluka M, Thormälen T, Schiele B (2012) Articulated people detection and pose estimation: reshaping the future. In: CVPR
Ponce J, Berg TL, Everingham M, Forsyth DA, Hebert M, Lazebnik S, Marszalek M, Schmid C, Russell BC, Torralba A, Williams CKI, Zhang J, Zisserman A (2006) Dataset issues in object recognition. In: Ponce J, Hebert M, Schmid C, Zisserman A (eds) Towards category-level object recognition. LNCS. Springer, Berlin, pp 29–48
Pope A, Lowe DG (1996) Learning appearance models for object recognition. In: Object representation in computer vision II
Posner MI, Goldsmith R, Welton KE (1967) Perceived distance and the classification of distorted patterns. J Exp Psychol 73(1):28–38
Prabhakaran S, Raman S, Vogt JE, Roth V (2012) Automatic model selection in archetype analysis. In: DAGM
Rohban MH, Rabiee HR (2012) Supervised neighborhood graph construction for semi-supervised classification. Pattern Recognit 45(4):1363–1372
Rohrbach M, Stark M, Schiele B (2011) Evaluating knowledge transfer and zero-shot learning in a large-scale setting. In: CVPR
Rosch E, Mervis CB, Gray WD, Johnson DM, Boyes-Braem P (1976) Basic objects in natural categories. Cogn Psychol 8:382–439
Saffari A, Godec M, Pock T, Leistner C, Bischof H (2010) Online multi-class LPBoost. In: CVPR, June 2010. IEEE
Schiele B (2000) Towards automatic extraction and modeling of objects from image sequences. In: Int sym on intelligent robotic systems
Schiele B, Crowley JL (1996) Where to look next and what to look for. In: IROS
Schiele B, Crowley JL (1997) The concept of visual classes for object classification. In: Scand conf image analysis
Schiele B, Crowley JL (1998) Transinformation for active object recognition. In: ICCV
Schiele B, Crowley JL (2000) Recognition without correspondence using multidimensional receptive field histograms. Int J Comput Vis 36(1):31–52
Schiele B, Pentland A (1999) Probabilistic object recognition and localization. In: ICCV
Schnitzspan P, Fritz M, Roth S, Schiele B, Berkeley Eecs UC (2009) Discriminative structure learning of hierarchical representations for object detection. In: CVPR
Schohn G, Cohn D (2000) Less is more: active learning with support vector machines. In: ICML
Seeger M (2001) Learning with labeled and unlabeled data. Technical report, University of Edinburgh
Settles B (2009) Active Learning Literature Survey. Technical report, University of Wisconsin–Madison
Simon I, Snavely N, Seitz SM (2007) Scene summarization for online image collections. In: ICCV. IEEE
Simons DJ, Chabris CF (1999) Gorillas in our midst: sustained inattentional blindness for dynamic events. Perception 28(9):1059–1074
Simons DJ, Levin DT (1998) Failure to detect changes to people during a real-world interaction. Psychon Bull Rev 5(4):644–649
Sivic J, Russell BC, Efros AA, Zisserman A, Freeman WT (2005) Discovering object categories in image collections. In: ICCV
Smith E, Medin DL (1981) Categories and concepts. Harvard University Press, Cambridge
Sonnenburg S, Rätsch G, Schäfer C, Schölkopf B (2006) Large scale multiple kernel learning. J Mach Learn Res 7:1531–1565
Stark M, Goesele M, Schiele B (2010) Back to the future: learning shape models from 3D CAD data. In: BMVC
Sternig S, Roth PM, Bischof H (2012) On-line inverse multiple instance boosting for classifier grids. Pattern Recognit Lett, 33(7):890–897
Sugiyama M, Rubens N (2008) Active learning with model selection in linear regression. In: DMKD
Talwalkar A, Kumar S, Rowley H (2008) Large-scale manifold learning. In: CVPR, June 2008, pp 1–8
Tong W, Jin R (2007) Semi-supervised learning by mixed label propagation. In: AAAI, vol 22
Tong S, Koller D (2001) Support vector machine active learning with applications to text classification. J Mach Learn Res 2:45–66
Tong H, He J, Li M, Zhang C, Ma WY (2005) Graph based multi-modality learning. In: ACM multimedia
Torralba A (2011) Unbiased look at dataset bias. In: CVPR
Torralba BA, Russell BC, Yuen J (2010) LabelMe: online image annotation and applications. In: Proc IEEE
Tsang IW, Kwok JT (2006) Large-scale sparsified manifold regularization. In: NIPS
Tsuda K, Shin H, Schoelkopf B (2005) Fast protein classification with multiple networks. Bioinformatics 21:59–65
Vedaldi A, Gulshan V, Varma M, Zisserman A (2009) Multiple kernels for object detection. In: ICCV, pp 606–613
Vernon D (2005) A research roadmap of cognitive vision. Technical report, ECVision: the European research network for cognitive computer vision systems
Von Luxburg U, Radl A, Hein M (2010) Getting lost in space: large sample analysis of the commute distance. In: NIPS
Wang L, Chan KL, Zhang Z (2003) Bootstrapping SVM active learning by incorporating unlabelled images for image retrieval. In: CVPR. IEEE Comput. Soc., Los Alamitos
Wang X, Han TX, Yan S (2009) An HOG-LBP human detector with partial occlusion handling. In: ICCV, September 2009. IEEE
Wang G, Wang B, Yang X, Yu G (2012) Efficiently indexing large sparse graphs for similarity search. IEEE Trans Knowl Data Eng 24(3):440–451
Weber M, Welling M, Perona P (2000) Unsupervised learning of models for recognition. In: ECCV
Welinder P, Branson S, Belongie S, Perona P (2010) The multidimensional wisdom of crowds. In: NIPS, pp 1–9
Wiskott L, von der Malsburg C (1993) A neural system for the recognition of partially occluded objects in cluttered scenes. Int J Pattern Recognit Artif Intell 7(4):935–948
Yang L (2006) Distance metric learning: a comprehensive survey. Technical report, Michigan State University
Yang X, Bai X, Köknar-Tezel S, Latecki LJ (2013) Densifying distance spaces for shape and image retrieval. J Math Imaging Vis 46:12–28
Zaki SR, Nosofsky RM (2007) A high-distortion enhancement effect in the prototype-learning paradigm: dramatic effects of category learning during test. Mem Cogn 35(8):2088–2096
Zaki SR, Nosofsky RM, Stanton RD, Cohen AL (2003) Prototype and exemplar accounts of category learning and attentional allocation: a reassessment. J Exp Psychol Learn Mem Cogn 29(6):1160–1173
Zhang Z, Zha H, Zhang M (2008) Spectral methods for semi-supervised manifold learning. In: CVPR
Zhang K, Kwok JT, Parvin B (2009) Prototype vector machine for large scale semi-supervised learning. In: ICML. ACM Press, New York
Zhou D, Bousquet O, Navin Lal T, Weston J, Schölkopf B (2004) Learning with local and global consistency. In: NIPS
Zhu X, Goldberg AB, Khot T (2009) Some new directions in graph-based semi-supervised learning. In: ICME
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag London
About this chapter
Cite this chapter
Ebert, S., Schiele, B. (2013). Where Next in Object Recognition and how much Supervision Do We Need?. In: Farinella, G., Battiato, S., Cipolla, R. (eds) Advanced Topics in Computer Vision. Advances in Computer Vision and Pattern Recognition. Springer, London. https://doi.org/10.1007/978-1-4471-5520-1_2
Download citation
DOI: https://doi.org/10.1007/978-1-4471-5520-1_2
Publisher Name: Springer, London
Print ISBN: 978-1-4471-5519-5
Online ISBN: 978-1-4471-5520-1
eBook Packages: Computer ScienceComputer Science (R0)