Abstract
We present an integrated vision architecture capable of incrementally learning several visual categories based on natural hand-held objects. Additionally we focus on interactive learning, which requires real-time image processing methods and a fast learning algorithm. The overall system is composed of a figure-ground segregation part, several feature extraction methods and a life-long learning approach combining incremental learning with category-specific feature selection. In contrast to most visual categorization approaches, where typically each view is assigned to a single category, we allow labeling with an arbitrary number of shape and color categories. We also impose no restrictions on the viewing angle of presented objects, relaxing the common constraint on canonical views.
Similar content being viewed by others
References
Agarwal S, Awan A, Roth D (2004) Learning to detect objects in images via a sparse, part-based representation. IEEE Trans Pattern Anal Mach Intell 26(11): 1475–1490
Arsenio AM (2004) Developmental learning on a humanoid robot. In: Proceedings of the international joint conference on neuronal networks (IJCNN), pp 3167–3172
Denecke A, Wersing H, Steil JJ, Körner E (2009) Online figure-ground segmentation with adaptive metrics in generalized LVQ. Neurocomputing 72(7–9): 1470–1482
French RM (1999) Catastrophic forgetting in connectionist networks. Trends Cognit Sci 3(4): 128–135
Fritsch J, Lang S, Kleinehagenbrock M, Fink GA, Sagerer G (2002) Improving adaptive skin color segmentation by incorporating results from face detection In: Proceedings of the IEEE International workshop on robot and human interactive communication (ROMAN), Berlin, pp 337–343
Fritzke B (1994) Growing cell structures—a self-organizing network for unsupervised and supervised learning. Neural Networks 7(9): 1441–1460
Fritzke B (1995) A growing neural gas network learns topologies. In: Tesauro G, Touretzky DS, Leen TK (eds) Advances in Neural Information Processing Systems 7. MIT Press, Cambridge, pp 625–632
Fritz M, Kruijff G-JM, Schiele B (2007) Cross-modal learning of visual categories using different levels of supervision. In: Proceedings of the international conference on vision systems (ICVS)
Fukushima K (1980) Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol Cybern 36(4): 193–202
Furao S, Hasegawa O (2006) An incremental network for on-line unsupervised classification and topology learning. Neural Netw 1(19): 90–106
Goerick C, Mikhailova I, Wersing H, Kirstein S (2006) Biologically motivated visual behaviours for humanoids: learning to interact and learning in interaction. In: Proceedings of the IEEE/RSJ international conference on humanoid robots
Guyon I, Elissee A (2003) An introduction to variable and feature selection. J Mach Learn Res 3: 1157–1182
Hamker FH (2001) Life-long learning cell structures—continously learning without catastrophic interference. Neural Netw 14: 551–573
Hammer B, Villmann T (2002) Generalized relevance learning vector quantization. Neural Netw 15(8–9): 1059–1068
Hasler S, Wersing H, Körner E (2007) A comparison of features in parts-based object recognition hierarchies. In: Proceedings of the international conference on artificial neural networks (ICANN), pp 210–219
Kirstein S, Wersing H, Körner E (2008) A biologically motivated visual memory architecture for online learning of objects. Neural Netw 21: 65–77
Kirstein S, Wersing H, Gross H-M, Körner E (2008) A vector quantization approach for life-long learning of categories. In: Proceedings international conference on neural information processing (ICONIP). Springer, pp 803–810
Kohonen T (1989) Self-organization and associative memory. Springer Series in information sciences, 3rd edn. Springer
Leibe B, Leonardis A, Schiele B (2004) Combined object categorization and segmentation with an implicit shape model. In: In ECCV workshop on statistical learning in computer vision, pp 17–32
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comp Vis 60(2): 91–110
Mikolajczyk K, Leibe B, Schiele B (2006) Multiple object class detection with a generative model. In: Proceedings IEEE conference on computer vision and pattern recognition (CVPR)
Ozawa S, Toh SL, Abe S, Pang S, Kasabov N (2005) Incremental learning of feature space and classifier for face recognition. Neural Netw 18(5–6): 575–584
Pomierski T, Gross HM (1996) Biological neural architecture for chromatic adaptation resulting in constant color sensations. In: Proceedings IEEE international conference on neural networks (ICNN), pp 734–739
Roth PM, Donoser M, Bischof H (2006) On-line learning of unknown hand held objects via tracking. In: Proceedings of the second international cognitive vision workshop (ICVW)
Schneider P, Biehl M, Hammer B (2007) Relevance matrices in LVQ. In: Similarity-based clustering and its application to medicine and biology, Number 07131 in Dagstuhl seminar proceedings. Internationales Begegnungs- und Forschungszentrum für Informatik (IBFI), Schloss Dagstuhl, Germany
Skočaj D, Berginc G, Ridge B, Štimec A, Jogan M, Vanek O, Leonardis A, Hutter M, Hewes N (2007) A system for continuous learning of visual concepts. In: Proceedings of the international conferance on vision systems (ICVS)
Skočaj D, Kristan M, Leonardis A (2008, January) Continuous learning of simple visual concepts using incremental kernel density estimation. In: Proceedings of the international conference on computer vision theory and applications (VISAPP), Funchal, Madeira, Portugal, pp 598–604
Steels L, Kaplan F (2001) AIBO’s first words. The social learning of language and meaning. Evolut Commun 4(1): 3–32
Swain MJ, Ballard DH (1991) Color indexing. Int J Comput Vis 7(1): 11–32
Thomas A, Ferrari V, Leibe B, Tuytelaars T, Schiele B, Gool LV (2006, June). Towards multi-view object class detection. In: Proceedings IEEE conference on computer vision and pattern recognition (CVPR), New York, USA
Wersing H, Körner E (2003) Learning optimized features for hierarchical models of invariant object recognition. Neural Comput 15(7): 1559–1588
Wersing H, Kirstein S, Götting M, Brandl H, Dunn M, Mikhailova I, Goerick C, Steil J, Ritter H, Körner E (2007) Online learning of objects in a biologically motivated architecture. Int J Neural Sys 17: 219–230
Willamowski J, Arregui D, Csurka G, Dance CR, Fan L (2004) Categorizing nine visual classes using local appearance descriptors. In: Proceedings of the ICPR workshop on learning for adaptable visual systems
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kirstein, S., Denecke, A., Hasler, S. et al. A vision architecture for unconstrained and incremental learning of multiple categories. Memetic Comp. 1, 291–304 (2009). https://doi.org/10.1007/s12293-009-0023-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12293-009-0023-x