Abstract
Natural scenes constitute a very heterogeneous stimulus class. Each semantic category contains exemplars of varying typicality. It is, therefore, an interesting question whether humans can categorize natural scenes consistently into a relatively small number of categories, such as, coasts, rivers/lakes, forests, plains, and mountains. This is particularly important for applications, such as, image retrieval systems. Only if typicality is consistently perceived across different individuals, a general image-retrieval system makes sense. In this study, we use psychophysics and computational modeling to gain a deeper understanding of scene typicality. In the first psychophysical experiment, we used a forced-choice categorization task in which each of 250 natural scenes had to be classified into one of the following five categories: coasts, rivers/lakes, forests, plains, and mountains. In the second experiment, the typicality of each scene had to be rated on a 50-point scale for each of the five categories. The psychophysical results show high consistency between participants not only in the categorization of natural scenes, but also in the typicality ratings. In order to model human perception, we then employ a computational approach that uses an intermediate semantic modeling step by extracting local semantic concepts, such as, rock, water, and sand. Based on the human typicality ratings, we learn a psychophysically plausible distance measure that leads to a high correlation between the computational and the human ranking of natural scenes. Interestingly, model comparisons without a semantic-modeling step correlated much less with human performance, suggesting that our model is psychophysically very plausible.
- Ashby, F. G. 1992. Multidimensional models of categorization. In Multidimensional Models of Perception and Cognition, F. G. Ashby, Ed. Lawrence Erlbaum Associates, Hillsdale, NJ, 449--483.Google Scholar
- Ashby, F. G. and Gott, R. 1988. Decision rules in the perception and categorization of multidimensional stimuli. Journal of Experimental Psychology: Learning, Memory and Cognition 14, 33--53.Google Scholar
- Ashby, F. G. and Lee, W. W. 1991. Predicting similarity and categorization from identification. Journal of Experimental Psychology: General 120, 150--172.Google Scholar
- Ashby, F. G. and Lee, W. W. 1992. On the relationship between identification, similarity, and categorization: Reply to Nosowsky and Smith (1992). Journal of Experimental Psychology: General 121, 385--393.Google Scholar
- Ashby, F. G. and Maddox, W. T. 1990. Integrating information from separable psychological dimensions. Journal of Experimental Psychology: Human Perception and Performance 16, 598--612.Google Scholar
- Ashby, F. G. and Maddox, W. T. 1992. Complex decision rules in categorization: Contrasting novice and experienced performance. Journal of Experimental Psychology: Human Perception and Performance 18, 50--71.Google Scholar
- Ashby, F. G. and Maddox, W. T. 1993. Relations among prototype, exemplar, and decision bound models of categorization. Journal of Mathematical Psychology 37, 372--400. Google Scholar
- Ashby, F. G. and Townsend, J. T. 1986. Varieties of perceptual independence. Psychological Review 93, 154--179.Google Scholar
- Barnard, K., Duygulu, P., De Freitas, N., and Forsyth, D. 2002. Object recognition as machine translation---Part 2: Exploiting image data-base clustering models. European Conference on Computer Vision ECCV'02, Copenhagen, Denmark.Google Scholar
- Barsalou, L. W. 1987. The instability of graded structure: implications for the nature of concepts. In Concepts and Conceptual Development: Ecological and Intellectual Factors in Categorization, U. Neisser, Ed. Cambridge University Press, Cambridge. 101--140.Google Scholar
- Barsalou, L. W. and Sewell, D. R. 1985. Contrasting the representations of scripts and categories. Journal of Memory and Language 24, 646--665.Google Scholar
- Birnbaum, M. H. 1998.Measurement, Judgment, and Decision Making. Academic Press, San Diego, CA.Google Scholar
- Brooks, L. R. 1978. Nonanalytic concept formation and memory for instances. In Cognition and Categorization, E. Rosch and B. B. Lloyd, Eds. Lawrence Erlbaum Associates, Hillsdale, NJ.Google Scholar
- Bruner, J. S., Goodnow, J. J., and Austin, G. A. 1956. A Study of Thinking. Wiley, New York.Google Scholar
- Chang, C. C. and Lin, C.-J. 2001. LIBSVM: a library for support vector machines, Software available under http://www.csie.ntu.edu.tw/~cjlin/libsvm.Google Scholar
- Cohen, J. 1988. Statistical Power Analysis for the Behavioral Sciences, 2nd ed. Lawrence Erlbaum Associates, Hillsdale, NJ.Google Scholar
- Cox, I. J., Miller, M. L., Minka, T. P., Papathomas, T. V., and Yianilos, P. N. 2000. The Bayesian image retrieval system, PicHunter: Theory, implementation, and psychophysical experiments. IEEE Transactions on Image Processing 9, 1, 20--37. Google Scholar
- Cronbach, L. J. 1951. Coefficient alpha and the internal structure of tests. Psychometrika, 16, 297--334.Google Scholar
- Edelman, S. 1999. Representation and Recognition in Vision. MIT Press, Cambridge, MA. Google Scholar
- Estes, W. K. 1986. Array models for category learning. Cognitive Psychology 18, 500--549.Google Scholar
- Fei-Fei, L. and Perona, P. 2005. A bayesian hierarchical model for learning natural scene categories. IEEE Computational Vision and Pattern Recognition, 524--531. Google Scholar
- Feng, S. L., Manmatha, R., and Lavrenko, V. 2004. Multiple Bernoulli relevance models for image and video annotation. In Conference on Image and Video Retrieval CIVR '04, Dublin, Ireland.Google Scholar
- Feng, X., Fang, J., and Qiu, G. 2003. Color photo categorization using compressed histograms and support vector machines. In International Conference on Image Processing ICIP'03, Barcelona, Spain.Google Scholar
- Franks, J. J. and Bransford, J. D. 1971. Abstraction of visual patterns. Journal of Experimental Psychology 90, 65--74.Google Scholar
- Hampton, J. A. 1998. Similarity-based categorization and fuzziness of natural categories. Cognition 65, 137--165.Google Scholar
- Hintzmann, D. L. 1986. “Schema abstraction” in a multiple-trace memory model. Psychological Review 93, 411--428.Google Scholar
- Jain, R., Kasturi, R., and Schunck, B. 1995. Machine Vision. McGraw-Hill, NY. Google Scholar
- Kalish, C. W. 2002. Essentialist to some degree: The structure of natural kind categories. Memory & Cognition 30, 340--352.Google Scholar
- Kline, P. 2000. Handbook of Psychological Testing. London: Routledge.Google Scholar
- Maddox, W. T. and Ashby, F. G. 1993. Comparing decision bound and exemplar models of categorization. Perception and Psychophysics 53, 49--70.Google Scholar
- Mccloskey, M. and Glucksberg, S. 1979. Decision processes in verifying category membership statements: Implications for models of semantic memory. Cognitive Psychology 11, 1--37.Google Scholar
- Medin, D. L. and Schaffer, M. M. 1978. Context theory of classification learning. Psychological Review 85, 207--238.Google Scholar
- Mojsilovic, A., Gomes, J., and Rogowitz, B. 2004. Semantic-friendly indexing and querying of images based on the extraction of the objective semantic cues. International Journal of Computer Vision 56, 79--107. Google Scholar
- Mojsilovic, A., Kovacevic, J., Hu, J., Safranek, R. J., and Ganapathy, S. K. 2000a. Matching and retrieval based on the vocabulary and grammar of color patterns. IEEE Transactions on Image Processing 9, 38--54. Google Scholar
- Mojsilovic, A., Kovacevic, J., Kall, D., Safranek, R. J., and Ganapathy, S. K 2000b. The vocabulary and grammar of color patterns. IEEE Transactions on Image Processing 9, 417--431. Google Scholar
- Nosofsky, R. M. 1986. Attention, similarity, and the identification-categorization relationship. Journal of Experimental Psychology: General 115, 39--57.Google Scholar
- Oliva, A. and Torralba, A. 2001. Modeling the shape of the scene: a holistic representation of the spatial envelope. International Journal of Computer Vision 42, 3, 145--175. Google Scholar
- Posner, M. I. and Keele, S. W. 1968. On the genesis of abstract ideas. Journal of Experimental Psychology 77, 353--363.Google Scholar
- Posner, M. I. and Keele, S. W. 1970. Retention of abstract ideas. Journal of Experimental Psychology 83, 304--308.Google Scholar
- Potter, M. C. 1976. Short-term conceptual memory for pictures. Journal of Experimental Psychology: Human Learning and Memory, 2, 5, 509--522.Google Scholar
- Rips, L. J., Shoben, E. J., and Smith, E. E. 1973. Semantic distance and the verification of semantic relations. Journal of Verbal Learning and Verbal Behavior 12, 1--20.Google Scholar
- Rogowitz, B. E., Frese, T., Smith, J., Bouman, C. A., and Kalin, E. 1997. Perceptual image similarity experiments. In SPIE Conference on Human Vision and Electronic Imaging. 576--590.Google Scholar
- Rosch, E. 1973. Natural categories. Cognitive Psychology 4, 3, 328--350.Google Scholar
- Rosch, E. 1975. Cognitive reference points. Cognitive Psychology 7, 4, 532--547.Google Scholar
- Rosch, E. 1977. Human categorization. In Studies in Cross-Cultural Psychology, N. Warren, Ed. Academic Press, London.Google Scholar
- Smeulders, A. W. M., Worring, M., Santini, S., Gupta, A., and Jain, R. 2000. Content based image retrieval at the end of the early years. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 12, 1349--1380. Google Scholar
- Smith, E. and Medin, D. 1981. Categories and Concepts. Harvard University Press, Cambridge, MA.Google Scholar
- Spearman, C. 1904. The proof and measurement of association between two things. American Journal of Psychology 15, 72--101.Google Scholar
- Szummer, M. and Picard, R. W. 1998. Indoor-outdoor image classification. Workshop on Content Based Access of Image and Video Databases, Bombay, India. Google Scholar
- Thorpe, S. J., Fize, D., and Marlot, C. 1996. Speed of processing in the human visual system. Nature 381, 520--522.Google Scholar
- Tversky, B. and Hemenway, K. 1983. Categories of environmental scenes. Cognitive Psychology 15, 121--149.Google Scholar
- Ullman, S. 1996. High-Level Vision: Object Recognition and Visual Cognition. The MIT Press, Cambridge, MA.Google Scholar
- Vailaya, A., Figueiredo, M., Jain, A., and Zhang, H. J. 2001. Image classification for content-based indexing. IEEE Transactions on Image Processing 10, 1, 117--130. Google Scholar
- Vanrullen, R. and Thorpe, S. J. 2001. The time course of visual processing: From early perception to decision making. Journal of Cognitive Neuroscience 24, 454--461. Google Scholar
- Veltkamp, R. C. and Tanase M. 2001. Content-based image retrieval systems: A survey. Technical Report UU-CS-2000-34, Department of Computer Science, Utrecht University.Google Scholar
- Vogel, J. 2004. Semantic Scene Modeling and Retrieval. Hartung-Gorre, Konstanz, Germany.Google Scholar
- Vogel, J. and Schiele, B. 2004. Natural scene retrieval based on a semantic modeling step. International Conference on Image and Video Retrieval CIVR 2004, Dublin, Ireland. Springer Verlag, New York.Google Scholar
Index Terms
- A psychophysically plausible model for typicality ranking of natural scenes
Recommendations
Categorization of natural scenes: Local versus global information and the role of color
Categorization of scenes is a fundamental process of human vision that allows us to efficiently and rapidly analyze our surroundings. Several studies have explored the processes underlying human scene categorization, but they have focused on processing ...
A color saliency model for salient objects detection in natural scenes
MMM'10: Proceedings of the 16th international conference on Advances in Multimedia ModelingDetection of salient objects is very useful for object recognition, content-based image/video retrieval, scene analysis and image/video compression. In this paper, we propose a color saliency model for salient objects detection in natural scenes. In our ...
Adaptively Combining Local with Global Information for Natural Scenes Categorization
This paper proposes the Extended Bag-of-Visterms (EBOV) to represent semantic scenes. In previous methods, most representations are bag-of-visterms (BOV), where visterms referred to the quantized local texture information. Our new representation is built ...
Comments