Abstract
This paper demonstrates a new application of computer vision to digital libraries — the use of texture forannotation, the description of content. Vision-based annotation assists the user in attaching descriptions to large sets of images and video. If a user labels a piece of an image aswater, a texture model can be used to propagate this label to other “visually similar” regions. However, a serious problem is that no single model has been found that is good enough to match reliably human perception of similarity in pictures. Rather than using one model, the system described here knows several texture models, and is equipped with the ability to choose the one that “best explains” the regions selected by the user for annotating. If none of these models suffices, then it creates new explanations by combining models. Examples of annotations propagated by the system on natural scenes are given. The system provides an average gain of four to one in label prediction for a set of 98 images.
Similar content being viewed by others
References
Brodatz P (1966) Textures: a photographic album for artists and designers. Dover, New York
Chakravarthy AS (1994) Toward semantic retrieval of pictures and video. RIAO'94, Intelligent Multimedia Information Retrieval Systems and Management, New York, pp 676–686
Chang T, Kuo CCJ (1993) Texture analysis and classification with tree-structured wavelet transform. IEEE Trans Image Processing 2:429–441
Gorkani MM, Picard RW (1994) Texture orientation for sorting photos at a glance. Proceedings of the International Conference on Pattern Recognition, Jerusalem, pp 459–464
Herrnstein RJ, Loveland DH, Cable C (1976) Natural concepts in pigeons. J Exp Psychol Anim Behav Proceedings 2:285–302
Healey G, Slater D (1994) Using illumination invariant color histogram descriptors for recognition. Proceedings of Conference Computer Vision and Pattern Recognition Seattle, Wash., pp355–360
Jain AK, Dubes RC (1988) Algorithms for clustering data. Prentice Hall, Englewood Cliffs, NJ
Miller GA (1990) Nouns in WordNet: A Lexical Inheritance System. Int J Lexicography 3:245–264
Minsky M (1985) The society of mind. Simon & Schuster, New York
Mao J, Jain AK (1992) Texture classification and segmentation using multiresolution simultaneous autoregressive models. Patt Recogn 25:173–188
Niblack W, Barber R, Equitz W, Flickner M, Glasman E, Petkovic D, Yanker P, Faloutsos C, Taubin G (1993) The QBIC project: querying images by content using color, texture, and shape. In: Niblack W (ed) Proc. SPIE Storage and retrieval for image and video databases, San Jose, Calif., pp 173–181
Ohta YI, Kanade T, Sakai T (1980) Color information for region segmentation. Comput Graph Image Processing 13:222–241
Picard RW, Kabir T (1993) Finding similar patterns in large image databases. Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, Minneapolis, Minn., pp V161–V164
Picard RW, Kabir T, Liu F (1993) Real-time recognition with the entire Brodatz texture database. Proceedings of the IEEE Conference on Computer vision and Pattern Recognition, New York, pp 638–639
Picard RW, Liu F (1994) A new Wold ordering for image similarity. Proceedings of the IEEE Conference on Acoustics, Speech, and Signal Processing, Adelaide, Australia, pp V129–V132
Pentland A, Picard RW, Sclaroff S (1994) Photobook: tools for content-based manipulation of image databases. SPIE Storage and Retrieval of Image & Video Databases II, San Jose, Calif., pp 34–47
Richards W, Koenderink JK (1993) Trajectory mapping (“TM”): a new non-metric scaling technique. Center for Cognitive Science, Technical Report 48, Massachusetts Institute of Technology, Cambridge, MA
Rao AR, Lohse GL (1993) Towards a texture naming system: identifying relevant dimensions of texture. IEEE Conference on Visualization, San Jose, Calif., pp220–227
Saint-Arnaud N (1994) Private communication
Swain MJ, Ballard DH (1990) Indexing via color histograms. Image Understanding Workshop, Pittsburgh, PA, pp 623–630
Sherstinsky AS (1994) M-lattice: a system for signal synthesis and processing based on reaction-diffusion. ScD Thesis Massachusetts Institute of Technology, Cambridge, MA
Syeda-Mahmood T (1993) Model-driven selection using texture. In: Illingworth J (ed) Proceedings of the 4th British Machine Vision Conference, University of Surrey, Guildford, pp 65–74
Smoliar SW, Zhang H (1994) Content-based video indexing and retrieval. IEEE Multimedia, 62–72
Treisman A, Gelade G (1980) A feature-integration theory of attention. Cognitive Psychol 12:97–136
Therrien CW (1989) Decision estimation and classification. Wiley, New York
Tan TSC, Kittler J (1993) Colour texture classification using features from colour histogram. SCIA Conference on Image Analysis, Tromso, Norway, 2:807–813
Tamura H, Mori S, Yamawaki T (1978) Textural features corresponding to visual perception. IEEE Trans Sys Man Cyber 8:460–473
Tversky A (1977) Features of similarity. Psychol Review 84:327–352
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Picard, R.W., Minka, T.P. Vision texture for annotation. Multimedia Systems 3, 3–14 (1995). https://doi.org/10.1007/BF01236575
Issue Date:
DOI: https://doi.org/10.1007/BF01236575