Abstract
The rapid expansion of multimedia digital collections brings to the fore the need for classifying not only text documents but their embedded non-textual parts as well. We propose a model for basing classification of multimedia on broad, non-topical features, and show how information on targeted nearby pieces of text can be used to effectively classify photographs on a first such feature, distinguishing between indoor and outdoor images. We examine several variations to a TF*IDF-based approach for this task, empirically analyze their effects, and evaluate our system on a large collection of images from current news newsgroups. In addition, we investigate alternative classification and evaluation methods, and the effect that a secondary feature can have on indoor/outdoor classification. We obtain a classification accuracy of 82%, a number that clearly outperforms baseline estimates and competing image-based approaches and nears the accuracy of humans who perform the same task with access to comparable information.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
A. V. Aho, S.-F. Chang, K. R. McKeown, D. Radev, J. R. Smith, and K. Zaman. Columbia Digital News Project. International Journal of Digital Libraries, 1(4):377–385, 1998.
J. R. Bach, C. Fuller, A. Gupta, A. Hampapur, B. Horowitz, R. Humphrey, R. C. Jain, and C. Shu. The VIRAGE Image Search Engine: An Open Framework for Image Management. In Proceedings of the Symposium on Electronic Imagic: Science and Technology—Storage and Retrieval for Image and Video Databases IV. IS&T/SPIE, February 1996.
D. M. Bates and D. G. Watts. Nonlinear Regression Analysis and its Applications. Wiley, New York, 1988.
K. W. Church. A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text. In Proceedings of the Second Conference on Applied Natural Language Processing (ANLP-88), pages 136–143, Austin, Texas, February 1988.
R. A. Fisher. Statistical Methods for Research Workers. Oliver and Boyd, E;;dinburgh, United Kingdom, 5th edition, 1934.
J. L. Fleiss. Statistical Methods for Rates and Proportions. Wiley, New York, 2nd edition, 1981.
D. A. Forsyth and M. M. Fleck. Finding Naked People. In Proceedings of the European Conference on Computer Vision, Berlin, Germany, 1996.
L. S. Gay and W. B. Croft. Interpreting Nominal Compounds for Information Retrieval. Information Processing and Management, 26(1):21–38, 1990.
T. Hastie and D. Pregibon. Shrinking Trees. Technical report, AT&T Bell Laboratories, 1990.
V. Hatzivassiloglou and K. R. McKeown. Towards the Automatic Identification of Adjectival Scales: Clustering Adjectives According to Meaning. In Proceedings of the 31st Annual Meeting of the ACL, pages 172–182, Columbus, Ohio, June 1993.
C. R. Hicks. Fundamental Concepts in the Design of Experiments. Holt, Rinehart, and Wilson, New York, 3rd edition, 1982.
D. Lewis, R. Schapire, J. Callan, and R. Papka. Training Algorithms for Linear Text Classifiers. In Proceedings of the 19th International ACM SIGIR Conference on Researce and Development in Information Retrieval (SIGIR-96), 1996.
D. Lewis. Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval. In Proceedings of the European Conference on Machine Learning, 1998.
W. Niblack, R. Barber, W. Equitz, M. Flickner, E. Glasman, D. Petkovic, P. Yanker, C. Faloutsos, and G. Taubin. The QBIC Project: Quering Images by Content Using Color, Texture, and Shape. In Proceedings of Symposium on Electronic Imaging: Science and Technology—Storage and Retrieval for Image and Video Databases. SPIE, February 1993.
V. E. Ogle and M. Stonebraker. Chabot: Retrieval from a Relational Database of Images. IEEE Computer Magazine, 28(9):40–48, September 1995.
S. Paek, C. L. Sable, V. Hatzivassiloglou, A. Jaimes, B. H. Schiffman, S.-F. Chang, and K. R. McKeown. Integration of Visual and Text-Based Approaches for the Content Labeling and Classification of Photographs, 1999. In preparation.
A. Pentland, R. W. Picard, and S. Sclaroff. Photobook: Tools for Content-Based Manipulation of Image Databases. In Proceedings of the Symposium on Electronic Imagic: Science and Technology—Storage and Retrieval for Image and Video Databases II, pages 34–47, Bellingham, Washington, 1994. SPIE.
J. R. Quinlan. Induction of Decision Trees. Machine Learning, 1(1):81–106, 1986.
J. Rocchio. Relevance Feedback in Information Retrieval. In The SMART Retrieval System: Experiments in Automatic Document Processing, chapter 14, pages 974–979. Prentice-Hall, 1971.
N. C. Rowe and E. J. Guglielmo. Exploiting Captions in the Retrieval of Multimedia Data. Information Processing and Management, 29(4):453–561, 1993.
G. Salton and C. Buckley. Term Weighting Approaches in Automatic Text Retrieval. Information Processing and Management, 25(5):513–523, 1988.
G. Salton and M. Smith. On the Application of Syntactic Methodologies in Automatic Text Analysis. In Proceedings of the 12th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 1989.
G. Salton. Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley, Reading, Massachusetts, 1989.
T. J. Santner and D. E. Duffy. The Statistical Analysis of Discrete Data. Springer-Verlag, New York, 1989.
D. W. Scott. Multivariate Density Estimation: Theory, Practice, and Visualization. John Wiley and Sons, New York, 1992.
A. F. Smeaton and I. Quigley. Experiments on Using Semantic Distances Between Words in Image Caption Retrieval. In Proceedings of the 19th International ACM SIGIR Conference on Research and Development in Information Retrieval, 1996.
A. F. Smeaton. Progress in the Application of Natural Language Processing to Information Retrieval Tasks. The Computer Journal, 35(3):268–278, 1992.
J. R. Smith and S.-F. Chang. Visually Searching the Web for Content. IEEE Multimedia, 4(3):12–20, July-September 1997.
R. K. Srihari. Automatic Indexing and Content-Based Retrieval of Captioned Images. IEEE Computer Magazine, 28(9):49–58, September 1995.
M. Szummer and R. W. Picard. Indoor-Outdoor Image Classification. In IEEE Workshop on Content Based Access of Image and Video Databases (CAIVD-98), pages 42–51, Bombay, India, January 1998.
A. Vailaya, M. Figueiredo, A. K. Jain, and H. Zhang. Bayesian Framework for Semantic Classification of Outdoor Vacation Images. In Proceedings of SPIE—Storage and Retrieval for Image and Video Databases VII, San Jose, California, 1999.
N. Wacholder, Y. Ravin, and M. Choi. Disambiguation of Proper Names in Text. In Proceedings of the 5th ACL Conference on Applied Natural Language Processing (ANLP-97), pages 202–208, Washington, D.C., April 1997.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sable, C.L., Hatzivassiloglou, V. (1999). Text-Based Approaches for the Categorization of Images. In: Abiteboul, S., Vercoustre, AM. (eds) Research and Advanced Technology for Digital Libraries. ECDL 1999. Lecture Notes in Computer Science, vol 1696. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48155-9_4
Download citation
DOI: https://doi.org/10.1007/3-540-48155-9_4
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66558-8
Online ISBN: 978-3-540-48155-3
eBook Packages: Springer Book Archive