Abstract
A major bottleneck for the efficient management of personal photographic collections is the large gap between low-level image features and high-level semantic contents of images. This paper proposes and evaluates two methodologies for making appropriate (re)use of natural language photographic annotations for extracting references to people, location and objects and propagating any location references encountered to previously unannotated images. The evaluation identifies the strengths of each approach and shows extraction and propagation results with promising accuracy.
Similar content being viewed by others
Notes
Flickr http://www.flickr.com.
EXchangeable Image file Format, was created by the Japan Electronic Industries Development Association (JEIDA). Version 2.1 (the first public release) was released June, 1998, and later updated to version 2.2 in April 2002.
See 2.4.2 for wider usage of such features.
Internal communications with Kodak Limited.
A token is a categorized block of text. In the context of this work, a token is a word in a text separated by white spaces from other tokens.
References
Ahn L, Dabbish L (2004) Labeling images with a computer game. In: CHI ’04. ACM, New York, pp 319–326
Alp Aslandogan Y, Thier C, Yu CT, Zou J, Rishe N (1997) Using semantic contents and wordnet in image retrieval. In: SIGIR ’97: proceedings of the 20th annual international ACM SIGIR conference on research and development in information retrieval. ACM, New York, pp 286–295
Bang HY, Zhang C, Chen T (2004) Semantic propagation from relevance feedbacks. In: 2004 IEEE international conference on multimedia and expo. ICME ’04, 27–30 June. vol 1. IEEE, Piscataway, pp 81–84
Barla A, Odone F, Verri A (2003) Old fashioned state-of-the-art image classification. In: Proc. of ICIAP 2003, Mantova, 17–19 September 2003, pp 566–571
Budura A, Michel S, Cudre-Mauroux P, Aberer K (2008) To tag or not to tag ? harvesting adjacent metadata in large-scale tagging systems. In: The 31st annual international ACM SIGIR conference, 20–24 July 2008, Singapore
Cooper M, Foote J, Girgensohn A, Wilcox L (2003) Temporal event clustering for digital photo collections. In MULTIMEDIA ’03: proceedings of the eleventh ACM international conference on multimedia. ACM, New York, pp 364–373
Duda RO, Hart PE (1973) Pattern classification and scene analysis. Wiley, New York, pp 98–105
Frohlich D, Kuchinsky A, Pering C, Don A, Ariss S (2002) Requirements for photoware. In: CSCW ’02: proceedings of the 2002 ACM conference on Computer supported cooperative work. ACM, New York, pp 166–175
Greenwood M, Iria J, Ciravegna F (2008) Saxon: an extensible multimedia annotator. In LREC’ 08: proceedings of the 6th international conference on language resources and evaluation, Morocco, May 2008
Hare JS, Lewis PH (2005) Saliency-based models of image content and their application to auto-annotation by semantic propagation. In: Multimedia and the semantic web/European semantic web conference 2005, Heraklion, 29 May–1 June 2005
Hartigan JA, Wong MA (1979) A K-means clustering algorithm. Appl Stat 28:100–108
Hearst M (1992) Automatic acquisition of hyponyms from large text corpora. In: Proc. of COLING 1992, 23–28 August 1992, Nantes, pp 539–545
Iria J, Ireson N, Ciravegna F (2006) An experimental study on boundary classification algorithms for information extraction using svm. In: Proc. of EACL 2006, Montreal, 22–27 April 2006
Keyvanpour M, Asbaghi S, Fathy M (2007) A new scheme of automatic semantic propagation in the image data base using a hierarchical structure of semantics. In: DEXA ’07: proceedings of the 18th international conference on database and expert systems applications. IEEE Computer Society, Washington, DC, pp 59–63
Manjunath B (2001) Color and texture descriptors. IEEE Trans Circuits Syst Video Technol 11:703–715
Miller AD, Edwards KW (2007) Give and take: a study of consumer photo-sharing culture and practice. In: CHI ’07: Proceedings of the SIGCHI conference on Human factors in computing systems. ACM, New York, pp 347–356
Naaman M, Harada S, Wang Q, Garcia-Molina H, Paepcke A (2004) Context data in geo-referenced digital photo collections. In: Proc. of ACM MM, October 2004
Park DK, Jeon YS, Won CS (2000) Efficient use of local edge histogram descriptor. In: MULTIMEDIA ’00: Proceedings of the 2000 ACM workshops on Multimedia. ACM, New York, pp 51–54
Pastra K, Saggion H, Wilks Y (2002) Extracting relational facts for indexing and retrieval of crime-scene photographs. In Macintosh A, Ellis R, Coenen F (eds) Applications and innovations in intelligent systems X, British Computer Society Conference Series. Springer, Heidelberg, pp 121–134
Pelleg D, Moore A (2000) X-means: Extending K-means with efficient estimation of the number of clusters. In: Proc. 17th International Conf. on Machine Learning. Morgan Kaufmann, San Francisco, pp 727–734
Salembier P, Sikora T (2002) Introduction to MPEG-7: multimedia content description interface. Wiley, New York
Spyrou E, LeBorgne H, Mailis T, Cooke E, Avrithis Y, O’Connor N (2005) Fusing MPEG-7 visual descriptors for image classiffication. In: Duch W, Kacprzyk J, Oja E, Zadrozny S (eds) In: Artificial neural networks, part II: formal models and their applications, vol 3697. Springer, Heidelberg, pp 847–852
Srihari R (1995) Automatic indexing and content-based retrieval of captioned images. Computer 28(9):49–56
Stauder J, Sirot J, Le Borgne H, Cooke E, O’Connor NE (2004) Relating visual and semantic image descriptors. In: EWIMT 2004—European workshop on the integration of knowledge, semantics and digital media technology
Veltkamp R, Tanase M (2000) Content-based image retrieval systems: a survey. Technical report UU-CS-2000-34, Dept. of Computing Science, Utrecht University
Zhang D, Tsotras VJ (2001) Improving min/max aggregation over spatial objects. In: ACM-GIS, pp 88–93
Zhang H, Chen Z, Li M, Su Z (2003) Relevance feedback and learning in content-based image search. World Wide Web 6(2):131–155
Acknowledgements
This work was sponsored by Kodak Limited. We would also like to thank the 391 online photo sharing users who donated their photographs and respective metadata.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Carvalho, R.F., Chapman, S. & Ciravegna, F. Attributing semantics to personal photographs. Multimed Tools Appl 42, 73–96 (2009). https://doi.org/10.1007/s11042-008-0249-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-008-0249-5