Skip to main content
Log in

Attributing semantics to personal photographs

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

A major bottleneck for the efficient management of personal photographic collections is the large gap between low-level image features and high-level semantic contents of images. This paper proposes and evaluates two methodologies for making appropriate (re)use of natural language photographic annotations for extracting references to people, location and objects and propagating any location references encountered to previously unannotated images. The evaluation identifies the strengths of each approach and shows extraction and propagation results with promising accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. Flickr http://www.flickr.com.

  2. EXchangeable Image file Format, was created by the Japan Electronic Industries Development Association (JEIDA). Version 2.1 (the first public release) was released June, 1998, and later updated to version 2.2 in April 2002.

  3. See 2.4.2 for wider usage of such features.

  4. Internal communications with Kodak Limited.

  5. A token is a categorized block of text. In the context of this work, a token is a word in a text separated by white spaces from other tokens.

  6. http://www.sourceforge.net/projects/t-rex

  7. Saxon—http://nlp.shef.ac.uk/wig/tools/saxon/index.html Runes—http://runes.sourceforge.net/

References

  1. Ahn L, Dabbish L (2004) Labeling images with a computer game. In: CHI ’04. ACM, New York, pp 319–326

    Chapter  Google Scholar 

  2. Alp Aslandogan Y, Thier C, Yu CT, Zou J, Rishe N (1997) Using semantic contents and wordnet in image retrieval. In: SIGIR ’97: proceedings of the 20th annual international ACM SIGIR conference on research and development in information retrieval. ACM, New York, pp 286–295

    Chapter  Google Scholar 

  3. Bang HY, Zhang C, Chen T (2004) Semantic propagation from relevance feedbacks. In: 2004 IEEE international conference on multimedia and expo. ICME ’04, 27–30 June. vol 1. IEEE, Piscataway, pp 81–84

    Google Scholar 

  4. Barla A, Odone F, Verri A (2003) Old fashioned state-of-the-art image classification. In: Proc. of ICIAP 2003, Mantova, 17–19 September 2003, pp 566–571

  5. Budura A, Michel S, Cudre-Mauroux P, Aberer K (2008) To tag or not to tag ? harvesting adjacent metadata in large-scale tagging systems. In: The 31st annual international ACM SIGIR conference, 20–24 July 2008, Singapore

  6. Cooper M, Foote J, Girgensohn A, Wilcox L (2003) Temporal event clustering for digital photo collections. In MULTIMEDIA ’03: proceedings of the eleventh ACM international conference on multimedia. ACM, New York, pp 364–373

    Chapter  Google Scholar 

  7. Duda RO, Hart PE (1973) Pattern classification and scene analysis. Wiley, New York, pp 98–105

    MATH  Google Scholar 

  8. Frohlich D, Kuchinsky A, Pering C, Don A, Ariss S (2002) Requirements for photoware. In: CSCW ’02: proceedings of the 2002 ACM conference on Computer supported cooperative work. ACM, New York, pp 166–175

    Chapter  Google Scholar 

  9. Greenwood M, Iria J, Ciravegna F (2008) Saxon: an extensible multimedia annotator. In LREC’ 08: proceedings of the 6th international conference on language resources and evaluation, Morocco, May 2008

  10. Hare JS, Lewis PH (2005) Saliency-based models of image content and their application to auto-annotation by semantic propagation. In: Multimedia and the semantic web/European semantic web conference 2005, Heraklion, 29 May–1 June 2005

  11. Hartigan JA, Wong MA (1979) A K-means clustering algorithm. Appl Stat 28:100–108

    Article  MATH  Google Scholar 

  12. Hearst M (1992) Automatic acquisition of hyponyms from large text corpora. In: Proc. of COLING 1992, 23–28 August 1992, Nantes, pp 539–545

  13. Iria J, Ireson N, Ciravegna F (2006) An experimental study on boundary classification algorithms for information extraction using svm. In: Proc. of EACL 2006, Montreal, 22–27 April 2006

  14. Keyvanpour M, Asbaghi S, Fathy M (2007) A new scheme of automatic semantic propagation in the image data base using a hierarchical structure of semantics. In: DEXA ’07: proceedings of the 18th international conference on database and expert systems applications. IEEE Computer Society, Washington, DC, pp 59–63

    Chapter  Google Scholar 

  15. Manjunath B (2001) Color and texture descriptors. IEEE Trans Circuits Syst Video Technol 11:703–715

    Article  Google Scholar 

  16. Miller AD, Edwards KW (2007) Give and take: a study of consumer photo-sharing culture and practice. In: CHI ’07: Proceedings of the SIGCHI conference on Human factors in computing systems. ACM, New York, pp 347–356

    Chapter  Google Scholar 

  17. Naaman M, Harada S, Wang Q, Garcia-Molina H, Paepcke A (2004) Context data in geo-referenced digital photo collections. In: Proc. of ACM MM, October 2004

  18. Park DK, Jeon YS, Won CS (2000) Efficient use of local edge histogram descriptor. In: MULTIMEDIA ’00: Proceedings of the 2000 ACM workshops on Multimedia. ACM, New York, pp 51–54

    Chapter  Google Scholar 

  19. Pastra K, Saggion H, Wilks Y (2002) Extracting relational facts for indexing and retrieval of crime-scene photographs. In Macintosh A, Ellis R, Coenen F (eds) Applications and innovations in intelligent systems X, British Computer Society Conference Series. Springer, Heidelberg, pp 121–134

    Google Scholar 

  20. Pelleg D, Moore A (2000) X-means: Extending K-means with efficient estimation of the number of clusters. In: Proc. 17th International Conf. on Machine Learning. Morgan Kaufmann, San Francisco, pp 727–734

    Google Scholar 

  21. Salembier P, Sikora T (2002) Introduction to MPEG-7: multimedia content description interface. Wiley, New York

    Google Scholar 

  22. Spyrou E, LeBorgne H, Mailis T, Cooke E, Avrithis Y, O’Connor N (2005) Fusing MPEG-7 visual descriptors for image classiffication. In: Duch W, Kacprzyk J, Oja E, Zadrozny S (eds) In: Artificial neural networks, part II: formal models and their applications, vol 3697. Springer, Heidelberg, pp 847–852

    Google Scholar 

  23. Srihari R (1995) Automatic indexing and content-based retrieval of captioned images. Computer 28(9):49–56

    Article  Google Scholar 

  24. Stauder J, Sirot J, Le Borgne H, Cooke E, O’Connor NE (2004) Relating visual and semantic image descriptors. In: EWIMT 2004—European workshop on the integration of knowledge, semantics and digital media technology

  25. Veltkamp R, Tanase M (2000) Content-based image retrieval systems: a survey. Technical report UU-CS-2000-34, Dept. of Computing Science, Utrecht University

  26. Zhang D, Tsotras VJ (2001) Improving min/max aggregation over spatial objects. In: ACM-GIS, pp 88–93

  27. Zhang H, Chen Z, Li M, Su Z (2003) Relevance feedback and learning in content-based image search. World Wide Web 6(2):131–155

    Article  Google Scholar 

Download references

Acknowledgements

This work was sponsored by Kodak Limited. We would also like to thank the 391 online photo sharing users who donated their photographs and respective metadata.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rodrigo F. Carvalho.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Carvalho, R.F., Chapman, S. & Ciravegna, F. Attributing semantics to personal photographs. Multimed Tools Appl 42, 73–96 (2009). https://doi.org/10.1007/s11042-008-0249-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-008-0249-5

Keywords

Navigation