poster

Toward content-aware multimodal tagging of personal photo collections

Authors:

Paulo Barthelmess,

Edward Kaiser,

David R. McGeeAuthors Info & Claims

ICMI '07: Proceedings of the 9th international conference on Multimodal interfaces

Pages 122 - 125

https://doi.org/10.1145/1322192.1322215

Published: 12 November 2007 Publication History

Get Access

Abstract

A growing number of tools is becoming available, that make use ofexisting tags to help organize and retrieve photos, facilitating the management and use of photo sets. The tagging on which these techniques rely remains a time consuming, labor intensive task that discourages many users. To address this problem, we aim to leverage the multimodal content of naturally occurring photo discussions among friends and families to automatically extract tags from a combination of conversational speech, handwriting, and photo content analysis. While naturally occurring discussions are rich sources of informationabout photos, methods need to be developed to reliably extract a set of discriminative tags from this noisy, unconstrained group discourse. To this end, this paper contributes ananalysis of pilot data identifying robust multimodal features examining the interplay between photo content and other modalities such as speech and handwriting. Our analysis is motivated by a search for design implications leading to the effective incorporation of automated location and person identification(e.g. based on GPS and facial recognition technologies) into a system able to extract tags from natural multimodal conversations.

References

[1]

M. T. and M. Mischa S. Harris, D. Duplaw, A. Chakravarthy, C. Brewster, N. Gibbins, K. O'Hara, F. Ciravegna, D. Sleeman, N. Shadbolt, and Y. Wilks. Image annotation with photocopain. In Proceedings First International Workshop on Semantic Web Annotations for Multimedia (SWAMM), 2006.

Google Scholar

[2]

R. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval. Addison-Wesley, 1999.

Digital Library

Google Scholar

[3]

P. Barthelmess, E. Kaiser, X. Huang, D. McGee, and P. Cohen. Collaborative multimodal photo annotation over digital paper. In Proceedings of the International Conference on Multimodal Interfaces (ICMI). ACM Press, 2006.

Digital Library

Google Scholar

[4]

P. Barthelmess, D. McGee, and P. Cohen. The emergence of representations in collaborative space planning over digital paper: Preliminary observations. In CSCW 2006 Workshop on Collaborating over Paper and Digital Documents (CoPADD), 2006. Available at http://www.copadd.ethz.ch/abstracts/11.pdf.

Digital Library

Google Scholar

[5]

M. Davis, M. Smith, F. Stentiford, A. Bambidele, J. Canny, N. Good, S. King, and R. Janakiraman. Using context and similarity for face and location identification. In Proceedings of the IS&T/SPIE 18th Annual Symposium on Electronic Imaging Science and Technology Internet Imaging VII. IS&T/SPIE Press, 2006.

Crossref

Google Scholar

[6]

N. Diakopoulos and I. Essa. Mediating photo collage authoring. In UIST '05: Proceedings of the 18th annual ACM symposium on User interface software and technology, pages 183--186, New York, NY, USA, 2005. ACM Press.

Digital Library

Google Scholar

[7]

M. Fleck. Eavesdropping on storytelling. Technical Report HPL-2004-44, HP Laboratories Palo Alto, 2004.

Google Scholar

[8]

E. Kaiser, P. Barthelmess, C. Erdmann, and P. Cohen. Multimodal redundancy across handwriting and speech during computer mediated human-human interactions. In Computer Human Interaction (CHI), 2007.

Digital Library

Google Scholar

[9]

J. Kustanowitz and B. Shneiderman. Annotation for personal digital photo libraries: Lowering barriers while raising incentives. Technical Report HCIL--2004--18, Univ. of Maryland, January 2005.

Google Scholar

[10]

N. Kuwahara, K. Kuwabara, N. Tetsutani, and K. Yasuda. Using photo annotations to produce a reminiscence video for dementia patients. In 3rd International Semantic Web Conference (ISWC2004), 2004. Demo Papers.

Google Scholar

[11]

C. Marlow, M. Naaman, D. Boyd, and M. Davis. Ht06, tagging paper, taxonomy, flickr, academic article, to read. In HYPERTEXT '06: Proceedings of the seventeenth conference on Hypertext and hypermedia, pages 31--40, New York, NY, USA, 2006. ACM Press.

Digital Library

Google Scholar

[12]

Y. Qian and L. M. G. Feijs. Exploring the potentials of combining photo annotating tasks with instant messaging fun. In MUM '04: Proceedings of the 3rd international conference on Mobile and ubiquitous multimedia, pages 11--17, New York, NY, USA, 2004. ACM Press.

Digital Library

Google Scholar

[13]

P. Schmitz. Inducing ontology from flickr tags. In Proc. of the Collaborative Web Tagging Workshop (WWW'06), May 2006.

Google Scholar

Cited By

View all

Monaghan FHandschuh SO’Sullivan D(2013)ACRONYMSemantic Web10.4018/978-1-4666-3610-1.ch009(201-234)Online publication date: 2013
https://doi.org/10.4018/978-1-4666-3610-1.ch009
Satish AJain RPrabhakaran BWorring MSmith JChua T(2013)CueNetProceedings of the 3rd ACM conference on International conference on multimedia retrieval10.1145/2461466.2461533(341-344)Online publication date: 16-Apr-2013
https://dl.acm.org/doi/10.1145/2461466.2461533
Hasan KMukta MNayeem MHasan Z(2012)Event-based content management by spontaneous metadata generation and diffusion2012 IEEE 13th International Symposium on Computational Intelligence and Informatics (CINTI)10.1109/CINTI.2012.6496740(97-102)Online publication date: Nov-2012
https://doi.org/10.1109/CINTI.2012.6496740
Show More Cited By

Index Terms

Toward content-aware multimodal tagging of personal photo collections
1. Human-centered computing
  1. Collaborative and social computing
    1. Collaborative and social computing systems and tools
      1. Synchronous editors
  2. Human computer interaction (HCI)
    1. Interaction devices
      1. Touch screens
    2. Interaction paradigms
      1. Natural language interfaces
2. Information systems
  1. Information systems applications
    1. Collaborative and social computing systems and tools
      1. Synchronous editors

Recommendations

Semantics, content, and structure of many for the creation of personal photo albums
MM '07: Proceedings of the 15th ACM international conference on Multimedia

Photos are often a means to remember personal events, and the creation of photo albums is the attempt to preserve our memories in a nice book. For a long time people have been creating such photo albums on the basis of prints from analog photos arranged ...
Collaborative multimodal photo annotation over digital paper
ICMI '06: Proceedings of the 8th international conference on Multimodal interfaces

The availability of metadata annotations over media content such as photos is known to enhance retrieval and organization, particularly for large data sets. The greatest challenge for obtaining annotations remains getting users to perform the large ...
Automatic tag expansion using visual similarity for photo sharing websites

In this paper we present an automatic photo tag expansion method designed for photo sharing websites. The purpose of the method is to suggest tags that are relevant to the visual content of a given photo at upload time. Both textual and visual cues are ...

Comments

Information & Contributors

Information

Published In

ICMI '07: Proceedings of the 9th international conference on Multimodal interfaces

November 2007

402 pages

ISBN:9781595938176

DOI:10.1145/1322192

General Chairs:
Kenji Mase
Nagoya University, Japan
,
Dominic Massaro
UC Santa Cruz, USA
,
Program Chairs:
Kazuya Takeda
Nagoya University, Japan
,
Deb Roy
MIT, USA
,
Alexandros Potamianos
Technical University of Crete, Greece

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 November 2007

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Poster

Conference

ICMI07

Sponsor:

ICMI07: International Conference on Multimodal Interface

November 12 - 15, 2007

Aichi, Nagoya, Japan

Acceptance Rates

Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
353
Total Downloads

Downloads (Last 12 months)3
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Monaghan FHandschuh SO’Sullivan D(2013)ACRONYMSemantic Web10.4018/978-1-4666-3610-1.ch009(201-234)Online publication date: 2013
https://doi.org/10.4018/978-1-4666-3610-1.ch009
Satish AJain RPrabhakaran BWorring MSmith JChua T(2013)CueNetProceedings of the 3rd ACM conference on International conference on multimedia retrieval10.1145/2461466.2461533(341-344)Online publication date: 16-Apr-2013
https://dl.acm.org/doi/10.1145/2461466.2461533
Hasan KMukta MNayeem MHasan Z(2012)Event-based content management by spontaneous metadata generation and diffusion2012 IEEE 13th International Symposium on Computational Intelligence and Informatics (CINTI)10.1109/CINTI.2012.6496740(97-102)Online publication date: Nov-2012
https://doi.org/10.1109/CINTI.2012.6496740
Monaghan FHandschuh SO'Sullivan D(2011)ACRONYMInternational Journal on Semantic Web & Information Systems10.4018/jswis.20111001017:4(1-35)Online publication date: 1-Oct-2011
https://dl.acm.org/doi/10.4018/jswis.2011100101
Vennelakanti RDey PShekhawat APisupati PBourlard HHuang TVidal EGatica-Perez DMorency LSebe N(2011)The picture says it all!Proceedings of the 13th international conference on multimodal interfaces10.1145/2070481.2070499(89-96)Online publication date: 14-Nov-2011
https://dl.acm.org/doi/10.1145/2070481.2070499
Bian LHoltzman HLanday JShi YPatterson DRogers YXie X(2011)QooqleProceedings of the 13th international conference on Ubiquitous computing10.1145/2030112.2030203(541-542)Online publication date: 17-Sep-2011
https://dl.acm.org/doi/10.1145/2030112.2030203
Bharath AMadhvanath S(2009)Online Handwriting Recognition for Indic ScriptsGuide to OCR for Indic Scripts10.1007/978-1-84800-330-9_11(209-234)Online publication date: 28-Aug-2009
https://doi.org/10.1007/978-1-84800-330-9_11
Kaiser EBarthelmess PKaiser E(2007)Cross-domain matching for automatic tag extraction across redundant handwriting and speech eventsProceedings of the 2007 workshop on Tagging, mining and retrieval of human related activity information10.1145/1330588.1330597(55-62)Online publication date: 15-Nov-2007
https://dl.acm.org/doi/10.1145/1330588.1330597

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

References

Cited By

Index Terms

Recommendations

Semantics, content, and structure of many for the creation of personal photo albums

Collaborative multimodal photo annotation over digital paper

Automatic tag expansion using visual similarity for photo sharing websites

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations