Skip to main content

A Method for Photograph Indexing Using Speech Annotation

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2195))

Abstract

We explore the feasibility of using speech input to perform the task of indexing a large volume of digital photographs. As a natural medium for image communication, speech can be used to complement existing contentbased techniques thereby promoting the reliability and use-ability of image retrieval systems. We introduce a methodology for image indexing using speech annotation technique. Speech recognition tools, like Dragon NaturallySpeaking can be adapted to perform the main role of speech-to-text transcription. The use of structured speech as opposed to free form speech in a limited system can further boost the transcription accuracy. We also introduce the idea of using N-best lists from the speech recognition output to improve the recognition performance. The transcribed text is used to populate the metadata of the corresponding photograph. A photo query strategy is implemented to affirm the performance of proposed technique for photo indexing and retrieval.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Flickner, M., Sawhney H., Niblack, W., Ashley J., Huang Q. and Dom B.: Query by Image and Video Content: The QBIC System. IEEE Computer, Vol. 28 (1995) 23–32

    Google Scholar 

  2. Wu J.K.: Content-based Indexing of Multimedia Databases. IEEE Trans. on Knowledge and Data Engineering, Vol. 9(1997) 978–989

    Google Scholar 

  3. Tan T., Mulhem P.: Image Query System using Object Probes. Submitted to ICIP 2001, Thessaloniki, Greece, 2001

    Google Scholar 

  4. Satoh S., Nakamura Y., and Kanade T.: Name-It: Naming and Detection Faces in News Videos. IEEE Multimedia (1999) 22–35

    Google Scholar 

  5. Siegler M.A.: Integration of Continuous Speech Recognition and Information Retrieval for Mutually Optimal Performance,“ Ph.D. Thesis, Carnegie Mellon University, U.S.(1999)

    Google Scholar 

  6. Srihari R.K. et al: Multimedia Indexing and Retrieval of Voice-Annotated Consumer Photos. Proceedings of the Multimedia Indexing and Retrieval Workshop, SIGIR ‘99, University of California, Berkeley, U.S (1999) 1–16

    Google Scholar 

  7. Kuchinsky A. et al: FotoFile: A Consumer Multimedia Organization and Retrieval System. Proceedings of the CHI 99 Conference on Human Factors in Computing Systems, Pennsylvania, U.S. (1999) 496–503

    Google Scholar 

  8. Mills T.J., Pye D., Sinclair D. and Wood K.R.: Shoebox: A Digital Photo Management System. AT&T Labs Cambridge Technical Reports, UK (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chen, J., Tan, T., Mulhem, P. (2001). A Method for Photograph Indexing Using Speech Annotation. In: Shum, HY., Liao, M., Chang, SF. (eds) Advances in Multimedia Information Processing — PCM 2001. PCM 2001. Lecture Notes in Computer Science, vol 2195. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45453-5_113

Download citation

  • DOI: https://doi.org/10.1007/3-540-45453-5_113

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-42680-6

  • Online ISBN: 978-3-540-45453-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics