Skip to main content

A Spoken Dialog System Speech Interface Based on a Microphone Array

  • Conference paper
Computational Processing of the Portuguese Language (PROPOR 2008)

Abstract

In this paper we present a Spoken Dialog System (SDS) with a Microphone Array (MA). Our goal is to create a hands-free home automation system with a speech interface to control home devices. The MA interface enables to create ubiquitous speech acquisition for the SDS. The implemented system allows any user – in any position in a room – to establish a dialog with a virtual butler that is able to control a wide range of home appliances (room lights, air-conditioner, windows shades and hi-fi features). This virtual butler has a 3D animated face that is, while the dialog is engaged, able to steer to the user’s position and respond to his/hers commands with synthesized speech. The presented results show that the MA, as distant talk interface, performs quite well and is a step towards a more realistic human-machine interaction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. CHIL - Computers. In: the Human Interaction Loop, http://chil.server.de/

  2. AMI - Augmented Multi-party Interaction, http://www.amiproject.org/

  3. DICIT - Distant-talking Interfaces for Control of Interactive TV, http://dicit.fbk.eu/

  4. Neto, J.P., Cassaca, R., Viveiros, M., Mourão, M.: Design of a Multimodal Input Interface for a Dialog System. In: Vieira, R., Quaresma, P., Nunes, M.d.G.V., Mamede, N.J., Oliveira, C., Dias, M.C. (eds.) PROPOR 2006. LNCS (LNAI), vol. 3960, pp. 170–179. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  5. Meinedo, H., Caseiro, D., Neto, J., Trancoso, I.: AUDIMUS.media: a Broadcast News speech recognition system for the European Portuguese language. In: Mamede, N.J., Baptista, J., Trancoso, I., Nunes, M.d.G.V. (eds.) PROPOR 2003. LNCS, vol. 2721. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  6. Paulo, S., Oliveir, L.C.: Reducing the Corpus-based TTS Signal Degradation Due to Speaker’s Word Pronunciations. In: Interspeech, ISCA, Portugal, pp. 1089–1092 (2005)

    Google Scholar 

  7. Viveiros, M.: Cara Falante - Uma interface visual para um sistema de diálogo falado, Graduation thesis, Instituto Superior Técnico, Universidade Técnica de Lisboa (2004)

    Google Scholar 

  8. Brandstein, M., Ward, D.: Microphone Arrays. Springer, Heidelberg (2001)

    Google Scholar 

  9. Kellermann, W., Buchner, H., Herbordt, W., Aichner, R.: Multichannel Acoustic Signal Processing for Human/Machine Interfaces - Fundamental Problems and Recent Advances. In: ICA 2004. LNCS, vol. 3195, Springer, Heidelberg (2004)

    Google Scholar 

  10. Buchner, H., Benesty, J., Kellermann, W.: Generalized Multichannel Frequency-Domain Adaptive Filtering: Efficient Realization and Application to Hands-Free Speech Communication. Signal Processing 85, 549–570 (2005)

    Article  Google Scholar 

  11. The Nist Mark-III Microphone Array, http://www.nist.gov/smartspace/cmaiii.html

  12. Coelho, G.E., Serralheiro, A.J., Neto, J.: Microphone Array front-end interface for Home Automation. In: Hands-free Speech Communication and Microphone Arrays (HSCMA), Trento, Italy, pp. 184–187 (2008)

    Google Scholar 

  13. Johnson, D.H., Dudgeon, D.E.: Array Signal Processing: Concepts and Techniques. Prentice Hall, Englewood Cliffs (1993)

    MATH  Google Scholar 

  14. Knapp, C., Carter, G.: The generalized correlation method for estimation of time delay. IEEE Trans. Acoust. Speech Signal Processing 24, 320–327 (1976)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

António Teixeira Vera Lúcia Strube de Lima Luís Caldas de Oliveira Paulo Quaresma

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Coelho, G.E., Serralheiro, A.J., Neto, J.P. (2008). A Spoken Dialog System Speech Interface Based on a Microphone Array. In: Teixeira, A., de Lima, V.L.S., de Oliveira, L.C., Quaresma, P. (eds) Computational Processing of the Portuguese Language. PROPOR 2008. Lecture Notes in Computer Science(), vol 5190. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85980-2_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-85980-2_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-85979-6

  • Online ISBN: 978-3-540-85980-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics