The MoveOn database: motorcycle environment speech and noise database for command and control applications

Kostoulas, Theodoros; Winkler, Thomas; Ganchev, Todor; Fakotakis, Nikos; Köhler, Joachim

doi:10.1007/s10579-013-9222-7

The MoveOn database: motorcycle environment speech and noise database for command and control applications

Original Paper
Published: 15 March 2013

Volume 47, pages 539–563, (2013)
Cite this article

Language Resources and Evaluation Aims and scope Submit manuscript

Theodoros Kostoulas¹,
Thomas Winkler²,
Todor Ganchev³,
Nikos Fakotakis¹ &
…
Joachim Köhler²

264 Accesses
Explore all metrics

Abstract

The MoveOn speech and noise database was purposely designed and implemented in support of research on spoken dialogue interaction in a motorcycle environment. The distinctiveness of the MoveOn database results from the requirements of the application domain—an information support and operational command and control system for the two-wheel police force—and also from the specifics of the adverse open-air acoustic environment. In this article, we first outline the target application, motivating the database design and purpose, and then report on the implementation details. The main challenges related to the choice of equipment, the organization of recording sessions, and some difficulties that were experienced during this effort, are discussed. We offer a detailed account of the database statistics, the suggested data splits in subsets, and discuss results from automatic speech recognition experiments which illustrate the degree of complexity of the operational environment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The CHiME Challenges: Robust Speech Recognition in Everyday Environments

Speech Recognition for Individuals with Voice Disorders

SPRAAK: Speech Processing, Recognition and Automatic Annotation Kit

Notes

http://showcase.m0ve0n.net/.
http://www.zoom.co.jp.
http://www.akg.com.
http://www.torkworld.com/tork_max.html.
http://www.alan-electronics.de.
http://www.cs.cmu.edu/afs/cs.cmu.edu/user/lenzo/html/areas/t2p/.
In earlier work (Winkler et al. 2008), published before the completion of the speech annotations, we estimated the amount of speech based on the speaker tier, i.e. including pauses at the beginning and end of each utterance, leading to a higher number of hours compared to the more precise number here.
http://htk.eng.cam.ac.uk/.
http://www.elra.info/.

References

Athanaselis, T., Bakamidis, S., Dologlou, I., Cowie, R., Douglas-Cowie, E., & Cox, C. (2005). ASR for emotional speech: Clarifying the issues and enhancing performance. Neural Networks, 18(4), 437–444.
Article Google Scholar
Boersma, P. (2001). Praat, a system for doing phonetics by computer. Glot international, 5(9/10), 341–345.
Google Scholar
Bohus, D., Raux, A., Harris, T. K., Eskenazi, M., & Rudnicky, A. I. (2007). Olympus: An open-source framework for conversational spoken language interface research. In: Bridging the Gap: Academic and Industrial Research in Dialog Technology workshop at HLT/NAACL.
Bohus, D., & Rudnicky, A. I. (2003). RavenClaw: Dialog Management Using Hierarchical Task Decomposition and an Expectation Agenda. In: Proceedings Eurospeech 2003 (pp. 597–600).
Gong, Y. (1995). Speech recognition in noisy environments: A survey. Journal Speech Communication, 16(3), 261–291.
Article Google Scholar
Junqua, J. C., Fincke, S., & Field, K. (1999). The Lombard effect: A reflex to better communicate with others in noise. In: Proceedings IEEE International Conference on Acoustics, Speech, and Signal Processing (pp. 2083–2086).
Kaiser, M., Mögele, H., & Schiel, F. (2006). Bikers accessing the web: The SmartWeb Motorbike Corpus. In: Proceedings LREC 2006 (pp. 1628–1631).
Kalapanidas, E., Davarakis, C., Nani, M., Winkler, T., Ganchev, T., Kocsis, O., et al. (2008). MoveON: A multimodal information management application for police motorcyclists. In: Proceedings System Demonstrations of the 18th European Conference on Artificial Intelligence.
Kawaguchi, N., Matsubara, S., Kajita, H., Iwa, S., Takeda, K., Itakura, F., & Inagaki, Y. (2000). Construction of speech corpus in moving car environment. In: Proceedings ICSLP 2000 (pp. 362–365).
Lee, B., Hasegawa-Johnson, M., Goudeseune, C., Kamdar, S., Borys, S., Liu, M., & Huang, T. (2004a). AVICAR: Audio-visual speech corpus in a car environment. In: Proceedings ICSLP 2004 (pp. 2489–2492).
Lee, Y. J., Kim, B. W., Kim, Y. I., Choi, D. L., Lee, K. H., & Um, Y. (2004b). Creation and assessment of Korean speech and noise DB in Car Environment. In: Proceedings LREC 2004 (pp. 1403–1406).
Moreno, A., Lindberg, B., Draxler, C., Richard, G., Choukri, K., Euler, S., & Allen, J. (2000). SPEECHDAT-CAR: A large speech database for automotive environments. In: Proceedings LREC 2000.
Schiel, F., & Draxler, C. (2003). Production and validation of speech corpora. Munich: Bastard Verlag.
Google Scholar
Van den Heuvel, H. (1999). Validation criteria, INCO-COP-977017.
Van den Heuvel, H. (2000). Slr validation: evaluation of the speechdat approach. In: Proceedings LREC 2000 Satellite workshop XLDB—Very large Telephone Speech Databases.
Van den Heuvel, H. (2001). The art of validation. ELRA Newsletter, 5(4), 4–6.
Google Scholar
Van den Heuvel, H., Boves, L., Moreno, A., Omologo, M., Richard, G., & Sanders, E. (2001). Annotation in the speechdat projects. International Journal of Speech Technology, 4, 127–143.
Article Google Scholar
Wakao, A., Takeda, K., & Itakura, F. (1996). Variability of Lombard effects under different noise conditions. In: Proceedings of Fourth International Conference on Spoken Language (pp. 2009–2012).
Wells, J. (1997). Standards, Assessment, and methods: Phonetic Alphabets. London: University College.
Google Scholar
Wheatley, S. J., & Ascham, S. R. (1998). SpeechDat English database for the fixed telephone network, Technical Report.
Whissell, C. (1989). The dictionary of Affect in Language. Plutchik, R., Kellerman, H. (eds.) Emotion: Theory, research and experience, vol. 4, Academic Press, New York.
Winkler, T., Kostoulas, T., Adderley, R., Bonkowski, C., Ganchev, T., Köhler, J., & Fakotakis, N. (2008). The MoveOn Motorcycle Speech Corpus. In: Proceedings LREC 2008 (pp. 2201–2205).

Download references

Acknowledgments

This work was supported by the FP6 MoveOn project (IST-2005-034753), which was co-funded by the European Commission. The authors would like to acknowledge the significant effort that Dr. Rick Adderley from A ESolutions (BI) invested in the recruitment of professional police officers and in the supervision of the data recording campaign. Furthermore, the authors would like to thank Patrick Seidler and Mr. Ali Khan from University of Reading as well as Mr. Christian Bonkowski from the Fraunhofer Institute for Intelligent Analysis and Information Systems, who performed major parts of the annotation of the speech and noise tiers of the database. Sincere thanks also to University of Reading, Systema Technologies S.A. and the whole MoveOn project team for supporting the development of the database by detailed definitions and discussions of the project requirements, as well as all other colleagues who directly or indirectly contributed to the successful implementation of the MoveOn speech and noise database.

Author information

Authors and Affiliations

Wire Communications Laboratory, Department of Electrical and Computer Engineering, University of Patras, 26500, Rion, Greece
Theodoros Kostoulas & Nikos Fakotakis
Fraunhofer IAIS, 53757, Sankt Augustin, Germany
Thomas Winkler & Joachim Köhler
Division of Electronics and Microelectronics, Department of Electronics, Technical University Varna, 9010, Varna, Bulgaria
Todor Ganchev

Authors

Theodoros Kostoulas
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Winkler
View author publications
You can also search for this author in PubMed Google Scholar
Todor Ganchev
View author publications
You can also search for this author in PubMed Google Scholar
Nikos Fakotakis
View author publications
You can also search for this author in PubMed Google Scholar
Joachim Köhler
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Theodoros Kostoulas.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kostoulas, T., Winkler, T., Ganchev, T. et al. The MoveOn database: motorcycle environment speech and noise database for command and control applications. Lang Resources & Evaluation 47, 539–563 (2013). https://doi.org/10.1007/s10579-013-9222-7

Download citation

Published: 15 March 2013
Issue Date: June 2013
DOI: https://doi.org/10.1007/s10579-013-9222-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The MoveOn database: motorcycle environment speech and noise database for command and control applications

Abstract

Access this article

Similar content being viewed by others

The CHiME Challenges: Robust Speech Recognition in Everyday Environments

Speech Recognition for Individuals with Voice Disorders

SPRAAK: Speech Processing, Recognition and Automatic Annotation Kit

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The MoveOn database: motorcycle environment speech and noise database for command and control applications

Abstract

Access this article

Similar content being viewed by others

The CHiME Challenges: Robust Speech Recognition in Everyday Environments

Speech Recognition for Individuals with Voice Disorders

SPRAAK: Speech Processing, Recognition and Automatic Annotation Kit

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation