Skip to main content

Applying Term Rewriting to Speech Recognition of Numbers

  • Conference paper
Formal Methods and Software Engineering (ICFEM 2012)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 7635))

Included in the following conference series:

  • 940 Accesses

Abstract

In typical speech recognition applications, the designer of the application must supply the recognition engine with context- free grammars defining the set of allowable utterances for each recognition.

In the case of the recognition of numeric quantities such as phone and room numbers, such grammars can be fairly tricky to formulate. The natural pronunciations of a number may have many variations and special cases depending on the language and country.Consider, for example, the pronunciations of the four-digit phone extension “2200” in the United Kingdom.One could pronounce this “two-two-zero-zero”, of course, but you will also hear “twenty-two hundred”, “double-two double-naught”, and many other combinations. In North America, on the other hand, you would almost never hear the use of “double”, “triple”, or “naught”. Conventions for pronouncing numbers also depend heavily on the type of quantity being recognized. The year 1987, for example, could be pronounced “nineteen eighty-seven” or in literary contexts, “nineteen hundred and eighty seven” - but never “one-nine eighty-seven” or “one-thousand-nine-hundred and eighty-seven”.

The richness and idiosyncrasies of pronunciation possibilities, together with the combinatorics of dealing with multi-digit numbers of varying lengths, thus make the practical task of manually designing sufficiently complete number grammars laborious and error prone. This is especially true for applications that must be localized to countries having differing telephone dial plans and conventions for natural pronunciation. Ideally, one would like to be able to generate such grammars automatically, or at least semi-automatically.

Doing so requires an intuitive specification technique that allows one to easily encode pronunciation rules from which a grammar can be automatically derived. In this talk, we will show how to define certain formal term rewriting systems – which we call Number Generating Term Rewriting Systems (NGTRS) – that work extremely well for this purpose. We will show that given an NGTRS that generates pronunciations for digit strings representing numeric quantities, we can mechanically generate an equivalent context-free grammar.We’ll give some examples in a number of different natural languages, and explain how classical term rewriting proof techniques can be used to verify the consistency and completeness of the construction.

The method we describe was put to real-world use in Vocera’s speech-controlled communication system. This is a commercial product that consists of tiny, Star Trek-like wearable communicator badges operating against an enterprise-class server on a Wi-Fi network. The system is currently used by nearly a half-million nurses and doctors every day in hospitals in several countries.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Shostak, R.E. (2012). Applying Term Rewriting to Speech Recognition of Numbers. In: Aoki, T., Taguchi, K. (eds) Formal Methods and Software Engineering. ICFEM 2012. Lecture Notes in Computer Science, vol 7635. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34281-3_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-34281-3_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-34280-6

  • Online ISBN: 978-3-642-34281-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics