Skip to main content
Log in

Learning pronunciation with the Visual ear

  • Articles
  • Published:
Neural Computing & Applications Aims and scope Submit manuscript

Abstract

We recently reported the use of Kohonen's feature map as the hidden layer of an RBF network for the recognition of spoken letters [1], and the analysis of sleep EEG [2]. The feature map was shown to act as an aid to visualization during the initial period of unsupervised learning in the hidden layer. In this paper, we again explore the topology preserving properties of Kohonen's feature map, this time for the visual interpretation of speech. It is shown that speech sounds, such as words or phonemes, may be displayed as moving trajectories on a computer screen and enhanced for ease of interpretation. A system known as the Visual Ear is introduced, in which speech from a normal speaker is displayed alongside that of a pupil learning pronunciation, enabling a visual comparison to be made between the two. The application of the Visual Ear to accelerated learning of foreign languages, or as a general speech therapy tool, are then discussed, and the limitations of the present system are highlighted.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Reynolds J, Tarassenko L. Spoken letter recognition with neural networks. Int J Neural Syst 1992; 3(3): 219–235

    Google Scholar 

  2. Roberts S, Tarassenko L. Analysis of the sleep eeg using a multilayer network with spatial organisation. IEE Proc F 1992; 139(6): 420–425

    Google Scholar 

  3. Hardcastle WJ, Gibbon FE, Jones W. Visual display of tongue palate contact: Electropalatography in the assessment and remediation of speech disorders. Br J Disorders of Commun 1991; 26: 41–74

    Google Scholar 

  4. Kohonen T. Self-Organisation and Associative Memory, Springer-Verlag, Berlin, 1988

    Google Scholar 

  5. Moody J, Darken C. Fast learning in networks of locally-tuned processing units. Neural Comput 1989; 1: 281–294

    Google Scholar 

  6. Lippmann R. An introduction to computing with neural nets. IEEE ASSP Mag 1987; 4(2): 4–22

    Google Scholar 

  7. Davis S, Mermelstein P. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans Acoust, Speech, and Signal Process 1980; 28(4): 357–366

    Google Scholar 

  8. Tattersall G, Linford P, Linggard R. Neural arrays for speech recognition. Br Telecom Tech J 1988; 6(2): 140–163

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Reynolds, J., Tarassenko, L. Learning pronunciation with the Visual ear. Neural Comput & Applic 1, 169–175 (1993). https://doi.org/10.1007/BF01414942

Download citation

  • Received:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF01414942

Keywords

Navigation