Learning pronunciation with the Visual ear

Reynolds, Jake; Tarassenko, Lionel

doi:10.1007/BF01414942

Learning pronunciation with the Visual ear

Articles
Published: September 1993

Volume 1, pages 169–175, (1993)
Cite this article

Neural Computing & Applications Aims and scope Submit manuscript

Jake Reynolds¹ &
Lionel Tarassenko¹

65 Accesses
Explore all metrics

Abstract

We recently reported the use of Kohonen's feature map as the hidden layer of an RBF network for the recognition of spoken letters [1], and the analysis of sleep EEG [2]. The feature map was shown to act as an aid to visualization during the initial period of unsupervised learning in the hidden layer. In this paper, we again explore the topology preserving properties of Kohonen's feature map, this time for the visual interpretation of speech. It is shown that speech sounds, such as words or phonemes, may be displayed as moving trajectories on a computer screen and enhanced for ease of interpretation. A system known as the Visual Ear is introduced, in which speech from a normal speaker is displayed alongside that of a pupil learning pronunciation, enabling a visual comparison to be made between the two. The application of the Visual Ear to accelerated learning of foreign languages, or as a general speech therapy tool, are then discussed, and the limitations of the present system are highlighted.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial Intelligence

References

Reynolds J, Tarassenko L. Spoken letter recognition with neural networks. Int J Neural Syst 1992; 3(3): 219–235
Google Scholar
Roberts S, Tarassenko L. Analysis of the sleep eeg using a multilayer network with spatial organisation. IEE Proc F 1992; 139(6): 420–425
Google Scholar
Hardcastle WJ, Gibbon FE, Jones W. Visual display of tongue palate contact: Electropalatography in the assessment and remediation of speech disorders. Br J Disorders of Commun 1991; 26: 41–74
Google Scholar
Kohonen T. Self-Organisation and Associative Memory, Springer-Verlag, Berlin, 1988
Google Scholar
Moody J, Darken C. Fast learning in networks of locally-tuned processing units. Neural Comput 1989; 1: 281–294
Google Scholar
Lippmann R. An introduction to computing with neural nets. IEEE ASSP Mag 1987; 4(2): 4–22
Google Scholar
Davis S, Mermelstein P. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans Acoust, Speech, and Signal Process 1980; 28(4): 357–366
Google Scholar
Tattersall G, Linford P, Linggard R. Neural arrays for speech recognition. Br Telecom Tech J 1988; 6(2): 140–163
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Engineering Science, University of Oxford, OX1 3PJ, Oxford, UK
Jake Reynolds & Lionel Tarassenko

Authors

Jake Reynolds
View author publications
You can also search for this author inPubMed Google Scholar
Lionel Tarassenko
View author publications
You can also search for this author inPubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Reynolds, J., Tarassenko, L. Learning pronunciation with the Visual ear. Neural Comput & Applic 1, 169–175 (1993). https://doi.org/10.1007/BF01414942

Download citation

Received: 04 May 1993
Issue Date: September 1993
DOI: https://doi.org/10.1007/BF01414942

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning pronunciation with the Visual ear

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Intuitive visualizations of pitch and loudness in speech

Gradient and categorical patterns of spoken-word recognition and processing of phonetic details

Modelling Human Word Learning and Recognition Using Visually Grounded Speech

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Learning pronunciation with the Visual ear

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Intuitive visualizations of pitch and loudness in speech

Gradient and categorical patterns of spoken-word recognition and processing of phonetic details

Modelling Human Word Learning and Recognition Using Visually Grounded Speech

Explore related subjects

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now