Spatio-Temporal Mask Learning: Application to Speech Recognition

Durand, Stéphane; Alexandre, Frédéric

doi:10.1007/978-3-7091-7535-4_36

Stéphane Durand⁴ &
Frédéric Alexandre⁴

Abstract

In this paper, we describe the “spatio-temporal” map which is an original algorithm to learn and recognize dynamic patterns represented by sequences. This work is slanted toward an internal and explicit representation of time which seems to be neuro-biologically relevant. The map involves units with different kinds of links: feed-forward connections, intra-map connections and inter-map connections. This architecture is able to learn sequences robust to noise from an input stream. The learning process is self-organized for the feed-forward links and “pseudo” self-organized for the intra-map links. An application to French spoken digits recognition is presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Single Neurons with Delay-Based Learning Can Generalise Between Time-Warped Patterns

Advanced Recurrent Neural Networks for Automatic Speech Recognition

Semi-supervised Phoneme Recognition with Recurrent Ladder Networks

References

F. Alexandre. Une modélisation fonctionnelle du cortex: la colonne corticale. Aspects visuels et moteurs. PhD thesis, Université Nancy I, 1990.
Google Scholar
F. Alexandre, F. Guyot, J P. Haton, and Y. Burnod. The cortical column: a new processing unit for multilayered networks. Neural networks, 4: 15–25, 1991
Article Google Scholar
B. Ans. Modèle neuromimétique du stockage et du rappel de séquences temporelles. t311, série iii, C. R. Acad. Sci. Paris, 1990.
Google Scholar
B. Colnet and S. Durand. Application of temporal neural networks to source localisation. In ICANNGA, second international conference on artificial neural networks and genetic algorithms, Alès, France, 1995.
Google Scholar
S. B. Davis and P. Mermelstein. Comparison of parametric representation for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on acoustics, speech, and signal processing, ASSP-28(4): 357–366, 1980.
Article Google Scholar
S. Durand and F. Alexandre. A neural network based on sequence learning: Application to spoken digits recognition. In 7th international conference on Neural Networks and Their Applications, pages 290–298, Marseille, 1994.
Google Scholar
J L. Elman. Finding structure in time. Cognitive Science, 14: 179–211, 1990.
Article Google Scholar
D H. Hubel and T N. Wiesel. Functional architecture of macaque monkey visual cortex. Ferrier Lecture Proc. Roy. Soc. Lond.B, pages 1–59, 1977.
Google Scholar
M I. Jordan. Attractor dynamics and parallelism in a connectionist sequential machine. In Hillsdale, editor, Proceedings of the Eighth Annual Conference of the Cognitive Science Society. Erlbaum, 1986.
Google Scholar
T. Kohonen. Self-Organization and Associative Memory. Springer Series in Information Sciences. Springer-Verlag, third edition, 1989.
Google Scholar
V.I. Nenov and M.G. Dyer. Perceptually grounded language learning: Part1-a neural network architecture for robust sequence association. Connection Science, 5 (2): 115–138, 1993.
Article Google Scholar
A. Waibel, T. Hanazawa, G. Hinton, K. Shikano, and K J. Lang. Phoneme recognition using time-delay neural networks. IEEE Transaction on Acoustics, Speech and Signal Processing, 37 (3): 328–339, 1989.
Article Google Scholar

Download references

Author information

Authors and Affiliations

CRIN-CNRS INRIA Lorraine, BP 239, F-54506, Vandoeuvre-lès-Nancy, France
Stéphane Durand & Frédéric Alexandre

Authors

Stéphane Durand
View author publications
You can also search for this author in PubMed Google Scholar
Frédéric Alexandre
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Durand, S., Alexandre, F. (1995). Spatio-Temporal Mask Learning: Application to Speech Recognition. In: Artificial Neural Nets and Genetic Algorithms. Springer, Vienna. https://doi.org/10.1007/978-3-7091-7535-4_36

Download citation

DOI: https://doi.org/10.1007/978-3-7091-7535-4_36
Publisher Name: Springer, Vienna
Print ISBN: 978-3-211-82692-8
Online ISBN: 978-3-7091-7535-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Spatio-Temporal Mask Learning: Application to Speech Recognition

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Single Neurons with Delay-Based Learning Can Generalise Between Time-Warped Patterns

Advanced Recurrent Neural Networks for Automatic Speech Recognition

Semi-supervised Phoneme Recognition with Recurrent Ladder Networks

References

Author information

Authors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Spatio-Temporal Mask Learning: Application to Speech Recognition

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Single Neurons with Delay-Based Learning Can Generalise Between Time-Warped Patterns

Advanced Recurrent Neural Networks for Automatic Speech Recognition

Semi-supervised Phoneme Recognition with Recurrent Ladder Networks

References

Author information

Authors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation