A template-based approach for recognition of intermittent sounds

Pinkowski, Ben

doi:10.1007/BFb0038472

Ben Pinkowski¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 507))

Included in the following conference series:

Great Lakes CS Conference on New Research Results in Computer Science

114 Accesses
1 Citations

Abstract

Automatic speech and sound recognition typically involves some measure of distance between training and (possibly time-warped) test samples. Special problems arise when the spectral samples of interest are intermittent and contain temporal patterns of alternating periods of sounds and pauses that are significant for recognition. In such cases a recognizer must be capable of distinguishing between the end-points and the pauses of digitized samples and economically searching the segmented sounds for the occurrence of significant spectral patterns. The usual distance metrics based on conventional dynamic time warping algorithms may be inappropriate because time-warping often corrupts the temporal structure of the sound. The problem can be solved by first searching a test sample for distinctive temporal patterns and, if more than one match is obtained, using a spectral distance measure to classify the sample with its nearest neighbor among these. Computational advantages can be obtained if both the temporal and spectral templates are maintained in a binary format reflecting the important sound components.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

J. S. Bridle. An efficient elastic-template method for detecting given words in running speech. Spring Meeting, British Acoust. Soc, 1973.
Google Scholar
P. deSouza. A statistical approach to the design of an adaptive self-normalizing silence detection. IEEE Trans. ASSP, ASSP-31:678–684, 1983.
Article Google Scholar
J. Doherty and R. Hoy. Communication in insects. III. The auditory behavior of crickets: some views of genetic coupling, song recognition, and predator detection. Quarterly Review of Biology, 60:457–472, 1985.
Article Google Scholar
R. O. Duda and P. E. Hart. Pattern Classification and Scene Analysis. John Wiley and Sons, New York, New York, 1973.
MATH Google Scholar
J. L. Elman and D. Zipser. Learning the hidden structure of speech. Journal Accoust. Soc. Amer., 83:1615–1626, 1988.
Article Google Scholar
A. L. Higgins and R. Wohlford. Keyword recognition using template concatenation. Proc. IEEE Int. Conf. ASSP, pages 1233–1236, 1985.
Google Scholar
R. R. Hoy. Acoustic communication in crickets: a model system for the study of feature detection. Federation Proc., 37:2316–2323, 1978.
Google Scholar
M. James. Pattern Recognition. John Wiley and Sons, New York, New York, 1988.
Google Scholar
L. R. Lamel, L. Rabiner, A. Rosenberg, and J. Wilpon. An improved endpoint detector for isolated word recognition. IEEE Trans. ASSP, ASSP-29:777–785, 1981.
Google Scholar
D. O'Shaughnessy. Speech Communication Human and Machine. Addison-Wesley Publishing Company, Reading, Massachusetts, 1987.
Google Scholar
B. Pinkowski. Discrete discriminant models: A performance simulation with reference to expert systems applications. In 20th Annual Simulation Symposium, pages 103–119. IEEE, 1987.
Google Scholar
B. Pinkowski. A rule-based approach for simulating errors in discrete sequential processes. In 22nd Annual Simulation Symposium, pages 145–152. IEEE, 1989.
Google Scholar
G. S. Pollack and R. R. Hoy. Temporal pattern as a cue for species-specific calling song recognition in crickets. Science, 204:429–432, 1979.
Google Scholar
L. R. Rabiner. On creating reference templates for speaker independent recognition of isolated words. IEEE Trans. ASSP, ASSP-26:34–42, 1978.
Google Scholar
L. R. Rabiner and M. R. Sambur. An algorithm for determining the endpoints of isolated utterances. Bell Sys. Tech. Journal, 54:297–315, 1975.
Google Scholar
J. J. Schwartz. The importance of spectral and temporal properties in species and call recognition in a neotropical treefrog with a complex vocal repertoire. Animal Behavior, 35:340–347, 1987.
Article Google Scholar
N. Sugamura, K. Shikano, and S. Furui. Isolated word recognition using phoneme-like templates. ICASSP, pages 732–726, 1983.
Google Scholar
J. Thorson, T. Weber, and F. Huber. Auditory behavior of the cricket. II. Simplicity of calling-song recognition in gryllus, and anomalous phonotaxis at abnormal carrier frequencies. Journal Comp. Physiol., 146:361–378, 1982.
Article Google Scholar
J. D. Tubbs. A note on binary template-matching. Pattern Recognition, 22:359–365, 1989.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Department, Western Michigan University, 49008, Kalamazoo, MI
Ben Pinkowski

Authors

Ben Pinkowski
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Naveed A. Sherwani Elise de Doncker John A. Kapenga

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pinkowski, B. (1991). A template-based approach for recognition of intermittent sounds. In: Sherwani, N.A., de Doncker, E., Kapenga, J.A. (eds) Computing in the 90's. Great Lakes CS 1989. Lecture Notes in Computer Science, vol 507. Springer, New York, NY. https://doi.org/10.1007/BFb0038472

Download citation

DOI: https://doi.org/10.1007/BFb0038472
Published: 14 June 2005
Publisher Name: Springer, New York, NY
Print ISBN: 978-0-387-97628-0
Online ISBN: 978-0-387-34815-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics