Skip to main content

A template-based approach for recognition of intermittent sounds

  • Track 2: Artificial Intelligence
  • Conference paper
  • First Online:
Computing in the 90's (Great Lakes CS 1989)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 507))

Included in the following conference series:

Abstract

Automatic speech and sound recognition typically involves some measure of distance between training and (possibly time-warped) test samples. Special problems arise when the spectral samples of interest are intermittent and contain temporal patterns of alternating periods of sounds and pauses that are significant for recognition. In such cases a recognizer must be capable of distinguishing between the end-points and the pauses of digitized samples and economically searching the segmented sounds for the occurrence of significant spectral patterns. The usual distance metrics based on conventional dynamic time warping algorithms may be inappropriate because time-warping often corrupts the temporal structure of the sound. The problem can be solved by first searching a test sample for distinctive temporal patterns and, if more than one match is obtained, using a spectral distance measure to classify the sample with its nearest neighbor among these. Computational advantages can be obtained if both the temporal and spectral templates are maintained in a binary format reflecting the important sound components.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. J. S. Bridle. An efficient elastic-template method for detecting given words in running speech. Spring Meeting, British Acoust. Soc, 1973.

    Google Scholar 

  2. P. deSouza. A statistical approach to the design of an adaptive self-normalizing silence detection. IEEE Trans. ASSP, ASSP-31:678–684, 1983.

    Article  Google Scholar 

  3. J. Doherty and R. Hoy. Communication in insects. III. The auditory behavior of crickets: some views of genetic coupling, song recognition, and predator detection. Quarterly Review of Biology, 60:457–472, 1985.

    Article  Google Scholar 

  4. R. O. Duda and P. E. Hart. Pattern Classification and Scene Analysis. John Wiley and Sons, New York, New York, 1973.

    MATH  Google Scholar 

  5. J. L. Elman and D. Zipser. Learning the hidden structure of speech. Journal Accoust. Soc. Amer., 83:1615–1626, 1988.

    Article  Google Scholar 

  6. A. L. Higgins and R. Wohlford. Keyword recognition using template concatenation. Proc. IEEE Int. Conf. ASSP, pages 1233–1236, 1985.

    Google Scholar 

  7. R. R. Hoy. Acoustic communication in crickets: a model system for the study of feature detection. Federation Proc., 37:2316–2323, 1978.

    Google Scholar 

  8. M. James. Pattern Recognition. John Wiley and Sons, New York, New York, 1988.

    Google Scholar 

  9. L. R. Lamel, L. Rabiner, A. Rosenberg, and J. Wilpon. An improved endpoint detector for isolated word recognition. IEEE Trans. ASSP, ASSP-29:777–785, 1981.

    Google Scholar 

  10. D. O'Shaughnessy. Speech Communication Human and Machine. Addison-Wesley Publishing Company, Reading, Massachusetts, 1987.

    Google Scholar 

  11. B. Pinkowski. Discrete discriminant models: A performance simulation with reference to expert systems applications. In 20th Annual Simulation Symposium, pages 103–119. IEEE, 1987.

    Google Scholar 

  12. B. Pinkowski. A rule-based approach for simulating errors in discrete sequential processes. In 22nd Annual Simulation Symposium, pages 145–152. IEEE, 1989.

    Google Scholar 

  13. G. S. Pollack and R. R. Hoy. Temporal pattern as a cue for species-specific calling song recognition in crickets. Science, 204:429–432, 1979.

    Google Scholar 

  14. L. R. Rabiner. On creating reference templates for speaker independent recognition of isolated words. IEEE Trans. ASSP, ASSP-26:34–42, 1978.

    Google Scholar 

  15. L. R. Rabiner and M. R. Sambur. An algorithm for determining the endpoints of isolated utterances. Bell Sys. Tech. Journal, 54:297–315, 1975.

    Google Scholar 

  16. J. J. Schwartz. The importance of spectral and temporal properties in species and call recognition in a neotropical treefrog with a complex vocal repertoire. Animal Behavior, 35:340–347, 1987.

    Article  Google Scholar 

  17. N. Sugamura, K. Shikano, and S. Furui. Isolated word recognition using phoneme-like templates. ICASSP, pages 732–726, 1983.

    Google Scholar 

  18. J. Thorson, T. Weber, and F. Huber. Auditory behavior of the cricket. II. Simplicity of calling-song recognition in gryllus, and anomalous phonotaxis at abnormal carrier frequencies. Journal Comp. Physiol., 146:361–378, 1982.

    Article  Google Scholar 

  19. J. D. Tubbs. A note on binary template-matching. Pattern Recognition, 22:359–365, 1989.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Naveed A. Sherwani Elise de Doncker John A. Kapenga

Rights and permissions

Reprints and permissions

Copyright information

© 1991 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Pinkowski, B. (1991). A template-based approach for recognition of intermittent sounds. In: Sherwani, N.A., de Doncker, E., Kapenga, J.A. (eds) Computing in the 90's. Great Lakes CS 1989. Lecture Notes in Computer Science, vol 507. Springer, New York, NY. https://doi.org/10.1007/BFb0038472

Download citation

  • DOI: https://doi.org/10.1007/BFb0038472

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-0-387-97628-0

  • Online ISBN: 978-0-387-34815-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics