Estimation of Boundaries between Speech Units Using Bayesian Changepoint Detectors

Čmejla, Roman; Sovka, Pavel

doi:10.1007/3-540-44805-5_39

Roman Čmejla² &
Pavel Sovka²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2166))

Included in the following conference series:

International Conference on Text, Speech and Dialogue

418 Accesses
2 Citations

Abstract

This contribution addresses the application of Bayesian changepoint detectors (BCD) for the estimation of boundary location between speech units. A novel segmentation approach based on the family of Bayesian detectors using an instantaneous envelope and instantaneous frequency of speech rather than waveform itself is suggested. This approach does not rely on phonetic models, and therefore no supervised training is needed. No apriori information about speech is required, and thus the approach belongs to the class of blind segmentation methods. Due to the small percent of error in signal changepoint location, this method can be also used for tuning boundary location between phonetic categories estimated by other segmentation methods. The average bias between exact boundary location and its estimation is up to 7 ms for real speech.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Relative occurrences and difference of extrema for detection of transitions between broad phonetic classes

Article 04 August 2018

Temporal feature-based approaches for enhancing phoneme boundary detection and masking in speech

Article 25 June 2024

Automatic Phoneme Border Detection to Improve Speech Recognition

References

Bajwa, R.S., Owens, R.M., Keliher, P.T.: Simultaneous speech segmentation and phoneme recognition using dynamic programming. Proceedings of IEEE International Conference on Acoustic, Speech and Signal Processing, Vol. 6. Atlanta (1996) 3213–3216
Article Google Scholar
Sharma, M., Mammone, R.L.: “Blind” speech segmentation of speech without linguistic knowledge. Proceedings ICLP’96. Fourth International Conference on Spoken Language Processing, Vol. 2. Philadelphia (1996) 1234–40
Google Scholar
Jeong, Ch.G., Jeong, H.: Automatic phone segmentation and labeling of continuous speech. Speech Communication 20 (1996) 291–311
Article Google Scholar
Pauws, S., Kamp, Y., Willems, L.: A hiearchical method of automatic speech segmentation for synthetis application. Speech Communication 19 (1996) 207–220
Article Google Scholar
Nakagawa, S., Hashimoto, Y.: A Method for Continuous Speech Segmentation using HMM. In.: 9th International Conference on Pattern Recognition, Roma (1988) 960–962
Google Scholar
Nakagawa, S., Hashimoto, Y.: Segmentation of Continuous Speech by HMM and Bayesian Probability. Transactions of he Institute of Electronics, Information and Communication Engineers D-II, Vol. J72D-II 1 (1989) 1–10
Google Scholar
Grayden, D.B., Scordilis, M.S.: Phonemic Segmentation of Fluent Speech. In.: International Conference on Acoustics, Speech and Signal Processing, ICASSP-94, Vol. 1 Adelaide (1994) 73–76
Google Scholar
Meng, H.M., Zue, V.W.: Signal representation comparison for phonetic classification. In.: International Conference on Acoustics, Speech and Signal Processing, ICASSP-91, Vol.1 Toronto (1991) 285–288
Article Google Scholar
Micallef, P.: Automatic identification of phoneme boundaries using a mixed parameter model. In.: Eurospeech’97. ESCA, Rhodes, (1997) 485–488
Google Scholar
Rayner, P.J.W., Fitzgerald, W.J.: The Bayesian Approach to Signal Modelling and Classification. In.: The 1-st European Conference on Signal Analysis and Prediction. ECSAP’97 Prague (1997) 65–75
Google Scholar
Ruanaidh, J.K.O., Fitzgerald, W. J.: Numerical Bayesian Methods Applied to Signal Processing. Springer-Verlag, Berlin Heidelberg New York (1996)
MATH Google Scholar
Cmejla, R., Sovka, P.: The use of Bayesian Detector in Signal Processing, In.: The IASTED Signal and Image Processing’99. IASTED/ACTA Press, Anaheim Calgary Zurich (1999) 76–80
Google Scholar
Wei, W.W.S.: Time Series Analysis: Univariate and Multivariate Methods. Addison-Wesley Publishing Company, Wokingham (1995)
Google Scholar
Marple, S.L.: Digital Spectral Analysis with Applications. Prentice-Hall, Englewood Cliffs New Jersey (1987)
Google Scholar
Romberg, T.M., Black, J.L., Ledwige, T.J.: Signal Processing for Industrial Diagnostics. John Wiley Chichester (1996)
Google Scholar
Pukkila, M.T., Krishnaiah, P.R.: On the Use of Autoregressive Order Determination Criteria in Multivariate White Noise Tests. IEEE Trans. on ASSP, 36(9) (1988) 1396–1403
Article MATH Google Scholar
Schloegel, A.:Time Series Analysis (TSA)Toolbox. http://www-dpmi.tu-graz.ac.at/ schloegl
Cmejla, R., Sovka, P.: Application of Bayesian Autoregressive Detector for Speech Segmentation. In.: The International Conference on Signal Processing, Applications and Technology, ICSPAT’ 99. Miller Freeman Los Angeles (1999) CD-ROM
Google Scholar
Kozuchovskaja, I.A.: Evaluation of the precission of speech parameters LPC vocoder. Elektrosvjaz, 4 Moskva (1984) 48–51
Google Scholar
Oppenheim, V.A., Shafer, R.V.:Discrete-Time Signal Processing. Prentice Hall, Englewood Clifs New Jersey (1990)
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Electrical Engineering, Czech Technical University in Prague, Technická 2, 16627, Prague 6, Czech Republic
Roman Čmejla & Pavel Sovka

Authors

Roman Čmejla
View author publications
You can also search for this author in PubMed Google Scholar
Pavel Sovka
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. of Computer Science and Engineering, University of West Bohemia in Plzeň, Faculty of Applied Sciences, Univerzitní 22, 306-14, Plzeň, Czech Republic
Václav Matoušek , Pavel Mautner , Roman Mouček & Karel Taušer , , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Čmejla, R., Sovka, P. (2001). Estimation of Boundaries between Speech Units Using Bayesian Changepoint Detectors. In: Matoušek, V., Mautner, P., Mouček, R., Taušer, K. (eds) Text, Speech and Dialogue. TSD 2001. Lecture Notes in Computer Science(), vol 2166. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44805-5_39

Download citation

DOI: https://doi.org/10.1007/3-540-44805-5_39
Published: 24 August 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42557-1
Online ISBN: 978-3-540-44805-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Estimation of Boundaries between Speech Units Using Bayesian Changepoint Detectors

Abstract

Access this chapter

Preview

Similar content being viewed by others

Relative occurrences and difference of extrema for detection of transitions between broad phonetic classes

Temporal feature-based approaches for enhancing phoneme boundary detection and masking in speech

Automatic Phoneme Border Detection to Improve Speech Recognition

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Estimation of Boundaries between Speech Units Using Bayesian Changepoint Detectors

Abstract

Access this chapter

Preview

Similar content being viewed by others

Relative occurrences and difference of extrema for detection of transitions between broad phonetic classes

Temporal feature-based approaches for enhancing phoneme boundary detection and masking in speech

Automatic Phoneme Border Detection to Improve Speech Recognition

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation