Automatic Prosodic Events Detection Using a Two-Stage SVM/CRF Sequence Classifier with Acoustic Features

Chen, Mingming; Liu, Wenju; Yang, Zhanlei; Hu, Pengfei

doi:10.1007/978-3-642-33506-8_70

Mingming Chen⁴,
Wenju Liu⁴,
Zhanlei Yang⁴ &
…
Pengfei Hu⁴

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 321))

Included in the following conference series:

Chinese Conference on Pattern Recognition

3320 Accesses

Abstract

To benefit from the maximum-margin nature of SVMs and also from the ability of CRFs to model correlations between neighboring features, this paper utilizes a two-stage SVM/CRF sequence classifier to detect prosodic events with acoustic prosodic cues. The classifier combines support vector machines (SVMs) and conditional random fields (CRFs) to predict the prosodic events. In the first stage, SVMs are trained to predict the label of each sequence element; in the second stage, a CRF is trained to predict the output sequence of labels using the output labels from the previously trained SVMs as its input. In our experiments on the Boston University radio news corpus, training data and testing data are selected according to the rule that they have distinctive transcriptions. The results have shown that the two-stage classifier can produce a competitive performance and the second stage CRF classifier can improve the performance of the overall classifier.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Silverman, K., Beckman, M., Pitrelli, J., Ostendorf, M., Wightman, C., Price, P., Pierrehumbert, J., Hirschberg, J.: ToBI: A standard for labeling English prosody. In: Proc. of ICSLP, Canada, pp. 867–870 (1992)
Google Scholar
Wightman, C.W., Ostendorf, M.: Automatic labeling of prosodic patterns. IEEE Transaction on Speech and Audio Processing 2(4), 469–481 (1994)
Article Google Scholar
Jeon, J.H., Liu, Y.: Automatic prosodic events detection using syllable-based acoustic and syntactic features. In: Proc. of ICASSP, Taipei, pp. 4565–4568 (2009)
Google Scholar
Hoefel, G., Elkan, C.: Learning a Two-Stage SVM/CRF Sequence Classifier. In: Proc.of CIKM, pp. 271–278. Napa Valley, California (2008)
Chapter Google Scholar
Chen, K., Hasegawa-Johnson, M., Cohen, A.: An automatic prosody labeling system using ANN-based syntactic-prosodic model and GMM-based acoustic prosodic model. In: Proc. of ICASSP, Montreal, pp. 509–512 (2004)
Google Scholar
Rangarajan Sridhar, V.K., Bangalore, S., Narayanan, S.: Exploiting acoustic and syntactic features for automatic prosody labeling in a maximum entropy framework. IEEE Transactions on Audio, Speech, and Language Processing 16, 797–811 (2008)
Article Google Scholar
Ananthakrishnan, S., Narayanan, S.: Automatic prosodic event detection using acoustic, lexical and syntactic evidence. IEEE Transactions on Audio, Speech and Language Processing 16, 216–228 (2008)
Article Google Scholar
Ostendorf, M., Price, P.J., Shattuck-Hunfnagel, S.: The Boston University Radio News Corpus. Linguistic Data Consortium (1995)
Google Scholar
Chih-Chung, C., Chih-Jen, L.: LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 27:1–27:27 (2011) software, http://www.csie.ntu.edu.tw/cjlin/libsvm
CRF++:Yet Another CRF toolkit, http://crfpp.sourceforge.net/

Download references

Author information

Authors and Affiliations

National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
Mingming Chen, Wenju Liu, Zhanlei Yang & Pengfei Hu

Authors

Mingming Chen
View author publications
You can also search for this author in PubMed Google Scholar
Wenju Liu
View author publications
You can also search for this author in PubMed Google Scholar
Zhanlei Yang
View author publications
You can also search for this author in PubMed Google Scholar
Pengfei Hu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Automation, National Laboratory of Pattern Recognition, Chinese Academy of Sciences, No.95, Zhongguancun East Road, 100190, Beijing, China
Cheng-Lin Liu
Department of Automation, Tsinghua University, Haidian District, 100084, Beijing, China
Changshui Zhang
Institute of Automation, National Laboratory of Pattern Recognition, Chinese Academy of Sciences, 100190, Beijing, China
Liang Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, M., Liu, W., Yang, Z., Hu, P. (2012). Automatic Prosodic Events Detection Using a Two-Stage SVM/CRF Sequence Classifier with Acoustic Features. In: Liu, CL., Zhang, C., Wang, L. (eds) Pattern Recognition. CCPR 2012. Communications in Computer and Information Science, vol 321. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33506-8_70

Download citation

DOI: https://doi.org/10.1007/978-3-642-33506-8_70
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33505-1
Online ISBN: 978-3-642-33506-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics