Skip to main content

Automatic Prosodic Events Detection Using a Two-Stage SVM/CRF Sequence Classifier with Acoustic Features

  • Conference paper
Pattern Recognition (CCPR 2012)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 321))

Included in the following conference series:

  • 3320 Accesses

Abstract

To benefit from the maximum-margin nature of SVMs and also from the ability of CRFs to model correlations between neighboring features, this paper utilizes a two-stage SVM/CRF sequence classifier to detect prosodic events with acoustic prosodic cues. The classifier combines support vector machines (SVMs) and conditional random fields (CRFs) to predict the prosodic events. In the first stage, SVMs are trained to predict the label of each sequence element; in the second stage, a CRF is trained to predict the output sequence of labels using the output labels from the previously trained SVMs as its input. In our experiments on the Boston University radio news corpus, training data and testing data are selected according to the rule that they have distinctive transcriptions. The results have shown that the two-stage classifier can produce a competitive performance and the second stage CRF classifier can improve the performance of the overall classifier.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Silverman, K., Beckman, M., Pitrelli, J., Ostendorf, M., Wightman, C., Price, P., Pierrehumbert, J., Hirschberg, J.: ToBI: A standard for labeling English prosody. In: Proc. of ICSLP, Canada, pp. 867–870 (1992)

    Google Scholar 

  2. Wightman, C.W., Ostendorf, M.: Automatic labeling of prosodic patterns. IEEE Transaction on Speech and Audio Processing 2(4), 469–481 (1994)

    Article  Google Scholar 

  3. Jeon, J.H., Liu, Y.: Automatic prosodic events detection using syllable-based acoustic and syntactic features. In: Proc. of ICASSP, Taipei, pp. 4565–4568 (2009)

    Google Scholar 

  4. Hoefel, G., Elkan, C.: Learning a Two-Stage SVM/CRF Sequence Classifier. In: Proc.of CIKM, pp. 271–278. Napa Valley, California (2008)

    Chapter  Google Scholar 

  5. Chen, K., Hasegawa-Johnson, M., Cohen, A.: An automatic prosody labeling system using ANN-based syntactic-prosodic model and GMM-based acoustic prosodic model. In: Proc. of ICASSP, Montreal, pp. 509–512 (2004)

    Google Scholar 

  6. Rangarajan Sridhar, V.K., Bangalore, S., Narayanan, S.: Exploiting acoustic and syntactic features for automatic prosody labeling in a maximum entropy framework. IEEE Transactions on Audio, Speech, and Language Processing 16, 797–811 (2008)

    Article  Google Scholar 

  7. Ananthakrishnan, S., Narayanan, S.: Automatic prosodic event detection using acoustic, lexical and syntactic evidence. IEEE Transactions on Audio, Speech and Language Processing 16, 216–228 (2008)

    Article  Google Scholar 

  8. Ostendorf, M., Price, P.J., Shattuck-Hunfnagel, S.: The Boston University Radio News Corpus. Linguistic Data Consortium (1995)

    Google Scholar 

  9. Chih-Chung, C., Chih-Jen, L.: LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 27:1–27:27 (2011) software, http://www.csie.ntu.edu.tw/cjlin/libsvm

  10. CRF++:Yet Another CRF toolkit, http://crfpp.sourceforge.net/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chen, M., Liu, W., Yang, Z., Hu, P. (2012). Automatic Prosodic Events Detection Using a Two-Stage SVM/CRF Sequence Classifier with Acoustic Features. In: Liu, CL., Zhang, C., Wang, L. (eds) Pattern Recognition. CCPR 2012. Communications in Computer and Information Science, vol 321. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33506-8_70

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-33506-8_70

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-33505-1

  • Online ISBN: 978-3-642-33506-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics