Skip to main content
Log in

Automatic Prosodic Break Detection and Feature Analysis

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Automatic prosodic break detection and annotation are important for both speech understanding and natural speech synthesis. In this paper, we discuss automatic prosodic break detection and feature analysis. The contributions of the paper are two aspects. One is that we use classifier combination method to detect Mandarin and English prosodic break using acoustic, lexical and syntactic evidence. Our proposed method achieves better performance on both the Mandarin prosodic annotation corpus — Annotated Speech Corpus of Chinese Discourse and the English prosodic annotation corpus — Boston University Radio News Corpus when compared with the baseline system and other researches' experimental results. The other is the feature analysis for prosodic break detection. The functions of different features, such as duration, pitch, energy, and intensity, are analyzed and compared in Mandarin and English prosodic break detection. Based on the feature analysis, we also verify some linguistic conclusions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Huang X. Acero A, Hon H W. Spoken Language Processing: A Guide to Theory, Algorithm, and System Development. Prentice Hall, 2001.

  2. Pitrelli J, Beckman M, Hirschberg J. Evaluation of prosodic transcription labeling reliability in the ToBI framework. In Proc. ICSLP, September 1994, pp.123-126.

  3. Chen X, Li A, Sun G,Wu H et al. An application of SAMPA-c in standard Chinese. In Proc. ICSLP, Oct. 2000, pp.652-655.

  4. Li A. Chinese prosody and prosodic labeling of spontaneous speech. In Proc. Speech Prosody, April 2002, pp.39-46.

  5. Ostendorf M, Price P J, Shattuck-Hufnagel S. The Boston university radio news corpus. Technical Report No. ECS-95-001, Boston University, March 1995.

  6. Wightman C, Ostendorf M. Automatic labeling of prosodic patterns. IEEE Trans. Speech and Audio Processing, 1994, 2(4): 469–481.

    Article  Google Scholar 

  7. Ross K, Ostendorf M. Prediction of abstract prosodic labels for speech synthesis. Computer Speech and Language, 1996, 10(3): 155–185.

    Article  Google Scholar 

  8. Chen K, Hasegawa-Johnson M, Cohen A. An automatic prosody labeling system using ANN-based syntactic-prosodic model and GMM-based acoustic prosodic model. In Proc. ICASSP, May 2004, Vol.1, pp.509-512.

  9. Ananthakrishnan S, Narayanan S. Automatic prosodic even detection using acoustic, lexical and syntactic evidence. IEEE Trans. Audio, Speech, and Language Processing, 2008, 16(1): 216–228.

    Article  Google Scholar 

  10. Jeon J H, Liu Y. Automatic prosodic events detection using syllable-based acoustic and syntactic features. In Proc. ICASSP, April 2009, pp. 4565–4568.

  11. Srihar V K R, Bangalore S, Narayanan S S. Exploiting acoustic and syntactic features for automatic prosody labeling in a maximum entropy framework. IEEE Trans. Audio Speech and Language Processing, 2008, 16(4): 797–811.

    Article  Google Scholar 

  12. Chou Y, Chiang C, Wang Y et al. Prosody labeling and modeling for Mandarin spontaneous speech. In Proc. Speech Prosody, May 2010.

  13. Hu W. Study on prosody modeling in Chinese [Ph.D. Thesis]. Institute of Automation, Chinese Academic of Sciences, 2007.

  14. Ni C, Liu W, Xu B. Automatic prosody boundary labeling of Mandarin using text and acoustic information. In Proc. the 6th ISCSLP, December 2008, pp.1-4.

  15. Packard J L. The Morphology of Chinese: A Linguistic and Cognitive Approach. Cambridge University Press, 2000.

  16. Tseng H, Chang P, Andrew G et al. A conditional random field word segmenter for sighan bakeoff 2005. In Proc. the 4th SIGHAN Workshop on Chinese Language Processing, October 2005, pp.168-171.

  17. Chang P, Galley M, Manning C. Optimizing Chinese word segmentation for machine translation performance. In Proc. the 3rd Workshop on Statistical Machine Translation, June, 2008, pp.224-232.

  18. Toutanova K, Klein D, Manning C, Singer Y. Feature rich part-of-speech tagging with a cyclic dependency network. InProc. HLT-NAACL, May 2003, pp.173-180.

  19. Kim H, Ghahramani Z. Bayesian classifier combination. In Proc. the 15th Int. Conf. Artificial Intelligence and Statistics, April 2012, pp.619-627.

  20. Sun X. Pitch accent prediction using ensemble machine learning. In Proc. the 2nd ICSLP, September 2002, pp.953-956.

  21. Freund Y, Schapire R E. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 1997, 55(1): 119–139.

    Article  MathSciNet  MATH  Google Scholar 

  22. Lafferty J D, McCallum A, Pereira F C N. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proc. the 8th International Conference on Machine Learning, June 2001, pp.282-289.

  23. Hall M, Frank E, Holmes G et al. The WEKA data mining software: An update. SIGKDD Explorations Newsletter, 2009, 11(1): 10–18.

    Article  Google Scholar 

  24. Chang C, Lin C. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2011, 2(3), Article No.27.

  25. Frazier L, Carlson K, Clifton C Jr. Prosodic phrasing is central to language comprehension. Trends in Cognitive Sciences, 2006, 10(6): 244–249.

    Article  Google Scholar 

  26. Watson D, Gibson E. The relationship between intonational phrasing and syntactic structure in language production. Language and Cognitive Processes, 2004, 19(6): 713–755.

    Article  Google Scholar 

  27. Xu Y, Wang M. Organizing syllables into groups: Evidence from F0 and duration patterns in Mandarin. Journal of Phonetics, 2009, 37(4): 502–520.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chong-Jia Ni.

Additional information

Supported by the National Natural Science Foundation of China under Grant Nos. 90820303, 90820011, and the Natural Science Foundation of Shandong Province of China under Grant No. ZR2011FQ024.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ni, CJ., Zhang, AY., Liu, WJ. et al. Automatic Prosodic Break Detection and Feature Analysis. J. Comput. Sci. Technol. 27, 1184–1196 (2012). https://doi.org/10.1007/s11390-012-1295-z

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-012-1295-z

Keywords