Abstract
Previous researches indicated that syntactic information could improve the performance of automatic prosodic boundaries prediction. However, few researches focused on the usefulness of various syntactic features on different phrase boundaries prediction in detail, especially the verification on large-scale corpus. This paper investigates the effect of different syntactic feature combinations for phrase boundaries prediction based on a large-scale Mandarin corpus. Syntactic phrase structure and dependency relationship are both introduced and compared in the experiments. The evaluations of all the features in different time-span as well as feature importance are also carried out. The experimental results show that prediction models of prosodic word and prosodic phrase achieve the best performance with syntactic phrase and dependency features, while the models with dependency features outperform other models when predicting intonational phrase. Furthermore, the performance of intonational phrase is improved obviously by concerning syntactic global features. Meanwhile, the experiments also find that the relationship between syntactic local features and lower level prosodic boundaries: prosodic phrase and prosodic word, is closer.
Similar content being viewed by others
References
Conkie, A., Riccardi, G., & Rose, R. C. (1999). Prosody recognition from speech utterances using acoustic and linguistic based models of prosodic events. Proc. EUROSPEECH, pp. 523-526.
Wightman, C. W., & Ostendorf, M. (1994). Automatic labeling of prosodic patterns. IEEE Transactions on Speech and Audio Processing, 2(4), 469–481.
Bachenko, J., & Fitzpatrick, E. (1990). A computational grammar of discourse-neutral prosodic phrasing in English. Computational linguistics, 16(3), 155–170.
Ananthakrishnan, S., & Narayanan, S. S. (2005). An automatic prosody recognizer using a coupled multi-stream acoustic model and a syntactic-prosodic language model. In ICASSP, 1, 269–272.
Hasegawa-Johnson, M., Chen, K., Cole, J., Borys, S., Kim, S. S., Cohen, A., et al. (2005). Simultaneous recognition of words and prosody in the boston university radio speech corpus. Speech Communication, 46(3), 418–439.
Ostendorf, M., & Veilleux, N. (1994). A hierarchical stochastic model for automatic prediction of prosodic boundary location. Computational Linguistics, 20(1), 27–54.
Liu, F., Jia, H., & Tao, J. (2008, December). A maximum entropy based hierarchical model for automatic prosodic boundary labeling in mandarin. Proc of 6th International Symposium on Chinese Spoken Language, ISCSLP’08, pp.1-4.
Sridhar, V. R., Bangalore, S., & Narayanan, S. S. (2008). Exploiting acoustic and syntactic features for automatic prosody labeling in a maximum entropy framework. IEEE Transactions on Audio, Speech and Language Processing, 16(4), 797–811.
Ni, C., Liu, W., & Xu, B. (2012). From english pitch accent detection to Mandarin stress detection, where is the difference? Computer Speech & Language, 26(3), 127–148.
Qian, Y., Wu, Z., Ma, X., & Soong, F. (2010, November). Automatic prosody prediction and detection with Conditional Random Field (CRF) models. Proc. of IEEE 7th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 135-138.
Selkirk, E. O. (1980). On prosodic structure and its relation to syntactic structure. Indiana University Linguistics Club.
Bailly, G. (1989). Integration of rhythmic and syntactic constraints in a model of generation of French prosody. Speech Communication, 8(2), 137–146.
Dehé, N., Feldhausen, I., & Ishihara, S. (2011). The prosody–syntax interface: Focus, phrasing, language evolution. Lingua, 121(13), 1863–1869.
Koehn, P., Abney, S., Hirschberg, J., & Collins, M. (2000). Improving intonational phrasing with syntactic information. In In 2000 I.E. International Conference on Acoustics, Speech, and Signal Processing, 2000. ICASSP’00. Proceedings (3) (pp. 1289–1290).
Hirschberg, J., & Rambow, O. (2001). Learning prosodic features using a tree representation. In INTERSPEECH,pp. 1175-1178.
Chen, K., Hasegawa-Johnson, M., & Cohen, A. (2004, May). An automatic prosody labeling system using ANN-based syntactic-prosodic model and GMM-based acoustic-prosodic model. In IEEE International Conference on Acoustics, Speech, and Signal Processing, 2004. Proceedings (ICASSP) (Vol.1), pp. I-509.
Cao, J., & Zhu, W. (2002). Syntactic and lexical constraint in prosodic segmentation and grouping. In International Conference of Speech Prosody 2002.
Chu, M., & Qian, Y. (2001). Locating boundaries for prosodic constituents in unrestricted Mandarin texts. Computational linguistics and Chinese language processing, 6(1), 61–82.
Chen, Z., Hu, G., & Jiang, W. (2010). Improving prosodic phrase prediction by unsupervised adaptation and syntactic features extraction. In INTERSPEECH, pp., 1421–1424.
Zhang, X., Qian, Y., Zhao, H., & Soong, F. K. (2012). Break index labeling of mandarin text via syntactic-to-prosodic tree mapping. In ISCSLP, pp., 256–260.
Stanford Parser (2007). Available at http://nlp.stanford.edu/software/lex-parser.shtml.
Che, W., Spitkovsky, V. I., & Liu, T. (2012, July). A comparison of chinese parsers for stanford dependencies. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers-Volume 2,pp.11-16
Selkirk, E. O. (1984). Phonology and syntax: The relation between sound and structure. Cambridge. MA: MIT Press.
Yu, Y., Li, D., & Wu, X. (2013). Prosodic modeling with rich syntactic context in hmm-based mandarin speech synthesis. In In 2013 I.E. China summit & international conference on signal and information processing (ChinaSIP) (pp. 132–136).
Miaomiao, W., Miaomiao, W., Hirose, K., & Minematsu, N. (2010). Improving Mandarin segmental duration prediction with automatically extracted syntax features. In INTERSPEECH, pp., 2178–2181.
CRF++ (2003). Available at http://crfpp.sourceforge.net/.
Witten, I. H., Frank, E., Trigg, L. E., Hall, M., Holmes, G., & Cunningham, S. J. (1999). Weka: practical machine learning tools and techniques with Java implementations. Proc. of ICONIP/ANZIIS/ANNES99: Future directions for intelligent systems and information sciences (pp. 192–196). New Zealand: Dunedin.
Acknowledgments
The authors are thankful to the anonymous reviewers for their valuable comments and corrections in an earlier version of our manuscript, which contributed to the significant improvement of quality of this article.
This work is supported jointly by the National Natural Science Foundation of China (NSFC) (No.61273288, No.61233009, No.61203258, No.61305003, No. 61332017, and No.61375027), the Major Program for the National Social Science Fund of China (13&ZD189).
Author information
Authors and Affiliations
Corresponding author
Additional information
Hao Che and Ya Li contributed equally to this work.
Rights and permissions
About this article
Cite this article
Che, H., Li, Y., Tao, J. et al. Investigating Effect of Rich Syntactic Features on Mandarin Prosodic Boundaries Prediction. J Sign Process Syst 82, 263–271 (2016). https://doi.org/10.1007/s11265-015-1013-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-015-1013-5