Abstract
Hierarchical prosody structure generation is a key component for a speech synthesis system. One major feature of the prosody of Mandarin Chinese speech flow is prosodic phrase grouping. In this paper we proposed an approach for prediction of Chinese prosodic phrase boundaries from a limited amount of labeled training examples and some amount of unlabeled data using conditional random fields. Some useful unlabeled data are chosen based on the assigned labels and the prediction probabilities of the current learned model. The useful unlabeled data is then exploited to improve the learning. Experiments show that the approach improves overall performance. The precision and recall ratio are improved.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Niu, Z., Chai, P.: Segmentation of Prosodic Phrases for Improving the Naturalness of Synthesized Mandarin Chinese Speech. In: ICSLP 2000 Conference, Beijing, China, pp. 350–353 (2000)
Yao, Q., Chu, M., Hu, P.: Segmenting unrestricted Chinese text into prosodic words instead of lexical words. In: ICASSP 2001 Conference, Salt Lake City, pp. 825–828 (2001)
Veilleux, N.M., Ostendorf, M., Price, P.J., Shattuck-Hufnagel, S.: Markov Modeling of prosodic phrase structure. In: ICASSP 1990, New Mexico, USA, pp. 777–780 (1990)
Li, J., Hu, G., Wang, R.: Chinese prosody phrase prediction based on maximum entropy model. In: Interspeech 2004, Jeju Island, Korea, pp. 729–732 (2004)
Yarowsky, D.: Unsupervised word sense disambiguation rivaling supervised methods. In: 33rd Annual Meeting of the Association for Computational Linguistics, USA, pp. 189–196 (1995)
Riloff, E., Wiebe, J., Wilson, T.: Learning subjective nouns using extraction pattern bootstrapping. In: 7th Conference on Natural Language Learning (CoNLL 2003), Canada, pp. 25–32 (2003)
Maeireizo, B., Litman, D., Hwa, R.: Co-training for predicting emotions with spoken dialogue data. In: 42nd Annual Meeting of the Association for Computational Linguistics (ACL), Spain (2004)
Rosenberg, C., Hebert, M., Schneiderman, H.: Semi-supervised self-training of object detection models. In: 7th IEEE Workshop on Applications of Computer Vision 2005, USA, pp. 29–36 (2005)
Lafferty, J., McCallum, A., Pereiram, F.: Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In: 18th International Conference on Machine Learning, USA, pp. 282–289 (2001)
McCallum, A., Freitag, D., Pereira, F.: Maximum Entropy Markov Models for Information Extraction and Segmentation. In: ICML 2000, USA, pp. 591–598 (2000)
della Pietra, S., della Pietra, V., Lafferty, J.: Inducing Features of Random Fields. IEEE Transactions on Pattern Analysis and Machine Intelligence 19(4), 380–393 (1997)
Sanders, E., Taylor, P.: Using statistical models to predict phrase boundaries for speech synthesis. In: 4th European Conference on Speech Communication and Technology, Spain, pp.19–25 (1995)
Wong, T.-L., Lam, W.: Semi-Supervised learning for sequence labeling using conditional random fields. In: Proceeding of 4th International Conference on Machine Learning and Cybernetics, China, pp. 2832–2837 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhao, Z., Ma, X., Pei, W. (2011). Semi Supervised Learning for Prediction of Prosodic Phrase Boundaries in Chinese TTS Using Conditional Random Fields. In: Liu, D., Zhang, H., Polycarpou, M., Alippi, C., He, H. (eds) Advances in Neural Networks – ISNN 2011. ISNN 2011. Lecture Notes in Computer Science, vol 6676. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21090-7_56
Download citation
DOI: https://doi.org/10.1007/978-3-642-21090-7_56
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21089-1
Online ISBN: 978-3-642-21090-7
eBook Packages: Computer ScienceComputer Science (R0)