Fusing Phonetic Features and Chinese Character Representation for Sentiment Analysis

Peng, Haiyun; Poria, Soujanya; Li, Yang; Cambria, Erik

doi:10.1007/978-3-031-24340-0_12

Haiyun Peng⁸,
Soujanya Poria⁸,
Yang Li⁸ &
…
Erik Cambria⁸

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13452))

Included in the following conference series:

International Conference on Computational Linguistics and Intelligent Text Processing

386 Accesses

Abstract

The Chinese pronunciation system offers two characteristics that distinguish it from other languages: deep phonemic orthography and intonation variations. We are the first to argue that these two important properties can play a major role in Chinese sentiment analysis. Hence, we learn phonetic features of Chinese characters and fuse them with their textual and visual features in order to mimic the way humans read and understand Chinese text. Experimental results on five different Chinese sentiment analysis datasets show that the inclusion of phonetic features significantly and consistently improves the performance of textual and visual representations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Neutral tone, in addition to the four variations, is neglected for the moment, due to its lack of connection with sentiment.
2.
https://github.com/fxsjy/jieba.
3.
https://en.wikipedia.org/wiki/Phonemic_orthography.
4.
https://chinese.yabla.com/.
5.
https://github.com/mozillazg/python-pinyin.
6.
Both the datasets and codes in this paper are available for public download upon acceptance.
7.
https://github.com/mozillazg/python-pinyin.

References

Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3(Feb), 1137–1155 (2003)
Google Scholar
Cambria, E., Song, Y., Wang, H., Howard, N.: Semantic multi-dimensional scaling for open-domain sentiment analysis. IEEE Intell. Syst. 29(2), 44–51 (2014)
Article Google Scholar
Cambria, E., Wang, H., White, B.: Guest editorial: big social data analysis. Knowl.-Based Syst. 69, 1–2 (2014)
Article Google Scholar
Chaturvedi, I., Satapathy, R., Cavallari, S., Cambria, E.: Fuzzy commonsense reasoning for multimodal sentiment analysis. Pattern Recogn. Lett. 125, 264–270 (2019)
Article Google Scholar
Che, W., Zhao, Y., Guo, H., Su, Z., Liu, T.: Sentence compression for aspect-based sentiment analysis. IEEE Trans. Audio Speech Lang. Process. 23(12), 2111–2124 (2015)
Article Google Scholar
Chen, X., Xu, L., Liu, Z., Sun, M., Luan, H.: Joint learning of character and word embeddings. In: IJCAI, pp. 1236–1242 (2015)
Google Scholar
Eyben, F., Wöllmer, M., Schuller, B.: Opensmile: the munich versatile and fast open-source audio feature extractor. In: Proceedings of the 18th ACM International Conference on Multimedia, pp. 1459–1462. ACM (2010)
Google Scholar
Hansen, C.: Chinese ideographs and western ideas. J. Asian Stud. 52(2), 373–399 (1993)
Article Google Scholar
Howard, N., Cambria, E.: Intention awareness: improving upon situation awareness in human-centric environments. Hum.-Centric Comput. Inf. Sci. 3(9), 1–17 (2013)
Google Scholar
Irsoy, O., Cardie, C.: Opinion mining with deep recurrent neural networks. In: EMNLP, pp. 720–728 (2014)
Google Scholar
Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014)
Google Scholar
Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014)
Li, Y., Li, W., Sun, F., Li, S.: Component-enhanced chinese character embeddings. arXiv preprint arXiv:1508.06669 (2015)
Liu, F., Lu, H., Lo, C., Neubig, G.: Learning character-level compositionality with visual features. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, vol. 1: Long Papers), pp. 2059–2068 (2017)
Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Peng, H., Cambria, E., Zou, X.: Radical-based hierarchical embeddings for Chinese sentiment analysis at sentence level. In: FLAIRS, pp. 347–352 (2017)
Google Scholar
Peng, H., Ma, Y., Li, Y., Cambria, E.: Learning multi-grained aspect target sequence for Chinese sentiment analysis. Knowl.-Based Syst. 148, 167–176 (2018)
Article Google Scholar
Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: EMNLP, pp. 1532–1543 (2014)
Google Scholar
Poria, S., Cambria, E., Hazarika, D., Mazumder, N., Zadeh, A., Morency, L.P.: Multi-level multiple attentions for contextual multimodal sentiment analysis. In: ICDM, pp. 1033–1038 (2017)
Google Scholar
Shi, X., Zhai, J., Yang, X., Xie, Z., Liu, C.: Radical embedding: delving deeper to Chinese radicals, vol. 2: Short Papers, p. 594 (2015)
Google Scholar
Snoek, C.G., Worring, M., Smeulders, A.W.: Early versus late fusion in semantic video analysis. In: Proceedings of the 13th Annual ACM International Conference on Multimedia, pp. 399–402. ACM (2005)
Google Scholar
Su, T.r., Lee, H.y.: Learning Chinese word representations from glyphs of characters. In: EMNLP, pp. 264–273 (2017)
Google Scholar
Sun, M., Chen, X., Zhang, K., Guo, Z., Liu, Z.: Thulac: an efficient lexical analyzer for chinese. Technical Report (2016)
Google Scholar
Sun, Y., Lin, L., Yang, N., Ji, Z., Wang, X.: Radical-enhanced Chinese character embedding. In: Loo, C.K., Yap, K.S., Wong, K.W., Teoh, A., Huang, K. (eds.) ICONIP 2014. LNCS, vol. 8835, pp. 279–286. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-12640-1_34
Chapter Google Scholar
Yin, R., Wang, Q., Li, P., Li, R., Wang, B.: Multi-granularity Chinese word embedding. In: EMNLP, pp. 981–986 (2016)
Google Scholar
Zhang, H.P., Yu, H.K., Xiong, D.Y., Liu, Q.: Hhmm-based Chinese lexical analyzer ictclas. In: Proceedings of the Second SIGHAN Workshop on Chinese Language Processing, vol. 17, pp. 184–187. Association for Computational Linguistics (2003)
Google Scholar
Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: Advances in Neural Information Processing Systems, pp. 649–657 (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Engineering, Nanyang Technological University, Singapore, Singapore
Haiyun Peng, Soujanya Poria, Yang Li & Erik Cambria

Authors

Haiyun Peng
View author publications
You can also search for this author in PubMed Google Scholar
Soujanya Poria
View author publications
You can also search for this author in PubMed Google Scholar
Yang Li
View author publications
You can also search for this author in PubMed Google Scholar
Erik Cambria
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Erik Cambria .

Editor information

Editors and Affiliations

Instituto Politécnico Nacional, Mexico City, Mexico
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Peng, H., Poria, S., Li, Y., Cambria, E. (2023). Fusing Phonetic Features and Chinese Character Representation for Sentiment Analysis. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2019. Lecture Notes in Computer Science, vol 13452. Springer, Cham. https://doi.org/10.1007/978-3-031-24340-0_12

Download citation

DOI: https://doi.org/10.1007/978-3-031-24340-0_12
Published: 26 February 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-24339-4
Online ISBN: 978-3-031-24340-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Fusing Phonetic Features and Chinese Character Representation for Sentiment Analysis