Abstract
Conversational emotion recognition has gained significant attention in data mining and text mining recently. Most existing methods only consider the utterance in conversations as a temporal sequence and ignore the fine-grained emotional clues in the compositional structure, where the non-ignorable semantic transitions and tone enhancement are implied. Consequently, such models hardly capture accurate semantic features of the utterance, which results in the accumulation of incorrect emotional features in the memory bank. To address this problem, we propose a novel framework, Tree-based Attention Networks with Transformer Pre-training (TANTP), which incorporates contextual representations and recursive constituency tree structure into the model architecture. Different from merely modeling the utterance in light of the time order, TANTP could effectively capture compositional emotion semantics of utterance features for the memory bank, where complex semantic transitions and emotional progression are difficult to be revealed by previous conventional sequential methods. Experimental results conducted on two public benchmark datasets demonstrate that TANTP could achieve superior performance compared with other state-of-the-art models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Hazarika, D., Poria, S., Zadeh, A., Cambria, E., Morency, L.-P., Zimmermann, R.: Conversational memory network for emotion recognition in dyadic dialogue videos. In: Proceedings of NAACL-HLT, pp. 2122–2132 (2018)
Majumder, N., Poria, S., Hazarika, D., Mihalcea, R., Gelbukh, A., Cambria, E.: DialogueRNN: an attentive rnn for emotion detection in conversations. In: Proceedings of the AAAI Conference on Artificial Intelligence (2019)
Zhong, P., Wang, D., Miao, C.: Knowledge-enriched transformer for emotion detection in textual conversations. arXiv preprint arXiv:1909.10681 (2019)
Jiao, W., Lyu, M.R., King, I.: Real-time emotion recognition via attention gated hierarchical memory network (2020)
Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014)
Cho, K., et al.: Learning phrase representations using rnn encoder-decoder for statistical machine translation. In: EMNLP, pp. 1724–1734
Vaswani, A., et al.: Attention is all you need. In: Advances in neural information processing systems (2017)
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Liu, Y., et al.: RoBERTa: a robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019)
Gildea, D.: Dependencies vs. constituents for tree-based alignment. In: Proceedings of the 2004 Conference on EMNLP, pp. 214–221 (2004)
Wang, W., Knight, K., Marcu, D.: Binarizing syntax trees to improve syntax-based machine translation accuracy. In: Proceedings of the 2007 Joint Conference on EMNLP-CoNLL, pp. 746–754 (2007)
Devillers, L., Vasilescu, I., Lamel, L.: Annotation and detection of emotion in a task-oriented human-human dialog corpus. In: Proceedings of ISLE Workshop (2002)
Lee, C.M., Narayanan, S.S.: Toward detecting emotions in spoken dialogs. IEEE Trans. Speech Audio Process. 13(2), 293–303 (2005)
Devillers, L., Vidrascu, L.: Real-life emotions detection with lexical and paralinguistic cues on human-human call center dialogs. In: Ninth International Conference on Spoken Language Processing (2006)
Zahiri, S.M., Choi, J.D.: Emotion detection on tv show transcripts with sequence-based convolutional neural networks. arXiv preprint arXiv:1708.04299 (2017)
Chatterjee, A., Gupta, U., Chinnakotla, M.K., Srikanth, R., Galley, M., Agrawal, P.: Understanding emotions in text using deep learning and big data. Comput. Hum. Behav. 93, 309–317 (2019)
Poria, S., Majumder, N., Mihalcea, R., Hovy, E.: Emotion recognition in conversation: research challenges, datasets, and recent advances. IEEE Access 7, 100943–100953 (2019)
Jiao, W., Yang, H., King, I., Lyu, M.R.: HiGRU: hierarchical gated recurrent units for utterance-level emotion recognition. arXiv preprint arXiv:1904.04446 (2019)
Li, Q., Chunhua, W., Wang, Z., Zheng, K.: Hierarchical transformer network for utterance-level emotion recognition. Appl. Ences 10(13), 4447 (2020)
Ghosal, D., Majumder, N., Gelbukh, A., Mihalcea, R., Poria, S.: COSMIC: commonsense knowledge for emotion identification in conversations. arXiv preprint arXiv:2010.02795 (2020)
Wang, Y.-S., Lee, H.-Y., Chen, Y.-N.: Tree transformer: Integrating tree structures into self-attention. arXiv preprint arXiv:1909.06639 (2019)
Yin, D., Meng, T., Chang, K.-W.: SentiBERT: A transferable transformer-based architecture for compositional sentiment semantics. arXiv preprint arXiv:2005.04114 (2020)
Pelletier, F.J.: The principle of semantic compositionality. Topoi 13(1), 11–24 (1994)
Poria, F., Hazarika, D., Majumder, N., Naik, G., Cambria, E., Mihalcea, R.: MELD: a multimodal multi-party dataset for emotion recognition in conversations. In ACL, pp. 527–536 (2019)
Kingma, D., Ba J.: ADAM: A method for stochastic optimization. Computer Ence (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Liu, H., Lin, H., Chen, G. (2021). TANTP: Conversational Emotion Recognition Using Tree-Based Attention Networks with Transformer Pre-training. In: Karlapalem, K., et al. Advances in Knowledge Discovery and Data Mining. PAKDD 2021. Lecture Notes in Computer Science(), vol 12713. Springer, Cham. https://doi.org/10.1007/978-3-030-75765-6_58
Download citation
DOI: https://doi.org/10.1007/978-3-030-75765-6_58
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-75764-9
Online ISBN: 978-3-030-75765-6
eBook Packages: Computer ScienceComputer Science (R0)