Skip to main content

Differential Dataset Cartography: Explainable Artificial Intelligence in Comparative Personalized Sentiment Analysis

  • Conference paper
  • First Online:
Computational Science – ICCS 2023 (ICCS 2023)

Abstract

Data Maps is an interesting method of graphical representation of datasets, which allows observing the model’s behaviour for individual instances in the learning process (training dynamics). The method groups elements of a dataset into easy-to-learn, ambiguous, and hard-to-learn. In this article, we present an extension of this method, Differential Data Maps, which allows you to visually compare different models trained on the same dataset or analyse the effect of selected features on model behaviour. We show an example application of this visualization method to explain the differences between the three personalized deep neural model architectures from the literature and the HumAnn model we developed. The advantage of the proposed HumAnn is that there is no need for further learning for a new user in the system, in contrast to known personalized methods relying on user embedding. All models were tested on the sentiment analysis task. Three datasets that differ in the type of human context were used: user-annotator, user-author, and user-author-annotator. Our results show that with the new explainable AI method, it is possible to pose new hypotheses explaining differences in the quality of model performance, both at the level of features in the datasets and differences in model architectures.

This work was financed by (1) the National Science Centre, Poland, 2021/41/B/ST6/04471 (PK); (2) the Polish Ministry of Education and Science, CLARIN-PL; (3) the European Regional Development Fund as a part of the 2014-2020 Smart Growth Operational Programme, POIR.01.01.01-00-0288/22 (JK-S140), POIR.01.01.01-00-0615/21 (JK-MHS), POIR.04.02.00-00C002/19 (JB, KK); (4) the statutory funds of the Department of Artificial Intelligence, Wroclaw University of Science and Technology; (5) the European Union under the Horizon Europe, grant no. 101086321 (OMINO).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Adadi, A., Berrada, M.: Peeking inside the black-box: a survey on explainable artificial intelligence (XAI). IEEE Access 6, 52138–52160 (2018)

    Article  Google Scholar 

  2. AI, H.: High-level expert group on artificial intelligence (2019)

    Google Scholar 

  3. Baran, J., Kocoń, J.: Linguistic knowledge application to neuro-symbolic transformers in sentiment analysis. In: 2022 IEEE International Conference on Data Mining Workshops (ICDMW), pp. 395–402. IEEE (2022)

    Google Scholar 

  4. Bielaniewicz, J., et al.: Deep-sheep: sense of humor extraction from embeddings in the personalized context. In: 2022 IEEE International Conference on Data Mining Workshops (ICDMW), pp. 967–974. IEEE (2022)

    Google Scholar 

  5. Cer, D., Diab, M., Agirre, E., Lopez-Gazpio, I., Specia, L.: SemEval-2017 task 1: semantic textual similarity multilingual and crosslingual focused evaluation. In: Proceedings of the 11th Workshop on Semantic Evaluation (SemEval 2017) (2017)

    Google Scholar 

  6. Das, A., Rad, P.: Opportunities and challenges in explainable artificial intelligence (XAI): a survey. arXiv preprint arXiv:2006.11371 (2020)

  7. Diao, Q., Qiu, M., Wu, C.Y., Smola, A.J., Jiang, J., Wang, C.: Jointly modeling aspects, ratings and sentiments for movie recommendation (JMARS). In: Proceedings of the 20th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (2014)

    Google Scholar 

  8. Doshi-Velez, F., Kim, B.: Towards a rigorous science of interpretable machine learning. Stat 1050, 2 (2017)

    Google Scholar 

  9. Flek, L.: Returning the N to NLP: towards contextually personalized classification models. In: Proceedings of the 58th Annual Meeting of ACL (2020)

    Google Scholar 

  10. Go, A.: Sentiment classification using distant supervision (2009)

    Google Scholar 

  11. Gong, L., Haines, B., Wang, H.: Clustered model adaption for personalized sentiment analysis. In: Proceedings of the 26th Conference on World Wide Web (2017)

    Google Scholar 

  12. He, X., Liao, L., Zhang, H., Nie, L., Hu, X., Chua, T.S.: Neural collaborative filtering. In: Proceedings of the 26th Conference on World Wide Web (2017)

    Google Scholar 

  13. Hovy, D.: Demographic factors improve classification performance. In: Proceedings of the 53rd Annual Meeting of the ACL &IJCNLP (2015)

    Google Scholar 

  14. Kanclerz, K., et al.: Controversy and conformity: from generalized to personalized aggressiveness detection. In: Proceedings of the 59th Annual Meeting of the ACL &IJCNLP (2021)

    Google Scholar 

  15. Kanclerz, K., et al.: What if ground truth is subjective? Personalized deep neural hate speech detection. In: Proceedings of the 1st Workshop on Perspectivist Approaches to NLP@ LREC2022, pp. 37–45 (2022)

    Google Scholar 

  16. Kazienko, P., et al.: Human-centred neural reasoning for subjective content processing: hate speech, emotions, and humor. Inf. Fusion (2023)

    Google Scholar 

  17. Kennedy, C.J., Bacon, G., Sahn, A., von Vacano, C.: Constructing interval variables via faceted Rasch measurement and multitask deep learning: a hate speech application. arXiv preprint arXiv:2009.10277 (2020)

  18. Kocoń, J., et al.: Neuro-symbolic models for sentiment analysis. In: Groen, D., de Mulatier, C., Paszynski, M., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M.A. (eds.) ICCS 2022. LNCS, vol. 13351, pp. 667–681. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-08754-7_69

    Chapter  Google Scholar 

  19. Kocoń, J., et al.: Learning personal human biases and representations for subjective tasks in natural language processing. In: 2021 IEEE International Conference on Data Mining (ICDM), pp. 1168–1173. IEEE (2021). https://doi.org/10.1109/ICDM51629.2021.00140

  20. Kocoń, J., Maziarz, M.: Mapping wordnet onto human brain connectome in emotion processing and semantic similarity recognition. Inf. Process. Manag. 58(3), 102530 (2021)

    Article  Google Scholar 

  21. Kocoń, J., Figas, A., Gruza, M., Puchalska, D., Kajdanowicz, T., Kazienko, P.: Offensive, aggressive, and hate speech analysis: from data-centric to human-centered approach. Inf. Process. Manag. 58(5) (2021)

    Google Scholar 

  22. Korczyński, W., Kocoń, J.: Compression methods for transformers in multidomain sentiment analysis. In: 2022 IEEE International Conference on Data Mining Workshops (ICDMW), pp. 419–426. IEEE (2022)

    Google Scholar 

  23. Li, T., Sanjabi, M., Smith, V.: Fair resource allocation in federated learning. CoRR abs/1905.10497 (2019)

    Google Scholar 

  24. Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019)

    Google Scholar 

  25. Lui, A., Lamb, G.W.: Artificial intelligence and augmented intelligence collaboration: regaining trust and confidence in the financial sector. Inf. Commun. Technol. Law 27(3), 267–283 (2018)

    Article  Google Scholar 

  26. Miłkowski, P., Saganowski, S., Gruza, M., Kazienko, P., Piasecki, M., Kocoń, J.: Multitask personalized recognition of emotions evoked by textual content. In: Pervasive Computing and Communications Workshops (2022)

    Google Scholar 

  27. Mireshghallah, F., Shrivastava, V., Shokouhi, M., Berg-Kirkpatrick, T., Sim, R., Dimitriadis, D.: UserIdentifier: implicit user representations for simple and effective personalized sentiment analysis. CoRR abs/2110.00135 (2021)

    Google Scholar 

  28. Ngo, A., Candri, A., Ferdinan, T., Kocoń, J., Korczynski, W.: StudEmo: a non-aggregated review dataset for personalized emotion recognition. In: Proceedings of the 1st Workshop on Perspectivist Approaches to NLP@ LREC2022, pp. 46–55 (2022)

    Google Scholar 

  29. Schneider, J., Vlachos, M.: Mass personalization of deep learning. CoRR abs/1909.02803 (2019)

    Google Scholar 

  30. Song, K., Feng, S., Gao, W., Wang, D., Yu, G., Wong, K.F.: Personalized sentiment classification based on latent individuality of microblog users (2015)

    Google Scholar 

  31. Song, K., Tan, X., Qin, T., Lu, J., Liu, T.: MPNet: masked and permuted pre-training for language understanding. CoRR abs/2004.09297 (2020)

    Google Scholar 

  32. Swayamdipta, S., et al.: Dataset cartography: mapping and diagnosing datasets with training dynamics. In: Proceedings of the EMNLP2020 (2020)

    Google Scholar 

  33. Szołomicka, J., Kocon, J.: MultiAspectEmo: multilingual and language-agnostic aspect-based sentiment analysis. In: 2022 IEEE International Conference on Data Mining Workshops (ICDMW), pp. 443–450. IEEE (2022)

    Google Scholar 

  34. Tonekaboni, S., Joshi, S., McCradden, M.D., Goldenberg, A.: What clinicians want: contextualizing explainable machine learning for clinical end use. In: Machine Learning for Healthcare Conference, pp. 359–380. PMLR (2019)

    Google Scholar 

  35. Volkova, S., Wilson, T., Yarowsky, D.: Exploring demographic language variations to improve multilingual sentiment analysis in social media. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (2013)

    Google Scholar 

  36. Zamani, M., Schwartz, H.A., Lynn, V.E., Giorgi, S., Balasubramanian, N.: Residualized factor adaptation for community social media prediction tasks (2018)

    Google Scholar 

  37. Zhong, W., Tang, D., Wang, J., Yin, J., Duan, N.: UserAdapter: few-shot user learning in sentiment analysis. In: Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pp. 1484–1488 (2021)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jan Kocoń .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kocoń, J., Baran, J., Kanclerz, K., Kajstura, M., Kazienko, P. (2023). Differential Dataset Cartography: Explainable Artificial Intelligence in Comparative Personalized Sentiment Analysis. In: Mikyška, J., de Mulatier, C., Paszynski, M., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M. (eds) Computational Science – ICCS 2023. ICCS 2023. Lecture Notes in Computer Science, vol 14073. Springer, Cham. https://doi.org/10.1007/978-3-031-35995-8_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-35995-8_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-35994-1

  • Online ISBN: 978-3-031-35995-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics