Abstract
Data Maps is an interesting method of graphical representation of datasets, which allows observing the model’s behaviour for individual instances in the learning process (training dynamics). The method groups elements of a dataset into easy-to-learn, ambiguous, and hard-to-learn. In this article, we present an extension of this method, Differential Data Maps, which allows you to visually compare different models trained on the same dataset or analyse the effect of selected features on model behaviour. We show an example application of this visualization method to explain the differences between the three personalized deep neural model architectures from the literature and the HumAnn model we developed. The advantage of the proposed HumAnn is that there is no need for further learning for a new user in the system, in contrast to known personalized methods relying on user embedding. All models were tested on the sentiment analysis task. Three datasets that differ in the type of human context were used: user-annotator, user-author, and user-author-annotator. Our results show that with the new explainable AI method, it is possible to pose new hypotheses explaining differences in the quality of model performance, both at the level of features in the datasets and differences in model architectures.
This work was financed by (1) the National Science Centre, Poland, 2021/41/B/ST6/04471 (PK); (2) the Polish Ministry of Education and Science, CLARIN-PL; (3) the European Regional Development Fund as a part of the 2014-2020 Smart Growth Operational Programme, POIR.01.01.01-00-0288/22 (JK-S140), POIR.01.01.01-00-0615/21 (JK-MHS), POIR.04.02.00-00C002/19 (JB, KK); (4) the statutory funds of the Department of Artificial Intelligence, Wroclaw University of Science and Technology; (5) the European Union under the Horizon Europe, grant no. 101086321 (OMINO).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Adadi, A., Berrada, M.: Peeking inside the black-box: a survey on explainable artificial intelligence (XAI). IEEE Access 6, 52138–52160 (2018)
AI, H.: High-level expert group on artificial intelligence (2019)
Baran, J., Kocoń, J.: Linguistic knowledge application to neuro-symbolic transformers in sentiment analysis. In: 2022 IEEE International Conference on Data Mining Workshops (ICDMW), pp. 395–402. IEEE (2022)
Bielaniewicz, J., et al.: Deep-sheep: sense of humor extraction from embeddings in the personalized context. In: 2022 IEEE International Conference on Data Mining Workshops (ICDMW), pp. 967–974. IEEE (2022)
Cer, D., Diab, M., Agirre, E., Lopez-Gazpio, I., Specia, L.: SemEval-2017 task 1: semantic textual similarity multilingual and crosslingual focused evaluation. In: Proceedings of the 11th Workshop on Semantic Evaluation (SemEval 2017) (2017)
Das, A., Rad, P.: Opportunities and challenges in explainable artificial intelligence (XAI): a survey. arXiv preprint arXiv:2006.11371 (2020)
Diao, Q., Qiu, M., Wu, C.Y., Smola, A.J., Jiang, J., Wang, C.: Jointly modeling aspects, ratings and sentiments for movie recommendation (JMARS). In: Proceedings of the 20th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (2014)
Doshi-Velez, F., Kim, B.: Towards a rigorous science of interpretable machine learning. Stat 1050, 2 (2017)
Flek, L.: Returning the N to NLP: towards contextually personalized classification models. In: Proceedings of the 58th Annual Meeting of ACL (2020)
Go, A.: Sentiment classification using distant supervision (2009)
Gong, L., Haines, B., Wang, H.: Clustered model adaption for personalized sentiment analysis. In: Proceedings of the 26th Conference on World Wide Web (2017)
He, X., Liao, L., Zhang, H., Nie, L., Hu, X., Chua, T.S.: Neural collaborative filtering. In: Proceedings of the 26th Conference on World Wide Web (2017)
Hovy, D.: Demographic factors improve classification performance. In: Proceedings of the 53rd Annual Meeting of the ACL &IJCNLP (2015)
Kanclerz, K., et al.: Controversy and conformity: from generalized to personalized aggressiveness detection. In: Proceedings of the 59th Annual Meeting of the ACL &IJCNLP (2021)
Kanclerz, K., et al.: What if ground truth is subjective? Personalized deep neural hate speech detection. In: Proceedings of the 1st Workshop on Perspectivist Approaches to NLP@ LREC2022, pp. 37–45 (2022)
Kazienko, P., et al.: Human-centred neural reasoning for subjective content processing: hate speech, emotions, and humor. Inf. Fusion (2023)
Kennedy, C.J., Bacon, G., Sahn, A., von Vacano, C.: Constructing interval variables via faceted Rasch measurement and multitask deep learning: a hate speech application. arXiv preprint arXiv:2009.10277 (2020)
Kocoń, J., et al.: Neuro-symbolic models for sentiment analysis. In: Groen, D., de Mulatier, C., Paszynski, M., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M.A. (eds.) ICCS 2022. LNCS, vol. 13351, pp. 667–681. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-08754-7_69
Kocoń, J., et al.: Learning personal human biases and representations for subjective tasks in natural language processing. In: 2021 IEEE International Conference on Data Mining (ICDM), pp. 1168–1173. IEEE (2021). https://doi.org/10.1109/ICDM51629.2021.00140
Kocoń, J., Maziarz, M.: Mapping wordnet onto human brain connectome in emotion processing and semantic similarity recognition. Inf. Process. Manag. 58(3), 102530 (2021)
Kocoń, J., Figas, A., Gruza, M., Puchalska, D., Kajdanowicz, T., Kazienko, P.: Offensive, aggressive, and hate speech analysis: from data-centric to human-centered approach. Inf. Process. Manag. 58(5) (2021)
Korczyński, W., Kocoń, J.: Compression methods for transformers in multidomain sentiment analysis. In: 2022 IEEE International Conference on Data Mining Workshops (ICDMW), pp. 419–426. IEEE (2022)
Li, T., Sanjabi, M., Smith, V.: Fair resource allocation in federated learning. CoRR abs/1905.10497 (2019)
Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019)
Lui, A., Lamb, G.W.: Artificial intelligence and augmented intelligence collaboration: regaining trust and confidence in the financial sector. Inf. Commun. Technol. Law 27(3), 267–283 (2018)
Miłkowski, P., Saganowski, S., Gruza, M., Kazienko, P., Piasecki, M., Kocoń, J.: Multitask personalized recognition of emotions evoked by textual content. In: Pervasive Computing and Communications Workshops (2022)
Mireshghallah, F., Shrivastava, V., Shokouhi, M., Berg-Kirkpatrick, T., Sim, R., Dimitriadis, D.: UserIdentifier: implicit user representations for simple and effective personalized sentiment analysis. CoRR abs/2110.00135 (2021)
Ngo, A., Candri, A., Ferdinan, T., Kocoń, J., Korczynski, W.: StudEmo: a non-aggregated review dataset for personalized emotion recognition. In: Proceedings of the 1st Workshop on Perspectivist Approaches to NLP@ LREC2022, pp. 46–55 (2022)
Schneider, J., Vlachos, M.: Mass personalization of deep learning. CoRR abs/1909.02803 (2019)
Song, K., Feng, S., Gao, W., Wang, D., Yu, G., Wong, K.F.: Personalized sentiment classification based on latent individuality of microblog users (2015)
Song, K., Tan, X., Qin, T., Lu, J., Liu, T.: MPNet: masked and permuted pre-training for language understanding. CoRR abs/2004.09297 (2020)
Swayamdipta, S., et al.: Dataset cartography: mapping and diagnosing datasets with training dynamics. In: Proceedings of the EMNLP2020 (2020)
Szołomicka, J., Kocon, J.: MultiAspectEmo: multilingual and language-agnostic aspect-based sentiment analysis. In: 2022 IEEE International Conference on Data Mining Workshops (ICDMW), pp. 443–450. IEEE (2022)
Tonekaboni, S., Joshi, S., McCradden, M.D., Goldenberg, A.: What clinicians want: contextualizing explainable machine learning for clinical end use. In: Machine Learning for Healthcare Conference, pp. 359–380. PMLR (2019)
Volkova, S., Wilson, T., Yarowsky, D.: Exploring demographic language variations to improve multilingual sentiment analysis in social media. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (2013)
Zamani, M., Schwartz, H.A., Lynn, V.E., Giorgi, S., Balasubramanian, N.: Residualized factor adaptation for community social media prediction tasks (2018)
Zhong, W., Tang, D., Wang, J., Yin, J., Duan, N.: UserAdapter: few-shot user learning in sentiment analysis. In: Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pp. 1484–1488 (2021)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Kocoń, J., Baran, J., Kanclerz, K., Kajstura, M., Kazienko, P. (2023). Differential Dataset Cartography: Explainable Artificial Intelligence in Comparative Personalized Sentiment Analysis. In: Mikyška, J., de Mulatier, C., Paszynski, M., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M. (eds) Computational Science – ICCS 2023. ICCS 2023. Lecture Notes in Computer Science, vol 14073. Springer, Cham. https://doi.org/10.1007/978-3-031-35995-8_11
Download citation
DOI: https://doi.org/10.1007/978-3-031-35995-8_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-35994-1
Online ISBN: 978-3-031-35995-8
eBook Packages: Computer ScienceComputer Science (R0)