Abstract
Dialogue systems based on neural networks trained on large-scale corpora have a variety of practical applications today. However, using uncensored training corpora may have risks, such as potential social bias issues. Meanwhile, manually reviewing these training corpora for social bias content is costly. So, it is necessary to design a recognition model that automatically detects social bias in dialogue systems. NLPCC 2022 Shared Task 7 - Fine-Grain Dialogue Social Bias Measurement, aims to measure social bias in dialogue systems and provides a well-annotated Chinese social bias dialogue dataset - CDAIL-BIAS DATASET. Based on CDAIL-BIAS DATASET, this paper proposes a powerful classifier, CDAIL-BIAS MEASURER. Specifically, we adopt a model ensemble approach, which combines five different pre-trained language models, and uses adversarial training and regularization strategy to enhance the robustness of the model. Finally, labels are obtained by using a novel method - a label-based weighted voting method. The result shows that the classifier has a macro F1 score of 0.580 for social bias measurement in dialogue systems. And our result ranks the third, demonstrating the effectiveness and superiority of our model.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Barikeri, S., Lauscher, A., Vulic, I., Glavas, G.: Redditbias: A real-world resource for bias evaluation and debiasing of conversational language models, pp. 1941–1955. Association for Computational Linguistics (2021)
Sap, M., Gabriel, S., Qin, L., Jurafsky, D., Smith, N.A., Choi, Y.: Social bias frames: reasoning about social and power implications of language, pp. 5477–5490. Association for Computational Linguistics (2020)
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. CoRR abs/1810.04805 (2018)
Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019)
Risch, J., Krestel, R.: Bagging BERT models for robust aggression identification, pp. 55–61. European Language Resources Association (ELRA) (2020)
Sagi, O., Rokach, L.: Ensemble learning: a survey. WIREs Data Min. Knowl. Disc. 8(4), e1249 (2018). https://doi.org/10.1002/widm.1249
Nobata, C., Tetreault, J.R., Thomas, A., Mehdad, Y., Chang, Y.: Abusive language detection in online user content, pp. 145–153. ACM (2016)
Waseem, Z., Davidson, T., Warmsley, D., Weber, I.: Understanding abuse: a typology of abusive language detection subtasks. CoRR abs/1705.09899 (2017)
Fortuna, P., Nunes, S.: A survey on automatic detection of hate speech in text. ACM Comput. Surv. 51(4), 85:1–85:30 (2018)
Vigna, F.D., Cimino, A., Dell’Orletta, F., Petrocchi, M., Tesconi, M.: Hate me, hate me not: hate speech detection on Facebook. In: CEUR Workshop Proceedings, vol. 1816, pp. 86–95. CEUR-WS.org (2017)
Zhang, Z., Robinson, D., Tepper, J.: Detecting hate speech on Twitter using a convolution-GRU based deep neural network. In: Gangemi, A., et al. (eds.) ESWC 2018. LNCS, vol. 10843, pp. 745–760. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93417-4_48
Fersini, E., Rosso, P., Anzovino, M.: Overview of the task on automatic misogyny identification at IberEval 2018. In: CEUR Workshop Proceedings, vol. 2150, pp. 214–228. CEUR-WS.org (2018)
Bosco, C., Dell’Orletta, F., Poletto, F., Sanguinetti, M., Tesconi, M.: Overview of the EVALITA 2018 hate speech detection task. In: CEUR Workshop Proceedings, vol. 2263. CEUR-WS.org (2018)
Basile, V., et al.: SemEval-2019 task 5: multilingual detection of hate speech against immigrants and women in Twitter, pp. 54–63. Association for Computational Linguistics (2019)
Bhattacharya, S., et al.: Developing a multilingual annotated corpus of misogyny and aggression. CoRR abs/2003.07428 (2020)
Zampieri, M., Malmasi, S., Nakov, P., Rosenthal, S., Farra, N., Kumar, R.: SemEval-2019 task 6: identifying and categorizing offensive language in social media (OffensEval). CoRR abs/1903.08983 (2019)
Zampieri, M., et al.: SemEval-2020 task 12: multilingual offensive language identification in social media (OffensEval 2020). CoRR abs/2006.07235 (2020)
Pavlopoulos, J., Sorensen, J., Laugier, L., Androutsopoulos, I.: SemEval-2021 task 5: toxic spans detection, pp. 59–69 (2021)
Pérez-Almendros, C., Espinosa-Anke, L., Schockaert, S.: SemEval-2022 task 4: patronizing and condescending language detection (2022)
Wiedemann, G., Yimam, S.M., Biemann, C.: UHH-LT at SemEval-2020 task 12: fine-tuning of pre-trained transformer networks for offensive language detection, pp. 1638–1644 (2020)
Hu, D., et al.: PALI-NLP at SemEval-2022 task 4: discriminative fine-tuning of deep transformers for patronizing and condescending language detection. CoRR abs/2203.04616 (2022)
Risch, J., Stoll, A., Ziegele, M., Krestel, R.: hpiDEDIS at GermEval 2019: offensive language identification using a German BERT model (2019)
den Poel, D.V.: Book review: ensemble methods: foundations and algorithms. IEEE Intell. Inf. Bull. 13(1), 33–34 (2012)
Dong, X., Yu, Z., Cao, W., Shi, Y., Ma, Q.: A survey on ensemble learning. Front. Comp. Sci. 14(2), 241–258 (2019). https://doi.org/10.1007/s11704-019-8208-z
Leon, F., Floria, S., Badica, C.: Evaluating the effect of voting methods on ensemble-based classification, pp. 1–6. IEEE (2017)
Zhou, J., et al.: Towards identifying social bias in dialog systems: frame, datasets, and benchmarks. CoRR abs/2202.08011 (2022)
Cui, Y., et al.: Pre-training with whole word masking for Chinese BERT. CoRR abs/1906.08101 (2019)
Zhao, Z., et al.: UER: an open-source toolkit for pre-training models. CoRR abs/1909.05658 (2019)
Cui, Y., Che, W., Liu, T., Qin, B., Wang, S., Hu, G.: Revisiting pre-trained models for Chinese natural language processing. CoRR abs/2004.13922 (2020)
Miyato, T., Dai, A.M., Goodfellow, I.J.: Adversarial training methods for semi-supervised text classification. OpenReview.net (2017)
Acknowledgments
This work supported by the Beijing Natural Science Foundation of China (4192057) and the Science Foundation of Beijing Language and Culture University (supported by “the Fundamental Research Funds for the Central Universities”) (21YJ040005).
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhao, J., Zhu, S., Liu, Y., Liu, P. (2022). CDAIL-BIAS MEASURER: A Model Ensemble Approach for Dialogue Social Bias Measurement. In: Lu, W., Huang, S., Hong, Y., Zhou, X. (eds) Natural Language Processing and Chinese Computing. NLPCC 2022. Lecture Notes in Computer Science(), vol 13552. Springer, Cham. https://doi.org/10.1007/978-3-031-17189-5_17
Download citation
DOI: https://doi.org/10.1007/978-3-031-17189-5_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-17188-8
Online ISBN: 978-3-031-17189-5
eBook Packages: Computer ScienceComputer Science (R0)