Skip to main content
Log in

Quantifying controversy from stance, sentiment, offensiveness and sarcasm: a fine-grained controversy intensity measurement framework on a Chinese dataset

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

Controversy measurement on social media plays an important part in understanding public opinion. Various topics are frequently hotly debated on social media platforms including Twitter and Sina Weibo. People sometimes use offensive or sarcastic language to convey their opinions about a topic or a source post, which might spark heated discussions and controversy towards related topics. Recent researches take controversy detection as a binary classification problem with two labels: controversy or non-controversy. The reason might be lacking a comprehensive understanding of why the controversy happened and a specific imagination of how it will be used in the downstream tasks. However, we believe that the degree of controversy courted by posts or topics in a real scenario varied. And fine-grained measurement of controversy will be beneficial to public sentiment identification, influence assessment and other social network analysis tasks. We also notice that the existing benchmarks of controversy detection are not applicable for fine-grained topic-level controversy measurement. In this paper, we present ProsCons, a large-scale comprehensive Chinese dataset that includes 245 topics and 32,667 posts with pro, con or neutral stances. Based on that, we design a controversy measurement framework for measuring the controversy intensity that topics sparked. This framework considers the degree of antagonism in terms of stance and sentiment, as well as the irrational degree (offensive or sarcasm) of a post to compute a controversy intensity. ProsCons provides a new benchmark for Chinese stance detection, offensive language and sarcasm detection, contributing to the multi-task learning of them. We conduct extensive experiments on ProsCons and provide baselines for these tasks. The experimental results highlight the challenges of the aforementioned tasks based on the ProsCons.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Availability of data and materials

Please request the ProsCons dataset at wanghaiyang19@nudt.edu.cn. X-stance can be accessed in http://doi.org/10.5281/zenodo.3831317. COVID-19-Stance can be accessed in https://github.com/kglandt/stance-detection-in-covid-19-tweets.

Notes

  1. Please request the ProsCons dataset at wanghaiyang19@nudt.edu.cn.

  2. https://github.com/hankcs/pyhanlp.

  3. https://ai.baidu.com/easydata/

  4. The analysis in this section is limited and the heuristic inference based on the ProsCons dataset only.

  5. https://huggingface.co/Helsinki-NLP.

  6. https://github.com/pysentimiento/pysentimiento.

References

  1. Zhong, L., Cao, J., Sheng, Q., Guo, J., Wang, Z.: Integrating semantic and structural information with graph convolutional network for controversy detection. In: Jurafsky, D., Chai, J., Schluter, N., Tetreault, J.R. (eds.) Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020, pp. 515–526 (2020). https://doi.org/10.18653/v1/2020.acl-main.49

  2. Dori-Hacohen, S.: Controversy analysis and detection (2017)

  3. Rethmeier, N., Hübner, M., Hennig, L.: Learning comment controversy prediction in web discussions using incidentally supervised multi-task cnns. In: Balahur, A., Mohammad, S.M., Hoste, V., Klinger, R. (eds.) Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, WASSA@EMNLP 2018, Brussels, Belgium, October 31, 2018, pp. 316–321 (2018). https://doi.org/10.18653/v1/w18-6246

  4. Garimella, K., Morales, G.D.F., Gionis, A., Mathioudakis, M.: Quantifying controversy on social media. ACM Transactions on Social Computing 1(1), 1–27 (2018)

    Article  Google Scholar 

  5. Hessel, J., Lee, L.: Something’s brewing! early prediction of controversy-causing posts from discussion features. In: Burstein, J., Doran, C., Solorio, T. (eds.) Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), pp. 1648–1659 (2019). https://doi.org/10.18653/v1/n19-1166

  6. Addawood, A., Rezapour, R., Abdar, O., Diesner, J.: Telling apart tweets associated with controversial versus non-controversial topics. In: Hovy, D., Volkova, S., Bamman, D., Jurgens, D., O’Connor, B., Tsur, O., Dogruöz, A.S. (eds.) Proceedings of the Second Workshop on NLP and Computational Social Science, NLP+CSS@ACL 2017, Vancouver, Canada, August 3, 2017, pp. 32–41 (2017). https://doi.org/10.18653/v1/w17-2905

  7. Wang, Z., Wang, J., Guo, Y., Gong, Z.: Zero-shot node classification with decomposed graph prototype network. In: Zhu, F., Ooi, B.C., Miao, C. (eds.) KDD ’21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Virtual Event, Singapore, August 14-18, 2021, pp. 1769–1779 (2021). https://doi.org/10.1145/3447548.3467230

  8. Wang, C., Wang, C., Wang, Z., Ye, X., Yu, P.S.: Edge2vec: Edge-based social network embedding. ACM Trans. Knowl. Discov. Data 14(4), 45–14524 (2020). DOI: 10.1145/3391298

    Article  Google Scholar 

  9. Mohammad, S.M., Kiritchenko, S., Sobhani, P., Zhu, X., Cherry, C.: Semeval-2016 task 6: Detecting stance in tweets. In: Bethard, S., Cer, D.M., Carpuat, M., Jurgens, D., Nakov, P., Zesch, T. (eds.) Proceedings of the 10th International Workshop on Semantic Evaluation, SemEval@NAACL-HLT 2016, San Diego, CA, USA, June 16-17, 2016, pp. 31–41 (2016). https://doi.org/10.18653/v1/s16-1003

  10. Sobhani, P., Inkpen, D., Zhu, X.: A dataset for multi-target stance detection. In: Lapata, M., Blunsom, P., Koller, A. (eds.) Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2017, Valencia, Spain, April 3-7, 2017, Volume 2: Short Papers, pp. 551–557 (2017). https://doi.org/10.18653/v1/e17-2088

  11. Allaway, E., McKeown, K.R.: Zero-shot stance detection: A dataset and model using generalized topic representations. In: Webber, B., Cohn, T., He, Y., Liu, Y. (eds.) Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16-20, 2020, pp. 8913–8931 (2020). https://doi.org/10.18653/v1/2020.emnlp-main.717

  12. Küçük, D., Can, F.: Stance detection: A survey. ACM Comput. Surv. 53(1), 12–11237 (2020). DOI: 10.1145/3369026

    Article  Google Scholar 

  13. Liang, B., Chen, Z., Gui, L., He, Y., Yang, M., Xu, R.: Zero-shot stance detection via contrastive learning. In: Laforest, F., Troncy, R., Simperl, E., Agarwal, D., Gionis, A., Herman, I., Médini, L. (eds.) WWW ’22: The ACM Web Conference 2022, Virtual Event, Lyon, France, April 25 - 29, 2022, pp. 2738–2747 (2022). https://doi.org/10.1145/3485447.3511994

  14. Allaway, E., Srikanth, M., McKeown, K.R.: Adversarial learning for zero-shot stance detection on social media. In: Toutanova, K., Rumshisky, A., Zettlemoyer, L., Hakkani-Tür, D., Beltagy, I., Bethard, S., Cotterell, R., Chakraborty, T., Zhou, Y. (eds.) Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2021, Online, June 6-11, 2021, pp. 4756–4767 (2021). https://doi.org/10.18653/v1/2021.naacl-main.379

  15. Zhu, Q., Liang, B., Sun, J., Du, J., Zhou, L., Xu, R.: Enhancing zero-shot stance detection via targeted background knowledge. In: Amigó, E., Castells, P., Gonzalo, J., Carterette, B., Culpepper, J.S., Kazai, G. (eds.) SIGIR ’22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11 - 15, 2022, pp. 2070–2075 (2022). https://doi.org/10.1145/3477495.3531807

  16. Zhou, X., Yong, Y., Fan, X., Ren, G., Song, Y., Diao, Y., Yang, L., Lin, H.: Hate speech detection based on sentiment knowledge sharing. In: Zong, C., Xia, F., Li, W., Navigli, R. (eds.) Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, August 1-6, 2021, pp. 7158–7166 (2021). https://doi.org/10.18653/v1/2021.acl-long.556

  17. Xu, R., Zhou, Y., Wu, D., Gui, L., Du, J., Xue, Y.: Overview of NLPCC shared task 4: Stance detection in chinese microblogs. In: Lin, C., Xue, N., Zhao, D., Huang, X., Feng, Y. (eds.) Natural Language Understanding and Intelligent Applications - 5th CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2016, and 24th International Conference on Computer Processing of Oriental Languages, ICCPOL 2016, Kunming, China, December 2-6, 2016, Proceedings. Lecture Notes in Computer Science, vol. 10102, pp. 907–916 (2016). https://doi.org/10.1007/978-3-319-50496-4_85

  18. Murakami, A., Raymond, R.: Support or oppose? classifying positions in online debates from reply activities and opinion expressions. In: Huang, C., Jurafsky, D. (eds.) COLING 2010, 23rd International Conference on Computational Linguistics, Posters Volume, 23-27 August 2010, Beijing, China, pp. 869–875 (2010). https://aclanthology.org/C10-2100/

  19. Vamvas, J., Sennrich, R.: X -stance: A multilingual multi-target dataset for stance detection. In: Ebling, S., Tuggener, D., Hürlimann, M., Cieliebak, M., Volk, M. (eds.) Proceedings of the 5th Swiss Text Analytics Conference and the 16th Conference on Natural Language Processing, SwissText/KONVENS 2020, Zurich, Switzerland, June 23-25, 2020 [online Only]. CEUR Workshop Proceedings, vol. 2624 (2020). http://ceur-ws.org/Vol-2624/paper9.pdf

  20. Glandt, K., Khanal, S., Li, Y., Caragea, D., Caragea, C.: Stance detection in COVID-19 tweets. In: Zong, C., Xia, F., Li, W., Navigli, R. (eds.) Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, August 1-6, 2021, pp. 1596–1611 (2021). https://doi.org/10.18653/v1/2021.acl-long.127

  21. Conforti, C., Berndt, J., Pilehvar, M.T., Giannitsarou, C., Toxvaerd, F., Collier, N.: Will-they-won’t-they: A very large dataset for stance detection on twitter. In: Jurafsky, D., Chai, J., Schluter, N., Tetreault, J.R. (eds.) Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020, pp. 1715–1724 (2020). https://doi.org/10.18653/v1/2020.acl-main.157

  22. Deng, J., Zhou, J., Sun, H., Zheng, C., Mi, F., Meng, H., Huang, M.: COLD: A benchmark for chinese offensive language detection. In: Goldberg, Y., Kozareva, Z., Zhang, Y. (eds.) Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7-11, 2022, pp. 11580–11599 (2022). https://aclanthology.org/2022.emnlp-main.796

  23. Warner, W., Hirschberg, J.: Detecting hate speech on the world wide web. (2012)

  24. MacAvaney, S., Yao, H.-R., Yang, E., Russell, K., Goharian, N., Frieder, O.: Hate speech detection: Challenges and solutions. PLoS ONE 14 (2019)

  25. del Arco, F.M.P., Molina-González, M.D., Martín-Valdivia, M., López, L.A.U.: SINAI at semeval-2019 task 6: Incorporating lexicon knowledge into SVM learning to identify and categorize offensive language in social media. In: May, J., Shutova, E., Herbelot, A., Zhu, X., Apidianaki, M., Mohammad, S.M. (eds.) Proceedings of the 13th International Workshop on Semantic Evaluation, SemEval@NAACL-HLT 2019, Minneapolis, MN, USA, June 6-7, 2019, pp. 735–738 (2019). https://doi.org/10.18653/v1/s19-2129

  26. Zhu, J., Tian, Z., Kübler, S.: Um-iu@ling at semeval-2019 task 6: Identifying offensive tweets using BERT and svms. In: May, J., Shutova, E., Herbelot, A., Zhu, X., Apidianaki, M., Mohammad, S.M. (eds.) Proceedings of the 13th International Workshop on Semantic Evaluation, SemEval@NAACL-HLT 2019, Minneapolis, MN, USA, June 6-7, 2019, pp. 788–795 (2019). https://doi.org/10.18653/v1/s19-2138

  27. Rozental, A., Biton, D.: Amobee at semeval-2019 tasks 5 and 6: Multiple choice CNN over contextual embedding. In: May, J., Shutova, E., Herbelot, A., Zhu, X., Apidianaki, M., Mohammad, S.M. (eds.) Proceedings of the 13th International Workshop on Semantic Evaluation, SemEval@NAACL-HLT 2019, Minneapolis, MN, USA, June 6-7, 2019, pp. 377–381 (2019). https://doi.org/10.18653/v1/s19-2066

  28. Zampieri, M., Malmasi, S., Nakov, P., Rosenthal, S., Farra, N., Kumar, R.: Semeval-2019 task 6: Identifying and categorizing offensive language in social media (offenseval). CoRR abs/1903.08983 (2019) 1903.08983

  29. Zhang, Y., Xu, B., Zhao, T.: CN-HIT-MI.T at semeval-2019 task 6: Offensive language identification based on bilstm with double attention. In: May, J., Shutova, E., Herbelot, A., Zhu, X., Apidianaki, M., Mohammad, S.M. (eds.) Proceedings of the 13th International Workshop on Semantic Evaluation, SemEval@NAACL-HLT 2019, Minneapolis, MN, USA, June 6-7, 2019, pp. 564–570 (2019). https://doi.org/10.18653/v1/s19-2101

  30. Rajendran, A., Zhang, C., Abdul-Mageed, M.: UBC-NLP at semeval-2019 task 6: Ensemble learning of offensive content with enhanced training data. In: May, J., Shutova, E., Herbelot, A., Zhu, X., Apidianaki, M., Mohammad, S.M. (eds.) Proceedings of the 13th International Workshop on Semantic Evaluation, SemEval@NAACL-HLT 2019, Minneapolis, MN, USA, June 6-7, 2019, pp. 775–781 (2019). https://doi.org/10.18653/v1/s19-2136

  31. Nikolov, A., Radivchev, V.: Nikolov-radivchev at semeval-2019 task 6: Offensive tweet classification with BERT and ensembles. In: May, J., Shutova, E., Herbelot, A., Zhu, X., Apidianaki, M., Mohammad, S.M. (eds.) Proceedings of the 13th International Workshop on Semantic Evaluation, SemEval@NAACL-HLT 2019, Minneapolis, MN, USA, June 6-7, 2019, pp. 691–695 (2019). https://doi.org/10.18653/v1/s19-2123

  32. Nguyen, D.Q., Vu, T., Nguyen, A.T.: Bertweet: A pre-trained language model for english tweets. In: Liu, Q., Schlangen, D. (eds.) Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, EMNLP 2020 - Demos, Online, November 16-20, 2020, pp. 9–14 (2020). https://doi.org/10.18653/v1/2020.emnlp-demos.2

  33. Koufakou, A., Pamungkas, E.W., Basile, V., Patti, V.: Hurtbert: Incorporating lexical features with BERT for the detection of abusive language. In: Akiwowo, S., Vidgen, B., Prabhakaran, V., Waseem, Z. (eds.) Proceedings of the Fourth Workshop on Online Abuse and Harms, WOAH 2020, Online, November 20, 2020, pp. 34–43 (2020). https://doi.org/10.18653/v1/2020.alw-1.5

  34. Caselli, T., Basile, V., Mitrovic, J., Granitzer, M.: Hatebert: Retraining BERT for abusive language detection in english. CoRR abs/2010.12472 (2020) 2010.12472

  35. Zampieri, M., Malmasi, S., Nakov, P., Rosenthal, S., Farra, N., Kumar, R.: Predicting the type and target of offensive posts in social media. In: Burstein, J., Doran, C., Solorio, T. (eds.) Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), pp. 1415–1420 (2019). https://doi.org/10.18653/v1/n19-1144

  36. Gehman, S., Gururangan, S., Sap, M., Choi, Y., Smith, N.A.: Realtoxicityprompts: Evaluating neural toxic degeneration in language models. In: Cohn, T., He, Y., Liu, Y. (eds.) Findings of the Association for Computational Linguistics: EMNLP 2020, Online Event, 16-20 November 2020. Findings of ACL, vol. EMNLP 2020, pp. 3356–3369 (2020). https://doi.org/10.18653/v1/2020.findings-emnlp.301

  37. Oprea, S., Magdy, W.: isarcasm: A dataset of intended sarcasm. In: Jurafsky, D., Chai, J., Schluter, N., Tetreault, J.R. (eds.) Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020, pp. 1279–1289 (2020). https://doi.org/10.18653/v1/2020.acl-main.118

  38. Maynard, D., Greenwood, M.A.: Who cares about sarcastic tweets? investigating the impact of sarcasm on sentiment analysis. In: Calzolari, N., Choukri, K., Declerck, T., Loftsson, H., Maegaard, B., Mariani, J., Moreno, A., Odijk, J., Piperidis, S. (eds.) Proceedings of the Ninth International Conference on Language Resources and Evaluation, LREC 2014, Reykjavik, Iceland, May 26-31, 2014, pp. 4238–4243 (2014). http://www.lrec-conf.org/proceedings/lrec2014/summaries/67.html

  39. Joshi, A., Tripathi, V., Patel, K., Bhattacharyya, P., Carman, M.J.: Are word embedding-based features useful for sarcasm detection? In: Su, J., Carreras, X., Duh, K. (eds.) Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016, Austin, Texas, USA, November 1-4, 2016, pp. 1006–1011 (2016). https://doi.org/10.18653/v1/d16-1104

  40. Tay, Y., Luu, A.T., Hui, S.C., Su, J.: Reasoning with sarcasm by reading in-between. In: Gurevych, I., Miyao, Y. (eds.) Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, July 15-20, 2018, Volume 1: Long Papers, pp. 1010–1020 (2018). https://doi.org/10.18653/v1/P18-1093. https://aclanthology.org/P18-1093/

  41. Hazarika, D., Poria, S., Gorantla, S., Cambria, E., Zimmermann, R., Mihalcea, R.: CASCADE: contextual sarcasm detection in online discussion forums. In: Bender, E.M., Derczynski, L., Isabelle, P. (eds.) Proceedings of the 27th International Conference on Computational Linguistics, COLING 2018, Santa Fe, New Mexico, USA, August 20-26, 2018, pp. 1837–1848 (2018). https://aclanthology.org/C18-1156/

  42. Oprea, S., Magdy, W.: Exploring author context for detecting intended vs perceived sarcasm. In: Korhonen, A., Traum, D.R., Màrquez, L. (eds.) Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers, pp. 2854–2859 (2019). https://doi.org/10.18653/v1/p19-1275

  43. Tang, Y.-j., Chen, H.-H.: Chinese irony corpus construction and ironic structure analysis. In: International Conference on Computational Linguistics (2014)

  44. Gong, X., Zhao, Q., Zhang, J., Mao, R., Xu, R.: The design and construction of a chinese sarcasm dataset. In: Calzolari, N., Béchet, F., Blache, P., Choukri, K., Cieri, C., Declerck, T., Goggi, S., Isahara, H., Maegaard, B., Mariani, J., Mazo, H., Moreno, A., Odijk, J., Piperidis, S. (eds.) Proceedings of The 12th Language Resources and Evaluation Conference, LREC 2020, Marseille, France, May 11-16, 2020, pp. 5034–5039 (2020). https://aclanthology.org/2020.lrec-1.619/

  45. Xing, L., Ying, L., Sing, W.: Linguistic features enhanced convolutional neural networks for irony recognition. Journal of Chinese Information Processing 33(5):31 (2019)

    Google Scholar 

  46. OpenAI: GPT-4 technical report. CoRR abs/2303.08774 (2023) 2303.08774. https://doi.org/10.48550/arXiv.2303.08774

  47. Huang, F., Kwak, H., An, J.: Is chatgpt better than human annotators? potential and limitations of chatgpt in explaining implicit hate speech. In: Ding, Y., Tang, J., Sequeda, J.F., Aroyo, L., Castillo, C., Houben, G. (eds.) Companion Proceedings of the ACM Web Conference 2023, WWW 2023, Austin, TX, USA, 30 April 2023 - 4 May 2023, pp. 294–297 (2023). https://doi.org/10.1145/3543873.3587368

  48. Li, Y., Sosea, T., Sawant, A., Nair, A.J., Inkpen, D., Caragea, C.: P-stance: A large dataset for stance detection in political domain. In: Zong, C., Xia, F., Li, W., Navigli, R. (eds.) Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, Online Event, August 1-6, 2021. Findings of ACL, vol. ACL/IJCNLP 2021, pp. 2355–2365 (2021). https://doi.org/10.18653/v1/2021.findings-acl.208

  49. Gorrell, G., Aker, A., Bontcheva, K., Derczynski, L., Kochkina, E., Liakata, M., Zubiaga, A.: Semeval-2019 task 7: Rumoureval, determining rumour veracity and support for rumours. In: May, J., Shutova, E., Herbelot, A., Zhu, X., Apidianaki, M., Mohammad, S.M. (eds.) Proceedings of the 13th International Workshop on Semantic Evaluation, 2019, Minneapolis, MN, USA, June 6-7, 2019, pp. 845–854 (2019). https://doi.org/10.18653/v1/s19-2147

  50. Jeong, Y., Oh, J., Lee, J., Ahn, J., Moon, J., Park, S., Oh, A.: KOLD: korean offensive language dataset. In: Goldberg, Y., Kozareva, Z., Zhang, Y. (eds.) Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7-11, 2022, pp. 10818–10833 (2022). https://aclanthology.org/2022.emnlp-main.744

  51. Jiang, A., Yang, X., Liu, Y., Zubiaga, A.: SWSR: A chinese dataset and lexicon for online sexism detection. Online Soc. Networks Media 27, 100182 (2022). DOI: 10.1016/j.osnem.2021.100182

    Article  Google Scholar 

  52. Tang, X., Shen, X., Wang, Y., Yang, Y.: Categorizing offensive language in social networks: A chinese corpus, systems and an explanation tool. In: Sun, M., Li, S., Zhang, Y., Liu, Y., He, S., Rao, G. (eds.) Chinese Computational Linguistics - 19th China National Conference, CCL 2020, Hainan, China, October 30 - November 1, 2020, Proceedings. Lecture Notes in Computer Science, vol. 12522 (2020). https://doi.org/10.1007/978-3-030-63031-7_22

  53. Hee, C.V., Lefever, E., Hoste, V.: Semeval-2018 task 3: Irony detection in english tweets. In: Apidianaki, M., Mohammad, S.M., May, J., Shutova, E., Bethard, S., Carpuat, M. (eds.) Proceedings of The 12th International Workshop on Semantic Evaluation, SemEval@NAACL-HLT 2018, New Orleans, Louisiana, USA, June 5-6, 2018, pp. 39–50 (2018). https://doi.org/10.18653/v1/s18-1005

  54. Lippmann, W., Curtis, M.: Public Opinion, (2017)

  55. Tian, H., Gao, C., Xiao, X., Liu, H., He, B., Wu, H., Wang, H., Wu, F.: SKEP: sentiment knowledge enhanced pre-training for sentiment analysis. In: Jurafsky, D., Chai, J., Schluter, N., Tetreault, J.R. (eds.) Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020, pp. 4067–4076 (2020). https://doi.org/10.18653/v1/2020.acl-main.374

  56. Pérez, J.M., Giudici, J.C., Luque, F.M.: pysentimiento: A python toolkit for sentiment analysis and socialnlp tasks. CoRR abs/2106.09462 (2021) 2106.09462

  57. Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Burstein, J., Doran, C., Solorio, T. (eds.) Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), pp. 4171–4186 (2019). https://doi.org/10.18653/v1/n19-1423

  58. Cui, Y., Che, W., Liu, T., Qin, B., Yang, Z.: Pre-training with whole word masking for chinese BERT. IEEE ACM Trans. Audio Speech Lang. Process. 29, 3504–3514 (2021). https://doi.org/10.1109/TASLP.2021.3124365

  59. Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. In: 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020 (2020). https://openreview.net/forum?id=r1xMH1BtvB

  60. Yang, Z., Dai, Z., Yang, Y., Carbonell, J.G., Salakhutdinov, R., Le, Q.V.: Xlnet: Generalized autoregressive pretraining for language understanding. In: Wallach, H.M., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E.B., Garnett, R. (eds.) Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, pp. 5754–5764 (2019). https://proceedings.neurips.cc/paper/2019/hash/dc6a7e655d7e5840e66733e9ee67cc69-Abstract.html

  61. Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019) 1907.11692

Download references

Funding

This work was supported by the National Natural Science Foundation of China (62172428).

Author information

Authors and Affiliations

Authors

Contributions

Bin Zhou raised the need of this work and provided funding. Haiyang Wang was responsible for data collection, annotation, experiment and completed the initial version of the manuscript. Ye Wang revised the manuscript and enriched the experiments. Xin Song, Xuechen Zhao and Xie Feng analysed the data and experiment results.

Corresponding author

Correspondence to Ye Wang.

Ethics declarations

Ethical Approval

Not applicable.

Competing Interests

No potential conflict of interest was reported by the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, H., Wang, Y., Song, X. et al. Quantifying controversy from stance, sentiment, offensiveness and sarcasm: a fine-grained controversy intensity measurement framework on a Chinese dataset. World Wide Web 26, 3607–3632 (2023). https://doi.org/10.1007/s11280-023-01191-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-023-01191-x

Keywords

Navigation