Abstract
Cyberbullying has become a serious problem with the spread of personal computers, smartphones and SNS. In this paper, for automatic cyberbullying detection on Twitter, we construct a bullying expression dictionary, which registers bullying words and their degrees related to bullying. The words registered in the dictionary are those that appear in the collected bullying-related tweets, and the bullying degrees attached to the words are calculated using SO-PMI. We also construct models to automatically classify bullying and non-bullying tweets by extracting multiple features including the bullying expression dictionary and combining them with multiple machine learning algorithms. We evaluate the classification performance of bullying and non-bullying tweets using the constructed models. The experimental results show that the bullying expression dictionary can contribute to cyberbullying detection in most of the machine learning algorithms and that the best model can obtain an evaluation of over 90%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
stopbullying.gov. Facts About Bullying (2017). https://www.stopbullying.gov/media/facts/index.html Accessed 27 Dec 2020
Bullying investigation results in Japan (2018). https://www.mext.go.jp/content/1410392.pdf Accessed 27 Dec 2020
Zhang, J., Otomo, T., Li, L., Nakajima, S.: Cyberbullying detection on twitter using multiple textual features. In: iCAST 2019, pp. 1–6 (2019)
Zhang, J., Minami, K., Kawai, Y., Shiraishi, Y., Kumamoto, T.: Personalized web search using emotional features. CD-ARES 2013, 69–83 (2013)
Takamura, H., Inui, T., Okumura, M.: Extracting semantic orientations of words using spin model. ACL 2005, 133–140 (2005)
Burnap, P., Williams, M.L.: Cyber hate speech on twitter: an application of machine classification and statistical modeling for policy and decision making. Policy Int. 7(2), 223–242 (2015)
Rafiq, R.I., Hosseinmardi, H., Han, R., Lv, Q., Mishra, S., Mattson, S.A.: Careful what you share in six seconds: detecting cyberbullying instances in vine. ASONAM 2015, 617–622 (2015)
Hosseinmardi, H., Mattson, S.A., Ibn Rafiq, R., Han, R., Lv, Q., Mishra, S.: Analyzing labeled cyberbullying incidents on the Instagram social network. SocInfo 2015. LNCS, vol. 9471, pp. 49–66. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-27433-1_4
Nobata, C., Tetreault, J., Thomas, A., Mehdad, Y., Chang, Y.: Abusive language detection in online user content. WWW 2016, 145–153 (2016)
Chatzakou, D., Kourtellis, N., Blackburn, J., Cristofaro, E.D., Stringhini, G., Vakali, A.: Mean birds: detecting aggression and bullying on Twitter. WebSci 2017, 13–22 (2017)
Rafiq, R.I., Hosseinmardi, H., Han, R., Lv, Q., Mishra, S.: Scalable and timely detection of cyberbullying in online social networks. SAC 2018, 1738–1747 (2018)
Li, C.: Explainable detection of fake news and cyberbullying on social media. WWW 2020, 398 (2020)
Cheng, L., Shu, K., Wu, S., Silva, Y.N., Hall, D.L., Liu, H.: Unsupervised cyberbullying detection via time-informed gaussian mixture model. CIKM 2020, 185–194 (2020)
Schmidt, A., Wiegand, M.: A survey on hate speech detection using natural language processing. In: SocialNLP@EACL 2017, pp. 1–10 (2017)
Van Hee, C., et al.: Detection and fine-grained classification of cyberbullying events. RANLP 2015, 672–680 (2015)
Ross, B., et al.: Measuring the reliability of hate speech annotations: the case of the European refugee crisis. In: NLP4CMC 2017, pp. 6–9 (2017)
Ptaszynski, M., Masui, F., Kimura, Y., Rzepka, R., Araki, K.: Automatic extraction of harmful sentence patterns with application in cyberbullying detection. LTC 2015, 349–362 (2015)
Ptaszynski, M., Eronen, J.K.K., Masui, F.: Learning deep on cyberbullying is always better than brute force. LaCATODA 2017, 3–10 (2017)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. NIPS 2013, 3111–3119 (2013)
Le, Q., Mikolov, T.: Distributed representations of sentences and documents. ICML 2014, 1188–1196 (2014)
Nitta, T., Masui, F., Ptaszynski, M., Kimura, Y., Rzepka, R., Araki, K.: Detecting cyberbullying entries on informal school websites based on category relevance maximization. IJCNLP 2013, 579–586 (2013)
Hatakeyama, S., Masui, F., Ptaszynski, M., Yamamoto, K.: Statistical analysis of automatic seed word acquisition to improve harmful expression extraction in cyberbullying detection. IJETI 6(2), 165–172 (2016)
Morita, H., Kawahara, D., Kurohashi, S.: Morphological analysis for unsegmented languages using recurrent neural network language model. EMNLP 2015, 2292–2297 (2015)
Wang, G., Araki, K.: Modifying SO-PMI for Japanese weblog opinion mining by using a balancing factor and detecting neutral expressions. ACL 2007, 189–192 (2007)
Yahoo! Japan crowdsourcing. https://crowdsourcing.yahoo.co.jp/ Accessed 27 Dec 2020 from
Acknowledgments
This research was supported by JSPS 19K12230.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, J., Otomo, T., Li, L., Nakajima, S. (2021). Automatic Cyberbullying Detection on Twitter Using Bullying Expression Dictionary. In: Nguyen, N.T., Chittayasothorn, S., Niyato, D., Trawiński, B. (eds) Intelligent Information and Database Systems. ACIIDS 2021. Lecture Notes in Computer Science(), vol 12672. Springer, Cham. https://doi.org/10.1007/978-3-030-73280-6_25
Download citation
DOI: https://doi.org/10.1007/978-3-030-73280-6_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-73279-0
Online ISBN: 978-3-030-73280-6
eBook Packages: Computer ScienceComputer Science (R0)