Skip to main content

Estimating Aggressiveness of Russian Texts by Means of Machine Learning

  • Conference paper
  • First Online:
Speech and Computer (SPECOM 2019)

Abstract

This paper considers emotional assessment of texts in Russian using machine learning on the example of aggression detection. It summarizes the related work, methods, models and datasets, describes actual problems, proposes a text processing pipeline and a software system for training neural networks on heterogeneous datasets. The experiments show that neural networks trained on the annotated corpora both in Russian and English, allow to determine whether a text item in Russian contains an aggressive message. Authors thoroughly compare different assessment methods, particularly corpus-based approaches, machine learning solutions and hybrid variants. Results, obtained here, can be used to estimate the aggressiveness probability, for example, to rank messages for subsequent manual verification. They also enable feasibility studies on the possibilities of detecting a particular type of emotion in a text using corpora in other languages. The paper highlights further research directions, where different Python toolkits (NLTK, Keras) could be used for better model performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Kocharov, D.A., Menshikova, A.P.: Detection of prominent words in Russian texts using linguistic features. SPIIRAS Proc. 6, 216–236 (2017)

    Article  Google Scholar 

  2. Glazkova, A.V.: An approach to text classification based on age groups of addressees. SPIIRAS Proc. 3, 51–69 (2017)

    Article  Google Scholar 

  3. Vorobiev, V.I., Evnevich, E.L., Levonevskiy, D.K., Fatkieva, R.R., Fedorchenko, L.N.: A study and selection of cryptographic standards on the basis of text mining. SPIIRAS Proc. 5, 69–87 (2016)

    Article  Google Scholar 

  4. Ventirozos, F.K., Varlamis, I., Tsatsaronis, G.: Detecting aggressive behavior in discussion threads using text mining. In: Gelbukh, A. (ed.) CICLing 2017. LNCS, vol. 10762, pp. 420–431. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-77116-8_31

    Chapter  Google Scholar 

  5. Medhat, W., Hassan, A., Korashy, H.: Sentiment analysis algorithms and applications: a survey. Ain Shams Eng. J. 5(4), 1093–1113 (2014)

    Article  Google Scholar 

  6. Chatzakou, D., Kourtellis, N., Blackburn, J., De Cristofaro, E., Stringhini, G., Vakali, A..: Mean birds: detecting aggression and bullying on twitter. In Proceedings of the 2017 ACM on Web Science Conference, pp. 13–22. ACM (2017)

    Google Scholar 

  7. Van Hee, C., et al.: Automatic detection of cyberbullying in social media text. PLoS One 13(10), e0203794 (2018)

    Article  Google Scholar 

  8. Tommasel, A., Rodriguez, J.M., Godoy, D.: Textual aggression detection through deep learning. In: Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying, TRAC-2018, pp. 177–187 (2018)

    Google Scholar 

  9. Golem, V., Karan, M., Šnajder, J.: Combining shallow and deep learning for aggressive text detection. In: Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying, TRAC-2018, pp. 188–198 (2018)

    Google Scholar 

  10. Escalante, H.J., Villatoro-Tello, E., Garza, S.E., López-Monroy, A.P., Montes-y-Gómez, M., Villaseñor-Pineda, L.: Early detection of deception and aggressiveness using profile-based representations. Expert Syst. Appl. 89, 99–111 (2017)

    Article  Google Scholar 

  11. Serrano-Guerrero, J., Olivas, J.A., Romero, F.P., Herrera-Viedma, E.: Sentiment analysis: a review and comparative analysis of web services. Inf. Sci. 311, 18–38 (2015)

    Article  Google Scholar 

  12. Mäntylä, M.V., Graziotin, D., Kuutila, M.: The evolution of sentiment analysis—a review of research topics, venues, and top cited papers. Comput. Sci. Rev. 27, 16–32 (2018)

    Article  Google Scholar 

  13. Jo, H., Kim, S.M., Ryu, J.: What we really want to find by sentiment analysis: the relationship between computational models and psychological state. arXiv preprint arXiv:1704.03407 (2017)

  14. Smirnov, I.V., SHelmanov, A.O., Kuznecova, E.S., Hramoin, I.V.: Semantiko-sintaksicheskij analiz estestvennykh yazykov. CHast’ II. Metod semantiko-sintaksicheskogo analiza tekstov (Semantic-syntactic analysis of natural languages. Part II. Method of semantic-syntactic analysis of texts). Iskusstvennyj intellekt i prinyatie reshenij, vol. 1, pp. 11–24. ISA RAS, Moscow (2014)

    Google Scholar 

  15. Plutchik, R.: A general psychoevolutionary theory of emotion. In: Theories of Emotion, pp. 3–33. Academic Press (1980)

    Google Scholar 

  16. Mejova, Y., Srinivasan, P.: Exploring feature definition and selection for sentiment classifiers. In: Fifth International AAAI Conference on Weblogs and Social Media (2011)

    Google Scholar 

  17. Reyes, A., Rosso, P.: Making objective decisions from subjective data: detecting irony in customer reviews. Decis. Support Syst. 53(4), 754–760 (2012)

    Article  Google Scholar 

  18. Bostan, L.A.M., Klinger, R.: An analysis of annotated corpora for emotion classification in text. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 2104–2119 (2018)

    Google Scholar 

  19. Rubtsova, Y.: Constricting a corpus for sentiment classification training. Softw. Syst. 1(109), 72–79 (2015)

    Article  Google Scholar 

  20. Ekman, P.: An argument for basic emotions. Cogn. Emot. 6(3–4), 169–200 (1992)

    Article  Google Scholar 

  21. Levonevskii, D., SHumskaya, O., Velichko, Uzdyaev, M., Malov, D.: Methods for determination of psychophysiological condition of user within smart environment based on complex analysis of heterogeneous data. Paper presented at the 14th International Conference on Electromechanics and Robotics “Zavalishin’s Readings”, ER(ZR)-2019 (2019)

    Google Scholar 

  22. Sentiment Analysis in Text. https://data.world/crowdflower/sentiment-analysis-in-text. Accessed 15 Feb 2019

  23. Emotion, Sentiment, and Stance Labeled Data. http://saifmohammad.com/WebPages/SentimentEmotionLabeledData.html. Accessed 21 Jan 2019

  24. Buechel, S., Hahn, U.: EMOBANK: studying the impact of annotation perspective and representation format on dimensional emotion analysis. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, vol. 2, pp. 578–585 (2017)

    Google Scholar 

  25. Risch, J., Krestel, R.: Aggression identification using deep learning and data augmentation. In: Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (co-located with COLING), pp. 150–158 (2018)

    Google Scholar 

  26. Yussupova, N., Bogdanova, D., Boyko, M.: Applying of sentiment analysis for texts in Russian based on machine learning approach. In: IMMM 2012: The Second International Conference on Advances in Information Mining and Management, pp. 8–14 (2012)

    Google Scholar 

  27. Neidenthal, P.M., Kranth-Gruber, S., Ric, F.: Psychology of Emotions: Interpersonal, Experiential, and Cognitive Approach. Psychology Press, New York (2006)

    Google Scholar 

Download references

Acknowledgment

This research is supported by the Russian Foundation for Basic Research (project No. 18-29-22061_MK).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dmitriy Levonevskiy .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Levonevskiy, D., Malov, D., Vatamaniuk, I. (2019). Estimating Aggressiveness of Russian Texts by Means of Machine Learning. In: Salah, A., Karpov, A., Potapova, R. (eds) Speech and Computer. SPECOM 2019. Lecture Notes in Computer Science(), vol 11658. Springer, Cham. https://doi.org/10.1007/978-3-030-26061-3_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-26061-3_28

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-26060-6

  • Online ISBN: 978-3-030-26061-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics