Skip to main content

HiSAT: Hierarchical Framework for Sentiment Analysis on Twitter Data

  • Conference paper
  • First Online:
Intelligent Systems and Applications (IntelliSys 2022)

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 542))

Included in the following conference series:

Abstract

Social media websites such as Twitter have become so indispensable today that people use them almost on a daily basis for sharing their emotions, opinions, suggestions and thoughts. Motivated by such behavioral tendencies, the purpose of this study is to define an approach to automatically classify the tweets on Twitter data into two main classes, namely, hate speech and non-hate speech. This provides a valuable source of information in analyzing and understanding target audiences and spotting marketing trends. We thus propose HiSAT, a Hierarchical framework for Sentiment Analysis on Twitter data. Sentiments/opinions in tweets are highly unstructured-and do not have a proper defined sequence. They constitute a heterogeneous data from many sources having different formats, and express either positive or negative, or neutral sentiment. Hence, in HiSAT we conduct Natural Language Processing encompassing tokenization, stemming and lemmatization techniques that convert text to tokens; as well as Bag-of-Words (BoW) and Term Frequency-Inverse Document Frequency (TF-IDF) techniques that convert text sentences into numeric vectors. These are then fed as inputs to Machine learning algorithms within the HiSAT framework; more specifically, Random Forest, Logistic Regression and Naïve Bayes are used as text-binary classifiers to detect hate speech and non-hate speech from the tweets. Results of experiments performed with the HiSAT framework show that Random Forest outperforms the others with a better prediction in estimating the correct labels (with accuracy above the 95% range). We present the HiSAT approach, its implementation and experiments, along with related work and ongoing research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Liu, B.: Sentiment Analysis: Mining Opinions, Sentiments, and Emotions, 2nd edn. Cambridge University Press (2020)

    Google Scholar 

  2. Simon Perfect (Theos, London, UK), What are the hate crime laws and should they be reformed? November 2020. https://www.theosthinktank.co.uk/comment/2020/10/29/what-are-the-hate-crime-laws-and-should-they-be-reformed

  3. Twitter Sentiment Analysis. https://www.kaggle.com/arkhoshghalb/twitter-sentiment-analysis-hatred-speech

  4. Anjaria, M., Guddeti, R.M.R.: Influence factor based opinion mining of Twitter data using supervised learning, In: 2014 Sixth International Conference on Communication Systems and Networks (COMSNETS), pp. 1–8 (2014)

    Google Scholar 

  5. Cristianini, N., Ricci, E.: Support Vector Machines. In: Kao, M.Y. (eds.) Encyclopedia of Algorithms. Springer, Boston (2008)

    Google Scholar 

  6. Cao, H., Verma, R., Nenkova, A.: Speaker-sensitive emotion recognition via ranking: studies on acted and spontaneous speech. In: Comput. Speech Lang. 28(1), 186–202 (2015)

    Google Scholar 

  7. Xia, R., Zong, C., Li, S.: Ensemble of feature sets and classification algorithms for sentiment classification. Inf. Sci. 181(6), 1138–1152 (2011)

    Article  Google Scholar 

  8. Chen, L., Mao, X., Xue, Y., Cheng, L.L.: Speech emotion recognition: features and classification models. Digit. Signal Process. 22(6), 1154–1160 (2012)

    Article  MathSciNet  Google Scholar 

  9. Du, X., Emebo, O., Varde, A.S., Tandon, N., Chowdhury, S.N., Weikum, G.: Air quality assessment from social media and structured data: pollutants and health impacts in urban planning. In: IEEE International Conference on Data Engineering (ICDE) Workshops, pp. 54–59 (2016)

    Google Scholar 

  10. El Ayadi, M., Kamel, M.S., Karray, F.: Survey on speech emotion recognition: Features, classification schemes, and databases. Pattern Recogn. 44, 572–587 (2011)

    Google Scholar 

  11. Gandhe, K., Varde, A.S., Du, X.: Sentiment analysis of Twitter data with hybrid learning for recommender applications. In: IEEE Ubiquitous Computing, Electronics and Mobile Communication Conference (UEMCON), pp. 57–63 (2018)

    Google Scholar 

  12. Saif, H., He, Y., Alani, H.: Semantic sentiment analysis of Twitter. In: Cudré-Mauroux, P., Heflin, J., Sirin, E., Tudorache, T., Euzenat, J., Hauswirth, M., Parreira, J.X., Hendler, J., Schreiber, G., Bernstein, A., Blomqvist, E. (eds.) ISWC 2012. LNCS, vol. 7649, pp. 508–524. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35176-1_32

    Chapter  Google Scholar 

  13. Badjatiya, P., Gupta, S., Gupta, M., Varma, V.: Deep learning for hate speech detection in tweets. In: The 26th International Conference on World Wide Web Companion (WWW), pp. 759–760. ACM (2017)

    Google Scholar 

  14. Puri, M., Varde, A.S., de Melo, G.: Commonsense based text mining on urban policy. In: Language Resources and Evaluation (LREV) Journal, Springer (2022)

    Google Scholar 

  15. Davidson, T., Warmsley, D., Macy, M., Weber, I.: Automated hate speech detection and the problem of offensive language. In: 11th International AAAI Conference on Web and Social Media, pp. 512–515 (2017)

    Google Scholar 

  16. Bifet, A., Frank, E.: Sentiment knowledge discovery in Twitter streaming data. In: Discovery Science - 13th International Conference (2010)

    Google Scholar 

  17. Du, X., Kowalski, M., Varde, A.S., de Melo, G., Taylor, R.W.: Public opinion matters: mining social media text for environmental management. In: ACM SIGWEB vol. 2019, issue Autumn, pp. 5:1–5:15 (2019)

    Google Scholar 

  18. Namita, M., Basant, A., Garvit, C., Prateek, P.; Sentiment analysis of Hindi review based on negation and discourse relation. In: International Joint Conference on Natural Language Processing (2013)

    Google Scholar 

  19. Wang, L., Wang, Y., de Melo, G., Weikum, G.: Understanding archetypes of fake news via fine-grained classification. Soc. Network Anal. Mining 9(1), 37:1–37:17 (2019)

    Google Scholar 

  20. Popat, K., Mukherjee, S., Strötgen, J., Weikum, G.: CredEye: a credibility lens for analyzing and explaining misinformation. In: International Conference on World Wide Web Companion (WWW), pp. 155–158 (2016)

    Google Scholar 

  21. Torres, J., Anu, V., Varde, A.S.: Understanding the information disseminated using Twitter during the COVID-19 pandemic. In: IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS), pp. 1–6 (2021)

    Google Scholar 

  22. Yin, Z., Rong, J., Zhi-Hua, Z.: Understanding bag-of-words model: a statistical framework. Int. J. Mach. Learn. Cybern. (2010)

    Google Scholar 

  23. Stemler, S.E., Tsai, J.: Best practices in interrater reliability three common approaches. In: Osborne, J. (ed.) Best Practices in Quantitative Methods, pp. 29–49. SAGE Publications Inc., Thousand Oaks (2011)

    Google Scholar 

  24. Mitchell, T.: Machine Learning. McGraw Hill (1997)

    Google Scholar 

  25. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press (2016)

    Google Scholar 

  26. LeCun, Y., Bengio, Y.: Convolutional networks for images, speech, and time series. In: The Handbook of Brain Theory and Neural Networks, vol. 3361, no. 10 (1995)

    Google Scholar 

  27. Mikolov, T., Karafiat, M., Burget, K., Cernocký, J., Khudanpur, S.: Recurrent neural network based language model. Interspeech J. 2(3), 1045–1048 (2010)

    Google Scholar 

  28. Razniewski, S., Tandon, N., Varde, A.S.: Information to wisdom: commonsense knowledge extraction and compilation. In: ACM Conference on Web Search and Data Mining (WSDM), pp. 1143–1146 (2021)

    Google Scholar 

  29. Zaramba, W., Sutskever, I., Vinyals, O.: Recurrent neural network regularization. arXiv preprint arXiv:1409.2329 (2014)

  30. Braşoveanu, A.M.P., Andonie, R.: Visualizing transformers for NLP: a brief survey. In: IEEE 24th International Conference Information Visualisation (IV) (2020)

    Google Scholar 

Download references

Acknowledgments and Disclaimer

Dr. Jiayin Wang and Dr. Aparna Varde acknowledge a grant from the US National Science Foundation NSF MRI: Acquisition of a High-Performance GPU Cluster for Research and Education. Award Number 2018575. Dr. Aparna Varde is a visiting researcher at Max Planck Institute for Informatics, Saarbrucken, Germany, in the research group of Dr. Gerhard Weikum, during the academic year 2021–2022, including a sabbatical visit. The authors acknowledge the CSAM Dean’s Office Travel Grant from Montclair State University to support attending this conference. The authors would like to make the disclaimer that the opinions expressed, analyzed and presented in this work are obtained from knowledge discovery by mining the concerned data only. These do not reflect the personal or professional views of the authors.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aparna S. Varde .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kommu, A., Patel, S., Derosa, S., Wang, J., Varde, A.S. (2023). HiSAT: Hierarchical Framework for Sentiment Analysis on Twitter Data. In: Arai, K. (eds) Intelligent Systems and Applications. IntelliSys 2022. Lecture Notes in Networks and Systems, vol 542. Springer, Cham. https://doi.org/10.1007/978-3-031-16072-1_28

Download citation

Publish with us

Policies and ethics