HiSAT: Hierarchical Framework for Sentiment Analysis on Twitter Data

Kommu, Amrutha; Patel, Snehal; Derosa, Sebastian; Wang, Jiayin; Varde, Aparna S.

doi:10.1007/978-3-031-16072-1_28

Amrutha Kommu¹⁰,
Snehal Patel¹⁰,
Sebastian Derosa¹⁰,
Jiayin Wang¹⁰ &
…
Aparna S. Varde¹⁰

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 542))

Included in the following conference series:

Proceedings of SAI Intelligent Systems Conference

996 Accesses
1 Citations

Abstract

Social media websites such as Twitter have become so indispensable today that people use them almost on a daily basis for sharing their emotions, opinions, suggestions and thoughts. Motivated by such behavioral tendencies, the purpose of this study is to define an approach to automatically classify the tweets on Twitter data into two main classes, namely, hate speech and non-hate speech. This provides a valuable source of information in analyzing and understanding target audiences and spotting marketing trends. We thus propose HiSAT, a Hierarchical framework for Sentiment Analysis on Twitter data. Sentiments/opinions in tweets are highly unstructured-and do not have a proper defined sequence. They constitute a heterogeneous data from many sources having different formats, and express either positive or negative, or neutral sentiment. Hence, in HiSAT we conduct Natural Language Processing encompassing tokenization, stemming and lemmatization techniques that convert text to tokens; as well as Bag-of-Words (BoW) and Term Frequency-Inverse Document Frequency (TF-IDF) techniques that convert text sentences into numeric vectors. These are then fed as inputs to Machine learning algorithms within the HiSAT framework; more specifically, Random Forest, Logistic Regression and Naïve Bayes are used as text-binary classifiers to detect hate speech and non-hate speech from the tweets. Results of experiments performed with the HiSAT framework show that Random Forest outperforms the others with a better prediction in estimating the correct labels (with accuracy above the 95% range). We present the HiSAT approach, its implementation and experiments, along with related work and ongoing research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Deployment of Sentiment Analysis of Tweets Using Various Classifiers

Sentiment Classification on Twitter Media: A Novel Approach Using the Desired Information from User

Exploration of sentiment analysis in twitter propaganda: a deep dive

Article 19 October 2023

References

Liu, B.: Sentiment Analysis: Mining Opinions, Sentiments, and Emotions, 2nd edn. Cambridge University Press (2020)
Google Scholar
Simon Perfect (Theos, London, UK), What are the hate crime laws and should they be reformed? November 2020. https://www.theosthinktank.co.uk/comment/2020/10/29/what-are-the-hate-crime-laws-and-should-they-be-reformed
Twitter Sentiment Analysis. https://www.kaggle.com/arkhoshghalb/twitter-sentiment-analysis-hatred-speech
Anjaria, M., Guddeti, R.M.R.: Influence factor based opinion mining of Twitter data using supervised learning, In: 2014 Sixth International Conference on Communication Systems and Networks (COMSNETS), pp. 1–8 (2014)
Google Scholar
Cristianini, N., Ricci, E.: Support Vector Machines. In: Kao, M.Y. (eds.) Encyclopedia of Algorithms. Springer, Boston (2008)
Google Scholar
Cao, H., Verma, R., Nenkova, A.: Speaker-sensitive emotion recognition via ranking: studies on acted and spontaneous speech. In: Comput. Speech Lang. 28(1), 186–202 (2015)
Google Scholar
Xia, R., Zong, C., Li, S.: Ensemble of feature sets and classification algorithms for sentiment classification. Inf. Sci. 181(6), 1138–1152 (2011)
Article Google Scholar
Chen, L., Mao, X., Xue, Y., Cheng, L.L.: Speech emotion recognition: features and classification models. Digit. Signal Process. 22(6), 1154–1160 (2012)
Article MathSciNet Google Scholar
Du, X., Emebo, O., Varde, A.S., Tandon, N., Chowdhury, S.N., Weikum, G.: Air quality assessment from social media and structured data: pollutants and health impacts in urban planning. In: IEEE International Conference on Data Engineering (ICDE) Workshops, pp. 54–59 (2016)
Google Scholar
El Ayadi, M., Kamel, M.S., Karray, F.: Survey on speech emotion recognition: Features, classification schemes, and databases. Pattern Recogn. 44, 572–587 (2011)
Google Scholar
Gandhe, K., Varde, A.S., Du, X.: Sentiment analysis of Twitter data with hybrid learning for recommender applications. In: IEEE Ubiquitous Computing, Electronics and Mobile Communication Conference (UEMCON), pp. 57–63 (2018)
Google Scholar
Saif, H., He, Y., Alani, H.: Semantic sentiment analysis of Twitter. In: Cudré-Mauroux, P., Heflin, J., Sirin, E., Tudorache, T., Euzenat, J., Hauswirth, M., Parreira, J.X., Hendler, J., Schreiber, G., Bernstein, A., Blomqvist, E. (eds.) ISWC 2012. LNCS, vol. 7649, pp. 508–524. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35176-1_32
Chapter Google Scholar
Badjatiya, P., Gupta, S., Gupta, M., Varma, V.: Deep learning for hate speech detection in tweets. In: The 26th International Conference on World Wide Web Companion (WWW), pp. 759–760. ACM (2017)
Google Scholar
Puri, M., Varde, A.S., de Melo, G.: Commonsense based text mining on urban policy. In: Language Resources and Evaluation (LREV) Journal, Springer (2022)
Google Scholar
Davidson, T., Warmsley, D., Macy, M., Weber, I.: Automated hate speech detection and the problem of offensive language. In: 11th International AAAI Conference on Web and Social Media, pp. 512–515 (2017)
Google Scholar
Bifet, A., Frank, E.: Sentiment knowledge discovery in Twitter streaming data. In: Discovery Science - 13th International Conference (2010)
Google Scholar
Du, X., Kowalski, M., Varde, A.S., de Melo, G., Taylor, R.W.: Public opinion matters: mining social media text for environmental management. In: ACM SIGWEB vol. 2019, issue Autumn, pp. 5:1–5:15 (2019)
Google Scholar
Namita, M., Basant, A., Garvit, C., Prateek, P.; Sentiment analysis of Hindi review based on negation and discourse relation. In: International Joint Conference on Natural Language Processing (2013)
Google Scholar
Wang, L., Wang, Y., de Melo, G., Weikum, G.: Understanding archetypes of fake news via fine-grained classification. Soc. Network Anal. Mining 9(1), 37:1–37:17 (2019)
Google Scholar
Popat, K., Mukherjee, S., Strötgen, J., Weikum, G.: CredEye: a credibility lens for analyzing and explaining misinformation. In: International Conference on World Wide Web Companion (WWW), pp. 155–158 (2016)
Google Scholar
Torres, J., Anu, V., Varde, A.S.: Understanding the information disseminated using Twitter during the COVID-19 pandemic. In: IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS), pp. 1–6 (2021)
Google Scholar
Yin, Z., Rong, J., Zhi-Hua, Z.: Understanding bag-of-words model: a statistical framework. Int. J. Mach. Learn. Cybern. (2010)
Google Scholar
Stemler, S.E., Tsai, J.: Best practices in interrater reliability three common approaches. In: Osborne, J. (ed.) Best Practices in Quantitative Methods, pp. 29–49. SAGE Publications Inc., Thousand Oaks (2011)
Google Scholar
Mitchell, T.: Machine Learning. McGraw Hill (1997)
Google Scholar
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press (2016)
Google Scholar
LeCun, Y., Bengio, Y.: Convolutional networks for images, speech, and time series. In: The Handbook of Brain Theory and Neural Networks, vol. 3361, no. 10 (1995)
Google Scholar
Mikolov, T., Karafiat, M., Burget, K., Cernocký, J., Khudanpur, S.: Recurrent neural network based language model. Interspeech J. 2(3), 1045–1048 (2010)
Google Scholar
Razniewski, S., Tandon, N., Varde, A.S.: Information to wisdom: commonsense knowledge extraction and compilation. In: ACM Conference on Web Search and Data Mining (WSDM), pp. 1143–1146 (2021)
Google Scholar
Zaramba, W., Sutskever, I., Vinyals, O.: Recurrent neural network regularization. arXiv preprint arXiv:1409.2329 (2014)
Braşoveanu, A.M.P., Andonie, R.: Visualizing transformers for NLP: a brief survey. In: IEEE 24th International Conference Information Visualisation (IV) (2020)
Google Scholar

Download references

Acknowledgments and Disclaimer

Dr. Jiayin Wang and Dr. Aparna Varde acknowledge a grant from the US National Science Foundation NSF MRI: Acquisition of a High-Performance GPU Cluster for Research and Education. Award Number 2018575. Dr. Aparna Varde is a visiting researcher at Max Planck Institute for Informatics, Saarbrucken, Germany, in the research group of Dr. Gerhard Weikum, during the academic year 2021–2022, including a sabbatical visit. The authors acknowledge the CSAM Dean’s Office Travel Grant from Montclair State University to support attending this conference. The authors would like to make the disclaimer that the opinions expressed, analyzed and presented in this work are obtained from knowledge discovery by mining the concerned data only. These do not reflect the personal or professional views of the authors.

Author information

Authors and Affiliations

Montclair State University, Montclair, NJ, 07043, USA
Amrutha Kommu, Snehal Patel, Sebastian Derosa, Jiayin Wang & Aparna S. Varde

Authors

Amrutha Kommu
View author publications
You can also search for this author in PubMed Google Scholar
Snehal Patel
View author publications
You can also search for this author in PubMed Google Scholar
Sebastian Derosa
View author publications
You can also search for this author in PubMed Google Scholar
Jiayin Wang
View author publications
You can also search for this author in PubMed Google Scholar
Aparna S. Varde
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Aparna S. Varde .

Editor information

Editors and Affiliations

Saga University, Saga, Japan
Kohei Arai

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kommu, A., Patel, S., Derosa, S., Wang, J., Varde, A.S. (2023). HiSAT: Hierarchical Framework for Sentiment Analysis on Twitter Data. In: Arai, K. (eds) Intelligent Systems and Applications. IntelliSys 2022. Lecture Notes in Networks and Systems, vol 542. Springer, Cham. https://doi.org/10.1007/978-3-031-16072-1_28

Download citation

DOI: https://doi.org/10.1007/978-3-031-16072-1_28
Published: 31 August 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16071-4
Online ISBN: 978-3-031-16072-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

HiSAT: Hierarchical Framework for Sentiment Analysis on Twitter Data