Sarcasm Detection in Newspaper Headlines

Chilpuri, Vishnu Sai Reddy; Nadeem, Saaman; Mehmood, Tahir; Yaqoob, Muhammad

doi:10.1007/978-981-97-0293-0_18

Vishnu Sai Reddy Chilpuri⁵,
Saaman Nadeem⁶,
Tahir Mehmood⁷ &
…
Muhammad Yaqoob⁵

Part of the book series: Lecture Notes on Data Engineering and Communications Technologies ((LNDECT,volume 191))

Included in the following conference series:

The International Conference on Data Science and Emerging Technologies

226 Accesses

Abstract

Language is an essential medium for human communication. It allows us to convey information, express our ideas, and give instructions to others. The rise of sarcasm can be attributed to the increasing number of negative comments and expressions posted on social networks such as Twitter, Facebook, and newspapers. Due to the use of positive vocabulary in sarcastic comments, it is hard to detect sarcasm in news reports. Sarcasm is intentionally used in news reports to grab the readers’ attention. Unfortunately, many people find it hard to identify the ironic tone of the headlines and may pass incorrect information. This work focuses on detecting sarcasm in newspaper headlines and investigates the performance of four machine learning algorithms (Logistic Regression, Naive Bayes, decision tree, and Random Forest) and one deep learning model BiLSTM (Bi-directional Long Short-Term Memory) for sarcasm detection in news headlines. We demonstrate that regardless of the machine learning model, the application of vectorization technique, i.e. BoW (Bag of Words) and TF–IDF (Term Frequency–Inverse Document Frequency) has minimal influence on the ability to detect sarcasm in news headlines. We also show that the performance of the three machine learning algorithms (Logistic Regression, Random Forest, and decision tree) remains stable across two tokenization techniques (Unigram or Bigram) except Naive Bayes which secured a higher precision with Unigram analysis. We further found that BiLSTM is the most preferred model for sarcasm detection in news headlines.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 199.99; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Efficient Deep Learning Methods for Sarcasm Detection of News Headlines

N-Gram Based Sarcasm Detection for News and Social Media Text Using Hybrid Deep Learning Models

Article 08 January 2024

Deep Contextualised Text Representation and Learning for Sarcasm Detection

Article 14 August 2023

References

Abulaish M, Kamal A (2018) Self-deprecating sarcasm detection: an amalgamation of rule-based and machine learning approach. In: 2018 IEEE/WIC/ACM international conference on Web Intelligence (WI), pp 574–579. IEEE
Google Scholar
Agarwal A, Xie B, Vovsha I, Rambow O, Passonneau RJ (2011) Sentiment analysis of twitter data. In: Proceedings of the workshop on language in social media (LSM 2011), pp 30–38
Google Scholar
Belgiu M, Dragut L (2016) Random forest in remote sensing: a review of applications and future directions. ISPRS J Photogramm Rem Sens 114:24–31
Google Scholar
Chaudhari P, Chandankhede C (2017) Literature survey of sarcasm detection. In: 2017 International conference on wireless communications, signal processing and networking (WiSPNET), pp 2041–2046. IEEE
Google Scholar
Christian H, Agus MP, Suhartono D (2016) Single document automatic text summarization using term frequency-inverse document frequency (tf-idf). ComTech: Comput Math Eng Appl 7(4):285–294
Google Scholar
Genç R (2017) The importance of communication in sustainability & sustainable strategies. Procedia Manufact 8:511–516
Google Scholar
Godara J, Aron R, Shabaz M (2022) Sentiment analysis and sarcasm detection from social network to train health-care professionals. World J Eng 19(1):124–133
Article Google Scholar
Graves A, Schmidhuber J (2005) Framewise phoneme classification with bidirectional lstm and other neural network architectures. Neural Netw 18(5–6):602–610
Article Google Scholar
Graves A, Graves A (2012) Long short-term memory. In: Supervised sequence labelling with recurrent neural networks, pp 37–45
Google Scholar
Habert B, Adda G, Adda-Decker M, de Mareüil PB, Ferrari S, Ferret O, Illouz G, Paraubeck P (1998) Towards tokenization evaluation. LREC, pp 427–432
Google Scholar
Handler A, Denny M, Wallach H, O’Connor B (2016) Bag of what? simple noun phrase extraction for text analysis. In: Proceedings of the first workshop on NLP and computational social science, pp 114–124
Google Scholar
Jain S, Ranjan A, Baviskar D (2018) Sarcasm detection in amazon product reviews. Int J Comput Sci Inform Technol 9(3)
Google Scholar
Jamil R, Ashraf I, Rustam F, Saad E, Mehmood A, Choi GS (2021) Detecting sarcasm in multi-domain datasets using convolutional neural networks and long short term memory network model. Peer J Comput Sci 7:e645
Article Google Scholar
Joshi A, Bhattacharyya P, Carman MJ (2017) Automatic sarcasm detection: a survey. ACM Comput Surv (CSUR) 50(5):1–22
Article Google Scholar
Kumar R, Bhat A (2021) An analysis on sarcasm detection over twitter during covid19. In: 2021 2nd international conference for emerging technology (INCET), pp 1–6. IEEE
Google Scholar
Lin R (2022) Comment texts sentiment analysis based on improved bi-LSTM and Naive Bayes. In: 2022 international conference on data analytics, computing and artificial intelligence (ICDACAI), pp 407–412. IEEE
Google Scholar
Maynard DG, Greenwood MA (2014) Who cares about sarcastic tweets? Investigating the impact of sarcasm on sentiment analysis. In: LREC 2014 proceedings. ELRA
Google Scholar
Mehmood T, Gerevini AE, Lavelli A, Olivato M, Serina I (2023) Distilling knowledge with a teacher’s multitask model for biomedical named entity recognition. Information 14(5):255
Article Google Scholar
Mehmood T, Gerevini A, Lavelli A, Serina I (2019) Leveraging multi-task learning for biomedical named entity recognition. In: AI*IA 2019—advances in artificial intelligence—XVIIIth international conference of the Italian Association for Artificial Intelligence, Rende, Italy, 19–22 Nov 2019, Proceedings. Lecture notes in computer science, vol 11946, pp 431–444. Springer, Berlin
Google Scholar
Mehmood T, Gerevini A, Lavelli A, Serina I (2019) Multi-task learning applied to biomedical named entity recognition task. In: Proceedings of the sixth italian conference on computational linguistics, Bari, Italy, 13–15 Nov 2019. CEUR Workshop Proceedings, vol 2481. CEUR-WS.org
Google Scholar
Mehmood T, Gerevini AE, Lavelli A, Serina I (2020) Combining multi-task learning with transfer learning for biomedical named entity recognition. In: Knowledge based and intelligent information & engineering systems: proceedings of the 24th international conference KES-2020, Virtual Event, 16–18 Sept 2020. Procedia Computer Science, vol 176, pp 848–857. Elsevier
Google Scholar
Mehmood T, Md Rais HB (2016) Machine learning algorithms in context of intrusion detection. In: 2016 3rd international conference on computer and information sciences (ICCOINS), pp 369–373
Google Scholar
Mehmood T, Serina I, Lavelli A, Putelli L, Gerevini A (2023) On the use of knowledge transfer techniques for biomedical named entity recognition. Fut Internet 15(2):79
Google Scholar
Mishra A, Kanojia D, Nagar S, Dey K, Bhattacharyya P (2017) Harnessing cognitive features for sarcasm detection. arXiv preprint arXiv:1701.05574
Misra R (2022) News headlines dataset for sarcasm detection. arXiv preprint arXiv:2212.06035
Mouthami K, Devi KN, Bhaskaran VM (2013) Sentiment analysis and classification based on textual reviews. In: 2013 international conference on Information communication and embedded systems (ICICES). pp 271–276. IEEE
Google Scholar
Ortigosa A, Martín JM, Carro RM (2014) Sentiment analysis in facebook and its application to e-learning. Comput Hum Behav 31:527–541
Google Scholar
Pawar N, Bhingarkar S (2020) Machine learning based sarcasm detection on twitter data. In: 2020 5th international conference on communication and electronics systems (ICCES), pp 957–961. IEEE
Google Scholar
Pini M, Scalvini A, Liaqat MU, Ranzi R, Serina I, Mehmood T (2020) Evaluation of machine learning techniques for inflow prediction in Lake Como, Italy. In: Knowledge-based and intelligent information & engineering systems: proceedings of the 24th international conference KES-2020, Virtual Event, 16–18 Sept 2020. Procedia Computer Science, vol 176, pp 918–927. Elsevier
Google Scholar
Plisson J, Lavrac N, Mladenic D et al (2004) A rule based approach to word lemmatization. In: Proceedings of IS, vol 3, pp 83–86
Google Scholar
Porwal S, Ostwal G, Phadtare A, Pandey M, Marathe MV (2018) Sarcasm detection using recurrent neural network. In: 2018 second international conference on intelligent computing and control systems (ICICCS), pp 746–748. IEEE
Google Scholar
Preethi V et al (2021) Survey on text transformation using bi-LSTM in natural language processing with text data. Turkish J Comput Math Educ (TURCOMAT) 12(9):2577–2585
Google Scholar
Prokhorov S, Safronov V (2019) AI for AI: what NLP techniques help researchers find the right articles on NLP. In: 2019 international conference on artificial intelligence: applications and innovations (IC-AIAI), pp 76–765. IEEE
Google Scholar
Rajadesingan A, Zafarani R, Liu H (2015) Sarcasm detection on twitter: a behavioral modeling approach. In: Proceedings of the eighth ACM international conference on web search and data mining (pp. 97–106)
Google Scholar
Runeson P, Alexandersson M, Nyholm O (2007) Detection of duplicate defect reports using natural language processing. In: 29th international conference on software engineering (ICSE’07), pp 499–510. IEEE
Google Scholar
Sarsam SM, Al-Samarraie H, Alzahrani AI, Wright B (2020) Sarcasm detection using machine learning algorithms in twitter: a systematic review. Int J Mark Res 62(5):578–598
Article Google Scholar
Shrikhande P, Setty V, Sahani A (2020) Sarcasm detection in newspaper headlines. In: 2020 IEEE 15th international conference on industrial and information systems (ICIIS), pp 483–487. IEEE
Google Scholar
Staudemeyer RC, Morris ER (2019) Understanding LSTM–a tutorial into long shortterm memory recurrent neural networks. arXiv preprint arXiv:1909.09586
Swain PH, Hauska H (1977) The decision tree classifier: design and potential. IEEE Trans Geosci Electron 15(3):142–147
Article Google Scholar
Tangirala S (2020) Evaluating the impact of gini index and information gain on classification using decision tree classifier algorithm. Int J Adv Comput Sci Appl 11(2):612–619
Google Scholar
Wol Kowicz J, Kulka Z, Keselj V (2008) N-gram-based approach to composer recognition. Arch Acoust 33(1):43–55 (2008)
Google Scholar

Download references

Acknowledgements

The authors thank the UNITAR International University supporting the publication of this paper.

Author information

Authors and Affiliations

Department of Computer Science, University of Hertfordshire, Hatfield, UK
Vishnu Sai Reddy Chilpuri & Muhammad Yaqoob
Department of Computer Science, University of Management and Technology, Lahore, Pakistan
Saaman Nadeem
School of Information Technology, UNITAR International University, Selangor, Malaysia
Tahir Mehmood

Authors

Vishnu Sai Reddy Chilpuri
View author publications
You can also search for this author in PubMed Google Scholar
Saaman Nadeem
View author publications
You can also search for this author in PubMed Google Scholar
Tahir Mehmood
View author publications
You can also search for this author in PubMed Google Scholar
Muhammad Yaqoob
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Tahir Mehmood or Muhammad Yaqoob .

Editor information

Editors and Affiliations

UNITAR Graduate School, UNITAR International University, Petaling Jaya, Malaysia
Yap Bee Wah
Faculty of Engineering and Technology, Liverpool John Moores University, Liverpool, UK
Dhiya Al-Jumeily OBE
University of Tennessee, Knoxville, TN, USA
Michael W. Berry

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chilpuri, V.S.R., Nadeem, S., Mehmood, T., Yaqoob, M. (2024). Sarcasm Detection in Newspaper Headlines. In: Bee Wah, Y., Al-Jumeily OBE, D., Berry, M.W. (eds) Data Science and Emerging Technologies. DaSET 2023. Lecture Notes on Data Engineering and Communications Technologies, vol 191. Springer, Singapore. https://doi.org/10.1007/978-981-97-0293-0_18

Download citation

DOI: https://doi.org/10.1007/978-981-97-0293-0_18
Published: 27 April 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-0292-3
Online ISBN: 978-981-97-0293-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics