Abstract
In recent years, social networking platform serves as a new media of news sharing and information diffusion. Social networking platform has become a part of our daily life. As such, social media advertising budgets have explosively expanded worldwide over the past few years. Due to the huge commercial interest, clickbait behaviors are commonly observed, which use attractive headlines and sensationalized textual description to bait users to visit websites. Clickbaits mainly exploit the users’ curiosity’s gap by interesting headlines to entice its readers to click an accompanying link to articles often with poor contents. Clickbaits are bothersome either to social media users or platform site owners. In this paper, we propose an approach called Ontology-based LSTM Model (OLSTM) to detect clickbaits. Compared with the existing solutions for clickbait detection, our approach is characterized by the following three components: word embedding model, Recurrent Neural Networks (RNN), and word ontology information. The observation is that preserving semantic relationships is significantly an important factor to be considered in detecting clickbaits. Therefore, we propose to capture semantic relationships between words by word embedding models. In addition, we adopted RNN as our classification models to consider word orders in a sentence. Furthermore, we consider the word ontology relation as another feature set for clickbait classification, as clickbaits often uses words with generalized concepts to induce curiosity. We conduct experiments with real data from Twitter and news websites to validate the effectiveness of the proposed approach, which demonstrates that the employment of the proposed method improves clickbait detection accuracy from 80% to 90% compared with the existing solutions.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Potthast, M., Gollub, T., Komlossy, K., Schuster, S., Wiegmann, M., Garces, E., Hagen, M., Stein, B.: Crowdsourcing a Large Corpus of Clickbait on Twitter. arXiv: 1710.08721v1 (2017)
Chakraborty, A., Paranjape, B., Kakarla, S., Ganguly, N.: Stop Clickbait: detecting and preventing clickbaits in online news media. In: IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM) (2016)
Heartfield, R., Loukas, G., Gan, D.: You are probably not the weakest link: towards practical prediction of susceptibility to semantic social engineering attacks. In: IEEE Access, vol. 4 (2016)
Yang, S., Chen, H., Vorakitphan, V., Fan, Y.: Learning term taxonomy relationship from a large collection of plain text. In: Computer Symposium (ICS) (2016)
Arnold, P., Rahm, E.: Extracting semantic concept relations from wikipedia. In: WIMS 2014 Proceedings of the 4th International Conference on Web Intelligence, Mining and Semantics (WIMS14), Article No. 26 (2014)
Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)
Ahmed, S., Monzur, R., Palit, R.: Development of a rumor and spam reporting and removal tool for social media. In: 3rd Asia-Pacific World Congress on Computer Science and Engineering (APWC on CSE) (2016)
Sang, L., Xie, F., Liu, X.: WEFEST: word embedding feature extension for short text classification. In: IEEE 16th International Conference on Data Mining Workshops (ICDMW) (2016)
Wong, W., Lui, W., Bennamoun, M.: Ontology learning from text: a look back and into the future. In: ACM Computing Surveys CSUR, pp. 1–36 (2011)
Fuller, S.: U.S. Social Media Marketing - Statistics Facts (2016)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Graves, A., Mohamed, A., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, speech and signal processing (ICASSP), pp. 6645–6649 (2013)
Acknowledgement
This research was supported by the Ministry of Science and Technology Taiwan R.O.C. under grant number 106-2221-E-005-082-, and also partially supported by the Project H367B83300 conducted by ITRI under sponsorship of the Ministry of Economic Affairs, Taiwan, R.O.C.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Vorakitphan, V., Leu, FY., Fan, YC. (2019). Clickbait Detection Based on Word Embedding Models. In: Barolli, L., Xhafa, F., Javaid, N., Enokido, T. (eds) Innovative Mobile and Internet Services in Ubiquitous Computing. IMIS 2018. Advances in Intelligent Systems and Computing, vol 773. Springer, Cham. https://doi.org/10.1007/978-3-319-93554-6_54
Download citation
DOI: https://doi.org/10.1007/978-3-319-93554-6_54
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-93553-9
Online ISBN: 978-3-319-93554-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)