Shallow Reading with Deep Learning: Predicting Popularity of Online Content Using only Its Title

Stokowiec, Wojciech; Trzciński, Tomasz; Wołk, Krzysztof; Marasek, Krzysztof; Rokita, Przemysław

doi:10.1007/978-3-319-60438-1_14

Wojciech Stokowiec^19,21,
Tomasz Trzciński^20,21,
Krzysztof Wołk¹⁹,
Krzysztof Marasek¹⁹ &
…
Przemysław Rokita²⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10352))

Included in the following conference series:

International Symposium on Methodologies for Intelligent Systems

2150 Accesses
15 Citations
8 Altmetric

Abstract

With the ever decreasing attention span of contemporary Internet users, the title of online content (such as a news article or video) can be a major factor in determining its popularity. To take advantage of this phenomenon, we propose a new method based on a bidirectional Long Short-Term Memory (LSTM) neural network designed to predict the popularity of online content using only its title. We evaluate the proposed architecture on two distinct datasets of news articles and news videos distributed in social media that contain over 40,000 samples in total. On those datasets, our approach improves the performance over traditional shallow approaches by a margin of 15%. Additionally, we show that using pre-trained word vectors in the embedding layer improves the results of LSTM models, especially when the training set is small. To our knowledge, this is the first attempt of applying popularity prediction using only textual information from the title.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Popularity prediction of movies: from statistical modeling to machine learning techniques

Article 06 January 2020

“Towards Re-Inventing Psychohistory”: Predicting the Popularity of Tomorrow’s News from Yesterday’s Twitter and News Feeds

Article 17 November 2020

Multi-branch LSTM encoded latent features with CNN-LSTM for Youtube popularity prediction

Article Open access 20 January 2025

Notes

References

Bandari, R., Asur, S., Huberman, B.A.: The pulse of news in social media: forecasting popularity. CoRR, abs/1202.0332 (2012)
Google Scholar
Castillo, C., El-Haddad, M., Pfeffer, J., Stempeck, M.: Characterizing the life cycle of online news stories using social media reactions. In: CSCW (2014)
Google Scholar
Chakraborty, A., Paranjape, B., Kakarla, S., Ganguly, N.: Stop clickbait: detecting and preventing clickbaits in online news media. CoRR, abs/1610.09786 (2016)
Google Scholar
Chen, J., Song, X., Nie, L., Wang, X., Zhang, H., Chua, T.: Micro tells macro: predicting the popularity of micro-videos via a transductive model. In: ACMMM (2016)
Google Scholar
Chesire, M., Wolman, A., Voelker, G., Levy, H.M.: Measurement and analysis of a streaming-media workload. In: USITS (2001)
Google Scholar
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.P.: Natural language processing (almost) from scratch. CoRR, abs/1103.0398 (2011)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Hong, L., Dan, O., Davison, B.: Predicting popular messages in Twitter. In: Proceedings of International Conference Companion on World Wide Web (2011)
Google Scholar
Khosla, A., Sarma, A., Hamid, R.: What makes an image popular? In: WWW (2014)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. CoRR, abs/1412.6980 (2014)
Google Scholar
Osborne, M., Lavrenko, V.: RT to win! predicting message propagation in Twitter. In: ICWSM (2011)
Google Scholar
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Empirical Methods in Natural Language Processing (EMNLP) (2014)
Google Scholar
Pinto, H., Almeida, J., Gonçalves, M.: Using early view patterns to predict the popularity of Youtube videos. In: WSDM (2013)
Google Scholar
Ramisa, A., Yan, F., Moreno-Noguer, F., Mikolajczyk, K.: Breakingnews: article annotation by image and text processing. CoRR, abs/1603.07141 (2016)
Google Scholar
Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Commun. ACM 18(11), 613–620 (1975)
Article MATH Google Scholar
Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45, 2673–2681 (1997)
Article Google Scholar
Szabo, G., Huberman, B.: Predicting the popularity of online content. Commun. ACM 53(8), 80–88 (2010)
Article Google Scholar
Trzcinski, T., Rokita, P.: Predicting popularity of online videos using support vector regression. CoRR, abs/1510.06223 (2015)
Google Scholar
Tsagkias, M., Weerkamp, W., de Rijke, M.: News comments: exploring, modeling, and online prediction. In: ECIR (2010)
Google Scholar
Wang, S., Manning, C.: Baselines and bigrams: simple, good sentiment and topic classification. In: ACL (2012)
Google Scholar
Zhang, X., LeCun, Y.: Text understanding from scratch. CoRR, abs/1502.01710 (2015)
Google Scholar
Zhou, C., Sun, C., Liu, Z., Lau, F.C.M.: A C-LSTM neural network for text classification. CoRR, abs/1511.08630 (2015)
Google Scholar

Download references

Acknowledgment

The authors would like to thank NowThisMedia Inc. for enabling this research by providing access to data and hardware.

Author information

Authors and Affiliations

Polish-Japanese Academy of Information Technology, Warsaw, Poland
Wojciech Stokowiec, Krzysztof Wołk & Krzysztof Marasek
Warsaw University of Technology, Warsaw, Poland
Tomasz Trzciński & Przemysław Rokita
Tooploox, Wrocław, Poland
Wojciech Stokowiec & Tomasz Trzciński

Authors

Wojciech Stokowiec
View author publications
You can also search for this author in PubMed Google Scholar
Tomasz Trzciński
View author publications
You can also search for this author in PubMed Google Scholar
Krzysztof Wołk
View author publications
You can also search for this author in PubMed Google Scholar
Krzysztof Marasek
View author publications
You can also search for this author in PubMed Google Scholar
Przemysław Rokita
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wojciech Stokowiec .

Editor information

Editors and Affiliations

Warsaw University of Technology, Warsaw, Poland
Marzena Kryszkiewicz
University of Bari Aldo Moro, Bari, Italy
Annalisa Appice
Institute of Informatics, University of Warsaw, Warsaw, Poland
Dominik Ślęzak
Faculty of Electronics & Information, Warsaw University of Technology, Warsaw, Poland
Henryk Rybinski
Institute of Mathematics, Warsaw University, Warsaw, Poland
Andrzej Skowron
Department of Computer Science, University of North Carolina at Charlotte, North Carolina, USA
Zbigniew W. Raś

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Stokowiec, W., Trzciński, T., Wołk, K., Marasek, K., Rokita, P. (2017). Shallow Reading with Deep Learning: Predicting Popularity of Online Content Using only Its Title. In: Kryszkiewicz, M., Appice, A., Ślęzak, D., Rybinski, H., Skowron, A., Raś, Z. (eds) Foundations of Intelligent Systems. ISMIS 2017. Lecture Notes in Computer Science(), vol 10352. Springer, Cham. https://doi.org/10.1007/978-3-319-60438-1_14

Download citation

DOI: https://doi.org/10.1007/978-3-319-60438-1_14
Published: 14 June 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-60437-4
Online ISBN: 978-3-319-60438-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics