MultiLayerET: A Unified Representation of Entities and Topics Using Multilayer Graphs

Alshehri, Jumanah; Stanojevic, Marija; Khan, Parisa; Rapp, Benjamin; Dragut, Eduard; Obradovic, Zoran

doi:10.1007/978-3-031-26390-3_39

MultiLayerET: A Unified Representation of Entities and Topics Using Multilayer Graphs

Conference paper
First Online: 17 March 2023

667 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13714))

Abstract

Many online news outlets, forums, and blogs provide a rich stream of publications and user comments. This rich body of data is a valuable source of information for researchers, journalists, and policymakers. However, the ever-increasing production and user engagement rate make it difficult to analyze this data without automated tools. This work presents MultiLayerET, a method to unify the representation of entities and topics in articles and comments. In MultiLayerET, articles’ content and associated comments are parsed into a multilayer graph consisting of heterogeneous nodes representing named entities and news topics. The nodes within this graph have attributed edges denoting weight, i.e., the strength of the connection between the two nodes, time, i.e., the co-occurrence contemporaneity of two nodes, and sentiment, i.e., the opinion (in aggregate) of an entity toward a topic. Such information helps in analyzing articles and their comments. We infer the edges connecting two nodes using information mined from the textual data. The multilayer representation gives an advantage over a single-layer representation since it integrates articles and comments via shared topics and entities, providing richer signal points about emerging events. MultiLayerET can be applied to different downstream tasks, such as detecting media bias and misinformation. To explore the efficacy of the proposed method, we apply MultiLayerET to a body of data gathered from six representative online news outlets. We show that with MultiLayerET, the classification F1 score of a media bias prediction model improves by \(36\%\), and that of a state-of-the-art fake news detection model improves by \(4\%\).

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
https://www.wikidata.org/.
2.
Full article: https://wapo.st/3yOMYdO.
3.
https://textblob.readthedocs.io/en/dev/.
4.
https://www.allsides.com/media-bias/media-bias-ratings.

References

Blei, D., Ng, A., Jordan, M.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
MATH Google Scholar
He, L., Han, C., Mukherjee, A., Obradovic, Z., Dragut, E.: On the dynamics of user engagement in news comment media. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 10, e1342 (2020)
Google Scholar
Röder, M., Both, A., Hinneburg, A.: Exploring the space of topic coherence measures. In:WSDM (2015)
Google Scholar
Watts, D., Strogatz, S.: Collective dynamics of ‘small-world’ networks. Nature 393, 440–442 (1998)
Article MATH Google Scholar
Newman, D., Chemudugunta, C., Smyth, P., Steyvers, M.: Analyzing entities and topics in news articles using statistical topic models. In: Mehrotra, S., Zeng, D.D., Chen, H., Thuraisingham, B., Wang, F.-Y. (eds.) ISI 2006. LNCS, vol. 3975, pp. 93–104. Springer, Heidelberg (2006). https://doi.org/10.1007/11760146_9
Chapter Google Scholar
Spitz, A., Gertz, M.: Exploring entity-centric networks in entangled news streams. In: TheWebConf (2018)
Google Scholar
Spitz, A., Gertz, M.: Entity-centric topic extraction and exploration: a network-based approach. In: Pasi, G., Piwowarski, B., Azzopardi, L., Hanbury, A. (eds.) ECIR 2018. LNCS, vol. 10772, pp. 3–15. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-76941-7_1
Chapter Google Scholar
Spitz, A., Almasian, S., Gertz, M.: Entity-centric network topic exploration in news streams. In: WSDM (2019)
Google Scholar
Wu, C., Kanoulas, E., Rijke, M.: Learning entity-centric document representations using an entity facet topic model. Inf. Process. Manage. 57, 102216 (2020)
Article Google Scholar
Kim, H., Sun, Y., Hockenmaier, J., Han, J.: ETM: entity topic models for mining documents associated with entities. In: ICDM (2012)
Google Scholar
Ramage, D., Hall, D., Nallapati, R., Manning, C.: Labeled LDA: a supervised topic model for credit attribution in multi-labeled corpora. EMNLP (2009)
Google Scholar
Liu, Y., Niculescu-Mizil, A., Gryc, W.: Topic-link LDA: joint models of topic and author community. In: ICML (2009)
Google Scholar
Hofmann, T.: Probabilistic latent semantic analysis. In: UAI (1999)
Google Scholar
Wang, X., Grimson, E.: Spatial latent dirichlet allocation. In: NeurIPS, vol. 20 (2008)
Google Scholar
Wu, C., Kanoulas, E., Rijke, M.: It all starts with entities: a salient entity topic model. Nat. Lang. Eng. 26, 531–549 (2020)
Article Google Scholar
Kim, H., El-Kishky, A., Ren, X., Han, J.: Mining news events from comparable news corpora: a multi-attribute proximity network modeling approach. In: IEEE BigData (2019)
Google Scholar
Grover, A., Leskovec, J.: Node2vec: scalable feature learning for networks. In: SIGKDD (2016)
Google Scholar
Shu, K., Cui, L., Wang, S., Lee, D., Liu, H.: DEFEND: explainable fake news detection. In: SIGKDD (2019)
Google Scholar
Shu, K., Sliva, A., Wang, S., Tang, J., Liu, H.: Fake news detection on social media: a data mining perspective. ACM SIGKDD. 19, 22–36 (2017)
Article Google Scholar
Shu, K., Mahudeswaran, D., Wang, S., Lee, D., Liu, H.: FakeNewsNet: a data repository with news content, social context, and spatiotemporal information for studying fake news on social media. Big Data 8, 171–188 (2020)
Article Google Scholar
Tatar, A., Leguay, J., Antoniadis, P., Limbourg, A., Amorim, M., Fdida, S.: Predicting the popularity of online articles based on user comments. In: WIMS (2011)
Google Scholar
Yigit-Sert, S., Altingovde, I., Ulusoy, Ö.: Towards detecting media bias by utilizing user comments. In: WebSci (2016)
Google Scholar
Rizos, G., Papadopoulos, S., Kompatsiaris, Y.: Predicting news popularity by mining online discussions. In: The Web Conference (2016)
Google Scholar
Tsagkias, M., Weerkamp, W., de Rijke, M.: News comments: exploring, modeling, and online prediction. In: Gurrin, C., et al. (eds.) ECIR 2010. LNCS, vol. 5993, pp. 191–203. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-12275-0_19
Chapter Google Scholar
Lee, E.: That’s not the way it is: how user-generated comments on the news affect perceived media bias. J. Comput.-Mediat. Comm. 18, 32–45 (2012)
Article Google Scholar
Yanagi, Y., Orihara, R., Sei, Y., Tahara, Y., Ohsuga, A.: Fake news detection with generated comments for news articles. In: INES (2020)
Google Scholar
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL (2019)
Google Scholar
Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using siamese BERT-networks. In: EMNLP (2019)
Google Scholar
Leban, G., Fortuna, B., Brank, J., Grobelnik, M.: Event registry: learning about world events from news. In: TheWebConference (2014)
Google Scholar
Watanabe, K., Ochi, M., Okabe, M., Onai, R.: Jasmine: a real-time local-event detection system based on geolocation information propagated to microblogs. In: CIKM (2011)
Google Scholar
Sankaranarayanan, J., Samet, H., Teitler, B., Lieberman, M., Sperling, J.: TwitterStand: news in tweets. In: GIS (2009)
Google Scholar
Panagiotou, N., Saravanou, A., Gunopulos, D.: News monitor: a framework for exploring news in real-time. Data 7, 3 (2022)
Article Google Scholar
Saravanou, A., Stefanoni, G., Meij, E.: Identifying notable news stories. In: Jose, J.M., et al. (eds.) ECIR 2020. LNCS, vol. 12036, pp. 352–358. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-45442-5_44
Chapter Google Scholar
Mathioudakis, M., Koudas, N.: TwitterMonitor: trend detection over the twitter stream. In: SIGMOD (2010)
Google Scholar
Syed, M., et al.: Unified representation of twitter and online news using graph and entities. Front. Big Data 4, 699070 (2021)
Article Google Scholar
Barabási, A.: Network science. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 371, 20120375 (2013)
Google Scholar
Trevisiol, M., Aiello, L., Schifanella, R., Jaimes, A.: Cold-start news recommendation with domain-dependent browse graph. In: RecSys (2014)
Google Scholar
Bach, N., Hai, N., Phuong, T.: Personalized recommendation of stories for commenting in forum-based social media. Inf. Sci. 352–353 (2016)
Google Scholar
Li, Q., Wang, J., Chen, Y., Lin, Z.: User comments for news recommendation in forum-based social media. Inf. Sci. 180, 4929–4939 (2010)
Article Google Scholar
Guo, W., Li, H., Ji, H., Diab, M.: Linking tweets to news: a framework to enrich short text data in social media. In: ACL (2013)
Google Scholar
Wei, Z., Gao, W.: Gibberish, assistant, or master? Using tweets linking to news for extractive single-document summarization. In: SIGIR (2015)
Google Scholar
Li, M., et al.: EKNOT: event Knowledge from news and opinions in Twitter. In: AAAI (2016)
Google Scholar
Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: ICML, vol. 32 (2014)
Google Scholar
Loper, E., Bird, S.: NLTK: the natural language toolkit. In: ACL (2004)
Google Scholar
Stanojevic, M., Alshehri, J., Dragut, E., Obradovic, Z.: Biased news data influence on classifying social media posts. In:sIR@ SIGIR (2019)
Google Scholar
Stanojevic, M., Alshehri, J., Obradovic, Z.: Surveying public opinion using label prediction on social media data. In: ASONAM (2019)
Google Scholar
Alshehri, J., Stanojevic, M., Dragut, E., Obradovic, Z.: Stay on topic, please: aligning user comments to the content of a news article. In: Hiemstra, D., Moens, M.-F., Mothe, J., Perego, R., Potthast, M., Sebastiani, F. (eds.) ECIR 2021. LNCS, vol. 12656, pp. 3–17. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-72113-8_1
Chapter Google Scholar
Yang, F., Dragut, E., Mukherjee, A.: Predicting personal opinion on future events with fingerprints. In: COLING (2020)
Google Scholar
Yang, F., Dragut, E., Mukherjee, A.: Claim verification under positive unlabeled learning. In: ASONAM (2020)
Google Scholar
Yang, F., Dragut, E., Mukherjee, A.: Improving evidence retrieval with claim-evidence entailment. In: RANLP (2021)
Google Scholar
He, L., Shen, C., Mukherjee, A., Vucetic, S., Dragut, E.: Cannot Predict comment volume of a news article before (a few) users read it. In: ICWSM (2021)
Google Scholar
Hosseinia, M., Dragut, E., Boumber, D., Mukherjee, A.: On the usefulness of personality traits in opinion-oriented tasks. In: RANLP (2021)
Google Scholar
Tumarada, K., Zhang, Y., Yang, F., Dragut, E., Gnawali, O., Mukherjee, A.: Opinion prediction with user fingerprinting. arXiv (2021)
Google Scholar

Download references

Acknowledgements

This research was supported in part by the U.S. NSF awards 2026513 and 1838145, and the ARL subaward 555080-78055 under Prime Contract No. W911NF2220001 and Temple University office of the Vice President for Research 2022 Catalytic Collaborative Research Initiative Program. AI & ML Focus Area. In addition, this research includes calculations carried out on HPC resources supported in part by the U.S. NSF through major research instrumentation grant number 1625061 and by the U.S. Army Research Laboratory under contract number W911NF-16-2-0189.

Author information

Authors and Affiliations

Center for Data Analytics and Biomedical Informatics, Temple University, Philadelphia, PA, USA
Jumanah Alshehri, Marija Stanojevic, Parisa Khan, Benjamin Rapp, Eduard Dragut & Zoran Obradovic

Authors

Jumanah Alshehri
View author publications
You can also search for this author in PubMed Google Scholar
Marija Stanojevic
View author publications
You can also search for this author in PubMed Google Scholar
Parisa Khan
View author publications
You can also search for this author in PubMed Google Scholar
Benjamin Rapp
View author publications
You can also search for this author in PubMed Google Scholar
Eduard Dragut
View author publications
You can also search for this author in PubMed Google Scholar
Zoran Obradovic
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jumanah Alshehri .

Editor information

Editors and Affiliations

Grenoble Alpes University, Saint Martin d'Hères, France
Massih-Reza Amini
INSA Rouen Normandy, Saint Etienne du Rouvray, France
Stéphane Canu
Ruhr-Universität Bochum, Bochum, Germany
Asja Fischer
KU Leuven, Leuven, Belgium
Tias Guns
Central European University, Vienna, Austria
Petra Kralj Novak
Aristotle University of Thessaloniki, Thessaloniki, Greece
Grigorios Tsoumakas

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Alshehri, J., Stanojevic, M., Khan, P., Rapp, B., Dragut, E., Obradovic, Z. (2023). MultiLayerET: A Unified Representation of Entities and Topics Using Multilayer Graphs. In: Amini, MR., Canu, S., Fischer, A., Guns, T., Kralj Novak, P., Tsoumakas, G. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2022. Lecture Notes in Computer Science(), vol 13714. Springer, Cham. https://doi.org/10.1007/978-3-031-26390-3_39

Download citation

DOI: https://doi.org/10.1007/978-3-031-26390-3_39
Published: 17 March 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-26389-7
Online ISBN: 978-3-031-26390-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the ECML PKDD community (opens in a new tab)