StoryTracker: A Semantic-Oriented Tool for Automatic Tracking Events by Web Documents

Santos, Welton; Fazzion, Elverton; Tuler, Elisa; Dias, Diego; Guimarães, Marcelo; Rocha, Leonardo

doi:10.1007/978-3-030-86970-0_10

Welton Santos¹⁸,
Elverton Fazzion¹⁸,
Elisa Tuler¹⁸,
Diego Dias¹⁸,
Marcelo Guimarães¹⁹ &
…
Leonardo Rocha¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12951))

Included in the following conference series:

International Conference on Computational Science and Its Applications

1142 Accesses

Abstract

Media vehicles play an essential role in investigating events and keeping the public informed. Indirectly, logs of daily events made by newspapers and magazines have been built rich collections of data that can be used by lots of professionals such as economists, historians, and political scientists. However, exploring these logs with traditional search engines has become impractical for more demanding users. In this paper, we propose StoryTracker, a temporal exploration tool that helps users query news collections. We focus our efforts (i) to allow users to make queries by adding information from documents represented by word embbedings and (ii) to develop a strategy for retrieving temporal information to generate timelines and present them using a suitable interface for temporal exploration. We evaluated our solution using a real database of articles from a huge Brazilian newspaper and showed that our tool can trace different timelines, covering different subtopics of the same theme.

Supported by CAPES, CNPq, Finep, Fapesp and Fapemig.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Alonso, O., Gertz, M., Baeza-Yates, R.: Clustering and exploring search results using timeline constructions. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management, CIKM 2009, pp. 97–106. Association for Computing Machinery, New York (2009)
Google Scholar
Alonso, O., Strötgen, J., Baeza-Yates, R., Gertz, M.: Temporal information retrieval: challenges and opportunities. In: TWAW Workshop, WWW, vol. 707, no. 01 (2011)
Google Scholar
Attar, R., Fraenkel, A.S.: Local feedback in full-text retrieval systems. J. ACM 24(3), 397–417 (1977)
Article Google Scholar
Azad, H.K., Deepak, A.: Query expansion techniques for information retrieval: a survey. Inf. Process. Manage. 56(5), 1698–1735 (2019)
Article Google Scholar
Chang, Y., Tang, J., Yin, D., Yamada, M., Liu, Y.: Timeline summarization from social media with life cycle models. In: IJCAI (2016)
Google Scholar
Jones, K.S., Walker, S., Robertson, S.E.: A probabilistic model of information retrieval: development and comparative experiments. In: Information Processing and Management, pp. 779–840 (2000)
Google Scholar
Kanhabua, N., Anand, A.: Temporal information retrieval, pp. 1235–1238 (2016)
Google Scholar
Karvelis, P., Gavrilis, D., Georgoulas, G., Stylios, C.: Topic recommendation using doc2vec. In: 2018 International Joint Conference on Neural Networks (IJCNN), pp. 1–6 (2018)
Google Scholar
Kuzi, S., Shtok, A., Kurland, O.: Query expansion using word embeddings, pp. 1929–1932 (2016)
Google Scholar
Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents (2014)
Google Scholar
Lee, H., Yoon, Y.: Engineering doc2vec for automatic classification of product descriptions on O2O applications. Electron. Commer. Res. 18(3), 433–456 (2017). https://doi.org/10.1007/s10660-017-9268-5
Article Google Scholar
Li, J., Cardie, C.: Timeline generation: tracking individuals on Twitter (2014)
Google Scholar
Matthews, M., Tolchinsky, P., Blanco, R., Atserias, J., Mika, P., Zaragoza, H.: Searching through time in the New York times (2010)
Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space (2013)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality (2013)
Google Scholar
Qamra, A., Tseng, B., Chang, E.Y.: Mining blog stories using community-based and temporal clustering. In: Proceedings of the 15th ACM International Conference on Information and Knowledge Management, CIKM 2006, pp. 58–67. Association for Computing Machinery, New York (2006)
Google Scholar
Rocchio, J.: Relevance feedback in information retrieval (1971)
Google Scholar
Roy, D., Paul, D., Mitra, M., Garain, U.: Using word embeddings for automatic query expansion (2016)
Google Scholar
Shao, Y., Taylor, S., Marshall, N., Morioka, C., Zeng-Treitler, Q.: Clinical text classification with word embedding features vs. bag-of-words features. In: 2018 IEEE International Conference on Big Data (Big Data), pp. 2874–2878 (2018)
Google Scholar
Singh, J., Nejdl, W., Anand, A.: History by diversity. In: Proceedings of the 2016 ACM on Conference on Human Information Interaction and Retrieval - CHIIR 2016 (2016)
Google Scholar
Tahvili, S., Hatvani, L., Felderer, M., Afzal, W., Bohlin, M.: Automated functional dependency detection between test cases using doc2vec and clustering (2019)
Google Scholar
Trieu, L., Tran, H., Tran, M.-T.: News classification from social media using Twitter-based doc2vec model and automatic query expansion, pp. 460–467 (2017)
Google Scholar
Wang, Y., Huang, H., Feng, C.: Query expansion with local conceptual word embeddings in microblog retrieval. IEEE Trans. Knowl. Data Eng. 33(4), 1737–1749 (2019)
Article Google Scholar

Download references

Acknowledgments

This project has partially supported by Huawei do Brasil Telecomunicações Ltda (Fundunesp Process # 3123/2020), FAPEMIG, and CAPES.

Author information

Authors and Affiliations

Universidade Federal de São João del-Rei, São João del-Rei, Brazil
Welton Santos, Elverton Fazzion, Elisa Tuler, Diego Dias & Leonardo Rocha
Universidade Federal de São Paulo/UNIFACCAMP, São Paulo, Brazil
Marcelo Guimarães

Authors

Welton Santos
View author publications
You can also search for this author in PubMed Google Scholar
Elverton Fazzion
View author publications
You can also search for this author in PubMed Google Scholar
Elisa Tuler
View author publications
You can also search for this author in PubMed Google Scholar
Diego Dias
View author publications
You can also search for this author in PubMed Google Scholar
Marcelo Guimarães
View author publications
You can also search for this author in PubMed Google Scholar
Leonardo Rocha
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Diego Dias .

Editor information

Editors and Affiliations

University of Perugia, Perugia, Italy
Osvaldo Gervasi
University of Basilicata, Potenza, Potenza, Italy
Beniamino Murgante
Covenant University, Ota, Nigeria
Sanjay Misra
University of Cagliari, Cagliari, Italy
Chiara Garau
University of Cagliari, Cagliari, Italy
Ivan Blečić
Monash University, Clayton, VIC, Australia
David Taniar
Kyushu Sangyo University, Fukuoka, Japan
Bernady O. Apduhan
University of Minho, Braga, Portugal
Ana Maria A. C. Rocha
Polytechnic University of Bari, Bari, Italy
Eufemia Tarantino
Polytechnic University of Bari, Bari, Italy
Carmelo Maria Torre

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Santos, W., Fazzion, E., Tuler, E., Dias, D., Guimarães, M., Rocha, L. (2021). StoryTracker: A Semantic-Oriented Tool for Automatic Tracking Events by Web Documents. In: Gervasi, O., et al. Computational Science and Its Applications – ICCSA 2021. ICCSA 2021. Lecture Notes in Computer Science(), vol 12951. Springer, Cham. https://doi.org/10.1007/978-3-030-86970-0_10

Download citation

DOI: https://doi.org/10.1007/978-3-030-86970-0_10
Published: 11 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86969-4
Online ISBN: 978-3-030-86970-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics