ABSTRACT
Exploring large news collections created by media outlets with traditional search engines is impractical for demanding users. We propose a temporal exploration tool that aims to facilitate the consultation of news collections. We concentrated our efforts on two fronts: (i) allowing users to make queries with the addition of information from documents represented by word embbedings, and; (ii) retrieving temporal information to generate timelines presented by an appropriate interface. We evaluated our solution in a collection of a Brazilian newspaper, demonstrating that it can draw different timelines, covering different subtopics of the same theme.
- Omar Alonso, Michael Gertz, and Ricardo Baeza-Yates. 2009. Clustering and Exploring Search Results Using Timeline Constructions. In Proc, of ACM CIKM.Google ScholarDigital Library
- Hiteshwar Kumar Azad and Akshay Deepak. 2019. Query expansion techniques for information retrieval: A survey. I&PM 56, 5 (2019), 1698--1735.Google Scholar
- Nattiya Kanhabua and Avishek Anand. 2016. Temporal Information Retrieval. In Proceedings of ACM SIGIR.Google ScholarDigital Library
- Saar Kuzi, Anna Shtok, and Oren Kurland. 2016. Query Expansion Using Word Embeddings. In Proceedings of ACM CIKM.Google ScholarDigital Library
- Quoc Le and Tomas Mikolov. 2014. Distributed Representations of Sentences and Documents. In Proceedings of ICML.Google ScholarDigital Library
- Jiwei Li and Claire Cardie. 2014. Timeline Generation: Tracking Individuals on Twitter. In Proceedings of ACM WWW.Google ScholarDigital Library
- Michael Matthews, Pancho Tolchinsky, Roi Blanco, Jordi Atserias, Peter Mika, and Hugo Zaragoza. 2010. Searching through time in the New York Times. In Proceedings of ACM HCIR.Google Scholar
- J. J. Rocchio. 1971. Relevance feedback in information retrieval. In Proceedings of The Smart retrieval system - experiments in automatic document processing.Google Scholar
- Dwaipayan Roy, Debjyoti Paul, Mandar Mitra, and Utpal Garain. 2016. Using Word Embeddings for Automatic Query Expansion. ArXiv abs/1606.07608 (2016).Google Scholar
- Jaspreet Singh, Wolfgang Nejdl, and Avishek Anand. 2016. History by Diversity: Helping Historians Search News Archives. In Proceedings of ACM CHIIR.Google ScholarDigital Library
Index Terms
- Semantically Time Tracking of Events from Web Documents
Recommendations
StoryTracker: A Semantic-Oriented Tool for Automatic Tracking Events by Web Documents
Computational Science and Its Applications – ICCSA 2021AbstractMedia vehicles play an essential role in investigating events and keeping the public informed. Indirectly, logs of daily events made by newspapers and magazines have been built rich collections of data that can be used by lots of professionals ...
A personalized search engine based on web-snippet hierarchical clustering
WWW '05: Special interest tracks and posters of the 14th international conference on World Wide WebIn this paper we propose a hierarchical clustering engine, called snaket, that is able to organize on-the-fly the search results drawn from 16 commodity search engines into a hierarchy of labeled folders. The hierarchy offers a complementary view to the ...
Personalized mining of web documents using link structures and fuzzy concept networks
AbstractPersonalized search engines are important tools for finding web documents for specific users, because they are able to provide the location of information on the WWW as accurately as possible, using efficient methods of data mining and ...
Comments