ABSTRACT
In the past couple of years, there have been significant advances in the areas of temporal information retrieval (TIR) and geographic information retrieval (GIR), each focusing on extracting and utilizing temporal and geographic information, respectively, from documents for search and exploration tasks. Interestingly, there is only little work that combines models, techniques and applications from these two areas to support scenarios and applications where temporal and geographic information in combination provide interesting meaningful nuggets in document exploration tasks, such as visualizing a chronological sequence of events with their locations.
In this paper, we present an approach that combines the two areas of TIR and GIR. Using temporal and geographic information extracted from documents and recorded in temporal and geographic document profiles, we show how co-occurrences of such information are determined and spatio-temporal document profiles are computed. Such profiles then provide the basis for a variety of document search and exploration tasks, such as visualizing the sequences of events on a map. We present a prototypical implementation of our system and demonstrate the effectiveness of combining GIR and TIR in the context of document exploration tasks.
- O. Alonso, R. Baeza-Yates, and M. Gertz. Effectiveness of Temporal Snippets. In WWW '09, 2009.Google Scholar
- O. Alonso, M. Gertz, and R. Baeza-Yates. On the Value of Temporal Information in Information Retrieval. SIGIR Forum, 41(2):35--41, 2007. Google ScholarDigital Library
- O. Alonso, M. Gertz, and R. Baeza-Yates. Clustering and Exploring Search Results Using Timeline Constructions. In CIKM '09, 97--106, 2009. Google ScholarDigital Library
- B. Boguraev and R. K. Ando. TimeBank-Driven TimeML Analysis. In Annotating, Extracting and Reasoning about Time and Events. Dagstuhl Seminar Proceedings, 2005.Google Scholar
- Y.-F. Chen, G. Di Fabbrizio, D. Gibbon, S. Jora, B. Renger, and B. Wei. GeoTracker: Geospatial and Temporal RSS Navigation. In WWW '07, 41--50, 2007. Google ScholarDigital Library
- D. Ferrucci and A. Lally. Building an Example Application with the Unstructured Information Management Architecture. IBM Systems Journal, 43(3):455--475, 2004. Google ScholarDigital Library
- F. Gey, R. Shaw, R. Larson, and B. Pateman. Biography as Events in Time and Space. In GIS '08, 89, 2008. Google ScholarDigital Library
- GUTime. http://www.timeml.org/site/tarsqi/modules/gutime/index.html.Google Scholar
- C. Jones and R. Purves, editors. Proceedings of the 5th ACM Workshop On Geographic Information Retrieval, 2008. Google ScholarCross Ref
- O. Kolomiyets and M.-F. Moens. Meeting TempEval-2: Shallow Approach for Temporal Tagger. In DEW '09: Proc. of the Workshop on Semantic Evaluations, 52--57, 2009. Google ScholarDigital Library
- J. Leidner. Toponym Resolution in Text: Annotation, Evaluation and Applications of Spatial Grounding of Place Names. PhD thesis, School of Informatics, University of Edinburgh, Scotland, 2007.Google Scholar
- J. Leidner, G. Sinclair, and B. Webber. Grounding Spatial Named Entities for Information Extraction and Question Answering. In Proc. of the HLT-NAACL 2003 Workshop on Analysis of Geographic References, 31--38, 2003. Google ScholarDigital Library
- H. Li, R. K. Srihari, C. Niu, and W. Li. Location Normalization for Information Extraction. In COLING'02, 1--7, 2002. Google ScholarDigital Library
- M. Lieberman, H. Samet, J. Sankaranarayanan, and J. Sperling. STEWARD: Architecture of a Spatio-Textual Search Engine. In GIS '07, 186--193, 2007. Google ScholarDigital Library
- T. Mandl, P. Carvalho, G. Di Nunzio, F. Gey, R. Larson, D. Santos, and C. Womser-Hacker. GeoCLEF 2008: The CLEF 2008 Cross-Language Geographic Information Retrieval Track Overview. In CLEF'08, 808--821, 2008. Google ScholarDigital Library
- I. Mani, J. Pustejovsky, and R. Gaizauskas, editors. The Language of Time. Oxford University Press, 2005.Google Scholar
- I. Mani and G. Wilson. Robust Temporal Processing of News. In ACL '00, 69--76, 2000. Google ScholarDigital Library
- B. Martins, H. Manguinhas, and J. Borbinha. Extracting and Exploring the Geo-Temporal Semantics of Textual Resources. Intl. Conf. on Semantic Computing, 1--9, 2008. Google ScholarDigital Library
- B. Martins, H. Manguinhas, J. Borbinha, and W. Siabato. A Geo-Temporal Information Extraction Service for Processing Descriptive Metadata in Digital Libraries. e-Perimetron, 4(1):25--37, 2009.Google Scholar
- MetaCarta Inc. MetaCarta White Paper: MetaCarta GTS and MetaCarta GeoTagger. http://www.metacarta.com/resource-center-resources.htm, 2008.Google Scholar
- A. Mikheev, M. Moens, and C. Grover. Named Entity Recognition without Gazetteers. In EACL'09, 1--8, 1999. Google ScholarDigital Library
- D. Nadeau and S. Sekine. A Survey of Named Entity Recognition and Classification. Linguisticae Investigationes, 30(1):3--26, 2007.Google ScholarCross Ref
- OpenCalais. http://www.opencalais.com.Google Scholar
- OpenNLP. http://opennlp.sourceforge.net.Google Scholar
- OpenStreetMap. http://www.openstreetmap.org.Google Scholar
- J. Pustejovsky, J. M. Castaño, R. Ingria, R. Sauri, R. Gaizauskas, A. Setzer, G. Katz, and D. Radev. TimeML: Robust Specification of Event and Temporal Expressions in Text. In IWCS-5, 2003.Google Scholar
- J. Pustejovsky, P. Hanks, R. Sauri, A. See, R. Gaizauskas, A. Setzer, D. Radev, B. Sundheim, D. Day, L. Ferro, and M. Lazo. The TIMEBANK Corpus. In Proceedings of Corpus Linguistics 2003, 647--656, 2003.Google Scholar
- F. Schilder and C. Habel. From Temporal Expressions to Temporal Information: Semantic Tagging of News Messages. In Proceedings of ACL'01 Workshop on Temporal and Spatial Information Processing, 65--72, 2001. Google ScholarDigital Library
- V. Tsotras. Recent Advances on Querying and Managing Trajectories. Tutorial at the 10th Intl. Symposium on Spatial and Temporal Databases, 2007.Google Scholar
- UIMA. http://incubator.apache.org/uima/.Google Scholar
- M. Verhagen, R. Gaizauskas, F. Schilder, M. Hepple, G. Katz, and J. Pustejovsky. SemEval-2007 Task 15: TempEval Temporal Relation Identification. In SemEval'07, 75--80, 2007. Google ScholarDigital Library
- M. Verhagen and J. Pustejovsky. Temporal Processing with the TARSQI Toolkit. In COLING 2008: Companion Volume: Demonstrations, 189--192, 2008. Google ScholarDigital Library
- Wikipedia Featured Articles. http://en.wikipedia.org/wiki/wikipedia:Featured_articles.Google Scholar
Index Terms
- Extraction and exploration of spatio-temporal information in documents
Recommendations
Visualizations for the spyglass ontology-based information analysis and retrieval system
ACM SE '10: Proceedings of the 48th Annual Southeast Regional ConferenceSpyglass is an ontology-based information retrieval system designed to help analysts explore very large collections of unstructured text documents. The tool includes two main components: server and client. The server is a web-based service that uses a ...
Domain-specific keyphrase extraction
CIKM '05: Proceedings of the 14th ACM international conference on Information and knowledge managementDocument keyphrases provide semantic metadata characterizing documents and producing an overview of the content of a document. They can be used in many text-mining and knowledge management related applications. This paper describes a Keyphrase ...
Discovering unexpected documents in corpora
Text mining is widely used to discover frequent patterns in large corpora of documents. Hence, many classical data mining techniques, that have been proven fruitful in the context of data stored in relational databases, are now successfully used in the ...
Comments