skip to main content
10.1145/1722080.1722101acmotherconferencesArticle/Chapter ViewAbstractPublication PagesgirConference Proceedingsconference-collections
research-article

Extraction and exploration of spatio-temporal information in documents

Published:18 February 2010Publication History

ABSTRACT

In the past couple of years, there have been significant advances in the areas of temporal information retrieval (TIR) and geographic information retrieval (GIR), each focusing on extracting and utilizing temporal and geographic information, respectively, from documents for search and exploration tasks. Interestingly, there is only little work that combines models, techniques and applications from these two areas to support scenarios and applications where temporal and geographic information in combination provide interesting meaningful nuggets in document exploration tasks, such as visualizing a chronological sequence of events with their locations.

In this paper, we present an approach that combines the two areas of TIR and GIR. Using temporal and geographic information extracted from documents and recorded in temporal and geographic document profiles, we show how co-occurrences of such information are determined and spatio-temporal document profiles are computed. Such profiles then provide the basis for a variety of document search and exploration tasks, such as visualizing the sequences of events on a map. We present a prototypical implementation of our system and demonstrate the effectiveness of combining GIR and TIR in the context of document exploration tasks.

References

  1. O. Alonso, R. Baeza-Yates, and M. Gertz. Effectiveness of Temporal Snippets. In WWW '09, 2009.Google ScholarGoogle Scholar
  2. O. Alonso, M. Gertz, and R. Baeza-Yates. On the Value of Temporal Information in Information Retrieval. SIGIR Forum, 41(2):35--41, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. O. Alonso, M. Gertz, and R. Baeza-Yates. Clustering and Exploring Search Results Using Timeline Constructions. In CIKM '09, 97--106, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. B. Boguraev and R. K. Ando. TimeBank-Driven TimeML Analysis. In Annotating, Extracting and Reasoning about Time and Events. Dagstuhl Seminar Proceedings, 2005.Google ScholarGoogle Scholar
  5. Y.-F. Chen, G. Di Fabbrizio, D. Gibbon, S. Jora, B. Renger, and B. Wei. GeoTracker: Geospatial and Temporal RSS Navigation. In WWW '07, 41--50, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. D. Ferrucci and A. Lally. Building an Example Application with the Unstructured Information Management Architecture. IBM Systems Journal, 43(3):455--475, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. F. Gey, R. Shaw, R. Larson, and B. Pateman. Biography as Events in Time and Space. In GIS '08, 89, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. GUTime. http://www.timeml.org/site/tarsqi/modules/gutime/index.html.Google ScholarGoogle Scholar
  9. C. Jones and R. Purves, editors. Proceedings of the 5th ACM Workshop On Geographic Information Retrieval, 2008. Google ScholarGoogle ScholarCross RefCross Ref
  10. O. Kolomiyets and M.-F. Moens. Meeting TempEval-2: Shallow Approach for Temporal Tagger. In DEW '09: Proc. of the Workshop on Semantic Evaluations, 52--57, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. Leidner. Toponym Resolution in Text: Annotation, Evaluation and Applications of Spatial Grounding of Place Names. PhD thesis, School of Informatics, University of Edinburgh, Scotland, 2007.Google ScholarGoogle Scholar
  12. J. Leidner, G. Sinclair, and B. Webber. Grounding Spatial Named Entities for Information Extraction and Question Answering. In Proc. of the HLT-NAACL 2003 Workshop on Analysis of Geographic References, 31--38, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. H. Li, R. K. Srihari, C. Niu, and W. Li. Location Normalization for Information Extraction. In COLING'02, 1--7, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M. Lieberman, H. Samet, J. Sankaranarayanan, and J. Sperling. STEWARD: Architecture of a Spatio-Textual Search Engine. In GIS '07, 186--193, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. T. Mandl, P. Carvalho, G. Di Nunzio, F. Gey, R. Larson, D. Santos, and C. Womser-Hacker. GeoCLEF 2008: The CLEF 2008 Cross-Language Geographic Information Retrieval Track Overview. In CLEF'08, 808--821, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. I. Mani, J. Pustejovsky, and R. Gaizauskas, editors. The Language of Time. Oxford University Press, 2005.Google ScholarGoogle Scholar
  17. I. Mani and G. Wilson. Robust Temporal Processing of News. In ACL '00, 69--76, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. B. Martins, H. Manguinhas, and J. Borbinha. Extracting and Exploring the Geo-Temporal Semantics of Textual Resources. Intl. Conf. on Semantic Computing, 1--9, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. B. Martins, H. Manguinhas, J. Borbinha, and W. Siabato. A Geo-Temporal Information Extraction Service for Processing Descriptive Metadata in Digital Libraries. e-Perimetron, 4(1):25--37, 2009.Google ScholarGoogle Scholar
  20. MetaCarta Inc. MetaCarta White Paper: MetaCarta GTS and MetaCarta GeoTagger. http://www.metacarta.com/resource-center-resources.htm, 2008.Google ScholarGoogle Scholar
  21. A. Mikheev, M. Moens, and C. Grover. Named Entity Recognition without Gazetteers. In EACL'09, 1--8, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. D. Nadeau and S. Sekine. A Survey of Named Entity Recognition and Classification. Linguisticae Investigationes, 30(1):3--26, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  23. OpenCalais. http://www.opencalais.com.Google ScholarGoogle Scholar
  24. OpenNLP. http://opennlp.sourceforge.net.Google ScholarGoogle Scholar
  25. OpenStreetMap. http://www.openstreetmap.org.Google ScholarGoogle Scholar
  26. J. Pustejovsky, J. M. Castaño, R. Ingria, R. Sauri, R. Gaizauskas, A. Setzer, G. Katz, and D. Radev. TimeML: Robust Specification of Event and Temporal Expressions in Text. In IWCS-5, 2003.Google ScholarGoogle Scholar
  27. J. Pustejovsky, P. Hanks, R. Sauri, A. See, R. Gaizauskas, A. Setzer, D. Radev, B. Sundheim, D. Day, L. Ferro, and M. Lazo. The TIMEBANK Corpus. In Proceedings of Corpus Linguistics 2003, 647--656, 2003.Google ScholarGoogle Scholar
  28. F. Schilder and C. Habel. From Temporal Expressions to Temporal Information: Semantic Tagging of News Messages. In Proceedings of ACL'01 Workshop on Temporal and Spatial Information Processing, 65--72, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. V. Tsotras. Recent Advances on Querying and Managing Trajectories. Tutorial at the 10th Intl. Symposium on Spatial and Temporal Databases, 2007.Google ScholarGoogle Scholar
  30. UIMA. http://incubator.apache.org/uima/.Google ScholarGoogle Scholar
  31. M. Verhagen, R. Gaizauskas, F. Schilder, M. Hepple, G. Katz, and J. Pustejovsky. SemEval-2007 Task 15: TempEval Temporal Relation Identification. In SemEval'07, 75--80, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. M. Verhagen and J. Pustejovsky. Temporal Processing with the TARSQI Toolkit. In COLING 2008: Companion Volume: Demonstrations, 189--192, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Wikipedia Featured Articles. http://en.wikipedia.org/wiki/wikipedia:Featured_articles.Google ScholarGoogle Scholar

Index Terms

  1. Extraction and exploration of spatio-temporal information in documents

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        GIR '10: Proceedings of the 6th Workshop on Geographic Information Retrieval
        February 2010
        130 pages
        ISBN:9781605588261
        DOI:10.1145/1722080

        Copyright © 2010 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 18 February 2010

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate46of61submissions,75%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader