Abstract
Information Extraction, the process of eliciting data from natural language documents, usually relies on the ability to parse the document and then to detect the meaning of the sentences by exploiting the syntactic structures encountered. In previous papers, we have discussed an application to extract information from short (e-mail and text) messages which takes an alternative approach. The application is lightweight and uses pattern matching rather than parsing, since parsing is not feasible for messages in which both the syntax and the spelling are unreliable. The application works in the context of a high level database schema and identifies sentences which make statements about data describable by this schema. The application matches sentences with templates to identify metadata terms and the data values associated with them. However, the initial prototype could only manage simple, time independent assertions about the data, such as "Jane Austen is the author." This paper describes an extension to the application which can extract temporal data, both time instants and time periods. It also manages time stamps - temporal information which partitions the values of time varying attributes, such as the monarch of a country. In order to achieve this, the original data model has had to be extended with a temporal component and a set of sentence templates has been constructed to recognise statements in this model. The paper describes the temporal model and the extensions to the application, concluding with a worked example.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Cardie, C.: Empirical Methods in Information Extraction. AI Magazine 18(4), 1–17 (1997)
Gaizauskas, R., Wilks, Y.: Information Extraction: Beyond Document Retrieval. The Journal of Documentation 54(1), 1–34 (1998)
Cooper, R., Ali, S.: Extracting Data from Short Messages, Natural Language Processing and Information Systems. In: Montoyo, A., Muńoz, R., Métais, E. (eds.) NLDB 2005. LNCS, vol. 3513, pp. 388–391. Springer, Heidelberg (2005)
Cooper, R., Ali, S.: Extracting Data from Personal Text Messages, Technical Report, Computing Science, University of Glasgow (2006)
Kang, I., Na, S., Lee, J., Yang, G.: Lightweight Natural Language Database Interfaces. Meziane & Métais, 76–88 (2004)
Vargas-Vera, M., Dominque, J., Kalfoglou, Y., Motta, E., Buckingham-Schum, S.: Template-driven information extraction for populating ontologies. In: Proceedings of IJCAI’01 Workshop on Ontology Learning, Seattle, WA, USA (2001)
Miller, G.: WordNet: A Lexical Database for English. Communications of the ACM 38(11), 39–4 (1995)
Lorentzos, N.: The Interval-extended Relational Model and Its Application to Valid-time Databases, in [10], pp. 67–91 (1993)
Navathe, S., Ahmed, R.: Temporal Extensions to the Relational Model and SQL, in [10], pp. 92–109 (1993)
Tansel, A., Clifford, J., Gadia, S., Jajodia, S., Segev, A., Snodgrass, R.: Temporal Databases Theory, Design And Implementation, Benjamin Cummings, 1992 (1993)
Cooper, R.: A Strategy for Using More of the Language in Extracting Information from Short Messages (submitted to NLDB) (2007)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cooper, R., Manson, S. (2007). Extracting Temporal Information from Short Messages. In: Cooper, R., Kennedy, J. (eds) Data Management. Data, Data Everywhere. BNCOD 2007. Lecture Notes in Computer Science, vol 4587. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73390-4_25
Download citation
DOI: https://doi.org/10.1007/978-3-540-73390-4_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73389-8
Online ISBN: 978-3-540-73390-4
eBook Packages: Computer ScienceComputer Science (R0)