Abstract
This paper addresses a central sub-task of timeline creation from historical Wikipedia articles: learning from text which of the person names in a textual article should appear in a timeline on the same topic. We first process hundreds of timelines written by human experts and related Wikipedia articles to construct a corpus that can be used to evaluate systems that create history timelines from text documents. We then use a set of features to train a classifier that predicts the most important person names, resulting in a clear improvement over a competitive baseline.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Alonso, O., Gertz, M., Baeza-Yates, R.: Clustering and exploring search results using timeline constructions. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management, CIKM 2009, Hong Kong, China, pp. 97–106 (2009)
Bethard, S.: A synchronous context free grammar for time normalization. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, (EMNLP 2013), pp. 821–826 (2013)
Chasin, R., Woodward, D., Witmer, J., Kalita, J.: Extracting and Displaying Temporal and Geospatial Entities from Articles on Historical Events. The Computer Journal, 403–426 (2013)
Chieu, H.L., Lee, Y.K.: Query based event extraction along a timeline. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2004, Sheffield, United Kingdom, pp. 425–432 (2004)
Finkel, J.R., Grenager, T., Manning, C.: Incorporating non-local information into information extraction systems by Gibbs sampling. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, ACL 2005, Ann Arbor, Michigan, USA, pp. 363–370 (2005)
Hienert, D., Luciano, F.: Extraction of Historical Events from Wikipedia. CoRR abs/1205.4138 (2012). http://dblp.uni-trier.de/db/journals/corr/corr1205.html#abs-1205-4138
Kolomiyets, O., Bethard, S., Moens, M.: Extracting narrative timelines as temporal dependency structures. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, pp. 88–97 (2012)
Nguyen, K.H., Tannier, X., Moriceau, V.: Ranking multidocument event descriptions for building thematic timelines. In: Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, Dublin, Ireland, pp. 1208–1217 (2014)
SaurÃ, R., Knippen, R., Verhagen, M., Pustejovsky, J.: Evita: a robust event recognizer for QA systems. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, HLT 2005, Vancouver, British Columbia, Canada, pp. 700–707 (2005)
Sipoš, R., Bhole, A., Fortuna, B., Grobelnik, M., Mladenić, D.: Demo: HistoryViz – visualizing events and relations extracted from wikipedia. In: Aroyo, L., et al. (eds.) ESWC 2009. LNCS, vol. 5554, pp. 903–907. Springer, Heidelberg (2009)
Suchanek, F.M., Kasneci, G., Weikum, G.: YAGO: a core of semantic knowledge unifying wordnet and wikipedia. In: Proceedings of the 16th International Conference on World Wide Web, WWW 2007, Banff, Canada, pp. 697–706 (2007)
UzZaman, N., Llorens, H., Derczynski, L., Allen, J., Verhagen, M., Pustejovsky, J.: SemEval-2013 task 1: TempEval-3: evaluating time expressions, events, and temporal relations. In: Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013), Atlanta, Georgia, USA, pp. 1–9 (2013)
Yan, R., Wan, X., Otterbacher, J., Kong, L., Li, X., Zhang, Y.: Evolutionary timeline summarization: a balanced optimization framework via iterative substitution. In: Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2011, Beijing, China, pp. 745–754 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Bauer, S., Clark, S., Graepel, T. (2015). Learning to Identify Historical Figures for Timeline Creation from Wikipedia Articles. In: Aiello, L., McFarland, D. (eds) Social Informatics. SocInfo 2014. Lecture Notes in Computer Science(), vol 8852. Springer, Cham. https://doi.org/10.1007/978-3-319-15168-7_30
Download citation
DOI: https://doi.org/10.1007/978-3-319-15168-7_30
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-15167-0
Online ISBN: 978-3-319-15168-7
eBook Packages: Computer ScienceComputer Science (R0)