Skip to main content
Log in

Tackling representation, annotation and classification challenges for temporal knowledge base population

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Temporal Information Extraction (TIE) plays an important role in many natural language processing and database applications. Temporal slot filling (TSF) is a new and ambitious TIE task prepared for the knowledge base population (KBP2011) track of NIST Text Analysis Conference. TSF requires systems to discover temporally bound facts about entities and their attributes in order to populate a structured knowledge base. In this paper, we will provide an overview of the unique challenges of this new task and our novel approaches to address these challenges. We present challenges from three perspectives: (1) Temporal information representation: We will review the relevant linguistic semantic theories of temporal information and their limitations, motivating the need to develop a new (4-tuple) representation framework for the task. (2) Annotation acquisition: The lack of substantial labeled training data for supervised learning is a limiting factor in the design of TSF systems. Our work examines the use of multi-class logistic regression methods to improve the labeling quality of training data obtained by distant supervision. (3) Temporal information classification: Another key challenge lies in capturing relations between salient text elements separated by a long context. We develop two approaches for temporal classification and combine them through cross-document aggregation: a flat approach that uses lexical context and shallow dependency features and a structured approach that captures long syntactic contexts by using a dependency path kernel tailored for this task. Experimental results demonstrated that our annotation enhancement approach dramatically increased the speed of the training procedure (by almost 100 times), and that the flat and structured classification approaches were complementary, together yielding a state-of-the-art TSF system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. We refer to events and states collectively as eventualities, following [5].

  2. We assume throughout that eventualities as concepts correspond with situations in the physical world.

  3. We consider the usage of a predicate of eventualities to be a mention of the eventuality for which it returns a positive truth value.

  4. Vendler alluded to the fact that not only verbs, but adjectives and nouns may be used to express eventualities as well. Dölling gave a formal account of aspectual coercion, in which the canonical conceptualization of an event (modifier of an event) may be adjusted based on factors such as which modifier (event) is applied to it (it is applied to), as well as world knowledge.

  5. In fact, “NONE” is ambiguous between (1) the query and slot fill are in relation slot_type, but in this context it is not explicitly related to time \(T\), and (2) the query and slot fill are not in relation slot_type, or any other relation, and (3) the query and slot fill are in relation slot_type*, which is not explicitly related to \(T\), and (4) the query and slot fill are in relation slot_type*, which is explicitly related to \(T\), but we still label \(\langle \) slot_type(query, slot fill), \(T\rangle \) as “NONE”.

  6. http://projects.ldc.upenn.edu/ace/.

  7. http://www.csie.ntu.edu.tw/~cjlin/libsvm/.

References

  1. Ahn D, Adafre SF, de Rijke M (2005) Extracting temporal information from open domain text: a comparative exploration. Digit Inform Manag 3:3–10

    Google Scholar 

  2. Allen JF (1983) Maintaining knowledge about temporal intervals. Commun ACM 26:832–843

    Article  MATH  Google Scholar 

  3. Amigo E, Artiles J, Li Q, Ji H (2011) An evaluation framework for aggregated temporal information extraction. In: Proceedings of SIGIR 2011 workshop on entity-oriented search

  4. Aseervatham S, Antoniadis A, Gaussier E, Burlet M, Denneulin Y (2011) A sparse version of the ridge logistic regression for large-scale text categorization. Pattern Recognit Lett 32:101–106

    Article  Google Scholar 

  5. Bach E (1986) The algebra of events. Linguist Philos 9:5–16

    Google Scholar 

  6. Baral C, Gelfond G, Gelfond M, Scherl RB (2005) Textual inference by combining multiple logic programming paradigms. In: Proceedings of AAAI 2005 workshop on inference for textual question answering

  7. Bell A (1999) News stories and narratives. Oxford University Press, Oxford, pp 236–251

  8. Bethard S, Martin JH (2007) Cu-tmp: temporal relation classification using syntactic and semantic features. In: SemEval-2007: 4th international workshop on semantic evaluations

  9. Bethard S, Martin JH (2008) Learning semantic links from a corpus of parallel temporal and causal relations. In: Proceedings of annual meeting of the association for computational linguistics: human language technologies (ACL-HLT) vol 1(4)

  10. Bethard S, Martin JH, Klingenstein S (2007) Finding temporal structure in text: machine learning of syntactic temporal relations. Int J Semant Comput (IJSC) 1(4):441–457

    Google Scholar 

  11. Bollacker K, Cook R, Tufts P (2008) Freebase: a shared database of structured general human knowledge. In: Proceedings of national conference on artificial intelligence

  12. Bouguraev B, Ando RK (2005) TimeBank-driven TimeML analysis. In: Proceedings of annotating, extracting and reasoning about time and events

  13. Bramsen P, Deshpande P, Lee YK, Barzilay R (2006) Inducing temporal graphs. In: Proceedings of conference on empirical methods in natural language processing

  14. Bunescu RC, Mooney RJ (2005) A shortest path dependency kernel for relation extraction. In: Proceedings of the HLT and EMNLP, pp 724–731

  15. Chambers, N, Wang, S, Jurafsky D (2007) Classifying temporal relations between events. In: Annual meeting of the association for computational linguistics (ACL). pp 173–176

  16. Chambers N, Jurafsky D (2007) Unsupervised learning of narrative schemas and their participants. In: Proceedings the 47th annual meeting of the association for computational linguistics and the 4th international joint conference on natural language processing of the Asian federation of natural language processing (ACL-IJCNLP 2009). pp 173–176

  17. Chambers N, Jurafsky D (2008) Jointly combining implicit constraints improves temporal ordering. In: Proceedings of empirical methods in natural language processing

  18. Chang C-C, Lin C-J (2001) LIBSVM: a library for support vector machines ACM Trans. Intell Syst Technol 2(3). doi: 10.1145/1961189.1961199

  19. Chen Z, Tamang S, Lee A, Li X, Lin W, Snover M, Artiles J, Passantino M, Ji H (2010) Cuny-blender tac-kbp 2010 entity linking and slot filling system description. In: Proceedings of the 2010 text analysis conference

  20. Cortes C, Vapnik V (1995) Support-vector networks. In: Machine learning. pp 273–297

  21. Davidson D (1967) The logical form of action sentences. In: Rescher N (ed) The logic of decision and action. University of Pittsburg Press, Pittsburg

    Google Scholar 

  22. De Marneffe M-C, Maccartney B, Manning CD (2006) Generating typed dependency parses from phrase structure parses. In: LREC 2006

  23. De Marneffe M-C, Manning CD (2006) Stanford typed dependencies manual. Technical report. Department of Computer Science, Stanford University

  24. Denis P, Muller P (2011) Predicting globally-coherent temporal structures from texts via endpoint inference and graph decomposition. In: IJCAI. pp 1788–1793

  25. Dölling J (2013) Aspectual coercion and eventuality structure. In: Robering K. (ed) Aspects, phases and arguments: topics in the semantics of verbs. John Benjamins, Amsterdam, 113–146

  26. Do Q, Lu W, Roth D (2012) Joint inference for event timeline construction. In: Proceedings of empirical methods for natural language processing (EMNLP2012)

  27. Dowty D (1986) The effects of aspectual class on the temporal structure of discourse: semantics of pragmatics? Linguist Philos 9:37–61

    Google Scholar 

  28. Elhadad N, Barzilay R, McKeown K (2002) Inferring strategies for sentence ordering in multidocument summarization. JAIR 17:35–55

    MATH  Google Scholar 

  29. Fellbaum C (1998) WordNet: an electronical lexical database. The MIT Press, Cambridge

    Google Scholar 

  30. Finkel JR, Grenager T, Manning CD (2005) Incorporating non-local information into information extraction systems by gibbs sampling. In: ACL

  31. Gupta P, Ji H (2009) Predicting unknown time arguments based on cross-event propagation. In: Proceedings of ACL-IJCNLP 2009

  32. Hinrichs E (1986) Temporal anaphora in discourses of English. Linguist Philos 9:63–82

    Google Scholar 

  33. Hitzeman J, Moens M, Grover C (1995) Algorithms for analysing the temporal structure of discourse. In: Proceedings of the seventh conference on European chapter of the association for computational linguistics, EACL’95. pp 253–260

  34. Ji H, Grishman R (2008) Refining event extraction through unsupervised cross-document inference. In: Proceedings the annual meeting of the association of computational linguistics

  35. Ji H, Grishman R, Chen Z, Gupta P (2009) Cross-document event extraction and tracking: task, evaluation. Techniques and challenges. In: Proceedings recent advances in natural language processing

  36. Ji H, Grishman R, Dang HT (2011) An overview of the TAC2011 knowledge base population track. In: Proceedings of text analytics conference (TAC)

  37. Kamp H (1981) A theory of truth and semantic representation. In: Paul Portner, Barbara H. Partee (eds) Formal semantics: The essential readings: Blackwell, Oxford, pp 189–222

  38. Katz G (2000) Anti neo-Davidsonianism. In: Events as grammatical objects. CSLI Publications pp 393–416

  39. Kingsbury P, Palmer M, (2002) From TreeBank to PropBank. In: Proceedings of the 3rd international conference on language resources and evaluation (LREC)

  40. Lapata M, Lascarides A (2006) Learning sentence-internal temporal relations. J AI Res pp 85–117

  41. Lascarides A, Asher N (1993) Temporal interpretation, discourse relations, and common sense entailment. Linguist Philos 16:437–493

    Article  Google Scholar 

  42. Li Q, Anzaroot S, Lin W, Li X, Ji H (2011) Joint inference for cross-document information extraction. In: Proceedings of 20th ACM conference on information and, knowledge management (CIKM2011)

  43. Ling X, Weld D (2010) Temporal information extraction. In Proceedings of the twenty fifth national conference on artificial intelligence

  44. Li F, Yang Y, Xing EP (2005) From Lasso regression to feature vector machine. In: NIPS2005

  45. Lodhi H, Saunders C, Shawe-Taylor J, Cristianini N, Watkins C (2002) Text classification using string kernels. J Mach Learn Res 2:419–444

    MATH  Google Scholar 

  46. Mani I, Verhagen M, Wellner B, Lee CM, Pustejovsky J (2006) Machine learning of temporal relations. In: Proceedings of the 21st international conference on computational linguistics and the 44th annual meeting of the association for computational linguistics, ACL-44. pp 753–760

  47. Mani I, Wellner B, Verhagen M, Pustejovsky J (2007) Three approaches to learning tlinks in timeml. Technical Report CS-07-268, Department of Computer Science, Brandeis University, Waltham, USA

  48. Mcclosky D, Charniak E, Johnson M (2006) Effective self-training for parsing. In: Proceedings of N. American ACL (NAACL). pp 152–159

  49. Mcclosky D, Manning CD (2012) Learning constraints for consistent timeline extraction. In: Proceedings of EMNLP

  50. Mintz M, Bills S, Snow R, Jurafsky D (2009) Distant supervision for relation extraction without labeled data. In: ACL/AFNLP. pp 1003–1011

  51. Moens M, Steedman M (1988) Temporal ontology and temporal reference. Comput Linguist 14:15–28

    Google Scholar 

  52. Ng AY (2004) Feature selection, L1 vs. L2 regularization, and rotational invariance. In: ICML

  53. Parsons T (1990) Events in the semantics of English: a study in subatomic semantics. The MIT Press, Cambridge Massachusetts

  54. Partee B (1973) Some structural analogies between tenses and pronouns in English. J Philos 70:601–609

    Article  Google Scholar 

  55. Partee B (1984) Nominal and temporal anaphora. Linguist Philos 7:243–286

    Article  Google Scholar 

  56. Pustejovsky J, Castano J, Ingria R, Sauri R, Gauzauskas R, Setzer A, Katz G (2003) TimeML: robust specification of event and temporal expression in text. In: Fifth international workshop on computational semantics, IWCS-5

  57. Pustejovsky J, Hanks P, Sauri R, See A, Gaizauskas R, Setzer A, Radev D, Sundheim B, Day D, Ferro L, Lazo M (2003) The TIMEBANK corpus. In: Proceedings of corpus linguistics 2003, Lancaster. pp 647–656

  58. Pustejovsky J, Verhagen M (2010) SemEval-2010 task 13: evaluating events, time expressions, and temporal relations (TempEval-2). In: Proceedings of the workshop on semantic evaluations: recent achievements and future directions. pp 112–116

  59. Pustejovsky J, Verhagen M (2010) SemEval-2010 Task 13: evaluating events, time expressions, and temporal relations (TempEval-2). In: SemEval-2010: 5th international workshop on semantic evaluations

  60. Richardson M, Domingos P (2006) Markov logic networks. Mach Learn 62:107–136

    Article  Google Scholar 

  61. Riedel S, Yao L, McCallum A (2010) Modeling relations and their mentions without labeled text. In: ECML-PKDD

  62. Reichenbach H (1947) Elements of symbolic logic. Macmillan, New york

  63. Schlaefer N, Ko J, Betteridge J, Sautter ,G, Pathak M, Nyberg E (2007) Semantic extensions of the Ephyra QA system for TREC. In: Proceedings of TREC 2007

  64. Schockaert S, Cock MD, Ahn D, Kerre E (2006) Supporting temporal question answering: strategies for offline data collection. In: Proceedings of 5th international workshop on inference in computational semantics (ICoS-5)

  65. Snodgrass R (1998) Of duplicates and septuplets. Database Program Des 11:46–49

  66. Surdeanu M, Tibshirani J, Nallapati R, Manning CD (2012) Multi-instance multi-label learning for relation extraction. In: Proceedings of EMNLP

  67. Takamatsu S, Sato I, Nakagawa H (2012) Reducing wrong labels in distant supervision for relation extraction. In: Proceedings of ACL

  68. Talukdar PP, Wijaya D, Mitchell T (2012) Acquiring temporal constraints between relations. In: Proceedings of CIKM

  69. Talukdar PP, Wijaya D, Mitchell T (2012) Coupled temporal scoping of relational facts. In: Proceedings of WSDM

  70. Tamang S, Ji H (2011) Adding smarter systems instead of human annotators: Re-ranking for slot filling system combination. In: Proceedings of CIKM2011 workshop on search and mining entity-relationship data

  71. Tatu M, Srikanth M (2008) Experiments with reasoning for temporal relations between events. In: COLING. pp 857–864

  72. Taylor B (1977) Tense and continuity. Linguist Philos 1:199–220

    MATH  Google Scholar 

  73. Tenny C, Pustejovsky J (2000) A History of events in linguistic theory. In: Events as grammatical objects. CSLI Publications, pp 3–38

  74. Tibshirani R (1996) Optimizing reinsertion: regression shrinkage and selection via the lasso. J R Stat Soc B 58(1):267–288

    MATH  MathSciNet  Google Scholar 

  75. Tibshirani R (2011) Regression shrinkage and selection via the lasso: a retrospective. J R Stat Soc B 73(3):273–282

    Article  MathSciNet  Google Scholar 

  76. Trautwein M (2011) The time window of language, the interaction between linguistic and non-linguistic knowledge in the temporal interpretation of German and English texts. In: Language, Context, and Cognition (2). Walter de Gruyter, Berlin

  77. Vendler Z (1967) Linguistics in philosophy. Cornell University Press, Ithaca, New York, USA

  78. Verhagen M (2004) Times between the Line. Brandeis University, Massachusetts

  79. Verhagen M (2005) Temporal closure in an annotation environment. Lang Resour Eval 39(2–3):211–241

    Article  Google Scholar 

  80. Verhagen M, Gaizauskas R, Schilder F, Katz G, Pustejovsky J (2007) Semeval 2007 task 15: tempeval temporal relation identification. In: SemEval-2007: 4th international workshop on semantic evaluations

  81. Verhagen M, Sauri R, Caselli T, Pustejovsky J (2010) Semeval-2010 task 13: tempeval 2. In: Proceedings of international workshop on semantic evaluations (SemEval 2010)

  82. Wang Y, Yang B, Qu L, Spaniol M, Weikum G (2011) Harvesting facts from textual web sources by constrained label propagation. In: CIKM2011

  83. Yoshikawa K, Riedel S, Asahara M, Matsumoto Y (2009) Jointly identifying temporal relations with markov logic. In: Proceedings the joint conference of the 47th annual meeting of the ACL and the 4th international joint conference on natural language processing of the AFNLP. pp 405–413

Download references

Acknowledgments

This work was supported by the U.S. Army Research Laboratory under Cooperative Agreement No. W911NF- 09-2-0053 (NS-CTA), the U.S. NSF CAREER Award under Grant IIS-0953149, the U.S. NSF EAGER Award under Grant No. IIS-1144111, the U.S. DARPA FA8750-13-2-0041—Deep Exploration and Filtering of Text (DEFT) Program and CUNY Junior Faculty Award. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation here on.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Heng Ji.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ji, H., Cassidy, T., Li, Q. et al. Tackling representation, annotation and classification challenges for temporal knowledge base population. Knowl Inf Syst 41, 611–646 (2014). https://doi.org/10.1007/s10115-013-0675-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-013-0675-1

Keywords

Navigation