Abstract
A timeline provides one of the most effective ways to visualize the important historical facts that occurred over a period of time, presenting the insights that may not be so apparent from reading the equivalent information in textual form . By leveraging generative adversarial learning for important sentence classification and by assimilating knowledge based tags for improving the performance of event coreference resolution we introduce a two staged system for event timeline generation from multiple (historical) text documents. We demonstrate our results on two manually annotated historical text documents. Our results can be extremely helpful for historians, in advancing research in history and in understanding the socio-political landscape of a country as reflected in the writings of famous personas. The dataset and the code are available at https://github.com/sayantan11995/Event-Timeline-Generation-from-Documents.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
Such sentences would typically consist of participants and locations.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
We consider the root verb as action for a sentence.
- 13.
- 14.
Statistical significance were performed using Mann-Whitney U test [23].
References
Abadi, M., Agarwal, A., et al.: TensorFlow: large-scale machine learning on heterogeneous systems (2015). https://www.tensorflow.org/
Adak, S., et al.: Gandhipedia: A one-stop AI-enabled portal for browsing Gandhian literature, life-events and his social network. In: JCDL, pp. 539–540, New York, NY, USA (2020)
Aprosio, A., Tonelli, S.: Recognizing biographical sections in Wikipedia, pp. 811–816, January 2015
Bagga, A., Baldwin, B.: Entity-based cross-document coreferencing using the vector space model. In: Coling, vol. 1, p. 79 (2000)
Bamman, D., Smith, N.A.: Unsupervised discovery of biographical structure from text. Trans. Assoc. Comput. Linguist. 2, 363–376 (2014)
Barhom, S., Shwartz, V., Eirew, A., Bugert, M., Reimers, N., Dagan, I.: Revisiting joint modeling of cross-document entity and event coreference resolution (2019)
Bedi, H., Patil, S., Hingmire, S., Palshikar, G.: Event timeline generation from history textbooks. In: Proceedings of the 4th Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA 2017), pp. 69–77. Asian Federation of Natural Language Processing, Taipei, Taiwan, December 2017
Born, L., Bacher, M., Markert, K.: Dataset reproducibility and IR methods in timeline summarization. In: LREC 2020 (2020)
Chen, Z., Ji, H., Haralick, R.: A pairwise event coreference model, feature impact and evaluation for event coreference resolution. In: Proceedings of the Workshop on Events in Emerging Text Types, pp. 17–22. Association for Computational Linguistics, Borovets, Bulgaria, September 2009
Choubey, P.K., Huang, R.: Event coreference resolution by iteratively unfolding inter-dependencies among events. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 2124–2133. Association for Computational Linguistics, Copenhagen, Denmark, September 2017
Croce, D., Castellucci, G., Basili, R.: GAN-BERT: generative adversarial learning for robust text classification with a bunch of labeled examples. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 2114–2119. Association for Computational Linguistics, July 2020
Cybulska, A., Vossen, P.: Using a sledgehammer to crack a nut? Lexical diversity and event coreference resolution. In: Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 2014), pp. 4545–4552. European Language Resources Association (ELRA), Reykjavik, Iceland, May 2014
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding (2019)
Ghaddar, A., Langlais, P.: Wikicoref: an English coreference-annotated corpus of Wikipedia articles. In: Chair, N.C.C., et al. (eds.) Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016). European Language Resources Association (ELRA), Paris, France, May 2016
Gholipour Ghalandari, D., Ifrim, G.: Examining the state-of-the-art in news timeline summarization. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 1322–1334. Association for Computational Linguistics, July 2020
Hearst, M.A.: Support vector machines. IEEE Intell. Syst. 13(4), 18–28 (1998)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Kenyon-Dean, K., Cheung, J.C.K., Precup, D.: Resolving event coreference with supervised representation learning and clustering-oriented regularization (2018)
Kibriya, A.M., Frank, E., Pfahringer, B., Holmes, G.: Multinomial naive Bayes for text categorization revisited. In: Webb, G.I., Yu, X. (eds.) AI 2004. LNCS (LNAI), vol. 3339, pp. 488–499. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30549-1_43
La Quatra, M., Cagliero, L., Baralis, E., Messina, A., Montagnuolo, M.: Summarize dates first: a paradigm shift in timeline summarization, pp. 418–427. Association for Computing Machinery, New York, NY, USA (2021)
Lu, Y., Lin, H., Tang, J., Han, X., Sun, L.: End-to-end neural event coreference resolution. Artif. Intell. 303, 103632 (2020)
Luo, X.: On coreference resolution performance metrics, January 2005
Mann, H.B., Whitney, D.R.: On a test of whether one of two random variables is stochastically larger than the other. Ann. Math. Stat. 18(1), 50–60 (1947)
Martschat, S., Markert, K.: Improving ROUGE for timeline summarization. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, vol. 2, Short Papers, pp. 285–290. Association for Computational Linguistics, Valencia, Spain, April 2017
Martschat, S., Markert, K.: A temporally sensitive submodularity framework for timeline summarization. In: Proceedings of the 22nd Conference on Computational Natural Language Learning, pp. 230–240. Association for Computational Linguistics, Brussels, Belgium, October 2018
Miller, D.: Leveraging BERT for extractive text summarization on lectures (2019)
Miyato, T., Dai, A.M., Goodfellow, I.: Adversarial training methods for semi-supervised text classification (2017)
Moosavi, N.S., Strube, M.: Which coreference evaluation metric do you trust? A proposal for a link-based entity aware metric. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, vol. 1, Long Papers, pp. 632–642. Association for Computational Linguistics, Berlin, Germany, August 2016
Palshikar, G., Pawar, S., Patil, et al.: Extraction of message sequence charts from narrative history text. In: Proceedings of the First Workshop on Narrative Understanding, pp. 28–36. Association for Computational Linguistics, Minneapolis, Minnesota, June 2019
Paszke, A., Gross, S., et al.: PyTorch: an imperative style, high-performance deep learning library. In: Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 32, pp. 8024–8035. Curran Associates, Inc. (2019)
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)
Preservation, S.A., Trust, M.: The Collected Works of Mahatma Gandhi (2013). https://www.gandhiheritageportal.org/the-collected-works-of-mahatma-gandhi. Accessed 22 Feb 2020
Pustejovsky, J., et al.: TimeML: robust specification of event and temporal expressions in text, pp. 28–34, January 2003
Recasens, M., Hovy, E.: Blanc: Implementing the rand index for coreference evaluation. Nat. Lang. Eng. 17, 485–510 (2011)
Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using Siamese BERT-networks (2019)
Strötgen, J., Gertz, M.: HeidelTime: high quality rule-based extraction and normalization of temporal expressions. In: Proceedings of the 5th International Workshop on Semantic Evaluation, pp. 321–324. Association for Computational Linguistics, Uppsala, Sweden, July 2010
Vilain, M., Burger, J., Aberdeen, J., Connolly, D., Hirschman, L.: A model-theoretic coreference scoring scheme, pp. 45–52, January 1995
Zhang, W., Chen, Q., Chen, Y.: Deep learning based robust text classification method via virtual adversarial training. IEEE Access 8, 61174–61182 (2020)
Zhang, Y., Wallace, B.: A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Adak, S., Ahmad, A., Basu, A., Mukherjee, A. (2023). Placing (Historical) Facts on a Timeline: A Classification Cum Coref Resolution Approach. In: Amini, MR., Canu, S., Fischer, A., Guns, T., Kralj Novak, P., Tsoumakas, G. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2022. Lecture Notes in Computer Science(), vol 13718. Springer, Cham. https://doi.org/10.1007/978-3-031-26422-1_21
Download citation
DOI: https://doi.org/10.1007/978-3-031-26422-1_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-26421-4
Online ISBN: 978-3-031-26422-1
eBook Packages: Computer ScienceComputer Science (R0)