Temporal Natural Language Inference: Evidence-Based Evaluation of Temporal Text Validity

Hosokawa, Taishi; Jatowt, Adam; Sugiyama, Kazunari

doi:10.1007/978-3-031-28244-7_28

Taishi Hosokawa¹⁶,
Adam Jatowt¹⁷ &
Kazunari Sugiyama¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13980))

Included in the following conference series:

European Conference on Information Retrieval

1576 Accesses
1 Citations
1 Altmetric

Abstract

It is important to learn whether text information remains valid or not for various applications including story comprehension, information retrieval, and user state tracking on microblogs and via chatbot conversations. It is also beneficial to deeply understand the story. However, this kind of inference is still difficult for computers as it requires temporal commonsense. We propose a novel task, Temporal Natural Language Inference, inspired by traditional natural language reasoning to determine the temporal validity of text content. The task requires inference and judgment whether an action expressed in a sentence is still ongoing or rather completed, hence, whether the sentence still remains valid, given its supplementary content. We first construct our own dataset for this task and train several machine learning models. Then we propose an effective method for learning information from an external knowledge base that gives hints on temporal commonsense knowledge. Using prepared dataset, we introduce a new machine learning model that incorporates the information from the knowledge base and demonstrate that our model outperforms state-of-the-art approaches in the proposed task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Note that it is not always easy to determine the correct answer as the context or necessary details might be missing, and in such cases humans seem to rely on probabilistic reasoning besides the commonsense base.
2.
The dataset will be made freely available after paper publication.
3.
Note that \(s_1\) and \(s_2\) may have temporal order: \(t_{s_1} \le t_{s_2}\), where \(t_{s_{id}}\) \((id=1,2)\) is the creation time (or a reading order) of a sentence \(s_{id}\). This may be for example in the case of receiving microblog posts issued by the user (or when reading next sentences of a story or a novel).
4.
SNLI dataset is licensed under CC-BY-SA 4.0.
5.
https://www.mturk.com/.

References

Abe, S., Shirakawa, M., Nakamura, T., Hara, T., Ikeda, K., Hoashi, K.: Predicting the Occurrence of Life Events from User’s Tweet History. In Proceedings of the 12th IEEE International Conference on Semantic Computing (ICSC 2018), pp. 219–226 (2018)
Google Scholar
Abel, F., Gao, Q., Houben, G.-J., Tao, K.: Analyzing user modeling on twitter for personalized news recommendations. In Proceedings of the 19th International Conference on User Modeling, Adaptation, and Personalization (UMAP 2011), pp. 1–12 (2011)
Google Scholar
Almquist, V., Jatowt, A.: Towards content expiry date determination: predicting validity periods of sentences. In Proceedings of the 41st European Conference on IR Research (ECIR 2019), pp. 86–101 (2019)
Google Scholar
Bordes, A., Usunier, N., Garcia-Durán, A., Weston, J., Yakhnenko, O.: Translating Embeddings for Modeling Multi-relational Data. In Proceedings of the 27th International Conference on Neural Information Processing Systems (NIPS 2013), pp. 2787–2795 (2013)
Google Scholar
Bosselut, A., Rashkin, H., Sap, M., Malaviya, C., Celikyilmaz, A., Choi. Y.: COMET: commonsense transformers for automatic knowledge graph construction. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 4762–4779, Florence, Italy, July (2019). Association for Computational Linguistics
Google Scholar
Bowman, S.R., Angeli, G., Potts, C., Manning, C.D.: A large annotated corpus for learning natural language inference. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 632–642, Lisbon, Portugal (2015). Association for Computational Linguistics
Google Scholar
Chen, Q., Zhu, X., Ling, Z.-H., Wei, Z.-H., Jiang, H., Inkpen, D.: Enhanced LSTM for natural language inference. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1657–1668, Vancouver, Canada (2017). Association for Computational Linguistics
Google Scholar
Chen, Y., Huang, S., Wang, F., Cao, J., Sun, W., Wan, X.: Neural maximum subgraph parsing for cross-domain semantic dependency analysis. In Proceedings of the 22nd Conference on Computational Natural Language Learning, pp. 562–572, Brussels, Belgium (2018). Association for Computational Linguistics
Google Scholar
Cheng, F., Miyao, Y.: Predicting event time by classifying sub-level temporal relations induced from a unified representation of time anchors. arXiv preprint arXiv:2008.06452 (2020)
Chicco, D.: Siamese Neural Networks: an Overview. Artificial Neural Networks - Third Edition, pp. 73–94 (2021)
Google Scholar
Clark, P., Dalvi, B., Tandon, N.: What happened? leveraging VerbNet to predict the effects of actions in procedural text. arXiv preprint arXiv:1804.05435 (2018)
Condoravdi, C., Crouch, D., de Paiva, V., Stolle, R., Bobrow, D.G.: Entailment, intensionality and text understanding. In Proceedings of the HLT-NAACL 2003 Workshop on Text Meaning, pp. 38–45 (2003)
Google Scholar
Conneau, A., Kiela, D., Schwenk, H., Barrault, L., Bordes, A.: Supervised learning of universal sentence representations from natural language inference data. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 670–680, Copenhagen, Denmark (2017). Association for Computational Linguistics
Google Scholar
Crawshaw, M.: Multi-task learning with deep neural networks: a survey. arXiv preprint arXiv:2009.09796 (2020)
Dagan, I., Glickman, O., Magnini, B.: The PASCAL recognising textual entailment challenge. In: Machine Learning Challenges Workshop (MLCW 2005), pp. 177–190 (2005)
Google Scholar
Demszky, D., Guu, K., Liang, P.: Transforming question answering datasets into natural language inference datasets. arXiv preprint arXiv:1809.02922 (2018)
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171–4186, Minneapolis, Minnesota (2019). Association for Computational Linguistics
Google Scholar
Dligach, D., Miller, T., Lin, C., Bethard, S., Savova, G.: Neural temporal relation extraction. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, pp. 746–751, Valencia, Spain (2017). Association for Computational Linguistics
Google Scholar
Fillmore, C.J., Baker, C.: A frames approach to semantic analysis. In: The Oxford Handbook of Linguistic Analysis. Oxford University Press (2010)
Google Scholar
Fyodorov, Y., Winter, Y., Francez, N.: A natural logic inference system. In: Proceedings of the 2nd Workshop on Inference in Computational Semantics (ICoS-2) (2000)
Google Scholar
Gao, Q., Yang, S., Chai, J., Vanderwende, L.: What action causes this? towards naive physical action-effect prediction. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 934–945, Melbourne, Australia (2018). Association for Computational Linguistics
Google Scholar
Glockner, M., Shwartz, V., Goldberg, Y.: Breaking NLI systems with sentences that require simple lexical inferences. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 650–655, Melbourne, Australia (2018). Association for Computational Linguistics
Google Scholar
Han, R., Liang, M., Alhafni, B., Peng, N.: Contextualized word embeddings enhanced event temporal relation extraction for story understanding. arXiv preprint arXiv:1904.11942 (2019)
Harabagiu, S., Bejan, C.A.: Question answering based on temporal inference. In: Proceedings of the AAAI-2005 Workshop on Inference for Textual Question Answering, pp. 27–34 (2005)
Google Scholar
Hwang, J.D., et al.: (COMET-) Atomic 2020: on symbolic and neural commonsense knowledge graphs. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence (AAAI-21), pp. 6384–6392 (2021)
Google Scholar
Jatowt, A.: Temporal question answering in news article collections. In: Companion of The Web Conference 2022, Virtual Event/Lyon, France, 25–29 April 2022, p. 895. ACM (2022)
Google Scholar
Jatowt, A., Antoine, É., Kawai, Y., Akiyama, T.: Mapping temporal horizons: analysis of collective future and past related attention in Twitter. In: Proceedings Of The 24th International Conference on World Wide Web (WWW 2015), pp. 484–494 (2015)
Google Scholar
Kanazawa, K., Jatowt, A., Tanaka, K.: Improving retrieval of future-related information in text collections. In: Proceedings of the 2011 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2011), pp. 278–283 (2011)
Google Scholar
Kapanipathi, P., et al.: Infusing knowledge into the textual entailment task using graph convolutional networks. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence (AAAI-20), pp. 8074–8081 (2020)
Google Scholar
Khot, T., Sabharwal, A., Clark, P.: SciTaiL: a textual entailment dataset from science question answering. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI-18) (2018)
Google Scholar
Koupaee, M., Wang, W.Y.: WikiHow: a large scale text summarization dataset. arXiv preprint arXiv:1810.09305 (2018)
Levesque, H., Davis, E., Morgenstern, L.: The winograd schema challenge. In: Proceedings of the 13th International Conference on the Principles of Knowledge Representation and Reasoning (KR 2012), pp. 552–561 (2012)
Google Scholar
Li, P., Lu, H., Kanhabua, N., Zhao, S., Pan, G.: Location inference for non-geotagged tweets in user timelines. IEEE Trans. Knowl. Data Eng. (TKDE) 31(6), 1150–1165 (2018)
Article Google Scholar
Lin, B.Y., Chen, X., Chen, J., Ren, X.: KagNet: knowledge-aware graph networks for commonsense reasoning. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 2829–2839, Hong Kong, China (2019). Association for Computational Linguistics
Google Scholar
Liu, H., Singh, P.: ConceptNet - a practical commonsense reasoning tool-kit. BT Technol. J. 22(4), 211–226 (2004)
Article Google Scholar
Liu, Q., Jiang, H., Ling, Z.-H., Zhu, X., Wei, S., Hu, Y.: Combing context and commonsense knowledge through neural networks for solving winograd schema problems. In: Proceedings of the AAAI 2017 Spring Symposium on Computational Context: Why It’s Important, What It Means, and Can It Be Computed? pp. 315–321 (2017)
Google Scholar
Liu, Y., et al.: RoBERTa: a robustly optimized BERt pretraining approach. arXiv preprint arXiv:1907.11692 (2019)
Luo, Z., Sha, Y., Zhu, K.Q., Hwang, S.-W., Wang, Z.: Commonsense causal reasoning between short texts. In: Proceedings of the 15th International Conference on the Principles of Knowledge Representation and Reasoning (KR 2016), pp. 421–431 (2016)
Google Scholar
Miech, A., Zhukov, D., Alayrac, J.-B., Tapaswi, M., Laptev, I., Sivic, J.: HowTo100M: learning a text-video embedding by watching hundred million narrated video clips. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV 2019), pp. 2630–2640 (2019)
Google Scholar
Mihaylov, T., Frank, A.: Knowledgeable reader: enhancing cloze-style reading comprehension with external commonsense knowledge. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 821–832, Melbourne, Australia (2018). Association for Computational Linguistics
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of the 27th Annual Conference on Neural Information Processing Systems (NIPS 2013), pp. 3111–3119 (2013)
Google Scholar
Minard, A.-L., et al.: SemEval-2015 task 4: timeline: cross-document event ordering. In: Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), pp. 778–786, Denver, Colorado (2015). Association for Computational Linguistics
Google Scholar
Mnasri, M.: Recent advances in conversational NLP: towards the standardization of chatbot building. arXiv preprint arXiv:1903.09025 (2019)
Mostafazadeh, N., et al.: A corpus and cloze evaluation for deeper understanding of commonsense stories. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 839–849, San Diego, California (2016). Association for Computational Linguistics
Google Scholar
Ning, Q., Wu, H., Roth, D.: A multi-axis annotation scheme for event temporal relations. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1318–1328, Melbourne, Australia (2018). Association for Computational Linguistics
Google Scholar
Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: Proceedings of the 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), pp. 8026–8037 (2019)
Google Scholar
Pennington, J., Socher, R., Manning, C.: GloVe: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543, Doha, Qatar (2014). Association for Computational Linguistics
Google Scholar
Peters, M.E., et al.: Knowledge enhanced contextual word representations. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 43–54, Hong Kong, China (2019). Association for Computational Linguistics
Google Scholar
Rashkin, H., Sap, M., Allaway, E., Smith, N.A., Choi, Y.: Event2Mind: commonsense inference on events, intents, and reactions. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 463–473, Melbourne, Australia (2018). Association for Computational Linguistics
Google Scholar
Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using Siamese BERT-networks. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 3982–3992, Hong Kong, China (2019). Association for Computational Linguistics
Google Scholar
Sap, M., et al.: ATOMIC: an atlas of machine commonsense for if-then reasoning. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI-19), pp. 3027–3035 (2019)
Google Scholar
Schuler, K.: VerbNet: a broad-coverage, comprehensive verb lexicon, Ph. D. thesis, University of Pennsylvania (2005)
Google Scholar
Speer, R., Chin, J., Havasi, C.: ConceptNet 5.5: an open multilingual graph of general knowledge. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence (AAAI-17), pp. 4444–4451 (2017)
Google Scholar
Storks, S., Gao, Q., Chai, J.Y.: Commonsense reasoning for natural language understanding: a survey of benchmarks, resources, and approaches. arXiv preprint arXiv:1904.01172, pp. 1–60 (2019)
Storks, S., Gao, Q., Chai, J.Y.: Recent advances in natural language inference: a survey of benchmarks, resources, and approaches. arXiv preprint arXiv:1904.01172 (2019)
Sun, Z., et al.: Self-explaining structures improve NLP models. arXiv preprint arXiv:2012.01786 (2020)
Takemura, H., Tajima, K.: Tweet classification based on their lifetime duration. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management (CIKM 2012), pp. 2367–2370 (2012)
Google Scholar
Tamborrino, A., Pellicanò, N., Pannier, B., Voitot, P., Naudin, L.: Pre-training is (almost) all you need: an application to commonsense reasoning. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 3878–3887, Online (2020). Association for Computational Linguistics
Google Scholar
Torfi, A., Shirvani, R.A., Keneshloo, Y., Tavaf, N., Fox, E.A.: Natural language processing advancements by deep learning: a survey. arXiv preprint arXiv:2003.01200 (2020)
Trinh, T.H., Le, Q.V.: A simple method for commonsense reasoning. arXiv preprint arXiv:1806.02847 (2018)
Trouillon, T., Welbl, J., Riedel, S., Gaussier, É., Bouchard, G.: Complex Embeddings for Simple Link Prediction. In Proceedings of the 33nd International Conference on Machine Learning (ICML 2016), pp. 2071–2080 (2016)
Google Scholar
Vashishtha, S., Poliak, A., Lal, Y.K., Van Durme, B., White, A.S.: Temporal reasoning in natural language inference. In: Findings of the Association for Computational Linguistics: EMNLP 2020, pp. 4070–4078, Online (2020). Association for Computational Linguistics
Google Scholar
Vashishtha, S., Van Durme, B., White, A.S.: Fine-grained temporal relation extraction. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 2906–2919, Florence, Italy (2019). Association for Computational Linguistics
Google Scholar
Vrandečić, D., Krötzsch, M.: Wikidata: A Free Collaborative Knowledgebase. Commun. ACM 57(10), 78–85 (2014)
Google Scholar
Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., Bowman, S.: GLUE: a multi-task benchmark and analysis platform for natural language understanding. In: Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pp. 353–355, Brussels, Belgium (2018). Association for Computational Linguistics
Google Scholar
Wang, X., et al.: Improving natural language inference using external knowledge in the science questions domain. In: Proceedings of the 33rd AAAI Conference on Artificial Intelligence (AAAI-19), pp. 7208–7215 (2019)
Google Scholar
Wang, Z., Zhang, J., Feng, J., Chen, Z.: Knowledge graph embedding by translating on hyperplanes. In: Proceedings of the 28th AAAI Conference on Artificial Intelligence (AAAI-14), pp. 1112–1119 (2014)
Google Scholar
White, A.S., Rastogi, P., Duh, K., Van Durme, B.: Inference is everything: Recasting semantic resources into a unified evaluation framework. In: Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 996–1005, Taipei, Taiwan (2017). Asian Federation of Natural Language Processing
Google Scholar
White, R.W., Awadallah, A.H.: Task duration estimation. In: Proceedings of the 12th ACM International Conference on Web Search and Data Mining (WSDM 2019), pp. 636–644 (2019)
Google Scholar
Williams, A., Nangia, N., Bowman, S.: A broad-coverage challenge corpus for sentence understanding through inference. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 1112–1122, New Orleans, Louisiana (2018). Association for Computational Linguistics
Google Scholar
Wolf, T., et al.: HuggingFace’s transformers: state-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019)
Xiang, W., Wang, B.: A survey of event extraction from text. IEEE Access 7, 173111–173137 (2019)
Article Google Scholar
Yasunaga, M., Ren, H., Bosselut, A., Liang, P., Leskovec, J.: QA-GNN: reasoning with language models and knowledge graphs for question answering. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 535–546, Online (2021). Association for Computational Linguistics
Google Scholar
Young, P., Lai, A., Hodosh, M., Hockenmaier, J.: From image descriptions to visual denotations: new similarity metrics for semantic inference over event descriptions. Trans. Assoc. Comput. Linguist. 2, 67–78 (2014)
Article Google Scholar
Zhang, T., et al.: HORNET: enriching pre-trained language representations with heterogeneous knowledge sources. In: Proceedings of the 30th ACM International Conference on Information and Knowledge Management (CIKM 2021), pp. 2608–2617 (2021)
Google Scholar
Zhang, Z., Han, X., Liu, Z., Jiang, X., Sun, M., Liu, Q.: ERNIE: enhanced language representation with informative entities. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 1441–1451, Florence, Italy (2019). Association for Computational Linguistics
Google Scholar
Zhou, B., Khashabi, D., Ning, Q., Roth, D.: “Going on a vacation” takes longer than “going for a walk”: a study of temporal commonsense understanding. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 3363–3369, Hong Kong, China (2019). Association for Computational Linguistics
Google Scholar
Zhou, B., Ning, Q., Khashabi, D., Roth, D. Temporal common sense acquisition with minimal supervision. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7579–7589, Online (2020). Association for Computational Linguistics
Google Scholar

Download references

Author information

Authors and Affiliations

Kyoto University, Kyoto, Japan
Taishi Hosokawa
University of Innsbruck, Innsbruck, Austria
Adam Jatowt
Osaka Seikei University, Osaka, Japan
Kazunari Sugiyama

Authors

Taishi Hosokawa
View author publications
You can also search for this author in PubMed Google Scholar
Adam Jatowt
View author publications
You can also search for this author in PubMed Google Scholar
Kazunari Sugiyama
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Adam Jatowt .

Editor information

Editors and Affiliations

University of Amsterdam, Amsterdam, The Netherlands
Jaap Kamps
Université Grenoble-Alpes, Saint-Martin-d’Hères, France
Lorraine Goeuriot
Università della Svizzera Italiana, Lugano, Switzerland
Fabio Crestani
University of Copenhagen, Copenhagen, Denmark
Maria Maistro
University of Tsukuba, Ibaraki, Japan
Hideo Joho
Dublin City University, Dublin, Ireland
Brian Davis
Dublin City University, Dublin, Ireland
Cathal Gurrin
Universität Regensburg, Regensburg, Germany
Udo Kruschwitz
Dublin City University, Dublin, Ireland
Annalina Caputo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hosokawa, T., Jatowt, A., Sugiyama, K. (2023). Temporal Natural Language Inference: Evidence-Based Evaluation of Temporal Text Validity. In: Kamps, J., et al. Advances in Information Retrieval. ECIR 2023. Lecture Notes in Computer Science, vol 13980. Springer, Cham. https://doi.org/10.1007/978-3-031-28244-7_28

Download citation

DOI: https://doi.org/10.1007/978-3-031-28244-7_28
Published: 17 March 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-28243-0
Online ISBN: 978-3-031-28244-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Temporal Natural Language Inference: Evidence-Based Evaluation of Temporal Text Validity