Skip to main content

ReLink: Open Information Extraction by Linking Phrases and Its Applications

  • Conference paper
  • First Online:
Distributed Computing and Internet Technology (ICDCIT 2021)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12582))

  • 825 Accesses

Abstract

Recently, many Open IE systems have been developed based on using deep linguistic features such as dependency-parse features to overcome the limitations presented in older Open IE systems which use only shallow information like part-of-speech or chunking. Even though these newer systems have some clear advantages in their extractions, they also possess several issues which do not exist in old systems. In this paper, we analyze the outputs from several popular Open IE systems to find out their strength and weaknesses. Then we introduce ReLink, a novel Open IE system for extracting binary relations from open-domain text. Its working model is based on identifying correct phrases and linking them in the most proper way to reflect their relationship in a sentence. After establishing connections, it can easily extract relations by using several pre-defined patterns. Despite using only shallow linguistic features for extraction, it does not have the same weakness that existed in older systems, and it can also avoid many similar issues arising in recent Open IE systems. Our experiments show that ReLink achieves larger Area Under Precision-Recall Curve compared with ReVerb and Ollie, two well-known Open IE systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://gramatica.usc.es/pln/tools/deppattern.html.

  2. 2.

    https://opennlp.apache.org/.

  3. 3.

    https://edition.cnn.com.

  4. 4.

    https://www.mpi-inf.mpg.de/departments/databases-and-information-systems/software/clausie/.

  5. 5.

    https://github.com/linktorepository.

References

  1. Banko, M., Cafarella, M.J., Soderland, S., Broadhead, M., Etzioni, O.: Open information extraction from the web. In: IJCAI, vol. 7, pp. 2670–2676 (2007)

    Google Scholar 

  2. Berant, J., Dagan, I., Goldberger, J.: Global learning of typed entailment rules. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 610–619. Association for Computational Linguistics (2011)

    Google Scholar 

  3. Christensen, J., Mausam, S.S., Soderland, S., Etzioni, O.: Towards coherent multi-document summarization. In: HLT-NAACL, pp. 1163–1173. Citeseer (2013)

    Google Scholar 

  4. Del Corro, L., Gemulla, R.: Clausie: Clause-based open information extraction. In: Proceedings of the 22nd International Conference on World Wide Web, pp. 355–366. ACM (2013)

    Google Scholar 

  5. Etzioni, O., et al.: Unsupervised named-entity extraction from the web: an experimental study. Artif. Intell. 165(1), 91–134 (2005)

    Article  MathSciNet  Google Scholar 

  6. Fader, A., Soderland, S., Etzioni, O.: Identifying relations for open information extraction. In: Proceedings of the Conference of Empirical Methods in Natural Language Processing (EMNLP 2011), Edinburgh, Scotland, UK, 27–31 July 2011 (2011)

    Google Scholar 

  7. Fader, A., Zettlemoyer, L., Etzioni, O.: Open question answering over curated and extracted knowledge bases. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1156–1165. ACM (2014)

    Google Scholar 

  8. Gabor Angeli, M.J.P., Manning, C.D.: Leveraging linguistic structure for open domain information extraction. In: Proceedings of the Association of Computational Linguistics (ACL) (2015)

    Google Scholar 

  9. Gamallo, P., Garcia, M., Fernández-Lanza, S.: Dependency-based open information extraction. In: Proceedings of the Joint Workshop on Unsupervised and Semi-Supervised Learning in NLP, pp. 10–18. Association for Computational Linguistics (2012)

    Google Scholar 

  10. Kim, J.T., Moldovan, D.I.: Acquisition of semantic patterns for information extraction from corpora. In: Proceedings of Ninth Conference on Artificial Intelligence for Applications, pp. 171–176. IEEE (1993)

    Google Scholar 

  11. Li, P., Cai, W., Huang, H.: Weakly supervised natural language processing framework for abstractive multi-document summarization: weakly supervised abstractive multi-document summarization. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, pp. 1401–1410. ACM (2015)

    Google Scholar 

  12. Li, X., Roth, D.: Exploring evidence for shallow parsing. In: Proceedings of the 2001 Workshop on Computational Natural Language Learning, vol. 7, p. 6. Association for Computational Linguistics (2001)

    Google Scholar 

  13. Manning, C.D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S.J., McClosky, D.: The Stanford CoreNLP natural language processing toolkit. In: Association for Computational Linguistics (ACL) System Demonstrations, pp. 55–60 (2014). http://www.aclweb.org/anthology/P/P14/P14-5010

  14. Mausam, Schmitz, M., Bart, R., Soderland, S., Etzioni, O.: Open language learning for information extraction. In: Proceedings of Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CONLL) (2012)

    Google Scholar 

  15. Reddy, S., Lapata, M., Steedman, M.: Large-scale semantic parsing without question-answer pairs. Trans. Assoc. Comput. Linguist. 2, 377–392 (2014)

    Article  Google Scholar 

  16. Ruppert, E.: Unsupervised conceptualization and semantic text indexing for information extraction. In: Sack, H., Blomqvist, E., d’Aquin, M., Ghidini, C., Ponzetto, S.P., Lange, C. (eds.) ESWC 2016. LNCS, vol. 9678, pp. 853–862. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-34129-3_54

    Chapter  Google Scholar 

  17. Schoenmackers, S., Etzioni, O., Weld, D.S., Davis, J.: Learning first-order horn clauses from web text. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pp. 1088–1098. Association for Computational Linguistics (2010)

    Google Scholar 

  18. Soderland, J.C.S., Mausam, G.B.: Hierarchical summarization: scaling up multi-document summarization. In: Proceedings of the 52nd Annual Meeting of the Association for Computlational Linguistics, pp. 902–912 (2014)

    Google Scholar 

  19. Stanovsky, G., Mausam, I.D.: Open IE as an intermediate structure for semantic tasks (2015)

    Google Scholar 

  20. Surdeanu, M.: Overview of the TAC2013 knowledge base population evaluation: English slot filling and temporal slot filling. In: Sixth Text Analysis Conference (2013)

    Google Scholar 

  21. Vo, D.T., Bagheri, E.: Open information extraction. arXiv preprint arXiv:1607.02784 (2016)

  22. Wu, F., Weld, D.S.: Open information extraction using Wikipedia. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 118–127. Association for Computational Linguistics (2010)

    Google Scholar 

Download references

Acknowledgments

This work was supported by JST CREST Grant Number JPMJCR1513 and in part by the Asian Office of Aerospace R&D (AOARD), Air Force Office of Scientific Research (Grant no. FA2386-19-1-4041).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Le-Minh Nguyen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tran, XC., Nguyen, LM. (2021). ReLink: Open Information Extraction by Linking Phrases and Its Applications. In: Goswami, D., Hoang, T.A. (eds) Distributed Computing and Internet Technology. ICDCIT 2021. Lecture Notes in Computer Science(), vol 12582. Springer, Cham. https://doi.org/10.1007/978-3-030-65621-8_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-65621-8_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-65620-1

  • Online ISBN: 978-3-030-65621-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics