Skip to main content

On Extracting Relations Using Distributional Semantics and a Tree Generalization

  • Conference paper
  • First Online:
Knowledge Engineering and Knowledge Management (EKAW 2018)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11313))

Included in the following conference series:

  • 973 Accesses

Abstract

Extracting relations out of unstructured text is essential for a wide range of applications. Minimal human effort, scalability and high precision are desirable characteristics. We introduce a distant supervised closed relation extraction approach based on distributional semantics and a tree generalization. Our approach uses training data obtained from a reference knowledge base to derive dependency parse trees that might express a relation. It then uses a novel generalization algorithm to construct dependency tree patterns for the relation. Distributional semantics are used to eliminate false candidate patterns. We evaluate the performance in experiments on a large corpus using ninety target relations. Our evaluation results suggest that our approach achieves a higher precision than two state-of-the-art systems. Moreover, our results also underpin the scalability of our approach. Our open source implementation can be found at https://github.com/dice-group/Ocelot.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    \(A:= \{ label, lemma, pos, ner, domain, range, general\}\) are the vertex attributes used throughout this paper.

  2. 2.

    The shortest sentence with a relation has at least two tokens for the named entity arguments, one token for the relation mention and one for the end punctuation.

  3. 3.

    Seven types are applied (Place, Person, Organization, Money, Percent, Date, Time).

  4. 4.

    https://code.google.com/archive/p/word2vec.

  5. 5.

    https://www.wikidata.org.

  6. 6.

    https://www.oxforddictionaries.com.

  7. 7.

    https://www.wordnik.com.

  8. 8.

    https://wordnet.princeton.edu.

  9. 9.

    In our approach we utilize Organization, Person and Place.

  10. 10.

    We provided an example of an RDF serialisation of the framework in Listing 1.1.

References

  1. Augenstein, I., Maynard, D., Ciravegna, F.: Relation extraction from the web using distant supervision. In: Janowicz, K., Schlobach, S., Lambrix, P., Hyvönen, E. (eds.) EKAW 2014. LNCS (LNAI), vol. 8876, pp. 26–41. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-13704-9_3

    Chapter  Google Scholar 

  2. Curran, J.R., Murphy, T., Scholz, B.: Minimising semantic drift with mutual exclusion bootstrapping. In: Proceedings of the 10th Conference of the Pacific Association for Computational Linguistics, pp. 172–180 (2007)

    Google Scholar 

  3. Del Corro, L., Gemulla, R.: Clausie: clause-based open information extraction. In: Proceedings of the 22nd International Conference on World Wide Web, WWW 2013, pp. 355–366. ACM, New York (2013). https://doi.org/10.1145/2488388.2488420, https://doi.org/10.1145/2488388.2488420

  4. Draicchio, F., Gangemi, A., Presutti, V., Nuzzolese, A.G.: FRED: from natural language text to RDF and OWL in one click. In: Cimiano, P., FernĂ¡ndez, M., Lopez, V., Schlobach, S., Völker, J. (eds.) ESWC 2013. LNCS, vol. 7955, pp. 263–267. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41242-4_36

    Chapter  Google Scholar 

  5. Dubey, M., Dasgupta, S., Sharma, A., Hoffner, K., Lehmann, J.: Asknow: a framework for natural language query formalization in sparql. In: Proceedings of the Extended Semantic Web Conference 2016 (2016). http://jens-lehmann.org/files/2016/eswc_asknow.pdf

  6. Fader, A., Soderland, S., Etzioni, O.: Identifying relations for open information extraction, pp. 1535–1545 (2011)

    Google Scholar 

  7. Gerber, D., et al.: Defacto - temporal and multilingual deep fact validation. Web Semant. Sci. Serv. Agents World Wide Web (2015). http://svn.aksw.org/papers/2015/JWS_DeFacto/public.pdf

  8. Gerber, D., Ngonga Ngomo, A.C.: Bootstrapping the linked data web. In: 1st Workshop on Web Scale Knowledge Extraction @ ISWC 2011 (2011)

    Google Scholar 

  9. Höffner, K., Walter, S., Marx, E., Usbeck, R., Lehmann, J., Ngonga Ngomo, A.C.: Survey on challenges of question answering in the semantic web. Semant. Web J. 8(6) (2017). http://www.semantic-web-journal.net/system/files/swj1375.pdf

    Article  Google Scholar 

  10. Krause, S., Li, H., Uszkoreit, H., Xu, F.: Large-scale learning of relation-extraction rules with distant supervision from the web. In: Cudré-Mauroux, P., Heflin, J., Sirin, E., Tudorache, T., Euzenat, J., Hauswirth, M., Parreira, J.X., Hendler, J., Schreiber, G., Bernstein, A., Blomqvist, E. (eds.) ISWC 2012. LNCS, vol. 7649, pp. 263–278. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35176-1_17

    Chapter  Google Scholar 

  11. Lehmann, J., BĂ¼hmann, L.: AutoSPARQL: let users query your knowledge base. In: Antoniou, G., Grobelnik, M., Simperl, E., Parsia, B., Plexousakis, D., De Leenheer, P., Pan, J. (eds.) ESWC 2011. LNCS, vol. 6643, pp. 63–79. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21034-1_5

    Chapter  Google Scholar 

  12. Mausam, Schmitz, M., Bart, R., Soderland, S., Etzioni, O.: Open language learning for information extraction. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, EMNLP-CoNLL 2012, Association for Computational Linguistics, Stroudsburg, PA, USA, pp. 523–534 (2012). http://dl.acm.org/citation.cfm?id=2390948.2391009

  13. Mendes, P.N., Jakob, M., Garcia-Silva, A., Bizer, C.: Dbpedia spotlight: shedding light on the web of documents. In: Proceedings of the 7th International Conference on Semantic Systems (I-Semantics) (2011)

    Google Scholar 

  14. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. CoRR (2013). http://arxiv.org/abs/1301.3781

  15. Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th International Conference on Neural Information Processing Systems, NIPS 2013, vol. 2, pp. 3111–3119. Curran Associates Inc., USA (2013). http://dl.acm.org/citation.cfm?id=2999792.2999959

  16. Mintz, M., Bills, S., Snow, R., Jurafsky, D.: Distant supervision for relation extraction without labeled data. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, Association for Computational Linguistics, pp. 1003–1011 (2009). http://www.aclweb.org/anthology/P09-1113

  17. Nakashole, N., Weikum, G., Suchanek, F.: Patty: a taxonomy of relational patterns with semantic types. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, EMNLP-CoNLL 2012, Association for Computational Linguistics, Stroudsburg, PA, USA, pp. 1135–1145 (2012). http://dl.acm.org/citation.cfm?id=2390948.2391076

  18. Ren, X., Wu, Z., He, W., Qu, M., Voss, C.R., Ji, H., Abdelzaher, T.F., Han, J.: Cotype: joint extraction of typed entities and relations with knowledge bases. In: Proceedings of the 26th International Conference on World Wide Web, pp. 1015–1024 (2017)

    Google Scholar 

  19. Riedel, S., Yao, L., McCallum, A.: Modeling relations and their mentions without labeled text. In: BalcĂ¡zar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds.) Machine Learning and Knowledge Discovery in Databases, pp. 148–163. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  20. Riloff, E., Jones, R.: Learning dictionaries for information extraction by multi-level bootstrapping. In: Proceedings of the Sixteenth National Conference on Artificial Intelligence and the Eleventh Innovative Applications of Artificial Intelligence Conference Innovative Applications of Artificial Intelligence, AAAI 1999/IAAI 1999, American Association for Artificial Intelligence, Menlo Park, CA, USA, pp. 474–479 (1999). http://dl.acm.org/citation.cfm?id=315149.315364

  21. Singh, K., Mulang’, I.O., Lytra, I., Jaradeh, M.Y., Sakor, A., Vidal, M.E., Lange, C., Auer, S.: Capturing knowledge in semantically-typed relational patterns to enhance relation linking. In: Proceedings of the Knowledge Capture Conference, K-CAP 2017, pp. 31:1–31:8. ACM, New York (2017). https://doi.org/10.1145/3148011.3148031, https://doi.org/10.1145/3148011.3148031

  22. Usbeck, R., Ngomo, A.-C.N., BĂ¼hmann, L., Unger, C.: HAWK – hybrid question answering using linked data. In: Gandon, F., Sabou, M., Sack, H., d’Amato, C., CudrĂ©-Mauroux, P., Zimmermann, A. (eds.) ESWC 2015. LNCS, vol. 9088, pp. 353–368. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-18818-8_22

    Chapter  Google Scholar 

  23. Usbeck, R., Ngomo, A.-C.N., Haarmann, B., Krithara, A., Röder, M., Napolitano, G.: 7th open challenge on question answering over linked data (QALD-7). In: Dragoni, M., Solanki, M., Blomqvist, E. (eds.) SemWebEval 2017. CCIS, vol. 769, pp. 59–69. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-69146-6_6. https://svn.aksw.org/papers/2017/ESWC_2017_QALD/public.pdf

    Chapter  Google Scholar 

  24. Yates, A., Cafarella, M., Banko, M., Etzioni, O., Broadhead, M., Soderland, S.: Textrunner: Open information extraction on the web. In: Proceedings of Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations, NAACL-Demonstrations 2007, Association for Computational Linguistics, Stroudsburg, PA, USA, pp. 25–26 (2007). http://dl.acm.org/citation.cfm?id=1614164.1614177

  25. Zaveri, A., Rula, A., Maurino, A., Pietrobon, R., Lehmann, J., Auer, S.: Quality assessment for linked data: a survey. Semantic Web Journal (2015). http://www.semantic-web-journal.net/content/quality-assessment-linked-data-survey

Download references

Acknowledgement

This work has been supported by the H2020 project HOBBIT (no. 688227), the BMWI projects GEISER (no. 01MD16014E) and OPAL (no. 19F2028A), the EuroStars projects DIESEL (no. 01QE1512C) and QAMEL (no. 01QE1549C).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to René Speck or Axel-Cyrille Ngomo Ngonga .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Speck, R., Ngomo Ngonga, AC. (2018). On Extracting Relations Using Distributional Semantics and a Tree Generalization. In: Faron Zucker, C., Ghidini, C., Napoli, A., Toussaint, Y. (eds) Knowledge Engineering and Knowledge Management. EKAW 2018. Lecture Notes in Computer Science(), vol 11313. Springer, Cham. https://doi.org/10.1007/978-3-030-03667-6_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-03667-6_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-03666-9

  • Online ISBN: 978-3-030-03667-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics