On Extracting Relations Using Distributional Semantics and a Tree Generalization

Speck, René; Ngomo Ngonga, Axel-Cyrille

doi:10.1007/978-3-030-03667-6_27

René Speck¹⁷ &
Axel-Cyrille Ngomo Ngonga¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11313))

Included in the following conference series:

European Knowledge Acquisition Workshop

973 Accesses

Abstract

Extracting relations out of unstructured text is essential for a wide range of applications. Minimal human effort, scalability and high precision are desirable characteristics. We introduce a distant supervised closed relation extraction approach based on distributional semantics and a tree generalization. Our approach uses training data obtained from a reference knowledge base to derive dependency parse trees that might express a relation. It then uses a novel generalization algorithm to construct dependency tree patterns for the relation. Distributional semantics are used to eliminate false candidate patterns. We evaluate the performance in experiments on a large corpus using ninety target relations. Our evaluation results suggest that our approach achieves a higher precision than two state-of-the-art systems. Moreover, our results also underpin the scalability of our approach. Our open source implementation can be found at https://github.com/dice-group/Ocelot.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
\(A:= \{ label, lemma, pos, ner, domain, range, general\}\) are the vertex attributes used throughout this paper.
2.
The shortest sentence with a relation has at least two tokens for the named entity arguments, one token for the relation mention and one for the end punctuation.
3.
Seven types are applied (Place, Person, Organization, Money, Percent, Date, Time).
4.
https://code.google.com/archive/p/word2vec.
5.
https://www.wikidata.org.
6.
https://www.oxforddictionaries.com.
7.
https://www.wordnik.com.
8.
https://wordnet.princeton.edu.
9.
In our approach we utilize Organization, Person and Place.
10.
We provided an example of an RDF serialisation of the framework in Listing 1.1.

References

Augenstein, I., Maynard, D., Ciravegna, F.: Relation extraction from the web using distant supervision. In: Janowicz, K., Schlobach, S., Lambrix, P., Hyvönen, E. (eds.) EKAW 2014. LNCS (LNAI), vol. 8876, pp. 26–41. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-13704-9_3
Chapter Google Scholar
Curran, J.R., Murphy, T., Scholz, B.: Minimising semantic drift with mutual exclusion bootstrapping. In: Proceedings of the 10th Conference of the Pacific Association for Computational Linguistics, pp. 172–180 (2007)
Google Scholar
Del Corro, L., Gemulla, R.: Clausie: clause-based open information extraction. In: Proceedings of the 22nd International Conference on World Wide Web, WWW 2013, pp. 355–366. ACM, New York (2013). https://doi.org/10.1145/2488388.2488420, https://doi.org/10.1145/2488388.2488420
Draicchio, F., Gangemi, A., Presutti, V., Nuzzolese, A.G.: FRED: from natural language text to RDF and OWL in one click. In: Cimiano, P., Fernández, M., Lopez, V., Schlobach, S., Völker, J. (eds.) ESWC 2013. LNCS, vol. 7955, pp. 263–267. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41242-4_36
Chapter Google Scholar
Dubey, M., Dasgupta, S., Sharma, A., Hoffner, K., Lehmann, J.: Asknow: a framework for natural language query formalization in sparql. In: Proceedings of the Extended Semantic Web Conference 2016 (2016). http://jens-lehmann.org/files/2016/eswc_asknow.pdf
Fader, A., Soderland, S., Etzioni, O.: Identifying relations for open information extraction, pp. 1535–1545 (2011)
Google Scholar
Gerber, D., et al.: Defacto - temporal and multilingual deep fact validation. Web Semant. Sci. Serv. Agents World Wide Web (2015). http://svn.aksw.org/papers/2015/JWS_DeFacto/public.pdf
Gerber, D., Ngonga Ngomo, A.C.: Bootstrapping the linked data web. In: 1st Workshop on Web Scale Knowledge Extraction @ ISWC 2011 (2011)
Google Scholar
Höffner, K., Walter, S., Marx, E., Usbeck, R., Lehmann, J., Ngonga Ngomo, A.C.: Survey on challenges of question answering in the semantic web. Semant. Web J. 8(6) (2017). http://www.semantic-web-journal.net/system/files/swj1375.pdf
Article Google Scholar
Krause, S., Li, H., Uszkoreit, H., Xu, F.: Large-scale learning of relation-extraction rules with distant supervision from the web. In: Cudré-Mauroux, P., Heflin, J., Sirin, E., Tudorache, T., Euzenat, J., Hauswirth, M., Parreira, J.X., Hendler, J., Schreiber, G., Bernstein, A., Blomqvist, E. (eds.) ISWC 2012. LNCS, vol. 7649, pp. 263–278. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35176-1_17
Chapter Google Scholar
Lehmann, J., Bühmann, L.: AutoSPARQL: let users query your knowledge base. In: Antoniou, G., Grobelnik, M., Simperl, E., Parsia, B., Plexousakis, D., De Leenheer, P., Pan, J. (eds.) ESWC 2011. LNCS, vol. 6643, pp. 63–79. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21034-1_5
Chapter Google Scholar
Mausam, Schmitz, M., Bart, R., Soderland, S., Etzioni, O.: Open language learning for information extraction. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, EMNLP-CoNLL 2012, Association for Computational Linguistics, Stroudsburg, PA, USA, pp. 523–534 (2012). http://dl.acm.org/citation.cfm?id=2390948.2391009
Mendes, P.N., Jakob, M., Garcia-Silva, A., Bizer, C.: Dbpedia spotlight: shedding light on the web of documents. In: Proceedings of the 7th International Conference on Semantic Systems (I-Semantics) (2011)
Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. CoRR (2013). http://arxiv.org/abs/1301.3781
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th International Conference on Neural Information Processing Systems, NIPS 2013, vol. 2, pp. 3111–3119. Curran Associates Inc., USA (2013). http://dl.acm.org/citation.cfm?id=2999792.2999959
Mintz, M., Bills, S., Snow, R., Jurafsky, D.: Distant supervision for relation extraction without labeled data. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, Association for Computational Linguistics, pp. 1003–1011 (2009). http://www.aclweb.org/anthology/P09-1113
Nakashole, N., Weikum, G., Suchanek, F.: Patty: a taxonomy of relational patterns with semantic types. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, EMNLP-CoNLL 2012, Association for Computational Linguistics, Stroudsburg, PA, USA, pp. 1135–1145 (2012). http://dl.acm.org/citation.cfm?id=2390948.2391076
Ren, X., Wu, Z., He, W., Qu, M., Voss, C.R., Ji, H., Abdelzaher, T.F., Han, J.: Cotype: joint extraction of typed entities and relations with knowledge bases. In: Proceedings of the 26th International Conference on World Wide Web, pp. 1015–1024 (2017)
Google Scholar
Riedel, S., Yao, L., McCallum, A.: Modeling relations and their mentions without labeled text. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds.) Machine Learning and Knowledge Discovery in Databases, pp. 148–163. Springer, Heidelberg (2010)
Chapter Google Scholar
Riloff, E., Jones, R.: Learning dictionaries for information extraction by multi-level bootstrapping. In: Proceedings of the Sixteenth National Conference on Artificial Intelligence and the Eleventh Innovative Applications of Artificial Intelligence Conference Innovative Applications of Artificial Intelligence, AAAI 1999/IAAI 1999, American Association for Artificial Intelligence, Menlo Park, CA, USA, pp. 474–479 (1999). http://dl.acm.org/citation.cfm?id=315149.315364
Singh, K., Mulang’, I.O., Lytra, I., Jaradeh, M.Y., Sakor, A., Vidal, M.E., Lange, C., Auer, S.: Capturing knowledge in semantically-typed relational patterns to enhance relation linking. In: Proceedings of the Knowledge Capture Conference, K-CAP 2017, pp. 31:1–31:8. ACM, New York (2017). https://doi.org/10.1145/3148011.3148031, https://doi.org/10.1145/3148011.3148031
Usbeck, R., Ngomo, A.-C.N., Bühmann, L., Unger, C.: HAWK – hybrid question answering using linked data. In: Gandon, F., Sabou, M., Sack, H., d’Amato, C., Cudré-Mauroux, P., Zimmermann, A. (eds.) ESWC 2015. LNCS, vol. 9088, pp. 353–368. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-18818-8_22
Chapter Google Scholar
Usbeck, R., Ngomo, A.-C.N., Haarmann, B., Krithara, A., Röder, M., Napolitano, G.: 7th open challenge on question answering over linked data (QALD-7). In: Dragoni, M., Solanki, M., Blomqvist, E. (eds.) SemWebEval 2017. CCIS, vol. 769, pp. 59–69. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-69146-6_6. https://svn.aksw.org/papers/2017/ESWC_2017_QALD/public.pdf
Chapter Google Scholar
Yates, A., Cafarella, M., Banko, M., Etzioni, O., Broadhead, M., Soderland, S.: Textrunner: Open information extraction on the web. In: Proceedings of Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations, NAACL-Demonstrations 2007, Association for Computational Linguistics, Stroudsburg, PA, USA, pp. 25–26 (2007). http://dl.acm.org/citation.cfm?id=1614164.1614177
Zaveri, A., Rula, A., Maurino, A., Pietrobon, R., Lehmann, J., Auer, S.: Quality assessment for linked data: a survey. Semantic Web Journal (2015). http://www.semantic-web-journal.net/content/quality-assessment-linked-data-survey

Download references

Acknowledgement

This work has been supported by the H2020 project HOBBIT (no. 688227), the BMWI projects GEISER (no. 01MD16014E) and OPAL (no. 19F2028A), the EuroStars projects DIESEL (no. 01QE1512C) and QAMEL (no. 01QE1549C).

Author information

Authors and Affiliations

Leipzig University, AKSW Group, Hainstraße 11, 04109, Leipzig, Germany
René Speck
Paderborn University, DICE Group, Warburger Straße 100, 33098, Paderborn, Germany
Axel-Cyrille Ngomo Ngonga

Authors

René Speck
View author publications
You can also search for this author in PubMed Google Scholar
Axel-Cyrille Ngomo Ngonga
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to René Speck or Axel-Cyrille Ngomo Ngonga .

Editor information

Editors and Affiliations

Université Côte d’Azur, CNRS, Inria, I3S, Sophia Antipolis, France
Catherine Faron Zucker
Fondazione Bruno Kessler, Trento, Italy
Chiara Ghidini
University of Lorraine, CNRS, Inria, LORIA, Nancy, France
Amedeo Napoli
University of Lorraine, CNRS, Inria, LORIA, Nancy, France
Yannick Toussaint

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Speck, R., Ngomo Ngonga, AC. (2018). On Extracting Relations Using Distributional Semantics and a Tree Generalization. In: Faron Zucker, C., Ghidini, C., Napoli, A., Toussaint, Y. (eds) Knowledge Engineering and Knowledge Management. EKAW 2018. Lecture Notes in Computer Science(), vol 11313. Springer, Cham. https://doi.org/10.1007/978-3-030-03667-6_27

Download citation

DOI: https://doi.org/10.1007/978-3-030-03667-6_27
Published: 31 October 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-03666-9
Online ISBN: 978-3-030-03667-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics