A Benchmark for Relation Extraction Kernels

Pereira, João L. M.; Galhardas, Helena; Martins, Bruno

doi:10.1007/978-3-319-23135-8_13

A Benchmark for Relation Extraction Kernels

João L. M. Pereira¹⁶,
Helena Galhardas¹⁶ &
Bruno Martins¹⁶

Conference paper
First Online: 01 January 2015

945 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9282))

Abstract

Relation extraction from textual documents is an important task in the context of information extraction. This task aims at identifying relations between pairs of named entities and assigning them a type. Relation extraction is often approached as a supervised classification problem, involving pre-processing steps such as text segmentation, entity recognition, and morphological and syntactic annotations. In previous studies, the way data is pre-processed differs among them, thus making the comparison of classification techniques for relation extraction unfair and inconclusive. Some of these classification techniques for relation extraction involve the use of kernels, which enable the comparison of complex structures. We propose a benchmark for the comparison of different kernels for relation extraction. Specifically, we propose the application of a common pre-processing stage, together with the use of an online learning algorithm to train Support Vector Machines with kernels designed for the classification of candidate pairs of related entities. We also report the results of the systematic experimental validation we have performed, using well known datasets in the area.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

References

Barrio, P., Simões, G., Galhardas, H., Gravano, L.: REEL: a relation extraction learning framework. In: JCDL (2014)
Google Scholar
Berger, A.L., Pietra, V.J.D., Pietra, S.A.D.: A maximum entropy approach to natural language processing. Comput. Linguist. 22, 39–71 (1996)
Google Scholar
Bunescu, R., Mooney, R.J.: A shortest path dependency kernel for relation extraction. In: HLT-EMNLP (2005)
Google Scholar
Bunescu, R., Mooney, R.J.: Subsequence kernels for relation extraction. In: CoNLL (2006)
Google Scholar
Chinchor, N.A.: Named entity task definition. In: MUC-7 (1998)
Google Scholar
Doddington, G.R., et al.: The automatic content extraction (ACE) program - tasks, data, and evaluation. In: LREC (2004)
Google Scholar
Giuliano, C., Lavelli, A., Romano, L.: Exploiting shallow linguistic information for relation extraction from biomedical literature. In: EACL (2006)
Google Scholar
Hendrickx, I., et al.: SemEval-2010 task 8: multi-way classification of semantic relations between pairs of nominals. In: SemEval (2010)
Google Scholar
Hsu, C.W., Lin, C.J.: A comparison of methods for multiclass support vector machines. IEEE Trans. Neural Netw. 13, 415–425 (2002)
Article Google Scholar
Marrero, M., Sanchez-Cuadrado, S., Lara, J.M., Andreadakis, G.: Evaluation of named entity extraction systems. Res. Comput. Sci. 41, 47–58 (2009)
Google Scholar
Sarawagi, S.: Information extraction. Found. Trends Databases 1, 261–377 (2008)
Article MATH Google Scholar
Shalev-Shwartz, S., Singer, Y., Srebro, N.: PEGASOS: primal estimated sub-GrAdient SOlver for SVM. In: ICML (2007)
Google Scholar

Download references

Acknowledgements

We would like to thank Gonçalo Simões for the fruitful discussions, and for advice on preliminary versions of this paper.

This work was supported by Fundação para a Ciência e a Tecnologia, under Project UID/CEC/50021/2013, and under Project DataStorm (ref. EXCL/EEI-ESS/0257/2012).

Author information

Authors and Affiliations

INESC-ID and Instituto Superior Técnico, Universidade de Lisboa, Lisbon, Portugal
João L. M. Pereira, Helena Galhardas & Bruno Martins

Authors

João L. M. Pereira
View author publications
You can also search for this author in PubMed Google Scholar
Helena Galhardas
View author publications
You can also search for this author in PubMed Google Scholar
Bruno Martins
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to João L. M. Pereira .

Editor information

Editors and Affiliations

Poznan University of Technology, Poznán, Poland
Morzy Tadeusz
INRIA, Montpellier, France
Patrick Valduriez
Teleport 2, LIAS/ISAE-ENSMA, Poitiers, France
Ladjel Bellatreche

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pereira, J.L.M., Galhardas, H., Martins, B. (2015). A Benchmark for Relation Extraction Kernels. In: Tadeusz, M., Valduriez, P., Bellatreche, L. (eds) Advances in Databases and Information Systems. ADBIS 2015. Lecture Notes in Computer Science(), vol 9282. Springer, Cham. https://doi.org/10.1007/978-3-319-23135-8_13

Download citation

DOI: https://doi.org/10.1007/978-3-319-23135-8_13
Published: 15 August 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23134-1
Online ISBN: 978-3-319-23135-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics