Skip to main content

A Benchmark for Relation Extraction Kernels

  • Conference paper
  • First Online:
  • 945 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9282))

Abstract

Relation extraction from textual documents is an important task in the context of information extraction. This task aims at identifying relations between pairs of named entities and assigning them a type. Relation extraction is often approached as a supervised classification problem, involving pre-processing steps such as text segmentation, entity recognition, and morphological and syntactic annotations. In previous studies, the way data is pre-processed differs among them, thus making the comparison of classification techniques for relation extraction unfair and inconclusive. Some of these classification techniques for relation extraction involve the use of kernels, which enable the comparison of complex structures. We propose a benchmark for the comparison of different kernels for relation extraction. Specifically, we propose the application of a common pre-processing stage, together with the use of an online learning algorithm to train Support Vector Machines with kernels designed for the classification of candidate pairs of related entities. We also report the results of the systematic experimental validation we have performed, using well known datasets in the area.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://reel.cs.columbia.edu/

  2. 2.

    ftp://ftp.cs.utexas.edu/pub/mooney/bio-data/interactions.tar.gz

  3. 3.

    http://semeval2.fbk.eu/semeval2.php?location=data

  4. 4.

    http://opennlp.apache.org/

  5. 5.

    http://snowball.tartarus.org/

  6. 6.

    http://web.tecnico.ulisboa.pt/joaoplmpereira/Capitalization.html

  7. 7.

    http://nlp.stanford.edu/software/corenlp.shtml

  8. 8.

    http://opennlp.sourceforge.net/models-1.5/en-sent.bin

  9. 9.

    http://opennlp.sourceforge.net/models-1.5/en-token.bin

  10. 10.

    http://opennlp.sourceforge.net/models-1.5/en-pos-maxent.bin

  11. 11.

    http://web.tecnico.ulisboa.pt/joaoplmpereira/OnlineLearning.html

References

  1. Barrio, P., Simões, G., Galhardas, H., Gravano, L.: REEL: a relation extraction learning framework. In: JCDL (2014)

    Google Scholar 

  2. Berger, A.L., Pietra, V.J.D., Pietra, S.A.D.: A maximum entropy approach to natural language processing. Comput. Linguist. 22, 39–71 (1996)

    Google Scholar 

  3. Bunescu, R., Mooney, R.J.: A shortest path dependency kernel for relation extraction. In: HLT-EMNLP (2005)

    Google Scholar 

  4. Bunescu, R., Mooney, R.J.: Subsequence kernels for relation extraction. In: CoNLL (2006)

    Google Scholar 

  5. Chinchor, N.A.: Named entity task definition. In: MUC-7 (1998)

    Google Scholar 

  6. Doddington, G.R., et al.: The automatic content extraction (ACE) program - tasks, data, and evaluation. In: LREC (2004)

    Google Scholar 

  7. Giuliano, C., Lavelli, A., Romano, L.: Exploiting shallow linguistic information for relation extraction from biomedical literature. In: EACL (2006)

    Google Scholar 

  8. Hendrickx, I., et al.: SemEval-2010 task 8: multi-way classification of semantic relations between pairs of nominals. In: SemEval (2010)

    Google Scholar 

  9. Hsu, C.W., Lin, C.J.: A comparison of methods for multiclass support vector machines. IEEE Trans. Neural Netw. 13, 415–425 (2002)

    Article  Google Scholar 

  10. Marrero, M., Sanchez-Cuadrado, S., Lara, J.M., Andreadakis, G.: Evaluation of named entity extraction systems. Res. Comput. Sci. 41, 47–58 (2009)

    Google Scholar 

  11. Sarawagi, S.: Information extraction. Found. Trends Databases 1, 261–377 (2008)

    Article  MATH  Google Scholar 

  12. Shalev-Shwartz, S., Singer, Y., Srebro, N.: PEGASOS: primal estimated sub-GrAdient SOlver for SVM. In: ICML (2007)

    Google Scholar 

Download references

Acknowledgements

We would like to thank Gonçalo Simões for the fruitful discussions, and for advice on preliminary versions of this paper.

This work was supported by Fundação para a Ciência e a Tecnologia, under Project UID/CEC/50021/2013, and under Project DataStorm (ref. EXCL/EEI-ESS/0257/2012).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to João L. M. Pereira .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Pereira, J.L.M., Galhardas, H., Martins, B. (2015). A Benchmark for Relation Extraction Kernels. In: Tadeusz, M., Valduriez, P., Bellatreche, L. (eds) Advances in Databases and Information Systems. ADBIS 2015. Lecture Notes in Computer Science(), vol 9282. Springer, Cham. https://doi.org/10.1007/978-3-319-23135-8_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-23135-8_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-23134-1

  • Online ISBN: 978-3-319-23135-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics