Skip to main content
Log in

Relation extraction based on two-step classification with distant supervision

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Supervised machine learning methods have been widely used in relation extraction to find the relation between two named entities in a sentence. However, the disadvantages of supervised machine learning methods are that constructing the training data set is costly and time-consuming, and the machine learning system is ultimately dependent on the specific domain of the training data. To overcome these disadvantages, we propose a two-step relation extraction model with distant supervision. The two-step model consists of a one-class model and a multi-class model. The one-class model selects positive sentences from input sentences and the multi-class model classifies the positive sentences into specific classes. In the experiments, the proposed model showed good F1-measures (62.9 % in the auto-labeled test data, 63.8 % in the gold-labeled test data), although it does not use any human-labeled training data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Kwon AR, Lee KS (2013) Opinion bias detection based on social opinions for Twitter. J Inf Process Syst 9:538–547

    Article  Google Scholar 

  2. Hsueh HY, Chen CN, Huang KF (2013) Generating metadata from web documents: a systematic approach. Hum-Centric Comput Inf Sci 3:1–17. doi:10.1186/2192-1962-3-7

    Article  Google Scholar 

  3. Ko M, Choi W (2013) A distributional inference for cross-lingual undefined entity linking. J Converg 4:23–28

    Google Scholar 

  4. IBM Waston Website. http://www.ibm.com/smarterplanet/us/en/ibmwatson/. Accessed 8 Sept 2015

  5. Apple Siri Website. http://www.apple.com/ios/siri/. Accessed 8 Sept 2015

  6. Culotta A, Sorensen J (2004) Dependency tree kernels for relation extraction. In: Proceedings of the 42nd annual meeting on association for computational linguistics, vol 432. doi:10.3115/1218955.1219009

  7. Bunescu RC, Mooney RJ (2005) A shortest path dependency kernel for relation extraction. In: Proceedings of the conference on human language technology and empirical methods in natural language processing, pp 724–731. doi:10.3115/1220575.1220666

  8. Zhang M, Zhang J, Su J (2006) Exploring syntactic features for relation extraction using a convolution tree kernel. In: Proceedings of the main conference on human language technology conference of the North American chapter of the association of computational linguistics, pp 288–295: doi:10.3115/1220835.1220872

  9. Zhou GD, Zhang M, Ji DH, Zhu QM (2007) Tree kernel-based relation extraction with context-sensitive structured parse tree information. In: Proceedings of EMNLP-CoNLL, pp 728–736

  10. Choi M, Kim H (2013) Social relation extraction from texts using a support-vector-machine-based dependency trigram kernel. Inf Process Manag 49:303–311. doi:10.1016/j.ipm.2012.04.002

    Article  Google Scholar 

  11. NIST (2007) The NIST ACE evaluation website. http://www.nist.gov/speech/tests/ace. Accessed 8 Jan 2015

  12. Mintz M, Bills S, Snow R, Jurafsky D (2009) Distant supervision for relation extraction without labeled data. In: Proceedings of the 47th annual meeting of the association for computational linguistics, pp 1003–1011

  13. Chrupala G, Momtazi S, Wiegand M, Kazalski S, Xu F, Roth B, Balahur A, Klakow D (2010) Saarland university spoken language systems at the slot filling task of TAC KBP 2010. In: Proceedings of TAC 2010 workshop

  14. Pershina M, Min B, Xu W, Grishman R (2014) Infusion of labeled data into distant supervision for relation extraction. In: Proceedings of the 52nd annual meeting on association for computational linguistics, pp 732–738

  15. Snow R, Jurafsky D, Ng AY (2004) Learning syntactic patterns for automatic hypernym discovery. In: Proceedings of advances in neural information processing systems, vol 17, pp 1297–1304

  16. Ngai G, Florian R (2001) Transformation-based learning in the fast lane. In: Proceedings of the second meeting of the North American chapter of the association for computational linguistics on language technologies, pp 1–8. doi:10.3115/1073336.1073342

  17. OpenNLP Website. https://opennlp.apache.org/. Accessed 8 Jan 2015

  18. Ravichandran D, Hovy E (2002) Learning surface text patterns for a question answering system. In: Proceedings of the 40th annual meeting on association for computational linguistics, pp 41–47. doi:10.3115/1073083.1073092

  19. Wikipedia Website. http://en.wikipedia.org/wiki/Tf-idf. Accessed 8 Jan 2015

  20. Salton G, Fox EA, Wu H (1983) Extended Boolean information retrieval. Commun ACM 26:1022–1036. doi:10.1145/182.358466

    Article  MathSciNet  MATH  Google Scholar 

  21. Milidiu R, Santos C, Duarte J, Renteria R (2006) Semi-supervised learning for Portuguese noun phrase extraction. Comput Process Port Lang 200–203. doi:10.1007/11751984_21

  22. DBpedia Ontology 3.9 Website. http://wiki.dbpedia.org/Downloads39. Accessed 8 Jan 2015

  23. Manevitz LM, Yousef M (2002) One-class SVMs for document classification. J Mach Learn Res 2:139–154

    MATH  Google Scholar 

Download references

Acknowledgments

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (2013R1A1A4A01005074). This research was also supported by LG Electronics.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Harksoo Kim.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Choi, M., Lee, Hg. & Kim, H. Relation extraction based on two-step classification with distant supervision. J Supercomput 72, 2609–2622 (2016). https://doi.org/10.1007/s11227-015-1535-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-015-1535-4

Keywords

Navigation