Skip to main content

Distant Supervision for Relation Extraction via Group Selection

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9490))

Abstract

Distant supervision (DS) aligns relations between name entities from a knowledge base (KB) with free text and automatically annotates the training corpus with relation mentions. One big challenge of DS is that the heuristically generated relation labels usually tend to be noisy, when a pair of entity has multiple and/or incomplete relations in a KB. This paper proposes two ranking-based methods to reduce noise and select effective training data for multi-instance multi-label learning (MIML), one of the most popular learning paradigms for distantly supervised relation extraction. Through the proposed methods, training groups that are of low quality are excluded from the training data according to different ranking strategies. Experimental evaluation on the KBP dataset using state-of-the-art MIML algorithms in this community demonstrated that the proposed methods improved the performance significantly.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Bunescu, R.C., Mooney, R.J.: Learning to extract relations from the web using minimal supervision. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, pp. 576–583 (2007)

    Google Scholar 

  2. Bellare, K., McCallum, A.: Learning extractors from unlabeled text using relevant databases. In: Proceedings of the 6th International Workshop on Information Extraction on the Web, pp. 10–15 (2007)

    Google Scholar 

  3. Hoffmann, R., Zhang, C., Ling, X., Zettlemoyer L., Weld, D.S.: Knowledge based weak supervision for information extraction of overlapping relations. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, pp. 541–550 (2011)

    Google Scholar 

  4. Mintz, M., Bills, S., Snow, R., Jurafsky, D.: Distant supervision for relation extraction without labeled data. In: Proceedings of the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing of the AFNLP.2, pp. 1003–1011(2009)

    Google Scholar 

  5. Riedel, S., Yao, L., McCallum, A.: Modeling relations and their mentions without labeled text. In: Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 148–163 (2010)

    Google Scholar 

  6. Surdeanu, M., Tibshirani, J., Nallapati, R., Manning, C.D.: Multi-instance multi-label learning for relation extraction. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 455–465 (2012)

    Google Scholar 

  7. Angeli, G., Tibshirani, J., Wu, J.Y., Manning, C.D.: Combining distant and partial supervision for relation extraction. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp. 1556–1567 (2014)

    Google Scholar 

  8. Min, B., Grishman, R., Wan, L., Wang, C., Gondek, D.: Distant supervision for relation extraction with an incomplete knowledge base. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics, pp. 777–782 (2013)

    Google Scholar 

  9. Craven, M., Kumlien, J.: Constructing biological knowledge bases by extracting information from text sources. In: Proceedings of the 7th International Conference on Intelligent Systems for Molecular Biology, pp. 77–86 (1999)

    Google Scholar 

  10. Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., Yakhnenko, O.: Translating embeddings for modeling multi-relational data. In: Proceedings of the 26th Advances in Neural Information Processing Systems, pp. 2787–2795(2013)

    Google Scholar 

  11. Wang, Z., Zhang, J., Feng, J., Chen, Z.: Knowledge graph embedding by translating on hyperplanes. In: Proceedings of the 29th AAAI Conference on Artificial Intelligence, pp. 1112–1119 (2014)

    Google Scholar 

  12. Lin, Y., Liu, Z., Sun, M., Liu, Y., Zhu, X.: Learning entity and relation embeddings for knowledge graph completion. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence, pp. 2181–2187 (2015)

    Google Scholar 

  13. Riedel, S., Yao, L., McCallum, A., Marlin, B.M.: Relation extraction with matrix factorization and universal schemas. In: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 74–84 (2013)

    Google Scholar 

  14. Fan, M., Zhao, D., Zhou, Q., et al.: Distant supervision for relation extraction with matrix completion. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 839–849 (2014)

    Google Scholar 

  15. Nagesh, A., Haffari, G., Ramakrishna, G.: Noisy or-based model for relation extraction using distant supervision. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp. 1937–1941 (2014)

    Google Scholar 

  16. Intxaurrondo, A., Surdeanu, M., de Lacalle, O.L., Agirre, E.: Removing noisy mentions for distant supervision. Procesamiento del Lenguaje Natural 51, 41–48 (2013)

    Google Scholar 

  17. Xu, W., Hoffmann, R., Zhao, L., Grishman, R.: Filling knowledge base gaps for distant supervision of relation extraction. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, pp. 665–670 (2013)

    Google Scholar 

  18. Takamatsu, S., Sato, I., Nakagawa, H.: Reducing wrong labels in distant supervision for relation extraction. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, pp. 721–729 (2012)

    Google Scholar 

  19. Ritter, A., Zettlemoyer, L., Etzioni, O.: Modeling missing data in distant supervision for information extraction. Trans. Assoc. Comput. Linguist. 1, 367–378 (2013)

    Google Scholar 

  20. Usunier, N., Buffoni, D., Gallinari, P.: Ranking with ordered weighted pairwise classification. In: Proceedings of the 26th International Conference on Machine Learning, pp. 1057–1064 (2009)

    Google Scholar 

  21. Weston, J., Bengio, S., Usunier, N.: Wsabie: scaling up to large vocabulary image annotation. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence, vol. 11, pp. 2764–2770 (2011)

    Google Scholar 

Download references

Acknowledgement

This work was supported in part by National 863 Program of China (2015AA015405), NSFCs (National Natural Science Foundation of China) (61402128, 61473101, 61173075 and 61272383).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yang Xiang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Xiang, Y., Wang, X., Zhang, Y., Qin, Y., Fan, S. (2015). Distant Supervision for Relation Extraction via Group Selection. In: Arik, S., Huang, T., Lai, W., Liu, Q. (eds) Neural Information Processing. ICONIP 2015. Lecture Notes in Computer Science(), vol 9490. Springer, Cham. https://doi.org/10.1007/978-3-319-26535-3_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-26535-3_29

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-26534-6

  • Online ISBN: 978-3-319-26535-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics