Skip to main content

DKGBuilder: An Architecture for Building a Domain Knowledge Graph from Scratch

  • Conference paper
  • First Online:
Database Systems for Advanced Applications (DASFAA 2017)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10178))

Included in the following conference series:

Abstract

In recent years, we have witnessed the technical advances in general knowledge graph construction. However, for a specific domain, harvesting precise and fine-grained knowledge is still difficult due to the long-tail property of entities and relations, together with the lack of high-quality, wide-coverage data sources. In this paper, a domain knowledge graph construction system DKGBuilder is presented. It utilizes a template-based approach to extract seed knowledge from semi-structured data. A word embedding based projection model is proposed to extract relations from text under the framework of distant supervision. We further employ an is-a relation classifier to learn a domain taxonomy using a bottom-up strategy. For demonstration, we construct a Chinese entertainment knowledge graph from Wikipedia to support several knowledge service functionalities, containing over 0.7M facts with 93.1% accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://baike.baidu.com/.

References

  1. Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: a core of semantic knowledge. In: WWW, pp. 697–706 (2007)

    Google Scholar 

  2. Li, J., Wang, C., He, X., Zhang, R., Gao, M.: User generated content oriented Chinese taxonomy construction. In: Cheng, R., Cui, B., Zhang, Z., Cai, R., Xu, J. (eds.) APWeb 2015. LNCS, vol. 9313, pp. 623–634. Springer, Cham (2015). doi:10.1007/978-3-319-25255-1_51

    Chapter  Google Scholar 

  3. Wang, C., Gao, M., He, X., Zhang, R.: Challenges in Chinese knowledge graph construction. In: ICDE Workshops, pp. 59–61 (2015)

    Google Scholar 

  4. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. CoRR abs/1301.3781 (2013)

    Google Scholar 

  5. Fu, R., Guo, J., Qin, B., Che, W., Wang, H., Liu, T.: Learning semantic hierarchies via word embeddings. In: ACL, pp. 1199–1209 (2014)

    Google Scholar 

  6. Wang, C., He, X.: Chinese hypernym-hyponym extraction from user generated categories. In: COLING, pp. 1350–1361 (2016)

    Google Scholar 

Download references

Acknowledgements

This work is partially supported by the National Key Research and Development Program of China under Grant No. 2016YFB1000904 and NSFC-Zhejiang Joint Fund for the Integration of Industrialization and Informatization under Grant No. U1509219.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaofeng He .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Fan, Y., Wang, C., Zhou, G., He, X. (2017). DKGBuilder: An Architecture for Building a Domain Knowledge Graph from Scratch. In: Candan, S., Chen, L., Pedersen, T., Chang, L., Hua, W. (eds) Database Systems for Advanced Applications. DASFAA 2017. Lecture Notes in Computer Science(), vol 10178. Springer, Cham. https://doi.org/10.1007/978-3-319-55699-4_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-55699-4_42

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-55698-7

  • Online ISBN: 978-3-319-55699-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics