Skip to main content

Building a Large Scale Knowledge Base from Chinese Wiki Encyclopedia

  • Conference paper
Book cover The Semantic Web (JIST 2011)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7185))

Included in the following conference series:

Abstract

DBpedia has been proved to be a successful structured knowledge base, and large scale Semantic Web data has been built by using DBpedia as the central interlinking-hubs of the Web of Data in English. But in Chinese, due to the heavily imbalance in size (no more than one tenth) between English and Chinese in Wikipedia, there are few Chinese linked data are published and linked to DBpedia, which hinders the structured knowledge sharing both within Chinese resources and cross-lingual resources. This paper aims at building large scale Chinese structured knowledge base from Hudong, which is one of the largest Chinese Wiki Encyclopedia websites. In this paper, an upper-level ontology schema in Chinese is first learned based on the category system and Infobox information in Hudong. Totally, there are 19542 concepts are inferred, which are organized in hierarchy with maximally 20 levels. 2381 properties with domain and range information are learned according to the attributes in the Hudong Infoboxes. Then, 802593 instances are extracted and described using the concepts and properties in the learned ontology. These extracted instances cover a wide range of things, including persons, organizations, places and so on. Among all the instances, 62679 of them are linked to identical instances in DBpedia. Moreover, the paper provides RDF dump or SPARQL to access the established Chinese knowledge base. The general upper-level ontology and wide coverage makes the knowledge base a valuable Chinese semantic resource. It not only can be used in Chinese linked data building, the fundamental work for building multi lingual knowledge base across heterogeneous resources of different languages, but also can largely facilitate many useful applications of large-scale knowledge base such as knowledge question-answering and semantic search.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Berners-Lee, T.: Semantic Web Road map (1998), http://www.w3.org/DesignIssues/Semantic.html

  2. Suchanek, F.M., Kasneci, G., Weikum, G.: YAGO: A Large Ontology from Wikipedia and WordNet. Web Semantics: Science. Services and Agents on the World Wide Web 6(3), 203–217 (2008)

    Article  Google Scholar 

  3. Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: a core of semantic knowledge. In: Proceedings of the 16th International Conference on World Wide Web 2007, pp. 697–706. ACM, Banff (2007)

    Chapter  Google Scholar 

  4. Bizer, C., et al.: DBpedia - A crystallization point for the Web of Data. Web Semantics: Science. Services and Agents on the World Wide Web 7(3), 154–165 (2009)

    Article  Google Scholar 

  5. Auer, S., et al.: DBpedia: A Nucleus for a Web of Open Data The Semantic Web. In: Aberer, K., et al. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  6. Bollacker, K., et al.: Freebase: a collaboratively created graph database for structuring human knowledge. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data 2008, pp. 1247–1250. ACM, Vancouver (2008)

    Chapter  Google Scholar 

  7. Bizer, C., Heath, T., Berners-Lee, T.: Linked Data - the story so far. International Journal on Semantic Web and Information Systems 5(3) (2009)

    Google Scholar 

  8. Passant, A.: dbrec - Music Recommendations Using DBpedia. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010, Part II. LNCS, vol. 6497, pp. 209–224. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  9. García-Silva, et al.: Preliminary Results in Tag Disambiguation using DBpedia. In: First International Workshop Collective Knowledge Capturing and Representation CKCaR 2009, Redondo Beach, California, USA (2009)

    Google Scholar 

  10. Wu, F., Weld, D.S.: Automatically refining the wikipedia infobox ontology. In: Proceeding of the 17th International Conference on World Wide Web 2008, pp. 635–644. ACM, Beijing (2008)

    Chapter  Google Scholar 

  11. Kasneci, G., et al.: The YAGO-NAGA approach to knowledge discovery. SIGMOD Record (2008)

    Google Scholar 

  12. Euzenat, J., Shvaiko, P.: Ontology Matching. Springer, Heidelberg (2007)

    MATH  Google Scholar 

  13. Fellbaum, C.: WordNet: An Electronic Lexical Database. In: Fellbaum, C. (ed.) WordNet: An Electornic Lexical Database. MIT Press (1998)

    Google Scholar 

  14. Ponzetto, S.P., Strube, M.: Deriving a large scale taxonomy from Wikipedia. In: Proceedings of the 22nd National Conference on Artificial Intelligence, vol. 2, pp. 1440–1445. AAAI Press, Vancouver (2007)

    Google Scholar 

  15. Melo, G., Weikum, G.: MENTA: inducing multilingual taxonomies from wikipedia. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pp. 1099–1108. ACM, Toronto (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wang, Z., Wang, Z., Li, J., Pan, J.Z. (2012). Building a Large Scale Knowledge Base from Chinese Wiki Encyclopedia. In: Pan, J.Z., et al. The Semantic Web. JIST 2011. Lecture Notes in Computer Science, vol 7185. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29923-0_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-29923-0_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-29922-3

  • Online ISBN: 978-3-642-29923-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics