Skip to main content

Learning Chinese Entity Attributes from Online Encyclopedia

  • Conference paper
Web Technologies and Applications (APWeb 2012)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7234))

Included in the following conference series:

Abstract

Automatically constructing knowledge bases from free online encyclopedias has been considered to be a crucial step in many internet related areas. However, current research pays more attention to extract knowledge facts from English resources, and there is less work concerning other languages. In this paper, we describe an approach to extract entity attributes from a free Chinese online encyclopedia-HudongBaike. We first identified attribute-value pairs from HudongBaike pages that are featured with InfoBoxes, which in turn can be used to learn which attributes we should pay attention to for different HudongBaike entries. We then adopted a keyword matching approach to identify candidate sentences for each attribute in a plain HudongBaike article. At last, we trained a CRF model to extract corresponding values from these candidate sentences. Our approach is simple but effective, and our experiments show that it is possible to produce large amount of <S,P,O> triples from free online encyclopedias which can be then used to construct Chinese knowledge bases with less human supervision.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Berners-Lee, T., Hendler, J., Lassila, O.: The Semantic Web. Scientific American (May 2001)

    Google Scholar 

  2. Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database. MIT Press (1998)

    Google Scholar 

  3. Matuszek, C., Cabral, J., Witbrock, M., DeOliveira, J.: An introduction to the syntax and content of Cyc. In: AAAI Spring Symposium (2006)

    Google Scholar 

  4. http://www.geneontology.org

  5. http://www.wikipedia.org

  6. Suchanek, F.M., Kasneci, G., Weikum, G.: YAGO: A Large Ontology from Wikipedia and WordNet. Web Semantics: Science, Services and Agents on the World Wide Web 6(3), 203–217 (2008)

    Article  Google Scholar 

  7. Bizer, C., Lehmann, J., Kobilarov, G., Auer, S., Becker, C., Cyganiak, R., Hellmann, S.: DBpedia - A Crystallization Point for the Web of Data. Journal of Web Semantics: Science, Services and Agents on the World Wide Web (7), 154–165 (2009)

    Google Scholar 

  8. Wu, F., Weld, D.S.: Automatically Refining the Wikipedia Infobox Ontology. In: Proceedings of the 17th International Conference on World Wide Web, pp. 635–644. ACM, New York (2008)

    Chapter  Google Scholar 

  9. Wu, F., Weld, D.S.: Autonomously semantifying Wikipedia. In: Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management. ACM, New York (2007)

    Google Scholar 

  10. Qu, Y., Cheng, G., Ji, Q., Ge, W., Zhang, X.: Seeking knowledge with Falcons. Semantic Web Challenge (2008)

    Google Scholar 

  11. Shi, F., Li, J., Tang, J., Xie, G., Li, H.: Actively Learning Ontology Matching via User Interaction. In: Bernstein, A., Karger, D.R., Heath, T., Feigenbaum, L., Maynard, D., Motta, E., Thirunarayan, K. (eds.) ISWC 2009. LNCS, vol. 5823, pp. 585–600. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  12. http://ictclas.org/

  13. http://crfpp.sourceforge.net/

Download references

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chen, Y., Chen, L., Xu, K. (2012). Learning Chinese Entity Attributes from Online Encyclopedia. In: Wang, H., et al. Web Technologies and Applications. APWeb 2012. Lecture Notes in Computer Science, vol 7234. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29426-6_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-29426-6_22

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-29425-9

  • Online ISBN: 978-3-642-29426-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics