Skip to main content

Taxonomy Induction from Chinese Encyclopedias by Combinatorial Optimization

  • Conference paper
  • First Online:
Natural Language Processing and Chinese Computing (NLPCC 2015)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9362))

  • 2302 Accesses

Abstract

Taxonomy is an important component in knowledge bases, and it is an urgent, meaningful but challenging task for Chinese taxonomy construction. In this paper, we propose a taxonomy induction approach from a Chinese encyclopedia by using combinatorial optimizations. At first, subclass-of relations are derived by validating the relation between two categories. Then, integer programming optimizations are applied to find out instance-of relations from encyclopedia articles by considering the constrains among categories. The experimental results show that our approach can construct a practicable taxonomy from Chinese encyclopedias.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bizer, C., Lehmann, J., Kobilarov, G., Auer, S., Becker, C., Cyganiak, R., Hellmann, S.: Dbpedia-a crystallization point for the web of data. Web Semantics: Science, Services and Agents on the World Wide Web 7(3), 154–165 (2009)

    Article  Google Scholar 

  2. Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: a core of semantic knowledge. In: Proceedings of the 16th International Conference on World Wide Web, pp. 697–706. ACM (2007)

    Google Scholar 

  3. Bollacker, K., Evans, C., Paritosh, P., Sturge, T., Taylor, J.: Freebase: a collaboratively created graph database for structuring human knowledge. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 1247–1250. ACM (2008)

    Google Scholar 

  4. Navigli, R., Ponzetto, S.P.: Babelnet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artificial Intelligence 193, 217–250 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  5. Fellbaum, C.: Wordnet: An electronic database (1998)

    Google Scholar 

  6. Hearst, M.A.: Automatic acquisition of hyponyms from large text corpora. In: Proceedings of the 14th Conference on Computational Linguistics, vol. 2, pp. 539–545. Association for Computational Linguistics (1992)

    Google Scholar 

  7. Li, T., Chubak, P., Lakshmanan, L.V.S., Pottinger, R.: Efficient extraction of ontologies from domain specific text corpora. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 1537–1541. ACM (2012)

    Google Scholar 

  8. Navigli, R., Velardi, P.: Learning word-class lattices for definition and hypernym extraction. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 1318–1327. Association for Computational Linguistics (2010)

    Google Scholar 

  9. Navigli, R., Velardi, P., Faralli, S.: A graph-based algorithm for inducing lexical taxonomies from scratch. In: IJCAI, pp. 1872–1877 (2011)

    Google Scholar 

  10. Ponzetto, S.P., Strube, M.: Wikitaxonomy: a large scale knowledge resource. In: ECAI (2008)

    Google Scholar 

  11. Ponzetto, S.P., Navigli, R.: Large-scale taxonomy mapping for restructuring and integrating wikipedia. In: IJCAI (2009)

    Google Scholar 

  12. Wu, F., Weld, D.S.: Automatically refining the wikipedia infobox ontology. In: WWW. ACM (2008)

    Google Scholar 

  13. Niu, X., Sun, X., Wang, H., Rong, S., Qi, G., Yu, Y.: Zhishi.me - weaving chinese linking open data. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011, Part II. LNCS, vol. 7032, pp. 205–220. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  14. Wang, Z., Wang, Z., Li, J., Pan, J.Z.: Knowledge extraction from chinese wiki encyclopedias. Journal of Zhejiang University SCIENCE C 13(4), 268–280 (2012)

    Google Scholar 

  15. Wang, Z., Li, J., Wang, Z., Li, S., Li, M., Zhang, D., Shi, Y., Liu, Y., Zhang, P., Tang, J.: Xlore: a large-scale english-chinese bilingual knowledge graph. In: International Semantic Web Conference (Posters & Demos), vol. 1035, pp. 121–124 (2013)

    Google Scholar 

  16. Fu, R., Qin, B., Liu, T.: Exploiting multiple sources for open-domain hypernym discovery. In: EMNLP, pp. 1224–1234 (2013)

    Google Scholar 

  17. Fu, R., Guo, J., Qin, B., Che, W., Wang, H., Liu, T.: Learning semantic hierarchies via word embeddings. In: Proceedings of the 52th Annual Meeting of the Association for Computational Linguistics: Long Papers, vol. 1 (2014)

    Google Scholar 

  18. Ruiji, F., Guo, J., Qin, B., Che, W., Wang, H., Liu, T.: Learning semantic hierarchies: A continuous vector space approach. IEEE/ACM Transactions on Audio, Speech, and Language Processing 23(3), 461–471 (2015)

    Article  Google Scholar 

  19. Wu, W., Li, H., Wang, H., Zhu, K.Q.: Probase: a probabilistic taxonomy for text understanding. In: SIGMOD. ACM (2012)

    Google Scholar 

  20. Snow, R., Jurafsky, D., Ng, A.Y.: Learning syntactic patterns for automatic hypernym discovery. Advances in Neural Information Processing Systems 17 (2004)

    Google Scholar 

  21. Weeds, J., Clarke, D., Reffin, J., Weir, D., Keller, B.: Learning to distinguish hypernyms and co-hyponyms. In: Proceedings of COLING, pp. 2249–2259 (2014)

    Google Scholar 

  22. Roller, S., Erk, K., Boleda, G.: Inclusive yet selective: supervised distributional hypernymy detection. In: COLING (2014)

    Google Scholar 

  23. Snow, R., Jurafsky, D., Ng, A.Y.: Semantic taxonomy induction from heterogenous evidence. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, pp. 801–808. Association for Computational Linguistics (2006)

    Google Scholar 

  24. Zhang, F., Shi, S., Liu, J., Sun, S., Lin, C.-Y.: Nonlinear evidence fusion and propagation for hyponymy relation mining. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 1159–1168. Association for Computational Linguistics (2011)

    Google Scholar 

  25. Bansal, M., Burkett, D., de Melo, G., Klein, D.: Structured learning for taxonomy induction with belief propagation, pp. 1041–1051 (2014)

    Google Scholar 

  26. Alfarone, D., Davis, J.: Unsupervised learning of an is-a taxonomy from a limited domain-specific corpus. CW Reports (2014)

    Google Scholar 

  27. Espinosa-Anke, L., Ronzano, F., Saggion, H.: Hypernym extraction: combining machine-learning and dependency grammar. In: Gelbukh, A. (ed.) CICLing 2015. LNCS, vol. 9041, pp. 372–383. Springer, Heidelberg (2015)

    Chapter  Google Scholar 

  28. Wang, H., Wu, T., Qi, G., Ruan, T.: On publishing chinese linked open schema. In: Mika, P., Tudorache, T., Bernstein, A., Welty, C., Knoblock, C., Vrandečić, D., Groth, P., Noy, N., Janowicz, K., Goble, C. (eds.) ISWC 2014, Part I. LNCS, vol. 8796, pp. 293–308. Springer, Heidelberg (2014)

    Chapter  Google Scholar 

  29. Wang, Z., Li, J., Li, S., Li, M., Tang, J., Zhang, K., Zhang, K.: Cross-lingual knowledge validation based taxonomy derivation from heterogeneous online wikis. In: Twenty-Eighth AAAI Conference on Artificial Intelligence (2014)

    Google Scholar 

  30. de Melo, G., Weikum, G.: Menta: Inducing multilingual taxonomies from wikipedia. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pp. 1099–1108. ACM (2010)

    Google Scholar 

  31. Flati, T., Vannella, D., Pasini, T., Navigli, R.: Two is bigger (and better) than one: the wikipedia bitaxonomy project. In: ACL, pp. 945–955 (2014)

    Google Scholar 

  32. Cilibrasi, R.L., Vitanyi, P.M.B.: The google similarity distance. IEEE Transactions on Knowledge and Data Engineering 19(3), 370–383 (2007)

    Article  Google Scholar 

  33. Järvelin, K., Kekäläinen, J.: Ir evaluation methods for retrieving highly relevant documents. In: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 41–48. ACM (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Weiming Lu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Lu, W., Lou, R., Dai, H., Zhang, Z., Yang, S., Wei, B. (2015). Taxonomy Induction from Chinese Encyclopedias by Combinatorial Optimization. In: Li, J., Ji, H., Zhao, D., Feng, Y. (eds) Natural Language Processing and Chinese Computing. NLPCC 2015. Lecture Notes in Computer Science(), vol 9362. Springer, Cham. https://doi.org/10.1007/978-3-319-25207-0_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-25207-0_25

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-25206-3

  • Online ISBN: 978-3-319-25207-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics