Skip to main content

Cross-Lingual Type Inference

  • Conference paper
  • First Online:
Database Systems for Advanced Applications (DASFAA 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9642))

Included in the following conference series:

Abstract

Entity typing is an essential task for constructing a knowledge base. However, many non-English knowledge bases fail to type their entities due to the absence of a reasonable local hierarchical taxonomy. Since constructing a widely accepted taxonomy is a hard problem, we propose to type these non-English entities with some widely accepted taxonomies in English, such as DBpedia, Yago and Freebase. We define this problem as cross-lingual type inference. In this paper, we present CUTE to type Chinese entities with DBpedia types. First we exploit the cross-lingual entity linking between Chinese and English entities to construct the training data. Then we propose a multi-label hierarchical classification algorithm to type these Chinese entities. Experimental results show the effectiveness and efficiency of our method.

Y. Xiao—This paper was supported by the National NSFC (No. 61472085, 61171132, 61033010, U1509213), by National Key Basic Research Program of China under No. 2015CB358800, by Shanghai Municipal Science and Technology Commission foundation key project under No. 15JC1400900. Seung-won Hwang was supported by Microsoft.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In this paper, we strictly differentiate “category” from “type”. “Category” always refer to the part of the knowledge base such as Baidu Baike and Wikipedia named as “category”. Most of these categories actually are only tags of an entity. Instead “type” always refer to the class that an entity can be classified into.

  2. 2.

    http://oldwiki.dbpedia.org/Downloads2014.

  3. 3.

    http://oldwiki.dbpedia.org/Downloads2014#dbpedia-ontology.

  4. 4.

    http://oldwiki.dbpedia.org/Downloads2014#mapping-based-types.

  5. 5.

    http://oldwiki.dbpedia.org/Downloads2014#articles-categories.

  6. 6.

    http://baike.baidu.com/.

  7. 7.

    http://data.dws.informatik.uni-mannheim.de/dbpedia/2014/zh/labels_en_uris_zh.nt.bz2.

References

  1. Palmero Aprosio, A., Giuliano, C., Lavelli, A.: Automatic expansion of DBpedia exploiting wikipedia cross-language information. In: Cimiano, P., Corcho, O., Presutti, V., Hollink, L., Rudolph, S. (eds.) ESWC 2013. LNCS, vol. 7882, pp. 397–411. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  2. Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.G.: DBpedia: a nucleus for a web of open data. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L.J.B., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  3. Bollacker, K., Evans, C., Paritosh, P., Sturge, T., Taylor, J.: Freebase: a collaboratively created graph database for structuring human knowledge. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 1247–1250. ACM (2008)

    Google Scholar 

  4. Dong, L., Wei, F., Sun, H., Zhou, M., Xu, K.: A hybrid neural model for type classification of entity mentions. In: Proceedings of the 24th International Conference on Artificial Intelligence, pp. 1243–1249. AAAI Press (2015)

    Google Scholar 

  5. Gangemi, A., Nuzzolese, A.G., Presutti, V., Draicchio, F., Musetti, A., Ciancarini, P.: Automatic typing of DBpedia entities. In: Cudré-Mauroux, P., Heflin, J., Sirin, E., Tudorache, T., Euzenat, J., Hauswirth, M., Parreira, J.X., Hendler, J., Schreiber, G., Bernstein, A., Blomqvist, E. (eds.) ISWC 2012, Part I. LNCS, vol. 7649, pp. 65–81. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  6. Hoffart, J., Suchanek, F.M., Berberich, K., Weikum, G.: Yago2: a spatially and temporally enhanced knowledge base from wikipedia. Artif. Intell. 194, 28–61 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  7. Lee, T., Wang, Z., Wang, H., Hwang, S.W.: Attribute extraction and scoring: a probabilistic approach. In: 2013 IEEE 29th International Conference on Data Engineering (ICDE), pp. 194–205. IEEE (2013)

    Google Scholar 

  8. Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P.N., Hellmann, S., Morsey, M., van Kleef, P., Auer, S., et al.: DBpedia-a large-scale, multilingual knowledge base extracted from wikipedia. Semant. Web J. 5, 1–29 (2014)

    Google Scholar 

  9. Ling, X., Weld, D.S.: Fine-grained entity recognition. In: AAAI. Citeseer (2012)

    Google Scholar 

  10. Murdock, J.W., Kalyanpur, A., Welty, C., Fan, J., Ferrucci, D.A., Gondek, D., Zhang, L., Kanayama, H.: Typing candidate answers using type coercion. IBM J. Res. Dev. 56(3.4), 7:1–7:13 (2012)

    Google Scholar 

  11. Nakashole, N., Tylenda, T., Weikum, G.: Fine-grained semantic typing of emerging entities. In: ACL (1), pp. 1488–1497 (2013)

    Google Scholar 

  12. Passant, A.: dbrec — music recommendations using DBpedia. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010, Part II. LNCS, vol. 6497, pp. 209–224. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  13. Paulheim, H., Bizer, C.: Type inference on noisy RDF data. In: Alani, H., Kagal, L., Fokoue, A., Groth, P., Biemann, C., Parreira, J.X., Aroyo, L., Noy, N., Welty, C., Janowicz, K. (eds.) ISWC 2013, Part I. LNCS, vol. 8218, pp. 510–525. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  14. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  15. Pohl, A.: Classifying the wikipedia articles into the opencyc taxonomy. In: Proceedings of the Web of Linked Entities Workshop in Conjuction with the 11th International Semantic Web Conference, vol. 5, p. 16 (2012)

    Google Scholar 

  16. Ponzetto, S.P., Strube, M.: Deriving a large scale taxonomy from wikipedia. In: AAAI, vol. 7, pp. 1440–1445 (2007)

    Google Scholar 

  17. Ritter, A., Clark, S., Etzioni, O., et al.: Named entity recognition in tweets: an experimental study. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 1524–1534 (2011)

    Google Scholar 

  18. Silla Jr., C.N., Freitas, A.A.: A survey of hierarchical classification across different application domains. Data Min. Knowl. Discov. 22(1–2), 31–72 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  19. Srinivas, S.: A generalization of the noisy-or model. In: Proceedings of the Ninth International Conference on Uncertainty in Artificial Intelligence, pp. 208–215. Morgan Kaufmann Publishers Inc. (1993)

    Google Scholar 

  20. Suchanek, F.M., Abiteboul, S., Senellart, P.: Paris: probabilistic alignment of relations, instances, and schema. Proc. VLDB endowment 5(3), 157–168 (2011)

    Article  Google Scholar 

  21. Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: a core of semantic knowledge. In: Proceedings of the 16th International Conference on World Wide Web, pp. 697–706. ACM (2007)

    Google Scholar 

  22. Wang, Z., Li, J., Tang, J.: Boosting cross-lingual knowledge linking via concept annotation. In: Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence, pp. 2733–2739. AAAI Press (2013)

    Google Scholar 

  23. Wang, Z., Li, J., Wang, Z., Tang, J.: Cross-lingual knowledge linking across wiki knowledge bases. In: Proceedings of the 21st International Conference on World Wide Web, pp. 459–468. ACM (2012)

    Google Scholar 

  24. Yosef, M.A., Bauer, S., Hoffart, J., Spaniol, M., Weikum, G.: Hyena: hierarchical type classification for entity names (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yanghua Xiao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Xu, B., Zhang, Y., Liang, J., Xiao, Y., Hwang, Sw., Wang, W. (2016). Cross-Lingual Type Inference. In: Navathe, S., Wu, W., Shekhar, S., Du, X., Wang, X., Xiong, H. (eds) Database Systems for Advanced Applications. DASFAA 2016. Lecture Notes in Computer Science(), vol 9642. Springer, Cham. https://doi.org/10.1007/978-3-319-32025-0_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-32025-0_28

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-32024-3

  • Online ISBN: 978-3-319-32025-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics