Using Crowdsourcing for Fine-Grained Entity Type Completion in Knowledge Bases

Dong, Zhaoan; Fan, Ju; Lu, Jiaheng; Du, Xiaoyong; Ling, Tok Wang

doi:10.1007/978-3-319-96893-3_19

Zhaoan Dong¹⁶,
Ju Fan¹⁶,
Jiaheng Lu^16,17,
Xiaoyong Du¹⁶ &
…
Tok Wang Ling¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10988))

Included in the following conference series:

Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint International Conference on Web and Big Data

1763 Accesses
2 Citations

Abstract

Recent years have witnessed the proliferation of large-scale Knowledge Bases (KBs). However, many entities in KBs have incomplete type information, and some are totally untyped. Even worse, fine-grained types (e.g., BasketballPlayer) containing rich semantic meanings are more likely to be incomplete, as they are more difficult to be obtained. Existing machine-based algorithms use predicates (e.g., birthPlace) of entities to infer their missing types, and they have limitations that the predicates may be insufficient to infer fine-grained types. In this paper, we utilize crowdsourcing to solve the problem, and address the challenge of controlling crowdsourcing cost. To this end, we propose a hybrid machine-crowdsourcing approach for fine-grained entity type completion. It firstly determines the types of some “representative” entities via crowdsourcing and then infers the types for remaining entities based on the crowdsourcing results. To support this approach, we first propose an embedding-based influence for type inference which considers not only the distance between entity embeddings but also the distances between entity and type embeddings. Second, we propose a new difficulty model for entity selection which can better capture the uncertainty of the machine algorithm when identifying the entity types. We demonstrate the effectiveness of our approach through experiments on real crowdsourcing platforms. The results show that our method outperforms the state-of-the-art algorithms by improving the effectiveness of fine-grained type completion at affordable crowdsourcing cost.

This work is partially supported by National Natural Science Foundation of China (No. 61602488, No. 61632016 and No. 61472427) and Academy of Finland (No. 310321).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A performant and incremental algorithm for knowledge graph entity typing

Article 30 March 2023

Zero-Shot Entity Typing in Knowledge Graphs

Crowd-Type: A Crowdsourcing-Based Tool for Type Completion in Knowledge Bases

Notes

References

Palmero Aprosio, A., Giuliano, C., Lavelli, A.: Automatic expansion of DBpedia exploiting wikipedia cross-language information. In: Cimiano, P., Corcho, O., Presutti, V., Hollink, L., Rudolph, S. (eds.) ESWC 2013. LNCS, vol. 7882, pp. 397–411. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38288-8_27
Chapter Google Scholar
Bollacker, K., Evans, C., Paritosh, P., Sturge, T., Taylor, J.: Freebase:a collaboratively created graph database for structuring human knowledge. In: SIGMOD Conference, pp. 1247–1250 (2008)
Google Scholar
Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., Yakhnenko, O.: Translating embeddings for modeling multi-relational data. In: International Conference on Neural Information Processing Systems, pp. 2787–2795 (2013)
Google Scholar
Dawid, A.P., Skene, A.M.: Maximum likelihood estimation of observer error-rates using the em algorithm. J. Roy. Stat. Soc. 28(1), 20–28 (1979)
Google Scholar
Dong, Z., Lu, J., Ling, T.W.: PANDA: a platform for academic knowledge discovery and acquisition. In: 2016 International Conference on Big Data and Smart Computing (BigComp), pp. 10–17. IEEE (2016)
Google Scholar
Dong, Z., Lu, J., Ling, T.W., Fan, J., Chen, Y.: Using hybrid algorithmic-crowdsourcing methods for academic knowledge acquisition. Cluster Comput. 20(4), 3629–3641 (2017). https://doi.org/10.1007/s10586-017-1089-8
Article Google Scholar
Fan, J., Lu, M., Ooi, B.C., Tan, W.C., Zhang, M.: A hybrid machine-crowdsourcing system for matching web tables. In: IEEE International Conference on Data Engineering, pp. 976–987 (2014)
Google Scholar
Gangemi, A., Nuzzolese, A.G., Presutti, V., Draicchio, F., Musetti, A., Ciancarini, P.: Automatic typing of DBpedia entities. In: Cudré-Mauroux, P., et al. (eds.) ISWC 2012. LNCS, vol. 7649, pp. 65–81. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35176-1_5
Chapter Google Scholar
Huang, F., Li, J., Lu, J., Ling, T.W., Dong, Z.: PandaSearch: a fine-grained academic search engine for research documents. In: ICDE 2015 (2015)
Google Scholar
Kejriwal, M., Szekely, P.: Supervised typing of big graphs using semantic embeddings, p. 3 (2017)
Google Scholar
Kondreddi, S.K., Triantafillou, P., Weikum, G.: Combining information extraction and human computing for crowdsourced knowledge acquisition. In: ICDE, pp. 988–999 (2014)
Google Scholar
Lehmann, J.: DBpedia: a large-scale, multilingual knowledge base extracted from Wikipedia. Seman. Web 6(2), 167–195 (2015)
Google Scholar
Lin, Y., Liu, Z., Sun, M., Liu, Y., Zhu, X.: Learning entity and relation embeddings for knowledge graph completion. In: Twenty-Ninth AAAI Conference on Artificial Intelligence, pp. 2181–2187 (2015)
Google Scholar
Lofi, C., Maarry, K.E.: Design patterns for hybrid algorithmic-crowdsourcing workflows. In: CBI, pp. 1–8 (2014)
Google Scholar
Melo, A., Völker, J., Paulheim, H.: Type prediction in noisy RDF knowledge bases using hierarchical multilabel classification with graph and latent features. Int. J. Artif. Intell. Tools 26(2), 1760011 (2017)
Article Google Scholar
Mozafari, B., Sarkar, P., Franklin, M.J., Jordan, M.I., Madden, S.: Scaling up crowd-sourcing to very large datasets: a case for active learning. Proc. VLDB Endow. (PVLDB) 8(2), 125–136 (2014)
Article Google Scholar
Nickel, M., Rosasco, L., Poggio, T.: Holographic embeddings of knowledge graphs. In: Thirtieth AAAI Conference on Artificial Intelligence, pp. 1955–1961 (2016)
Google Scholar
Paulheim, H.: Knowledge graph refinement: a survey of approaches and evaluation methods. Seman. Web 8, 1–20 (2016). (Preprint) survey
Article Google Scholar
Paulheim, H., Bizer, C.: Type inference on noisy RDF data. In: Alani, H., et al. (eds.) ISWC 2013. LNCS, vol. 8218, pp. 510–525. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41335-3_32
Chapter Google Scholar
Paulheim, H., Bizer, C.: Improving the quality of linked data using statistical distributions. Int. J. Seman. Web Inf. Syst. 10(2), 63–86 (2014)
Article Google Scholar
Rebele, T., Suchanek, F., Hoffart, J., Biega, J., Kuzey, E., Weikum, G.: YAGO: a multilingual knowledge base from Wikipedia, wordnet, and geonames. In: Groth, P., et al. (eds.) ISWC 2016. LNCS, vol. 9982, pp. 177–185. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46547-0_19
Chapter Google Scholar
Sleeman, J., Finin, T.: Type prediction for efficient coreference resolution in heterogeneous semantic graphs. In: IEEE Seventh International Conference on Semantic Computing, pp. 78–85 (2014)
Google Scholar

Download references

Author information

Authors and Affiliations

DEKE, MOE and School of Information, Renmin University of China, Beijing, China
Zhaoan Dong, Ju Fan, Jiaheng Lu & Xiaoyong Du
Department of Computer Science, University of Helsinki, Helsinki, Finland
Jiaheng Lu
School of Computing, National University of Singapore, Singapore, Singapore
Tok Wang Ling

Authors

Zhaoan Dong
View author publications
You can also search for this author in PubMed Google Scholar
Ju Fan
View author publications
You can also search for this author in PubMed Google Scholar
Jiaheng Lu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoyong Du
View author publications
You can also search for this author in PubMed Google Scholar
Tok Wang Ling
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ju Fan .

Editor information

Editors and Affiliations

South China University of Technology, Guangzhou, China
Yi Cai
Nagoya University, Nagoya, Japan
Yoshiharu Ishikawa
Hong Kong Baptist University, Kowloon Tong, Hong Kong, China
Jianliang Xu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dong, Z., Fan, J., Lu, J., Du, X., Ling, T.W. (2018). Using Crowdsourcing for Fine-Grained Entity Type Completion in Knowledge Bases. In: Cai, Y., Ishikawa, Y., Xu, J. (eds) Web and Big Data. APWeb-WAIM 2018. Lecture Notes in Computer Science(), vol 10988. Springer, Cham. https://doi.org/10.1007/978-3-319-96893-3_19

Download citation

DOI: https://doi.org/10.1007/978-3-319-96893-3_19
Published: 19 July 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-96892-6
Online ISBN: 978-3-319-96893-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Using Crowdsourcing for Fine-Grained Entity Type Completion in Knowledge Bases

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A performant and incremental algorithm for knowledge graph entity typing

Zero-Shot Entity Typing in Knowledge Graphs

Crowd-Type: A Crowdsourcing-Based Tool for Type Completion in Knowledge Bases

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Using Crowdsourcing for Fine-Grained Entity Type Completion in Knowledge Bases

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A performant and incremental algorithm for knowledge graph entity typing

Zero-Shot Entity Typing in Knowledge Graphs

Crowd-Type: A Crowdsourcing-Based Tool for Type Completion in Knowledge Bases

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation