Skip to main content

Attention-Based Multimodal Entity Linking with High-Quality Images

  • Conference paper
  • First Online:
Database Systems for Advanced Applications (DASFAA 2021)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12682))

Included in the following conference series:

  • 3877 Accesses

Abstract

Multimodal entity linking (MEL) is an emerging research field which uses both textual and visual information to map an ambiguous mention to an entity in a knowledge base (KB). However, images do not always help, which may also backfire if they are irrelevant to the textual content at all. Besides, the existing efforts mainly focus on learning a representation of both mentions and entities from their textual and visual contexts, without considering the negative impact brought by noisy irrelevant images, which happens frequently with social media posts. In this paper, we propose a novel MEL model, which not only removes the negative impact of noisy images, but also uses multiple attention mechanism to better capture the connection between mention representation and its corresponding entity representation. Our empirical study on a large real data collection demonstrates the effectiveness of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Adjali, O., Besançon, R., Ferret, O., Le Borgne, H., Grau, B.: Multimodal entity linking for Tweets. In: Jose, J.M., et al. (eds.) ECIR 2020, Part I. LNCS, vol. 12035, pp. 463–478. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-45439-5_31

    Chapter  Google Scholar 

  2. Cheng, J., et al.: Entity linking for Chinese short texts based on BERT and entity name embeddings. In: China Conference on Knowledge Graph and Semantic Computing (CCKS) (2019). https://conference.bj.bcebos.com/ccks2019/eval/webpage/pdfs/eval_paper_2_1.pdf

  3. Chong, W.-H., Lim, E.-P., Cohen, W.: Collective entity linking in Tweets over space and time. In: Jose, J.M., Hauff, C., Altıngovde, I.S., Song, D., Albakour, D., Watt, S., Tait, J. (eds.) ECIR 2017. LNCS, vol. 10193, pp. 82–94. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-56608-5_7

    Chapter  Google Scholar 

  4. Csomai, A., Mihalcea, R.: Linking documents to encyclopedic knowledge. IEEE Intell. Syst. 23(5), 34–41 (2008)

    Article  Google Scholar 

  5. Cucerzan, S.: Large-scale named entity disambiguation based on Wikipedia data. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pp. 708–716 (2007)

    Google Scholar 

  6. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  7. Dredze, M., Andrews, N., Deyoung, J.: Twitter at the Grammys: a social media corpus for entity linking and disambiguation. In: International Workshop on Natural Language Processing for Social Media (2016)

    Google Scholar 

  8. Fang, Y., Chang, M.W.: Entity linking on microblogs with spatial and temporal signals. Trans. Assoc. Comput. Linguist. 2, 259–272 (2014)

    Article  Google Scholar 

  9. Hua, W., Zheng, K., Zhou, X.: Microblog entity linking with social temporal context, pp. 1761–1775 (2015)

    Google Scholar 

  10. Huang, D., Wang, J.: An approach on Chinese microblog entity linking combining Baidu Encyclopaedia and word2vec. Procedia Comput. Sci. 111, 37–45 (2017)

    Article  Google Scholar 

  11. Liu, X., Li, Y., Wu, H., Ming, Z., Yi, L.: Entity linking for Tweets. In: Meeting of the Association for Computational Linguistics (2017)

    Google Scholar 

  12. Liu, Y., Li, H., Garcia-Duran, A., Niepert, M., Onoro-Rubio, D., Rosenblum, D.S.: MMKG: multi-modal knowledge graphs. In: Hitzler, P., Hitzler, P., et al. (eds.) ESWC 2019. LNCS, vol. 11503, pp. 459–474. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-21348-0_30

    Chapter  Google Scholar 

  13. Ma, C., Sha, Y., Tan, J., Guo, L., Peng, H.: Chinese social media entity linking based on effective context with topic semantics. In: 2019 IEEE 43rd Annual Computer Software and Applications Conference (COMPSAC), vol. 1, pp. 386–395. IEEE (2019)

    Google Scholar 

  14. Mihalcea, R., Csomai, A.: Wikify! linking documents to encyclopedic knowledge. In: Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management, pp. 233–242 (2007)

    Google Scholar 

  15. Moon, S., Neves, L., Carvalho, V.: Multimodal named entity disambiguation for noisy social media posts. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2000–2008 (2018)

    Google Scholar 

  16. Mousselly-Sergieh, H., Botschen, T., Gurevych, I., Roth, S.: A multimodal translation-based approach for knowledge graph representation learning. In: Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics, pp. 225–234 (2018)

    Google Scholar 

  17. Nguyen, T.H., Fauceglia, N.R., Muro, M.R., Hassanzadeh, O., Gliozzo, A., Sadoghi, M.: Joint learning of local and global features for entity linking via neural networks. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pp. 2310–2320 (2016)

    Google Scholar 

  18. Pezeshkpour, P., Chen, L., Singh, S.: Embedding multimodal relational data for knowledge base completion. arXiv preprint arXiv:1809.01341 (2018)

  19. Shen, W., Wang, J., Han, J.: Entity linking with a knowledge base: issues, techniques, and solutions. IEEE Trans. Knowl. Data Eng. 27(2), 443–460 (2014)

    Article  Google Scholar 

  20. Shen, W., Wang, J., Luo, P., Wang, M.: Linking named entities in tweets with knowledge base via user interest modeling. In: ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (2013)

    Google Scholar 

  21. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  22. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)

    Google Scholar 

  23. Tao, Z., Wei, Y., Wang, X., He, X., Huang, X., Chua, T.S.: MGAT: multimodal graph attention network for recommendation. Inf. Process. Manage. 57(5), 102277 (2020)

    Google Scholar 

  24. Yang, Z., Zheng, B., Li, G., Zhao, X., Zhou, X., Jensen, C.S.: Adaptive top-k overlap set similarity joins. In: 2020 IEEE 36th International Conference on Data Engineering (ICDE), pp. 1081–1092. IEEE (2020)

    Google Scholar 

  25. Yen, A.Z., Huang, H.H., Chen, H.H.: Multimodal joint learning for personal knowledge base construction from Twitter-based lifelogs. Inf. Process. Manage. 57(6), 102148 (2019)

    Google Scholar 

  26. Yin, X., Huang, Y., Zhou, B., Li, A., Lan, L., Jia, Y.: Deep entity linking via eliminating semantic ambiguity with BERT. IEEE Access 7, 169434–169445 (2019)

    Article  Google Scholar 

  27. Zheng, B., et al.: Online trichromatic pickup and delivery scheduling in spatial crowdsourcing. In: 2020 IEEE 36th International Conference on Data Engineering (ICDE), pp. 973–984. IEEE (2020)

    Google Scholar 

  28. Zheng, B., Zhao, X., Weng, L., Hung, N.Q.V., Liu, H., Jensen, C.S.: PM-LSH: a fast and accurate lSH framework for high-dimensional approximate NN search. Proceedings of the VLDB Endow. 13(5), 643–655 (2020)

    Article  Google Scholar 

  29. Zheng, B., et al.: Answering why-not group spatial keyword queries. IEEE Trans. Knowl. Data Eng. 32(1), 26–39 (2018)

    Article  Google Scholar 

  30. Zhu, Y., Zhang, C., Ré, C., Fei-Fei, L.: Building a large-scale multimodal knowledge base system for answering visual queries. arXiv preprint arXiv:1507.05670 (2015)

Download references

Acknowledgments

This research is supported by National Key R&D Program of China (No. 2018-AAA0101900), the Priority Academic Program Development of Jiangsu Higher Education Institutions, National Natural Science Foundation of China (Grant No. 62072323, 61632016), Natural Science Foundation of Jiangsu Province (No. BK20191420), and the Suda-Toycloud Data Intelligence Joint Laboratory.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhixu Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, L., Li, Z., Yang, Q. (2021). Attention-Based Multimodal Entity Linking with High-Quality Images. In: Jensen, C.S., et al. Database Systems for Advanced Applications. DASFAA 2021. Lecture Notes in Computer Science(), vol 12682. Springer, Cham. https://doi.org/10.1007/978-3-030-73197-7_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-73197-7_35

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-73196-0

  • Online ISBN: 978-3-030-73197-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics