research-article

Open Access

Entity Disambiguation with Extreme Multi-label Ranking

Authors:
Jyun-Yu Jiang

Amazon Search, Palo Alto, USA

Amazon Search, Palo Alto, USA

0000-0002-1753-8099
View Profile

,
Wei-Cheng Chang

Amazon Search, Palo Alto, USA

Amazon Search, Palo Alto, USA

0000-0002-5646-9356
View Profile

,
Jiong Zhang

Amazon Search, Palo Alto, USA

Amazon Search, Palo Alto, USA

0000-0003-3192-3281
View Profile

,
Cho-Jui Hsieh

University of California, Los Angeles & Google, Los Angeles, USA

University of California, Los Angeles & Google, Los Angeles, USA

0000-0002-3520-9627
View Profile

,
Hsiang-Fu Yu

Amazon Search, Palo Alto, USA

Amazon Search, Palo Alto, USA

0000-0001-5235-2962
View Profile

Authors Info & Claims

WWW '24: Proceedings of the ACM on Web Conference 2024May 2024Pages 4172–4180https://doi.org/10.1145/3589334.3645498

Published:13 May 2024Publication History

WWW '24: Proceedings of the ACM on Web Conference 2024

Pages 4172–4180

ABSTRACT

Entity disambiguation is one of the most important natural language tasks to identify entities behind ambiguous surface mentions within a knowledge base. Although many recent studies apply deep learning to achieve decent results, they need exhausting pre-training and mediocre recall in the retrieval stage. In this paper, we propose a novel framework, eXtreme Multi-label Ranking for Entity Disambiguation (XMRED), to address this challenge. An efficient zero-shot entity retriever with auxiliary data is first pre-trained to recall relevant entities based on linear models. Specifically, the retrieval process can be considered as an extreme multi-label ranking (XMR) task. Entities are first clustered at different scales to form a label tree, thereby learning multi-scale entity retrievers over the label tree with high recall. Moreover, XMRED applies deep cross-encoder as a re-ranker to achieve high precision based on high-quality candidates. Extensive experimental results based on the AIDA-CoNLL benchmark and five zero-shot testing datasets demonstrate that XMRED obtains 98% and over 95% recall scores for in-domain and zero-shot datasets with top-10 retrieved entities. With a deep cross-encoder as the re-ranker, XMRED further outperforms the previous state-of-the-art by 1.74% in In-KB micro-F1 scores on average with a significant improvement on the training efficiency from days to 3.48 hours. In addition, XMRED also beats the state-of-the-art for page-level document retrieval by 2.38% in accuracy and 1.90% in recall@5.

Supplemental Material

rfp1141.mp4

Supplemental video

mp4

6.3 MB

Download

References

Tom Ayoola, Shubhi Tyagi, Joseph Fisher, Christos Christodoulopoulos, and Andrea Pierleoni. 2022. ReFinED: An Efficient Zero-shot-capable Approach to End-to-End Entity Linking. In NAACL. Association for Computational Linguistics, Hybrid: Seattle, Washington Online, 209--220. https://doi.org/10.18653/v1/2022.naacl-industry.24Google ScholarCross Ref
Edoardo Barba, Luigi Procopio, and Roberto Navigli. 2022. ExtEnD: Extractive Entity Disambiguation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2478--2488.Google ScholarCross Ref
Jiangui Chen, Ruqing Zhang, Jiafeng Guo, Yiqun Liu, Yixing Fan, and Xueqi Cheng. 2022. CorpusBrain: Pre-train a Generative Retrieval Model for Knowledge-Intensive Language Tasks. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management. 191--200.Google ScholarDigital Library
Kunal Dahiya, Deepak Saini, Anshul Mittal, Ankush Shaw, Kushal Dave, Akshay Soni, Himanshu Jain, Sumeet Agarwal, and Manik Varma. 2021. Deepxml: A deep extreme multi-label learning framework applied to short text documents. In WSDM. 31--39.Google Scholar
Nicola De Cao, Gautier Izacard, Sebastian Riedel, and Fabio Petroni. 2021. Autoregressive Entity Retrieval. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3--7, 2021. OpenReview.net. https://openreview.net/forum?id=5k8F6UU39VGoogle Scholar
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-t raining of Deep Bidirectional Transformers for Language Understanding. In NAACL. 4171--4186.Google Scholar
Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, and Chih-Jen Lin. 2008. LIBLINEAR: A library for large linear classification. the Journal of machine Learning research , Vol. 9 (2008), 1871--1874.Google Scholar
Zheng Fang, Yanan Cao, Qian Li, Dongjie Zhang, Zhenyu Zhang, and Yanbing Liu. 2019. Joint entity linking with deep reinforcement learning. In The world wide web conference. 438--447.Google Scholar
Evgeniy Gabrilovich, Michael Ringgaard, and Amarnag Subramanya. 2013. FACC1: Freebase annotation of ClueWeb corpora, Version 1 (Release date 2013-06--26, Format version 1, Correction level 0). Technical Report. The Lemur Project.Google Scholar
Octavian-Eugen Ganea and Thomas Hofmann. 2017. Deep Joint Entity Disambiguation with Local Neural Attention. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2619--2629.Google ScholarCross Ref
Zhaochen Guo and Denilson Barbosa. 2014. Robust entity linking via random walks. In CIKM. 499--508.Google Scholar
Zhaochen Guo and Denilson Barbosa. 2018. Robust named entity disambiguation with random walks. Semantic Web, Vol. 9, 4 (2018), 459--479.Google ScholarDigital Library
Geoffrey E Hinton and Ruslan R Salakhutdinov. 2006. Reducing the dimensionality of data with neural networks. science, Vol. 313, 5786 (2006), 504--507.Google Scholar
Johannes Hoffart, Mohamed Amir Yosef, Ilaria Bordino, Hagen Fürstenau, Manfred Pinkal, Marc Spaniol, Bilyana Taneva, Stefan Thater, and Gerhard Weikum. 2011. Robust disambiguation of named entities in text. In Proceedings of the 2011 conference on empirical methods in natural language processing. 782--792.Google ScholarDigital Library
Samuel Humeau, Kurt Shuster, Marie-Anne Lachaux, and Jason Weston. 2020. Poly-encoders: Architectures and Pre-training Strategies for Fast and Accurate Multi-sentence Scoring. In International Conference on Learning Representations.Google Scholar
Jyun-Yu Jiang, Jing Liu, Chin-Yew Lin, and Pu-Jen Cheng. 2015. Improving ranking consistency for web search by leveraging a knowledge base and search logs. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. 1441--1450.Google ScholarDigital Library
Siddhant Kharbanda, Atmadeep Banerjee, Erik Schultheis, and Rohit Babbar. 2022. CascadeXML: Rethinking Transformers for End-to-end Multi-resolution Training in Extreme Multi-label Classification. In Conference on Neural Information Processing Systems.Google Scholar
Alex M Lamb, Anirudh Goyal ALIAS PARTH GOYAL, Ying Zhang, Saizheng Zhang, Aaron C Courville, and Yoshua Bengio. 2016. Professor forcing: A new algorithm for training recurrent networks. Advances in neural information processing systems , Vol. 29 (2016).Google Scholar
Phong Le and Ivan Titov. 2018. Improving Entity Linking by Modeling Latent Relations between Mentions. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1595--1604.Google ScholarCross Ref
Phong Le and Ivan Titov. 2019. Boosting Entity Linking Performance by Leveraging Unlabeled Documents. In ACL. 1935--1945.Google Scholar
Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. 2020a. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 7871--7880.Google ScholarCross Ref
Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rockt"aschel, et al. 2020b. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems , Vol. 33 (2020), 9459--9474.Google Scholar
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019).Google Scholar
Ilya Loshchilov and Frank Hutter. 2018. Decoupled Weight Decay Regularization. In International Conference on Learning Representations.Google Scholar
Christopher D Manning. 2008. Introduction to information retrieval. Syngress Publishing,.Google Scholar
Makoto Miwa and Mohit Bansal. 2016. End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures. In ACL. 1105--1116.Google Scholar
Andrea Moro, Alessandro Raganato, and Roberto Navigli. 2014. Entity linking meets word sense disambiguation: a unified approach. Transactions of the Association for Computational Linguistics , Vol. 2 (2014), 231--244.Google ScholarCross Ref
Laurel J Orr, Megan Leszczynski, Neel Guha, Sen Wu, Simran Arora, Xiao Ling, and Christopher Ré. 2021. Bootleg: Chasing the Tail with Self-Supervised Named Entity Disambiguation. In CIDR.Google Scholar
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. 2019. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems , Vol. 32 (2019).Google Scholar
Maria Pershina, Yifan He, and Ralph Grishman. 2015. Personalized page rank for named entity disambiguation. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 238--243.Google ScholarCross Ref
Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep Contextualized Word Representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Association for Computational Linguistics, New Orleans, Louisiana, 2227--2237. https://doi.org/10.18653/v1/N18--1202Google ScholarCross Ref
Fabio Petroni, Aleksandra Piktus, Angela Fan, Patrick Lewis, Majid Yazdani, Nicola De Cao, James Thorne, Yacine Jernite, Vladimir Karpukhin, Jean Maillard, Vassilis Plachouras, Tim Rockt"aschel, and Sebastian Riedel. 2021. KILT: a Benchmark for Knowledge Intensive Language Tasks. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Online, 2523--2544. https://doi.org/10.18653/v1/2021.naacl-main.200Google ScholarCross Ref
Yashoteja Prabhu, Anil Kag, Shrutendra Harsola, Rahul Agrawal, and Manik Varma. 2018. Parabel: Partitioned label trees for extreme classification with application to dynamic search advertising. In Proceedings of the 2018 World Wide Web Conference. 993--1002.Google ScholarDigital Library
Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J Liu, et al. 2020. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. , Vol. 21, 140 (2020), 1--67.Google Scholar
Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 3982--3992.Google ScholarCross Ref
Hamed Shahbazi, Xiaoli Z Fern, Reza Ghaeini, Rasha Obeidat, and Prasad Tadepalli. 2019. Entity-aware ELMo: Learning contextual entity representation for entity disambiguation. arXiv preprint arXiv:1908.05762 (2019).Google Scholar
Ronald J Williams and David Zipser. 1989. A learning algorithm for continually running fully recurrent neural networks. Neural computation, Vol. 1, 2 (1989), 270--280.Google Scholar
Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz, et al. 2019. Huggingface's transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019).Google Scholar
Ledell Wu, Fabio Petroni, Martin Josifoski, Sebastian Riedel, and Luke Zettlemoyer. 2020. Scalable Zero-shot Entity Linking with Dense Entity Retrieval. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 6397--6407.Google ScholarCross Ref
Ikuya Yamada, Hiroyuki Shindo, Hideaki Takeda, and Yoshiyasu Takefuji. 2016. Joint learning of the embedding of words and entities for named entity disambiguation. arXiv preprint arXiv:1601.01343 (2016).Google Scholar
Ikuya Yamada, Koki Washio, Hiroyuki Shindo, and Yuji Matsumoto. 2022. Global entity disambiguation with BERT. In NAACL. 3264--3271.Google Scholar
Xiyuan Yang, Xiaotao Gu, Sheng Lin, Siliang Tang, Yueting Zhuang, Fei Wu, Zhigang Chen, Guoping Hu, and Xiang Ren. 2019. Learning Dynamic Context Augmentation for Global Entity Linking. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 271--281.Google ScholarCross Ref
Yi Yang, Ozan. Irsoy, and Kazi Shefaet Rahman. 2018. Collective Entity Disambiguation with Structured Gradient Tree Boosting. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). 777--786.Google ScholarCross Ref
Hsiang-Fu Yu, Kai Zhong, Jiong Zhang, Wei-Cheng Chang, and Inderjit S Dhillon. 2022. PECOS: Prediction for Enormous and Correlated Output Spaces. Journal of Machine Learning Research (2022).Google ScholarDigital Library
Fangwei Zhu, Jifan Yu, Hailong Jin, Lei Hou, Juanzi Li, and Zhifang Sui. 2023. Learn to Not Link: Exploring NIL Prediction in Entity Linking. In Findings of the Association for Computational Linguistics: ACL 2023. Association for Computational Linguistics, 10846--10860. ioGoogle ScholarCross Ref

Index Terms

Entity Disambiguation with Extreme Multi-label Ranking
1. Computing methodologies
  1. Machine learning
2. Information systems
  1. Information retrieval

Recommendations

Re-ranking for joint named-entity recognition and linking
CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management

Recognizing names and linking them to structured data is a fundamental task in text analysis. Existing approaches typically perform these two steps using a pipeline architecture: they use a Named-Entity Recognition (NER) system to find the boundaries of ...
Read More
Generalized Zero-Shot Extreme Multi-label Learning
KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining

Extreme Multi-label Learning (XML) involves assigning the subset of most relevant labels to a data point from millions of label choices. A hitherto unaddressed challenge in XML is that of predicting unseen labels with no training points. These form a ...
Read More
Entity Disambiguation with Linkless Knowledge Bases
WWW '16: Proceedings of the 25th International Conference on World Wide Web

Named Entity Disambiguation is the task of disambiguating named entity mentions in natural language text and link them to their corresponding entries in a reference knowledge base (e.g. Wikipedia). Such disambiguation can help add semantics to plain ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WWW '24: Proceedings of the ACM on Web Conference 2024
May 2024
4826 pages
ISBN:9798400701719
DOI:10.1145/3589334
General Chairs:
Tat-Seng Chua
National University of Singapore
,
Chong-Wah Ngo
Singapore Management University
,
Proceedings Chair:
Roy Ka-Wei Lee
Singapore University of Technology and Design
,
Program Chairs:
Ravi Kumar
Google
,
Hady W. Lauw
Singapore Management University
Copyright © 2024 Owner/Author
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives International 4.0 License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 May 2024
Check for updates
Author Tags
entity disambiguation
entity retriever
extreme multi-label classification
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,899of8,196submissions,23%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 33
  Total Downloads
- Downloads (Last 12 months)33
- Downloads (Last 6 weeks)33
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Entity Disambiguation with Extreme Multi-label Ranking

WWW '24: Proceedings of the ACM on Web Conference 2024

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Re-ranking for joint named-entity recognition and linking

Generalized Zero-Shot Extreme Multi-label Learning

Entity Disambiguation with Linkless Knowledge Bases

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Entity Disambiguation with Extreme Multi-label Ranking

WWW '24: Proceedings of the ACM on Web Conference 2024

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Re-ranking for joint named-entity recognition and linking

Generalized Zero-Shot Extreme Multi-label Learning

Entity Disambiguation with Linkless Knowledge Bases

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media