skip to main content
10.1145/3477495.3531986acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article
Open access

H-ERNIE: A Multi-Granularity Pre-Trained Language Model for Web Search

Published: 07 July 2022 Publication History

Abstract

The pre-trained language models (PLMs), such as BERT and ERNIE, have achieved outstanding performance in many natural language understanding tasks. Recently, PLMs-based Information Retrieval models have also been investigated and showed substantially state-of-the-art effectiveness, e.g., MORES, PROP and ColBERT. Moreover, most of the PLMs-based rankers only focus on a single level relevance matching (e.g., character-level), while ignore the other granularity information (e.g., words and phrases), which easily lead to the ambiguity of query understanding and inaccurate matching issues in web search.
In this paper, we aim to improve the state-of-the-art PLMs ERNIE for web search, by modeling multi-granularity context information with the awareness of word importance in queries and documents. In particular, we propose a novel H-ERNIE framework, which includes a query-document analysis component and a hierarchical ranking component. The query-document analysis component has several individual modules which generate the necessary variables, such as word segmentation, word importance analysis, and word tightness analysis. Based on these variables, the importance-aware multiple-level correspondences are sent to the ranking model. The hierarchical ranking model includes a multi-layer transformer module to learn the character-level representations, a word-level matching module, and a phrase-level matching module with word importance. Each of these modules models the query and the document matching from a different perspective. Also, these levels are inherently communicated to achieve the overall accurate matching. We discuss the time complexity of the proposed framework, and show that it can be efficiently implemented in real applications. The offline and online experiments on both public data sets and a commercial search engine illustrate the effectiveness of the proposed H-ERNIE framework.

Supplementary Material

MP4 File (SIGIR22-fp0563.mp4)
Recently, PLMs (pre-trained language models) -based Information Retrieval models have also been investigated and showed substantially state-of-the-art effectiveness. However, most of the PLMs-based rankers only focus on a single level relevance matching (e.g., character-level), while ignore the other granularity information (e.g., words and phrases), which easily lead to the ambiguity of query understanding and inaccurate matching issues in web search. This video will introduce a hierarchical ERNIE-based model, H-ERNIE, which studies multi-granularity context relevance in queries and documents. The offline and online experiments on both public datasets and a commercial search engine illustrate the effectiveness of the proposed model.

References

[1]
James Bergstra, Daniel Yamins, and David D. Cox. 2013. Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures. In Proceedings of the 30th International Conference on Machine Learning, ICML 2013, Atlanta, GA, USA, 16--21 June 2013.
[2]
Sebastian Bruch. [n.d.]. An Alternative Cross Entropy Loss for Learning-to-Rank. In WWW '21: The Web Conference 2021, Virtual Event / Ljubljana, Slovenia, April 19-23, 2021. 118--126.
[3]
Christopher JC Burges. 2010. From ranknet to lambdarank to lambdamart: An overview. Learning (2010).
[4]
Zhe Cao, Tao Qin, T. Liu, Ming-Feng Tsai, and H. Li. 2007. Learning to rank: from pairwise approach to listwise approach. In Machine Learning, Proceedings of the Twenty-Fourth International Conference (ICML 2007), Corvallis, Oregon, USA, June 20-24, 2007.
[5]
Olivier Chapelle, Thorsten Joachims, Filip Radlinski, and Yisong Yue. 2012. Large-scale validation and analysis of interleaved search evaluation. TOIS (2012).
[6]
Jing Chen, Qingcai Chen, Xin Liu, Haijun Yang, Daohe Lu, and Buzhou Tang. [n.d.]. The BQ Corpus: A Large-scale Domain-specific Chinese Corpus For Sentence Semantic Equivalence Identification. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31 - November 4, 2018. 4946--4951.
[7]
William S Cooper, Fredric C Gey, and Daniel P Dabney. 1992. Probabilistic retrieval based on staged logistic regression. In Proceedings of the 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Copenhagen, Denmark, June 21-24, 1992. 198--210.
[8]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers). 4171--4186.
[9]
Hui Fang, Tao Tao, and ChengXiang Zhai. 2004. A formal study of information retrieval heuristics. In SIGIR 2004: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Sheffield, UK, July 25-29, 2004. 49--56.
[10]
Yoav Freund, Raj Iyer, Robert E Schapire, and Yoram Singer. 2003. An Efficient Boosting Algorithm for Combining Preferences. JMLR (2003).
[11]
Luyu Gao, Zhuyun Dai, and J. Callan. 2020. Modularized Transfomer-based Ranking Framework. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16-20, 2020. 4180--4190.
[12]
Luyu Gao, Zhuyun Dai, and Jamie Callan. 2021. Rethink Training of BERT Rerankers in Multi-stage Retrieval Pipeline. In Advances in Information Retrieval - 43rd European Conference on IR Research, ECIR 2021, Virtual Event, March 28 - April 1, 2021, Proceedings, Part II. 280--286.
[13]
Jiafeng Guo, Yixing Fan, Qingyao Ai, and W Bruce Croft. 2016. A deep relevance matching model for ad-hoc retrieval. In Proceedings of the 25th ACM International Conference on Information and Knowledge Management, CIKM 2016, Indianapolis, IN, USA, October 24-28, 2016. 55--64.
[14]
Djoerd Hiemstra. 2002. Term-specific smoothing for the language modeling approach to information retrieval: the importance of a query term. In SIGIR 2002: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, August 11--15, 2002, Tampere, Finland. 35--41.
[15]
Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alex Acero, and Larry Heck. 2013. Learning deep structured semantic models for web search using clickthrough data. In 22nd ACM International Conference on Information and Knowledge Management, CIKM'13, San Francisco, CA, USA, October 27 - November 1, 2013. 2333--2338.
[16]
Kalervo Järvelin and Jaana Kekäläinen. 2017. IR evaluation methods for retrieving highly relevant documents. SIGIR Forum, Vol. 51, 2 (2017), 243--250.
[17]
Yiling Jia, Huazheng Wang, Stephen Guo, and Hongning Wang. [n.d.]. PairRank: Online Pairwise Learning to Rank by Divide-and-Conquer. In WWW '21: The Web Conference 2021, Virtual Event / Ljubljana, Slovenia, April 19-23, 2021.
[18]
Thorsten Joachims. 2002. Optimizing search engines using clickthrough data. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, July 23--26, 2002, Edmonton, Alberta, Canada. 133--142.
[19]
Omar Khattab and Matei Zaharia. 2020. Colbert: Efficient and effective passage search via contextualized late interaction over bert. In Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, SIGIR 2020, Virtual Event, China, July 25-30, 2020. 39--48.
[20]
Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. CoRR, Vol. abs/1412.6980 (2015).
[21]
John D. Lafferty, Andrew McCallum, and Fernando C. N. Pereira. [n.d.]. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In Proceedings of the Eighteenth International Conference on Machine Learning (ICML 2001), Williams College, Williamstown, MA, USA, June 28 - July 1, 2001. 282--289.
[22]
Ping Li, Qiang Wu, and Christopher Burges. 2007. McRank: Learning to Rank Using Multiple Classification and Gradient Boosting. NIPS'07 (2007).
[23]
Jimmy Lin, Rodrigo Nogueira, and A. Yates. 2020. Pretrained Transformers for Text Ranking: BERT and Beyond. arXiv:2010.06467 (2020).
[24]
Xin Liu, Qingcai Chen, Chong Deng, Huajun Zeng, Jing Chen, Dongfang Li, and Buzhou Tang. [n.d.]. LCQMC: A Large-scale Chinese Question Matching Corpus. In Proceedings of the 27th International Conference on Computational Linguistics, COLING 2018, Santa Fe, New Mexico, USA, August 20-26, 2018. 1952--1962.
[25]
Yiding Liu, Weixue Lu, Suqi Cheng, Daiting Shi, Shuaiqiang Wang, Zhicong Cheng, and Dawei Yin. 2021. Pre-trained Language Model for Web-scale Retrieval in Baidu Search. In KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Virtual Event, Singapore, August 14-18, 2021. 3365--3375.
[26]
Xinyu Ma, Jiafeng Guo, Ruqing Zhang, Yixing Fan, Xiang Ji, and Xueqi Cheng. [n.d.]. PROP: Pre-training with Representative Words Prediction for Ad-hoc Retrieval. In WSDM '21, The Fourteenth ACM International Conference on Web Search and Data Mining, Virtual Event, Israel, March 8-12, 2021. 283--291.
[27]
Xinyu Ma, Jiafeng Guo, Ruqing Zhang, Yixing Fan, Yingyan Li, and Xueqi Cheng. 2021. B-PROP: Bootstrapped Pre-training with Representative Words Prediction for Ad-hoc Retrieval. In SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual Event, Canada, July 11-15, 2021. 1318--1327.
[28]
Yoshitomo Matsubara, Thuy Vu, and Alessandro Moschitti. [n.d.]. Reranking for Efficient Transformer-based Answer Selection. In Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, SIGIR 2020, Virtual Event, China, July 25-30, 2020. 1577--1580.
[29]
Ryan McDonald, George Brokos, and Ion Androutsopoulos. 2018. Deep Relevance Ranking Using Enhanced Document-Query Interactions. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31 - November 4, 2018. 1849--1860.
[30]
Jonas Mueller and Aditya Thyagarajan. [n.d.]. Siamese Recurrent Architectures for Learning Sentence Similarity. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, February 12-17, 2016, Phoenix, Arizona, USA. 2786--2792.
[31]
Rodrigo Nogueira and Kyunghyun Cho. 2019. Passage Re-ranking with BERT. arXiv:1901.04085 (2019).
[32]
Rodrigo Nogueira, W. Yang, Kyunghyun Cho, and Jimmy Lin. 2019. Multi-Stage Document Ranking with BERT. arXiv:1910.14424 (2019).
[33]
Liang Pang, Yanyan Lan, Jiafeng Guo, Jun Xu, Jingfang Xu, and Xueqi Cheng. 2017. Deeprank: A new deep architecture for relevance ranking in information retrieval. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, CIKM 2017, Singapore, November 06 - 10, 2017. ACM, 257--266.
[34]
Stephen Robertson, Hugo Zaragoza, and Michael Taylor. 2004. Simple BM25 extension to multiple weighted fields. In Proceedings of the 2004 ACM CIKM International Conference on Information and Knowledge Management, Washington, DC, USA, November 8-13, 2004. 42--49.
[35]
Stephen E. Robertson and Steve Walker. [n.d.]. Some Simple Effective Approximations to the 2-Poisson Model for Probabilistic Weighted Retrieval. In Proceedings of the 17th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval. Dublin, Ireland, 3-6 July 1994 (Special Issue of the SIGIR Forum). 232--241.
[36]
Lorenzo Rosasco, Ernesto De Vito, Andrea Caponnetto, Michele Piana, and Alessandro Verri. 2004. Are loss functions all the same? Neural computation (2004).
[37]
Yelong Shen, Xiaodong He, Jianfeng Gao, Li Deng, and Grégoire Mesnil. 2014. A latent semantic model with convolutional-pooling structure for information retrieval. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, CIKM 2014, Shanghai, China, November 3-7, 2014. 101--110.
[38]
Yu Sun, Shuohuan Wang, Yukun Li, Shikun Feng, Xuyi Chen, Han Zhang, Xin Tian, Danxiang Zhu, Hao Tian, and Hua Wu. 2019. ERNIE: Enhanced Representation through Knowledge Integration. arXiv preprint arXiv:1904.09223 (2019).
[39]
Yu Sun, Shuohuan Wang, Yu-Kun Li, Shikun Feng, Hao Tian, Hua Wu, and Haifeng Wang. [n.d.]. ERNIE 2.0: A Continual Pre-Training Framework for Language Understanding. In The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7-12, 2020. 8968--8975.
[40]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, L. Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA. 5998--6008.
[41]
Lidan Wang, Jimmy J. Lin, and Donald Metzler. 2010. Learning to efficiently rank. In Proceeding of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2010, Geneva, Switzerland, July 19-23, 2010. 138--145.
[42]
Chenyan Xiong, Zhuyun Dai, Jamie Callan, Zhiyuan Liu, and Russell Power. 2017. End-to-end neural ad-hoc ranking with kernel pooling. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, Shinjuku, Tokyo, Japan, August 7-11, 2017. 55--64.
[43]
Yinfei Yang, Yuan Zhang, Chris Tar, and Jason Baldridge. [n.d.]. PAWS-X: A Cross-lingual Adversarial Dataset for Paraphrase Identification. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3-7, 2019. 3685--3690.
[44]
Dawei Yin, Yuening Hu, Jiliang Tang, Tim Daly, Mianwei Zhou, Hua Ouyang, Jianhui Chen, Changsung Kang, Hongbo Deng, Chikashi Nobata, et almbox. 2016. Ranking relevance in yahoo search. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13-17, 2016. 323--332.
[45]
Denghui Zhang, Zixuan Yuan, Yanchi Liu, Zuohui Fu, Fuzhen Zhuang, Pengyang Wang, Haifeng Chen, and Hui Xiong. 2020. E-BERT: A phrase and product knowledge enhanced language model for e-commerce. arXiv preprint arXiv:2009.02835 (2020).
[46]
Shiqi Zhao, H. Wang, Chao Li, T. Liu, and Y. Guan. 2011. Automatically Generating Questions from Queries for Community-based Question Answering. In Fifth International Joint Conference on Natural Language Processing, IJCNLP 2011, Chiang Mai, Thailand, November 8-13, 2011. 929--937.
[47]
Xiangyu Zhao, Long Xia, Lixin Zou, Hui Liu, Dawei Yin, and Jiliang Tang. 2020. Whole-Chain Recommendations. In CIKM '20: The 29th ACM International Conference on Information and Knowledge Management, Virtual Event, Ireland, October 19-23, 2020. 1883--1891.
[48]
Zhaohui Zheng, Keke Chen, Gordon Sun, and Hongyuan Zha. 2007. A regression framework for learning ranking functions using relative relevance judgments. In SIGIR 2007: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Amsterdam, The Netherlands, July 23-27, 2007. 287--294.
[49]
Lixin Zou, Long Xia, Zhuoye Ding, Jiaxing Song, Weidong Liu, and Dawei Yin. 2019. Reinforcement learning to optimize long-term user engagement in recommender systems. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019, Anchorage, AK, USA, August 4-8, 2019. 2810--2818.
[50]
Lixin Zou, Shengqiang Zhang, Hengyi Cai, Dehong Ma, Suqi Cheng, Daiting Shi, Shuaiqiang Wang, Zhicong Cheng, and Dawei Yin. 2021. Pre-trained Language Model based Ranking in Baidu Search. In KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Virtual Event, Singapore, August 14-18, 2021. 4014--4022.

Cited By

View all
  • (2024)A bias study and an unbiased deep neural network for recommender systemsWeb Intelligence10.3233/WEB-23003622:1(15-29)Online publication date: 26-Mar-2024
  • (2024)Towards Robustness Analysis of E-Commerce Ranking SystemCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3648335(364-373)Online publication date: 13-May-2024
  • (2023)SelfLRE: Self-refining Representation Learning for Low-resource Relation ExtractionProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3592058(2364-2368)Online publication date: 19-Jul-2023
  • Show More Cited By

Index Terms

  1. H-ERNIE: A Multi-Granularity Pre-Trained Language Model for Web Search

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval
    July 2022
    3569 pages
    ISBN:9781450387323
    DOI:10.1145/3477495
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 July 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. information retrieval
    2. pre-trained language models
    3. web search

    Qualifiers

    • Research-article

    Conference

    SIGIR '22
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)229
    • Downloads (Last 6 weeks)34
    Reflects downloads up to 28 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)A bias study and an unbiased deep neural network for recommender systemsWeb Intelligence10.3233/WEB-23003622:1(15-29)Online publication date: 26-Mar-2024
    • (2024)Towards Robustness Analysis of E-Commerce Ranking SystemCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3648335(364-373)Online publication date: 13-May-2024
    • (2023)SelfLRE: Self-refining Representation Learning for Low-resource Relation ExtractionProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3592058(2364-2368)Online publication date: 19-Jul-2023
    • (2023)Model-based Unbiased Learning to RankProceedings of the Sixteenth ACM International Conference on Web Search and Data Mining10.1145/3539597.3570395(895-903)Online publication date: 27-Feb-2023
    • (2023)COLTR: Semi-Supervised Learning to Rank With Co-Training and Over-Parameterization for Web SearchIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.327075035:12(12542-12555)Online publication date: 26-Apr-2023
    • (2022)Pre-trained Language Model-based Retrieval and Ranking for Web SearchACM Transactions on the Web10.1145/356868117:1(1-36)Online publication date: 20-Dec-2022
    • (2022)Approximated Doubly Robust Search Relevance EstimationProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557145(3756-3765)Online publication date: 17-Oct-2022

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media