Skip to main content
Log in

Understanding the role of human-inspired heuristics for retrieval models

  • Research Article
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

Relevance estimation is one of the core concerns of information retrieval (IR) studies. Although existing retrieval models gained much success in both deepening our understanding of information seeking behavior and building effective retrieval systems, we have to admit that the models work in a rather different manner from how humans make relevance judgments. Users’ information seeking behaviors involve complex cognitive processes, however, the majority of these behavior patterns are not considered in existing retrieval models. To bridge the gap between practical user behavior and retrieval model, it is essential to systematically investigate user cognitive behavior during relevance judgement and incorporate these heuristics into retrieval models. In this paper, we aim to formally define a set of basic user reading heuristics during relevance judgement and investigate their corresponding modeling strategies in retrieval models. Further experiments are conducted to evaluate the effectiveness of different reading heuristics for improving ranking performance. Based on a large-scale Web search dataset, we find that most reading heuristics can improve the performance of retrieval model and establish guidelines for improving the design of retrieval models with human-inspired heuristics. Our study sheds light on building retrieval model from the perspective of cognitive behavior.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Li X S, Mao J X, Wang C, Liu Y Q, Zhang M, Ma S P. Understanding reading attention distribution during relevance judgement. In: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2019, 795–804

  2. Pang L, Lan Y Y, Guo J F, Xu J, Cheng X Q. Deeprank: a new deep architecture for relevance ranking in information retrieval. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. 2017

  3. Hu B T, Lu Z D, Li H, Chen Q C. Convolutional neural network architectures for matching natural language sentences. In: Proceedings of the 27th International Conference on Neural Information Processing Systems. 2014, 2042–2050

  4. Huang P S, He X D, Gao J F, Deng L, Acero A, Heck L. Learning deep structured semantic models for Web search using clickthrough data. In: Proceedings of the 22nd ACM International Conference on Information and Knowledge Management. 2013, 2333–2338

  5. Li X S, Liu Y Q, Mao J X, He Z X, Zhang M, Ma S P. Understanding reading attention distribution during relevance judgement. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management. 2018, 733–742

  6. Fan Y X, Guo J F, Lan Y Y, Xu J, Zhai C X, Cheng X Q. Modeling diverse relevance patterns in ad-hoc retrieval. In: Proceedings of the 41st ACM International Conference on Information and Knowledge Management. 2018, 375–384

  7. Robertson S E, Walker S. Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management. 1994, 232–241

  8. Guo J F, Fan Y X, Ai Q Y, Croft W B. A deep relevance matching model for ad-hoc retrieval. In: Proceedings of the 25th ACM International Conference on Information and Knowledge Management. 2016, 55–64

  9. Xiong C, Dai Z, Callan J, Liu Z, Power R. End-to-end neural ad-hoc ranking with kernel pooling. In: Proceedings of the 40th ACM International Conference on Information and Knowledge Management. 2017

  10. Ding M, Zhou C, Chen Q B, Yang H X, Tang J. Cognitive graph for multi-hop reading comprehension at scale. 2019, arXiv preprint arXiv:1905.05460

  11. Yu A W, Lee H, Le Q V. Learning to skim text. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 2017, 1880–1890

  12. Fu T J, Ma W Y. Speed reading: learning to read forbackward via shuttle. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018, 4439–4448

  13. Pang L, Lan Y Y, Guo J F, Xu J, Cheng X Q. A deep investigation of deep ir models. 2017, arXiv preprint arXiv:1707.07700

  14. Hui K, Yates A, Berberich K, de Melo G. Pacrr: a position-aware neural ir model for relevance matching. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2017, 1049–1058

  15. Fang H, Tao T, Zhai C X. Diagnostic evaluation of information retrieval models. Transactions on Information Systems, 2011, 29(2): 7

    Article  Google Scholar 

  16. Fang H, Tao T, Zhai C X. A formal study of information retrieval heuristics. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 2004, 49–56

  17. Tao T, Zhai C X. An exploration of proximity measures in information retrieval. In: Proceedings of the 30th Annual International ACM SI-GIR Conference on Research and Development in Information Retrieval. 2007, 295–302

  18. Hahn M, Keller F. Modeling human reading with neural attention. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2016, 85–95

  19. Liu X G, Mou L L, Cui H T, Lu Z D, Song S. Jumper: learning when to make classification decisions in reading. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence. 2018, 4237–4243

  20. Yu K Y, Liu Y, Schwing A G, Peng J. Fast accurate text classification: skimming, rereading and early stopping. In: Proceedings of the 6th International Conference on Learning Representations. 2018

  21. Wason P C, Evans J S B. Dual Processes in Reasoning? Cognition. Elsevier, 1974, 141–154

  22. Gehring J, Auli M, Grangier D, Yarats D, Dauphin Y N. Convolutional sequence to sequence learning. In: Proceedings of the 34th International Conference on Machine Learning. 2017, 1243–1252

  23. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Polosukhin I. Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 5998–6008

  24. Pang L, Lan Y Y, Guo J F, Xu J, Wan S X, Cheng X Q. Text matching as image recognition. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence. 2016

  25. Nie Y F, Li Y, Nie J Y. Empirical study of multi-level convolution models for ir based on representations and interactions. In: Proceedings of the 2018 ACM SIGIR International Conference on Theory of Information Retrieval. 2018, 59–66

  26. Graves A, Schmidhuber J. Offline handwriting recognition with multidimensional recurrent neural networks. In: Proceedings of the 21st International Conference on Neural Information Processing Systems. 2009, 545–552

  27. Wu H C, Luk R W P, Wong K F, Kwok K L. A retrospective study of a hybrid document-context based retrieval model. Information Processing and Management, 2007, 43(5): 1308–1331

    Article  Google Scholar 

  28. Sutton R S, McAllester D A, Singh S P, Mansour Y. Policy gradient methods for reinforcement learning with function approximation. In: Proceedings of the 12th International Conference on Neural Information Processing Systems. 2000, 1057–1063

  29. Zheng Y K, Fan Z, Liu Y Q, Luo C, Zhang M, Ma S P. Sogou-QCL: a new dataset with click relevance label. In: Proceedings of the 41st International ACM SIGIR Conference on Research and Development in Information Retrieval. 2018, 1117–1120

  30. Dehghani M, Zamani H, Severyn A, Kamps J, Croft W B. Neural ranking models with weak supervision. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2017, 65–74

  31. Wang C, Liu Y Q, Wang M, Zhou K, Nie J Y, Ma S P. Incorporating non-sequential behavior into click models. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2015, 283–292

  32. Barrett M, Bingel J, Hollenstein N, Rei M, Søgaard A. Sequence classification with human attention. In: Proceedings of the 22nd Conference on Computational Natural Language Learning. 2018, 302–312

Download references

Acknowledgements

This work was supported by the National Key Research and Development Program of China (2018YFC0831700), the National Natural Science Foundation of China (Grant Nos. 61732008, 61532011), Beijing Academy of Artificial Intelligence (BAAI) and Tsinghua University Guoqiang Research Institute.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yiqun Liu.

Additional information

This article is an extension of Li et al. [1]. Compared with the previous conference version, it systematically introduces the reading heuristics for retrieval model. It also includes an extensive study of modeling strategies and experimental results to evaluate different reading heuristics.

Xiangsheng Li is a PhD student at the Department of Computer Science and Technology in Tsinghua University, China. His research interests include Web search, user behavior analysis and search ranking modeling. He is mainly working on using human-inspired heuristics (e.g., behavior patterns and knowledge information) to design and improve retrieval models.

Yiqun Liu is now working as professor and co-chair at the Department of Computer Science and Technology in Tsinghua University, China. His major research interests include Web search, user behavior analysis, and Web data mining. He is senior members of ACM and IEEE, distinguished member of CCF, a visiting professor of National Institute of Informatics (NII) in Japan and a visiting research professor of National University of Singapore (NUS), Singapore.

Jiaxin Mao works as a postdoctoral research fellow at the Department of Computer Science and Technology at Tsinghua University, China. His research interests are information retrieval, Web mining, and applied machine learning. He has published more than 30 papers in top-tier conferences and journals including SIGIR, TOIS, WWW, KDD, IJCAI, WSDM, and CIKM. He is the recipient of the SIGIR 2018 Best Short Paper Honourable Mention Award, ICTIR 2019 Best Short Paper Honorable Mention Award, and the Chinese Journal of Computers Best Paper Award (2014–2018). He also serves as the ACM SIGIR Student Affairs Co-Chair.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, X., Liu, Y. & Mao, J. Understanding the role of human-inspired heuristics for retrieval models. Front. Comput. Sci. 16, 161305 (2022). https://doi.org/10.1007/s11704-020-0016-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11704-020-0016-y

Keywords

Navigation