Skip to main content
Log in

Information retrieval: a view from the Chinese IR community

  • Review Article
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

During a two-day strategic workshop in February 2018, 22 information retrieval researchers met to discuss the future challenges and opportunities within the field. The outcome is a list of potential research directions, project ideas, and challenges. This report describes the major conclusions we have obtained during the workshop. A key result is that we need to open our mind to embrace a broader IR field by rethink the definition of information, retrieval, user, system, and evaluation of IR. By providing detailed discussions on these topics, this report is expected to inspire our IR researchers in both academia and industry, and help the future growth of the IR research community.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Bush V. As we may think. The Atlantic Monthly, 1945, 176(1): 101–108

    Google Scholar 

  2. Clarke C. From the chair… ACM SIGIR Forum, 2016, 50(1): 1

    Google Scholar 

  3. Zobel J, Moffat A. Inverted files for text search engines. ACM Computing Surveys (CSUR), 2006, 38(2): 6

    Article  Google Scholar 

  4. Salton G, Wong A, Yang C S. A vector space model for automatic indexing. Communications of the ACM, 1975, 18(11): 613–620

    Article  Google Scholar 

  5. Robertson S, Zaragoza H. The probabilistic relevance framework: Bm25 and beyond. Foundations and Trends® in Information Retrieval, 2009, 3(4): 333–389

    Article  Google Scholar 

  6. Lv Y, Zhai C. Positional language models for information retrieval. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2009, 299–306

  7. Zhai C, Lafferty J. A study of smoothing methods for language models applied to ad hoc information retrieval. ACM SIGIR Forum, 2017, 51(2): 268–276

    Article  Google Scholar 

  8. Page L, Brin S, Motwani R, Winograd T. The pagerank citation ranking: bringing order to the web. Technical Report, Stanford InfoLab, 1999

  9. Kleinberg J M. Authoritative sources in a hyperlinked environment. Journal of the ACM, 1999, 46(5): 604–632

    Article  MathSciNet  Google Scholar 

  10. Chen C P, Zhang C Y. Data-intensive applications, challenges, techniques and technologies: a survey on big data. Information Sciences, 2014, 275: 314–347

    Article  Google Scholar 

  11. Sanderson M, Croft W B. The history of information retrieval research. Proceedings of the IEEE, 2012, 100 (Special Centennial Issue): 1444–1451

  12. Chaudhuri S, Dayal U. An overview of data warehousing and olap technology. ACM Sigmod Record, 1997, 26(1): 65–74

    Article  Google Scholar 

  13. Borlund P. The IIR evaluation model: a framework for evaluation of interactive information retrieval systems. Information Research, 2003, 8(3): 289–291

    Google Scholar 

  14. Hinton G, Deng L, Yu D, Dahl G, Mohamed A R, Jaitly N, Senior A, Vanhoucke V, Nguyen P, Kingsbury B. Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Processing Magazine, 2012, 29(6): 82–97

    Article  Google Scholar 

  15. LeCun Y, Bengio Y. Convolutional networks for images, speech, and time series. The Handbook of Brain Theory and Neural Networks, 1995, 3361(10): 1995

    Google Scholar 

  16. Socher R, Huang E H, Pennin J, Manning C D, Ng A Y. Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In: Proceedings of Advances in Neural Information Processing Systems. 2011, 801–809

  17. Craswell N, Croft W B, Guo J, Mitra B, de Rijke M. Neu-IR: the SIGIR 2016 workshop on neural information retrieval. In: Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2016, 1245–1246

  18. Craswell N, Croft W B, de Rijke M, Guo J, Mitra B. SIGIR 2017 workshop on neural information retrieval (Neu-Ir’17). In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2017, 1431–1432

  19. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y. Generative adversarial nets. In: Proceedings of Advances in Neural Information Processing Systems. 2014, 2672–2680

  20. Mnih V, Kavukcuoglu K, Silver D, Rusu A A, Veness J, Bellemare M G, Graves A, Riedmiller M, Fidjeland A K, Ostrovski G, Petersen S, Beattie C, Sadik A, Antonoglou I, King H, Kumaran D, Wierstra D, Legg S, Hassabis D. Human-level control through deep reinforcement learning. Nature, 2015, 518(7540): 529–533

    Article  Google Scholar 

  21. Silver D, Schrittwieser J, Simonyan K, Antonoglou I, Huang A, Guez A, Hubert T, Baker L, Lai M, Bolton A, Chen Y, Lillicrap T, Hui F, Sifre L, Driessche G V D, Graepel T, Hassabis D. Mastering the game of go without human knowledge. Nature, 2017, 550(7676): 354

    Article  Google Scholar 

  22. Wang J, Yu L, Zhang W, Gong Y, Xu Y, Wang B, Zhang P, Zhang D. Irgan: a minimax game forunifying generative and discriminative information retrieval models. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2017, 515–524

  23. Agichtein E, Brill E, Dumais S. Improving web search ranking by incorporating user behavior information. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 2006, 19–26

  24. Granka L A, Joachims T, Gay G. Eye-tracking analysis of user behavior in www search. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 2004, 478–479

  25. Morris M R, Teevan J, Panovich K. What do people ask their social networks, and why?: a survey study of status message q&a behavior. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 2010, 1739–1748

  26. Croft W B, Cronen-Townsend S, Lavrenko V. Relevance feedback and personalization: a language modeling perspective. In: Proceedings of the 2nd DELOS Network of Excellence Workshop on Personalisation and Recommender Systems in Digital Libraries. 2001

  27. Thomee B, Lew M S. Interactive search in image retrieval: a survey. International Journal of Multimedia Information Retrieval, 2012, 1(2): 71–86

    Article  Google Scholar 

  28. Said A, Jain B J, Narr S, Plumbaum T. Users and noise: the magic barrier of recommender systems. In: Proceedings of International Conference on User Modeling, Adaptation, and Personalization. 2012, 237–248

  29. Swan M. Blockchain: Blueprint for a New Economy. O’Reilly Media, Inc., 2015

  30. Akyildiz I F, Akan Ö B, Chen C, Fang J, Su W. Interplanetary internet: state-of-the-art and research challenges. Computer Networks, 2003, 43(2): 75–112

    Article  Google Scholar 

  31. Lavanya B M. Blockchain technology beyond bitcoin: an overview. International Journal of Computer Science and Mobile Applications, 2018, 6(1): 76–80

    Google Scholar 

  32. Seebacher S, Schüritz R. Blockchain technology as an enabler of service systems: a structured literature review. In: Proceedings of International Conference on Exploring Services Science. 2017, 12–23

  33. Croft W B, Metzler D, Strohman T. Search Engines: Information Retrieval in Practice. Addison-Wesley Reading, 2010

  34. Voorhees E M, Harman D K. TREC: Experiment and Evaluation in Information Retrieval. Cambridge: MIT Press, 2005

    Google Scholar 

  35. Kelly D. Methods for evaluating interactive information retrieval systems with users. Foundations and Trends®R in Information Retrieval, 2009, 3(1–2): 1–224

    Google Scholar 

  36. Ellis D. Theory and explanation in information retrieval research. Journal of Information Science, 1984, 8(1): 25–38

    Article  Google Scholar 

  37. Vakkari P, Järvelin K. Explanation in information seeking and retrieval. New Directions in Cognitive Information Retrieval, 2006, 19: 113–138

    Article  Google Scholar 

  38. Singh J, Anand A. EXS: explainable search using local model agnostic interpretability. In: Proceedings of the 12th ACM International Conference on Web Search and Data Mining. 2019, 770–773

  39. Luo G, Tang C, Yang H, Wei X. Medsearch: a specialized search engine for medical information retrieval. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management. 2008, 143–152

  40. Huang P S, He X, Gao J, Deng L, Acero A, Heck L. Learning deep structured semantic models for Web search using clickthrough data. In: Proceedings of the 22nd ACM International Conference on Information & Knowledge Management. 2013, 2333–2338

  41. Guo J, Fan Y, Ai Q, Croft W B. A deep relevance matching model for ad-hoc retrieval. In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. 2016, 55–64

  42. Zhang Y, Rahman M M, Braylan A, Dang B, Chang H L, Kim H, Mc-Namara Q, Angert A, Banner E, Khetan V, McDonnell T, Nguyen A T, Xu D, Wallace B C, Leasey M. Neural information retrieval: a literature review. 2016, arXiv preprint arXiv:1611.06792

  43. Mitra B, Craswell N. Neural models for information retrieval. 2017, arXiv preprint arXiv:1705.01509

  44. Guo J, Fan Y, Pang L, Yang L, Ai Q, Zamani H, Wu C, Croft W B, Cheng X. A deep look into neural ranking models for information retrieval. 2019, arXiv preprint arXiv:1903.06902

  45. Sharma D, Kumar S, Kholia C. Multi-modal information retrieval system. US Patent 7,054,818, 2006

  46. Lee D, Park J, Ahn J H. On the explanation of factors affecting ecommerce adoption. In: Proceedings of the International Conference on Information Systems. 2001, 109–120

  47. Jamali M, Ester M. A matrix factorization technique with trust propagation for recommendation in social networks. In: Proceedings of the 4th ACM Conference on Recommender Systems. 2010, 135–142

  48. Callison-Burch C. Fast, cheap, and creative: evaluating translation quality using amazon’s mechanical turk. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing. 2009, 286–295

  49. Gubbi J, Buyya R, Marusic S, Palaniswami M. Internet of Things (IoT): a vision, architectural elements, and future directions. Future Generation Computer Systems, 2013, 29(7): 1645–1660

    Article  Google Scholar 

  50. Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M, Kudlur M, Levenberg J, Monga R, Moore S, Murray D G, Steiner B, Tucker P, Vasudevan V, Warden P, Wicke M, Yu Y, Zheng X. Tensorflow: a system for large-scale machine learning. In: Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation. 2016, 265–283

  51. Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T. Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM International Conference on Multimedia. 2014, 675–678

  52. Paszke A, Gross S, Chintala S, Chanan G. Pytorch: tensors and dynamic neural networks in python with strong GPU acceleration. 2017

  53. McCandless M, Hatcher E, Gospodnetic O. Lucene in Action: Covers Apache Lucene 3.0. Greenwich, CT: Manning Publications Co., 2010

    Google Scholar 

Download references

Acknowledgements

We would like to thank Chinese Information Processing Society of China, CAS Key Laboratory of Network Data Science and Technology, Institute of Computing Technology, Chinese Academy of Sciences, and ACM SIGIR Beijing Chapter for suporting the strategic workshop. Thank Professor Bo Zhang (Tsinghua University) and Ming Zhou (Microsoft Research Asia) for contributing the keynotes and valuable discussions in the workshop.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Jiafeng Guo, Yanyan Lan or Yiqun Liu.

Additional information

Zhumin Chen is an associate professor in School of Computer Science and Technology, Shandong University, China. His research interests include information retrieval and natural language processing. His research is supported by the Natural Science Fund of China, Key Science and Technology Innovation Project of Shandong Province, etc.

Xueqi Cheng is a full professor and vice director of the Institute of Computing Technology, Chinese Academy of Sciences (CAS), and the director of the CAS Key Laboratory of Network Data Science and Technology, China. His research areas include Web search and data mining, data science, and social media analytics. He is the general secretary of CCF Committee on Big Data, the vice-chair of CIPS Committee on Information Retrieval, the general co-chair of SIGIR’20 and WSDM’15. He is the principal investigator of more than 10 major research projects, funded by NSFC and MOST. He was awarded the NSFC Distinguished Youth Scientist (2014), the National Prize for Progress in Science and Technology (2012), the China Youth Science and Technology Award (2011).

Shoubin Dong received the PhD degree in electronic engineering from the University of Science and Technology of China (USTC), China in 1994. She was a visiting scholar at the School of Computer Science and Language Technology Institute, Carnegie Mellon University (CMU), Pittsburgh, USA from 2001 to 2002. She is now a professor with the School of Computer Science and Engineering, South China University of Technology (SCUT), China. Her research interests include information retrieval, natural language processing, high-performance computing, etc.

Zhicheng Dou is currently a professor at School of Information, Renmin University of China, China. He received his PhD and BS degrees in computer science and technology from the Nankai University, China in 2008 and 2003, respectively. He worked at Microsoft Research Asia from July 2008 to September 2014. His current research interests are information retrieval, natural language processing, and big data analytics.

Jiafeng Guo is a professor in Institute of Computing Technology, Chinese Academy of Sciences, and University of Chinese Academy of Sciences, China. He has worked on a number of topics related to web search and data mining. His current research is focused on representation learning and neural models for information retrieval and filtering. He has published more than 80 papers in several top conferences/journals. His work on IR has received the Best Paper Award in ACM CIKM (2011), Best Student Paper Award in ACM SIGIR (2012) and Best Full Paper Runner-up Award in ACM CIKM (2017). Moreover, he has served as the PC member for the prestigious conferences including SIGIR, WWW, KDD, WSDM, and ACL, and the associate editor of TOIS.

Xuanjing Huang is a professor of the School of Computer Science, Fudan University, China. Her research interest includes natural language processing, information retrieval, artificial intelligence, deep learning and data intensive computing. She has published more than 100 papers in major conferences including ACL, SIGIR, IJCAI, AAAI, NIPS, ICML, CIKM, EMNLP, WSDM, and COLING. In the research community, she served as the PC Co-Chair of CCL 2019, NLPCC 2017, CCL 2016, SMP 2015, and SMP 2014, the organizer of WSDM 2015, competition chair of CIKM 2014, tutorial chair of COLING 2010, SPC or PC member of past WSDM, SIGIR, WWW, CIKM, ACL, IJCAI, KDD, EMNLP, COLING, and many other conferences.

Yanyan Lan is a professor in Institute of Computing Technology, Chinese Academy of Sciences, China. She leads a research group working on Big Data and Machine Learning. Her current research interests include machine learning, information retrieval and natural language processing. From April 2018 to March 2019, she acted as a visiting scholar in the department of statistics, UC Berkeley. She has published over 70 papers on top conferences including ICML, NIPS, SIGIR, WWW, etc. Her paper entitled “Top-k Learning to Rank: Labeling, Ranking, and Evaluation” has won the Best Student Paper Award of SIGIR 2012, and the paper entitled “Learning Visual Features from Snapshots for Web Search” has won the Best Paper RunnerUp Award of CIKM2017.

Chenliang Li received PhD from Nanyang Technological University, Singapore in 2013. Currently, he is an associate professor at School of Cyber Science and Engineering, Wuhan University, China. His research interests include information retrieval, text/web mining, data mining and natural language processing. He is a co-recipient of Best Student Paper Award Honorable Mention in ACM SIGIR 2016, and serves as an editorial board member of JASIST and IPM.

Ru Li, Professor, PhD Supervisor. She is the Vicedean of the School of Computer and Information Technology, and the School of Big Data of Shanxi University, the standing council member of Chinese Information Processing Society (CIPS), the committee member of Computational Linguistics, Information Retrieval, and Language and Knowledge Computing of CIPS. Her research interests include Chinese information processing and information retrieval. She has published more than 70 papers in both international and national important academic journals and conferences, including in the IEEE Transactions on Knowledge and Data Engineering, the Annual Meeting of the Association for Computational Linguistics, and the International Conference on Computational Linguistics, and so on. She has won three Second Prize for scientific and technological progress in Shanxi.

Tie-Yan Liu, assistant managing director of Microsoft Research Asia, fellow of the IEEE, and distinguished member of the ACM. He is an adjunct professor at Carnegie Mellon University (CMU) and Tsinghua University. His research interest includes learning to rank, deep learning, reinforcement learning, and distributed learning. He has served as general chair, program committee chair, local chair, or area chair for a dozen of international conferences including WWW/WebConf, SIGIR, KDD, ICML, NIPS, IJCAI, AAAI, ACL, ICTIR, as well as associate editor of ACM Transactions on Information Systems, ACM Transactions on the Web, and Neurocomputing.

Yiqun Liu is professor and Department co-Chair at the Department of Computer Science and Technology in Tsinghua University, China. His major research interests are in Web search, user behavior analysis, and natural language processing. He also works as a visiting research professor of National University of Singapore and a visiting professor of National Institute of Informatics (NII) of Japan, as well as a member of Tiangong AI Research Center which is founded by Tsinghua and Sogou Inc.

Jun Ma received his BSc, MSc, and PhD degrees from Shandong University in China, Ibaraki University and Kyushu University in Japan, respectively. He is a full professor at Shandong University. He was a senior researcher in the Department of Computer Science at Ibaraki University in 1994 and German GMD and Fraunhofer from the year 1999 to 2003. His research interests include information retrieval, Web data mining, recommendation systems, and machine learning. He has published more than 150 papers in the international journals and conferences, including SIGIR, WWW, MM, TOIS, and TKDE. He is a member of the ACM and IEEE.

Bing Qin, a professor and doctoral supervisor of the School of Computer Science and Technology, at Harbin Institute of Technology, China. Her main research directions are natural language processing, information extraction, text mining, sentiment analysis, etc. She has published more than 80 papers in the several international top journals and conferences such as ACL, COLING, EMNLP, IEEE TKDE, IEEE TASLP, etc. She has leaded over several the National Natural Science Foundations of China, as well as the key research and development projects of the National Science and Technology Ministry. She was awarded the first prize of Qian Weichang Chinese Information Processing Science and Technology Award by the Chinese Information Processing Society and the second prize of Heilongjiang Province Technical Invention.

Mingwen Wang is currently a professor of Jiangxi Normal University, China. He received the BS (1985) and MS (1988) degrees in mathematics from Jiangxi Normal University, China, and the PhD (2000) degree in computer science from Shanghai Jiaotong University, China. His research interests include information retrieval, natural language processing, and machine learning.

Jirong Wen is a professor and the dean of School of Information, Renmin University of China (RUC), China. He is also the Executive Dean of Gaoling School of Artificial Intelligence, and Director of Beijing Key Laboratory of Big Data Research. He received his PhD degree in 1999 from the Institute of Computing Technology, the Chinese Academy of Science, China. His main research interests include information retrieval, data mining and machine learning. He worked at Microsoft Research Asia (MSRA) for 14 years and once was the group manager of the Web Search and Mining Group.

Jun Xu is a professor with the School of Information, Renmin University of China, China. His research interests include learning to rank and semantic matching in search. He has published more than 50 papers in international conferences (e.g., SIGIR, WSDM) and journals (e.g., ACM TOIS, IEEE TKDE). He has won the Best Paper Award in AIRS (2010), Best Paper Runner-up in CIKM (2017), and Test of Time Award Honorable Mention in SIGIR (2019). He is serving as SPC for SIGIR, WWW, AAAI, and ACML, editorial board member for JASIST, and associate editor for ACM TIST.

Min Zhang is a tenured associate professor in the Department of Computer Science & Technology, Tsinghua University, China, specializes in Web search and recommendation, and user modeling. She is the vice director of Intelligent Technology & Systems lab at CS Dept., and vice director of Intelligent Information Acquisition, AI Institute, Tsinghua University. She also serves as ACM SIGIR Executive Committee Member, Associate Editor for the ACM Transaction of Information Systems (TOIS), Web Mining and Content Analysis Track Chair of theWebConf 2020, Short Paper Chair of SIGIR 2018, Program Chair of WSDM 2017, etc. She also owns 12 patents.

Peng Zhang is an associate professor and Vice Dean of School of Computer Science and Technology, College of Computing and Intelligence, Tianjin University, China. He obtained his PhD at Robert Gordon University, United Kingdom in 2013. His research is focused on the tensor space language models, explainable neural network design and quantum theory inspired language models. He has published papers on refereed journals such as IEEE TNN, IEEE TKDE, ACM TIST, ACM TALIP, JASIST, IP&M, and on top conferences such as NeurIPS, SIGIR, ACL, AAAI, IJCAI, CIKM, WWW, and EMNLP. He won ECIR 2011 Best Poster Award and SIGIR 2017 Best Paper Award Honorable Mention.

Qi Zhang received the PhD degree in computer science from Fudan University, China. He is a professor of computer science at Fudan University, China. His research interests include natural language processingand information retrieval.

Electronic Supplementary Material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, Z., Cheng, X., Dong, S. et al. Information retrieval: a view from the Chinese IR community. Front. Comput. Sci. 15, 151601 (2021). https://doi.org/10.1007/s11704-020-9159-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11704-020-9159-0

Keywords

Navigation