skip to main content
10.1145/3534678.3542614acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
abstract

Modern Theoretical Tools for Designing Information Retrieval System

Published: 14 August 2022 Publication History

Abstract

In the past decade, deep learning has significantly reshaped the landscape of information retrieval (IR). The community has recently begun to notice the potential dangers of overusing less-understood mechanisms and over-simplified assumptions to learn patterns and make decisions. In particular, there is growing concerns on the interpretation, reliability, social impact, and long-term utility of real-world IR systems. Therefore, it has become a pressing issue to bring the IR community comprehensive and systematic tools to understand empirical domain solutions and motivate principled design ideas. We focus on the three pillar stones of modern IR systems: pattern recognition with deep learning, causal inference analysis, and online decision making (with bandits and reinforcement learning). Our objectives are as follows.
For pattern recognition, we introduce theoretical tools that address the expressivity, optimization, generalization, and model diagnostic for widespread domain practices, including models from unsupervised, (semi-)supervised, meta-learning, and online learning.
For causal inference analysis, we emphasize both learning from observational studies and optimizing online experiment design, leveraging the recent theoretical advancements from various domains.
Finally, for online decision making (with bandits and reinforcement learning), we aim to resolve both the conceptual and practical learning, evaluation and deployment challenges by introducing powerful tools from robust optimization and optimal control.
Our tutorial is inclusive: we not only cover a broad range of heating topics, more importantly, we substantiate our discussion with the production examples at Walmart and Instacart such that audiences with different backgrounds can learn to leverage the tools as instructed. Our tutorial can serve as a guideline for practitioners seeking justifications and principled design ideas, a playbook for researchers landing their innovations on IR productions, and an introductory course for those interested in learning the advanced topics and tools of IR.

References

[1]
Clara Fannjiang, Stephen Bates, Anastasios Angelopoulos, Jennifer Listgarten, and Michael I Jordan. 2022. Conformal prediction for the design problem. arXiv preprint arXiv:2202.03613 (2022).
[2]
Jonathan Frankle and Michael Carbin. 2018. The lottery ticket hypothesis: Finding sparse, trainable neural networks. arXiv preprint arXiv:1803.03635 (2018).
[3]
Amir Gholami, Sehoon Kim, Zhen Dong, Zhewei Yao, Michael W Mahoney, and Kurt Keutzer. 2021. A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021).
[4]
Chuan Guo, Geoff Pleiss, Yu Sun, and Kilian Q Weinberger. 2017. On calibration of modern neural networks. In International Conference on Machine Learning. PMLR, 1321--1330.
[5]
Nathan Kallus and Angela Zhou. 2018. Confounding-robust policy improvement. Advances in neural information processing systems, Vol. 31 (2018).
[6]
Karthika Mohan and Judea Pearl. 2021. Graphical models for processing missing data. J. Amer. Statist. Assoc., Vol. 116, 534 (2021), 1023--1037.
[7]
Christoph Molnar. 2020. Interpretable machine learning. Lulu. com.
[8]
Judea Pearl. 2012. The do-calculus revisited. arXiv preprint arXiv:1210.4852 (2012).
[9]
Jonas Peters, Dominik Janzing, and Bernhard Schölkopf. 2017. Elements of causal inference: foundations and learning algorithms. The MIT Press.
[10]
Benjamin Recht. 2019. A tour of reinforcement learning: The view from continuous control. Annual Review of Control, Robotics, and Autonomous Systems, Vol. 2 (2019), 253--279.
[11]
Steffen Rendle, Walid Krichene, Li Zhang, and John Anderson. 2020. Neural Collaborative Filtering vs. Matrix Factorization Revisited. arXiv preprint arXiv:2005.09683 (2020).
[12]
Nilesh Tripuraneni, Chi Jin, and Michael Jordan. 2021. Provable meta-learning of linear representations. In International Conference on Machine Learning. PMLR, 10434--10443.
[13]
Da Xu and et al. 2022 a. Causal Structure Learning with Recommendation Systems. Coming soon (2022).
[14]
Da Xu and et al. 2022 b. Rethinking Pre-trained Embedding for E-commerce Machine Learning. Coming soon (2022).
[15]
Da Xu, Chuanwei Ruan, Evren Korpeoglu, Sushant Kumar, and Kannan Achan. 2020. Adversarial Counterfactual Learning and Evaluation for Recommender System. Advances in Neural Information Processing Systems, Vol. 33 (2020).
[16]
Da Xu, Chuanwei Ruan, Evren Korpeoglu, Sushant Kumar, and Kannan Achan. 2021 a. Rethinking Neural vs. Matrix-Factorization Collaborative Filtering: the Theoretical Perspectives. In International Conference on Machine Learning. PMLR, 11514--11524.
[17]
Da Xu, Chuanwei Ruan, Evren Korpeoglu, Sushant Kumar, and Kannan Achan. 2021 b. Towards the D-Optimal Online Experiment Design for Recommender Selection. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 3817--3825.
[18]
Da Xu and Bo Yang. 2022. On the Advances and Challenges of Adaptive Online Testing. Workshop on Decision Making for Information Retrieval (2022).
[19]
Da Xu, Yuting Ye, and Chuanwei Ruan. 2021 c. Understanding the role of importance weighting for deep learning. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net.
[20]
Da Xu, Yuting Ye, Chuanwei Ruan, Evren Korpeoglu, Sushant Kumar, and Kannan Achan. 2021 d. From Intervention to Domain Transportation: A Novel Perspective to Optimize Recommendation. In International Conference on Learning Representations.
[21]
Da Xu, Yuting Ye, Chuanwei Ruan, and Bo Yang. 2022. Towards Robust Off-policy Learning for Runtime Uncertainty. AAAI (2022).

Index Terms

  1. Modern Theoretical Tools for Designing Information Retrieval System

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
    August 2022
    5033 pages
    ISBN:9781450393850
    DOI:10.1145/3534678
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 14 August 2022

    Check for updates

    Author Tags

    1. information retrieval
    2. theory

    Qualifiers

    • Abstract

    Conference

    KDD '22
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Upcoming Conference

    KDD '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 139
      Total Downloads
    • Downloads (Last 12 months)22
    • Downloads (Last 6 weeks)5
    Reflects downloads up to 03 Mar 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media