skip to main content
10.1145/3578337.3605123acmconferencesArticle/Chapter ViewAbstractPublication PagesictirConference Proceedingsconference-collections
research-article
Open Access

Clarifying Questions in Math Information Retrieval

Published:09 August 2023Publication History

ABSTRACT

One of the challenges of math information retrieval is the inherent ambiguity of mathematical notation. The use of various notations, symbols, and conventions can lead to ambiguities in math search queries, potentially causing confusion and errors. Therefore, asking clarifying questions in math search can help remove these ambiguities. Despite advances in incorporating clarifying questions for search, little is currently understood about the characteristics of these questions in math. This paper investigates math clarifying questions asked on the MathStackExchange community question answering platform, analyzing a total of 495,431 clarifying questions and their usefulness. The results of the analysis uncover specific patterns in useful clarifying questions that provide insight into the design considerations for future conversational math search systems. The formulae used in clarifying questions are closely related to those in the initial queries and are accompanied by common phrases, seeking for the missing information related to the formulae. Additionally, experiments utilizing clarifying questions for math search demonstrate the potential benefits of incorporating them alongside the original query.

References

  1. Mohammad Aliannejadi, Hamed Zamani, Fabio Crestani, and W Bruce Croft. 2019. Asking Clarifying Questions in Open-Domain Information-Seeking Conversations. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Pavel Braslavski, Denis Savenkov, Eugene Agichtein, and Alina Dubatovka. 2017. What Do You Mean Exactly? Analyzing Clarification Questions in CQA. In Proceedings of the 2017 Conference on Human Information Interaction and Retrieval.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Dallas Fraser, Andrew Kane, and Frank Wm Tompa. 2018. Choosing Math Features for BM25 Ranking with Tangent-L. In Proceedings of the ACM Symposium on Document Engineering 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Agnieszka Geras, Grzegorz Siudem, and Marek Gagolewski. 2022. Time to vote: Temporal clustering of user activity on Stack Overflow. Journal of the Association for Information Science and Technology (2022).Google ScholarGoogle Scholar
  5. Andrew Kane, Yin Ki Ng, and Frank Tompa. 2022. Dowsing for Answers to Math Questions. Doing Better with Less. Proceedings of the Working Notes of CLEF 2022.Google ScholarGoogle Scholar
  6. Omar Khattab and Matei Zaharia. 2020. ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Antonios Minas Krasakis, Mohammad Aliannejadi, Nikos Voskarides, and Evangelos Kanoulas. 2020. Analysing the Effect of Clarifying Questions on Document Ranking in Conversational Search. In Proceedings of the 2020 acm sigir on international conference on theory of information retrieval.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Giovanni Yoko Kristianto, Goran Topic, and Akiko Aizawa. 2016. MCAT Math Retrieval System for NTCIR-12 MathIR Task. In NTCIR.Google ScholarGoogle Scholar
  9. Suyu Ma, Chunyang Chen, Hourieh Khalajzadeh, and John Grundy. 2021. Latexify Math: Mathematical Formula Markup Revision to Assist Collaborative Editing in Math Q&A Sites. Proceedings of the ACM on Human-Computer Interaction.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Behrooz Mansouri, V'it Novotnỳ, Anurag Agarwal, Douglas W Oard, and Richard Zanibbi. 2022a. Overview of ARQMath-3 (2022): Third CLEF Lab on Answer Retrieval for Questions on Math. In International Conference of the Cross-Language Evaluation Forum for European Languages. Springer.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Behrooz Mansouri, Douglas W Oard, and Richard Zanibbi. 2021a. DPRL Systems in the CLEF 2022 ARQMath Lab: Introducing MathAMR for Math-Aware Search. Proceedings of the Working Notes of CLEF 2022.Google ScholarGoogle Scholar
  12. Behrooz Mansouri, Douglas W Oard, and Richard Zanibbi. 2022b. Contextualized Formula Search Using Math Abstract Meaning Representation. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Behrooz Mansouri, Shaurya Rohatgi, Douglas W Oard, Jian Wu, C Lee Giles, and Richard Zanibbi. 2019a. Tangent-CFT: An Embedding Model for Mathematical Formulas. In Proceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Behrooz Mansouri, Richard Zanibbi, and Douglas W Oard. 2019b. Characterizing Searches for Mathematical Concepts. In 2019 ACM/IEEE Joint Conference on Digital Libraries (JCDL). IEEE.Google ScholarGoogle Scholar
  15. Behrooz Mansouri, Richard Zanibbi, Douglas W Oard, and Anurag Agarwal. 2021b. Overview of ARQMath-2 (2021): Second CLEF Lab on Answer Retrieval for Questions on Math. In International Conference of the Cross-Language Evaluation Forum for European Languages. Springer.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Shuai Peng, Ke Yuan, Liangcai Gao, and Zhi Tang. 2021. MathBERT:: A Pre-trained Model for Mathematical Formula Understanding. arXiv preprint arXiv:2105.00377 (2021).Google ScholarGoogle Scholar
  17. Anja Reusch, Maik Thiele, and Wolfgang Lehner. 2021. An ALBERT-based Similarity Measure for Mathematical Answer Retrieval. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Stephen Robertson, Hugo Zaragoza, et al. 2009. The Probabilistic Relevance Framework: BM25 and Beyond. Foundations and Trends® in Information Retrieval.Google ScholarGoogle Scholar
  19. Leila Tavakoli, Hamed Zamani, Falk Scholer, William Bruce Croft, and Mark Sanderson. 2022. Analyzing Clarification in Asynchronous Information-Seeking Conversations. Journal of the Association for Information Science and Technology.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Hamed Zamani, Susan Dumais, Nick Craswell, Paul Bennett, and Gord Lueck. 2020. Generating Clarifying Questions for Information Retrieval. In Proceedings of the Web Conference 2020.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Richard Zanibbi and Dorothea Blostein. 2012. Recognition and Retrieval of Mathematical Expressions. International Journal on Document Analysis and Recognition (IJDAR).Google ScholarGoogle Scholar
  22. Richard Zanibbi, Douglas W Oard, Anurag Agarwal, and Behrooz Mansouri. 2020. Overview of ARQMath 2020: CLEF Lab on Answer Retrieval for Questions on Math. In International Conference of the Cross-Language Evaluation Forum for European Languages. Springer.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Wei Zhong, Shaurya Rohatgi, Jian Wu, C Lee Giles, and Richard Zanibbi. 2020. Accelerating Substructure Similarity Search for Formula Retrieval. In European Conference on Information Retrieval. Springer.Google ScholarGoogle Scholar
  24. Wei Zhong, Xinyu Zhang, Ji Xin, Richard Zanibbi, and Jimmy Lin. 2021. Approach Zero and Anserini at the CLEF-2021 ARQMath Track: Applying Substructure Search and BM25 on Operator Tree Path Tokens. In CLEF (Working Notes).Google ScholarGoogle Scholar

Index Terms

  1. Clarifying Questions in Math Information Retrieval

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      ICTIR '23: Proceedings of the 2023 ACM SIGIR International Conference on Theory of Information Retrieval
      August 2023
      300 pages
      ISBN:9798400700736
      DOI:10.1145/3578337

      Copyright © 2023 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 9 August 2023

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      ICTIR '23 Paper Acceptance Rate30of73submissions,41%Overall Acceptance Rate209of482submissions,43%

      Upcoming Conference

    • Article Metrics

      • Downloads (Last 12 months)103
      • Downloads (Last 6 weeks)18

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader