ABSTRACT
One of the challenges of math information retrieval is the inherent ambiguity of mathematical notation. The use of various notations, symbols, and conventions can lead to ambiguities in math search queries, potentially causing confusion and errors. Therefore, asking clarifying questions in math search can help remove these ambiguities. Despite advances in incorporating clarifying questions for search, little is currently understood about the characteristics of these questions in math. This paper investigates math clarifying questions asked on the MathStackExchange community question answering platform, analyzing a total of 495,431 clarifying questions and their usefulness. The results of the analysis uncover specific patterns in useful clarifying questions that provide insight into the design considerations for future conversational math search systems. The formulae used in clarifying questions are closely related to those in the initial queries and are accompanied by common phrases, seeking for the missing information related to the formulae. Additionally, experiments utilizing clarifying questions for math search demonstrate the potential benefits of incorporating them alongside the original query.
- Mohammad Aliannejadi, Hamed Zamani, Fabio Crestani, and W Bruce Croft. 2019. Asking Clarifying Questions in Open-Domain Information-Seeking Conversations. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval.Google ScholarDigital Library
- Pavel Braslavski, Denis Savenkov, Eugene Agichtein, and Alina Dubatovka. 2017. What Do You Mean Exactly? Analyzing Clarification Questions in CQA. In Proceedings of the 2017 Conference on Human Information Interaction and Retrieval.Google ScholarDigital Library
- Dallas Fraser, Andrew Kane, and Frank Wm Tompa. 2018. Choosing Math Features for BM25 Ranking with Tangent-L. In Proceedings of the ACM Symposium on Document Engineering 2019.Google ScholarDigital Library
- Agnieszka Geras, Grzegorz Siudem, and Marek Gagolewski. 2022. Time to vote: Temporal clustering of user activity on Stack Overflow. Journal of the Association for Information Science and Technology (2022).Google Scholar
- Andrew Kane, Yin Ki Ng, and Frank Tompa. 2022. Dowsing for Answers to Math Questions. Doing Better with Less. Proceedings of the Working Notes of CLEF 2022.Google Scholar
- Omar Khattab and Matei Zaharia. 2020. ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval.Google ScholarDigital Library
- Antonios Minas Krasakis, Mohammad Aliannejadi, Nikos Voskarides, and Evangelos Kanoulas. 2020. Analysing the Effect of Clarifying Questions on Document Ranking in Conversational Search. In Proceedings of the 2020 acm sigir on international conference on theory of information retrieval.Google ScholarDigital Library
- Giovanni Yoko Kristianto, Goran Topic, and Akiko Aizawa. 2016. MCAT Math Retrieval System for NTCIR-12 MathIR Task. In NTCIR.Google Scholar
- Suyu Ma, Chunyang Chen, Hourieh Khalajzadeh, and John Grundy. 2021. Latexify Math: Mathematical Formula Markup Revision to Assist Collaborative Editing in Math Q&A Sites. Proceedings of the ACM on Human-Computer Interaction.Google ScholarDigital Library
- Behrooz Mansouri, V'it Novotnỳ, Anurag Agarwal, Douglas W Oard, and Richard Zanibbi. 2022a. Overview of ARQMath-3 (2022): Third CLEF Lab on Answer Retrieval for Questions on Math. In International Conference of the Cross-Language Evaluation Forum for European Languages. Springer.Google ScholarDigital Library
- Behrooz Mansouri, Douglas W Oard, and Richard Zanibbi. 2021a. DPRL Systems in the CLEF 2022 ARQMath Lab: Introducing MathAMR for Math-Aware Search. Proceedings of the Working Notes of CLEF 2022.Google Scholar
- Behrooz Mansouri, Douglas W Oard, and Richard Zanibbi. 2022b. Contextualized Formula Search Using Math Abstract Meaning Representation. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management.Google ScholarDigital Library
- Behrooz Mansouri, Shaurya Rohatgi, Douglas W Oard, Jian Wu, C Lee Giles, and Richard Zanibbi. 2019a. Tangent-CFT: An Embedding Model for Mathematical Formulas. In Proceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval.Google ScholarDigital Library
- Behrooz Mansouri, Richard Zanibbi, and Douglas W Oard. 2019b. Characterizing Searches for Mathematical Concepts. In 2019 ACM/IEEE Joint Conference on Digital Libraries (JCDL). IEEE.Google Scholar
- Behrooz Mansouri, Richard Zanibbi, Douglas W Oard, and Anurag Agarwal. 2021b. Overview of ARQMath-2 (2021): Second CLEF Lab on Answer Retrieval for Questions on Math. In International Conference of the Cross-Language Evaluation Forum for European Languages. Springer.Google ScholarDigital Library
- Shuai Peng, Ke Yuan, Liangcai Gao, and Zhi Tang. 2021. MathBERT:: A Pre-trained Model for Mathematical Formula Understanding. arXiv preprint arXiv:2105.00377 (2021).Google Scholar
- Anja Reusch, Maik Thiele, and Wolfgang Lehner. 2021. An ALBERT-based Similarity Measure for Mathematical Answer Retrieval. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval.Google ScholarDigital Library
- Stephen Robertson, Hugo Zaragoza, et al. 2009. The Probabilistic Relevance Framework: BM25 and Beyond. Foundations and Trends® in Information Retrieval.Google Scholar
- Leila Tavakoli, Hamed Zamani, Falk Scholer, William Bruce Croft, and Mark Sanderson. 2022. Analyzing Clarification in Asynchronous Information-Seeking Conversations. Journal of the Association for Information Science and Technology.Google ScholarDigital Library
- Hamed Zamani, Susan Dumais, Nick Craswell, Paul Bennett, and Gord Lueck. 2020. Generating Clarifying Questions for Information Retrieval. In Proceedings of the Web Conference 2020.Google ScholarDigital Library
- Richard Zanibbi and Dorothea Blostein. 2012. Recognition and Retrieval of Mathematical Expressions. International Journal on Document Analysis and Recognition (IJDAR).Google Scholar
- Richard Zanibbi, Douglas W Oard, Anurag Agarwal, and Behrooz Mansouri. 2020. Overview of ARQMath 2020: CLEF Lab on Answer Retrieval for Questions on Math. In International Conference of the Cross-Language Evaluation Forum for European Languages. Springer.Google ScholarDigital Library
- Wei Zhong, Shaurya Rohatgi, Jian Wu, C Lee Giles, and Richard Zanibbi. 2020. Accelerating Substructure Similarity Search for Formula Retrieval. In European Conference on Information Retrieval. Springer.Google Scholar
- Wei Zhong, Xinyu Zhang, Ji Xin, Richard Zanibbi, and Jimmy Lin. 2021. Approach Zero and Anserini at the CLEF-2021 ARQMath Track: Applying Substructure Search and BM25 on Operator Tree Path Tokens. In CLEF (Working Notes).Google Scholar
Index Terms
- Clarifying Questions in Math Information Retrieval
Recommendations
One Blade for One Purpose: Advancing Math Information Retrieval using Hybrid Search
SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information RetrievalNeural retrievers have been shown to be effective for math-aware search. Their ability to cope with math symbol mismatches, to represent highly contextualized semantics, and to learn effective representations are critical to improving math information ...
Asking Clarifying Questions in Open-Domain Information-Seeking Conversations
SIGIR'19: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information RetrievalUsers often fail to formulate their complex information needs in a single query. As a consequence, they may need to scan multiple result pages or reformulate their queries, which may be a frustrating experience. Alternatively, systems can improve user ...
Towards Facet-Driven Generation of Clarifying Questions for Conversational Search
ICTIR '21: Proceedings of the 2021 ACM SIGIR International Conference on Theory of Information RetrievalClarifying an underlying user information need is an important aspect of a modern-day IR system. The importance of clarification is even higher in limited-bandwidth scenarios, such as conversational or mobile search, where a user is unable to easily ...
Comments