Overview of ARQMath 2020: CLEF Lab on Answer Retrieval for Questions on Math

Zanibbi, Richard; Oard, Douglas W.; Agarwal, Anurag; Mansouri, Behrooz

doi:10.1007/978-3-030-58219-7_15

Richard Zanibbi¹⁸,
Douglas W. Oard¹⁹,
Anurag Agarwal¹⁸ &
…
Behrooz Mansouri¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12260))

Included in the following conference series:

International Conference of the Cross-Language Evaluation Forum for European Languages

1150 Accesses
20 Citations
3 Altmetric

Abstract

The ARQMath Lab at CLEF considers finding answers to new mathematical questions among posted answers on a community question answering site (Math Stack Exchange). Queries are question posts held out from the searched collection, each containing both text and at least one formula. This is a challenging task, as both math and text may be needed to find relevant answer posts. ARQMath also includes a formula retrieval sub-task: individual formulas from question posts are used to locate formulae in earlier question and answer posts, with relevance determined considering the context of the post from which a query formula is taken, and the posts in which retrieved formulae appear.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Finding Old Answers to New Math Questions: The ARQMath Lab at CLEF 2020

Advancing Math-Aware Search: The ARQMath-3 Lab at CLEF 2022

Overview of ARQMath-2 (2021): Second CLEF Lab on Answer Retrieval for Questions on Math

Notes

1.
https://math.stackexchange.com.
2.
https://www.cs.rit.edu/~dprl/ARQMath.
3.
https://www.w3.org/Math.
4.
https://dlmf.nist.gov.
5.
https://archive.org/download/stackexchange.
6.
https://dlmf.nist.gov/LaTeXML.
7.
https://drive.google.com/drive/folders/1ZPKIWDnhMGRaPNVLi1reQxZWTfH2R4u3.
8.
https://github.com/ARQMath/ARQMathCode.
9.
Note that participating systems did not have access to this information.
10.
https://github.com/hltcoe/turkle.
11.
H+M binarization corresponds to the definition of relevance usually used in the Text Retrieval Conference (TREC). The TREC definition is “If you were writing a report on the subject of the topic and would use the information contained in the document in the report, then the document is relevant. Only binary judgments (“relevant” or “not relevant”) are made, and a document is judged relevant if any piece of it is relevant (regardless of how small the piece is in relation to the rest of the document).” (source: https://trec.nist.gov/data/reljudge_eng.html).
12.
Pooling to at least depth 20 ensures that there are no unjudged posts above rank 10 for any primary or secondary submission, and for four of the five baselines. Note, however, that P@10 can not achieve a value of 1 because some topics have fewer than 10 relevant posts.
13.
https://github.com/usnistgov/trec_eval.
14.
One team submitted incorrect post id’s for retrieved formulae; those post id’s were not used for pooling.
15.
See, for example, MathDeck [16], in which candidate formulae are suggested to the users during formula editing.

References

Aizawa, A., Kohlhase, M., Ounis, I.: NTCIR-10 math pilot task overview. In: NTCIR (2013)
Google Scholar
Aizawa, A., Kohlhase, M., Ounis, I., Schubotz, M.: NTCIR-11 Math-2 task overview. In: NTCIR, vol. 11, pp. 88–98 (2014)
Google Scholar
Borlund, P.: The IIR evaluation model: a framework for evaluation of interactive information retrieval systems. Inf. Res. 8(3) (2003)
Google Scholar
Buckley, C., Voorhees, E.M.: Retrieval evaluation with incomplete information. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 25–32 (2004)
Google Scholar
Davila, K., Zanibbi, R.: Layout and semantics: combining representations for mathematical formula search. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1165–1168 (2017)
Google Scholar
Guidi, F., Sacerdoti Coen, C.: A survey on retrieval of mathematical knowledge. In: Kerber, M., Carette, J., Kaliszyk, C., Rabe, F., Sorge, V. (eds.) CICM 2015. LNCS (LNAI), vol. 9150, pp. 296–315. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-20615-8_20
Chapter Google Scholar
Hopkins, M., Le Bras, R., Petrescu-Prahova, C., Stanovsky, G., Hajishirzi, H., Koncel-Kedziorski, R.: SemEval-2019 task 10: math question answering. In: Proceedings of the 13th International Workshop on Semantic Evaluation (2019)
Google Scholar
Kaliszyk, C., Brady, E., Kohlhase, A., Sacerdoti Coen, C. (eds.): CICM 2019. LNCS (LNAI), vol. 11617. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-23250-4
Book Google Scholar
Kincaid, J.P., Fishburne Jr., R.P., Rogers, R.L., Chissom, B.S.: Derivation of new readability formulas (automated readability index, fog count and Flesch reading ease formula) for Navy enlisted personnel. Technical report, Naval Technical Training Command Millington TN Research Branch (1975)
Google Scholar
Kushman, N., Artzi, Y., Zettlemoyer, L., Barzilay, R.: Learning to automatically solve algebra word problems. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (2014)
Google Scholar
Ling, W., Yogatama, D., Dyer, C., Blunsom, P.: Program induction by rationale generation: learning to solve and explain algebraic word problems. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (2017)
Google Scholar
Mansouri, B., Agarwal, A., Oard, D., Zanibbi, R.: Finding old answers to new math questions: the ARQMath lab at CLEF 2020. In: Jose, J.M., et al. (eds.) ECIR 2020. LNCS, vol. 12036, pp. 564–571. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-45442-5_73
Chapter Google Scholar
Mansouri, B., Rohatgi, S., Oard, D.W., Wu, J., Giles, C.L., Zanibbi, R.: Tangent-CFT: an embedding model for mathematical formulas. In: Proceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval (ICTIR), pp. 11–18 (2019)
Google Scholar
Mansouri, B., Zanibbi, R., Oard, D.W.: Characterizing searches for mathematical concepts. In: Joint Conference on Digital Libraries (2019)
Google Scholar
Newell, A., Simon, H.: The logic theory machine-a complex information processing system. IRE Trans. Inf. Theory 2, 61–79 (1956)
Article Google Scholar
Nishizawa, G., Liu, J., Diaz, Y., Dmello, A., Zhong, W., Zanibbi, R.: MathSeer: a math-aware search interface with intuitive formula editing, reuse, and lookup. In: Jose, J.M., et al. (eds.) ECIR 2020. LNCS, vol. 12036, pp. 470–475. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-45442-5_60
Chapter Google Scholar
Ounis, I., Amati, G., Plachouras, V., He, B., Macdonald, C., Johnson, D.: Terrier information retrieval platform. In: Losada, D.E., Fernández-Luna, J.M. (eds.) ECIR 2005. LNCS, vol. 3408, pp. 517–519. Springer, Heidelberg (2005). https://doi.org/10.1007/978-3-540-31865-1_37
Chapter Google Scholar
Sakai, T., Kando, N.: On information retrieval metrics designed for evaluation with incomplete relevance assessments. Inf. Retrieval 11(5), 447–470 (2008). https://doi.org/10.1007/s10791-008-9059-7
Article Google Scholar
Schubotz, M., Youssef, A., Markl, V., Cohl, H.S.: Challenges of mathematical information retrieval in the NTCIR-11 Math Wikipedia Task. In: SIGIR, pp. 951–954. ACM (2015)
Google Scholar
Zanibbi, R., Aizawa, A., Kohlhase, M., Ounis, I., Topic, G., Davila, K.: NTCIR-12 MathIR task overview. In: NTCIR (2016)
Google Scholar
Zanibbi, R., Blostein, D.: Recognition and retrieval of mathematical expressions. Int. J. Doc. Anal. Recognit. (IJDAR) 15(4), 331–357 (2012). https://doi.org/10.1007/s10032-011-0174-4
Article Google Scholar
Zhong, W., Rohatgi, S., Wu, J., Giles, C.L., Zanibbi, R.: Accelerating substructure similarity search for formula retrieval. In: Jose, J.M., et al. (eds.) ECIR 2020. LNCS, vol. 12035, pp. 714–727. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-45439-5_47
Chapter Google Scholar
Zhong, W., Zanibbi, R.: Structural similarity search for formulas using leaf-root paths in operator subtrees. In: Azzopardi, L., Stein, B., Fuhr, N., Mayr, P., Hauff, C., Hiemstra, D. (eds.) ECIR 2019. LNCS, vol. 11437, pp. 116–129. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-15712-8_8
Chapter Google Scholar

Download references

Acknowledgements

Wei Zhong suggested using Math Stack Exchange for benchmarking, made Approach0 available for participants, and provided helpful feedback. Kenny Davila helped with the Tangent-S formula search results. We also thank our student assessors from RIT: Josh Anglum, Wiley Dole, Kiera Gross, Justin Haverlick, Riley Kieffer, Minyao Li, Ken Shultes, and Gabriella Wolf. This material is based upon work supported by the National Science Foundation (USA) under Grant No. IIS- 1717997 and the Alfred P. Sloan Foundation under Grant No. G-2017-9827.

Author information

Authors and Affiliations

Rochester Institute of Technology, Rochester, NY, USA
Richard Zanibbi, Anurag Agarwal & Behrooz Mansouri
University of Maryland, College Park, USA
Douglas W. Oard

Authors

Richard Zanibbi
View author publications
You can also search for this author in PubMed Google Scholar
Douglas W. Oard
View author publications
You can also search for this author in PubMed Google Scholar
Anurag Agarwal
View author publications
You can also search for this author in PubMed Google Scholar
Behrooz Mansouri
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Douglas W. Oard .

Editor information

Editors and Affiliations

Department of Electrical and Computer Engineering, Democritus University of Thrace, Xanthi, Greece
Avi Arampatzis
University of Amsterdam, Amsterdam, The Netherlands
Evangelos Kanoulas
Information Technologies Institute, Centre for Research and Technology Hellas, Thessaloniki, Greece
Theodora Tsikrika
Information Technologies Institute, Centre for Research and Technology Hellas, Thessaloniki, Greece
Stefanos Vrochidis
Faculty of Library, Information and Media Science, University of Tsukuba, Ibaraki, Japan
Hideo Joho
Department of Computer Science, University of Copenhagen, Copenhagen, Denmark
Christina Lioma
Brown University, Providence, RI, USA
Carsten Eickhoff
LIMSI-CNRS, Orsay, France
Aurélie Névéol
Department of Information Engineering, University of Padova, Padua, Italy
Linda Cappellato
Department of Information Engineering, University of Padova, Padua, Italy
Nicola Ferro

A Appendix: Evaluation Results

Table 4. Task 1 (CQA) results, averaged over 77 topics. P indicates a primary run, M indicates a manual run, and indicates a baseline pooled at the primary run depth. For Precision@10 and MAP, H+M binarization was used. The best baseline results are in parentheses. * indicates that one baseline did not contribute to judgment pools.

Table 5. Task 2 (Formula Retrieval) results, averaged over 45 topics and computed over deduplicated ranked lists of visually distinct formulae. P indicates a primary run, and shows the baseline pooled at the primary run depth. For MAP and P@10, relevance was thresholded H+M binarization. All runs were automatic. Baseline results are in parentheses.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zanibbi, R., Oard, D.W., Agarwal, A., Mansouri, B. (2020). Overview of ARQMath 2020: CLEF Lab on Answer Retrieval for Questions on Math. In: Arampatzis, A., et al. Experimental IR Meets Multilinguality, Multimodality, and Interaction. CLEF 2020. Lecture Notes in Computer Science(), vol 12260. Springer, Cham. https://doi.org/10.1007/978-3-030-58219-7_15

Download citation

DOI: https://doi.org/10.1007/978-3-030-58219-7_15
Published: 15 September 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58218-0
Online ISBN: 978-3-030-58219-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Overview of ARQMath 2020: CLEF Lab on Answer Retrieval for Questions on Math

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Finding Old Answers to New Math Questions: The ARQMath Lab at CLEF 2020

Advancing Math-Aware Search: The ARQMath-3 Lab at CLEF 2022

Overview of ARQMath-2 (2021): Second CLEF Lab on Answer Retrieval for Questions on Math

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A Appendix: Evaluation Results

A Appendix: Evaluation Results

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us