Skip to main content

A Fine-tuning Retrieval System for Mathematical Information

  • Conference paper
  • First Online:
Proceedings of the Seventh International Conference on Mathematics and Computing

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1412))

  • 566 Accesses

Abstract

The web is a rich repository of mathematical information, the task of finding relevant documents in such collection is a laborious one. Although multiple approaches have been proposed to retrieve relevant documents for a queried formula, the poor values of evaluation measures depict existing limitations of such systems. To improve the performance of this systems, this paper proposes a novel approach of formula indexing by employing formula embedding and generalization techniques. The formula embedding and the generalization modules of the proposed system transform the formulas into the fixed-size vectors by counting the occurrence of different entities in formulas. Subsequently, the formula vectors are indexed by an indexer. The documents retrieved by both the modules have higher priorities in comparison to those retrieved by individual ones. The obtained results have been compared with the state-of-the-art existing approaches, and the comparison study reveals that the proposed approach gives better retrieval accuracy in terms of P\(\_\)5\(=\)0.522, P\(\_\)10\(=\)0.478, P\(\_\)15=0.352, and P\(\_\)20=0.289 measures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Asperti A, Padovani L, Coen CS, Guidi F, Schena I (2003) Mathematical knowledge management in helm. Ann Math Artif Intell 38(1–3):27–46

    Article  MathSciNet  Google Scholar 

  2. Ferreira D, Freitas A (2020) Natural language premise selection: finding supporting statements for mathematical text. arXiv:2004.14959

  3. Gao L, Jiang Z, Yin Y, Yuan K, Yan Z, Tang Z (2017) Preliminary exploration of formula embedding for mathematical information retrieval: can mathematical formulae be embedded like a natural language? arXiv:1707.05154

  4. Kristianto GY, Topic G, Aizawa A (2016) Mcat math retrieval system for ntcir-12 mathir task. In: Proceedings of the 12th NTCIR conference on evaluation of information access technologies, June 7–10, Tokyo, Japan, pp 323–330

    Google Scholar 

  5. Larson RR, Reynolds C, Gey FC (2013) The abject failure of keyword IR for mathematics search: Berkeley at NTCIR-10 math. In: Proceedings of the 10th NTCIR conference on evaluation of information access technologies, June 18–21, Tokyo, Japan, pp 662–666

    Google Scholar 

  6. Liska M, Sojka P, Ruzicka M (2015) Combining text and formula queries in math information retrieval: evaluation of query results merging strategies. In: Proceedings of the first international workshop on novel web search interfaces and systems, Oct 23. ACM, Melbourne, Australia, pp 7–9

    Google Scholar 

  7. Melis E, Büdenbender J, Goguadze G, Libbrecht P, Ullrich C et al (2003) Knowledge representation and management in activemath. Ann Math Artif Intell 38(1–3):47–64

    MATH  Google Scholar 

  8. Pathak A, Pakray P, Gelbukh A (2018) A formula embedding approach to math information retrieval. Computación y Sistemas 22(3)

    Google Scholar 

  9. Pathak A, Pakray P, Gelbukh A (2019) Binary vector transformation of math formula for mathematical information retrieval. J Intell Fuzzy Syst 36(5):4685–4695

    Article  Google Scholar 

  10. Ruzicka M, Sojka P, Liska M (2014) Math indexer and searcher under the hood: history and development of a winning strategy. In: Proceedings of the 11th NTCIR Conference on Evaluation of Information Access Technologies, Dec 9–12, Tokyo, Japan, pp 127–134. Citeseer

    Google Scholar 

  11. Sojka P, Liska M (2011) The art of mathematics retrieval. In: Proceedings of the 11th ACM symposium on document engineering, pp 57–60

    Google Scholar 

  12. Sojka P, Ruzicka M (2018) Novotny: Mias: Math-aware retrieval in digital mathematical libraries. In: Proceedings of the 27th ACM international conference on information and knowledge management, pp 1923–1926

    Google Scholar 

  13. Stathopoulos Y, Teufel S (2016) Mathematical information retrieval based on type embedding and query expansion. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers, pp 2344–2355

    Google Scholar 

  14. Voorhees EM, Harman DK et al (2005) TREC: experiment and evaluation in information retrieval, vol 563–567. Computational Linguistics, MIT Press Cambridge

    Google Scholar 

  15. Zanibbi R, Aizawa A, Kohlhase M, Ounis I, Topic G, Davila K (2016) Ntcir-12 mathir task overview. In: Proceedings of the 12th NTCIR conference on evaluation of information access technologies, Tokyo, Japan, pp 299–308

    Google Scholar 

  16. Zhong W, Rohatgi S, Wu J, Giles CL, Zanibbi R (2020) Accelerating substructure similarity search for formula retrieval. In: European conference on information retrieval, pp 714–727. Springer (2020)

    Google Scholar 

Download references

Acknowledgements

The authors would like to express gratitude to the Department of Computer Science and Engineering and Center for Natural Language Processing, National Institute of Technology, Silchar, India, for providing infrastructural facilities and support.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Dadure, P., Pakray, P., Bandyopadhyay, S. (2022). A Fine-tuning Retrieval System for Mathematical Information. In: Giri, D., Raymond Choo, KK., Ponnusamy, S., Meng, W., Akleylek, S., Prasad Maity, S. (eds) Proceedings of the Seventh International Conference on Mathematics and Computing . Advances in Intelligent Systems and Computing, vol 1412. Springer, Singapore. https://doi.org/10.1007/978-981-16-6890-6_81

Download citation

Publish with us

Policies and ethics