skip to main content
10.1145/3591106.3592259acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
research-article

Symbol Location-Aware Network for Improving Handwritten Mathematical Expression Recognition

Published:12 June 2023Publication History

ABSTRACT

Recently most handwritten mathematical expression recognition methods adopt the attention-based encoder-decoder framework, which generates LaTeX sequences from given images. However, the accuracy of the attention mechanism limits the performance of HMER models. Lacking global context information in the decoding process is also a challenge for HMER. Some methods adopt symbol-level counting to localize symbols for improving the model performance, while these methods cannot work well. In this paper, we propose a method named SLAN, shorted for a Symbol Location-Aware Network, to solve the HMER problem. Specifically, we propose an advanced relation-level counting method to detect symbols in the image. We solve the lacking global context problem with a new global context-aware decoder. For improving the accuracy of attention, we design a novel attention alignment loss function by the dynamic programming algorithm, which can learn attention alignment directly without pixel-level labels. We conducted extensive experiments on the CROHME dataset to demonstrate the effectiveness of each part of SLAN and achieved state-of-the-art performance.

References

  1. Peter Anderson, Xiaodong He, Chris Buehler, Damien Teney, Mark Johnson, Stephen Gould, and Lei Zhang. 2018. Bottom-up and top-down attention for image captioning and visual question answering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6077–6086.Google ScholarGoogle ScholarCross RefCross Ref
  2. Xiaohang Bian, Bo Qin, Xiaozhe Xin, Jianwu Li, Xuefeng Su, and Yanfeng Wang. 2022. Handwritten mathematical expression recognition via attention aggregation based bi-directional mutual learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 113–121.Google ScholarGoogle ScholarCross RefCross Ref
  3. Dorothea Blostein and Ann Grbavec. 1997. Recognition of mathematical notation. In Handbook of character recognition and document image analysis. World Scientific, 557–582.Google ScholarGoogle Scholar
  4. Kam-Fai Chan and Dit-Yan Yeung. 2001. Error detection, error correction and performance evaluation in on-line mathematical expression recognition. Pattern Recognition 34, 8 (2001), 1671–1684.Google ScholarGoogle ScholarCross RefCross Ref
  5. Xinpeng Chen, Lin Ma, Wenhao Jiang, Jian Yao, and Wei Liu. 2018. Regularizing rnns for caption generation by reconstructing the past with the present. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7995–8003.Google ScholarGoogle ScholarCross RefCross Ref
  6. Yuntian Deng, Anssi Kanervisto, Jeffrey Ling, and Alexander M Rush. 2017. Image-to-markup generation with coarse-to-fine attention. In Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, 980–989.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q Weinberger. 2017. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4700–4708.Google ScholarGoogle ScholarCross RefCross Ref
  8. Joseph J LaViola and Robert C Zeleznik. 2007. A practical approach for writer-dependent symbol recognition using a writer-independent symbol recognizer. IEEE Transactions on pattern analysis and machine intelligence 29, 11 (2007), 1917–1926.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Anh Duc Le. 2020. Recognizing handwritten mathematical expressions via paired dual loss attention network and printed mathematical expressions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. 566–567.Google ScholarGoogle ScholarCross RefCross Ref
  10. Bohan Li, Ye Yuan, Dingkang Liang, Xiao Liu, Zhilong Ji, Jinfeng Bai, Wenyu Liu, and Xiang Bai. 2022. When Counting Meets HMER: Counting-Aware Network for Handwritten Mathematical Expression Recognition. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXVIII. Springer, 197–214.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Qiqiang Lin, Chunyi Wang, Ning Bi, Ching Y Suen, and Jun Tan. 2022. An Encoder-Decoder Approach to Offline Handwritten Mathematical Expression Recognition with Residual Attention. In Pattern Recognition and Artificial Intelligence: Third International Conference, ICPRAI 2022, Paris, France, June 1–3, 2022, Proceedings, Part I. Springer, 335–345.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Qi Liu, Zai Huang, Zhenya Huang, Chuanren Liu, Enhong Chen, Yu Su, and Guoping Hu. 2018. Finding similar exercises in online education systems. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 1821–1830.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Minh-Thang Luong, Hieu Pham, and Christopher D Manning. 2015. Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025 (2015).Google ScholarGoogle Scholar
  14. Christopher Malon, Seiichi Uchida, and Masakazu Suzuki. 2008. Mathematical symbol recognition with support vector machines. Pattern Recognition Letters 29, 9 (2008), 1326–1332.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Cuong Tuan Nguyen, Hung Tuan Nguyen, Kei Morizumi, and Masaki Nakagawa. 2021. Temporal classification constraint for improving handwritten mathematical expression recognition. In Document Analysis and Recognition–ICDAR 2021 Workshops: Lausanne, Switzerland, September 5–10, 2021, Proceedings, Part II 16. Springer, 113–125.Google ScholarGoogle Scholar
  16. Masayuki Okamoto, Hiroki Imai, and Kazuhiko Takagi. 2001. Performance evaluation of a robust method for mathematical expression recognition. In Proceedings of Sixth International Conference on Document Analysis and Recognition. IEEE, 121–128.Google ScholarGoogle ScholarCross RefCross Ref
  17. Aniket Pal and Krishna Pratap Singh. 2022. R-GRU: Regularized gated recurrent unit for handwritten mathematical expression recognition. Multimedia Tools and Applications 81, 22 (2022), 31405–31419.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Amar Raja, Matthew Rayner, Alan Sexton, and Volker Sorge. 2006. Towards a parser for mathematical formula recognition. In International Conference on Mathematical Knowledge Management. Springer, 139–151.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Faisal Shafait, Daniel Keysers, and Thomas Breuel. 2008. Performance evaluation and benchmarking of six-page segmentation algorithms. IEEE Transactions on Pattern Analysis and Machine Intelligence 30, 6 (2008), 941–954.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Yu Su, Qingwen Liu, Qi Liu, Zhenya Huang, Yu Yin, Enhong Chen, Chris Ding, Si Wei, and Guoping Hu. 2018. Exercise-enhanced sequential modeling for student performance prediction. In Thirty-Second AAAI Conference on Artificial Intelligence.Google ScholarGoogle ScholarCross RefCross Ref
  21. Thanh-Nghia Truong, Huy Quang Ung, Hung Tuan Nguyen, Cuong Tuan Nguyen, and Masaki Nakagawa. 2021. Relation-based representation for handwritten mathematical expression recognition. In Document Analysis and Recognition–ICDAR 2021 Workshops: Lausanne, Switzerland, September 5–10, 2021, Proceedings, Part I 16. Springer, 7–19.Google ScholarGoogle Scholar
  22. Lei Wang, Dongxiang Zhang, Lianli Gao, Jingkuan Song, Long Guo, and Heng Tao Shen. 2018. Mathdqn: Solving arithmetic word problems via deep reinforcement learning. In Thirty-Second AAAI Conference on Artificial Intelligence.Google ScholarGoogle ScholarCross RefCross Ref
  23. Lei Wang, Dongxiang Zhang, Jipeng Zhang, Xing Xu, Lianli Gao, Bingtian Dai, and Heng Tao Shen. 2019. Template-Based Math Word Problem Solvers with Recursive Neural Networks. (2019).Google ScholarGoogle Scholar
  24. Zelun Wang and Jyh-Charn Liu. 2021. Translating math formula images to LaTeX sequences using deep neural networks with sequence-level training. International Journal on Document Analysis and Recognition (IJDAR) 24, 1 (2021), 63–75.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Changjie Wu, Jun Du, Yunqing Li, Jianshu Zhang, Chen Yang, Bo Ren, and Yiqing Hu. 2022. TDv2: A Novel Tree-Structured Decoder for Offline Mathematical Expression Recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 2694–2702.Google ScholarGoogle ScholarCross RefCross Ref
  26. Jin-Wen Wu, Fei Yin, Yan-Ming Zhang, Xu-Yao Zhang, and Cheng-Lin Liu. 2019. Image-to-markup generation via paired adversarial learning. In Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2018, Dublin, Ireland, September 10–14, 2018, Proceedings, Part I 18. Springer, 18–34.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Jin-Wen Wu, Fei Yin, Yan-Ming Zhang, Xu-Yao Zhang, and Cheng-Lin Liu. 2020. Handwritten mathematical expression recognition via paired adversarial learning. International Journal of Computer Vision 128 (2020), 2386–2401.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Zuoyu Yan, Xiaode Zhang, Liangcai Gao, Ke Yuan, and Zhi Tang. 2021. ConvMath: A Convolutional Sequence Network for Mathematical Expression Recognition. In 2020 25th International Conference on Pattern Recognition (ICPR). IEEE, 4566–4572.Google ScholarGoogle Scholar
  29. Yu Yin, Zhenya Huang, Enhong Chen, Qi Liu, Fuzheng Zhang, Xing Xie, and Guoping Hu. 2018. Transcribing content from structural images with spotlight mechanism. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2643–2652.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Richard Zanibbi and Dorothea Blostein. 2012. Recognition and retrieval of mathematical expressions. International Journal on Document Analysis and Recognition (IJDAR) 15, 4 (2012), 331–357.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Richard Zanibbi, Dorothea Blostein, and James R Cordy. 2001. Baseline structure analysis of handwritten mathematics notation. In Proceedings of Sixth International Conference on Document Analysis and Recognition. IEEE, 768–773.Google ScholarGoogle ScholarCross RefCross Ref
  32. Richard Zanibbi, Dorothea Blostein, and James R. Cordy. 2002. Recognizing mathematical expressions using tree transformation. IEEE Transactions on pattern analysis and machine intelligence 24, 11 (2002), 1455–1467.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Jianshu Zhang, Jun Du, and Lirong Dai. 2018. Multi-scale attention with dense encoder for handwritten mathematical expression recognition. In 2018 24th international conference on pattern recognition (ICPR). IEEE, 2245–2250.Google ScholarGoogle ScholarCross RefCross Ref
  34. Jianshu Zhang, Jun Du, Yongxin Yang, Yi-Zhe Song, Si Wei, and Lirong Dai. 2020. A tree-structured decoder for image-to-markup generation. In International Conference on Machine Learning. PMLR, 11076–11085.Google ScholarGoogle Scholar
  35. Jianshu Zhang, Jun Du, Shiliang Zhang, Dan Liu, Yulong Hu, Jinshui Hu, Si Wei, and Lirong Dai. 2017. Watch, attend and parse: An end-to-end neural network based approach to handwritten mathematical expression recognition. Pattern Recognition 71 (2017), 196–206.Google ScholarGoogle ScholarCross RefCross Ref
  36. Wenqi Zhao, Liangcai Gao, Zuoyu Yan, Shuai Peng, Lin Du, and Ziyin Zhang. 2021. Handwritten mathematical expression recognition with bidirectionally trained transformer. In Document Analysis and Recognition–ICDAR 2021: 16th International Conference, Lausanne, Switzerland, September 5–10, 2021, Proceedings, Part II 16. Springer, 570–584.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Shuhan Zhong, Sizhe Song, Guanyao Li, and S-H Gary Chan. 2022. A Tree-Based Structure-Aware Transformer Decoder for Image-To-Markup Generation. In Proceedings of the 30th ACM International Conference on Multimedia. 5751–5760.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Symbol Location-Aware Network for Improving Handwritten Mathematical Expression Recognition

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            ICMR '23: Proceedings of the 2023 ACM International Conference on Multimedia Retrieval
            June 2023
            694 pages
            ISBN:9798400701788
            DOI:10.1145/3591106

            Copyright © 2023 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 12 June 2023

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed limited

            Acceptance Rates

            Overall Acceptance Rate254of830submissions,31%

            Upcoming Conference

            ICMR '24
            International Conference on Multimedia Retrieval
            June 10 - 14, 2024
            Phuket , Thailand
          • Article Metrics

            • Downloads (Last 12 months)133
            • Downloads (Last 6 weeks)13

            Other Metrics

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format .

          View HTML Format