Skip to main content

A Bottom-Up OCR System for Mathematical Formulas Recognition

  • Conference paper
Intelligent Computing (ICIC 2006)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4113))

Included in the following conference series:

Abstract

An OCR system is presented to understand mathematical formulas in binary printed document images. The system utilizes a novel component-labeling algorithm for extracting local maximum components from image, and uses these components to locate the mathematical formulas. A character recognition algorithm based on neural networks is then adopted. For segmenting merged characters in the image, a novel segmentation algorithm based on a modified SOM neural network was introduced into the system. With the employment of LL(1) grammar, this system can convert the recognition results into a \(\mbox{\LaTeX}\) file.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Anderson, R.H.: Syntax-Directed Recognition of Hand-Printed Two-Dimensional Mathematics. In: Klerer, M., Reinfelds, J. (eds.) Interactive Systems for Experimental Applied Matheties, pp. 436–459. Academic Press, New York (1968)

    Google Scholar 

  2. Lee, H.J., Wang, J.S.: Design of a Mathematical Expression Understanding System. Pattern Recognition Letters 18(3), 289–298 (1997)

    Article  Google Scholar 

  3. Eto, Y., Suzuki, M.: Mathematical Formula Recognition Using Virtual Link Network. In: Proc. ICDAR, pp. 762–767 (2001)

    Google Scholar 

  4. Louden, K.C.: Compiler Construction: Principles and Practice. Brooks Cole (1997)

    Google Scholar 

  5. Kohonen, T.: The Self-Organizing Map. Proc. IEEE 78(9), 1464–1480 (1990)

    Article  Google Scholar 

  6. Li, F., Wu, W.: Local Maximum Component-Labeling Based on Parallel Local Operation Sequence for Layout Analysis. In: Proc. WCICA (2006) (accepted)

    Google Scholar 

  7. Chang, F., Chen, C.-J., Lu, C.-J.: A Linear-Time Component Labeling Algorithm Using Contour Tracing Technique. Computer Vision Image Understanding 93(2), 206–220 (2004)

    Article  Google Scholar 

  8. Drivas, D., Amin, A.: Page Segmentation and Classification Utilising Bottom-Up Approach. In: Proc. ICDAR, vol. 11, pp. 610–614 (1995)

    Google Scholar 

  9. Kacem, A., Belaid, A., Ben Ahmed, M.: EXTRAFOR: Automatic EXTRAction of Mathematical FORmulas. In: Proc. ICDAR, vol. 28, pp. 527–530 (1999)

    Google Scholar 

  10. Kacem, A., Belaid, A., Ben Ahmed, M.: Automatic Segmentation of Mathematical Documents. In: Proc. ACIDCA 2000, Monastir - Tunisia, pp. 86–91 (2000)

    Google Scholar 

  11. Mukunkan, R., Ramakrishnan, K.R.: Fast Computation of Legendre and Zernike Moments. Pattern Recognition 28(9), 1433–1442 (1995)

    Article  MathSciNet  Google Scholar 

  12. Oja, E., Ogawa, H.: Principal Component Analysis by Homogeneous Neural Network. IEICE. Trans. INF. & SYST.  E75-D(3) (1992)

    Google Scholar 

  13. Guo, L.B., Wu, W.: Recognition of Junctions in Two-Dimensional Images by Neural Networks. Journal of Dalian University of Technology 43, 548–550 (2003)

    MATH  Google Scholar 

  14. Kong, J., Wu, W., Zhao, W.H.: Neural Networks for Recognition of Mathematical Symbols. Acta Scientiarum Naturalium Universitatis Jilinensis 3, 11–16 (2001)

    Google Scholar 

  15. Deng, J.S., Peng, R.R., Chen, C.S.: Science and Technology Typesetting Guide. Science Press, Beijing (2001)

    Google Scholar 

  16. Hou, L.C.: Design and Implement of Printed Mathematical Formula Recognition System. Master Degree Thesis of Dalian University of Technology (2004)

    Google Scholar 

  17. Hou, L.C., Wu, W., Zhu, B.D., Li, F.: A Segmentation Method for Merged Characters Using Self-Organizing Map Neural Networks (to appear)

    Google Scholar 

  18. Zhu, B.D.: Mathematic Expression Recognition. Master Degree Thesis of Dalian University of Technology (2005)

    Google Scholar 

  19. Wang, J.: Segmentation of Merged Characters by Neural Network and Shortest Path. Pattern Recognition 27(5), 649–658 (1994)

    Article  Google Scholar 

  20. Hou, L.C., Wu, W.: Structure Analysis of Mathematical Expressions Using LL(1) Grammar. To appear in Journal of Dalian University of Technology 46(3) (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wu, W., Li, F., Kong, J., Hou, L., Zhu, B. (2006). A Bottom-Up OCR System for Mathematical Formulas Recognition. In: Huang, DS., Li, K., Irwin, G.W. (eds) Intelligent Computing. ICIC 2006. Lecture Notes in Computer Science, vol 4113. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11816157_27

Download citation

  • DOI: https://doi.org/10.1007/11816157_27

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-37271-4

  • Online ISBN: 978-3-540-37273-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics