Abstract
An OCR system is presented to understand mathematical formulas in binary printed document images. The system utilizes a novel component-labeling algorithm for extracting local maximum components from image, and uses these components to locate the mathematical formulas. A character recognition algorithm based on neural networks is then adopted. For segmenting merged characters in the image, a novel segmentation algorithm based on a modified SOM neural network was introduced into the system. With the employment of LL(1) grammar, this system can convert the recognition results into a \(\mbox{\LaTeX}\) file.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Anderson, R.H.: Syntax-Directed Recognition of Hand-Printed Two-Dimensional Mathematics. In: Klerer, M., Reinfelds, J. (eds.) Interactive Systems for Experimental Applied Matheties, pp. 436–459. Academic Press, New York (1968)
Lee, H.J., Wang, J.S.: Design of a Mathematical Expression Understanding System. Pattern Recognition Letters 18(3), 289–298 (1997)
Eto, Y., Suzuki, M.: Mathematical Formula Recognition Using Virtual Link Network. In: Proc. ICDAR, pp. 762–767 (2001)
Louden, K.C.: Compiler Construction: Principles and Practice. Brooks Cole (1997)
Kohonen, T.: The Self-Organizing Map. Proc. IEEE 78(9), 1464–1480 (1990)
Li, F., Wu, W.: Local Maximum Component-Labeling Based on Parallel Local Operation Sequence for Layout Analysis. In: Proc. WCICA (2006) (accepted)
Chang, F., Chen, C.-J., Lu, C.-J.: A Linear-Time Component Labeling Algorithm Using Contour Tracing Technique. Computer Vision Image Understanding 93(2), 206–220 (2004)
Drivas, D., Amin, A.: Page Segmentation and Classification Utilising Bottom-Up Approach. In: Proc. ICDAR, vol. 11, pp. 610–614 (1995)
Kacem, A., Belaid, A., Ben Ahmed, M.: EXTRAFOR: Automatic EXTRAction of Mathematical FORmulas. In: Proc. ICDAR, vol. 28, pp. 527–530 (1999)
Kacem, A., Belaid, A., Ben Ahmed, M.: Automatic Segmentation of Mathematical Documents. In: Proc. ACIDCA 2000, Monastir - Tunisia, pp. 86–91 (2000)
Mukunkan, R., Ramakrishnan, K.R.: Fast Computation of Legendre and Zernike Moments. Pattern Recognition 28(9), 1433–1442 (1995)
Oja, E., Ogawa, H.: Principal Component Analysis by Homogeneous Neural Network. IEICE. Trans. INF. & SYST. E75-D(3) (1992)
Guo, L.B., Wu, W.: Recognition of Junctions in Two-Dimensional Images by Neural Networks. Journal of Dalian University of Technology 43, 548–550 (2003)
Kong, J., Wu, W., Zhao, W.H.: Neural Networks for Recognition of Mathematical Symbols. Acta Scientiarum Naturalium Universitatis Jilinensis 3, 11–16 (2001)
Deng, J.S., Peng, R.R., Chen, C.S.: Science and Technology Typesetting Guide. Science Press, Beijing (2001)
Hou, L.C.: Design and Implement of Printed Mathematical Formula Recognition System. Master Degree Thesis of Dalian University of Technology (2004)
Hou, L.C., Wu, W., Zhu, B.D., Li, F.: A Segmentation Method for Merged Characters Using Self-Organizing Map Neural Networks (to appear)
Zhu, B.D.: Mathematic Expression Recognition. Master Degree Thesis of Dalian University of Technology (2005)
Wang, J.: Segmentation of Merged Characters by Neural Network and Shortest Path. Pattern Recognition 27(5), 649–658 (1994)
Hou, L.C., Wu, W.: Structure Analysis of Mathematical Expressions Using LL(1) Grammar. To appear in Journal of Dalian University of Technology 46(3) (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wu, W., Li, F., Kong, J., Hou, L., Zhu, B. (2006). A Bottom-Up OCR System for Mathematical Formulas Recognition. In: Huang, DS., Li, K., Irwin, G.W. (eds) Intelligent Computing. ICIC 2006. Lecture Notes in Computer Science, vol 4113. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11816157_27
Download citation
DOI: https://doi.org/10.1007/11816157_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-37271-4
Online ISBN: 978-3-540-37273-8
eBook Packages: Computer ScienceComputer Science (R0)