Quantitative analysis of mathematical documents

Uchida, S.; Nomura, A.; Suzuki, M.

doi:10.1007/s10032-005-0142-y

S. Uchida¹,
A. Nomura² &
M. Suzuki²

128 Accesses
17 Citations
Explore all metrics

Abstract.

Mathematical documents are analyzed from several viewpoints for the development of practical OCR for mathematical and other scientific documents. Specifically, four viewpoints are quantified using a large-scale database of mathematical documents, containing 690,000 manually ground-truthed characters: (i) the number of character categories, (ii) abnormal characters (e.g., touching characters), (iii) character size variation, and (iv) the complexity of the mathematical expressions. The result of these analyses clarifies the difficulties of recognizing mathematical documents and then suggests several promising directions to overcome them.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

References

Hara S, Ohtake N, Higuchi M, Miyazaki N, Watanabe A, Kusunoki K, Sato H (2000) MathBraille; a system to transform LATEX documents into Braille. SIGCAPH Newslett 66:17-20
Michler GO (2001) Report on the retrodigitization project "Archiv der Mathematik". Archiv der Mathematik 77:116-128
Dennis K, Michler GO, Schneider G, Suzuki M (2003) Automatic reference linking in distributed digital libraries. In: Proceedings of the workshop of document image analysis and retrieval (DIAR-03)
Blostein D, Grbavec A (1997) Recognition of mathematical notation. In: Bunke H, Wang PSP, Handbook of character recognition and document image analysis. World Scientific, Singapore, pp 557-582
Chan K-F, Yeung D-Y (2000) Mathematical expression recognition: a survey. Int J Doc Anal Recog 3(1):3-15
Google Scholar
Lee H-J, Wang J-S (1997) Design of a mathematical expression understanding system. Pattern Recog Lett 18(3):289-298
Google Scholar
Okamoto M, Imai H, Takagi K (2001) Performance evaluation of a robust method for mathematical expression recognition. In: Proceedings of the international conference on document analysis and recognition, pp 121-128
Mitra J, Garain U, Chaudhuri BB, Swamy K, Pal T (2003) Automatic understanding of structures in printed mathematical expressions. In: Proceedings of the international conference on document analysis and recognition, pp 540-544
Chaudhuri BB, Garain U (2001) Extraction type-based meta-information from imaged documents. Int J Doc Anal Recog 3(3):138-149
Google Scholar
Nagy G, Shelton G Jr (1966) Self-corrective character recognition system. IEEE Trans Inf Theory 12(2):215-222
Google Scholar
Baird HS, Nagy G (1994) A self-correcting 100-font classifier. In: Document Recognition, Proceedings of SPIE, 2181:106-115
Okamoto M, Sakaguchi S, Suzuki T (1999) Segmentation of touching characters in formulas. Document analysis systems: theory and practice. 3rd IAPR workshop, DAS'98, selected papers. Lecture notes in computer science, vol 1655. Springer, Berlin Heidelberg New York
Nomura A, Michishita K, Uchida S, Suzuki M (2003) Detection and segmentation of touching characters in mathematical expressions. In: Proceedings of the international conference on document analysis and recognition, 1:126-130
Ha J, Haralick RM, Phillips IT (1995) Understanding mathematical expressions from document images. In: Proceedings of the international conference on document analysis and recognition, pp 956-959
Eto Y, Suzuki M (2001) Mathematical formula recognition using virtual link network. In: Proceedings of the international conference on document analysis and recognition, pp 762-767
Zanibbi R, Blostein D, Cordy JR (2002) Recognizing handwritten mathematical expressions using tree transformation. IEEE Trans Pattern Anal Mach Intell 24(11):1455-1467
Google Scholar
http://www.inftyproject.org

Download references

Author information

Authors and Affiliations

Department of Intelligent Systems, Kyushu University, 6-10-1, Hakozaki, Higashi-ku, Fukuoka-shi, Japan
S. Uchida
Department of Mathematics, Kyushu University, 6-10-1, Hakozaki, Higashi-ku, Fukuoka-shi, Japan
A. Nomura & M. Suzuki

Authors

S. Uchida
View author publications
You can also search for this author in PubMed Google Scholar
A. Nomura
View author publications
You can also search for this author in PubMed Google Scholar
M. Suzuki
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to S. Uchida.

Additional information

Received: 3 March 2004, Accepted: 5 January 2005, Published online: 29 June 2005

Correspondence to: S. Uchida

Rights and permissions

Reprints and permissions

About this article

Cite this article

Uchida, S., Nomura, A. & Suzuki, M. Quantitative analysis of mathematical documents. IJDAR 7, 211–218 (2005). https://doi.org/10.1007/s10032-005-0142-y

Download citation

Issue Date: September 2005
DOI: https://doi.org/10.1007/s10032-005-0142-y

Keywords:

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Quantitative analysis of mathematical documents

Abstract.

Access this article

Similar content being viewed by others

Processing Mathematical Notation

Advancing the state of the art for handwritten math recognition: the CROHME competitions, 2011–2014

A Brief History of Documents and Writing Systems

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords:

Navigation

Quantitative analysis of mathematical documents

Abstract.

Access this article

Similar content being viewed by others

Processing Mathematical Notation

Advancing the state of the art for handwritten math recognition: the CROHME competitions, 2011–2014

A Brief History of Documents and Writing Systems

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords:

Search

Navigation