Skip to main content

Processing Mathematical Notation

  • Reference work entry
  • First Online:

Abstract

Automated recognition of mathematical notation is required for convenient document search and editing. The recognition problem varies depending on whether the input is a document image, vector graphics such as PDF, or handwritten tablet input. This chapter describes the state of the art in recognition of math notation, discussing the four component problems of expression detection, symbol recognition, layout analysis, and mathematical content interpretation.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   549.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   549.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Anderson R (1977) Syntax-directed recognition of hand-printed two-dimensional equations. PhD thesis, Harvard University, Cambridge, Jan 1968. Portions of this thesis appear as a chapter. In: Fu KS (ed) Syntactic pattern recognition, applications. Springer, pp 147–177

    Google Scholar 

  2. Awal A-M, Mouchére H, Viard-Gaudin C (2009) Towards handwritten mathematical expression recognition. In: Proceedings of the 10th international conference on document analysis and recognition, Barcelona, pp 1046–1050

    Google Scholar 

  3. Baker J, Sexton A, Sorge V, Suzuki M (2011) Comparing approaches to mathematical document analysis from PDF. In: Proceedings of the 11th international conference on document analysis and recognition, Beijing, pp 463–467

    Google Scholar 

  4. Berman B, Fateman R (1994) Optical character recognition for typeset mathematics. In: Proceedings of the 1994 international symposium on symbolic and algebraic computation, Oxford, pp 348–353, July 1994

    Google Scholar 

  5. Chan K-F, Yeung D-Y (2000) Mathematical expression recognition: a survey. Int J Doc Anal Recognit 3:3–15

    Article  Google Scholar 

  6. Chan K-F, Yeung D-Y (2001) Error detection, error correction and performance evaluation in on-line mathematical expression recognition. Pattern Recognit 34(8):1671–1684

    Article  Google Scholar 

  7. Chan K-F, Yeung D-Y (2001) Pencalc: a novel application of on-line mathematical expression recognition technology. In: Proceedings of the 6th international conference on document analysis and recognition, Seattle, pp 774–778

    Google Scholar 

  8. Chang S-K (1970) A method for the structural analysis of two-dimensional mathematical expressions. Inf Sci 2(3):253–272

    Article  Google Scholar 

  9. Chou P (1989) Recognition of equations using a two-dimensional stochastic context-free grammar. In: Visual communications and image processing IV, Philadelphia. SPIE, vol 1199, pp 852–863

    Google Scholar 

  10. Dewar M (2000) Openmath: an overview. ACM SIGSAM Bull 34:2–5

    Article  Google Scholar 

  11. Drake D, Baird H (2005) Distinguishing mathematics notation from English text using computational geometry. In: Proceedings of the 8th international conference on document analysis and recognition, Seoul, pp 1270–1274

    Google Scholar 

  12. Eto Y, Suzuki M (2001) Mathematical formula recognition using virtual link network. In: Proceedings of the 6th international conference on document analysis and recognition, Seattle, pp 430–437

    Google Scholar 

  13. Fateman R, Tokuyasu T (1996) Progress in recognizing typeset mathematics. Proc Int Soc Opt Eng 2660:37–50

    Google Scholar 

  14. Garain U (2009) Identification of mathematical expressions in document images. In: Proceedings of the 10th international conference on document analysis and recognition, Barcelona, pp 1340–1344

    Google Scholar 

  15. Garain U, Chaudhuri BB (2004) Recognition of online handwritten mathematical expressions. IEEE Trans Syst Man Cybern 34(6):2366–2376

    Article  Google Scholar 

  16. Genoe R, Fitzgerald JA, Kechadi T (2006) An online fuzzy approach to the structural analysis of handwritten mathematical expressions. In: Proceedings of the IEEE international conference on fuzzy systems, Vancouver, pp 242–250, July 2006

    Google Scholar 

  17. Golubitsky O, Watt SM (2010) Distance-based classification of handwritten symbols. Int J Doc Anal Recognit 13(2):133–146

    Article  Google Scholar 

  18. Golubitsky O, Watt SM (2010) Improved classification through runoff elections. In: Proceedings of the international workshop document analysis systems, Boston, pp 59–64

    Google Scholar 

  19. Grbavec A, Blostein D (1995) Mathematics recognition using graph rewriting. In: Proceedings of the 3rd international conference on document analysis and recognition, Montreal, pp 417–421

    Google Scholar 

  20. Hu L, Zanibbi R (2011) HMM-based recognition of on-line handwritten mathematical symbols using segmental k-means initialization and a modified pen up/down feature. In: Proceedings of the international conference on document analysis and recognition, Beijing, pp 457–462

    Google Scholar 

  21. Kacem A, Belaid A, Ben Ahmed M (2001) Automatic extraction of printed mathematical formulas using fuzzy logic and propagation of context. Int J Doc Anal Recognit 4(2):97–108

    Article  Google Scholar 

  22. Kanahori T, Suzuki M (2002) A recognition method of matrices by using variable block pattern elements generating rectangular areas. In: Graphics recognition – algorithms and applications. LNCS, vol 2390. Springer, pp 320–329

    Google Scholar 

  23. Kanahori T, Sexton A, Sorge V, Suzuki M (2006) Capturing abstract matrices from paper. In: Mathematical knowledge management. LNAI, vol 4108. Springer, pp 124–138

    Google Scholar 

  24. Labahn G, Lank E, MacLean S, Marzouk M, Tausky D (2008) Mathbrush: a system for doing math on pen-based devices. In: Proceedings of the eighth IAPR workshop on document analysis systems (DAS 2008), Nara. IEEE Computer Society, pp 599–606

    Google Scholar 

  25. LaViola J, Zeleznik R (2004) Mathpad2: a system for the creation and exploration of mathematical sketches. ACM Trans Graph (Proc SIGGRAPH 2004) 23(3):432–440

    Article  Google Scholar 

  26. LaViola J, Zeleznik R (2007) A practical approach to writer-dependent symbol recognition using a writer-independent recognizer. IEEE Trans Pattern Anal Mach Intell 29(11): 1917–1926

    Article  Google Scholar 

  27. Lavirotte S, Pottier L (1997) Optical formula recognition. In: Proceedings of the 4th international conference on document analysis and recognition, Ulm, pp 357–361

    Google Scholar 

  28. Lee H-J, Wang J-S (1997) Design of a mathematical expression understanding system. Pattern Recognit Lett 18(3):289–298

    Article  Google Scholar 

  29. Li C, Zeleznik R, Miller T, LaViola J (2008) Online recognition of handwritten mathematical expressions with support for matrices. In: Proceedings of the 19th international conference on pattern recognition, Tampa, pp 1–4

    Google Scholar 

  30. Lin X, Gao L, Tang Z, Lin X, Hu X (2011) Mathematical formula identification in PDF documents. In: Proceedings of the 11th international conference on document analysis and recognition, Beijing, pp 1419–1423

    Google Scholar 

  31. Lin X, Gao L, Tang Z, Lin X, Hu X (2012) Performance evaluation of mathematical formula identification. In: Proceedings of the 10th IAPR international workshop on document analysis systems, Gold Coast, pp 287–291

    Google Scholar 

  32. MacLean S, Labahn G, Lank E, Marzouk M, Tausky D (2011) Grammar-based techniques for creating ground-truthed sketch corpora. Int J Doc Anal Recognit 14(1):65–74

    Article  Google Scholar 

  33. Malon C, Uchida S, Suzuki M (2008) Mathematical symbol recognition with support vector machines. Pattern Recognit Lett 29(9):1326–1332

    Article  Google Scholar 

  34. Matsakis N (1999) Recognition of handwritten mathematical expressions. Master’s thesis, Massachusetts Institute of Technology, Cambridge, May 1999

    Google Scholar 

  35. Michler G (2003) How to build a prototype for a distributed digital mathematics archive library. Ann Math Artif Intell 38:137–164

    Article  MathSciNet  Google Scholar 

  36. Miller E, Viola P (1998) Ambiguity and constraint in mathematical expression recognition. In: Proceedings of the 15th national conference of artificial intelligence, Madison, pp 784–791, July 1998

    Google Scholar 

  37. Mouchère H, Viard-Gaudin C, Kim DH, Kim JH, Garain U (2011) CROHME2011: competition on recognition of online handwritten mathematical expressions. In: Proceedings of the 11th international conference on document analysis and recognition, Beijing, pp 1497–1500

    Google Scholar 

  38. Okamoto N, Miao B (1991) Recognition of mathematical expressions by using the layout structures of symbols. In: Proceedings of the 1st international conference on document analysis and recognition, Saint-Malo, pp 242–250

    Google Scholar 

  39. Panic M (2009) Math handwriting recognition in Windows 7 and its benefits. In: Intelligent computer mathematics. LNCS, vol 5625. Springer, Berlin/Heidelberg, pp 29–30

    MATH  Google Scholar 

  40. Phillips I (1998) Methodologies for using UW databases for OCR and image understanding systems. In: Proceedings of the document recognition V, San Jose. SPIE, vol 3305, pp 112–127

    Google Scholar 

  41. Pollanen M, Wisniewski T, Yu X (2007) Xpress: a novice interface for the real-time communication of mathematical expressions. In: Proceedings of the workshop on mathematical user-interfaces, Linz, June 2007

    Google Scholar 

  42. Quiniou S, Mouchère H, Peña Saldarriaga S, Viard-Gaudin C, Morin E, Petitrenaud S, Medjkoune S (2011) HAMEX – a handwritten and audio dataset of mathematical expressions. In: Proceedings of the 11th international conference on document analysis and recognition, Beijing, pp 452–456

    Google Scholar 

  43. Rhee TH, Kim JH (2009) Efficient search strategy in structural analysis for handwritten mathematical expression recognition. Pattern Recognit 42(12):3192–3201

    Article  Google Scholar 

  44. Sasarak C, Hart K, Pospesel R, Stalnaker D, Hu L, LiVolsi R, Zhu S, Zanibbi R. (2012) min: a multimodal web interface for math search. In: Symposium on human-computer interaction and information retrieval, Cambridge. Online: https://sites.google.com/site/hcirworkshop/hcir-2012

  45. Shi Y, Soong FK (2008) Symbol graph based discriminative training and rescoring for improved math symbol recognition. In: Proceedings of the international conference on acoustics, speech, and signal processing, Las Vegas, pp 1953–1956

    Google Scholar 

  46. Smirnova E, Watt S (2008) Communicating mathematics via pen-based computer interfaces. In: Proceedings of the 10th international symposium on symbolic and numeric algorithms for scientific computing (SYNASC 2008), Timisoara, pp 9–18

    Google Scholar 

  47. Smithies S, Novins K, Arvo J (1999) A handwriting-based equation editor. In: Proceedings of the graphics interface, Kingston, pp 84–91, June 1999

    Google Scholar 

  48. So CM, Watt SM (2005) Determining empirical characteristics of mathematical expression use. In: Proceedings of the mathematical knowledge management. LNCS, vol 3863. Springer, pp 361– 375

    Google Scholar 

  49. Suzuki M, Tamari F, Fukuda R, Uchida S, Kanahori T (2003) INFTY: an integrated OCR system for mathematical documents. In: Proceedings of the ACM symposium on document engineering 2003, Grenoble, pp 95–104

    Google Scholar 

  50. Suzuki M, Uchida S, Nomura A (2005) A ground-truthed mathematical character and symbol image database. In: Proceedings of the 8th international conference on document analysis and recognition, Seoul, pp 675–679

    Google Scholar 

  51. Tapia E, Rojas R (2003) Recognition of on-line handwritten mathematical formulas in the E-chalk system. In: Proceedings of the 7th international conference on document analysis and recognition, Edinburgh, pp 980–984

    Google Scholar 

  52. Tapia E, Rojas R (2004) Recognition of on-line handwritten mathematical expressions using a minimum spanning tree construction and symbol dominance. In: Graphics recognition, recent advances and perspectives. LNCS, vol 3088. Springer, Berlin/New York, pp 329–340

    Chapter  Google Scholar 

  53. Tausky D, Labahn G, Lank E, Marzouk M (2007) Managing ambiguity in mathematical matrices. In: Proceedings of the 4th Eurographics workshop on sketch-based interfaces and modeling, Riverside California, pp 115–122

    Google Scholar 

  54. Toyozumi K, Yamada N, Mase K, Kitasaka T, Mori K, Suenaga Y, Takahashi T (2004) A study of symbol segmentation method for handwritten mathematical formula recognition using mathematical structure information. In: Proceedings of the 17th international conference on pattern recognition, Cambridge, vol 2, pp 630–633

    Google Scholar 

  55. Watt SM (2008) An empirical measure on the set of symbols occurring in engineering mathematics texts. In: Proceedings of the 8th IAPR international workshop on document analysis systems (DAS 2008), Nara, pp 557–564

    Google Scholar 

  56. Winkler H-J (1996) HMM-based handwritten symbol recognition using on-line and off-line features. In: Proceedings of the international conference on acoustics speech and signal processing, Atlanta, pp 3438–3441

    Google Scholar 

  57. Yamamoto R, Sako S, Nishimoto T, Sagayama S (2006) On-line recognition of handwritten mathematical expressions based on stroke-based stochastic context-free grammar. In: Proceedings of the 10th international workshop on frontiers in handwriting recognition, La Baule, Oct 2006

    Google Scholar 

  58. Zanibbi R, Blostein D (2012) Recognition and retrieval of mathematical expressions. Int J Doc Anal Recognit 15(4):331–357

    Article  Google Scholar 

  59. Zanibbi R, Blostein D, Cordy JR (2002) Recognizing mathematical expressions using tree transformation. IEEE Trans Pattern Anal Mach Intell 24(11):1455–1467

    Article  Google Scholar 

Download references

Acknowledgements

Financial support from the National Science Foundation, USA, (Grant No. IIS-1016815) and the Natural Sciences and Engineering Research Council of Canada are gratefully acknowledged.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dorothea Blostein .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag London

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Blostein, D., Zanibbi, R. (2014). Processing Mathematical Notation. In: Doermann, D., Tombre, K. (eds) Handbook of Document Image Processing and Recognition. Springer, London. https://doi.org/10.1007/978-0-85729-859-1_21

Download citation

Publish with us

Policies and ethics