Skip to main content

Multi-lingual Offline Handwriting Recognition Using Hidden Markov Models: A Script-Independent Approach

  • Conference paper
Arabic and Chinese Handwriting Recognition (SACH 2006)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4768))

Included in the following conference series:

Abstract

This paper introduces a script-independent methodology for multi-lingual offline handwriting recognition (OHR) based on the use of Hidden Markov Models (HMM). The OHR methodology extends our script-independent approach for OCR of machine-printed text images. The feature extraction, training, and recognition components of the system are all designed to be script independent. The HMM training and recognition components are based on our Byblos continuous speech recognition system. The HMM parameters are estimated automatically from the training data, without the need for laborious hand-written rules. The system does not require pre-segmentation of the data, neither at the word level nor at the character level. Thus, the system can handle languages with cursive handwritten scripts in a straightforward manner. The script independence of the system is demonstrated with experimental results in three scripts that exhibit significant differences in glyph characteristics: English, Chinese, and Arabic. Results from an initial set of experiments are presented to demonstrate the viability of the proposed methodology.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Rabiner, l.: A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proc. IEEE 77, 257–286 (1989)

    Article  Google Scholar 

  2. Makhoul, J., Schwartz, R.: State of the Art in Continuous Speech Recognition. Proc. Natl. Acad. Sci. USA 92, 9956–9963 (1995)

    Article  Google Scholar 

  3. Kundu, A., Bahl, P.: Recognition of Handwritten Script: a Hidden Markov Model Based Approach. Pattern Recognition 22, 283–297 (1989)

    Article  Google Scholar 

  4. Levin, E., Pieraccini, R.: Dynamic Planar Warping for Optical Character Recognition. In: IEEE Int. Conf. Acoustics, Speech, Signal Processing, San Francisco, CA, vol. III, pp. 149–152 (1992)

    Google Scholar 

  5. Vlontzos, J.A., Kung, S.Y.: Hidden Markov Models for Character Recognition. IEEE Trans. Image Processing 1, 539–543 (1992)

    Article  Google Scholar 

  6. Agazzi, O.E., Kuo, S.: Hidden Markov Model Based Optical Character Recognition in the Presence of Deterministic Transformations. Pattern Recognition 26, 1813–1826 (1993)

    Article  Google Scholar 

  7. Chen, M.Y., Kundu, A., Srihari, S.N.: Handwritten Word Recognition Using Continuous Density Variable Duration Hidden Markov Model. In: Int. Conf. Acoustics, Speech, Signal Processing, Minneapolis, MN, vol. 5, pp. 105–108 (1993)

    Google Scholar 

  8. Bose, C.B., Kuo, S.S.: Connected and Degraded Text Recognition Using Hidden Markov Model. Pattern Recognition 27, 1345–1363 (1994)

    Article  Google Scholar 

  9. Rocha, J., Pavlidis, T.: Character Recognition without Segmentation. IEEE Trans. Pattern Analysis and Machine Intelligence 17, 903–909 (1995)

    Article  Google Scholar 

  10. Bunke, H., Roth, M., Schukat-Talamazzini, E.G.: Off-line Cursive Handwriting Recognition Using Hidden Markov Models. Pattern Recognition 28, 1399–1413 (1995)

    Article  Google Scholar 

  11. Oh, C., Kim, W.S.: Off-line Recognition of Handwritten Korean and Alphanumeric Characters Using Hidden Markov Models. In: Proc. Int. Conf. Document Analysis and Recognition, Montreal, Canada, vol. 2, pp. 815–818 (1995)

    Google Scholar 

  12. Anigbogu, J.C., Belaid, A.: Hidden Markov Models in Text Recognition. Int. J. Pattern Recognition and Artificial Intelligence 9, 925–958 (1995)

    Article  Google Scholar 

  13. Casey, R.G., Lecolinet, E.: A Survey of Methods and Strategies in Character Segmentation. IEEE Trans. Pattern Analysis and Machine Intelligence 18, 690–706 (1996)

    Article  Google Scholar 

  14. Kim, W.S., Park, R.H.: Off-line Recognition of Handwritten Korean and Alphanumeric Characters Using Hidden Markov Models. Pattern Recognition 29, 845–858 (1996)

    Article  Google Scholar 

  15. Park, H.S., Lee, S.W.: Off-line Recognition of Large-set Handwritten Characters with Multiple Hidden Markov Models. Pattern Recognition 29, 231–244 (1996)

    Article  Google Scholar 

  16. Allam, M.: Segmentation Versus Segmentation-free for Recognizing Arabic Text. In: Proc. SPIE, vol. 2422, pp. 228–235 (1995)

    Google Scholar 

  17. Ben Amara, N., Belaid, A.: Printed PAW Recognition Based on Planar Hidden Markov Models. In: 13th Int. Conf. Pattern Recognition, Vienna, Austria, vol. II, pp. 220–224 (1996)

    Google Scholar 

  18. Yarman-Vural, F.T., Atici, A.: A Heuristic Algorithm for Optical Character Recognition of Arabic Script. In: Proc. SPIE. Part 2, vol. 2727, pp. 725–736 (1996)

    Google Scholar 

  19. Kaltenmeier, A., Caesar, T., Gloger, J.M., Mandler, E.: Sophisticated Topology of Hidden Markov Models for Cursive Script Recognition. In: Proc. Int. Conf. Document Analysis and Recognition, Tsukuba City, Japan, pp. 139–142 (1993)

    Google Scholar 

  20. Cho, W., Lee, S.W., Kim, J.H.: Modeling and Recognition of Cursive Words with Hidden Markov Models. Pattern Recognition 28, 1941–1953 (1995)

    Article  Google Scholar 

  21. Mohamed, M., Gader, P.: Handwritten Word Recognition using Segmentation-Free Hidden Markov Modeling and Segmentation-based Dynamic Programming Techniques. IEEE Trans. Pattern Analysis and Machine Intelligence 18, 548–554 (1996)

    Article  Google Scholar 

  22. Elms, A.J., Illingworth, J.: Modelling Polyfont Printed Characters with HMMs and a Shift Invariant Hamming Distance. In: Proc. Int. Conf. Document Analysis and Recognition, Montreal, Canada, pp. 504–507 (1995)

    Google Scholar 

  23. Aas, K., Eikvil, L.: Text Page Recognition using Grey-level Features and Hidden Markov Models. Pattern Recognition 29, 977–985 (1996)

    Article  Google Scholar 

  24. Kornai, A.: Experimental HMM-based Postal OCR System. In: Proc. Int. Conf. Acoustics, Speech, Signal Processing, Munich, Germany, vol. 4, pp. 3177–3180 (1997)

    Google Scholar 

  25. Al-Badr, B., Mahmoud, S.: Survey and Bibliography of Arabic Optical Text Recognition. Signal Processing 41, 49–77 (1995)

    Article  MATH  Google Scholar 

  26. Starner, T., Makhoul, J., Schwartz, R., Chou, G.: On-line Cursive Handwriting Recognition Using Speech Recognition Methods. In: IEEE Int. Conf. Acoustics, Speech, Signal Processing, Adelaide, Australia, vol. V, pp. 125–128 (1994)

    Google Scholar 

  27. Park, H., Lee, S.: A Truly 2-D Hidden Markov Model. Pattern Recognition 31(12), 1849–1864 (1998)

    Article  MathSciNet  Google Scholar 

  28. Vinciarelli, A., Bengio, S., Bunke, H.: Offline Recognition of Unconstrained Handwritten Texts Using HMMs and Statistical Language Models. IEEE Transactions on Pattern Analysis and Machine Intelligence 26(6), 709–720 (2004)

    Article  Google Scholar 

  29. Liu, C.-L., Marukawa, K.: Global Shape Normalization for Handwritten Chinese Character Recognition: A New Method. In: Proceedings of the 9th International Workshop on Frontiers in Handwriting Recognition (IWFHR-9) (2004)

    Google Scholar 

  30. Wu, T., Ma, S.: Feature Extraction by Hierarchical Overlapped Elastic Meshing for Handwritten Chinese Character Recognition. In: Proceedings of the Seventh International Conference on Document Analysis and Recognition (ICDAR) (2003)

    Google Scholar 

  31. Tang, Y.Y., Tu, L., Liu, J., Lee, S., Lin, W., Shyu, I.: Offline Recognition of Chinese Handwriting by Multifeature and Multilevel Classification. IEEE. Trans. On Pattern Analysis and Machine Intelligence 20(5) (May 1998)

    Google Scholar 

  32. Natarajan, P., Lu, Z., Bazzi, I., Schwartz, R., Makhoul, J.: Multilingual Machine Printed OCR. International Journal of Pattern Recognition and Artificial Intelligence 15(1), 43–63 (2001)

    Article  Google Scholar 

  33. Forney, G.D.: The Viterbi Algorithm. Proc. IEEE 61, 268–278 (1973)

    Article  MathSciNet  Google Scholar 

  34. Austin, S., Schwartz, R., Placeway, P.: The Forward-Backward Search Algorithm. In: IEEE Int. Conf. Acoustics, Speech, Signal Processing, Toronto, Canada, vol. V, pp. 697–700 (1991)

    Google Scholar 

  35. Schwartz, R., Nguyen, L., Makhoul, J.: Multiple-Pass Search Strategies. In: Lee, C.-H., Soong, F.K., Paliwal, K.K. (eds.) Automatic Speech and Speaker Recognition: Advanced Topics, pp. 429–456. Kluwer Academic Publishers, Dordrecht (1996)

    Google Scholar 

  36. Nguyen, L., Anastasakos, T., Kubala, F., LaPre, C., Makhoul, J., Schwartz, R., Yuan, N., Zavaliagkos, G., Zhao, Y.: The 1994 BBN/BYBLOS Speech Recognition System. In: Proc. ARPA Spoken Language Systems Technology Workshop, Austin, TX, pp. 77–81. Morgan Kaufmann Publishers, San Francisco (1995)

    Google Scholar 

  37. Makhoul, J., Schwartz, R., LaPre, C., Bazzi, I.: A Script-Independent Methodology for Optical Character Recognition. Pattern Recognition 31(9), 1285–1294 (1998)

    Article  Google Scholar 

  38. Lu, Z., Schwartz, R., Raphael, C.: Script-Independent, HMM-based Text Line Finding for OCR. In: Int. Conf. Pattern Recognition, Barcelona, Spain, vol. 4, pp. 551–554 (September 2000)

    Google Scholar 

  39. Makhoul, J., Schwartz, R.: Language-Independent and Segmentation-Free Optical Character Recognition. U.S. Patent No. 5933525, (August 3, 1999)

    Google Scholar 

  40. Fukunaga, K.: Introduction to Statistical Pattern Recognition, ch. 10, 2nd edn. Academic Press, New York (1990)

    Google Scholar 

  41. Baum, L.E.: An inequality and Associated Maximization Technique in Statistical Estimation for Probabilistic Functions of Markov Processes. Inequalities 3, 1–8 (1972)

    Google Scholar 

  42. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum-likelihood from Incomplete Data via the EM Algorithm. J. Royal Statist. Soc. Ser. B (methodological) 39, 1–38 (1977)

    MATH  MathSciNet  Google Scholar 

  43. Redner, R.A., Walker, H.F.: Mixture Densities, Maximum Likelihood and the EM Algorithm. SIAM Review 26, 195–239 (1984)

    Article  MATH  MathSciNet  Google Scholar 

  44. Bellegarda., J., Nahamoo, D.: Tied Mixture Continuous Parameter Models for Large Vocabulary Isolated Speech Recognition. In: IEEE Int. Conf. Acoustics, Speech, Signal Processing, Glasgow, Scotland, vol. 1, pp. 13–16 (1989)

    Google Scholar 

  45. Huang., X.D., Jack, M.A.: Semi-continuous Hidden Markov Models for Speech Recognition. Computer Speech and Language 3 (1989)

    Google Scholar 

  46. Lu, Z., Schwartz, R., Natarajan, P., Bazzi, I., Makhoul, J.: Advances in the BBN BYBLOS OCR System. In: Proc. Of Intl. Conf. Doc. Analysis and Recognition, Bangalore, India, pp. 337–340 (1999)

    Google Scholar 

  47. Tay, Y.H., Lallican, P.M., Khalid, M., Viard-Gaudin, C., Knerr, S.: An Offline Cursive Handwritten Word Recognition System. In: Proceedings of IEEE Region 10 Conference (2001)

    Google Scholar 

  48. Marti, U., Bunke, H.: A Full English Sentence Database for Off-line Handwriting Recognition. In: Proc. of the 5th Int. Conf. on Document Analysis and Recognition, ICDAR 1999, Bangalore, pp. 705–708 (1999)

    Google Scholar 

  49. Johansson, S., Leech, G.N., Goodluck, H.: Manual of Information to Accompany the Lancaster-Oslo/Bergen Corpus of British English, for Use with Digital Computers. Department of English, University of Oslo, Norway (1978)

    Google Scholar 

  50. Pechwitz, M., Maddouri, S.S., Märgner, V., Ellouze, N., Amiri, H.: IFN/ENIT-Database of Handwritten Arabic Words. In: 7th Colloque International Francophone sur l’Ecrit et le Document, CIFED 2002, Hammamet, Tunis, October 21-23 (2002)

    Google Scholar 

  51. Märgner, V., Pechwitz, M., El Abed, H.: ICDAR 2005 Arabic Handwriting Recognition Competition. In: 8th International Conference on Document Analysis and Recognition, ICDAR 2005, Seoul, Korea, August 29-Sepember 01 (2005)

    Google Scholar 

  52. Legetter, C.J., Woodland, P.C.: Maximum Likelihood Linear Regression for Speaker Adaptation of Continuous Density Hidden Markov Models. Computer Speech and Language 9, 171–185 (1995)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

David Doermann Stefan Jaeger

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Natarajan, P., Saleem, S., Prasad, R., MacRostie, E., Subramanian, K. (2008). Multi-lingual Offline Handwriting Recognition Using Hidden Markov Models: A Script-Independent Approach. In: Doermann, D., Jaeger, S. (eds) Arabic and Chinese Handwriting Recognition. SACH 2006. Lecture Notes in Computer Science, vol 4768. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78199-8_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-78199-8_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-78198-1

  • Online ISBN: 978-3-540-78199-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics