Skip to main content

NMF-Based Approach to Font Classification of Printed English Alphabets for Document Image Understanding

  • Conference paper
Book cover Modeling Decisions for Artificial Intelligence (MDAI 2005)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3558))

Abstract

This paper proposes an approach to font classification for document image understanding using non-negative matrix factorization (NMF). The basic idea of the proposed method is based on that the characteristics of each font are derived from parts of the individual characters in each font rather than holistic textures. Spatial localities, parts composing of font images, are automatically extracted using NMF. These parts are used as features representing each font. In the experimental results, the distribution of features and the appropriateness of use of the characteristics specifying each font are investigated. Add to that, the proposed method is compared with the method based on principal component analysis (PCA), in which various distance metrics are tested in the feature space. It expects that the proposed method will increase the performance of optical character recognition (OCR) systems or document indexing and retrieval systems if such systems adopt the proposed font classifier as a preprocessor.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Nagy, G.: Twenty Years of Document Image Analysis in PAMI. IEEE Trans. Pattern Analysis and Machine Intelligence 22(1), 38–62 (2000)

    Article  Google Scholar 

  2. Khoubyari, S., Hull, J.J.: Font and function word identification in document recognition. Computer Vision and Image Understanding 63(1), 66–74 (1996)

    Article  Google Scholar 

  3. Shi, H., Pavlidis, T.: Font Recognition and Contextual Processing for More Accurate Text Recognition. In: Proc. Int. Conf. Document Analysis and Recognition (1997), pp. 39–44 (1997)

    Google Scholar 

  4. Zhu, Y., Tan, T., Wang, Y.: Font Recognition Based on Global Texture Analysis. IEEE Trans. Pattern Analysis and Machine Intelligence 23(10), 1192–1200 (2001)

    Article  Google Scholar 

  5. Zramdini, A., Ingold, R.: Optical Font Recognition Using Typographical Features. IEEE Trans. Pattern Analysis and Machine Intelligence 20(8), 877–882 (1998)

    Article  Google Scholar 

  6. Lee, D.D., Seung, H.S.: Learning the Parts of Objects by Non-Negative Matrix Factorization. Nature 401, 788–791 (1999)

    Article  Google Scholar 

  7. Guillamet, D., Vitrià, J.: Evaluation of Distance Metrics for Recognition Based on Non- Negative Matrix Factorization. Pattern Recognition Letters 24, 1599–1605 (2003)

    Article  MATH  Google Scholar 

  8. Rubner, Y., Puzicha, J., Tomasi, C., Buhmann, J.M.: Empirical evaluation of dissimilarity measures for color and texture. Computer Vision and Image Understanding 84(1), 25–43 (2001)

    Article  MATH  Google Scholar 

  9. Rubner, Y., Tomasi, C., Guibas, L.J.: The earth mover’s distance as a metric for image retrieval. International Journal of Computer Vision 40(2), 99–121 (2000)

    Article  MATH  Google Scholar 

  10. Lee, D.D., Seung, H.S.: Algorithms for Non-negative Matrix Factorization. Advances in NIPS 13, 556–562 (2001)

    Google Scholar 

  11. Bae, J.H., Jung, K., Kim, J.W., Kim, H.J.: Segmentation of Touching Characters Using an MLP. Pattern Recognition Letters 19(8), 701–709 (1998)

    Article  MATH  Google Scholar 

  12. Bansal, V., Sinha, R.M.K.: Segmentation of touching and fused Devanagari characters. Pattern Recognition 35(4), 875–893 (2002)

    Article  MATH  Google Scholar 

  13. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. A Wiley-Interscience, Hoboken (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lee, C.W., Jung, K. (2005). NMF-Based Approach to Font Classification of Printed English Alphabets for Document Image Understanding. In: Torra, V., Narukawa, Y., Miyamoto, S. (eds) Modeling Decisions for Artificial Intelligence. MDAI 2005. Lecture Notes in Computer Science(), vol 3558. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11526018_35

Download citation

  • DOI: https://doi.org/10.1007/11526018_35

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-27871-9

  • Online ISBN: 978-3-540-31883-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics