Skip to main content

Using typography in document image analysis

  • Part I: RIDT'98
  • Conference paper
  • First Online:
Electronic Publishing, Artistic Imaging, and Digital Typography (RIDT 1998)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1375))

Included in the following conference series:

  • 280 Accesses

  • 9 Citations

Abstract

Even if font usage plays an important role in Document Image Analysis (DIA), recognition systems generally take the concept of font management in a weaker sense than in the production cycle. With the point of view of the document recognition community, we show how typographic information (characters bitmap, metrics, etc.) can improve existing analysis methods. After a brief survey of font recognition issues, we present the advantages of a font software support in the design of recognition systems. Concrete algorithms are proposed in the subtopics of a posteriori font recognition, monofont Optical Character Recognition (OCR), and word segmentation. The reported experiments and results indicate that there are still substantial benefits to expect from the design of typographyaware analyzers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Adnan Amin. Arabic character recognition. In H. Bunke and P. S P. Wang, editors, Handbook of Character Recognition and Document Image Analysis, chapter 15, pages 397–420. World Scientific, 1997.

    Google Scholar 

  2. J. Andre and R. D. Hersch. Teaching digital typography. Electronic Publishing: Origination, Dissemination and Design, 5(2):79–90, 6 1992.

    Google Scholar 

  3. H. S. Baird and G. Nagy. A self-correcting 100-font classifier. In SPIE-The international Society for Optical Engeneering, Document Recognition, pages 106–115, San Jose, California, February 1994.

    Google Scholar 

  4. Frédéric Bapst, Rolf Brugger, and Rolf Ingold. Towards an interactive document recognition system. Internal working paper 95-09, IIUF-Université de Fribourg, March 1995.

    Google Scholar 

  5. Charles Bigelow. The evolution of markings and meanings in typography, 1997. Keynote speech at ICDAR'97 (see also http://www.YandY.com).

    Google Scholar 

  6. Rolf Brugger, Frédéric Bapst, and Rolf Ingold. A DTD extension for document structure recognition. In EP'98, St-Malo, France, 1998.

    Google Scholar 

  7. X. Q. Ding. Machine printed Chinese character recognition. In H. Bunke and P. S P. Wang, editors, Handbook of Character Recognition and Document Image Analysis, chapter 11, pages 305–330. World Scientific, 1997.

    Google Scholar 

  8. Laurence Duffy, Frank Lebourgeois, and Hubert Emptoz. Logical structure analysis by typographic characteristics extraction. In ICIAP'97: International Conference on Image Analysis and Processing, number 1311 in Lecture Notes in Computer Science, pages 639–646. Springer, September 1997.

    Google Scholar 

  9. ExperVision, Inc., 3590 North First Street, San Jose, CA 95134-9815. TypeReader Professionnal, February 1995. Release 1.0 for MacOS.

    Google Scholar 

  10. J. D. Hobby and H. S. Baird. Degraded character image restoration. In SDAIR'96: Fifth Symposium on Document Analysis and Information Retrieval, pages 233–246, Las Vegas, Nevada, April 1996.

    Google Scholar 

  11. Rolf Ingold. Une nouvelle approche de la lecture optique intégrant la reconnaissance des structures de documents. PhD thesis, EPFL, Lausanne, 1988. n. 777.

    Google Scholar 

  12. Peter Karow. Typeface Statistics. URW Verlag, Hambourg, 1993.

    Google Scholar 

  13. Peter Karow. Digital Typefaces. URW Verlag, Hambourg, 1994.

    Google Scholar 

  14. Peter Karow. Font Technology. URW Verlag, Hambourg, 1994.

    Google Scholar 

  15. G. E. Kopec. Least-square font metric estimation from images. IEEE Transactions on Image Processing, 2(4):510–519, October 1993.

    Article  Google Scholar 

  16. D. Lopresti, J. Zhou, G. Nagy, and P. Sarkar. Spatial sampling effects in OCR. In ICDAR'95: Third International Conference on Document Analysis and Recognition, pages 309–314, Montreal, Canada, August 1995.

    Google Scholar 

  17. R. A. Morris. Classification of digital typefaces using spectral signatures. Pattern Recognition, 25(8):869–876, 1992.

    Article  Google Scholar 

  18. Beth Paddock and Timothy J. Platt. ScanWorX API, Programmer's Guide. Xerox Imaging Systems, Inc., 9 Centennial Drive, Peabody, Massachusetts 01960, 1992.

    Google Scholar 

  19. R. P. Rogers, I. T. Phillips, and R. M. Haralick. Semiautomatic production of highly accurate word bounding box ground truth. In Document Analysis Systems (DAS'96), pages 375–386, 1996.

    Google Scholar 

  20. R. Sennhauser. Improving the recognition accuracy of text recognition systems using typographical constraints. In RIDT'94: Third International Conference on Raster Imaging and Digital Typography, pages 273–282, Darmstadt, Germany, April 1994.

    Google Scholar 

  21. Hogwei Shi and Theo Pavlidis. Font recognition and contextual processing for more accurate text recognition. In ICDAR'97, pages 39–44, Ulm-Germany, August 1997.

    Google Scholar 

  22. A. Zramdini and R. Ingold. A Study of Document Image Degradation Effects on Font Recognition. In ICDAR'95: Third International Conference on Document Analysis and Recognition, pages 740–743, Montreal, Canada, August 1995.

    Google Scholar 

  23. Abdelwahab Zramdini. Study of optical font recognition based on global typographical features. PhD thesis, IIUF-Université de Fribourg, 1995. n. 1106.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Roger D. Hersch Jacques André Heather Brown

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bapst, F., Ingold, R. (1998). Using typography in document image analysis. In: Hersch, R.D., André, J., Brown, H. (eds) Electronic Publishing, Artistic Imaging, and Digital Typography. RIDT 1998. Lecture Notes in Computer Science, vol 1375. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0053274

Download citation

  • DOI: https://doi.org/10.1007/BFb0053274

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-64298-5

  • Online ISBN: 978-3-540-69718-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics