Abstract
Even if font usage plays an important role in Document Image Analysis (DIA), recognition systems generally take the concept of font management in a weaker sense than in the production cycle. With the point of view of the document recognition community, we show how typographic information (characters bitmap, metrics, etc.) can improve existing analysis methods. After a brief survey of font recognition issues, we present the advantages of a font software support in the design of recognition systems. Concrete algorithms are proposed in the subtopics of a posteriori font recognition, monofont Optical Character Recognition (OCR), and word segmentation. The reported experiments and results indicate that there are still substantial benefits to expect from the design of typographyaware analyzers.
Preview
Unable to display preview. Download preview PDF.
References
Adnan Amin. Arabic character recognition. In H. Bunke and P. S P. Wang, editors, Handbook of Character Recognition and Document Image Analysis, chapter 15, pages 397–420. World Scientific, 1997.
J. Andre and R. D. Hersch. Teaching digital typography. Electronic Publishing: Origination, Dissemination and Design, 5(2):79–90, 6 1992.
H. S. Baird and G. Nagy. A self-correcting 100-font classifier. In SPIE-The international Society for Optical Engeneering, Document Recognition, pages 106–115, San Jose, California, February 1994.
Frédéric Bapst, Rolf Brugger, and Rolf Ingold. Towards an interactive document recognition system. Internal working paper 95-09, IIUF-Université de Fribourg, March 1995.
Charles Bigelow. The evolution of markings and meanings in typography, 1997. Keynote speech at ICDAR'97 (see also http://www.YandY.com).
Rolf Brugger, Frédéric Bapst, and Rolf Ingold. A DTD extension for document structure recognition. In EP'98, St-Malo, France, 1998.
X. Q. Ding. Machine printed Chinese character recognition. In H. Bunke and P. S P. Wang, editors, Handbook of Character Recognition and Document Image Analysis, chapter 11, pages 305–330. World Scientific, 1997.
Laurence Duffy, Frank Lebourgeois, and Hubert Emptoz. Logical structure analysis by typographic characteristics extraction. In ICIAP'97: International Conference on Image Analysis and Processing, number 1311 in Lecture Notes in Computer Science, pages 639–646. Springer, September 1997.
ExperVision, Inc., 3590 North First Street, San Jose, CA 95134-9815. TypeReader Professionnal, February 1995. Release 1.0 for MacOS.
J. D. Hobby and H. S. Baird. Degraded character image restoration. In SDAIR'96: Fifth Symposium on Document Analysis and Information Retrieval, pages 233–246, Las Vegas, Nevada, April 1996.
Rolf Ingold. Une nouvelle approche de la lecture optique intégrant la reconnaissance des structures de documents. PhD thesis, EPFL, Lausanne, 1988. n. 777.
Peter Karow. Typeface Statistics. URW Verlag, Hambourg, 1993.
Peter Karow. Digital Typefaces. URW Verlag, Hambourg, 1994.
Peter Karow. Font Technology. URW Verlag, Hambourg, 1994.
G. E. Kopec. Least-square font metric estimation from images. IEEE Transactions on Image Processing, 2(4):510–519, October 1993.
D. Lopresti, J. Zhou, G. Nagy, and P. Sarkar. Spatial sampling effects in OCR. In ICDAR'95: Third International Conference on Document Analysis and Recognition, pages 309–314, Montreal, Canada, August 1995.
R. A. Morris. Classification of digital typefaces using spectral signatures. Pattern Recognition, 25(8):869–876, 1992.
Beth Paddock and Timothy J. Platt. ScanWorX API, Programmer's Guide. Xerox Imaging Systems, Inc., 9 Centennial Drive, Peabody, Massachusetts 01960, 1992.
R. P. Rogers, I. T. Phillips, and R. M. Haralick. Semiautomatic production of highly accurate word bounding box ground truth. In Document Analysis Systems (DAS'96), pages 375–386, 1996.
R. Sennhauser. Improving the recognition accuracy of text recognition systems using typographical constraints. In RIDT'94: Third International Conference on Raster Imaging and Digital Typography, pages 273–282, Darmstadt, Germany, April 1994.
Hogwei Shi and Theo Pavlidis. Font recognition and contextual processing for more accurate text recognition. In ICDAR'97, pages 39–44, Ulm-Germany, August 1997.
A. Zramdini and R. Ingold. A Study of Document Image Degradation Effects on Font Recognition. In ICDAR'95: Third International Conference on Document Analysis and Recognition, pages 740–743, Montreal, Canada, August 1995.
Abdelwahab Zramdini. Study of optical font recognition based on global typographical features. PhD thesis, IIUF-Université de Fribourg, 1995. n. 1106.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bapst, F., Ingold, R. (1998). Using typography in document image analysis. In: Hersch, R.D., André, J., Brown, H. (eds) Electronic Publishing, Artistic Imaging, and Digital Typography. RIDT 1998. Lecture Notes in Computer Science, vol 1375. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0053274
Download citation
DOI: https://doi.org/10.1007/BFb0053274
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64298-5
Online ISBN: 978-3-540-69718-3
eBook Packages: Springer Book Archive