Skip to main content
Log in

ICDAR 2003 robust reading competitions: entries, results, and future directions

  • Published:
International Journal of Document Analysis and Recognition (IJDAR) Aims and scope Submit manuscript

Abstract.

This paper describes the robust reading competitions for ICDAR 2003. With the rapid growth in research over the last few years on recognizing text in natural scenes, there is an urgent need to establish some common benchmark datasets and gain a clear understanding of the current state of the art. We use the term ‘robust reading’ to refer to text images that are beyond the capabilities of current commercial OCR packages. We chose to break down the robust reading problem into three subproblems and run competitions for each stage, and also a competition for the best overall system. The subproblems we chose were text locating, character recognition and word recognition. By breaking down the problem in this way, we hoped to gain a better understanding of the state of the art in each of the subproblems. Furthermore, our methodology involved storing detailed results of applying each algorithm to each image in the datasets, allowing researchers to study in depth the strengths and weaknesses of each algorithm. The text-locating contest was the only one to have any entries. We give a brief description of each entry and present the results of this contest, showing cases where the leading entries succeed and fail. We also describe an algorithm for combining the outputs of the individual text locators and show how the combination scheme improves on any of the individual systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Baird H, Popat K (2002) Human interactive proofs and document image analysis. In: Proceedings of the 5th IAPR international workshop on document analysis systems, Princeton, NJ, pp 507-518

  2. Baird HS (1993) Document image defect models and their uses. In: Proceedings of the 2nd IAPR international conference on document analysis and recognition, pp 62-67

  3. Bezdek J (1981) Pattern recognition with fuzzy objective function algorithms. Plenum, New York

  4. Bieber G, Carpenter J Introduction to service-oriented programming (rev 2.1). http://www.openwings.org/download/specs/ ServiceOrientedIntroduction.pdf

  5. Burges CJC (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 2(2):121-167

    Google Scholar 

  6. Celenk M (1990) A color clustering technique for image segmentation. Comput Vis Graph Image Process 52:145-170

    Google Scholar 

  7. Chang J, Chen X, Hanneman A, Yang J, Waibel A (2002) A robust approach for recognition of text embedded in natural scenes. Proceedings of the international conference on pattern recognition, pp 204-207

  8. Clark P, Mirmehdi M (2000) Combining statistical measures to find image text regions. In: Proceedings of the 15th international conference on pattern recognition, pp 450-453. IEEE Press, New York

  9. Collobert R, Bengio S (2001) SVMTorch: Support vector machines for large-scale regression problems. J Mach Learn Res 1:143-160

    Google Scholar 

  10. Jain AK, Yu B (1998) Automatic text location in images and video frame. Pattern Recog 31(12):2055-2076

    Google Scholar 

  11. Li H, Doermann D, Kia O (2000) Automatic text detection and tracking in digital videos. IEEE Trans Image Process 9(1):147-156

    Google Scholar 

  12. Liang J, Phillips I, Haralick R (1997) Performance evaluation of document layout analysis algorithms on the UW data set. In: Proceedings of SPIE, Document Recognition IV, pp 149-160

  13. Lienhart R, Wernicke A (2002) Localizing and segmenting text in images and videos. IEEE Trans Circuits Syst Video Technol 12(4):256-268

    Google Scholar 

  14. Liu J, Yang YH (1994) Multiresolution color image segmentation. IEEE Trans Pattern Anal Mach Intell 16:689-700

    Google Scholar 

  15. Lucas S (2002) Web-based evaluation and deployment of pattern recognizers. Proceedings of the international conference on pattern recognition, pp 419-422

  16. Maio D, Maltoni D, Cappelli R, Wayman J, Jain A (2002) Fvc2000: Fingerprint verification competition. IEEE Trans Pattern Anal Mach Intell 24:402-412

    Google Scholar 

  17. Mariano V, Min J, Park J-H, Kasturi R, Mihalcik D, Li H, Doermann D, Drayer T (2002) Performance evaluation of object detection algorithms. In: Proceedings of the 16th international conference on pattern recognition. IEEE Press, New York, 3:965-969

  18. Otsu N (1979) A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern 9(1):62-66

    Google Scholar 

  19. Park SH, Yun ID, Lee SU (1998) Color image segmentation based on 3-d clustering: a morphological approach. Pattern Recog 31(8):1061-1076

    Google Scholar 

  20. Pavlidis T (1982) Algorithms for graphics and image processing. Computer Science Press, Rockville, MD

  21. Rahman A, Fairhurst M (2003) Multiple classifier decision combination strategies for character recognition: a review. Int J Doc Anal Recog 5(4):166-194

    Google Scholar 

  22. Todoran L, Worring M, Smeulders A (2002) Data groundtruth, complexity and evaluation measures for color document analysis. In: Proceedings of the 5th IAPR international workshop on document analysis systems, Princeton, NJ, pp 519-531

  23. Trier O, Jain A (1995) Goal-directed evaluation of binarization methods. IEEE Trans Pattern Anal Mach Intell 17:1191-1201

    Google Scholar 

  24. von Ahn L, Blum M, Hopper N, Langford J, Manber U The CAPTCHA project. http://www.captcha.net

  25. Vapnik V (1998) Statistical learning theory. Wiley, New York

  26. Wu V, Manmatha R, Riseman E (1999) Textfinder: an automatic system to detect and recognize text in images. IEEE Trans Pattern Anal Mach Intell 21(11):1224-1229

    Google Scholar 

  27. Wolf C (2003) Text detection in images taken from videos sequences for semantic indexing. PhD thesis, Institut National de Sciences Appliquées de Lyon, 20, rue Albert Einstein, 69621 Villeurbanne Cedex, France

  28. Wolf C, Jolion J, Chassaing F (2001) Procédé de détection de zones de texte dans une image vidéo. Patent France Télécom, Ref. No. FR 01 06776, June 2001

  29. Wolf C, Jolion J, Laurent C (2003) Extraction d’informations textuelles contenues dans les images et les séquences audio-visuelles par une approche de type machine á vecteurs supports. Patent France Télécom, Ref. No. FR 03 11918, October 2003

  30. Wolf C, Jolion J-M (2002) Extraction and recognition of artificial text in multimedia documents. Technical Report 2002.01, Technical Report, Reconnaissance de Formes et Vision Lab

  31. Wolf C, Jolion J-M (2003) Extraction and recognition of artificial text in multimedia documents. Pattern Anal Appl 6(4):309-326

    Google Scholar 

  32. Wolf C, Jolion J-M, Chassaing F (2002) Text localization, enhancement and binarization in multimedia documents. In: Proceedings of the international conference on pattern recognition, 4:1037-1040

  33. Wu V, Manmatha R, Riseman EM (1997) Finding text in images. In: Proceedings of the 2nd ACM conference on digital libraries, pp 3-12

Download references

Author information

Authors and Affiliations

Authors

Additional information

Published online: 21 June 2005

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lucas, S.M., Panaretos, A., Sosa, L. et al. ICDAR 2003 robust reading competitions: entries, results, and future directions. IJDAR 7, 105–122 (2005). https://doi.org/10.1007/s10032-004-0134-3

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10032-004-0134-3

Keywords:

Navigation