Skip to main content
Log in

Bayesian super-resolution of text in videowith a text-specific bimodal prior

  • Published:
International Journal of Document Analysis and Recognition (IJDAR) Aims and scope Submit manuscript

Abstract.

To increase the range of sizes of video scene text recognizable by optical character recognition (OCR), we developed a Bayesian super-resolution algorithm that uses a text-specific bimodal prior. We evaluated the effectiveness of the bimodal prior, compared and in conjunction with a piecewise smoothness prior, visually and by measuring the accuracy of the OCR results on the variously super-resolved images. The bimodal prior improved the readability of 4- to 7-pixel-high scene text significantly better than bicubic interpolation and increased the accuracy of OCR results better than the piecewise smoothness prior.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Aradhye H, Dorai C, Shim J (2001) Study of embedded font context and kernel space methods for improved videotext recognition. In: IEEE international conference on image processing

  2. Dorai C, Aradhye H, Shim J (2001) End-to-end videotext recognition for multimedia content analysis. In: IEEE international conference on multimedia and expo

  3. Chaudhuri S (2001) Super-resolution imaging. Kluwer International series in engineering and computer science, no. 632. ISBN 0-7923-7471-1

  4. Park SC, Park MK, Moon GK (2003) Super-resolution image reconstruction: a technical overview. IEEE Signal Process Mag 20(3):21-36

    Google Scholar 

  5. Wu V, Manmatha R, Riseman E (1997) Automatic text detection and recognition. In: Proceedings of the workshop on image understanding, pp 707-712

  6. Li H, Doermann D, Omid K (2000) Automatic text detection and tracking in digital video. IEEE Trans Image Process 9(1):147-56

    Google Scholar 

  7. Clark P, Mirmehdi M (2000a) Combining statistical measures to find image text regions. In: Proceedings of the international conference on pattern recognition, 3-8 September 2000. IEEE Press, New York, pp 450-453

  8. Clark P, Mirmehdi M (2000b) Finding text regions using localised measures. In: Proceedings of the 11th British machine vision conference, pp 675-684

  9. Mirmehdi M, Clark P, Lam J (2001) Extracting low resolution text with an active camera for OCR. In: Proceedings of the 9th Spanish symposium on pattern recognition and image processing, pp 43-48

  10. Doermann D, Liang J, Li H (2003) Progress in camera-based document image analysis. In: Proceedings of the international conference on document analysis and recognition, pp 606-616

  11. Li H, Doermann D (2000) Superresolution-based enhancement of text in digital video. In: Proceedings of the international conference on pattern recognition, pp 847-850

  12. Patti A, Sezan M, Tekalp A (1997) Superresolution video reconstruction with arbitrary sampling lattices and nonzero aperture time. IEEE Trans Image Process 6(8):1064-1076

    Google Scholar 

  13. Capel D, Zisserman A (2000) Super-resolution enhancement of text image sequences. In: Proceedings of the international conference on pattern recognition, 1:600-605

  14. Irani M, Peleg S (1991) Improving resolution by image restoration. Comput Vis Graph Image Process 53:231-239

    Google Scholar 

  15. Irani M, Peleg S (1993) Motion analysis for image enhancement: Resolution, occlusion, and transparency. J Vis Commun Image Represent 4(4):324-335

    Google Scholar 

  16. Schultz R, Stevenson RL (1996) Extraction of high-resolution frames from video sequences. IEEE Trans Image Process 5(6):996-1011

    Google Scholar 

  17. Vogel CR, Oman ME (1998) Fast, robust total variation-based reconstruction of noisy, blurred images. IEEE Trans Image Process 7(7):813-424

    Google Scholar 

  18. Capel D, Zisserman A (2001) Super-resolution from multiple views using learnt image models. In: Proceedings of the IEEE conference on computer vision and pattern recognition

  19. Baker S, Kanade T (2002) Limits on super-resolution and how to break them. IEEE Trans Pattern Anal Mach Intell 24(9):1167-1183

    Google Scholar 

  20. Chiang M, Boult TE (1997) Local blur estimation and super-resolution. In: Proceedings of the conference on computer vision and pattern recognition, San Juan, Puerto Rico, June 1997, pp 821-826

  21. Chiang M-C, Boult TE (2000) Efficient super-resolution via image warping. Image Vis Comput 18(10):761-771

    Google Scholar 

  22. Cheeseman P, Kanefsky B, Knaft R, Stutz J, Hanson R (1996) Super-resolved surface reconstruction from multiple images. In: Heidbreder G (ed) Maximum entropy and bayesian methods, pp 293-308. Kluwer, Santa Barbara

  23. Hardie RC, Barnard KJ, Armstrong EA (1997) Joint MAP registration and high-resolution image estimation using a sequence of undersampled images. IEEE Trans Image Process 6(12):1621-1633

    Google Scholar 

  24. Capel D, Zisserman A (2003) Computer vision applied to super-resolution. IEEE Signal Process Mag 20(3):75-86

    Google Scholar 

  25. Thouin P, Chang C-I (1999) A method for restoration of low-resolution text images. In: Proceedings of the symposium on document image understanding technology, Annapolis, MD, pp 143-148

  26. Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B 39(1):1-38

    Google Scholar 

  27. Mitchell TM (1997) Machine learning. McGraw-Hill, New York, pp 191-196

  28. Abdou I (1996) Study of subpixel image registration methods. SRI Technical Report, SRI International, Menlo Park, CA

  29. Kundur D, Hatzinakos D (1996) Blind image deconvolution. IEEE Signal Process Mag 13:43-64

    Google Scholar 

  30. Tipping ME, Bishop CM (2003) Bayesian image super-resolution. In: Becker S, Thrun S, Obermayer K (eds) Advances in neural information processing systems 15. MIT Press, Cambridge, MA

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Donaldson, K., Myers, G.K. Bayesian super-resolution of text in videowith a text-specific bimodal prior. IJDAR 7, 159–167 (2005). https://doi.org/10.1007/s10032-004-0139-y

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10032-004-0139-y

Keywords:

Navigation