Abstract.
To increase the range of sizes of video scene text recognizable by optical character recognition (OCR), we developed a Bayesian super-resolution algorithm that uses a text-specific bimodal prior. We evaluated the effectiveness of the bimodal prior, compared and in conjunction with a piecewise smoothness prior, visually and by measuring the accuracy of the OCR results on the variously super-resolved images. The bimodal prior improved the readability of 4- to 7-pixel-high scene text significantly better than bicubic interpolation and increased the accuracy of OCR results better than the piecewise smoothness prior.
Similar content being viewed by others
References
Aradhye H, Dorai C, Shim J (2001) Study of embedded font context and kernel space methods for improved videotext recognition. In: IEEE international conference on image processing
Dorai C, Aradhye H, Shim J (2001) End-to-end videotext recognition for multimedia content analysis. In: IEEE international conference on multimedia and expo
Chaudhuri S (2001) Super-resolution imaging. Kluwer International series in engineering and computer science, no. 632. ISBN 0-7923-7471-1
Park SC, Park MK, Moon GK (2003) Super-resolution image reconstruction: a technical overview. IEEE Signal Process Mag 20(3):21-36
Wu V, Manmatha R, Riseman E (1997) Automatic text detection and recognition. In: Proceedings of the workshop on image understanding, pp 707-712
Li H, Doermann D, Omid K (2000) Automatic text detection and tracking in digital video. IEEE Trans Image Process 9(1):147-56
Clark P, Mirmehdi M (2000a) Combining statistical measures to find image text regions. In: Proceedings of the international conference on pattern recognition, 3-8 September 2000. IEEE Press, New York, pp 450-453
Clark P, Mirmehdi M (2000b) Finding text regions using localised measures. In: Proceedings of the 11th British machine vision conference, pp 675-684
Mirmehdi M, Clark P, Lam J (2001) Extracting low resolution text with an active camera for OCR. In: Proceedings of the 9th Spanish symposium on pattern recognition and image processing, pp 43-48
Doermann D, Liang J, Li H (2003) Progress in camera-based document image analysis. In: Proceedings of the international conference on document analysis and recognition, pp 606-616
Li H, Doermann D (2000) Superresolution-based enhancement of text in digital video. In: Proceedings of the international conference on pattern recognition, pp 847-850
Patti A, Sezan M, Tekalp A (1997) Superresolution video reconstruction with arbitrary sampling lattices and nonzero aperture time. IEEE Trans Image Process 6(8):1064-1076
Capel D, Zisserman A (2000) Super-resolution enhancement of text image sequences. In: Proceedings of the international conference on pattern recognition, 1:600-605
Irani M, Peleg S (1991) Improving resolution by image restoration. Comput Vis Graph Image Process 53:231-239
Irani M, Peleg S (1993) Motion analysis for image enhancement: Resolution, occlusion, and transparency. J Vis Commun Image Represent 4(4):324-335
Schultz R, Stevenson RL (1996) Extraction of high-resolution frames from video sequences. IEEE Trans Image Process 5(6):996-1011
Vogel CR, Oman ME (1998) Fast, robust total variation-based reconstruction of noisy, blurred images. IEEE Trans Image Process 7(7):813-424
Capel D, Zisserman A (2001) Super-resolution from multiple views using learnt image models. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Baker S, Kanade T (2002) Limits on super-resolution and how to break them. IEEE Trans Pattern Anal Mach Intell 24(9):1167-1183
Chiang M, Boult TE (1997) Local blur estimation and super-resolution. In: Proceedings of the conference on computer vision and pattern recognition, San Juan, Puerto Rico, June 1997, pp 821-826
Chiang M-C, Boult TE (2000) Efficient super-resolution via image warping. Image Vis Comput 18(10):761-771
Cheeseman P, Kanefsky B, Knaft R, Stutz J, Hanson R (1996) Super-resolved surface reconstruction from multiple images. In: Heidbreder G (ed) Maximum entropy and bayesian methods, pp 293-308. Kluwer, Santa Barbara
Hardie RC, Barnard KJ, Armstrong EA (1997) Joint MAP registration and high-resolution image estimation using a sequence of undersampled images. IEEE Trans Image Process 6(12):1621-1633
Capel D, Zisserman A (2003) Computer vision applied to super-resolution. IEEE Signal Process Mag 20(3):75-86
Thouin P, Chang C-I (1999) A method for restoration of low-resolution text images. In: Proceedings of the symposium on document image understanding technology, Annapolis, MD, pp 143-148
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B 39(1):1-38
Mitchell TM (1997) Machine learning. McGraw-Hill, New York, pp 191-196
Abdou I (1996) Study of subpixel image registration methods. SRI Technical Report, SRI International, Menlo Park, CA
Kundur D, Hatzinakos D (1996) Blind image deconvolution. IEEE Signal Process Mag 13:43-64
Tipping ME, Bishop CM (2003) Bayesian image super-resolution. In: Becker S, Thrun S, Obermayer K (eds) Advances in neural information processing systems 15. MIT Press, Cambridge, MA
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Donaldson, K., Myers, G.K. Bayesian super-resolution of text in videowith a text-specific bimodal prior. IJDAR 7, 159–167 (2005). https://doi.org/10.1007/s10032-004-0139-y
Issue Date:
DOI: https://doi.org/10.1007/s10032-004-0139-y