Border Noise Removal and Clean Up Based on Retinex Theory

Wagdy, Marian; Faye, Ibrahima; Rohaya, Dayang

doi:10.1007/978-981-4585-18-7_39

Marian Wagdy^4,5,
Ibrahima Faye^4,6 &
Dayang Rohaya^4,5

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 285))

3082 Accesses
6 Citations

Abstract

Conversion from gray scale or color document image into binary image is the main step in most of Optical Character Recognition (OCR) systems and document analysis. After digitization, document images often suffer from poor contrast, noise, uniform lighting, and shadow. Also when a page of book is digitized using a scanner or a camera, a border noise, which is an unwanted text coming from the adjacent page, may appear. In this paper we present a simple and efficient document image clean up by border noise removal and enhancement based on retinex theory and global threshold. The proposed method produces high quality results compared to the previous works.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Y. Chen and G. Leedham, “Decompose Algorithm for Thresholding Degraded Historical Document Images” IEEE Proceedings on Vision, Image and Signal Processing, vol. 152 No.6, pp. 702–714, 2005.
Google Scholar
G. Agam, G. Bal, G. Frieder, and O. Frieder, “Degraded Document Image Enhancement” in Document Recognition and Retrieval XIV, Proc. SPIE, vol. 6500, pp. 65000C-1 - 65000C-11, 2007.
Google Scholar
J. M. White and G. D. Rohrer, “Image Thresholding for Optical Character Recognition and Other Applications Requiring Character Image Extraction” IBM Journal of Research and Development vol. 27, No. 4, pp. 400-411, 1983.
Google Scholar
L. Gorman “Binarization and Multithresholding of Document Image Using Connectivity” CVGIP, Graph. Models Image Processing, vol. 56, No. 6, pp. 496-506, 1994.
Google Scholar
R. Cattoni, T. Coianiz, S. Messelodi, and CM Modena, “Geometric Layout Analysis Techniques for Document Image Understanding: a Review”, ITC-irst Technical Report 9703 (09), 1998.
Google Scholar
P. Viola and M. J. Jones, “Robust Real-Time Face Detection,” Int. Journal of Computer Vision, vol. 57, No. 2, pp. 137– 154, 2004.
Google Scholar
F. Shafait, D. Keysers, and T. M. Breuel, “Performance Comparison of Six Algorithms for Page Segmentation,” in 7th IAPR Workshop on Document Analysis Systems, pp. 368–379, 2006.
Google Scholar
N. Otsu, “A Threshold Selection Method FromGray-Level Histograms,” IEEE Trans. Systems, Man, and Cybernetics, vol. 9, No. 1, pp. 62–66, 1979.
Google Scholar
Y. Solihin, and C. G. Leedham, “Integral Ratio: A New Class of Global Thresholding Techniques for Handwriting Images”, IEEE Trans. Pattern Anal. Mach. Intell., vol. 21, No. 8, pp. 761 – 768, 1999.
Google Scholar
W. Niblack “An Introduction to Digital Image Processing” Prentice-Hall, Englewood Cliffs, New Jersey, 1986.
Google Scholar
J. Sauvola and M. Pietikainen, “Adaptive Document Image Binarization,” Proc. of Pattern Recognition, vol. 33, No. 2, pp. 225–236, 2000.
Google Scholar
T.Romen “A New Local Adaptive Thresholding Technique in Binarization” IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 6, No. 2, pp. 271-277,2011.
Google Scholar
J. G. Kuk, and N. I. Cho, “Feature Based Binarization of Document Images Degraded by Uneven Light Condition” in 10th inter. Conf. On Document Analysis and Recognition (ICDAR), pp. 748-752, 2009.
Google Scholar
I. K. Kim, D. W. Jung, and R. H. Park, “Document Image Binarization Based on Topographic Analysis Using a Water Fow Model” Proc. of Pattern Recognition, vol. 35, pp. 265–277, 2002.
Google Scholar
Bolan Su, Shijian Lu, and Chew Lim Tan “Binarization of Historical Document Images Using the Local Maximum and Minimum” 9th IAPR International Workshop on Document Analysis Systems, pp. 159-166, 2010.
Google Scholar
Baird, H.S.: Background structure in document images. In: Bunke, H. Wang, P., B aird, H.S. (eds.) Document Image Analysis. World Scientific, Singapore, pp. 17–34 (1994).
Google Scholar
Breuel, T.M.: Two geometric algorithms for layout analysis. In: Proceedings of Document Analysis Systems. Lecture Notes in Computer Science, vol. 2423, Princeton, NY, USA, pp. 188–199 (2002).
Google Scholar
O’Gorman, L.: The document spectrum for page layout analysis. IEEE Trans. Pattern Anal. Mach. Intell. 15(11), 1162– 1173 (1993).
Google Scholar
S. Mao and T. Kanungo, “Empirical Per formance Evaluation Methodology and Its Application to Page Segmentation Algorithms,” IEEE Trans. Pattern Analysis and M achi ne Intelligence, vol. 23, no. 3, pp. 242-256, Mar. 2001.
Google Scholar
F. Shafait, D. Keysers, and T.M. Breuel, “Performance Evaluation and Benchmarking of Six Page Segmentation Algorithms,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 6, pp. 941-954, June 2008.
Google Scholar
F. Shafait, D. Keyser s, and T.M. B reuel, “Pixel-Accurate Representation and Evaluation of Page Segmentation in Document Images,” Proc. 18th Int’l Conf. Pattern Recognition, pp. 872-875, Aug. 2006.
Google Scholar
N. Stamatopoulos, B.Gatos, and A. K esidis, “Automatic Borders Detection of Camera DocumentImages,” Proc. Second I nt’l Workshop Camera-Based Document Analys is and Recognition, pp. 71-78, Sept. 2007.
Google Scholar
F. Shafait, J. van B euseko m, D. Keysers, and T.M.Breuel, “Do cumentCleanup Using Page Frame Detectio n,” Int’l J. Document Analysis and Recognition, vol. 11, no. 2, pp. 81-96, 2008.
Google Scholar
F. Shafait, J. van B eusekom, D. K eysers, and T.M. B reuel, “Page Frame Detection for Marginal Noise Removal from S canned Documents,” Proc. Scandinavian Conf. I mage Analys is, pp. 651-660, June 2007.
Google Scholar
Edwin H. Land, “The Retinex Theory of Color Vision,” Scientific American, Vol. 237, No. 6, pp. 108-128, 1977.
Google Scholar
Kuo-Chin Fan, Yuan-Kai Wang, Tsann-Ran Lay, “Marginal Noise Removal of Document Images”, Pattern Recognition, 35(11), 2002, pp. 2593-2611.
Google Scholar

Download references

Author information

Authors and Affiliations

Centre of Intelligent Signal and Imaging Research (CISIR), Universiti Teknologi Petronas, Seri Iskandar, Malaysia
Marian Wagdy, Ibrahima Faye & Dayang Rohaya
Department of Computer and Information Sciences, Universiti Teknologi Petronas, Seri Iskandar, Malaysia
Marian Wagdy & Dayang Rohaya
Department of Fundamental and Applied Sciences, Universiti Teknologi Petronas, Seri Iskandar, Malaysia
Ibrahima Faye

Authors

Marian Wagdy
View author publications
You can also search for this author in PubMed Google Scholar
Ibrahima Faye
View author publications
You can also search for this author in PubMed Google Scholar
Dayang Rohaya
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marian Wagdy .

Editor information

Editors and Affiliations

Faculty of Computer Science and Information Technology, University of Malaya, Kuala Lumpur, Malaysia
Tutut Herawan
Faculty of Computer Science and Information Technology, Universiti Tun Hussein Onn Malaysia, Batu Pahat, Malaysia
Mustafa Mat Deris
School of Information Technology, Deakin University, Burwood, Victoria, Australia
Jemal Abawajy

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wagdy, M., Faye, I., Rohaya, D. (2014). Border Noise Removal and Clean Up Based on Retinex Theory. In: Herawan, T., Deris, M., Abawajy, J. (eds) Proceedings of the First International Conference on Advanced Data and Information Engineering (DaEng-2013). Lecture Notes in Electrical Engineering, vol 285. Springer, Singapore. https://doi.org/10.1007/978-981-4585-18-7_39

Download citation

DOI: https://doi.org/10.1007/978-981-4585-18-7_39
Published: 15 December 2013
Publisher Name: Springer, Singapore
Print ISBN: 978-981-4585-17-0
Online ISBN: 978-981-4585-18-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics