Abstract
In this paper, we present a sparse-based denoising algorithm for scanned documents. This method can be applied to any kind of scanned documents with satisfactory results. Unlike other approaches, the proposed approach encodes noise documents through sparse representation and visual dictionary learning techniques without any prior noise model. Moreover, we propose a precision parameter estimator. Experiments on several datasets demonstrate the robustness of the proposed approach compared to the state-of-the-art methods on document denoising.
References
Aharon M, Elad M, Bruckstein A (2006) K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation. Sig Process 54(11):4311–4322
Barney E (2008) Modeling image degradations for improving OCR. In: Proceedings of the 16th European signal processing conference (EUSIPCO), pp 1–5
Candés EJ, Donoho DL (2000) Curvelets: a surprisingly effective nonadaptive representation for objects with edges. In: Rabut C, Cohen A, Schumaker L (eds) Curve and Surface Fitting: Saint-Malo 1999 (Innovations in Applied Mathematics), Vanderbilt University Press, pp 105–120
Chatterjee P, Milanfar P (2010) Is denoising dead? Trans Image Process 19(4):895–911
Chen SS, Donoho DL, Saunders MA (1998) Atomic decomposition by basis pursuit. SIAM J Sci Comput 20(1):33–61
Daubechies I, Devore R, Fornasier M, Gunturk CS (2009) Iteratively reweighted least squares minimization for sparse recovery. Commun Pure Appl Math 63(1):1–38
Do M, Vetterli M (2005) The contourlet transform: an efficient directional multiresolution image representation. Image Process 14(12):2091–2106
Do TH (2014) Sparse representations over learned dictionary for document analysis. PhD thesis, Université de Lorraine
Dong W, Zhang L, Shi G, Li X (2013) Nonlocally centralized sparse representation for image restoration. IEEE Trans Image Process 22(4):1620–1630
Donoho D, Elad M (2003) Optimally sparse representation in general (nonorthogonal) dictionaries via \(\ell\)1 minimization. PNAS 100(5):2197–2202
Donoho DL (1999) Wedgelets: nearly minimax estimation of edges. Ann Stat 27(3):782–1117
Dosch P, Valveny P (2005) Report on the second symbol recognition contest. In: Liu W, Lladós J (ed) Graphics recognition. Ten years review and future perspectives, volume 3926 of Lecture notes in computer science, Springer, pp 381–397
Eksioglu EM (2014) Online dictionary learning algorithm with periodic updates and its application to image denoising. Expert Syst Appl 41:3682–3690
Elad M (2010) Sparse and redundant representation: from theory to applications in signal and images processing. Springer, New York
Elad M, Aharon M (2006) Image denoising via sparse and redundant representations over learned dictionaries. Image Process 54(12):3736–3745
Engan K, Skretting K, Husoy JH (2007) Family of iterative LS-based dictionary learning algorithm, ITS-DLA, for sparse signal representation. Digit Signal Proc 17(1):32–49
Eslami R, Radha H (2003) The contourlet transform for image de-noising using cycle spinning. In: Proceedings of Asilomar conference on signals, systems, and computers, pp 1982–1986
Gatos B, Ntirogiannis K, Pratikakis I (2011) DIBCO 2009: document image binarization contest. Int J Doc Anal Recognit 14(1):35–44
Gonzalez I, Rao B (1997) Sparse signal reconstruction from limited data using focuss: a re-weighted minimum norm algorithm. Sig Process 45(3):600–616
Hamza AB, Luque P, Martinez J, Roman R (1999) Removing noise and preserving details with relaxed median filters. Math Imag Vis 11(2):161–177
Hardie RC, Barner KE (1994) Rank conditioned rank selection filters for signal restoration. Image Process 3:192–206
Hernandez-Sabate A, Gil D, Roche D, Matsumoto M, Furuie S (2012) Inferring the performance of medical imaging algorithms. In: 14th International conference on computer analysis of images and patterns, vol 6854. pp 520–528
Hoang T, Barney E, Tabbone S (2011) Edge noise removal in bilevel graphical document images using sparse representation. In: Proceedings of the international conference on image processing, pp 3610–3613
Hoang TV, Smith EHB, Tabbone S (2014) Sparsity-based edge noise removal from bilevel graphical document images. IJDAR 17(2):161–179
Jain AK (1989) Fundamentals of digital image processing. Prentice-Hall, Upper Saddle River
Kanungo T, Haralick RM, Phillips IT (1993) Global and local document degradation models. In: Proceedings of the second international conference on document analysis and recognition, pp 730–734
Kuang Y, Zhang L, Yi Z (2014) An adaptive rank-sparsity K-SVD algorithm for image sequence denoising. Pattern Recogn Lett 45:46–54
Lewis D, Agam G, Argamon S, Frieder O, Grossman D, Heard J (2006) Building a test collection for complex document information processing. In: Proceedings of 29th annual international ACM SIGIR conference, pp 665–666
Liu J, Wang Y, Su K, He W (2016) Image denoising with multidirectional shrinkage in directionlet domain. Sig Process 125:64–78
Mallat S (2009) A wavelet tour of signal processing: The sparse way, third edn. Academic Press, Cambridge
Mallat SG, Zhang Z (1993) Matching pursuits with time-frequency dictionaries. Sig Process 41(12):3397–3415
Marial J, Bach F, Ponce J, Sapiro G (2009) Online dictionary learning for sparse coding. In: 26th Annual international conference on machine learning, pp 689–696
Om H, Biswas M (2014) MMSE based map estimation for image denoising. Opt Laser Technol 57:252–264
Pati Y, Rezaiifar R, Krishnaprasad P (1993) Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition. In: 27th Annual Asilomar conference on signals, systems, and computers, pp 40–44
Le Pennec E, Mallat S (2005) Sparse geometric image representations with bandelets. Image Process 14(4):423–438
Peyré G, Mallat S (2007) A review of bandlet methods for geometrical image representation. Numer Algorithms 44(3):205–234
Sadreazami H, Omair Ahmad M, Swamy MNS (2016) A study on image denoising in contourlet domain using the alpha-stable family of distributions. Sig Process 128:459–473
Skretting K, Engan K (2010) Recursive least squares dictionary learning algorithm. Sig Process 58(4):2121–2130
Starck J-L, Candés EJ, Donoho DL (2002) The curvelet transform for image denoising. Image Process 11(6):670–684
Sun D, Gao Q, Lu Y, Huang Z, Li T (2014) A novel image denoising algorithm using linear Bayesian map estimation based on sparse representation. Sig Process 100:132–145
Temlyakov VN (2000) Weak greedy algorithms. Adv Comput Math 12(2–3):213–227
Yang R, Yin L, Gabbouj M, Astola J, Neuvo Y (1995) Optimal weighted median filters under structural constraints. Sig Process 43:591–604
Zha Z, Zhang X, Wang Q, Bai Y, Chen Y, Tang L, Liu X (2018) Group sparsity residual constraint for image denoising with external nonlocal self-similarity prior. Neurocomputing 275:2294–2306
Acknowledgements
This work was partially supported by the European project SCANPLAN (A0806017L), the Spanish ConCORDIA Project (TIN2015-70924-C2-2-R) and the Vietnam National University, Hanoi (VNU) under project number QG.18.04.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Do, T.H., Ramos Terrades, O. & Tabbone, S. DSD: document sparse-based denoising algorithm. Pattern Anal Applic 22, 177–186 (2019). https://doi.org/10.1007/s10044-018-0714-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-018-0714-3