Abstract
People often capture document images using the cameras attached to smart mobile phones. As a result, different types of distorted images gets generated. Warping is one of the major problems found on those distorted document images. Most existing techniques work based on the text lines present in the documents. On the other hand, comic documents contain fewer text lines. So, existing dewarping methods fail to perform with great accuracy in comic document images. Here, we propose a novel dewarping technique for warped comic document images. First, a simple mathematical model is proposed for warping generation in comic documents. Here, we show that warping depends on some factors. We estimate those factors from the boundaries of the panels present in a comic document image. Finally, based on those factors, we dewarp the document image. Unlike, the existing methods, the proposed approach can rectify warping in both horizontal and vertical direction. Nevertheless, the proposed approach can dewarp document images having multiple folds. We also evaluate the proposed approach, and the results are quite encouraging.
Similar content being viewed by others
References
Augereau O, Iwata M, Kise K (2018) A survey of comics research in computer science. J Imag 4(7):87
Bera SK, Ghosh S, Bhowmik S, Sarkar R, Nasipuri M (2021) A non-parametric binarization method based on ensemble of clustering algorithms. Multimed Tools Appl 80(5):7653–7673. https://doi.org/10.1007/s11042-020-09836-z
Brown MS, Seales WB (2004) Image restoration of arbitrarily warped documents. IEEE Trans Pattern Anal Mach Intell 26(10):1295–1306. https://doi.org/10.1109/TPAMI.2004.87
Cao H, Ding X, Liu C (2003) A cylindrical surface model to rectify the bound document image. In: Proceedings Ninth IEEE International Conference on Computer Vision, pp 228–2331, DOI https://doi.org/10.1109/ICCV.2003.1238346, (to appear in print)
Cao Y, Pang X, Chan AB, Lau RW (2016) Dynamic manga: animating still manga via camera movement. IEEE Trans Multimed 19(1):160–172
Chu W-T, Yu C-H, Wang H-H (2014) Optimized comics-based storytelling for temporal image sequences. IEEE Trans Multimed 17(2):201–215
Das S (2019) A statistical tool based binarization method for document images. Multimed Tools Appl 78:27449–27462. https://doi.org/10.1007/s11042-019-07857-x
Das S, Ma K, Shu Z, Samaras D, Shilkrot R (2019) Dewarpnet: Single-image document unwarping with stacked 3d and 2d regression networks. In: The IEEE international conference on computer vision (ICCV)
Dubray D, Laubrock J (2019) Deep CNN-based speech balloon detection and segmentation for comic books. arXiv:1902.08137
Dutta A, Biswas S (2019) Cnn based extraction of panels/characters from bengali comic book page images. In: 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW), vol 1, pp 38–43. IEEE
Dutta A, Biswas S, Das AK (2021) Cnn-based segmentation of speech balloons and narrative text boxes from comic book page images. In: International Journal on Document Analysis and Recognition (IJDAR). pp 1–14
Dutta A, Garai A, Biswas S (2018) Segmentation of meaningful text-regions from camera captured document images. In: 2018 Fifth International Conference on Emerging Applications of Information Technology (EAIT), pp 1–4. IEEE
Ezaki H, Uchida S, Asano A, Sakoe H (2005) Dewarping of document image by global optimization. In: Eighth international conference on document analysis and recognition (ICDAR’05), pp 302–3061, DOI https://doi.org/10.1109/ICDAR.2005.87, (to appear in print)
Fu B, Li W, Wu M, Li R, Xu Z (2012) A document rectification approach dealing with both perspective distortion and warping based on text flow curve fitting. Int J Image Graph 12(01):1250002. https://doi.org/10.1142/S0219467812500027
Garai A, Biswas S, Mandal S (2021) A theoretical justification of warping generation for dewarping using cnn. Pattern Recogn 107621:109. https://doi.org/10.1016/j.patcog.2020.107621
Garai A, Biswas S, Mandal S, Chaudhuri BB (2017) Automatic dewarping of camera captured born-digital bangla document images. In: 2017 Ninth international conference on advances in pattern recognition (ICAPR), pp 1–6, DOI https://doi.org/10.1109/ICAPR.2017.8593157, (to appear in print)
Garai A, Biswas S, Mandal S, Chaudhuri BB (2020) Automatic rectification of warped bangla document images. IET Image Process 14:74–839
Garai A, Biswas S, Mandal S, Chaudhuri BB (2021) Dewarping of document images: a semi-cnn based approach. Multimed Tool Appl
Gatos B, Pratikakis I, Ntirogiannis K (2007) Segmentation based recovery of arbitrarily warped document images. In: Ninth international conference on document analysis and recognition (ICDAR 2007), vol 2, pp 989–993, DOI https://doi.org/10.1109/ICDAR.2007.4377063, (to appear in print)
He Y, Pan P, Xie S, Sun J, Naoi S (2013) A book dewarping system by boundary-based 3d surface reconstruction. In: 2013 12Th international conference on document analysis and recognition, pp 403–407, DOI https://doi.org/10.1109/ICDAR.2013.88, (to appear in print)
He Z, Zhou Y, Wang Y, Wang S, Lu X, Tang Z, Cai L (2018) An end-to-end quadrilateral regression network for comic panel extraction. In: Proceedings of the 26th ACM International Conference on Multimedia, pp 887–895
Herranz L, Calic J, Martínez JM, Mrak M (2012) Scalable comic-like video summaries and layout disturbance. IEEE Trans Multimed 14(4):1290–1297
Ke M, Zhixin S, Bai X, Jue W, Dimitris S (2018) Docunet: document image unwarping via a stacked u-net. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition
Kil T, Seo W, Koo HI, Cho NI (2017) Robust document image dewarping method using text-lines and line segments. In: 2017 14th IAPR international conference on document analysis and recognition (ICDAR), vol 01, pp 865–870, DOI https://doi.org/10.1109/ICDAR.2017.146, (to appear in print)
Kim BS, Koo HI, Cho NI (2015) Document dewarping via text-line based optimization. Pattern Recogn 48(11):3600–3614. https://doi.org/10.1016/j.patcog.2015.04.026
Lee S-H, Kim D, Jadhav S, Lee S (2017) A restoration method for distorted comics to improve comic contents identification. Int J Doc Anal Recognit 20(4):223–240. https://doi.org/10.1007/s10032-017-0291-9
Li L, Wang Y, Gao L, Tang Z, Suen CY (2014) Comic2cebx: a system for automatic comic content adaptation. In: IEEE/ACM Joint Conference on Digital Libraries, pp 299–308. IEEE
Liang J, DeMenthon D, Doermann D (2008) Geometric rectification of camera-captured document images. IEEE Trans Pattern Anal Mach Intell 30(4):591–605. https://doi.org/10.1109/TPAMI.2007.70724
Liang J, DeMenthon D, Doermann D (2008) Geometric rectification of camera-captured document images. IEEE Trans Pattern Anal Mach intell 30(4):591–605
Liu X, Meng G, Fan B, Xiang S, Pan C (2020) Geometric rectification of document images using adversarial gated unwarping network. Pattern Recogn 107576:108. https://doi.org/10.1016/j.patcog.2020.107576
Liu C, Zhang Y, Wang B, Ding X (2015) Restoring camera-captured distorted document images. Int J Document Anal Recogn (IJDAR) 18 (2):111–124. https://doi.org/10.1007/s10032-014-0233-8
Lu S, Tan CL (2006) Document flattening through grid modeling and regularization. In: 18Th international conference on pattern recognition (ICPR’06), vol 1, pp 971–974, DOI https://doi.org/10.1109/ICPR.2006.458https://doi.org/10.1109/ICPR.2006.458, (to appear in print)
Matsui Y, Yamasaki T, Aizawa K (2011) Interactive manga retargeting. In: SIGGRAPH Posters, p 35
Meng G, Pan C, Xiang S, Duan J, Zheng N (2012) Metric rectification of curved document images. IEEE Trans Pattern Anal Mach Intell 34 (4):707–722. https://doi.org/10.1109/TPAMI.2011.151
Meng G, Su Y, Wu Y, Xiang S, Pan C (2018) Exploiting vector fields for geometric rectification of distorted document images. In: Ferrari V, Hebert M, Sminchisescu C, Weiss Y (eds) Computer vision – ECCV 2018. Springer, Cham, pp 180–195
Nguyen N-V, Rigaud C, Burie J-C (2017) Comic characters detection using deep learning. In: ICDAR, 2017, vol 3, pp 41–46. IEEE
Nguyen N-V, Rigaud C, Burie J-C (2019) Comic MTL: optimized multi-task learning for comic book image analysis. Int J Document Anal Recogn (IJDAR) 22(3):265–284
Oh T, Choi N, Kim D, Lee S (2015) Low-complexity and robust comic fingerprint method for comic identification. Signal Process: Image Commun 39:1–16
Ohk H, Seo H, Kang K, Kim S, Choi D (2011) A restoration method for distorted image scanned from a bound book. In: Color Imaging XVI: Displaying, Processing, Hardcopy, and Applications, vol 7866, p 78661. International Society for Optics and Photonics
Pang X, Cao Y, Lau RW, Chan AB (2014) A robust panel extraction method for manga. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp 1125–1128. ACM
Pratikakis I, Zagoris K, Karagiannis X, Tsochatzidis L, Mondal T, Marthot-Santaniello I (2019) Icdar 2019 competition on document image binarization (dibco 2019). In: 2019 International conference on document analysis and recognition (ICDAR), pp 1547–1556
Qin X, Zhou Y, He Z, Wang Y, Tang Z (2017) A faster r-cnn based method for comic characters face detection. In: ICDAR, vol 1, pp 1074–1080. IEEE
Rigaud C, Burie J-C, Ogier J-M, Karatzas D, Van de Weijer J (2013) An active contour model for speech balloon detection in comics. In: 2013 12th International Conference on Document Analysis and Recognition, pp 1240–1244. IEEE
Rigaud C, Guérin C, Karatzas D, Burie J-C, Ogier J-M (2015) Knowledge-driven understanding of images in comic books. IJDAR 18 (3):199–221
Rigaud C, Le Thanh N, Burie J-C, Ogier J-M, Iwata M, Imazu E, Kise K (2015) Speech balloon and speaker association for comics and manga understanding. In: 2015 13th International Conference on Document Analysis and Recognition (ICDAR), pp 351–355. IEEE
Stamatopoulos N, Gatos B, Pratikakis I, Perantonis SJ (2011) Goal-oriented rectification of camera-based document images. IEEE Trans Image Process 20(4):910–920. https://doi.org/10.1109/TIP.2010.2080280
Sun W, Burie J-C, Ogier J-M, Kise K (2013) Specific comic character detection using local feature matching. In: ICDAR, pp 275–279. IEEE
Sun W, Kise K (2011) Similar manga retrieval using visual vocabulary based on regions of interest. In: 2011 International Conference on Document Analysis and Recognition, pp 1075–1079. IEEE
Ulges A, Lampert CH, Breuel T (2005) Document image dewarping using robust estimation of curled text lines. In: Eighth international conference on document analysis and recognition (ICDAR’05), pp 1001–10052, DOI https://doi.org/10.1109/ICDAR.2005.90, (to appear in print)
Wang M, Hong R, Yuan X-T, Yan S, Chua T-S (2012) Movie2comics: towards a lively video content presentation. IEEE Trans Multimed 14 (3):858–870
Wang F, Nagano H, Kashino K, Igarashi T (2016) Visualizing video sounds with sound word animation to enrich user experience. IEEE Trans Multimed 19(2):418–429
Wang Y, Zhou Y, Tang Z (2015) Comic frame extraction via line segments combination. In: ICDAR, pp 856–860. IEEE
Yang P, Antonacopoulos A, Clausner C, Pletschacher S, Qi J (2017) Effective geometric restoration of distorted historical document for large-scale digitisation. IET Image Process 11:841–85312
You S, Matsushita Y, Sinha S, Bou Y, Ikeuchi K (2017) Multiview rectification of folded documents. IEEE Trans Pattern Anal Mach Intell PP(99):1–1. https://doi.org/10.1109/TPAMI.2017.2675980
Zhang L, Tan CL (2006) Restoringwarped document images using shape-from-shading and surface interpolation. In: 18Th international conference on pattern recognition (ICPR’06), vol 1, pp 642–645, DOI https://doi.org/10.1109/ICPR.2006.997, (to appear in print)
Zhao J, Shi C, Jia F, Wang Y, Xiao B (2018) An effective binarization method for disturbed camera-captured document images. In: 2018 16th international conference on frontiers in handwriting recognition (ICFHR), pp 339–344
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
This work is not funded in any funding agencies. To the best of our knowledge, this work do not have any financial and/or non-financial conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Garai, A., Dutta, A. & Biswas, S. Automatic dewarping of camera-captured comic document images. Multimed Tools Appl 82, 1537–1552 (2023). https://doi.org/10.1007/s11042-022-13234-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-13234-y