Automatic dewarping of camera-captured comic document images

Garai, Arpan; Dutta, Arpita; Biswas, Samit

doi:10.1007/s11042-022-13234-y

Automatic dewarping of camera-captured comic document images

Published: 11 June 2022

Volume 82, pages 1537–1552, (2023)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

483 Accesses
2 Citations
2 Altmetric
Explore all metrics

Abstract

People often capture document images using the cameras attached to smart mobile phones. As a result, different types of distorted images gets generated. Warping is one of the major problems found on those distorted document images. Most existing techniques work based on the text lines present in the documents. On the other hand, comic documents contain fewer text lines. So, existing dewarping methods fail to perform with great accuracy in comic document images. Here, we propose a novel dewarping technique for warped comic document images. First, a simple mathematical model is proposed for warping generation in comic documents. Here, we show that warping depends on some factors. We estimate those factors from the boundaries of the panels present in a comic document image. Finally, based on those factors, we dewarp the document image. Unlike, the existing methods, the proposed approach can rectify warping in both horizontal and vertical direction. Nevertheless, the proposed approach can dewarp document images having multiple folds. We also evaluate the proposed approach, and the results are quite encouraging.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Document Dewarping with Control Points

Adaptive dewarping of severely warped camera-captured document images based on document map generation

Article 09 January 2023

Restoring camera-captured distorted document images

Article 26 November 2014

References

Augereau O, Iwata M, Kise K (2018) A survey of comics research in computer science. J Imag 4(7):87
Article Google Scholar
Bera SK, Ghosh S, Bhowmik S, Sarkar R, Nasipuri M (2021) A non-parametric binarization method based on ensemble of clustering algorithms. Multimed Tools Appl 80(5):7653–7673. https://doi.org/10.1007/s11042-020-09836-z
Article Google Scholar
Brown MS, Seales WB (2004) Image restoration of arbitrarily warped documents. IEEE Trans Pattern Anal Mach Intell 26(10):1295–1306. https://doi.org/10.1109/TPAMI.2004.87
Article Google Scholar
Cao H, Ding X, Liu C (2003) A cylindrical surface model to rectify the bound document image. In: Proceedings Ninth IEEE International Conference on Computer Vision, pp 228–2331, DOI https://doi.org/10.1109/ICCV.2003.1238346, (to appear in print)
Cao Y, Pang X, Chan AB, Lau RW (2016) Dynamic manga: animating still manga via camera movement. IEEE Trans Multimed 19(1):160–172
Article Google Scholar
Chu W-T, Yu C-H, Wang H-H (2014) Optimized comics-based storytelling for temporal image sequences. IEEE Trans Multimed 17(2):201–215
Article Google Scholar
Das S (2019) A statistical tool based binarization method for document images. Multimed Tools Appl 78:27449–27462. https://doi.org/10.1007/s11042-019-07857-x
Article Google Scholar
Das S, Ma K, Shu Z, Samaras D, Shilkrot R (2019) Dewarpnet: Single-image document unwarping with stacked 3d and 2d regression networks. In: The IEEE international conference on computer vision (ICCV)
Dubray D, Laubrock J (2019) Deep CNN-based speech balloon detection and segmentation for comic books. arXiv:1902.08137
Dutta A, Biswas S (2019) Cnn based extraction of panels/characters from bengali comic book page images. In: 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW), vol 1, pp 38–43. IEEE
Dutta A, Biswas S, Das AK (2021) Cnn-based segmentation of speech balloons and narrative text boxes from comic book page images. In: International Journal on Document Analysis and Recognition (IJDAR). pp 1–14
Dutta A, Garai A, Biswas S (2018) Segmentation of meaningful text-regions from camera captured document images. In: 2018 Fifth International Conference on Emerging Applications of Information Technology (EAIT), pp 1–4. IEEE
Ezaki H, Uchida S, Asano A, Sakoe H (2005) Dewarping of document image by global optimization. In: Eighth international conference on document analysis and recognition (ICDAR’05), pp 302–3061, DOI https://doi.org/10.1109/ICDAR.2005.87, (to appear in print)
Fu B, Li W, Wu M, Li R, Xu Z (2012) A document rectification approach dealing with both perspective distortion and warping based on text flow curve fitting. Int J Image Graph 12(01):1250002. https://doi.org/10.1142/S0219467812500027
Article MathSciNet Google Scholar
Garai A, Biswas S, Mandal S (2021) A theoretical justification of warping generation for dewarping using cnn. Pattern Recogn 107621:109. https://doi.org/10.1016/j.patcog.2020.107621
Article Google Scholar
Garai A, Biswas S, Mandal S, Chaudhuri BB (2017) Automatic dewarping of camera captured born-digital bangla document images. In: 2017 Ninth international conference on advances in pattern recognition (ICAPR), pp 1–6, DOI https://doi.org/10.1109/ICAPR.2017.8593157, (to appear in print)
Garai A, Biswas S, Mandal S, Chaudhuri BB (2020) Automatic rectification of warped bangla document images. IET Image Process 14:74–839
Article Google Scholar
Garai A, Biswas S, Mandal S, Chaudhuri BB (2021) Dewarping of document images: a semi-cnn based approach. Multimed Tool Appl
Gatos B, Pratikakis I, Ntirogiannis K (2007) Segmentation based recovery of arbitrarily warped document images. In: Ninth international conference on document analysis and recognition (ICDAR 2007), vol 2, pp 989–993, DOI https://doi.org/10.1109/ICDAR.2007.4377063, (to appear in print)
He Y, Pan P, Xie S, Sun J, Naoi S (2013) A book dewarping system by boundary-based 3d surface reconstruction. In: 2013 12Th international conference on document analysis and recognition, pp 403–407, DOI https://doi.org/10.1109/ICDAR.2013.88, (to appear in print)
He Z, Zhou Y, Wang Y, Wang S, Lu X, Tang Z, Cai L (2018) An end-to-end quadrilateral regression network for comic panel extraction. In: Proceedings of the 26th ACM International Conference on Multimedia, pp 887–895
Herranz L, Calic J, Martínez JM, Mrak M (2012) Scalable comic-like video summaries and layout disturbance. IEEE Trans Multimed 14(4):1290–1297
Article Google Scholar
Ke M, Zhixin S, Bai X, Jue W, Dimitris S (2018) Docunet: document image unwarping via a stacked u-net. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition
Kil T, Seo W, Koo HI, Cho NI (2017) Robust document image dewarping method using text-lines and line segments. In: 2017 14th IAPR international conference on document analysis and recognition (ICDAR), vol 01, pp 865–870, DOI https://doi.org/10.1109/ICDAR.2017.146, (to appear in print)
Kim BS, Koo HI, Cho NI (2015) Document dewarping via text-line based optimization. Pattern Recogn 48(11):3600–3614. https://doi.org/10.1016/j.patcog.2015.04.026
Article Google Scholar
Lee S-H, Kim D, Jadhav S, Lee S (2017) A restoration method for distorted comics to improve comic contents identification. Int J Doc Anal Recognit 20(4):223–240. https://doi.org/10.1007/s10032-017-0291-9
Article Google Scholar
Li L, Wang Y, Gao L, Tang Z, Suen CY (2014) Comic2cebx: a system for automatic comic content adaptation. In: IEEE/ACM Joint Conference on Digital Libraries, pp 299–308. IEEE
Liang J, DeMenthon D, Doermann D (2008) Geometric rectification of camera-captured document images. IEEE Trans Pattern Anal Mach Intell 30(4):591–605. https://doi.org/10.1109/TPAMI.2007.70724
Article Google Scholar
Liang J, DeMenthon D, Doermann D (2008) Geometric rectification of camera-captured document images. IEEE Trans Pattern Anal Mach intell 30(4):591–605
Article Google Scholar
Liu X, Meng G, Fan B, Xiang S, Pan C (2020) Geometric rectification of document images using adversarial gated unwarping network. Pattern Recogn 107576:108. https://doi.org/10.1016/j.patcog.2020.107576
Article Google Scholar
Liu C, Zhang Y, Wang B, Ding X (2015) Restoring camera-captured distorted document images. Int J Document Anal Recogn (IJDAR) 18 (2):111–124. https://doi.org/10.1007/s10032-014-0233-8
Article Google Scholar
Lu S, Tan CL (2006) Document flattening through grid modeling and regularization. In: 18Th international conference on pattern recognition (ICPR’06), vol 1, pp 971–974, DOI https://doi.org/10.1109/ICPR.2006.458 https://doi.org/10.1109/ICPR.2006.458, (to appear in print)
Matsui Y, Yamasaki T, Aizawa K (2011) Interactive manga retargeting. In: SIGGRAPH Posters, p 35
Meng G, Pan C, Xiang S, Duan J, Zheng N (2012) Metric rectification of curved document images. IEEE Trans Pattern Anal Mach Intell 34 (4):707–722. https://doi.org/10.1109/TPAMI.2011.151
Article Google Scholar
Meng G, Su Y, Wu Y, Xiang S, Pan C (2018) Exploiting vector fields for geometric rectification of distorted document images. In: Ferrari V, Hebert M, Sminchisescu C, Weiss Y (eds) Computer vision – ECCV 2018. Springer, Cham, pp 180–195
Nguyen N-V, Rigaud C, Burie J-C (2017) Comic characters detection using deep learning. In: ICDAR, 2017, vol 3, pp 41–46. IEEE
Nguyen N-V, Rigaud C, Burie J-C (2019) Comic MTL: optimized multi-task learning for comic book image analysis. Int J Document Anal Recogn (IJDAR) 22(3):265–284
Article Google Scholar
Oh T, Choi N, Kim D, Lee S (2015) Low-complexity and robust comic fingerprint method for comic identification. Signal Process: Image Commun 39:1–16
Google Scholar
Ohk H, Seo H, Kang K, Kim S, Choi D (2011) A restoration method for distorted image scanned from a bound book. In: Color Imaging XVI: Displaying, Processing, Hardcopy, and Applications, vol 7866, p 78661. International Society for Optics and Photonics
Pang X, Cao Y, Lau RW, Chan AB (2014) A robust panel extraction method for manga. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp 1125–1128. ACM
Pratikakis I, Zagoris K, Karagiannis X, Tsochatzidis L, Mondal T, Marthot-Santaniello I (2019) Icdar 2019 competition on document image binarization (dibco 2019). In: 2019 International conference on document analysis and recognition (ICDAR), pp 1547–1556
Qin X, Zhou Y, He Z, Wang Y, Tang Z (2017) A faster r-cnn based method for comic characters face detection. In: ICDAR, vol 1, pp 1074–1080. IEEE
Rigaud C, Burie J-C, Ogier J-M, Karatzas D, Van de Weijer J (2013) An active contour model for speech balloon detection in comics. In: 2013 12th International Conference on Document Analysis and Recognition, pp 1240–1244. IEEE
Rigaud C, Guérin C, Karatzas D, Burie J-C, Ogier J-M (2015) Knowledge-driven understanding of images in comic books. IJDAR 18 (3):199–221
Article Google Scholar
Rigaud C, Le Thanh N, Burie J-C, Ogier J-M, Iwata M, Imazu E, Kise K (2015) Speech balloon and speaker association for comics and manga understanding. In: 2015 13th International Conference on Document Analysis and Recognition (ICDAR), pp 351–355. IEEE
Stamatopoulos N, Gatos B, Pratikakis I, Perantonis SJ (2011) Goal-oriented rectification of camera-based document images. IEEE Trans Image Process 20(4):910–920. https://doi.org/10.1109/TIP.2010.2080280
Article MathSciNet MATH Google Scholar
Sun W, Burie J-C, Ogier J-M, Kise K (2013) Specific comic character detection using local feature matching. In: ICDAR, pp 275–279. IEEE
Sun W, Kise K (2011) Similar manga retrieval using visual vocabulary based on regions of interest. In: 2011 International Conference on Document Analysis and Recognition, pp 1075–1079. IEEE
Ulges A, Lampert CH, Breuel T (2005) Document image dewarping using robust estimation of curled text lines. In: Eighth international conference on document analysis and recognition (ICDAR’05), pp 1001–10052, DOI https://doi.org/10.1109/ICDAR.2005.90, (to appear in print)
Wang M, Hong R, Yuan X-T, Yan S, Chua T-S (2012) Movie2comics: towards a lively video content presentation. IEEE Trans Multimed 14 (3):858–870
Article Google Scholar
Wang F, Nagano H, Kashino K, Igarashi T (2016) Visualizing video sounds with sound word animation to enrich user experience. IEEE Trans Multimed 19(2):418–429
Article Google Scholar
Wang Y, Zhou Y, Tang Z (2015) Comic frame extraction via line segments combination. In: ICDAR, pp 856–860. IEEE
Yang P, Antonacopoulos A, Clausner C, Pletschacher S, Qi J (2017) Effective geometric restoration of distorted historical document for large-scale digitisation. IET Image Process 11:841–85312
Article Google Scholar
You S, Matsushita Y, Sinha S, Bou Y, Ikeuchi K (2017) Multiview rectification of folded documents. IEEE Trans Pattern Anal Mach Intell PP(99):1–1. https://doi.org/10.1109/TPAMI.2017.2675980
Article Google Scholar
Zhang L, Tan CL (2006) Restoringwarped document images using shape-from-shading and surface interpolation. In: 18Th international conference on pattern recognition (ICPR’06), vol 1, pp 642–645, DOI https://doi.org/10.1109/ICPR.2006.997, (to appear in print)
Zhao J, Shi C, Jia F, Wang Y, Xiao B (2018) An effective binarization method for disturbed camera-captured document images. In: 2018 16th international conference on frontiers in handwriting recognition (ICFHR), pp 339–344

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Indian Institute of Technology, Delhi, New Delhi, 110016, Delhi, India
Arpan Garai
Department of Computer Science and Technology, Indian Institute of Engineering Science and Technology, Shibpur, Howrah, 711103, West Bengal, India
Arpita Dutta & Samit Biswas

Authors

Arpan Garai
View author publications
You can also search for this author in PubMed Google Scholar
Arpita Dutta
View author publications
You can also search for this author in PubMed Google Scholar
Samit Biswas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Arpan Garai.

Ethics declarations

This work is not funded in any funding agencies. To the best of our knowledge, this work do not have any financial and/or non-financial conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Garai, A., Dutta, A. & Biswas, S. Automatic dewarping of camera-captured comic document images. Multimed Tools Appl 82, 1537–1552 (2023). https://doi.org/10.1007/s11042-022-13234-y

Download citation

Received: 27 October 2021
Revised: 14 January 2022
Accepted: 15 May 2022
Published: 11 June 2022
Issue Date: January 2023
DOI: https://doi.org/10.1007/s11042-022-13234-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automatic dewarping of camera-captured comic document images

Abstract

Access this article

Similar content being viewed by others

Document Dewarping with Control Points

Adaptive dewarping of severely warped camera-captured document images based on document map generation

Restoring camera-captured distorted document images

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Automatic dewarping of camera-captured comic document images

Abstract

Access this article

Similar content being viewed by others

Document Dewarping with Control Points

Adaptive dewarping of severely warped camera-captured document images based on document map generation

Restoring camera-captured distorted document images

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation