Automatic Detection of Comic Characters: An Analysis of Model Robustness Across Domains

Lucas, Javier; Gallego, Antonio Javier; Calvo-Zaragoza, Jorge; Martinez-Sevilla, Juan Carlos

doi:10.1007/978-3-031-41498-5_11

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14193))

Included in the following conference series:

International Conference on Document Analysis and Recognition

630 Accesses

Abstract

The popularity of comics has increased in the digital era, leading to the development of several applications and platforms. These advancements have opened up new opportunities for creating and distributing comics and experimenting with new forms of visual storytelling. One of the most promising research areas in this field is the use of deep learning techniques to process comic book images. However, one of the main challenges associated with the use of these models is adapting them to different domains because comics greatly vary in style, subject matter, and design. In this paper, we present a study on the problem of generalization across different domains for the automatic detection of characters in comics. We evaluate the performance of state-of-the-art models trained in different domains and analyze the difficulties and challenges associated with generalization. Our study provides insights into the development of more robust deep-learning models for processing comics’ characters and improving their generalization to new domains.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bottou, L.: Large-scale machine learning with stochastic gradient descent. In: Lechevallier, Y., Saporta, G. (eds.) Proceedings of COMPSTAT’2010, pp. 177–186. Springer-Verlag, Berlin, Heidelberg (2010). https://doi.org/10.1007/978-3-7908-2604-3_16
Dutta, A., Biswas, S.: CNN based extraction of panels/characters from Bengali comic book page images. In: 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW), vol. 1, pp. 38–43. IEEE (2019)
Google Scholar
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)
Article Google Scholar
Fujimoto, A., Ogawa, T., Yamamoto, K., Matsui, Y., Yamasaki, T., Aizawa, K.: Manga109 dataset and creation of metadata. In: Proceedings of the 1st International Workshop on Comics Analysis, Processing and Understanding, pp. 1–5 (2016)
Google Scholar
Girshick, R.: Fast r-cnn. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1440–1448 (2015). https://doi.org/10.1109/ICCV.2015.169
Guérin, C., et al.: eBDtheque: a representative database of comics. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 1145–1149. IEEE (2013)
Google Scholar
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Google Scholar
Iyyer, M., et al.: The amazing mysteries of the gutter: drawing inferences between panels in comic book narratives. In: Proceedings of the IEEE Conference on Computer Vision and Pattern recognition, pp. 7186–7195 (2017)
Google Scholar
Jocher, G.: YOLOv5 by Ultralytics, May 2020. https://doi.org/10.5281/zenodo.3908559, https://github.com/ultralytics/yolov5
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Lladós, J.: Two decades of GREC workshop series. Conclusions of GREC2017. In: Fornés, A., Lamiroy, B. (eds.) GREC 2017. LNCS, vol. 11009, pp. 163–168. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-02284-6_14
Chapter Google Scholar
Nguyen, N.V., Rigaud, C., Burie, J.C.: Comic characters detection using deep learning. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 3, pp. 41–46. IEEE (2017)
Google Scholar
Nguyen, N.V., Rigaud, C., Burie, J.C.: Digital comics image indexing based on deep learning. J. Imaging 4(7), 89 (2018)
Article Google Scholar
Padilla, R., Passos, W.L., Dias, T.L., Netto, S.L., Da Silva, E.A.: A comparative analysis of object detection metrics with a companion open-source toolkit. Electronics 10(3), 279 (2021)
Article Google Scholar
Rayar, F.: Accessible comics for visually impaired people: challenges and opportunities. In: 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 3, pp. 9–14. IEEE (2017)
Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)
Article Google Scholar
Xiao, Y., et al.: A review of object detection based on deep learning. Multimed. Tools Appl. 23729–23791 (2020). https://doi.org/10.1007/s11042-020-08976-6
Zhao, Z.Q., Zheng, P., Xu, S.t., Wu, X.: Object detection with deep learning: a review. IEEE Trans. Neural Netw. Learn. Syst. 30(11), 3212–3232 (2019)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Software and Computing Systems, University of Alicante, Alicante, Spain
Javier Lucas, Antonio Javier Gallego, Jorge Calvo-Zaragoza & Juan Carlos Martinez-Sevilla

Authors

Javier Lucas
View author publications
You can also search for this author in PubMed Google Scholar
Antonio Javier Gallego
View author publications
You can also search for this author in PubMed Google Scholar
Jorge Calvo-Zaragoza
View author publications
You can also search for this author in PubMed Google Scholar
Juan Carlos Martinez-Sevilla
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Antonio Javier Gallego .

Editor information

Editors and Affiliations

University of La Rochelle, La Rochelle, France
Mickael Coustaty
Autonomous University of Barcelona, Bellaterra, Spain
Alicia Fornés

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lucas, J., Gallego, A.J., Calvo-Zaragoza, J., Martinez-Sevilla, J.C. (2023). Automatic Detection of Comic Characters: An Analysis of Model Robustness Across Domains. In: Coustaty, M., Fornés, A. (eds) Document Analysis and Recognition – ICDAR 2023 Workshops. ICDAR 2023. Lecture Notes in Computer Science, vol 14193. Springer, Cham. https://doi.org/10.1007/978-3-031-41498-5_11

Download citation

DOI: https://doi.org/10.1007/978-3-031-41498-5_11
Published: 15 August 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-41497-8
Online ISBN: 978-3-031-41498-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Automatic Detection of Comic Characters: An Analysis of Model Robustness Across Domains