Skip to main content

Automatic Detection of Comic Characters: An Analysis of Model Robustness Across Domains

  • Conference paper
  • First Online:
Document Analysis and Recognition – ICDAR 2023 Workshops (ICDAR 2023)

Abstract

The popularity of comics has increased in the digital era, leading to the development of several applications and platforms. These advancements have opened up new opportunities for creating and distributing comics and experimenting with new forms of visual storytelling. One of the most promising research areas in this field is the use of deep learning techniques to process comic book images. However, one of the main challenges associated with the use of these models is adapting them to different domains because comics greatly vary in style, subject matter, and design. In this paper, we present a study on the problem of generalization across different domains for the automatic detection of characters in comics. We evaluate the performance of state-of-the-art models trained in different domains and analyze the difficulties and challenges associated with generalization. Our study provides insights into the development of more robust deep-learning models for processing comics’ characters and improving their generalization to new domains.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bottou, L.: Large-scale machine learning with stochastic gradient descent. In: Lechevallier, Y., Saporta, G. (eds.) Proceedings of COMPSTAT’2010, pp. 177–186. Springer-Verlag, Berlin, Heidelberg (2010). https://doi.org/10.1007/978-3-7908-2604-3_16

  2. Dutta, A., Biswas, S.: CNN based extraction of panels/characters from Bengali comic book page images. In: 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW), vol. 1, pp. 38–43. IEEE (2019)

    Google Scholar 

  3. Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)

    Article  Google Scholar 

  4. Fujimoto, A., Ogawa, T., Yamamoto, K., Matsui, Y., Yamasaki, T., Aizawa, K.: Manga109 dataset and creation of metadata. In: Proceedings of the 1st International Workshop on Comics Analysis, Processing and Understanding, pp. 1–5 (2016)

    Google Scholar 

  5. Girshick, R.: Fast r-cnn. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1440–1448 (2015). https://doi.org/10.1109/ICCV.2015.169

  6. Guérin, C., et al.: eBDtheque: a representative database of comics. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 1145–1149. IEEE (2013)

    Google Scholar 

  7. He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)

    Google Scholar 

  8. Iyyer, M., et al.: The amazing mysteries of the gutter: drawing inferences between panels in comic book narratives. In: Proceedings of the IEEE Conference on Computer Vision and Pattern recognition, pp. 7186–7195 (2017)

    Google Scholar 

  9. Jocher, G.: YOLOv5 by Ultralytics, May 2020. https://doi.org/10.5281/zenodo.3908559, https://github.com/ultralytics/yolov5

  10. Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48

    Chapter  Google Scholar 

  11. Lladós, J.: Two decades of GREC workshop series. Conclusions of GREC2017. In: Fornés, A., Lamiroy, B. (eds.) GREC 2017. LNCS, vol. 11009, pp. 163–168. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-02284-6_14

    Chapter  Google Scholar 

  12. Nguyen, N.V., Rigaud, C., Burie, J.C.: Comic characters detection using deep learning. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 3, pp. 41–46. IEEE (2017)

    Google Scholar 

  13. Nguyen, N.V., Rigaud, C., Burie, J.C.: Digital comics image indexing based on deep learning. J. Imaging 4(7), 89 (2018)

    Article  Google Scholar 

  14. Padilla, R., Passos, W.L., Dias, T.L., Netto, S.L., Da Silva, E.A.: A comparative analysis of object detection metrics with a companion open-source toolkit. Electronics 10(3), 279 (2021)

    Article  Google Scholar 

  15. Rayar, F.: Accessible comics for visually impaired people: challenges and opportunities. In: 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 3, pp. 9–14. IEEE (2017)

    Google Scholar 

  16. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)

    Google Scholar 

  17. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)

    Article  Google Scholar 

  18. Xiao, Y., et al.: A review of object detection based on deep learning. Multimed. Tools Appl. 23729–23791 (2020). https://doi.org/10.1007/s11042-020-08976-6

  19. Zhao, Z.Q., Zheng, P., Xu, S.t., Wu, X.: Object detection with deep learning: a review. IEEE Trans. Neural Netw. Learn. Syst. 30(11), 3212–3232 (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Antonio Javier Gallego .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lucas, J., Gallego, A.J., Calvo-Zaragoza, J., Martinez-Sevilla, J.C. (2023). Automatic Detection of Comic Characters: An Analysis of Model Robustness Across Domains. In: Coustaty, M., Fornés, A. (eds) Document Analysis and Recognition – ICDAR 2023 Workshops. ICDAR 2023. Lecture Notes in Computer Science, vol 14193. Springer, Cham. https://doi.org/10.1007/978-3-031-41498-5_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-41498-5_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-41497-8

  • Online ISBN: 978-3-031-41498-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics