Skip to main content

Analysis of Partial Semantic Segmentation for Images of Four-Scene Comics

  • Conference paper
  • First Online:
Distributed Computing and Artificial Intelligence, 17th International Conference (DCAI 2020)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1237))

  • 633 Accesses

Abstract

Ways of understanding human creations with the help of artificial intelligence (AI) have increased; however, those are still known as being one of the most difficult tasks. Our research challenge is to find ways to understand four-scene comics through AI. To achieve this aim, we used a novel dataset called “Four-scene Comics Story Dataset”, which is the first dataset made by researchers and comic artists to develop AI creations. In this paper, we focused on the partial semantic segmentation of features such as eyes, mouth, or speech balloons. The semantic segmentation task of comics has been difficult because of the lack of annotated comic dataset. To solve this problem, we utilized the features of our dataset and easily created annotated dataset. For the semantic segmentation method, we used a model called DeepLabv3+. The effectiveness of our experiment is confirmed by computer simulations showing the segmentation result of test images from four-scene comics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015

    Google Scholar 

  2. Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)

    Article  Google Scholar 

  3. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015, pp. 234–241. Springer, Cham (2015)

    Google Scholar 

  4. Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: CVPR (2017)

    Google Scholar 

  5. Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: ECCV (2018)

    Google Scholar 

  6. Grauman, K., Darrell, T.: The pyramid match kernel: discriminative classification with sets of image features. In: Tenth IEEE International Conference on Computer Vision (ICCV 2005 Volume 1, vol. 2, pp. 1458–1465, October 2005

    Google Scholar 

  7. Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2006), vol. 2, pp. 2169–2178, June 2006

    Google Scholar 

  8. Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions, November 2015

    Google Scholar 

  9. Narita, R., Ogawa, T., Matsui, Y., Yamasaki, T., Aizawa, K.: Sketch-based manga retrieval using deep features. In: The 31st Annual Conference of the Japanese Society for Artificial Intelligence, JSAI2017, p. 3H1OS04a2 (2017)

    Google Scholar 

  10. Matsui, Y., Ito, K., Aramaki, Y., Fujimoto, A., Ogawa, T., Yamasaki, T., Aizawa, K.: Sketch-based manga retrieval using Manga109 dataset. Multimed. Tools Appl. 76(20), 21811–21838 (2017)

    Article  Google Scholar 

  11. Ogawa, T., Otsubo, A., Narita, R., Matsui, Y., Yamasaki, T., Aizawa, K.: Object detection for comics using Manga109 annotations. CoRR, abs/1803.08670 (2018)

    Google Scholar 

  12. Guérin, C., Rigaud, C., Mercier, A., Ammar-Boudjelal, F., Bertet, K., Bouju, A., Burie, J.-C., Louis, G., Ogier, J.-M., Revel, A.: eBDtheque: a representative database of comics. In: Proceedings of the 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 1145–1149 (2013)

    Google Scholar 

  13. Fujino, S., Mori, N., Matsumoto, K.: Recognizing the order of four-scene comics by evolutionary deep learning. In: 15th International Conference on Distributed Computing and Artificial Intelligence, DCAI 2018, Toledo, Spain, June 20–22 2018, pp. 136–144 (2018)

    Google Scholar 

  14. Ueno, M.: Creators and artificial intelligence: enabling collaboration with creative processes and meta-data for four-scene comic story dataset. In: The 32nd Annual Conference of the Japanese Society for Artificial Intelligence, JSAI2018, p. 4Pin116 (2018)

    Google Scholar 

Download references

Acknowledgment

We thank the comic artists and Spoma Inc. for cooperating with this research. A part of this work was supported by ACT-I, JST. Grant Number: JPMJPR17U4, JSPS KAKENHI Grant, Grant-in-Aid for Scientific Research(C), 26330282, and Grant-in-Aid for Scientific Research(B), 19H04184.b.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Akira Terauchi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Terauchi, A., Mori, N., Ueno, M. (2021). Analysis of Partial Semantic Segmentation for Images of Four-Scene Comics. In: Dong, Y., Herrera-Viedma, E., Matsui, K., Omatsu, S., González Briones, A., Rodríguez González, S. (eds) Distributed Computing and Artificial Intelligence, 17th International Conference. DCAI 2020. Advances in Intelligent Systems and Computing, vol 1237. Springer, Cham. https://doi.org/10.1007/978-3-030-53036-5_6

Download citation

Publish with us

Policies and ethics