Abstract
Ways of understanding human creations with the help of artificial intelligence (AI) have increased; however, those are still known as being one of the most difficult tasks. Our research challenge is to find ways to understand four-scene comics through AI. To achieve this aim, we used a novel dataset called “Four-scene Comics Story Dataset”, which is the first dataset made by researchers and comic artists to develop AI creations. In this paper, we focused on the partial semantic segmentation of features such as eyes, mouth, or speech balloons. The semantic segmentation task of comics has been difficult because of the lack of annotated comic dataset. To solve this problem, we utilized the features of our dataset and easily created annotated dataset. For the semantic segmentation method, we used a model called DeepLabv3+. The effectiveness of our experiment is confirmed by computer simulations showing the segmentation result of test images from four-scene comics.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015
Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015, pp. 234–241. Springer, Cham (2015)
Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: CVPR (2017)
Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: ECCV (2018)
Grauman, K., Darrell, T.: The pyramid match kernel: discriminative classification with sets of image features. In: Tenth IEEE International Conference on Computer Vision (ICCV 2005 Volume 1, vol. 2, pp. 1458–1465, October 2005
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2006), vol. 2, pp. 2169–2178, June 2006
Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions, November 2015
Narita, R., Ogawa, T., Matsui, Y., Yamasaki, T., Aizawa, K.: Sketch-based manga retrieval using deep features. In: The 31st Annual Conference of the Japanese Society for Artificial Intelligence, JSAI2017, p. 3H1OS04a2 (2017)
Matsui, Y., Ito, K., Aramaki, Y., Fujimoto, A., Ogawa, T., Yamasaki, T., Aizawa, K.: Sketch-based manga retrieval using Manga109 dataset. Multimed. Tools Appl. 76(20), 21811–21838 (2017)
Ogawa, T., Otsubo, A., Narita, R., Matsui, Y., Yamasaki, T., Aizawa, K.: Object detection for comics using Manga109 annotations. CoRR, abs/1803.08670 (2018)
Guérin, C., Rigaud, C., Mercier, A., Ammar-Boudjelal, F., Bertet, K., Bouju, A., Burie, J.-C., Louis, G., Ogier, J.-M., Revel, A.: eBDtheque: a representative database of comics. In: Proceedings of the 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 1145–1149 (2013)
Fujino, S., Mori, N., Matsumoto, K.: Recognizing the order of four-scene comics by evolutionary deep learning. In: 15th International Conference on Distributed Computing and Artificial Intelligence, DCAI 2018, Toledo, Spain, June 20–22 2018, pp. 136–144 (2018)
Ueno, M.: Creators and artificial intelligence: enabling collaboration with creative processes and meta-data for four-scene comic story dataset. In: The 32nd Annual Conference of the Japanese Society for Artificial Intelligence, JSAI2018, p. 4Pin116 (2018)
Acknowledgment
We thank the comic artists and Spoma Inc. for cooperating with this research. A part of this work was supported by ACT-I, JST. Grant Number: JPMJPR17U4, JSPS KAKENHI Grant, Grant-in-Aid for Scientific Research(C), 26330282, and Grant-in-Aid for Scientific Research(B), 19H04184.b.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Terauchi, A., Mori, N., Ueno, M. (2021). Analysis of Partial Semantic Segmentation for Images of Four-Scene Comics. In: Dong, Y., Herrera-Viedma, E., Matsui, K., Omatsu, S., González Briones, A., Rodríguez González, S. (eds) Distributed Computing and Artificial Intelligence, 17th International Conference. DCAI 2020. Advances in Intelligent Systems and Computing, vol 1237. Springer, Cham. https://doi.org/10.1007/978-3-030-53036-5_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-53036-5_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-53035-8
Online ISBN: 978-3-030-53036-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)