research-article

Estimation of structure of four-scene comics by convolutional neural networks

Authors:

Toshinori Suenaga,

Hitoshi IsaharaAuthors Info & Claims

MANPU '16: Proceedings of the 1st International Workshop on coMics ANalysis, Processing and Understanding

Article No.: 9, Pages 1 - 6

https://doi.org/10.1145/3011549.3011558

Published: 04 December 2016 Publication History

Abstract

The computational interpretation of comics is one of the important topics being studied in the field of artificial intelligence and image recognition. There are a lot of challenging tasks to undertake in order to interpret comics, i.e., recognize objects in gray-scaled drawing image, extract emotional information of scenes, and define models of continuous scenes by considering the structure of comics. In this paper, we focused on four scene comics and their transition. Four-scene comics have a structure which originated in four-part of Chinese-poetry so creators clearly draw the semantic distance between each scene. It is very important for expressing the interesting and lyrical aspects of comics. To detect the transition of scenes, convolutional neural networks(CNNs) are constructed and computer experiments were carried out. The results suggest that CNN is able to detect the transition of scenes and that the features of each scene are quite different.

References

[1]

A. Krizhevsky, I. Sutskever & G. E. Hinton, Imagenet classification with deep convolutional neural networks, In Advances in neural information processing systems, pp. 1097--1105, (2012)

Digital Library

[2]

K. Fukushima, S. Miyake, Neocognitron: A new algorithm for pattern recognition tolerant of deformations and shifts in position, Pattern Recognition, Vol. 15, Issue 6, pp. 455--469(1982)

[3]

V. Le, Quoc. Building high-level features using large scale unsupervised learning, In Acoustics, Speech and Signal Processing (ICASSP), pp. 8595--8598, (2013)

[4]

T. Tanaka, F. Toyama, J. Miyamichi, K. Shoji Detection and Classification of Speech Balloons in Comic Images, The journal of the Institute of Image Information and Television Engineers Vol. 64, No.12, pp.1933--1939, (2010)

[5]

M. Ueno, N. Mori, K. Matsumoto, 2-Scene Comic Creating System Based on the Distribution of Picture State Transition, Advances in Intelligent Systems and Computing, Vol. 290. pp.459--467 (2014)

[6]

M. Ueno, Computational Interpretation of Comic Scenes, Advances in Intelligent Systems and Computing, Vol.474, pp.387--393, (2016)

[7]

A. Karpathy and L. Fei-Fei, Deep visual-semantic alignments for generating image descriptions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3128--3137, (2015)

[8]

M. Eitz, J. Hays and M. Alexa : How Do Humans Sketch Objects?, ACM Trans. Graph. (Proc. SIGGRAPH), Vol. 31, No. 4, pp. 44:1--44:10, (2012)

[9]

P. Bertola, N. Mori and K. Matsumoto : Sketch Recognition for Interactive Multi-Agent System, in Institute of Systems Control and Information Engineers, 334--4, (2014)

[10]

M. Saito and Y. Matsui, Illustration2Vec: A Semantic Vector Representation of Illustrations, SIGGRAPH Asia Technical Briefs, (2015)

Digital Library

[11]

V. Propp, Morphology of the Folktale, University of Texas Press. vol.9 (2010)

[12]

M. Ueno & H. Isahara. Relationships between Features and Story Description in Comics, The 30th Annual Conference of the Japanese Society for Artificial Intelligence, 2J5-OS-08b-4in2, (2016)

[13]

S. Tokui, K. Oono, S. Hido, and J. Clayton, Chainer: a Next-Generation Open Source Framework for Deep Learning, In Workshop on Machine Learning Systems at Neural Information Processing Systems (NIPS), (2015)

[14]

H. Fujino, Compeito ! 1 (Confetti ! 1), Houbunsha. (2007)

[15]

G. E. Hinton, et al, Improving neural networks by preventing co-adaptation of feature detectors, arXiv preprint arXiv:1207.0580, (2012)

Cited By

Rishu Kukreja V(2024)Comic exploration and Insights: Recent trends in LDA-Based recognition studiesExpert Systems with Applications10.1016/j.eswa.2024.124732255(124732)Online publication date: Dec-2024
https://doi.org/10.1016/j.eswa.2024.124732
Rishu Kukreja V(2024)Decoding comics: a systematic literature review on recognition, segmentation, and classification techniques with emphasis on computer vision and non-computer visionMultimedia Tools and Applications10.1007/s11042-024-20214-xOnline publication date: 1-Oct-2024
https://doi.org/10.1007/s11042-024-20214-x
Ueno MFukuda KMori N(2019)Can Computers Understand Picture Books and Comics?Post-Narratology Through Computational and Cognitive Approaches10.4018/978-1-5225-7979-3.ch008(318-350)Online publication date: 2019
https://doi.org/10.4018/978-1-5225-7979-3.ch008
Show More Cited By

Index Terms

Estimation of structure of four-scene comics by convolutional neural networks
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision representations
        Image representations
      2. Computer vision tasks
        Scene understanding

Recommendations

Scene text recognition using residual convolutional recurrent neural network

Text is a significant tool for human communication, and text recognition in scene images becomes more and more important. In this paper, we propose a residual convolutional recurrent neural network for solving the task of scene text recognition. The ...
Transforming photos to comics using convolutional neural networks
2017 IEEE International Conference on Image Processing (ICIP)
In this paper, inspired by Gatys's recent work, we propose a novel approach that transforms photos to comics using deep convolutional neural networks (CNNs). While Gatys's method that uses a pre-trained VGG network generally works well for transferring ...
Research on improved wavelet convolutional wavelet neural networks
Abstract
Convolutional neural network (CNN) is recognized as state of the art of deep learning algorithm, which has a good ability on the image classification and recognition. The problems of CNN are as follows: the precision, accuracy and efficiency of ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

MANPU '16: Proceedings of the 1st International Workshop on coMics ANalysis, Processing and Understanding

December 2016

78 pages

ISBN:9781450347846

DOI:10.1145/3011549

General Chairs:
Jean-Marc Ogier
University of La Rochelle, France
,
Kiyoharu Aizawa
The University of Tokyo, Japan
,
Koichi Kise Osaka
Prefecture University, Japan
,
Program Chairs:
Jean-Christophe Burie
University of La Rochelle, France
,
Toshihiko Yamasaki
The University of Tokyo, Japan
,
Motoi Iwata Osaka
Prefecture University, Japan

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 December 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

MANPU '16

MANPU '16: First International Workshop on coMics ANalysis, Processing and Understanding

December 4, 2016

Cancun, Mexico

Acceptance Rates

MANPU '16 Paper Acceptance Rate 12 of 17 submissions, 71%;

Overall Acceptance Rate 12 of 17 submissions, 71%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
145
Total Downloads

Downloads (Last 12 months)10
Downloads (Last 6 weeks)0

Reflects downloads up to 20 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Rishu Kukreja V(2024)Comic exploration and Insights: Recent trends in LDA-Based recognition studiesExpert Systems with Applications10.1016/j.eswa.2024.124732255(124732)Online publication date: Dec-2024
https://doi.org/10.1016/j.eswa.2024.124732
Rishu Kukreja V(2024)Decoding comics: a systematic literature review on recognition, segmentation, and classification techniques with emphasis on computer vision and non-computer visionMultimedia Tools and Applications10.1007/s11042-024-20214-xOnline publication date: 1-Oct-2024
https://doi.org/10.1007/s11042-024-20214-x
Ueno MFukuda KMori N(2019)Can Computers Understand Picture Books and Comics?Post-Narratology Through Computational and Cognitive Approaches10.4018/978-1-5225-7979-3.ch008(318-350)Online publication date: 2019
https://doi.org/10.4018/978-1-5225-7979-3.ch008
Cohn NMagliano J(2019)Editors’ Introduction and Review: Visual Narrative Research: An Emerging Field in Cognitive ScienceTopics in Cognitive Science10.1111/tops.1247312:1(197-223)Online publication date: 22-Dec-2019
https://doi.org/10.1111/tops.12473
Fujino SMori NMatsumoto K(2019)Recognizing the Order of Four-Scene Comics by Evolutionary Deep LearningDistributed Computing and Artificial Intelligence, 15th International Conference10.1007/978-3-319-94649-8_17(136-144)Online publication date: 2019
https://doi.org/10.1007/978-3-319-94649-8_17
Ueno M(2018)Structure Analysis on Common Plot in Four-Scene Comic Story DatasetMultiMedia Modeling10.1007/978-3-030-05716-9_56(625-636)Online publication date: 11-Dec-2018
https://doi.org/10.1007/978-3-030-05716-9_56
Ueno MIsahara H(2017)Story Pattern Analysis Based on Scene Order Information in Four-Scene Comics2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)10.1109/ICDAR.2017.296(78-83)Online publication date: Nov-2017
https://doi.org/10.1109/ICDAR.2017.296
Nguyen NRigaud CBurie J(2017)Comic Characters Detection Using Deep Learning2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)10.1109/ICDAR.2017.290(41-46)Online publication date: Nov-2017
https://doi.org/10.1109/ICDAR.2017.290

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents