Abstract
Panels are the fundamental elements of manga pages, and hence their detection serves as the basis of high-level manga content understanding. Existing panel detection methods could be categorized into heuristic-based methods and CNN-based (Convolutional Neural Network-based) ones. Although the former can accurately localize panels, they cannot handle well elaborate panels and require considerable effort to hand-craft rules for every new hard case. In contrast, detection results of CNN-based methods could be rough and inaccurate. We utilize CNN object detectors to propose coarse guide panels, then use heuristics to propose panel candidates and finally optimize an energy function to select the most plausible candidates. CNN assures roughly localized detection of almost all kinds of panels, while the follow-up procedure refines the detection results and minimizes the margin between detected panels and ground-truth with the help of heuristics and energy minimization. Experimental results show the proposed method surpasses previous methods regarding panel detection F1-score and page accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Andres, B., Beier, T., Kappes, J.H.: OpenGM: A C++ library for discrete graphical models, Jun 2012. http://arxiv.org/abs/1206.0111
Arai, K., Tolle, H.: Automatic e-comic content adaptation. Int. J. Ubiquitous Comput. 1(1), 1–11 (2010)
Girshick, R.: Fast R-CNN. In: ICCV, vol. 2015 Inter, pp. 1440–1448 (2015)
Guo, J.: MX-RCNN: faster R-CNN in MXNet with distributed implementation and data parallelization (2017). https://github.com/precedenceguo/mx-rcnn
Han, E.J., Wong, C.O., Jung, K.C., Lee, K.H., Kim, E.Y.: Efficient page layout analysis on small devices. J. Zhejiang Univ.-Sci. A 10(6), 800–804 (2009)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)
He, Z., Zhou, Y., Wang, Y., Tang, Z.: SReN: shape regression network for comic storyboard extraction. In: AAAI, pp. 4937–4938 (2017)
Ho, A.K.N., Burie, J.C., Ogier, J.M.: Panel and speech balloon extraction from comic books. In: Proceedings - 10th IAPR International Workshop on Document Analysis Systems, DAS 2012, pp. 424–428. IEEE (2012)
Hung, S.H., Lai, Y.C., Wong, S.C., Chiu, C.H., Yao, C.Y.: Arbitrary screen-aware manga reading framework with parameter-optimized panel extraction. IEEE MultiMedia 26(2), 55–65 (2019)
Ishii, D., Watanabe, H.: A study on frame position detection of digitized comics images. In: Workshop on Picture Coding and Image Processing (PCSJ), Nagoya, vol. 2010, pp. 124–125 (2010)
Li, L., Wang, Y., Suen, C.Y., Tang, Z., Liu, D.: A tree conditional random field model for panel detection in comic images. Pattern Recogn. 48(7), 2129–2140 (2015)
Li, L., Wang, Y., Tang, Z., Gao, L.: Automatic comic page segmentation based on polygon detection. Multimedia Tools Appl. 69(1), 171–197 (2014)
Li, L., Wang, Y., Tang, Z., Liu, D.: Comic image understanding based on polygon detection. In: Zanibbi, R., Coüasnon, B. (eds.) Proceedings SPIE 8658, Document Recognition and Retrieval XX. vol. 8658, p. 86580B. International Society for Optics and Photonics, Feb 2013
Liu, D., Wang, Y., Tang, Z., Li, L., Gao, L.: Automatic comic page image understanding based on edge segment analysis. In: Coüasnon, B., Ringger, E.K. (eds.) Document Recognition and Retrieval XXI, vol. 9021, pp. 90210J–90210J-12. International Society for Optics and Photonics, Dec 2013
Matsui, Y., Ito, K., Aramaki, Y., Fujimoto, A., Ogawa, T., Yamasaki, T., Aizawa, K.: Sketch-based manga retrieval using manga109 dataset. Multimedia Tools Appl. 76(20), 21811–21838 (2017)
Nguyen, N.V., Rigaud, C., Burie, J.C.: Comic MTL: optimized multi-task learning for comic book image analysis. Int. J. Doc. Anal. Recogn. (IJDAR), 1–20 (2019)
Pang, X., Cao, Y., Lau, R.W., Chan, A.B.: A robust panel extraction method for manga. In: Proceedings of the ACM International Conference on Multimedia - MM 2014, pp. 1125–1128. ACM Press, New York, New York, USA (2014)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS, pp. 91–99 (2015)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations (ICLR), pp. 1–14, Sep 2015
Tanaka, T., Shoji, K., Toyama, F., Miyamichi, J.: Layout analysis of tree-structured scene frames in comic images. In: IJCAI International Joint Conference on Artificial Intelligence, pp. 2885–2890 (2007)
Wang, Y., Zhou, Y., Tang, Z.: Comic frame extraction via line segments combination. In: Proceedings of the International Conference on Document Analysis and Recognition, ICDAR, vol. 2015-Novem, pp. 856–860. IEEE (2015)
Acknowledgments
This work is supported by National Natural Science Foundation of China under Grant 61673029. This work is also a research achievement of Key Laboratory of Science, Technology and Standard in Press Industry (Key Laboratory of Intelligent Press Media Technology). The authors gratefully acknowledge financial support from China Scholarship Council.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhou, Y., Wang, Y., He, Z., Tang, Z., Suen, C.Y. (2020). Towards Accurate Panel Detection in Manga: A Combined Effort of CNN and Heuristics. In: Ro, Y., et al. MultiMedia Modeling. MMM 2020. Lecture Notes in Computer Science(), vol 11961. Springer, Cham. https://doi.org/10.1007/978-3-030-37731-1_18
Download citation
DOI: https://doi.org/10.1007/978-3-030-37731-1_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-37730-4
Online ISBN: 978-3-030-37731-1
eBook Packages: Computer ScienceComputer Science (R0)