BCBId: first Bangla comic dataset and its applications

Dutta, Arpita; Biswas, Samit; Das, Amit Kumar

doi:10.1007/s10032-022-00412-9

BCBId: first Bangla comic dataset and its applications

Special Issue Paper
Published: 15 September 2022

Volume 25, pages 265–279, (2022)
Cite this article

International Journal on Document Analysis and Recognition (IJDAR) Aims and scope Submit manuscript

Arpita Dutta¹,
Samit Biswas¹ &
Amit Kumar Das¹

563 Accesses
Explore all metrics

Abstract

Comic document image analysis is now an active field of research in both academia and industry. However, comic document image processing research suffers due to its inherent complexities and the limited availability of benchmark public datasets. This paper describes the creation of the first-ever comic dataset among Indian Languages, namely Bangla Comic Book Image dataset (BCBId) (https://sites.google.com/view/banglacomicbookdataset), which is also made public for the benefit of the researchers. BCBId consists of 3327 images taken from 64 Bangla comic stories written by 8 writers. Bangla is the 6th most popular spoken language in the world—used by 265 million people (https://en.wikipedia.org/wiki/Languages_of_India), and has a century-old heritage of comic strips (in newspapers) and books. BCBId has the ground truth for extracting various visual components of the comic book images, i.e., panels, characters, speech balloons, and text lines. BCBId also includes the metadata encoding of all images in XML format to describe the underlined structure, semantics, and other features of the documents to pursue research on understanding stories and dialogues. A tool is specifically designed for accurate and faster ground-truth generation. As an application of the dataset, we carry out the sentiment analysis of comic stories—the first-ever attempt on comic book images. We also elaborate on a couple of applications of the BCBId in the comic research domain. Besides, we estimate the errors made by the annotators during the annotation process and describe different evaluation parameters to test the efficacy of the comic document image analysis algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial intelligence in the creative industries: a review

Article Open access 02 July 2021

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations

Article Open access 06 February 2017

Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization

Article 11 October 2019

Notes

References

Aizawa, K., Fujimoto, A., Otsubo, A., Ogawa, T., Matsui, Y., Tsubota, K., Ikuta, H.: Building a manga dataset “manga109’’ with annotations for multimedia applications. IEEE MultiMedia 27(2), 8–18 (2020)
Article Google Scholar
Arai, K., Tolle, H.: Method for real time text extraction of digital manga comic. Int. J. Image Process. (IJIP) 4(6), 669–676 (2011)
Google Scholar
Aramaki, Y., Matsui, Y., Yamasaki, T., Aizawa, K.: Interactive segmentation for manga using lossless thinning and coarse labeling. In: 2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA), pp. 293–296. IEEE (2015)
Araque, O., Corcuera-Platas, I., Sánchez-Rada, J.F., Iglesias, C.A.: Enhancing deep learning sentiment analysis with ensemble techniques in social applications. Expert Syst. Appl. 77, 236–246 (2017)
Article Google Scholar
Augereau, O., Iwata, M., Kise, K.: A survey of comics research in computer science. J. Imaging 4(7), 87 (2018)
Article Google Scholar
Cambria, E.: Affective computing and sentiment analysis. IEEE Intell. Syst. 31(2), 102–107 (2016)
Article Google Scholar
Das, A., Bandyopadhyay, S.: Sentiwordnet for indian languages. In: Proceedings of the Eighth Workshop on Asian Language Resouces, pp. 56–63 (2010)
Dey, A., Jenamani, M., Thakkar, J.J.: Senti-n-gram: an n-gram lexicon for sentiment analysis. Expert Syst. Appl. 103, 92–105 (2018)
Article Google Scholar
Digital Comic Museum. https://digitalcomicmuseum.com/. Accessed 29 May 2019
Dos Santos, C., Gatti, M.: Deep convolutional neural networks for sentiment analysis of short texts. In: Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pp. 69–78 (2014)
Dubray, D., Laubrock, J.: Deep cnn-based speech balloon detection and segmentation for comic books. In: ICDAR,2019, pp. 1237–1243. IEEE
Dutta, A., Biswas, S.: Cnn based extraction of panels/characters from bengali comic book page images. In: 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW), vol. 1, pp. 38–43. IEEE (2019)
Dutta, A., Biswas, S., Das, A.K.: Cnn-based segmentation of speech balloons and narrative text boxes from comic book page images. International Journal on Document Analysis and Recognition (IJDAR) pp. 1–14 (2021)
Dutta, A., Zisserman, A.: The via annotation software for images, audio and video. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 2276–2279 (2019)
Esuli, A., Sebastiani, F.: Sentiwordnet: A publicly available lexical resource for opinion mining. In: Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06) (2006)
Fukusato, T., Hirai, T., Kawamura, S., Morishima, S.: Computational cartoonist: A comic-style video summarization system for anime films. In: International Conference on Multimedia Modeling, pp. 42–50. Springer (2016)
Guérin, C., Rigaud, C., Mercier, A., Ammar-Boudjelal, F., Bertet, K., Bouju, A., Burie, J.C., Louis, G., Ogier, J.M., Revel, A.: eBDtheque: a representative database of comics. In: ICDAR, pp. 1145–1149. IEEE (2013)
Gupta, V., Detani, V., Khokar, V., Chattopadhyay, C.: C2vnet: A deep learning framework towards comic strip to audio-visual scene synthesis. In: International Conference on Document Analysis and Recognition, pp. 160–175. Springer (2021)
Hartel, R., Dunst, A.: An ocr pipeline and semantic text analysis for comics. In: International Conference on Pattern Recognition, pp. 213–222. Springer (2021)
He, Z., Zhou, Y., Wang, Y., Wang, S., Lu, X., Tang, Z., Cai, L.: An end-to-end quadrilateral regression network for comic panel extraction. In: Proceedings of the 26th ACM international conference on Multimedia, pp. 887–895 (2018)
Ho, A.K.N., Burie, J.C., Ogier, J.M.: Panel and speech balloon extraction from comic books. In: 2012 10th IAPR international workshop on document analysis systems, pp. 424–428. IEEE (2012)
Hossen, M., Dev, N.R., et al.: An improved lexicon based model for efficient sentiment analysis on movie review data. Wirel. Pers. Commun. 120(1), 535–544 (2021)
Article Google Scholar
Iyyer, M., Manjunatha, V., Guha, A., Vyas, Y., Boyd-Graber, J., Daume, H., Davis, L.S.: The amazing mysteries of the gutter: Drawing inferences between panels in comic book narratives. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7186–7195 (2017)
Kiritchenko, S., Zhu, X., Mohammad, S.M.: Sentiment analysis of short informal texts. J. Artif. Intell. Res. 50, 723–762 (2014)
Article Google Scholar
Leskovec, J., Rajaraman, A., Ullman, J.D.: Mining of Massive Data Sets. Cambridge University Press (2020)
Li, L., Goh, T.T., Jin, D.: How textual quality of online reviews affect classification performance: a case of deep learning sentiment analysis. Neural Comput. Appl. 32(9), 4387–4415 (2020)
Article Google Scholar
Li, L., Wang, Y., Gao, L., Tang, Z., Suen, C.Y.: Comic2cebx: A system for automatic comic content adaptation. In: IEEE/ACM Joint Conference on Digital Libraries, pp. 299–308. IEEE (2014)
Matsui, Y., Ito, K., Aramaki, Y., Fujimoto, A., Ogawa, T., Yamasaki, T., Aizawa, K.: Sketch-based manga retrieval using manga109 dataset. Multimed. Tools Appl. 76, 21811–21838 (2017)
Article Google Scholar
Matsui, Y., Yamasaki, T., Aizawa, K.: Interactive manga retargeting. In: ACM SIGGRAPH 2011 Posters, pp. 1–1 (2011)
Mowlaei, M.E., Abadeh, M.S., Keshavarz, H.: Aspect-based sentiment analysis using adaptive aspect-based lexicons. Expert Syst. Appl. 148, 113234 (2020)
Article Google Scholar
Neviarouskaya, A., Prendinger, H., Ishizuka, M.: Sentiful: a lexicon for sentiment analysis. IEEE Trans. Affect. Comput. 2(1), 22–36 (2011)
Article Google Scholar
Nguyen, N.V., Rigaud, C., Burie, J.C.: Comic characters detection using deep learning. In: ICDAR ,2017, vol. 3, pp. 41–46. IEEE
Nguyen, N.V., Rigaud, C., Burie, J.C.: Digital comics image indexing based on deep learning. J. Imaging 4(7), 89 (2018)
Article Google Scholar
Nguyen, N.V., Rigaud, C., Burie, J.C.: Comic MTL: optimized multi-task learning for comic book image analysis. Int. J. Document Anal. Recogn. (IJDAR) 22(3), 265–284 (2019)
Article Google Scholar
Nguyen, N.V., Vu, X.S., Rigaud, C., Jiang, L., Burie, J.C.: Icdar 2021 competition on multimodal emotion recognition on comics scenes. In: ICDAR,2021, pp. 767–782. Springer
Ogawa, T., Otsubo, A., Narita, R., Matsui, Y., Yamasaki, T., Aizawa, K.: Object detection for comics using manga109 annotations. Preprint arXiv:1803.08670 (2018)
Pang, X., Cao, Y., Lau, R.W., Chan, A.B.: A robust panel extraction method for manga. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 1125–1128. ACM (2014)
Qian, Q., Huang, M., Lei, J., Zhu, X.: Linguistically regularized lstm for sentiment classification. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1679–1689 (2017)
Qin, X., Zhou, Y., He, Z., Wang, Y., Tang, Z.: A faster r-cnn based method for comic characters face detection. In: ICDAR, vol. 1, pp. 1074–1080. IEEE (2017)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: Unified, real-time object detection. In: CVPR, pp. 779–788 (2016)
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: CVPR, pp. 7263–7271 (2017)
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: Towards real-time object detection with region proposal networks. In: NIPS, pp. 91–99 (2015)
Rigaud, C., Burie, J.C., Ogier, J.M.: Text-independent speech balloon segmentation for comics and manga. In: International Workshop on Graphics Recognition, pp. 133–147. Springer (2015)
Rigaud, C., Burie, J.C., Ogier, J.M., Karatzas, D., Van de Weijer, J.: An active contour model for speech balloon detection in comics. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 1240–1244. IEEE (2013)
Rigaud, C., Guérin, C., Karatzas, D., Burie, J.C., Ogier, J.M.: Knowledge-driven understanding of images in comic books. IJDAR 18(3), 199–221 (2015)
Article Google Scholar
Rigaud, C., Le Thanh, N., Burie, J.C., Ogier, J.M., Iwata, M., Imazu, E., Kise, K.: Speech balloon and speaker association for comics and manga understanding. In: ICDAR,2015, pp. 351–355. IEEE
Rigaud, C., Tsopze, N., Burie, J.C., Ogier, J.M.: Robust frame and text extraction from comic books. In: International Workshop on Graphics Recognition, pp. 129–138. Springer (2011)
Sun, W., Burie, J.C., Ogier, J.M., Kise, K.: Specific comic character detection using local feature matching. In: ICDAR, 2013, pp. 275–279. IEEE
VGG image annotator. http://www.robots.ox.ac.uk/~vgg/software/via/via.html. Accessed 11 March 2019
Walsh, J.A.: Comic book markup language: an introduction and rationale. Digital Humanities Q. 6(1) (2012)
Wang, Y., Zhou, Y., Tang, Z.: Comic frame extraction via line segments combination. In: ICDAR,2015, pp. 856–860. IEEE
Xie, M., Xia, M., Liu, X., Wong, T.T.: Screentone-preserved manga retargeting. Preprint arXiv:2203.03396 (2022)
Yadav, A., Vishwakarma, D.K.: Sentiment analysis using deep learning architectures: a review. Artif. Intell. Rev. 53(6), 4335–4385 (2020)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science And Technology, Indian Institute of Engineering Science and Technology, Shibpur, India
Arpita Dutta, Samit Biswas & Amit Kumar Das

Authors

Arpita Dutta
View author publications
You can also search for this author in PubMed Google Scholar
Samit Biswas
View author publications
You can also search for this author in PubMed Google Scholar
Amit Kumar Das
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Arpita Dutta.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Dutta, A., Biswas, S. & Das, A.K. BCBId: first Bangla comic dataset and its applications. IJDAR 25, 265–279 (2022). https://doi.org/10.1007/s10032-022-00412-9

Download citation

Received: 15 March 2022
Accepted: 22 August 2022
Published: 15 September 2022
Issue Date: December 2022
DOI: https://doi.org/10.1007/s10032-022-00412-9

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

BCBId: first Bangla comic dataset and its applications

Abstract

Access this article

Similar content being viewed by others

Artificial intelligence in the creative industries: a review

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations

Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Navigation

BCBId: first Bangla comic dataset and its applications

Abstract

Access this article

Similar content being viewed by others

Artificial intelligence in the creative industries: a review

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations

Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation