research-article

Creating Segments and Effects on Comics by Clustering Gaze Data

Authors:

Ishwarya Thirunarayanan,

Khimya Khetarpal,

Sanjeev Koppal,

Olivier Le Meur,

Eakta JainAuthors Info & Claims

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Volume 13, Issue 3

Article No.: 24, Pages 1 - 23

https://doi.org/10.1145/3078836

Published: 31 May 2017 Publication History

Abstract

Traditional comics are increasingly being augmented with digital effects, such as recoloring, stereoscopy, and animation. An open question in this endeavor is identifying where in a comic panel the effects should be placed. We propose a fast, semi-automatic technique to identify effects-worthy segments in a comic panel by utilizing gaze locations as a proxy for the importance of a region. We take advantage of the fact that comic artists influence viewer gaze towards narrative important regions. By capturing gaze locations from multiple viewers, we can identify important regions and direct a computer vision segmentation algorithm to extract these segments. The challenge is that these gaze data are noisy and difficult to process. Our key contribution is to leverage a theoretical breakthrough in the computer networks community towards robust and meaningful clustering of gaze locations into semantic regions, without needing the user to specify the number of clusters. We present a method based on the concept of relative eigen quality that takes a scanned comic image and a set of gaze points and produces an image segmentation. We demonstrate a variety of effects such as defocus, recoloring, stereoscopy, and animations. We also investigate the use of artificially generated gaze locations from saliency models in place of actual gaze locations.

Supplementary Material

thirunarayanan (thirunarayanan.zip)

Supplemental movie, appendix, image and software files for, Creating Segments and Effects on Comics by Clustering Gaze Data

Download
23.75 MB

References

[1]

Radhakrishna Achanta, Appu Shaji, Kevin Smith, Aurelien Lucchi, Pascal Fua, and Sabine Süsstrunk. 2010. Slic Superpixels. Technical Report.

[2]

Suleyman Al-Showarah, Naseer Al-Jawad, and Harin Sellahewa. 2013. Examining eye-tracker scan paths for elderly people using smart phones. In Proceedings of the 6th York Doctoral Symposium on Computer Science 8 Electronics, Vol. 1. 7.

[3]

Yuji Aramaki, Yusuke Matsui, Toshihiko Yamasaki, and Kiyoharu Aizawa. 2014. Interactive segmentation for manga. In Proceedings of ACM SIGGRAPH 2014 Posters. 66.

Digital Library

[4]

Pablo Arbelaez, Michael Maire, Charless Fowlkes, and Jitendra Malik. 2011. Contour detection and hierarchical image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 33, 5 (May 2011), 898--916.

Digital Library

[5]

Archive. 2015. Internet Archive. Retrieved from http://archive.org/.

[6]

Soonmin Bae and Frédo Durand. 2007. Defocus magnification. In Computer Graphics Forum, Vol. 26. Wiley Online Library, 571--579.

[7]

Pieter Blignaut. 2009. Fixation identification: The optimum threshold for a dispersion algorithm. Attention Percept. Psychophys. 71, 4 (2009), 881--895.

[8]

Ali Borji and Laurent Itti. 2015. CAT2000: A large scale fixation dataset for boosting saliency research. In Proceedings of the Computer Vision and Pattern Recognition 2015 Workshop on “Future of Datasets”

[9]

Zoya Bylinskii, Tilke Judd, Ali Borji, Laurent Itti, Frédo Durand, Aude Oliva, and Antonio Torralba. 2016. MIT Saliency Benchmark. Retrieved from http://saliency.mit.edu/.

[10]

Huiwen Chang, Ohad Fried, Yiming Liu, Stephen DiVerdi, and Adam Finkelstein. 2015. Palette-based photo recoloring. ACM Trans. Graph. 34, 4 (2015), 139.

Digital Library

[11]

Jinsoo Choi, Tae-Hyun Oh, and In So Kweon. 2016. Human attention estimation for natural images: An automatic gaze refinement approach. arXiv Preprint arXiv:1601.02852 (2016).

[12]

Yung-Yu Chuang, Dan B. Goldman, Ke Colin Zheng, Brian Curless, David H. Salesin, and Richard Szeliski. 2005. Animating pictures with stochastic motion textures. ACM Transactions on Graphics (TOG) 24, 3 (2005), 853--860.

Digital Library

[13]

ComicBookPlus. 2015. Comics in public domain. Retrieved From http://comicbookplus.com/.

[14]

Comichron. 2015. Comichron sales data. Retrieved from http://www.comichron.com/monthlycomicssales.html.

[15]

Marcella Cornia, Lorenzo Baraldi, Giuseppe Serra, and Rita Cucchiara. 2016. A deep multi-level network for saliency prediction. arXiv Preprint arXiv:1609.01064 (2016).

[16]

Doug DeCarlo and Anthony Santella. 2002. Stylization and abstraction of photographs. ACM Transactions on Graphics 21, 3 (2002), 769--776.

Digital Library

[17]

Clement Farabet, Camille Couprie, Laurent Najman, and Yann LeCun. 2013. Learning hierarchical features for scene labeling. IEEE Trans. Pattern Anal. Mach. Intell. 35, 8 (2013), 1915--1929.

Digital Library

[18]

Joseph H. Goldberg and Jonathan I. Helfman. 2010. Scanpath clustering and aggregation. In Proceedings of the 2010 Symposium on Eye-tracking Research 8 Applications. ACM, 227--234.

Digital Library

[19]

Laurent Itti, Christof Koch, and Ernst Niebur. 1998. A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 11 (1998), 1254--1259.

Digital Library

[20]

Eakta Jain. 2012. Attention-Guided Algorithms to Retarget and Augment Animations, Stills, and Videos. Ph.D. Dissertation. CMU.

Digital Library

[21]

Eakta Jain, Yaser Sheikh, and Jessica Hodgins. 2012. Inferring artistic intention in comic art through viewer gaze. In Proceedings of the ACM Symposium on Applied Perception. ACM, 55--62.

Digital Library

[22]

Ming Jiang, Shengsheng Huang, Juanyong Duan, and Qi Zhao. 2015. SALICON: Saliency in context. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15).

[23]

Tilke Judd, Fredo Durand, and Antonio Torralba. 2011. Fixations on low-resolution images. J. Vis. 11, 4 (2011), 14.

[24]

Tilke Judd, Krista Ehinger, Frédo Durand, and Antonio Torralba. 2009. Learning to predict where humans look. In Proceedings of the IEEE International Conference on Computer Vision (ICCV).

[25]

Kevin Karsch, Ce Liu, and Sing Bing Kang. 2012. Depth extraction from video using non-parametric sampling. In Proceedings of the Computer Vision--ECCV 2012. Springer, 775--788.

Digital Library

[26]

S. Karthikeyan, T. Ngo, M. Eckstein, and B. S. Manjunath. 2015. Eye tracking assisted extraction of attentionally important objects from videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3241--3250.

[27]

Harish Katti. 2011. Human Visual Perception, Study and Applications to Understanding Images and Videos. Ph.D. Dissertation.

[28]

Harish Katti, Ramanathan Subramanian, Mohan Kankanhalli, Nicu Sebe, Tat-Seng Chua, and Kalpathi R. Ramakrishnan. 2010. Making computers look the way we look: Exploiting visual attention for image understanding. In Proceedings of the 18th ACM International Conference on Multimedia. ACM, 667--670.

Digital Library

[29]

Rubaiat Habib Kazi, Fanny Chevalier, Tovi Grossman, Shengdong Zhao, and George Fitzmaurice. 2014. Draco: Bringing life to illustrations with kinetic textures. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 351--360.

Digital Library

[30]

Khimya Khetarpal and Eakta Jain. 2016. A preliminary benchmark of four saliency algorithms on comic art. In Proceedings of the 2016 IEEE International Conference on Multimedia 8 Expo Workshops (ICMEW). IEEE, 1--6.

[31]

Natasha Kholgade, Tomas Simon, Alexei Efros, and Yaser Sheikh. 2014. 3D object manipulation in a single photograph using stock 3D models. ACM Transactions on Graphics 33, 4 (2014), 127.

Digital Library

[32]

Johannes Kopf and Dani Lischinski. 2012. Digital reconstruction of halftoned color comics. ACM Transactions on Graphics 31, 6 (2012), 140.

Digital Library

[33]

Kyle Krafka, Aditya Khosla, Petr Kellnhofer, Harini Kannan, Suchendra Bhandarkar, Wojciech Matusik, and Antonio Torralba. 2016. Eye tracking for everyone. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2176--2184.

[34]

Srinivas S. S. Kruthiventi, Kumar Ayush, and R. Venkatesh Babu. 2015. Deepfix: A fully convolutional neural network for predicting human eye fixations. arXiv Preprint arXiv:1510.02927 (2015).

[35]

Matthias Kümmerer, Thomas S. A. Wallis, and Matthias Bethge. 2016. DeepGaze II: Reading fixations from deep features trained on object recognition. arXiv Preprint arXiv:1610.01563 (2016).

[36]

Olivier Le Meur and Antoine Coutrot. 2016. Introducing context-dependent and spatially-variant viewing biases in saccadic models. Vision Research 121 (2016), 72--84.

[37]

Olivier Le Meur and Zhi Liu. 2015. Saccadic model of eye movements for free-viewing condition. Vision Research 116 (2015), 152--164.

[38]

Anat Levin, Dani Lischinski, and Yair Weiss. 2004. Colorization using optimization. In ACM Transactions on Graphics (TOG), Vol. 23. ACM, 689--694.

Digital Library

[39]

Anat Levin, Dani Lischinski, and Yair Weiss. 2008. A closed-form solution to natural image matting. IEEE Transactions on Pattern Analysis and Machine Intelligence 30, 2 (2008), 228--242.

Digital Library

[40]

Wan-Yen Lo, Jeroen van Baar, Claude Knaus, Matthias Zwicker, and Markus Gross. 2010. Stereoscopic 3D copy 8 paste. In ACM Transactions on Graphics (TOG), Vol. 29. ACM, 147.

Digital Library

[41]

I. Merk and J. Schnakenberg. 2002. A stochastic model of multistable visual perception. Biological Cybernetics 86, 2 (2002), 111--116.

[42]

Ajay Mishra, Yiannis Aloimonos, and Cheong Loong Fah. 2009. Active segmentation with fixation. In IEEE International Conference on Computer Vision (ICCV). IEEE, 468--475.

[43]

Eric N. Mortensen and William A. Barrett. 1995. Intelligent scissors for image composition. In Proceedings of the 22nd Annual Conference on Computer Graphics and Interactive Techniques. ACM, 191--198.

Digital Library

[44]

Makoto Okabe, Ken Anjyo, and Rikio Onai. 2011. Creating fluid animation from a single image using video database. Computer Graphics Forum 30 (2011).

[45]

Dim P. Papadopoulos, Alasdair D. F. Clarke, Frank Keller, and Vittorio Ferrari. 2014. Training object class detectors from eye tracking data. In Proceedings of the Computer Vision--ECCV 2014. Springer, 361--376.

[46]

Yingge Qu, Tien-Tsin Wong, and Pheng-Ann Heng. 2006. Manga colorization. ACM Transactions on Graphics 25, 3 (2006), 1214--1220.

Digital Library

[47]

Subramanian Ramanathan, Harish Katti, Nicu Sebe, Mohan Kankanhalli, and Tat-Seng Chua. 2010. An eye fixation database for saliency detection in images. In Proceedings of the European Conference on Computer Vision. Springer, 30--43.

Digital Library

[48]

Bryan C. Russell, Antonio Torralba, Kevin P. Murphy, and William T. Freeman. 2008. LabelMe: A database and web-based tool for image annotation. International Journal of Computer Vision 77, 1--3 (2008), 157--173.

Digital Library

[49]

Dario D. Salvucci and Joseph H. Goldberg. 2000. Identifying fixations and saccades in eye-tracking protocols. In Proceedings of the 2000 Symposium on Eye Tracking Research 8 Applications. ACM, 71--78.

Digital Library

[50]

Anthony Santella, Maneesh Agrawala, Doug DeCarlo, David Salesin, and Michael Cohen. 2006. Gaze-based interaction for semi-automatic photo cropping. In Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI).

Digital Library

[51]

Anthony Santella and Doug DeCarlo. 2004. Robust clustering of eye movement recordings for quantification of visual interest. In Proceedings of the 2004 Symposium on Eye Tracking Research 8 Applications. ACM, 27--34.

Digital Library

[52]

John M. Shea and Joseph P. Macker. 2013. Automatic selection of number of clusters in networks using relative eigenvalue quality. In Proceedings of the Military Communications Conference, MILCOM 2013-2013 IEEE. IEEE, 131--136.

[53]

Jianbo Shi and Jitendra Malik. 2000. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 8 (2000), 888--905.

Digital Library

[54]

Oleg Špakov and Darius Miniotas. 2015. Application of clustering algorithms in eye gaze visualizations. Information Technology And Control 36, 2 (2015).

[55]

Yusuke Sugano, Yasuyuki Matsushita, and Yoichi Sato. 2013. Graph-based joint clustering of fixations and visual entities. ACM Transactions on Applied Perception (TAP) 10, 2 (2013), 10.

Digital Library

[56]

M. Sun, A. D. Jepson, and E. Fiume. 2003. Video input driven animation (VIDA). In IEEE International Conference on Computer Vision (ICCV). 96--103.

Digital Library

[57]

Daniel Sỳkora, Jan Buriánek, and Jiří Žára. 2003. Segmentation of black and white cartoons. In Spring Conference on Computer Graphics (SCCG). 223--230.

Digital Library

[58]

Daniel Sỳkora, John Dingliana, and Steven Collins. 2009. LazyBrush: Flexible painting tool for hand-drawn cartoons. In Computer Graphics Forum, Vol. 28.

[59]

Enkelejda Tafaj, Gjergji Kasneci, Wolfgang Rosenstiel, and Martin Bogdan. 2012. Bayesian online clustering of eye movement data. In Proceedings of the Symposium on Eye Tracking Research and Applications. ACM, 285--288.

Digital Library

[60]

Thierry Urruty, Stanislas Lew, Nacim Ihadaddene, and Dan A. Simovici. 2007. Detecting eye fixations by projection clustering. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 3, 4 (2007), 5.

Digital Library

[61]

Dirk Walther and Christof Koch. 2006. Modeling attention to salient proto-objects. Neural Networks 19, 9 (2006), 1395--1407.

Digital Library

[62]

Xuemiao Xu, Liang Wan, Xiaopei Liu, Tien-Tsin Wong, Liansheng Wang, and Chi-Sing Leung. 2008. Animating animal motion from still. ACM Transactions on Graphics 27, 5 (2008), 117.

Digital Library

[63]

Song-Hai Zhang, Tao Chen, Yi-Fei Zhang, Shi-Min Hu, and Ralph R. Martin. 2009. Vectorizing cartoon animations. IEEE Transactions on Visualization and Computer Graphics (TVCG) 15, 4 (2009), 618--629.

Digital Library

Cited By

Bisogni CNappi MTortora GDel Bimbo A(2024)Gaze analysisImage and Vision Computing10.1016/j.imavis.2024.104961144:COnline publication date: 1-Apr-2024
https://dl.acm.org/doi/10.1016/j.imavis.2024.104961
Sharma RKukreja V(2023)CPD: Faster RCNN-based DragonBall Comic Panel Detection2023 IEEE 12th International Conference on Communication Systems and Network Technologies (CSNT)10.1109/CSNT57126.2023.10134577(786-790)Online publication date: 8-Apr-2023
https://doi.org/10.1109/CSNT57126.2023.10134577
Ikuta HWöhler LAizawa K(2023)Statistical characteristics of comic panel viewing timesScientific Reports10.1038/s41598-023-47120-w13:1Online publication date: 20-Nov-2023
https://doi.org/10.1038/s41598-023-47120-w
Show More Cited By

Index Terms

Creating Segments and Effects on Comics by Clustering Gaze Data
1. Computing methodologies
2. Information systems
  1. Information systems applications
    1. Data mining
      1. Clustering

Recommendations

Leveraging gaze data for segmentation and effects on comics
SAP '16: Proceedings of the ACM Symposium on Applied Perception

In this work, we present a semi-automatic method based on gaze data to identify the objects in comic images on which digital effects will look best. Our key contribution is a robust technique to cluster the noisy gaze data without having to specify the ...
Design Patterns for Data Comics
CHI '18: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems

Data comics for data-driven storytelling are inspired by the visual language of comics and aim to communicate insights in data through visualizations. While comics are widely known, few examples of data comics exist and there has not been any structured ...
Enabling portable animation browsing by transforming animations into comics
IMMPD '12: Proceedings of the 2nd ACM international workshop on Interactive multimedia on mobile and portable devices

This paper presents a media transformation system that transforms animations into comics. The comic presentation not only enhances the efficiency of animation browsing, but also enables users to browse animations just like reading comic books, which is ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications

ACM Transactions on Multimedia Computing, Communications, and Applications Volume 13, Issue 3

August 2017

233 pages

ISSN:1551-6857

EISSN:1551-6865

DOI:10.1145/3104033

Editor:
Alberto Del Bimbo
University of Firenze, Italy

Issue’s Table of Contents

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 May 2017

Accepted: 01 March 2017

Revised: 01 March 2017

Received: 01 July 2016

Published in TOMM Volume 13, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6
Total Citations
View Citations
316
Total Downloads

Downloads (Last 12 months)5
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Bisogni CNappi MTortora GDel Bimbo A(2024)Gaze analysisImage and Vision Computing10.1016/j.imavis.2024.104961144:COnline publication date: 1-Apr-2024
https://dl.acm.org/doi/10.1016/j.imavis.2024.104961
Sharma RKukreja V(2023)CPD: Faster RCNN-based DragonBall Comic Panel Detection2023 IEEE 12th International Conference on Communication Systems and Network Technologies (CSNT)10.1109/CSNT57126.2023.10134577(786-790)Online publication date: 8-Apr-2023
https://doi.org/10.1109/CSNT57126.2023.10134577
Ikuta HWöhler LAizawa K(2023)Statistical characteristics of comic panel viewing timesScientific Reports10.1038/s41598-023-47120-w13:1Online publication date: 20-Nov-2023
https://doi.org/10.1038/s41598-023-47120-w
He DXie C(2022)Semantic image segmentation algorithm in a deep learning computer networkMultimedia Systems10.1007/s00530-020-00678-128:6(2065-2077)Online publication date: 1-Dec-2022
https://dl.acm.org/doi/10.1007/s00530-020-00678-1
Yang R(2021)Vocational education reform based on improved convolutional neural network and speech recognitionPersonal and Ubiquitous Computing10.1007/s00779-021-01614-4Online publication date: 11-Aug-2021
https://doi.org/10.1007/s00779-021-01614-4
Bannier KJain EMeur OSharif BKrejtz K(2018)DeepcomicsProceedings of the 2018 ACM Symposium on Eye Tracking Research & Applications10.1145/3204493.3204560(1-5)Online publication date: 14-Jun-2018
https://dl.acm.org/doi/10.1145/3204493.3204560

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents