ABSTRACT
The incorporation of subtitles in multimedia content plays an important role in communicating spoken content. For example, subtitles in the respective language are often preferred to expensive audio translation of foreign movies. The traditional representation of subtitles displays text centered at the bottom of the screen. This layout can lead to large distances between text and relevant image content, causing eye strain and even that we miss visual content. As a recent alternative, the technique of speaker-following subtitles places subtitle text in speech bubbles close to the current speaker. We conducted a controlled eye-tracking laboratory study (n = 40) to compare the regular approach (center-bottom subtitles) with content-sensitive, speaker-following subtitles. We compared different dialog-heavy video clips with the two layouts. Our results show that speaker-following subtitles lead to higher fixation counts on relevant image regions and reduce saccade length, which is an important factor for eye strain.
Supplemental Material
- W. Akahori, T. Hirai, S. Kawamura, and S. Morishima. Region-of-interest-based subtitle placement using eye-tracking data of multiple viewers. In Proceedings of the ACM International Conference on Interactive Experiences for TV and Online Video, pages 123--128, 2016. Google ScholarDigital Library
- A. T. Bahill and L. Stark. Overlapping saccades and glissades are produced by fatigue in the saccadic eye movement system. Experimental Neurology, 48(1):95--106, 1975.Google ScholarCross Ref
- P. Baudisch, D. DeCarlo, A. T. Duchowski, and W. S. Geisler. Focusing on the essential: Considering attention in display design. Communications of the ACM, 46(3):60--66, 2003. Google ScholarDigital Library
- R. Bednarik, H. Vrzakova, and M. Hradis. What do you want to do next: A novel approach for intent prediction in gaze-based interaction. In Proceedings of the Symposium on Eye Tracking Research and Applications, pages 83--90, 2012. Google ScholarDigital Library
- L. Bergen, T. Grimes, and D. Potter. How attention partitions itself during simultaneous message presentations. Human Communication Research, 31(3):311--336, 2005.Google ScholarCross Ref
- M.-J. Bisson, W. J. Van Heuven, K. Conklin, and R. J. Tunney. Processing of native and foreign language subtitles in films: An eye tracking study. Applied Psycholinguistics, 35(2):399--418, 2014.Google ScholarCross Ref
- A. Brown, R. Jones, M. Crabb, J. Sandford, M. Brooks, M. Armstrong, and C. Jay. Dynamic subtitles: The user experience. In Proceedings of the ACM International Conference on Interactive Experiences for TV and Online Video, pages 103--112, 2015. Google ScholarDigital Library
- D. Chiaro, C. Heiss, and C. Bucaria, editors. Between Text and Image: Updating Research in Screen Translation, volume 78. John Benjamins Publishing, 2008.Google ScholarCross Ref
- G. d'Ydewalle and W. De Bruycker. Eye movements of children and adults while reading television subtitles. European Psychologist, 12(3):196--205, 2007.Google ScholarCross Ref
- G. d'Ydewalle, C. Praet, K. Verfaillie, and J. Van Rensbergen. Watching subtitled television automatic reading behavior. Communication Research, 18(5):650--666, 1991.Google ScholarCross Ref
- G. d'Ydewalle, J. Van Rensbergen, and J. Pollet. Reading a message when the same message is available auditorily in another language: The case of subtitling. In J. O'Regan and A. Lévi-Schoen, editors, Eye Movements: From Psychology to Cognition, pages 313--321. Elsevier Science Publishers, 1987.Google Scholar
- P. M. Fitts. The information capacity of the human motor system in controlling the amplitude of movement. Journal of Experimental Psychology, 47(6):381--391, 1954.Google ScholarCross Ref
- T. Foulsham and L. A. Sanderson. Look who's talking? sound changes gaze behaviour in a dynamic social scene. Visual Cognition, 21(7):922--944, 2013.Google ScholarCross Ref
- D. Holman, R. Vertegaal, C. Sohn, and D. Cheng. Attentive display: Paintings as attentive user interfaces. In CHI Extended Abstracts on Human Factors in Computing Systems, pages 1127--1130, 2004. Google ScholarDigital Library
- K. Holmqvist, M. Nyström, R. Andersson, R. Dewhurst, H. Jarodzka, and J. Van de Weijer. Eye Tracking: A Comprehensive Guide to Methods and Measures. Oxford University Press, 2011.Google Scholar
- R. Hong, M. Wang, M. Xu, S. Yan, and T.-S. Chua. Dynamic captioning: Video accessibility enhancement for hearing impairment. In Proceedings of the ACM International Conference on Multimedia, pages 421--430, 2010. Google ScholarDigital Library
- Y. Hu, J. Kautz, Y. Yu, and W. Wang. Speaker-following video subtitles. ACM Transactions on Multimedia Computing, Communications, and Applications, 11(2):32:1--32:17, 2015. Google ScholarDigital Library
- F. Karamitroglou. A proposed set of subtitling standards in Europe. Translation Journal, 2(2):1--15, 1998.Google Scholar
- C. M. Koolstra, A. L. Peeters, and H. Spinhof. The pros and cons of dubbing and subtitling. European Journal of Communication, 17(3):325--354, 2002.Google ScholarCross Ref
- B. Kothari, J. Takeda, A. Joshi, and A. Pandey. Same language subtitling: a butterfly for literacy? International Journal of Lifelong Education, 21(1):55--66, 2002.Google ScholarCross Ref
- I. Krejtz, A. Szarkowska, and K. Krejtz. The effects of shot changes on eye movements in subtitling. Journal of Eye Movement Research, 6(5):1--12, 2013.Google ScholarCross Ref
- J.-L. Kruger, E. Hefer, and G. Matthew. Measuring the impact of subtitles on cognitive load: Eye tracking and dynamic audiovisual texts. In Proceedings of the Conference on Eye Tracking South Africa, pages 62--66, 2013. Google ScholarDigital Library
- J.-L. Kruger and F. Steyn. Subtitles and eye tracking: Reading and performance. Reading Research Quarterly, 49(1):105--120, 2014.Google ScholarCross Ref
- K. Kurzhals, F. Heimerl, and D. Weiskopf. ISeeCube: Visual analysis of gaze data for video. In Proceedings of the Symposium on Eye Tracking Research and Applications, pages 43--50, 2014. Google ScholarDigital Library
- K. Kurzhals and D. Weiskopf. Space-time visual analytics of eye-tracking data for dynamic stimuli. IEEE Transactions on Visualization and Computer Graphics, 19(12):2129--2138, 2013. Google ScholarDigital Library
- D. Miniotas. Application of Fitts' law to eye gaze interaction. In CHI Extended Abstracts on Human Factors in Computing Systems, pages 339--340, 2000. Google ScholarDigital Library
- P. Mohr, B. Kerbl, M. Donoser, D. Schmalstieg, and D. Kalkofen. Retargeting technical documentation to augmented reality. In Proceedings of the ACM Conference on Human Factors in Computing Systems, pages 3337--3346, 2015. Google ScholarDigital Library
- E. Perego, F. Del Missier, M. Porta, and M. Mosconi. The cognitive effectiveness of subtitle processing. Media Psychology, 13(3):243--272, 2010.Google ScholarCross Ref
- A. Poole and L. Ball. Eye tracking in human-computer interaction and usability research: Current status and future prospects. In R. . D. Hyona, editor, The Mind's Eye: Cognitive and Applied Aspects of Eye Movement Research, pages 573--605. Elsevier Science, 2003.Google Scholar
- D. J. Rajendran, A. T. Duchowski, P. Orero, J. Martínez, and P. Romero-Fresco. Effects of text chunking on subtitling: A quantitative and qualitative examination. Perspectives, 21(1):5--21, 2013.Google ScholarCross Ref
- K. Rayner. The perceptual span and peripheral cues in reading. Cognitive Psychology, 7(1):65--81, 1975.Google ScholarCross Ref
- N. M. Ross and E. Kowler. Eye movements while viewing narrated, captioned, and silent videos. Journal of Vision, 13(4):1--1, 2013.Google ScholarCross Ref
- A. Szarkowska, I. Krejtz, Z. Klyszejko, and A. Wieczorek. Verbatim, standard, or edited?: Reading patterns of different captioning styles among deaf, hard of hearing, and hearing viewers. American Annals of the Deaf, 156(4):363--378, 2011.Google ScholarCross Ref
- E. E. Veas, E. Mendez, S. K. Feiner, and D. Schmalstieg. Directing attention and influencing memory with visual saliency modulation. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pages 1471--1480, 2011. Google ScholarDigital Library
Index Terms
- Close to the Action: Eye-Tracking Evaluation of Speaker-Following Subtitles
Recommendations
Interactive video stories from user generated content: a school concert use case
ICIDS'12: Proceedings of the 5th international conference on Interactive StorytellingThis paper describes a web-based narrative system able to generate video compilations, framed as event stories, from a shared repository of video recordings of the event itself and possibly of related events. For this, it employs narrative techniques ...
Best practices for eye tracking of television and video user experiences
UXTV '08: Proceedings of the 1st international conference on Designing interactive user experiences for TV and videoEye tracking is a usability tool that employs a device that measures on-screen eye fixations and movements to determine how users visually interact with an interface. In this paper, we present best practices for planning and moderating eye tracking ...
Automatic generation of video narratives from shared UGC
HT '11: Proceedings of the 22nd ACM conference on Hypertext and hypermediaThis paper introduces an evaluated approach to the automatic generation of video narratives from user generated content gathered in a shared repository. In the context of social events, end-users record video material with their personal cameras and ...
Comments