research-article

Close to the Action: Eye-Tracking Evaluation of Speaker-Following Subtitles

Authors:
Kuno Kurzhals

University of Stuttgart, Stuttgart, Germany

University of Stuttgart, Stuttgart, Germany
View Profile

,
Emine Cetinkaya

University of Stuttgart, Stuttgart, Germany

University of Stuttgart, Stuttgart, Germany
View Profile

,
Yongtao Hu

The University of Hong Kong, Hong Kong, China

The University of Hong Kong, Hong Kong, China
View Profile

,
Wenping Wang

The University of Hong Kong, Hong Kong, Hong Kong

The University of Hong Kong, Hong Kong, Hong Kong
View Profile

,
Daniel Weiskopf

University of Stuttgart, Stuttgart, Germany

University of Stuttgart, Stuttgart, Germany
View Profile

CHI '17: Proceedings of the 2017 CHI Conference on Human Factors in Computing SystemsMay 2017Pages 6559–6568https://doi.org/10.1145/3025453.3025772

Published:02 May 2017Publication History

CHI '17: Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems

Pages 6559–6568

ABSTRACT

The incorporation of subtitles in multimedia content plays an important role in communicating spoken content. For example, subtitles in the respective language are often preferred to expensive audio translation of foreign movies. The traditional representation of subtitles displays text centered at the bottom of the screen. This layout can lead to large distances between text and relevant image content, causing eye strain and even that we miss visual content. As a recent alternative, the technique of speaker-following subtitles places subtitle text in speech bubbles close to the current speaker. We conducted a controlled eye-tracking laboratory study (n = 40) to compare the regular approach (center-bottom subtitles) with content-sensitive, speaker-following subtitles. We compared different dialog-heavy video clips with the two layouts. Our results show that speaker-following subtitles lead to higher fixation counts on relevant image regions and reduce saccade length, which is an important factor for eye strain.

Supplemental Material

pn2569p.mp4

mp4

650.3 KB

Download

References

W. Akahori, T. Hirai, S. Kawamura, and S. Morishima. Region-of-interest-based subtitle placement using eye-tracking data of multiple viewers. In Proceedings of the ACM International Conference on Interactive Experiences for TV and Online Video, pages 123--128, 2016. Google ScholarDigital Library
A. T. Bahill and L. Stark. Overlapping saccades and glissades are produced by fatigue in the saccadic eye movement system. Experimental Neurology, 48(1):95--106, 1975.Google ScholarCross Ref
P. Baudisch, D. DeCarlo, A. T. Duchowski, and W. S. Geisler. Focusing on the essential: Considering attention in display design. Communications of the ACM, 46(3):60--66, 2003. Google ScholarDigital Library
R. Bednarik, H. Vrzakova, and M. Hradis. What do you want to do next: A novel approach for intent prediction in gaze-based interaction. In Proceedings of the Symposium on Eye Tracking Research and Applications, pages 83--90, 2012. Google ScholarDigital Library
L. Bergen, T. Grimes, and D. Potter. How attention partitions itself during simultaneous message presentations. Human Communication Research, 31(3):311--336, 2005.Google ScholarCross Ref
M.-J. Bisson, W. J. Van Heuven, K. Conklin, and R. J. Tunney. Processing of native and foreign language subtitles in films: An eye tracking study. Applied Psycholinguistics, 35(2):399--418, 2014.Google ScholarCross Ref
A. Brown, R. Jones, M. Crabb, J. Sandford, M. Brooks, M. Armstrong, and C. Jay. Dynamic subtitles: The user experience. In Proceedings of the ACM International Conference on Interactive Experiences for TV and Online Video, pages 103--112, 2015. Google ScholarDigital Library
D. Chiaro, C. Heiss, and C. Bucaria, editors. Between Text and Image: Updating Research in Screen Translation, volume 78. John Benjamins Publishing, 2008.Google ScholarCross Ref
G. d'Ydewalle and W. De Bruycker. Eye movements of children and adults while reading television subtitles. European Psychologist, 12(3):196--205, 2007.Google ScholarCross Ref
G. d'Ydewalle, C. Praet, K. Verfaillie, and J. Van Rensbergen. Watching subtitled television automatic reading behavior. Communication Research, 18(5):650--666, 1991.Google ScholarCross Ref
G. d'Ydewalle, J. Van Rensbergen, and J. Pollet. Reading a message when the same message is available auditorily in another language: The case of subtitling. In J. O'Regan and A. Lévi-Schoen, editors, Eye Movements: From Psychology to Cognition, pages 313--321. Elsevier Science Publishers, 1987.Google Scholar
P. M. Fitts. The information capacity of the human motor system in controlling the amplitude of movement. Journal of Experimental Psychology, 47(6):381--391, 1954.Google ScholarCross Ref
T. Foulsham and L. A. Sanderson. Look who's talking? sound changes gaze behaviour in a dynamic social scene. Visual Cognition, 21(7):922--944, 2013.Google ScholarCross Ref
D. Holman, R. Vertegaal, C. Sohn, and D. Cheng. Attentive display: Paintings as attentive user interfaces. In CHI Extended Abstracts on Human Factors in Computing Systems, pages 1127--1130, 2004. Google ScholarDigital Library
K. Holmqvist, M. Nyström, R. Andersson, R. Dewhurst, H. Jarodzka, and J. Van de Weijer. Eye Tracking: A Comprehensive Guide to Methods and Measures. Oxford University Press, 2011.Google Scholar
R. Hong, M. Wang, M. Xu, S. Yan, and T.-S. Chua. Dynamic captioning: Video accessibility enhancement for hearing impairment. In Proceedings of the ACM International Conference on Multimedia, pages 421--430, 2010. Google ScholarDigital Library
Y. Hu, J. Kautz, Y. Yu, and W. Wang. Speaker-following video subtitles. ACM Transactions on Multimedia Computing, Communications, and Applications, 11(2):32:1--32:17, 2015. Google ScholarDigital Library
F. Karamitroglou. A proposed set of subtitling standards in Europe. Translation Journal, 2(2):1--15, 1998.Google Scholar
C. M. Koolstra, A. L. Peeters, and H. Spinhof. The pros and cons of dubbing and subtitling. European Journal of Communication, 17(3):325--354, 2002.Google ScholarCross Ref
B. Kothari, J. Takeda, A. Joshi, and A. Pandey. Same language subtitling: a butterfly for literacy? International Journal of Lifelong Education, 21(1):55--66, 2002.Google ScholarCross Ref
I. Krejtz, A. Szarkowska, and K. Krejtz. The effects of shot changes on eye movements in subtitling. Journal of Eye Movement Research, 6(5):1--12, 2013.Google ScholarCross Ref
J.-L. Kruger, E. Hefer, and G. Matthew. Measuring the impact of subtitles on cognitive load: Eye tracking and dynamic audiovisual texts. In Proceedings of the Conference on Eye Tracking South Africa, pages 62--66, 2013. Google ScholarDigital Library
J.-L. Kruger and F. Steyn. Subtitles and eye tracking: Reading and performance. Reading Research Quarterly, 49(1):105--120, 2014.Google ScholarCross Ref
K. Kurzhals, F. Heimerl, and D. Weiskopf. ISeeCube: Visual analysis of gaze data for video. In Proceedings of the Symposium on Eye Tracking Research and Applications, pages 43--50, 2014. Google ScholarDigital Library
K. Kurzhals and D. Weiskopf. Space-time visual analytics of eye-tracking data for dynamic stimuli. IEEE Transactions on Visualization and Computer Graphics, 19(12):2129--2138, 2013. Google ScholarDigital Library
D. Miniotas. Application of Fitts' law to eye gaze interaction. In CHI Extended Abstracts on Human Factors in Computing Systems, pages 339--340, 2000. Google ScholarDigital Library
P. Mohr, B. Kerbl, M. Donoser, D. Schmalstieg, and D. Kalkofen. Retargeting technical documentation to augmented reality. In Proceedings of the ACM Conference on Human Factors in Computing Systems, pages 3337--3346, 2015. Google ScholarDigital Library
E. Perego, F. Del Missier, M. Porta, and M. Mosconi. The cognitive effectiveness of subtitle processing. Media Psychology, 13(3):243--272, 2010.Google ScholarCross Ref
A. Poole and L. Ball. Eye tracking in human-computer interaction and usability research: Current status and future prospects. In R. . D. Hyona, editor, The Mind's Eye: Cognitive and Applied Aspects of Eye Movement Research, pages 573--605. Elsevier Science, 2003.Google Scholar
D. J. Rajendran, A. T. Duchowski, P. Orero, J. Martínez, and P. Romero-Fresco. Effects of text chunking on subtitling: A quantitative and qualitative examination. Perspectives, 21(1):5--21, 2013.Google ScholarCross Ref
K. Rayner. The perceptual span and peripheral cues in reading. Cognitive Psychology, 7(1):65--81, 1975.Google ScholarCross Ref
N. M. Ross and E. Kowler. Eye movements while viewing narrated, captioned, and silent videos. Journal of Vision, 13(4):1--1, 2013.Google ScholarCross Ref
A. Szarkowska, I. Krejtz, Z. Klyszejko, and A. Wieczorek. Verbatim, standard, or edited?: Reading patterns of different captioning styles among deaf, hard of hearing, and hearing viewers. American Annals of the Deaf, 156(4):363--378, 2011.Google ScholarCross Ref
E. E. Veas, E. Mendez, S. K. Feiner, and D. Schmalstieg. Directing attention and influencing memory with visual saliency modulation. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pages 1471--1480, 2011. Google ScholarDigital Library

Index Terms

Close to the Action: Eye-Tracking Evaluation of Speaker-Following Subtitles
1. Human-centered computing
  1. Human computer interaction (HCI)
2. Information systems
  1. Information systems applications
    1. Multimedia information systems

Recommendations

Interactive video stories from user generated content: a school concert use case
ICIDS'12: Proceedings of the 5th international conference on Interactive Storytelling

This paper describes a web-based narrative system able to generate video compilations, framed as event stories, from a shared repository of video recordings of the event itself and possibly of related events. For this, it employs narrative techniques ...
Read More
Best practices for eye tracking of television and video user experiences
UXTV '08: Proceedings of the 1st international conference on Designing interactive user experiences for TV and video

Eye tracking is a usability tool that employs a device that measures on-screen eye fixations and movements to determine how users visually interact with an interface. In this paper, we present best practices for planning and moderating eye tracking ...
Read More
Automatic generation of video narratives from shared UGC
HT '11: Proceedings of the 22nd ACM conference on Hypertext and hypermedia

This paper introduces an evaluated approach to the automatic generation of video narratives from user generated content gathered in a shared repository. In the context of social events, end-users record video material with their personal cameras and ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CHI '17: Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems
May 2017
7138 pages
ISBN:9781450346559
DOI:10.1145/3025453
General Chairs:
Gloria Mark
University of California Irvine
,
Susan Fussell
Cornell University
,
Program Chairs:
Cliff Lampe
University of Michigan
,
m.c. schraefel
University of Southampton
,
Juan Pablo Hourcade
University of Iowa
,
Caroline Appert
Université Paris-Sud
,
Daniel Wigdor
University of Toronto
Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 2 May 2017
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
eye tracking
subtitle layout
video
Qualifiers
- research-article
Conference

Acceptance Rates
CHI '17 Paper Acceptance Rate600of2,400submissions,25%Overall Acceptance Rate6,199of26,314submissions,24%
More
Upcoming Conference
CHI '24

Sponsor:

sigchi

CHI Conference on Human Factors in Computing Systems

May 11 - 16, 2024

Honolulu , HI , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 27
  Total Citations
  View Citations
- 726
  Total Downloads
- Downloads (Last 12 months)81
- Downloads (Last 6 weeks)8
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Close to the Action: Eye-Tracking Evaluation of Speaker-Following Subtitles

CHI '17: Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Interactive video stories from user generated content: a school concert use case

Best practices for eye tracking of television and video user experiences

Automatic generation of video narratives from shared UGC