skip to main content
research-article

Towards efficient context-specific video coding based on gaze-tracking analysis

Published: 12 December 2007 Publication History

Abstract

This article discusses a framework for model-based, context-dependent video coding based on exploitation of characteristics of the human visual system. The system utilizes variable-quality coding based on priority maps which are created using mostly context-dependent rules. The technique is demonstrated through two case studies of specific video context, namely open signed content and football sequences. Eye-tracking analysis is employed for identifying the characteristics of each context, which are subsequently exploited for coding purposes, either directly or through a gaze prediction model. The framework is shown to achieve a considerable improvement in coding efficiency.

References

[1]
Agrafiotis, D., Canagarajah, N., Bull, D. R., Dye, M., Twyford, H., Kyle, J., and Chung-How, J. 2003. Optimised sign language video coding based on eye-tracking analysis. In Proceedings of the International Conference on Visual Communications and Image Processing, Lugano, Switzerland.
[2]
Agrafiotis, D., Canagarajah, N., Bull D. R., Kyle, J., Seers, H., and Dye, M. 2006. A perceptually optimised video coding system for sign language communication at low bit rates. Signal Proc. Image Commun. 21, 531--549
[3]
Appleby, S., Crabtree, B., Jeffery, R., Mulroy, P., and Nilsson, M. 2006. Video coding and delivery challenges for next generation IPTV. BT Technol. J. 24, 174--179.
[4]
Chen, M. J., Chi, M. C., Hsu, C. T., and Chen, J. W. 2003. ROI video coding based on H.263+ with robust skin-color detection technique. IEEE Trans. Consum. Electron. 49, 3.
[5]
Cheng, W., Chu, W., and Wu, J. 2005. A visual attention based region-of-interest determination framework for video sequences. IEICE Trans. Inf. Syst. E88-D, 1578--1586.
[6]
Crabtree, B. 2006. Video compression using focus of attention. In Proceedings of the Picture Coding Symposium, China.
[7]
Daly, S., Matthews, K., and Ribas-Corbera, J. 1998. Face-based visually-optimized image sequence coding. In Proceedings of the International Conference on Image Processing, Chicago.
[8]
Geisler, W. S. and Perrys, J. S. 1998. A real-time foveated multiresolution system for low-bandwidth video communication. In Proceedings of the SPIE Conference on Human Vision and Electronic Imaging, 3299.
[9]
ISO/IEC. 2003. ISO/IEC 14496--10 and ITU-T recommendation H.264 2003. Coding of audiovisual objects---Part 10.
[10]
ISO/IEC. 2000. ISO/IEC 14496--2 2000. Information technology-Coding of audiovisual objects-Part 2:Visual.
[11]
Itti, L. 2004. Automatic foveation for video compression using a neurobiological model of visual attention. IEEE Trans. Image Proc. 13, 1304--1318.
[12]
Itti, L. and Baldi, P. 2005a. Bayesian surprise attracts human attention. Adv. Neural Inf. Proc. Syst. 19, 1--8.
[13]
Itti, L. and Baldi, P. 2005b. A principled approach to detecting surprising events in video. In Proceedings of the International Conference on Computer Vision and Pattern Recognition, San Diego.
[14]
Itti, L., Koch, C., and Niebur, E. 1998. A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Analy. Mach. Intell. 20, 1254--1259.
[15]
Itu 1995. R BT.500-11 2002. Methodology for the subjective assessment of the quality of television pictures.
[16]
Lee, S. and Bovik, A. C. 2003. Fast algorithms for foveated video processing. IEEE Trans. Circ. Syst. Video Technol. 13, 149--162.
[17]
Lin, C. W., Chang, Y. Y., and Chen, Y. C. 2000. Low-Complexity face-assisted video coding. In Proceedings of the International Conference on Image Processing, Vancouver, Canada.
[18]
Liu, Y., Li, Z. G., Soh, Y. C., and Loke, M. H. 2006. Conversational video communication of H.264/AVC with region-of-interest concern. In Proceedings of the International Conference on Image Processing, Atlanta, GA.
[19]
Muir, L., Richardson, I., and Leaper, S. 2003. Gaze tracking and its application to video coding for sign language. In Proceedings of the Picture Coding Symposium, Saint Malo, France.
[20]
Nadenau, M. J., Reichel, J., and Kunt, M. 2002. Performance comparison of masking models based on a new psychovisual test method with natural scenery stimuli. Signal Proc. Image Commun. 17, 807--823.
[21]
Privitera, C. and Stark, L. W. 1997. Algorithms for defining visual region-of-interest: Comparison with eye fixations. Tech. Rep. UCB/ERL M97/72, Electrical Engineering and Computer Science, Department, University of California, Berkeley.
[22]
Tang, C. W., Chen, C. H., Yu, Y. H., and Tsai, C. J. 2006. Visual sensitivity guided bit allocation for video coding. IEEE Trans. Multimedia 8, 11--18.

Cited By

View all
  • (2024)Depth Matters: Spatial Proximity-Based Gaze Cone Generation for Gaze Following in WildACM Transactions on Multimedia Computing, Communications, and Applications10.1145/368964320:11(1-24)Online publication date: 26-Aug-2024
  • (2021)Multiple Linear Regression Based Silica Melting Process via Partial Differential Equations2021 40th Chinese Control Conference (CCC)10.23919/CCC52363.2021.9549747(6634-6638)Online publication date: 26-Jul-2021
  • (2019)Scalable 360° Video Stream Delivery: Challenges, Solutions, and OpportunitiesProceedings of the IEEE10.1109/JPROC.2019.2894817107:4(639-650)Online publication date: Apr-2019
  • Show More Cited By

Index Terms

  1. Towards efficient context-specific video coding based on gaze-tracking analysis

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Multimedia Computing, Communications, and Applications
    ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 3, Issue 4
    December 2007
    147 pages
    ISSN:1551-6857
    EISSN:1551-6865
    DOI:10.1145/1314303
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 12 December 2007
    Accepted: 01 August 2007
    Received: 01 August 2007
    Published in TOMM Volume 3, Issue 4

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Eye tracking
    2. applications
    3. context-based video coding
    4. multimedia perceptual quality
    5. subjective video quality
    6. transformation of eye movements into useful knowledge

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)8
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 28 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Depth Matters: Spatial Proximity-Based Gaze Cone Generation for Gaze Following in WildACM Transactions on Multimedia Computing, Communications, and Applications10.1145/368964320:11(1-24)Online publication date: 26-Aug-2024
    • (2021)Multiple Linear Regression Based Silica Melting Process via Partial Differential Equations2021 40th Chinese Control Conference (CCC)10.23919/CCC52363.2021.9549747(6634-6638)Online publication date: 26-Jul-2021
    • (2019)Scalable 360° Video Stream Delivery: Challenges, Solutions, and OpportunitiesProceedings of the IEEE10.1109/JPROC.2019.2894817107:4(639-650)Online publication date: Apr-2019
    • (2018)Efficient Multiview Video Coding Using 3-D Coding and Saliency-Based Bit AllocationIEEE Transactions on Broadcasting10.1109/TBC.2017.278111864:2(235-246)Online publication date: Jun-2018
    • (2017)A review of visual moving target trackingMultimedia Tools and Applications10.1007/s11042-016-3647-076:16(16989-17018)Online publication date: 1-Aug-2017
    • (2014)Subjective Assessment of Region of Interest-Aware Adaptive Multimedia Streaming QualityIEEE Transactions on Broadcasting10.1109/TBC.2013.229023860:1(50-60)Online publication date: Mar-2014
    • (2013)Gaze Location Prediction for Broadcast Football VideoIEEE Transactions on Image Processing10.1109/TIP.2013.227994122:12(4918-4929)Online publication date: 1-Dec-2013
    • (2013)Adaptive Energy Optimization in Multimedia-Centric Wireless Devices: A SurveyIEEE Communications Surveys & Tutorials10.1109/SURV.2012.072412.0011515:2(768-786)Online publication date: Oct-2014
    • (2013)Gaze location prediction for broadcast football video using Bayesian integration of low level features and top-down cues2013 IEEE International Conference on Image Processing10.1109/ICIP.2013.6738047(226-230)Online publication date: Sep-2013
    • (2012)Perceptual Video Compression: A SurveyIEEE Journal of Selected Topics in Signal Processing10.1109/JSTSP.2012.22150066:6(684-697)Online publication date: Oct-2012
    • Show More Cited By

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media