Abstract
Visual comfort assessment (VCA) for stereoscopic three-dimensional (S3D) images is a challenging problem in the community of 3D quality of experience (3D-QoE). The goal of VCA is to automatically predict the degree of perceived visual discomfort in line with subjective judgment. The challenges of VCA typically lie in the following two aspects: 1) formulating effective visual comfort-aware features, and 2) finding an appropriate way to pool them into an overall visual comfort score. In this paper, a novel two-stage framework is proposed to address these problems. In the first stage, primary predictive feature (PPF) and advanced predictive feature (APF) are separately extracted and then integrated to reflect the perceived visual discomfort for 3D viewing. Specifically, we compute the S3D visual attention-weighted disparity statistics and neural activities of the middle temporal (MT) area in human brain to construct the PPF and APF, respectively. Followed by the first stage, the integrated visual comfort-aware features are fused with a single visual comfort score by using random forest (RF) regression, mapping from a high-dimensional feature space into a low-dimensional quality (visual comfort) space. Comparison results with five state-of-the-art relevant models on a standard benchmark database confirm the superior performance of our proposed method.
Similar content being viewed by others
References
Achanta R, Shaji A, Smith K, Lucchi A, Fua P, üsstrunk S (2012) SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans Pattern Anal Mach Intell 34(11):2274–2282
Borji A, Itti L (2013) State-of-the-art in visual attention modeling. IEEE Trans Pattern Anal Mach Intell 35(1):185–207
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Chang B, Yang F, Wan S et al. (2013) “Effect of content on visual comfort in viewing stereoscopic videos,”. Proc Sign Inform Process Assoc Ann Summit Conf (APSIPA)
Choi J, Kim D, Choi S, Sohn K (2010) Visual fatigue modeling and analysis for stereoscopic video. Opt Eng 51(1):017206
Cumming B, Parker A (1997) Responses of primary visual cortical neurons to binocular disparity without depth perception. Nature 389(6648):280–283
DeAngelis G, Cumming B, Newsome W (1998) Cortical area MT and the perception of stereoscopic depth. Nature 394(6694):677–680
DeAngelis G, Newsome W (1999) Organization of disparity-selective neurons in macaque area MT. J Neurosci 19(4):1398–1415
DeAngelis G, Uka T (2003) Coding of horizontal disparity and velocity by MT neurons in the alert macaque. J Neurophysiol 89(2):1094–1111
Fang Y, Wang J, Narwaria M, Le Callet P, Lin W (2014) Saliency detection for stereoscopic images. IEEE Trans Image Process 23(6):2625–2636
Final Report from the Video Quality Experts Group on the Validation of Objective Models of Video Quality Assessment VQEG, 2000. [Online]. Available: http://www.vqeg.org
Gao Y, Wang M, Ji R, Wu X, Dai Q (2014) 3-D object retrieval with Hausdorff distance learning. IEEE Trans Ind Electron 61(4):2088–2098
Gao Y, Wang M, Tao D, Ji R, Dai Q (2012) 3-D object retrieval and recognition with hypergraph analysis. IEEE Trans Image Process 21(9):4290–4303
Harel J, Koch C, Perona P et al. (2006) “Graph-based visual saliency,”. Proc Adv Neural Inform Process Syst
Hoffman D, Girshick A, Akeley K, Banks M (2008) Vergence-accommodation conflicts hinder visual performance and cause visual fatigue. J Vis 8:1–30
Hou X, Zhang L (2007) “Saliency detection: a spectral residual approach,”. Proc IEEE Int Conf Comput Vision Pattern Recognition (CVPR)
Hur N, Lee H, Lee G, Lee S, Gotchev A, Park S (2011) 3DTV broadcasting and distribution systems. IEEE Trans Broadcast 57(2):395–407
Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20(11):1254–1259
ITU-R BT.1438 (2000) Subjective assessment for stereoscopic television pictures
ITU-R BT-500 (2002) Methodology for the subjective assessment of the quality of television pictures
Jiang Q, Shao F, Jiang G, Yu M, Peng Z, Yu C (2015) A depth perception and visual comfort guided computational model for stereoscopic 3D visual saliency. Signal Process Image Commun 38:57–69
Jiang Q, Shao F, Jiang G, Yu M, Peng Z (2015) Three-dimensional visual comfort assessment via preference learning. J Electron Imag 24(4):043002
Jiang Q, Shao F, Jiang G, Yu M, Peng Z (2015) Supervised dictionary learning for blind image quality assessment using quality-constraint sparse coding. J Vis Commun Image Represent 33:123–133
Jiang Q, Shao F, Lin W, Jiang G (2016) On predicting visual comfort of stereoscopic images: a learning to rank based approach. IEEE Sign Process Lett 23(2):302–306
Jung Y, Sohn H, Lee S, Park H, Ro Y (2013) Predicting visual discomfort of stereoscopic images using human attention model. IEEE Trans Circ Syst Video Technol 23(12):2077–2082
Kim D, Sohn K (2011) Visual fatigue prediction for stereoscopic image. IEEE Trans Circ Syst Video Technol 21(2):231–236
Lambooij M, Ijsselsteijn W, Fortuin M, Heynderickx I (2009) Visual discomfort and visual fatigue of stereoscopic displays: a review. J Imag Sci Technol 53(3):1–14
Lambooij M, IJsselsteijn W, Heynderickx I (2011) Visual discomfort of 3-D TV: assessment methods and modeling. Displays 32(4):209–218
Lang C, Nguyen T, Katti H et al. (2012) “Depth matters: influence of depth cues on visual saliency,”. Proc 12th Europ Conf Comput Vision (ECCV)
Lee S, Jung Y, Sohn H, Speranza F, Ro Y (2013) Effect of stimulus width on the perceived visual discomfort in viewing stereoscopic 3D-TV. IEEE Trans Broadcast 59(4):580–590
Liu Y, Cormack L, Bovik A (2011) Statistical modeling of 3-D natural scenes with application to bayesian stereopsis. IEEE Trans Image Process 20(9):2515–2530
Martinez L, Alonso J (2003) Complex receptive fields in primary visual cortex. Neuroscientist 9(5):317–331
Mittal A, Moorthy A, Ghosh J et al. (2011) “Algorithmic assessment of 3D quality of experience for images and videos,”. Proc IEEE Digit Sign Process Workshop 338–343
Moorthy A, Bovik A (2009) Visual importance pooling for image quality assessment. IEEE J Select Topics Sign Process 3(2):193–201
Nojiri Y, Yamanoue H, Ide S et al. (2006) “Parallax distribution and visual comfort on stereoscopic HDTV,”. Proc IBC 373–380
Park J, Lee S, Bovik A (2014) 3D visual discomfort prediction: vergence, foveation, and the physiological optics of accommodation. IEEE J Select Topic Sign Process 8(3):415–426
Shao F, Li K, Lin W, Jiang G, Yu M, Dai Q (2015) Full-reference quality assessment of stereoscopic images by learning binocular receptive field properties. IEEE Trans Image Process 24(10):2971–2983
Shao F, Lin W, Gu S, Jiang G, Srikanthan T (2013) Perceptual full-reference quality assessment of stereoscopic images by considering binocular visual characteristics. IEEE Trans Image Process 22(5):1940–1953
Shibata T, Kim J, Hoffman D et al. (2011 “The zone of comfort: predicting visual discomfort with stereo displays,”. J Vision 11(8)
Sohn H, Jung Y, Lee S, Ro Y (2013) Predicting visual discomfort using object size and disparity information in stereoscopic images. IEEE Trans Broadcast 59(1):28–37
Sun D, Roth S, Black M et al. (2010) “Secrets of optical flow estimation and their principles,”. Proc IEEE Int Conf Comput Vision Pattern Recognition (CVPR) 2432–2439
Tam W, Speranza F, Yano S, Shimono K, Ono H (2011) Stereoscopic 3D-TV: visual comfort. IEEE Trans Broadcast 57(2):335–346
Ukai K, Howarth P (2008) Visual fatigue caused by viewing stereoscopic motion images: background, theories, and observations. Displays 29(2):106–116
Urvoy M, Barkowsky M, Le Callet P (2013) How visual fatigue and discomfort impact 3D-TV quality of experience: a comprehensive review of technological, psychophysical, and psychological factors. Ann Telecommun-Annales Des Télécommun 68(11-12):641–655
Wang Z, Shang X (2006) “Spatial pooling strategies for perceptual image quality assessment,”. Proc IEEE Int Conf Imag Process (ICIP) 2945–2948
Wang J, Sliva M, Le Callet P, Ricordel V (2013) A computational model of stereoscopic 3D visual saliency. IEEE Trans Imag Process 22(6):2151–2165
Yano S, Ide S, Mitsuhashi T, Thwaites H (2002) A study of visual fatigue and visual comfort for 3D HDTV/HDTV images. Displays 23(4):191–201
Zhao S, Gao Y, Jiang X et al. (2014) “Exploring principles-of-art features for image emotion recognition,”. Proc ACM Int Conf Multimed 47–56
Zhao S, Yao H, Jiang X et al. (2015) “Predicting continuous probability distribution of image emotions in valence-arousal space,”. Proc ACM Conf Multimed Conf 879–882
Zhao S, Yao H, Zhang Y, Wang Y, Liu S (2015) View-based 3D object retrieval via multi-modal graph learning. Signal Process 112:110–118
Acknowledgments
The authors would like to thank the editor and all of the reviewers for their valuable comments and suggestions that have led to improvements in the quality and presentation of this paper. This work was supported in part by the Natural Science Foundation of China (grant 61271021, U1301257), in part by the Scientific Research Foundation of Graduate School of Ningbo University. It was also sponsored by the K.C. Wong Magna Fund in Ningbo University.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Jiang, Q., Shao, F., Jiang, G. et al. Leveraging visual attention and neural activity for stereoscopic 3D visual comfort assessment. Multimed Tools Appl 76, 9405–9425 (2017). https://doi.org/10.1007/s11042-016-3548-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-016-3548-2