Stereoscopic Video Quality Prediction Based on End-to-End Dual Stream Deep Neural Networks

Zhou, Wei; Chen, Zhibo; Li, Weiping

doi:10.1007/978-3-030-00764-5_44

Wei Zhou¹⁸,
Zhibo Chen¹⁸ &
Weiping Li¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11166))

Included in the following conference series:

Pacific Rim Conference on Multimedia

3367 Accesses

Abstract

In this paper, we propose a no-reference stereoscopic video quality assessment (NR-SVQA) method based on an end-to-end dual stream deep neural network (DNN), which incorporates left and right view sub-networks. The end-to-end dual stream network takes image patch pairs from left and right view pivotal frames as inputs and evaluates the perceptual quality of each image patch pair. By combining multiple convolution, max-pooling and fully-connected layers with regression in the framework, distortion related features are learned end-to-end and purely data driven. Then, a spatiotemporal pooling strategy is employed on these image patch pairs to estimate the entire stereoscopic video quality. The proposed network architecture, which we name End-to-end Dual stream deep Neural network (EDN), is trained and tested on the well-known stereoscopic video dataset divided by reference videos. Experimental results demonstrate that our proposed method outperforms state-of-the-art algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Stereoscopic video quality measurement with fine-tuning 3D ResNets

Article 12 August 2022

Stereoscopic Video Quality Assessment Using Modified Parallax Attention Module

Objective Quality Assessment of Stereoscopic Video Using Inflated 3D Features

Article 15 August 2024

References

Bosse, S., Maniry, D., Wiegand, T., Samek, W.: A deep neural network for image quality assessment. In: 2016 IEEE International Conference on Image Processing (ICIP), pp. 3773–3777. IEEE (2016)
Google Scholar
Chen, Z., Zhou, W., Li, W.: Blind stereoscopic video quality assessment: from depth perception to overall experience. IEEE Trans. Image Process. 27(2), 721–734 (2018)
Article MathSciNet Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
Google Scholar
Han, J., Jiang, T., Ma, S.: Stereoscopic video quality assessment model based on spatial-temporal structural information. In: 2012 IEEE Visual Communications and Image Processing (VCIP), pp. 1–6. IEEE (2012)
Google Scholar
Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)
Article MathSciNet Google Scholar
Hou, W., Gao, X., Tao, D., Li, X.: Blind image quality assessment via deep learning. IEEE Trans. Neural Netw. Learn. Syst. 26(6), 1275–1286 (2015)
Article MathSciNet Google Scholar
Jiang, G., Liu, S., Yu, M., Shao, F., Peng, Z., Chen, F.: No reference stereo video quality assessment based on motion feature in tensor decomposition domain. J. Vis. Commun. Image Represent. 50, 247–262 (2018)
Article Google Scholar
Jin, L., Boev, A., Gotchev, A., Egiazarian, K.: 3D-DCT based perceptual quality assessment of stereo video. In: 2011 18th IEEE International Conference on Image Processing, pp. 2521–2524. IEEE (2011)
Google Scholar
Joveluro, P., Malekmohamadi, H., Fernando, W.C., Kondoz, A.: Perceptual video quality metric for 3D video quality assessment. In: 2010 3DTV-Conference: The True Vision-Capture, Transmission and Display of 3D Video, pp. 1–4. IEEE (2010)
Google Scholar
Kang, L., Ye, P., Li, Y., Doermann, D.: Convolutional neural networks for no-reference image quality assessment. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1733–1740 (2014)
Google Scholar
Kavukcuoglu, K., Sermanet, P., Boureau, Y.L., Gregor, K., Mathieu, M., Cun, Y.L.: Learning convolutional feature hierarchies for visual recognition. In: Advances in Neural Information Processing Systems, pp. 1090–1098 (2010)
Google Scholar
Kim, J., Lee, S.: Fully deep blind image quality predictor. IEEE J. Sel. Top. Signal Process. 11(1), 206–220 (2017)
Article Google Scholar
Kim, J., Zeng, H., Ghadiyaram, D., Lee, S., Zhang, L., Bovik, A.C.: Deep convolutional neural models for picture quality prediction. IEEE Signal Process. Mag. 34, 130–141 (2017)
Article Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Li, Y., et al.: No-reference video quality assessment with 3D shearlet transform and convolutional neural networks. IEEE Trans. Circuits Syst. Video Technol. 26(6), 1044–1057 (2016)
Article Google Scholar
Li, Y., Po, L.M., Feng, L., Yuan, F.: No-reference image quality assessment with deep convolutional neural networks. In: 2016 IEEE International Conference on Digital Signal Processing (DSP), pp. 685–689. IEEE (2016)
Google Scholar
Li, Y., et al.: No-reference image quality assessment with shearlet transform and deep neural networks. Neurocomputing 154, 94–109 (2015)
Article Google Scholar
Lu, F., Wang, H., Ji, X., Er, G.: Quality assessment of 3D asymmetric view coding using spatial frequency dominance model. In: 2009 3DTV Conference: The True Vision-Capture, Transmission and Display of 3D Video, pp. 1–4. IEEE (2009)
Google Scholar
Lv, Y., Yu, M., Jiang, G., Shao, F., Peng, Z., Chen, F.: No-reference stereoscopic image quality assessment using binocular self-similarity and deep neural network. Signal Process.: Image Commun. 47, 346–357 (2016)
Google Scholar
Parker, A.J.: Binocular depth perception and the cerebral cortex. Nat. Rev. Neurosci. 8(5), 379 (2007)
Article Google Scholar
Qi, F., Zhao, D., Fan, X., Jiang, T.: Stereoscopic video quality assessment based on visual attention and just-noticeable difference models. Signal, Image Video Process. 10(4), 737–744 (2016)
Article Google Scholar
Rec, I.: P. 910: Subjective video quality assessment methods for multimedia applications. International Telecommunication Union, Geneva (2008)
Google Scholar
Urvoy, M., et al.: NAMA3DS1-COSPAD1: subjective video quality assessment database on coding conditions introducing freely available high quality 3D stereoscopic sequences. In: 2012 Fourth International Workshop on Quality of Multimedia Experience (QoMEX), pp. 109–114. IEEE (2012)
Google Scholar
Vega, M.T., Mocanu, D.C., Famaey, J., Stavrou, S., Liotta, A.: Deep learning for quality assessment in live video streaming. IEEE Signal Process. Lett. 24(6), 736–740 (2017)
Article Google Scholar
Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)
Article Google Scholar
Yang, J., Wang, H., Lu, W., Li, B., Badiid, A., Meng, Q.: A no-reference optical flow-based quality evaluator for stereoscopic videos in curvelet domain. Inf. Sci. 414, 133–146 (2017)
Article Google Scholar
Zhang, W., Qu, C., Ma, L., Guan, J., Huang, R.: Learning structure of stereoscopic image for no-reference quality assessment with convolutional neural network. Pattern Recognit. 59, 176–187 (2016)
Article Google Scholar

Download references

Acknowledgement

This work was supported in part by the National Key Research and Development Program of China under Grant No. 2016YFC0801001, the National Program on Key Basic Research Projects (973 Program) under Grant 2015CB351803, NSFC under Grant 61571413, 61632001, 61390514, and Intel ICRI MNC.

Author information

Authors and Affiliations

CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System, Department of Electronic Engineering and Information Science, University of Science and Technology of China, Hefei, 230027, China
Wei Zhou, Zhibo Chen & Weiping Li

Authors

Wei Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Zhibo Chen
View author publications
You can also search for this author in PubMed Google Scholar
Weiping Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhibo Chen .

Editor information

Editors and Affiliations

Hefei University of Technology, Hefei, China
Richang Hong
National Chiao Tung University, Hsinchu, Taiwan
Wen-Huang Cheng
University of Tokyo, Tokyo, Japan
Toshihiko Yamasaki
Hefei University of Technology, Hefei, China
Meng Wang
City University of Hong Kong, Hong Kong, Hong Kong
Chong-Wah Ngo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhou, W., Chen, Z., Li, W. (2018). Stereoscopic Video Quality Prediction Based on End-to-End Dual Stream Deep Neural Networks. In: Hong, R., Cheng, WH., Yamasaki, T., Wang, M., Ngo, CW. (eds) Advances in Multimedia Information Processing – PCM 2018. PCM 2018. Lecture Notes in Computer Science(), vol 11166. Springer, Cham. https://doi.org/10.1007/978-3-030-00764-5_44

Download citation

DOI: https://doi.org/10.1007/978-3-030-00764-5_44
Published: 18 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00763-8
Online ISBN: 978-3-030-00764-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics