Abstract
Omnidirectional video quality assessment (OVQA) helps to evaluate the viewers’ visual experience and promotes the development of omnidirectional video. The perceived quality of omnidirectional video is affected not only by the video content and distortion, but also by the viewing directions of viewer’s preference. At present, there are some quality assessment methods for omnidirectional video, but most of them are full-reference (FR). Compared with the FR method, the no-reference (NR) method becomes more difficult due to the lack of the reference video. In this paper, a NR OVQA based on generative adversarial networks (GAN) is proposed, which is composed of the reference video generator and the quality score predictor. Generally, a reference image/video is distorted by some distortion types, and each distortion type has some distortion levels. To the best of our knowledge, there are some NR methods using GAN to generate the reference images/videos for quality assessment. In these methods, the reference images/videos is generated by GAN when the distorted images/videos, which are from a distortion type but with different distortion level, are input into the GAN. In order to achieve an accurate quality assessment, the generated reference images/videos, which are from a distortion type but with different distortion level, are expected to have as similar quality as possible with each other. However, the distorted images/videos are independent for GAN, and the GAN will generate a little different reference images/videos for these distorted images/videos. This issue is not considered in the existing GAN-based methods. To solve this issue, we introduced a level loss in OVQA. For the quality score predictor, as a further contribution of this paper, the viewing direction of the omnidirectional video is incorporated to guide the quality and weight regression. The publicly available dataset is used to evaluate the proposed method. The experimental results indicate the effectiveness of the proposed method.












Similar content being viewed by others
References
Arjovsky M, Chintala S, Bottou L (2017) Wasserstein generative adversarial networks. In: Proc. int. conf. mach. learn. (ICML), vol 70, pp 214–223, Sydney, Australia
Bosse S, Maniry D, Müller KR, Wiegand T, Samek W (2017) Deep neural networks for no-reference and full-reference image quality assessment. IEEE Trans. Image Process 27(1):206–219
Deng J, Dong W, Socher R, Li LJ, Li K, Li FF (2009) ImageNet: A large-scale hierarchical image database. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pp 248–255, Miami, FL, USA
Dong X, Shen J (2018) Triplet loss in siamese network for object tracking. In: Proc. Eur. Conf. Comput. Vis. (ECCV). pp 472–488, Munich, Germany
Dong X, Shen J, Wu D, Guo K, Jin X, Porikli F (2019) Quadruplet network with one-shot learning for fast visual object tracking. IEEE Trans. Image Process 28(7):3516–3527
Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Proc. Adv. Neural Inf. Process. Syst. (NIPS), vol 27, pp 2672–2680, Montréal, Canada
Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V, Courville A (2017) Improved training of wasserstein gans. In: Proc. Adv. Neural Inf. Process. Syst. (NIPS), vol 30, pp 5767–5777, Long Beach, CA, USA
Huawei iLab (2019) Cloud VR service quality monitoring white paper. Huawei report, Huawei Technologies Co., Ltd.
Justin J, Alexandre A, Li FF (2016) Perceptual losses for real-time style transfer and super-resolution. In: Proc. Eur. Conf. Comput. Vis. (ECCV), 694–711, Amsterdam, The Netherlands
Kang L, Ye P, Li Y, Doermann D (2014) Convolutional neural networks for no-reference image quality assessment. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pp 1733–1740, Columbus, OH, United States
Kim HG, Lim HT, Ro YM (2019) Deep virtual reality image quality assessment with human perception guider for omnidirectional image. IEEE Trans. Circuits Syst. Video Technol. 30(4):917–928
Kupyn O, Budzan V, Mykhailych M, Mishkin D, Matas J (2018) DeblurGAN: Blind motion deblurring using conditional adversarial networks. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR). pp 8183–8192, Salt Lake City, UT, USA
Ledig C, Theis L, Huszar F, Caballero J, Cunningham A, Acosta A, Aitken A, Tejani A, Totz J, Wang Z, Shi W (2017) Photo-realistic single image super-resolution using a generative adversarial network. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), vol 27, pp 2672–2680, Honolulu, Hawaii, USA
Li C, Xu M, Du X, Wang Z (2018) Bridge the gap between VQA and human behavior on omnidirectional video. In: Proc. 26th ACM Int. Conf. Multimedia (MM ’18), pp 932–940, Seoul, Republic of Korea
Liang Z, Shen J (2020) Local semantic siamese networks for fast tracking. IEEE Trans. Image Process 29:3351–3364
Lin KY, Wang G (2018) Hallucinated-IQA: No-reference image quality assessment via adversarial learning. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pp 732–741, Salt Lake City, UT, USA
Ni Z, Ma L, Zeng H, Chen J, Cai C, Ma KK (2017) ESIM: Edge similarity for screen content image quality assessment. IEEE Trans. Image Process 26(10):4818–4831
Orduna M, Pérez P, Díaz C, García N (2020) Evaluating the influence of the HMD, usability, and fatigue in 360VR video quality assessments. In: Proceedings of IEEE Conference on Virtual Reality and 3D User Interfaces Workshops (VRW). pp 683–684, Atlanta, GA, USA
Reddy Dendi SV, Channappayya SS (2020) No-reference video quality assessment using natural spatiotemporal scene statistics. IEEE Trans. Image Process 29:5612–5624
Ren H, Chen D, Wang Y (2018) RAN4IQA: Restorative adversarial nets for no-reference image quality assessment. In: Proc. AAAI Conf. Artif. Intell. (AAAI). New orleans, Louisiana, USA
Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X (2016) Improved techniques for training GANs. In: Proc. Adv. Neural Inf. Process. Syst. (NIPS). pp 2234–2242, Barcelona, Spain
Seshadrinathan K, Soundararajan R, Conrad AB, Cormack LK (2010) Study of subjective and objective quality assessment of video. IEEE Trans. Image Process 19(6):1427–1441
Shen J, Tang X, Dong X, Shao L (2020) Visual object tracking by hierarchical attention siamese network. IEEE Trans. Cybern. 50(7):3068–3080
Sun W, Min X, Zhai G, Gu K, Duan H, Ma S (2020) MC360IQA: A multi-channel CNN for blind 360-degree image quality assessment. IEEE J. Sel. Topics Signal Process 14(1):64–77
Tran D, Bourdev L, Fergus R, Torresani L, Paluri M (2015) Learning spatiotemporal features with 3D convolutional networks. In: Proc. IEEE Int. Conf. Comput. Vis. (ICCV), pp 4489–4497, Santiago, Chile
Wang S, Gu K, Zhang X, Lin W, Ma S, Gao W (2016) Reduced-reference quality assessment of screen content images. IEEE Trans. Circuits Syst. Video Technol. 28(1):1–14
Wang W, Shen J (2018) Deep visual attention prediction. IEEE Trans. Image Process 27(5):2368–2378
Wang W, Shen J, Ling H (2019) A deep network solution for attention and aesthetics aware photo cropping. IEEE Trans. Pattern Anal. Mach. Intell. 41(7):1531–1544
Xu M, Li C, Chen Z, Wang Z, Guan Z (2019) Assessing visual quality of omnidirectional videos. IEEE Trans. Circuits Syst. Video Technol. 29 (12):3516–3530
Yang J, Liu T, Jiang B, Song H, Lu W (2018) 3D panoramic virtual reality video quality assessment based on 3D convolutional neural networks. IEEE Access 6:38669–38682
Yang J, Zhu Y, Ma C, Lu W, Meng Q (2018) Stereoscopic video quality assessment based on 3D convolutional neural networks. Neurocomputing 309:83–93
Yu M, Lakshman H, Girod B (2015) A framework to evaluate omnidirectional video coding schemes. In: International Symposium on Mixed and Augmented Reality (ISMAR), pp 31–36, Fukuoka, Japan
Zakharchenko V, Choi KP, Park JH (2016) Quality metric for spherical panoramic video. In: Optics and Photonics for Information Processing X. vol 9970, pp 57–65, San Diego, CA, United States
Zhang W, Ma K, Yan J, Deng D, Wang Z (2018) Blind image quality assessment using a deep bilinear convolutional neural network. IEEE Trans. Circuits Syst. Video Technol. 30(1):36–47
Zhang Y, Gao X, He L, Lu W, He R (2019) Objective video quality assessment combining transfer learning with CNN. IEEE Trans. Neural Netw. Learn. Syst. 31(8):2716–2730
Acknowledgements
This work was supported by the Natural Science Foundation of Fujian Province of China under Grant 2019J01046.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Guo, J., Luo, Y. No-reference omnidirectional video quality assessment based on generative adversarial networks. Multimed Tools Appl 80, 27531–27552 (2021). https://doi.org/10.1007/s11042-021-10862-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-021-10862-8