Skip to main content
Log in

Objective Quality Assessment of Stereoscopic Video Using Inflated 3D Features

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

Convolutional Neural Networks (CNNs) have been receiving research attention for Stereoscopic Video Quality Assessment (SVQA) in recent years. Recently, researchers have used 3D CNNs for extracting useful spatial and temporal features from stereo videos and have used them for detecting the reduction in the quality of the stereoscopic videos. To our best knowledge, the concept of transfer learning (TL) has not been well-examined in SVQA. Pretraining and fine-tuning are approaches used in deep neural networks to transform the knowledge learned from other general fields. The previous methods that utilized TL used very heavy 3D ResNet architectures with several layers; therefore, they are very time-consuming. In this paper, we develop a new model for SVQA and use the Inflated 3-Dimensional ConvNet (I3D) network as the backbone feature extractor for our model. We first apply left and right videos to I3D models to extract their features. Then, we apply 3D CNNs to learn quality-aware features from stereo videos. We evaluate our proposed method using LFOVIAS3DPh2 and NAMA3DS1- COSPAD1 SVQA datasets. Extensive experimental studies on two datasets prove that the proposed method correlates with the subjective results. The Root-Mean-Square Error (RMSE) for the NAMA3DS1-COSPAD1 dataset is 0.2454, and the high amount of Linear Correlation Coefficient (LCC) and Spearmen Rank Order Correlation Coefficient (SROCC) values (0.895 and 0.901 respectively) for LFOVIAS3DPh2 dataset show the compatibility of the results with human visual system (HVS). Despite having lighter architecture than the best performing method, the proposed method outperforms most of the methods and overall it is the second best performing method available.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Availability of data and materials

The source code for this work is available upon request to the corresponding author.

References

  1. Al-Najdawi A, Kalawsky RS. Visual quality assessment of video and image sequences-a human-based approach. Journal of Signal Processing Systems. 2010;59(2):223–31.

    Article  Google Scholar 

  2. F. Torkamani-Azar, H. Imani, H. Fathollahian, Video quality measurement based on 3-d. singular value decomposition, Journal of Visual Communication and Image Representation 27 (2015) 1–6.

  3. Yang J, Wang H, Lu W, Li B, Badii A, Meng Q. A no-reference optical flow-based quality evaluator for stereoscopic videos in curvelet domain. Inf Sci. 2017;414:133–46.

    Article  Google Scholar 

  4. Prieto A, Prieto B, Ortigosa EM, Ros E, Pelayo F, Ortega J, Rojas I. Neural networks: An overview of early research, current frameworks and new challenges. Neurocomputing. 2016;214:242–68.

    Article  Google Scholar 

  5. Y. Chen, W. Li, C. Sakaridis, D. Dai, L. Van Gool, Domain adaptive faster r-cnn for object detection in the wild, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 3339–3348.

  6. A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, L. Fei-Fei, Large-scale video classification with convolutional neural networks, in: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2014, pp. 1725–1732.

  7. Appina B, Dendi SVR, Manasa K, Channappayya SS, Bovik AC. Study of subjective quality and objective blind quality prediction of stereoscopic videos. IEEE Trans Image Process. 2019;28(10):5027–40.

    Article  MathSciNet  Google Scholar 

  8. M. Urvoy, M. Barkowsky, R. Cousseau, Y. Koudota, V. Ricorde, P. Le Callet, J. Gutierrez, N. Garcia, Nama3ds1-cospad1: Subjective video quality assessment database on coding conditions introducing freely available high quality 3d stereoscopic sequences, in: Fourth International Workshop on Quality of Multimedia Experience, IEEE, 2012, pp. 109–114.

  9. Y. Feng, C. Yiyu, No-reference image quality assessment through transfer learning, in: 2017 IEEE 2nd International Conference on Signal and Image Processing (ICSIP), IEEE, 2017, pp. 90–94.

  10. J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, L. Fei-Fei, ImageNet: A Large-Scale Hierarchical Image Database, in: CVPR09, 2009.

  11. W. Kay, J. Carreira, K. Simonyan, B. Zhang, C. Hillier, S. Vijayanarasimhan, F. Viola, T. Green, T. Back, P. Natsev, et al., The kinetics human action video dataset, arXiv preprint arXiv:1705.06950 (2017).

  12. Bianco S, Celona L, Napoletano P, Schettini R. On the use of deep learning for blind image quality assessment. SIViP. 2018;12(2):355–62.

    Article  Google Scholar 

  13. Z. Wang, H. R. Sheikh, A. C. Bovik, et al., Objective video quality assessment, in: The handbook of video databases: design and applications, Vol. 41, Citeseer, 2003, pp. 1041–1078.

  14. K. Hara, H. Kataoka, Y. Satoh, Can spatiotemporal 3d cnns retrace the history of 2d cnns and imagenet?, in: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2018, pp. 6546–6555.

  15. Imani H, Islam MB, Junayed MS, Aydin T, Arica N. Stereoscopic video quality measurement with fine-tuning 3d resnets. Multimedia Tools and Applications. 2022;81(29):42849–69.

    Article  Google Scholar 

  16. P. Campisi, P. Le Callet, E. Marini, Stereoscopic images quality assessment, in: 15th European Signal Processing Conference, IEEE, 2007, pp. 2110–2114.

  17. M. Carnec, P. Le Callet, D. Barba, An image quality assessment method based on perception of structural information, in: Proceedings 2003 International Conference on Image Processing (Cat. No. 03CH37429), Vol. 3, IEEE, 2003, pp. III–185.

  18. Wang Z, Bovik AC, Sheikh HR, Simoncelli EP. Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process. 2004;13(4):600–12.

    Article  Google Scholar 

  19. A. Benoit, P. Le Callet, P. Campisi, R. Cousseau, Using disparity for quality assessment of stereoscopic images, in: 15th IEEE International Conference on Image Processing, IEEE, 2008, pp. 389–392.

  20. J. You, L. Xing, A. Perkis, X. Wang, Perceptual quality assessment for stereoscopic images based on 2d image quality metrics and disparity analysis, in: Proc. Int. Workshop Video Process. Quality Metrics Consum. Electron, Vol. 9, 2010, pp. 1–6.

  21. Z. Wang, E. P. Simoncelli, A. C. Bovik, Multiscale structural similarity for image quality assessment, in: The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003, Vol. 2, Ieee, 2003, pp. 1398–1402.

  22. Sheikh HR, Bovik AC. Image information and visual quality. IEEE Trans Image Process. 2006;15(2):430–44.

    Article  Google Scholar 

  23. F. Lu, H. Wang, X. Ji, G. Er, Quality assessment of 3d asymmetric view coding using spatial frequency dominance model, in: 3DTV Conference: The True Vision-Capture, Transmission and Display of 3D Video, IEEE, 2009, pp. 1–4.

  24. P. Joveluro, H. Malekmohamadi, W. C. Fernando, A. Kondoz, Perceptual video quality metric for 3d video quality assessment, in: 3DTV-Conference: The True Vision-Capture, Transmission and Display of 3D Video, IEEE, 2010, pp. 1–4.

  25. J. Han, T. Jiang, S. Ma, Stereoscopic video quality assessment model based on spatial-temporal structural information, in: Visual Communications and Image Processing, IEEE, 2012, pp. 1–6.

  26. L. Jin, A. Boev, A. Gotchev, K. Egiazarian, 3d-dct based perceptual quality assessment of stereo video, in: 18th IEEE International Conference on Image Processing, IEEE, 2011, pp. 2521–2524.

  27. Cui S, Peng Z, Chen F, Zou W, Jiang G, Yu M. Blind quality assessment for 3d synthesised video with binocular asymmetric distortion. IET Image Proc. 2020;14(6):1027–34.

    Article  Google Scholar 

  28. O. Messai, F. Hachouf, Z. A. Seghir, Deep learning and cyclopean view for no-reference stereoscopic image quality assessment, in: International Conference on Signal, Image, Vision and their Applications (SIVA), IEEE, 2018, pp. 1–6.

  29. Yang J, Sim K, Gao X, Lu W, Meng Q, Li B. A blind stereoscopic image quality evaluator with segmented stacked autoencoders considering the whole visual perception route. IEEE Trans Image Process. 2018;28(3):1314–28.

    Article  MathSciNet  Google Scholar 

  30. Yang J, Zhu Y, Ma C, Lu W, Meng Q. Stereoscopic video quality assessment based on 3d convolutional neural networks. Neurocomputing. 2018;309:83–93.

    Article  Google Scholar 

  31. S. Ma, S. Li, J. Xue, Y. Ding, G. Yue, Stereoscopic video quality assessment based on the two-step-training binocular fusion network, in: IEEE Visual Communications and Image Processing (VCIP), IEEE, 2019, pp. 1–4.

  32. Imani H, Islam MB, Arica N. Three-stream 3d deep cnn for no-reference stereoscopic video quality assessment. Intelligent Systems with Applications. 2022;13: 200059.

  33. H. Imani, S. Zaim, M. B. Islam, M. S. Junayed, Stereoscopic video quality assessment using modified parallax attention module, in: Digitizing Production Systems, Springer, 2022, pp. 39–50.

  34. L. Wang, Y. Wang, Z. Liang, Z. Lin, J. Yang, W. An, Y. Guo, Learning parallax attention for stereo image super-resolution, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 12250–12259.

  35. Xu X, Shi B, Gu Z, Deng R, Chen X, Krylov AS, Ding Y. 3d no-reference image quality assessment via transfer learning and saliency-guided feature consolidation. IEEE Access. 2019;7:85286–97.

    Article  Google Scholar 

  36. Otroshi-Shahreza H, Amini A, Behroozi H, No-reference image quality assessment using transfer learning, in,. 9th International Symposium on Telecommunications (IST). IEEE. 2018;2018:637–40.

  37. F. Chollet, Xception: Deep learning with depthwise separable convolutions, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1251–1258.

  38. Y. Shen, R. Fang, B. Sheng, L. Dai, H. Li, J. Qin, Q. Wu, W. Jia, Multi-task fundus image quality assessment via transfer learning and landmarks detection, in: International Workshop on Machine Learning in Medical Imaging, Springer, 2018, pp. 28–36.

  39. K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.

  40. Varga D. No-reference video quality assessment based on the temporal pooling of deep features. Neural Process Lett. 2019;50(3):2595–608.

    Article  Google Scholar 

  41. C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, Z. Wojna, Rethinking the inception architecture for computer vision, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 2818–2826.

  42. C. Szegedy, S. Ioffe, V. Vanhoucke, A. Alemi, Inception-v4, inception-resnet and the impact of residual connections on learning, arXiv preprint arXiv:1602.07261 (2016).

  43. Varga D, Szirányi T. No-reference video quality assessment via pretrained cnn and lstm networks. SIViP. 2019;13(8):1569–76.

    Article  Google Scholar 

  44. Hou R, Zhao Y, Hu Y, Liu H. No-reference video quality evaluation by a deep transfer cnn architecture. Signal Processing: Image Communication. 2020;83: 115782.

  45. K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, arXiv preprint arXiv:1409.1556 (2014).

  46. Zhang W, Qu C, Ma L, Guan J, Huang R. Learning structure of stereoscopic image for no-reference quality assessment with convolutional neural network. Pattern Recogn. 2016;59:176–87.

    Article  Google Scholar 

  47. J. Carreira, A. Zisserman, Quo vadis, action recognition? a new model and the kinetics dataset, in: proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 6299–6308.

  48. S. Ioffe, C. Szegedy, Batch normalization: Accelerating deep network training by reducing internal covariate shift, arXiv preprint arXiv:1502.03167 (2015).

  49. Shorten C, Khoshgoftaar TM. A survey on image data augmentation for deep learning. Journal of Big Data. 2019;6(1):60.

    Article  Google Scholar 

  50. E. Cheng, P. Burton, J. Burton, A. Joseski, I. Burnett, Rmit3dv: Pre-announcement of a creative commons uncompressed hd 3d video database, in: Fourth International Workshop on Quality of Multimedia Experience, IEEE, 2012, pp. 212–217.

  51. Mittal A, Soundararajan R, Bovik AC. Making a completely blind image quality analyzer. IEEE Signal Process Lett. 2012;20(3):209–12.

    Article  Google Scholar 

  52. Pinson MH, Wolf S. A new standardized method for objectively measuring video quality. IEEE Trans Broadcast. 2004;50(3):312–22.

    Article  Google Scholar 

  53. Md SK, Appina B, Channappayya SS. Full-reference stereo image quality assessment using natural stereo scene statistics. IEEE Signal Process Lett. 2015;22(11):1985–9.

    Article  Google Scholar 

  54. Lin Y-H, Wu J-L. Quality assessment of stereoscopic 3d image compression by binocular integration behaviors. IEEE Trans Image Process. 2014;23(4):1527–42.

    Article  MathSciNet  Google Scholar 

  55. B. Appina, A. Jalli, S. S. Battula, S. S. Channappayya, No-reference stereoscopic video quality assessment algorithm using joint motion and depth statistics, in: 25th IEEE International Conference on Image Processing (ICIP), IEEE, 2018, pp. 2800–2804.

  56. Qi F, Zhao D, Fan X, Jiang T. Stereoscopic video quality assessment based on visual attention and just-noticeable difference models. SIViP. 2016;10(4):737–44.

    Article  Google Scholar 

  57. Jiang G, Liu S, Yu M, Shao F, Peng Z, Chen F. No reference stereo video quality assessment based on motion feature in tensor decomposition domain. J Vis Commun Image Represent. 2018;50:247–62.

    Article  Google Scholar 

  58. Chen Z, Zhou W, Li W. Blind stereoscopic video quality assessment: From depth perception to overall experience. IEEE Trans Image Process. 2017;27(2):721–34.

    Article  MathSciNet  Google Scholar 

  59. H. Imani, M. B. Islam, L.-K. Wong, A new dataset and transformer for stereoscopic video super-resolution, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 706–715.

Download references

Funding

This work is partially supported by the Scientific and Technological Research Council of Turkey (TUBITAK) under the 2232 Outstanding Researchers program, Project No. 118C301. Research and its contents are solely the authors’ responsibility and do not necessarily represent the official view of the funding organizations. The funders had no role in study design, data analysis, algorithmic design, the decision to publish, or the preparation of the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hassan Imani.

Ethics declarations

Conflict of interest

The authors declare that they have no Conflict of interest.

Ethics approval and consent

Not applicable.

Consent for publication

All authors read and approved the final manuscript.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Imani, H., Islam, M.B. Objective Quality Assessment of Stereoscopic Video Using Inflated 3D Features. SN COMPUT. SCI. 5, 799 (2024). https://doi.org/10.1007/s42979-024-03184-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-024-03184-7

Keywords

Navigation