skip to main content
10.1145/3474085.3475183acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

PUGCQ: A Large Scale Dataset for Quality Assessment of Professional User-Generated Content

Published: 17 October 2021 Publication History

Abstract

Recent years have witnessed a surge of professional user-generated content (PUGC) based video services, coinciding with the accelerated proliferation of video acquisition devices such as mobile phones, wearable cameras, and unmanned aerial vehicles. Different from traditional UGC videos by impromptu shooting, PUGC videos produced by professional users tend to be carefully designed and edited, receiving high popularity with a relatively satisfactory playing count. In this paper, we systematically conduct the comprehensive study on the perceptual quality of PUGC videos and introduce a database consisting of 10,000 PUGC videos with subjective ratings. In particular, during the subjective testing, we collect the human opinions based upon not only the MOS, but also the attributes that could potentially influence the visual quality including face, noise, blur, brightness, and color. We make the attempt to analyze the large-scale PUGC database with a series of video quality assessment (VQA) algorithms and a dedicated baseline model based on pretrained deep neural network is further presented. The cross-dataset experiments reveal a large domain gap between the PUGC and the traditional user-generated videos, which are critical in learning based VQA. These results shed light on developing next-generation PUGC quality assessment algorithms with desired properties including promising generalization capability, high accuracy, and effectiveness in perceptual optimization. The dataset and the codes are released at https://github.com/wlkdb/pugcq_create.

Supplementary Material

ZIP File (mfp0147aux.zip)
A PDF file for more details about PUGCQ dataset.

References

[1]
Jochen Antkowiak, TDF Jamal Baina, France Vittorio Baroncini, Noel Chateau, France FranceTelecom, Antonio Claudio Francc a Pessoa, FUB Stephanie Colonnese, Italy Laura Contin, Jorge Caviedes, and France Philips. 2000. FINAL REPORT FROM THE VIDEO QUALITY EXPERTS GROUP ON THE VALIDATION OF OBJECTIVE MODELS OF VIDEO QUALITY ASSESSMENT March 2000. (2000).
[2]
Sebastian Bosse, Dominique Maniry, Klaus-Robert Müller, Thomas Wiegand, and Wojciech Samek. 2017. Deep neural networks for no-reference and full-reference image quality assessment. IEEE Transactions on Image Processing, Vol. 27, 1 (2017), 206--219.
[3]
Damon M Chandler and Sheila S Hemami. 2007. VSNR: A wavelet-based visual signal-to-noise ratio for natural images. IEEE transactions on image processing, Vol. 16, 9 (2007), 2284--2298.
[4]
Baoliang Chen, Lingyu Zhu, Guo Li, Fangbo Lu, Hongfei Fan, and Shiqi Wang. 2021. Learning Generalized Spatial-Temporal Deep Feature Representation for No-Reference Video Quality Assessment. IEEE Transactions on Circuits and Systems for Video Technology (2021).
[5]
Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014).
[6]
James E Crenshaw, Alfred She, Ning Xu, Limin Liu, Scott Daly, Kevin Stec, and Samir Hulyalkar. 2017. Video compression implementing resolution tradeoffs and optimization. US Patent 9,554,132.
[7]
Francesca De Simone, Matteo Naccari, Marco Tagliasacchi, Frederic Dufaux, Stefano Tubaro, and Touradj Ebrahimi. 2009. Subjective assessment of H. 264/AVC video sequences transmitted over a noisy channel. In 2009 International Workshop on Quality of Multimedia Experience. IEEE, 204--209.
[8]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition. Ieee, 248--255.
[9]
Keyan Ding, Kede Ma, Shiqi Wang, and Eero Simoncelli. 2020. Image Quality Assessment: Unifying Structure and Texture Similarity. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. PP (12 2020), 1--1. https://doi.org/10.1109/TPAMI.2020.3045810
[10]
Yifan Fu, Xingquan Zhu, and Bin Li. 2013. A survey on instance selection for active learning. Knowledge and information systems, Vol. 35, 2 (2013), 249--283.
[11]
Deepti Ghadiyaram, Janice Pan, Alan C Bovik, Anush Krishna Moorthy, Prasanjit Panda, and Kai-Chieh Yang. 2017. In-capture mobile video distortions: A study of subjective behavior and objective algorithms. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 28, 9 (2017), 2061--2077.
[12]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.
[13]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation, Vol. 9, 8 (1997), 1735--1780.
[14]
Vlad Hosu, Franz Hahn, Mohsen Jenadeleh, Hanhe Lin, Hui Men, Tamás Szirányi, Shujun Li, and Dietmar Saupe. 2017. The Konstanz natural video database (KoNViD-1k). In 2017 Ninth international conference on quality of multimedia experience (QoMEX). IEEE, 1--6.
[15]
Vlad Hosu, Hanhe Lin, Tamas Sziranyi, and Dietmar Saupe. 2020. KonIQ-10k: An ecologically valid database for deep learning of blind image quality assessment. IEEE Transactions on Image Processing, Vol. 29 (2020), 4041--4056.
[16]
Jari Korhonen. 2019. Two-level approach for no-reference consumer video quality assessment. IEEE Transactions on Image Processing, Vol. 28, 12 (2019), 5923--5938.
[17]
Jari Korhonen, Yicheng Su, and Junyong You. 2020. Blind natural video quality prediction via statistical temporal features and deep spatial features. In Proceedings of the 28th ACM International Conference on Multimedia. 3311--3319.
[18]
Dingquan Li, Tingting Jiang, and Ming Jiang. 2019. Quality assessment of in-the-wild videos. In Proceedings of the 27th ACM International Conference on Multimedia. 2351--2359.
[19]
Yuming Li, Lai-Man Po, Chun-Ho Cheung, Xuyuan Xu, Litong Feng, Fang Yuan, and Kwok-Wai Cheung. 2015. No-reference video quality assessment with 3D shearlet transform and convolutional neural networks. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 26, 6 (2015), 1044--1057.
[20]
Hanhe Lin, Vlad Hosu, and Dietmar Saupe. 2018. KonIQ-10K: Towards an ecologically valid and large-scale IQA database. arXiv preprint arXiv:1803.08489 (2018).
[21]
Joe Yuchieh Lin, Tsung-Jung Liu, Eddy Chi-Hao Wu, and C-C Jay Kuo. 2014. A fusion-based video quality assessment (FVQA) index. In Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific. IEEE, 1--5.
[22]
Wentao Liu, Zhengfang Duanmu, and Zhou Wang. 2018. End-to-End Blind Quality Assessment of Compressed Videos Using Deep Neural Networks. In ACM Multimedia. 546--554.
[23]
Kede Ma, Zhengfang Duanmu, and Zhou Wang. 2018. Geometric transformation invariant image quality assessment using convolutional neural networks. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 6732--6736.
[24]
Kede Ma, Zhengfang Duanmu, Qingbo Wu, Zhou Wang, Hongwei Yong, Hongliang Li, and Lei Zhang. 2016. Waterloo exploration database: New challenges for image quality assessment models. IEEE Transactions on Image Processing, Vol. 26, 2 (2016), 1004--1016.
[25]
Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research, Vol. 9, Nov (2008), 2579--2605.
[26]
Rafał K Mantiuk, Anna Tomaszewska, and Radosław Mantiuk. 2012. Comparison of four subjective methods for image quality assessment. In Computer graphics forum, Vol. 31. Wiley Online Library, 2478--2491.
[27]
Leland McInnes, John Healy, and James Melville. 2018. Umap: Uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426 (2018).
[28]
Anish Mittal, Anush Krishna Moorthy, and Alan Conrad Bovik. 2012a. No-reference image quality assessment in the spatial domain. IEEE Transactions on image processing, Vol. 21, 12 (2012), 4695--4708.
[29]
Anish Mittal, Michele A Saad, and Alan C Bovik. 2015. A completely blind video integrity oracle. IEEE Transactions on Image Processing, Vol. 25, 1 (2015), 289--300.
[30]
Anish Mittal, Rajiv Soundararajan, and Alan C Bovik. 2012b. Making a ?completely blind" image quality analyzer. IEEE Signal processing letters, Vol. 20, 3 (2012), 209--212.
[31]
Anush Krishna Moorthy, Lark Kwon Choi, Alan Conrad Bovik, and Gustavo De Veciana. 2012. Video quality assessment on mobile devices: Subjective, behavioral and objective studies. IEEE Journal of Selected Topics in Signal Processing, Vol. 6, 6 (2012), 652--671.
[32]
Mikko Nuutinen, Toni Virtanen, Mikko Vaahteranoksa, Tero Vuori, Pirkko Oittinen, and Jukka H"akkinen. 2016. CVD2014-A database for evaluating no-reference video quality assessment algorithms. IEEE Transactions on Image Processing, Vol. 25, 7 (2016), 3073--3086.
[33]
Abdul Rehman, Kai Zeng, and Zhou Wang. 2015. Display device-adapted video quality-of-experience assessment. In Human Vision and Electronic Imaging XX, Vol. 9394. International Society for Optics and Photonics, 939406.
[34]
Flávio Ribeiro, Dinei Florencio, and V'itor Nascimento. 2011. Crowdsourcing subjective image quality evaluation. In 2011 18th IEEE International Conference on Image Processing. IEEE, 3097--3100.
[35]
Kalpana Seshadrinathan, Rajiv Soundararajan, Alan Conrad Bovik, and Lawrence K Cormack. 2010. Study of subjective and objective quality assessment of video. IEEE transactions on Image Processing, Vol. 19, 6 (2010), 1427--1441.
[36]
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).
[37]
Zeina Sinno and Alan Conrad Bovik. 2018. Large-scale study of perceptual video quality. IEEE Transactions on Image Processing, Vol. 28, 2 (2018), 612--627.
[38]
Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, and Alexander A Alemi. 2017. Inception-v4, inception-resnet and the impact of residual connections on learning. In Thirty-first AAAI conference on artificial intelligence .
[39]
Bart Thomee, David A Shamma, Gerald Friedland, Benjamin Elizalde, Karl Ni, Douglas Poland, Damian Borth, and Li-Jia Li. 2016. YFCC100M: The new data in multimedia research. Commun. ACM, Vol. 59, 2 (2016), 64--73.
[40]
VQEG. 2018a. VQEG FR?TV Phase I Database. [EB/OL].(2000). http://www.its.bldrdoc.gov/vqeg/projects/frtv-phase-i/frtv-phase-i.aspx.
[41]
VQEG. 2018b. VQEG HDTV Group. VQEG HDTV Database. [EB/OL].(2000). http://www.its.bldrdoc.gov/vqeg/projects/hdtv/hdtv.aspx/.
[42]
Phong V Vu and Damon M Chandler. 2014. ViS3: an algorithm for video quality assessment via analysis of spatial and spatiotemporal slices. Journal of Electronic Imaging, Vol. 23, 1 (2014), 013016.
[43]
Yilin Wang, Sasi Inguva, and Balu Adsumilli. 2019. Youtube ugc dataset for video compression research. In 2019 IEEE 21st International Workshop on Multimedia Signal Processing (MMSP). IEEE, 1--5.
[44]
Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli. 2004. Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing, Vol. 13, 4 (2004), 600--612.
[45]
Zhou Wang, Eero P Simoncelli, and Alan C Bovik. 2003. Multiscale structural similarity for image quality assessment. In The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003, Vol. 2. Ieee, 1398--1402.
[46]
Saining Xie, Ross Girshick, Piotr Dollár, Zhuowen Tu, and Kaiming He. 2017. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1492--1500.
[47]
Joong Gon Yim, Yilin Wang, Neil Birkbeck, and Balu Adsumilli. 2020. Subjective Quality Assessment for YouTube UGC Dataset. arXiv preprint arXiv:2002.12275 (2020).
[48]
Junyong You and Jari Korhonen. 2019. Deep neural networks for no-reference video quality assessment. In 2019 IEEE International Conference on Image Processing (ICIP). IEEE, 2349--2353.
[49]
Yu Zhang, Xinbo Gao, Lihuo He, Wen Lu, and Ran He. 2018. Blind video quality assessment with weakly supervised learning and resampling strategy. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 29, 8 (2018), 2244--2255.
[50]
Yu Zhang, Xinbo Gao, Lihuo He, Wen Lu, and Ran He. 2020. Objective Video Quality Assessment Combining Transfer Learning With CNN. IEEE Transactions on Neural Networks and Learning Systems, Vol. 31, 8 (2020), 2716--2730. https://doi.org/10.1109/TNNLS.2018.2890310
[51]
Simeone Zomer, Miguel Del Nogal Sánchez, Richard G Brereton, and Jose L Perez Pavon. 2004. Active learning support vector machines for optimal sample selection in classification. Journal of Chemometrics: A Journal of the Chemometrics Society, Vol. 18, 6 (2004), 294--305.

Cited By

View all
  • (2025)A Multifaceted Vision of the Human-AI Collaboration: A Comprehensive ReviewIEEE Access10.1109/ACCESS.2025.353609513(29375-29405)Online publication date: 2025
  • (2023)Learning Spatiotemporal Interactions for User-Generated Video Quality AssessmentIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2022.320714833:3(1031-1042)Online publication date: 1-Mar-2023

Index Terms

  1. PUGCQ: A Large Scale Dataset for Quality Assessment of Professional User-Generated Content

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '21: Proceedings of the 29th ACM International Conference on Multimedia
    October 2021
    5796 pages
    ISBN:9781450386517
    DOI:10.1145/3474085
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 17 October 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. no-reference video quality assessment
    2. professional user-generated content
    3. video quality assessment

    Qualifiers

    • Research-article

    Conference

    MM '21
    Sponsor:
    MM '21: ACM Multimedia Conference
    October 20 - 24, 2021
    Virtual Event, China

    Acceptance Rates

    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)38
    • Downloads (Last 6 weeks)6
    Reflects downloads up to 28 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)A Multifaceted Vision of the Human-AI Collaboration: A Comprehensive ReviewIEEE Access10.1109/ACCESS.2025.353609513(29375-29405)Online publication date: 2025
    • (2023)Learning Spatiotemporal Interactions for User-Generated Video Quality AssessmentIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2022.320714833:3(1031-1042)Online publication date: 1-Mar-2023

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media