skip to main content
10.1145/2502081.2508119acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Towards a comprehensive computational model foraesthetic assessment of videos

Published: 21 October 2013 Publication History

Abstract

In this paper we propose a novel aesthetic model emphasizing psycho-visual statistics extracted from multiple levels in contrast to earlier approaches that rely only on descriptors suited for image recognition or based on photographic principles. At the lowest level, we determine dark-channel, sharpness and eye-sensitivity statistics over rectangular cells within a frame. At the next level, we extract Sentibank features (1,200 pre-trained visual classifiers) on a given frame, that invoke specific sentiments such as "colorful clouds", "smiling face" etc. and collect the classifier responses as frame-level statistics. At the topmost level, we extract trajectories from video shots. Using viewer's fixation priors, the trajectories are labeled as foreground, and background/camera on which statistics are computed. Additionally, spatio-temporal local binary patterns are computed that capture texture variations in a given shot. Classifiers are trained on individual feature representations independently. On thorough evaluation of 9 different types of features, we select the best features from each level -- dark channel, affect and camera motion statistics. Next, corresponding classifier scores are integrated in a sophisticated low-rank fusion framework to improve the final prediction scores. Our approach demonstrates strong correlation with human prediction on 1,000 broadcast quality videos released by NHK as an aesthetic evaluation dataset.

References

[1]
S. Bhattacharya, R. Sukthankar, and M. Shah. A holistic approach to aesthetic enhancement of photographs. TOMCCAP, 7(Supplement):21, 2011.
[2]
D. Borth, R. Ji, T. Chen, T. Breuel, and S.-F. Chang. Large-scale visual sentiment ontology and detectors using adjective noun pairs. In ACM MM, 2013.
[3]
E. J. Candès, X. Li, Y. Ma, and J. Wright. Robust principal component analysis? J. ACM, 58(3):11:1--11:37, June 2011.
[4]
R. Datta, D. Joshi, J. Li, and J. Z. Wang. Studying aesthetics in photographic images using a computational approach. In ECCV, pages 288--301, 2006.
[5]
K. He, J. Sun, and X. Tang. Single image haze removal using dark channel prior. TPAMI, 33(12):2341--2353, 2011.
[6]
T. Joachims. Optimizing search engines using clickthrough data. In ACM KDD, pages 133--142, 2002.
[7]
Y. Ke, X. Tang, and F. Jing. The design of high-level features for photo quality assessment. In CVPR, pages 419--426, 2006.
[8]
W. Luo, X. Wang, and X. Tang. Content-based photo quality assessment. In ICCV, pages 2206--2213, 2011.
[9]
Y. Luo and X. Tang. Photo and video quality evaluation: Focusing on the subject. In ECCV, pages 386--399, 2008.
[10]
L. Marchesotti, F. Perronnin, D. Larlus, and G. Csurka. Assessing the aesthetic quality of photographs using generic image descriptors. In ICCV, pages 1784--1791, 2011.
[11]
A. K. Moorthy, P. Obrador, and N. Oliver. Towards computational models of the visual aesthetic appeal of consumer videos. In ECCV, pages 1--14, 2010.
[12]
M. Muja and D. G. Lowe. Fast approximate nearest neighbors with automatic algorithm configuration. In VISAPP, pages 331--340. INSTICC Press, 2009.
[13]
N. Murray, L. Marchesotti, and F. Perronnin. Ava: A large-scale database for aesthetic visual analysis. In CVPR, pages 2408--2415, 2012.
[14]
M. Nishiyama, T. Okabe, I. Sato, and Y. Sato. Aesthetic quality classification of photographs based on color harmony. In CVPR, pages 33--40, 2011.
[15]
R. Plutchik. Harper & Row Publishers, 1980.
[16]
J. Shi and C. Tomasi. Good features to track. In CVPR, pages 593--600, 1994.
[17]
K. van de Sande, T. Gevers, and C. Snoek. Evaluating color descriptors for object and scene recognition. TPAMI, 32(9):1582--1596, 2010.
[18]
J. J. Vos. Colorimetric and photometric properties of a 2-deg fundamental observer. Color Research and Application, 1978.
[19]
C. Vu, T. Phan, and D. Chandler. S3: A spectral and spatial measure of local perceived sharpness in natural images. TIP, 21(3):934--945, 2012.
[20]
H. Wang, A. Klaser, C. Schmid, and C.-L. Liu. Action recognition by dense trajectories. In CVPR, pages 3169--3176, 2011.
[21]
G. Ye, D. Liu, I.-H. Jhuo, and S.-F. Chang. Robust late fusion with rank minimization. In CVPR, pages 3021--3028, 2012.
[22]
L. Zhang, M. H. Tong, T. K. Marks, H. Shan, and G. W. Cottrell. SUN: A bayesian framework for saliency using natural statistics. J Vis, 8(7):32.1--20, 2008.
[23]
G. Zhao and M. Pietikainen. Dynamic texture recognition using local binary patterns with an application to facial expressions. TPAMI, 29(6):915--928, June 2007.

Cited By

View all
  • (2022)Image Aesthetic Assessment: A Comparative Study of Hand-Crafted & Deep Learning ModelsIEEE Access10.1109/ACCESS.2022.320919610(101770-101789)Online publication date: 2022
  • (2022)MVVA-Net: a Video Aesthetic Quality Assessment Network with Cognitive Fusion of Multi-type Feature–Based Strong GeneralizationCognitive Computation10.1007/s12559-021-09947-114:4(1435-1445)Online publication date: 12-Mar-2022
  • (2021)Advances and Challenges in Computational Image AestheticsHuman Perception of Visual Information10.1007/978-3-030-81465-6_6(133-181)Online publication date: 22-Jul-2021
  • Show More Cited By

Index Terms

  1. Towards a comprehensive computational model foraesthetic assessment of videos

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '13: Proceedings of the 21st ACM international conference on Multimedia
    October 2013
    1166 pages
    ISBN:9781450324045
    DOI:10.1145/2502081
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 21 October 2013

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. affect features
    2. camera motion features
    3. cinematography
    4. low rank late fusion
    5. video aesthetics

    Qualifiers

    • Research-article

    Conference

    MM '13
    Sponsor:
    MM '13: ACM Multimedia Conference
    October 21 - 25, 2013
    Barcelona, Spain

    Acceptance Rates

    MM '13 Paper Acceptance Rate 47 of 235 submissions, 20%;
    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)15
    • Downloads (Last 6 weeks)6
    Reflects downloads up to 20 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Image Aesthetic Assessment: A Comparative Study of Hand-Crafted & Deep Learning ModelsIEEE Access10.1109/ACCESS.2022.320919610(101770-101789)Online publication date: 2022
    • (2022)MVVA-Net: a Video Aesthetic Quality Assessment Network with Cognitive Fusion of Multi-type Feature–Based Strong GeneralizationCognitive Computation10.1007/s12559-021-09947-114:4(1435-1445)Online publication date: 12-Mar-2022
    • (2021)Advances and Challenges in Computational Image AestheticsHuman Perception of Visual Information10.1007/978-3-030-81465-6_6(133-181)Online publication date: 22-Jul-2021
    • (2020)Exploring Personal Memories and Video Content as Context for Facial Behavior in Predictions of Video-Induced EmotionsProceedings of the 2020 International Conference on Multimodal Interaction10.1145/3382507.3418814(153-162)Online publication date: 21-Oct-2020
    • (2020)Objectivity and Subjectivity in Aesthetic Quality Assessment of Digital PhotographsIEEE Transactions on Affective Computing10.1109/TAFFC.2018.280975211:3(493-506)Online publication date: 1-Jul-2020
    • (2020)Predicting the popularity of micro-videos via a feature-discrimination transductive modelMultimedia Systems10.1007/s00530-020-00660-x26:5(519-534)Online publication date: 14-Jun-2020
    • (2020)ReferencesComputational Models for Cognitive Vision10.1002/9781119527886.refs(187-213)Online publication date: 6-Jul-2020
    • (2019)Analysis and Importance of Deep Learning for Video Aesthetic AssessmentsInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology10.32628/CSEIT1951100(546-554)Online publication date: 20-Feb-2019
    • (2019)Multimodal Learning toward Micro-Video UnderstandingSynthesis Lectures on Image, Video, and Multimedia Processing10.2200/S00938ED1V01Y201907IVM0209:4(1-186)Online publication date: 17-Sep-2019
    • (2019)Detecting Mood of Aesthetically Pleasing Videos Using Deep Learning2019 Global Conference for Advancement in Technology (GCAT)10.1109/GCAT47503.2019.8978327(1-5)Online publication date: Oct-2019
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media