Abstract:
Image Aesthetic Assessment (IAA) is a challenging task that is closely tied to human aesthetic experience. In this paper, inspired by Top-Down guidance and visual attenti...Show MoreMetadata
Abstract:
Image Aesthetic Assessment (IAA) is a challenging task that is closely tied to human aesthetic experience. In this paper, inspired by Top-Down guidance and visual attention mechanism, we propose a Top-Down Guidance based ViT-CNN network considering theme information. Considering the guidance of global information on local information, we construct a two-stream network structure. It consists of Vision Transformer (ViT) and Convolutional Neural Network (CNN) streams. Meanwhile, a global and local feature attention guidance module (GLFAGM) is proposed to better realize the guidance from global features of ViT stream down to local features of CNN stream. In addition, considering the importance of theme information, the proposed network utilizes more comprehensive theme information as auxiliary information to achieve aesthetic assessment. To better utilize theme information, an attentionbased theme feature fusion module (ATFFM) is proposed to integrate theme features and visual features from CNN stream. The experimental results show that the proposed method achieves better performance and outperforms some state-of-the-art methods.
Date of Conference: 15-19 July 2024
Date Added to IEEE Xplore: 30 September 2024
ISBN Information: