Abstract:
Assessing image aesthetics requires a multi-level aesthetics representation and a comprehensive aesthetics vision. Therefore, providing fine-grained information and multi...Show MoreMetadata
Abstract:
Assessing image aesthetics requires a multi-level aesthetics representation and a comprehensive aesthetics vision. Therefore, providing fine-grained information and multi-scale information is of great significance in aesthetics assessment. The use of multi-scale information to process image aesthetics features and the multi-patch as input has become a common method in image aesthetics assessment (IAA). In this paper, we propose a multi-scale and multi-patch aggregation network based on dual-column vision fusion (MMANet) for IAA. First instead of random cropping, we proposed a visual saliency guided cropping (VS cropping) to get multi-patch. Second to effectively capture the characteristics between patches, we propose a multi-scale aesthetics patch fusion attention (MAPA) module based on the human visual stereoscopic imaging mechanism. In addition, a multi-patch feature aggregation (MFA) module for further fusing the features of multi-patch information for IAA is proposed. Extensive experiments on three public IAA databases demonstrate the superiority of the proposed MMANet model over the state-of-the-arts.
Date of Conference: 15-19 July 2024
Date Added to IEEE Xplore: 30 September 2024
ISBN Information: