28 December 2021 Graph attention mechanism with global contextual information for multi-label image recognition
Xiaoxiao Ban, Peihua Li, Qilong Wang, Shoujun Zhou, Shijie Guo, Yuanquan Wang
Author Affiliations +
Abstract

Recent works have shown that multi-label image recognition is still a challenging task in computer vision due to the complicatedness and diversity of multi-label images. However, the existing works ignore the co-occurrence correlation and global contextual information between image space and objects. We present a model to solve these problems. On the one hand, we devise the graph attention mechanism to compute the hidden representations of different categories in multi-label images. It can specify different weights to different neighbor objects and well model the label dependency. On the other hand, we iterate the global contextual information by the second-order covariance pooling to enhance nonlinear modeling capability and use basic residual network to extract features. The proposed model is thoroughly evaluated on PASCAL VOC 2007 and MS-COCO datasets. Compared with classical ML-GCN, the model can better combine the image features and label embedding. Meanwhile, experiments show that it outperforms the state-of-the-art methods such as residual multi-layer perceptron, EfficientNet, and vision transformer.

© 2021 SPIE and IS&T 1017-9909/2021/$28.00 © 2021 SPIE and IS&T
Xiaoxiao Ban, Peihua Li, Qilong Wang, Shoujun Zhou, Shijie Guo, and Yuanquan Wang "Graph attention mechanism with global contextual information for multi-label image recognition," Journal of Electronic Imaging 30(6), 063031 (28 December 2021). https://doi.org/10.1117/1.JEI.30.6.063031
Received: 4 August 2021; Accepted: 8 December 2021; Published: 28 December 2021
Lens.org Logo
CITATIONS
Cited by 2 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Performance modeling

Visual process modeling

Chromium

Convolution

Data modeling

Feature extraction

Detection and tracking algorithms

RELATED CONTENT


Back to Top