Abstract:
In this paper, each image is viewed as a bag of local regions, as well as it is investigated globally. A novel method is developed for achieving multi-label multi-instanc...Show MoreMetadata
Abstract:
In this paper, each image is viewed as a bag of local regions, as well as it is investigated globally. A novel method is developed for achieving multi-label multi-instance image annotation, where image-level (bag-level) labels and region-level (instance-level) labels are both obtained. The associations between semantic concepts and visual features are mined both at the image level and at the region level. Inter-label correlations are captured by a co-occurence matrix of concept pairs. The cross-level label coherence encodes the consistency between the labels at the image level and the labels at the region level. The associations between visual features and semantic concepts, the correlations among the multiple labels, and the cross-level label coherence are sufficiently leveraged to improve annotation performance. Structural max-margin technique is used to formulate the proposed model and multiple interrelated classifiers are learned jointly. To leverage the available image-level labeled samples for the model training, the region-level label identification on the training set is firstly accomplished by building the correspondences between the multiple bag-level labels and the image regions. JEC distance based kernels are employed to measure the similarities both between images and between regions. Experimental results on real image datasets MSRC and Corel demonstrate the effectiveness of our method.
Published in: 2011 International Conference on Computer Vision
Date of Conference: 06-13 November 2011
Date Added to IEEE Xplore: 12 January 2012
ISBN Information: