Abstract:
This paper presents a new semantic sparse recoding method to generate more descriptive and robust representation of visual content for image applications. Although the vi...Show MoreMetadata
Abstract:
This paper presents a new semantic sparse recoding method to generate more descriptive and robust representation of visual content for image applications. Although the visual bag-of-words (BOW) representation has been reported to achieve promising results in different image applications, its visual codebook is completely learnt from low-level visual features using quantization techniques and thus the so-called semantic gap remains unbridgeable. To handle such challenging issue, we utilize the annotations (predicted by algorithms or shared by users) of all the images to improve the original visual BOW representation. This is further formulated as a sparse coding problem so that the noise issue induced by the inaccurate quantization of visual features can also be handled to some extent. By developing an efficient sparse coding algorithm, we successfully generate a new visual BOW representation for image applications. Since such sparse coding has actually incorporated the high-level semantic information into the original visual codebook, we thus consider it as semantic sparse recoding of the visual content. Finally, we apply our semantic sparse recoding method to automatic image annotation and social image classification. The experimental results on several benchmark datasets show the promising performance of our semantic sparse recoding method in these two image applications.
Published in: IEEE Transactions on Image Processing ( Volume: 24, Issue: 1, January 2015)