Matryoshka Peek: Toward Learning Fine-Grained, Robust, Discriminative Features for Product Search | IEEE Journals & Magazine | IEEE Xplore

Matryoshka Peek: Toward Learning Fine-Grained, Robust, Discriminative Features for Product Search


Abstract:

In sharp contrast to the traditional category/subcategory level image retrieval, product image search aims to find the images containing the exact same product. This is a...Show More

Abstract:

In sharp contrast to the traditional category/subcategory level image retrieval, product image search aims to find the images containing the exact same product. This is a challenging problem because in addition to being robust under different imaging conditions such as varying viewpoints and illumination changes, the features should also be able to distinguish the specific product among many similar products. Consequently, it is important to utilize a large dataset, containing many product classes, to learn a strongly discriminative representation. Building such a dataset requires laborious manual annotation. Toward learning fine-grained, robust, discriminative features for product image search, we present a novel paradigm that can construct the required dataset without any human annotation. Unlike other fine-grained recognition works that rely on high-quality annotated datasets and are very narrowly focused on a specific object category, our method handles multiple object classes and requires minimum human effort. First, an ImageNet pretrained model is used to generate product clusters. As the original features from ImageNet are not discriminative, the clusters generated by this unsupervised procedure contain much noise. We alleviate noise by explicitly modeling noise distribution and automatically detecting errors during learning. The proposed paradigm is general, requires minimum human efforts, and is applicable to any deep learning task where fine-grained discriminative features are desired. Extensive experiments on the ALISC dataset have demonstrated that our approach is sound and effective, surpassing the baseline GoogleNet model by 15.09%.
Published in: IEEE Transactions on Multimedia ( Volume: 19, Issue: 6, June 2017)
Page(s): 1272 - 1284
Date of Publication: 18 January 2017

ISSN Information:

Funding Agency:


References

References is not available for this document.