ABSTRACT
We propose a beauty product image retrieval method based on multi-feature fusion and feature aggregation. The key idea is representing the image with the feature vector obtained by multi-feature fusion and feature aggregation. VGG16 and ResNet50 are chosen to extract image features, and Crow is adopted to perform deep feature aggregation. Benefited from the idea of transfer learning, we fine turn VGG16 on the Perfect-500K data set to improve the performance of image retrieval. The proposed method won the third price in Perfect Corp. Challenge 2018 with the best result 0.270676 mAP. We released our code on GitHub: https://github.com/wangqi12332155/ACMMM-beauty-AI-challenge.
- Wen-Huang Cheng, Jia Jia, Si Liu, etc. 2018. Perfect Corp. Challenge 2018: Half Million Beauty Product Image Recognition. In https://challenge2018.perfectcorp.com/index.html.Google Scholar
- Simonyan, K., & Zisserman, A. 2014. Very deep convolutional networks for large-scale image recognition. In Computer Science,Google Scholar
- Datar, M., Immorlica, N., Indyk, P., & Mirrokni, V. S. 2004. Locality-sensitive hashing scheme based on p-stable distributions. In ACM, Twentieth Symposium on Computational Geometry, Vol.34, 253--262. Google ScholarDigital Library
- He, K., Zhang, X., Ren, S., & Sun, J. 2016. Deep Residual Learning for Image Recognition. In IEEE Conference on Computer Vision and Pattern Recognition, 770--778.Google Scholar
- Kalantidis, Y., Mellina, C., & Osindero, S. 2016. Cross-Dimensional Weighting for Aggregated Deep Convolutional Features. In Springer, Cham, European Conference on Computer Vision, 685--701.Google Scholar
- Gordo, A., Almazán, J., Revaud, J., & Larlus, D. 2016. Deep Image Retrieval: Learning Global Representations for Image Search. In Springer, Cham, European Conference on Computer Vision, 241--257.Google Scholar
- Gordo, A., Almazán, J., Revaud, J., & Larlus, D. 2016. End-to-end learning of deep visual representations for image retrieval. In International Journal of Computer Vision, 1--18. Google ScholarDigital Library
- Sivic, J. 2003. A Text Retrieval Approach to Object Matching in Videos. In Proc. of IEEE International Conference on Computer Vision. Google ScholarDigital Library
- Nister, D., & Stewenius, H. 2006. Scalable recognition with a vocabulary tree. In Computer Vision and Pattern Recognition, 2(10), 2161--2168. Google ScholarDigital Library
- Philbin, J., Chum, O., Isard, M., Sivic, J., & Zisserman, A. 2007. Object retrieval with large vocabularies and fast spatial matching. In Computer Vision and Pattern Recognition.Google Scholar
- Jegou, H., Douze, M., & Schmid, C. 2008. Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search. In European Conference on Computer Vision, Vol.5302, 304--317. Google ScholarDigital Library
- Jégou, H., Douze, M., & Schmid, C. 2010. Improving bag-of-features for large scale image search. In International Journal of Computer Vision, 87(3), 316--336. Google ScholarDigital Library
- Tolias, G., & Avrithis, Y. 2016. Erratum to: image search with selective match kernels: aggregation across single and multiple images.In International Journal of Computer Vision, 116(3), 262--262. Google ScholarDigital Library
- Perronnin, F., & Dance, C. 2007. Fisher Kernels on Visual Vocabularies for Image Categorization. In IEEE Conference on Computer Vision and Pattern Recognition, 1--8.Google Scholar
- Perronnin, F., Liu, Y., Sanchez, J., & Poirier, H. 2010. Large-scale image retrieval with compressed Fisher vectors. In Computer Vision and Pattern Recognition, Vol.26, 3384--3391.Google Scholar
- Jégou, H., Douze, M., Schmid, C., & Pérez, P. 2010. Aggregating local descriptors into a compact image representation. In Computer Vision and Pattern Recognition, Vol.238, 3304--3311.Google Scholar
- Krizhevsky, A., Sutskever, I., & Hinton, G. E. 2012. ImageNet classification with deep convolutional neural networks. In International Conference on Neural Information Processing Systems, Vol.60, 1097--1105. Google ScholarDigital Library
- Razavian, A. S., Azizpour, H., Sullivan, J., & Carlsson, S. 2014. CNN features off-the-shelf : an astounding baseline for recognition. In Computer Vision and Pattern Recognition Workshops, pp.512--519. Google ScholarDigital Library
- Babenko, A., & Lempitsky, V. 2015. Aggregating deep convolutional features for image retrieval. In Computer Science.Google Scholar
- Babenko, A., Slesarev, A., Chigorin, A., & Lempitsky, V. 2014. Neural codes for image retrieval. In European conference on computer vision, 8689, 584--599.Google ScholarCross Ref
- Gordo, A., Almazán, J., Revaud, J., & Larlus, D. 2016. Deep Image Retrieval: Learning Global Representations for Image Search. In European Conference on Computer Vision, 241--257.Google Scholar
- Radenovic, F., Tolias, G., & Chum, O. 2016. CNN Image Retrieval Learns from BoW: Unsupervised Fine-Tuning with Hard Examples. In European Conference on Computer Vision, 3--20.Google ScholarCross Ref
Recommendations
Image Retrieval Based on Multi-feature Fusion
IMCCC '14: Proceedings of the 2014 Fourth International Conference on Instrumentation and Measurement, Computer, Communication and ControlIn content-based image retrieval, and for this critical issue of image feature fusion, paper proposes a new method to determine the weights for multi-feature fusion. In this paper, color histogram, color correlogram, gray level co-occurrence matrix, ...
Series feature aggregation for content-based image retrieval
Feature aggregation is a critical technique in content-based image retrieval (CBIR) systems that employs multiple visual features to characterize image content. Most previous feature aggregation schemes apply parallel topology, e.g., the linear ...
Haze transfer and feature aggregation network for real-world single image dehazing
AbstractThe absence of ground truth for hazy images in real scenes results in most dehazing models being trained only by synthesized datasets, which is not feasible for real hazy images due to the domain shift. To address this problem, a haze ...
Highlights- We propose a haze transfer model for haze information from a real hazy image to a clear image.
Comments