research-article

Beauty Product Retrieval Based on Regional Maximum Activation of Convolutions with Generalized Attention

Authors:

Lingyun YuAuthors Info & Claims

MM '19: Proceedings of the 27th ACM International Conference on Multimedia

Pages 2553 - 2557

https://doi.org/10.1145/3343031.3356065

Published: 15 October 2019 Publication History

Abstract

Beauty and Personal care product retrieval has attracted more and more research attention for its value in real life. However, suffering from data variants and complex background, this task has been very challenging. In this paper, we propose a novel Generalized-attention Regional Maximal Activation of Convolutions (GRMAC) descriptor which helps to generate image features for retrieval. This method introduces attention mechanism to reduce the influence of clustered background and highlight the target, and thus contributes to enhancing the effectiveness of features and boosting the retrieval performance. Different from other attention-based methods, our method supports adjusting mask with a hyperparameter p, which is more flexible and accurate in real application. To demonstrate its effectiveness, we conduct experiments on the dataset containing more than half million personal care products (Perfect-500K) and obtain remarkable results. Furthermore, we try to fuse multiple features from different models for more improvements. And finally, our team (USTC_NELSLIP) ranked 1st in the Grand Challenge of AI Meets Beauty in ACM Multimedia 2019 with a MAP score of 0.408614. Our code is available at: https://github.com/gniknoil/Perfect500K-Beauty-and-Personal-Care-Products-Retrieval-Challenge

References

[1]

Relja Arandjelovic, Petr Gronat, Akihiko Torii, Tomas Pajdla, and Josef Sivic. 2016. NetVLAD: CNN architecture for weakly supervised place recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition . 5297--5307.

[2]

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition. Ieee, 248--255.

[3]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.

[4]

Jie Hu, Li Shen, and Gang Sun. 2018. Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 7132--7141.

[5]

Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q Weinberger. 2017. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4700--4708.

[6]

Yang Li, Yulong Xu, Jiabao Wang, Zhuang Miao, and Yafei Zhang. 2017. Ms-rmac: Multiscale regional maximum activation of convolutions for image retrieval. In IEEE Signal Processing Letters, Vol. 24. IEEE, 609--613.

[7]

Zehang Lin, Zhenguo Yang, Feitao Huang, and Junhong Chen. 2018. Regional maximum activations of convolutions with attention for cross-domain beauty and personal care product retrieval. In 2018 ACM Multimedia Conference on Multimedia Conference. ACM, 2073--2077.

Digital Library

[8]

David G Lowe. 2004. Distinctive image features from scale-invariant keypoints. In International journal of computer vision, Vol. 60. Springer, 91--110.

[9]

Maxime Oquab, Leon Bottou, Ivan Laptev, and Josef Sivic. 2014. Learning and transferring mid-level image representations using convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1717--1724.

Digital Library

[10]

Filip Radenović, Giorgos Tolias, and Ondvr ej Chum. 2018. Fine-tuning CNN image retrieval with no human annotation. In IEEE transactions on pattern analysis and machine intelligence, Vol. 41. IEEE, 1655--1668.

[11]

Ali Sharif Razavian, Hossein Azizpour, Josephine Sullivan, and Stefan Carlsson. 2014. CNN features off-the-shelf: an astounding baseline for recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 806--813.

Digital Library

[12]

K. Simonyan and A. Zisserman. 2016. Very deep convolutional networks for large-scale image recognition. In International Conference on Learning Representations.

[13]

Josef Sivic and Andrew Zisserman. 2003. Video Google: A text retrieval approach to object matching in videos. In Proceedings Ninth IEEE International Conference on Computer Vision, Vol. 2. IEEE, 1470--1477.

[14]

Giorgos Tolias, Ronan Sicre, and Hervé Jégou. 2016. Particular object retrieval with integral max-pooling of CNN activations. In Proceedings of the International Conference on Learning Representations.

[15]

Antonio Torralba, Kevin P Murphy, William T Freeman, and Mark A Rubin. 2003. Context-based vision system for place and object recognition. In Proceedings Ninth IEEE International Conference on Computer Vision, Vol. 1. 273--280.

[16]

Si Liu Jianlong Fu Jiaying Liu Shintami Chusnul Hidayati Johnny Tseng Wen-Huang Cheng, Jia Jia and Jau Huang. 2019. Perfect Corp. Challenge 2019: Half Million Beauty Product Image Recognition. colorblue https://challenge2019.perfectcorp.com/.

[17]

A. B. Yandex and V. Lempitsky. 2015. Aggregating Local Deep Features for Image Retrieval. In 2015 IEEE International Conference on Computer Vision. 1269--1277.

Cited By

Hou JLin WYue GLiu WZhao B(2023)Interaction-Matrix Based Personalized Image Aesthetics AssessmentIEEE Transactions on Multimedia10.1109/TMM.2022.318927625(5263-5278)Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1109/TMM.2022.3189276
Yang ZLiu HYuan YHou YFu BWang RZhang LJiang T(2023)Research on image retrieval based on artificial intelligence technology2023 International Conference on Image Processing, Computer Vision and Machine Learning (ICICML)10.1109/ICICML60161.2023.10424901(1067-1071)Online publication date: 3-Nov-2023
https://doi.org/10.1109/ICICML60161.2023.10424901
Cools ABelarbi MMahmoudi S(2022)A Comparative Study of Reduction Methods Applied on a Convolutional Neural NetworkElectronics10.3390/electronics1109142211:9(1422)Online publication date: 28-Apr-2022
https://doi.org/10.3390/electronics11091422
Show More Cited By

Index Terms

Beauty Product Retrieval Based on Regional Maximum Activation of Convolutions with Generalized Attention
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision

Recommendations

Attention Based Beauty Product Retrieval Using Global and Local Descriptors
MM '20: Proceedings of the 28th ACM International Conference on Multimedia

Beauty product retrieval has drawn more and more attention for its wide application outlook and enormous economic benefits. However, this task is always challenging due to the variation of products, especially the disturbance of clustered background. In ...
Weakly Supervised Image Retrieval via Coarse-scale Feature Fusion and Multi-level Attention Blocks
ICMR '19: Proceedings of the 2019 on International Conference on Multimedia Retrieval

In this paper, we propose an end-to-end Attention-Block network for image retrieval (ABIR), which greatly increases the retrieval accuracy without human annotations like bounding boxes. Specifically, our network utilizes coarse-scale feature fusion, ...
Image Retrieval Using Fused Deep Convolutional Features

This paper proposes an image retrieval using fused deep convolutional features to solve the semantic gap between low-level features and high-level semantic features of traditional contend-based image retrieval method. Firstly, the improved network ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '19: Proceedings of the 27th ACM International Conference on Multimedia

October 2019

2794 pages

ISBN:9781450368896

DOI:10.1145/3343031

General Chairs:
Laurent Amsaleg
CNRS-IRISA, France
,
Benoit Huet
EURECOM, France
,
Martha Larson
Radboud University and TU Delft (Netherlands)
,
Program Chairs:
Guillaume Gravier
CNRS-IRISA, France
,
Hayley Hung
Delft University of Technology Netherlands
,
Chong-Wah Ngo
City University of Hong Kong Hong Kong
,
Wei Tsang Ooi
National University of Singapore Singapore

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 October 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Natural Science Foundation of China
Anhui Provincial Natural Science Foundation
Fundamental Research Funds for the Central Universities

Conference

MM '19

Sponsor:

SIGMM

MM '19: The 27th ACM International Conference on Multimedia

October 21 - 25, 2019

Nice, France

Acceptance Rates

MM '19 Paper Acceptance Rate 252 of 936 submissions, 27%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
224
Total Downloads

Downloads (Last 12 months)13
Downloads (Last 6 weeks)0

Reflects downloads up to 12 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Hou JLin WYue GLiu WZhao B(2023)Interaction-Matrix Based Personalized Image Aesthetics AssessmentIEEE Transactions on Multimedia10.1109/TMM.2022.318927625(5263-5278)Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1109/TMM.2022.3189276
Yang ZLiu HYuan YHou YFu BWang RZhang LJiang T(2023)Research on image retrieval based on artificial intelligence technology2023 International Conference on Image Processing, Computer Vision and Machine Learning (ICICML)10.1109/ICICML60161.2023.10424901(1067-1071)Online publication date: 3-Nov-2023
https://doi.org/10.1109/ICICML60161.2023.10424901
Cools ABelarbi MMahmoudi S(2022)A Comparative Study of Reduction Methods Applied on a Convolutional Neural NetworkElectronics10.3390/electronics1109142211:9(1422)Online publication date: 28-Apr-2022
https://doi.org/10.3390/electronics11091422
Hou JDing HLin WLiu WFang Y(2022)Distilling Knowledge From Object Classification to Aesthetics AssessmentIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2022.318630732:11(7386-7402)Online publication date: Nov-2022
https://doi.org/10.1109/TCSVT.2022.3186307
Dhingra NJain S(2022)Language-Attention Modular-Network for Relational Referring Expression Comprehension in Videos2022 26th International Conference on Pattern Recognition (ICPR)10.1109/ICPR56361.2022.9956465(4103-4110)Online publication date: 21-Aug-2022
https://doi.org/10.1109/ICPR56361.2022.9956465
Wang XMa LFu YXue XCheng WKankanhalli MWang MChu WLiu JWorring M(2021)Neural Symbolic Representation Learning for Image CaptioningProceedings of the 2021 International Conference on Multimedia Retrieval10.1145/3460426.3463637(312-321)Online publication date: 24-Aug-2021
https://dl.acm.org/doi/10.1145/3460426.3463637
Xu KLiu YFeng MZhao JWang HCai HWen Chen CCucchiara RHua XQi GRicci EZhang ZZimmermann R(2020)Multi-Scale Generalized Attention-Based Regional Maximum Activation of Convolutions for Beauty Product RetrievalProceedings of the 28th ACM International Conference on Multimedia10.1145/3394171.3416293(4733-4737)Online publication date: 12-Oct-2020
https://dl.acm.org/doi/10.1145/3394171.3416293
Yu JXie GLi MXie HHao XGao FShuang FWen Chen CCucchiara RHua XQi GRicci EZhang ZZimmermann R(2020)Attention Based Beauty Product Retrieval Using Global and Local DescriptorsProceedings of the 28th ACM International Conference on Multimedia10.1145/3394171.3416289(4708-4712)Online publication date: 12-Oct-2020
https://dl.acm.org/doi/10.1145/3394171.3416289
Vu TDang AWang JWen Chen CCucchiara RHua XQi GRicci EZhang ZZimmermann R(2020)Learning to Remember Beauty ProductsProceedings of the 28th ACM International Conference on Multimedia10.1145/3394171.3416281(4728-4732)Online publication date: 12-Oct-2020
https://dl.acm.org/doi/10.1145/3394171.3416281
Hou JJi SWang AWen Chen CCucchiara RHua XQi GRicci EZhang ZZimmermann R(2020)Attention-driven Unsupervised Image Retrieval for Beauty Products with Visual and Textual CluesProceedings of the 28th ACM International Conference on Multimedia10.1145/3394171.3416271(4718-4722)Online publication date: 12-Oct-2020
https://dl.acm.org/doi/10.1145/3394171.3416271

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten