skip to main content
10.1145/3343031.3356065acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Beauty Product Retrieval Based on Regional Maximum Activation of Convolutions with Generalized Attention

Published: 15 October 2019 Publication History

Abstract

Beauty and Personal care product retrieval has attracted more and more research attention for its value in real life. However, suffering from data variants and complex background, this task has been very challenging. In this paper, we propose a novel Generalized-attention Regional Maximal Activation of Convolutions (GRMAC) descriptor which helps to generate image features for retrieval. This method introduces attention mechanism to reduce the influence of clustered background and highlight the target, and thus contributes to enhancing the effectiveness of features and boosting the retrieval performance. Different from other attention-based methods, our method supports adjusting mask with a hyperparameter p, which is more flexible and accurate in real application. To demonstrate its effectiveness, we conduct experiments on the dataset containing more than half million personal care products (Perfect-500K) and obtain remarkable results. Furthermore, we try to fuse multiple features from different models for more improvements. And finally, our team (USTC_NELSLIP) ranked 1st in the Grand Challenge of AI Meets Beauty in ACM Multimedia 2019 with a MAP score of 0.408614. Our code is available at: https://github.com/gniknoil/Perfect500K-Beauty-and-Personal-Care-Products-Retrieval-Challenge

References

[1]
Relja Arandjelovic, Petr Gronat, Akihiko Torii, Tomas Pajdla, and Josef Sivic. 2016. NetVLAD: CNN architecture for weakly supervised place recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition . 5297--5307.
[2]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition. Ieee, 248--255.
[3]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.
[4]
Jie Hu, Li Shen, and Gang Sun. 2018. Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 7132--7141.
[5]
Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q Weinberger. 2017. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4700--4708.
[6]
Yang Li, Yulong Xu, Jiabao Wang, Zhuang Miao, and Yafei Zhang. 2017. Ms-rmac: Multiscale regional maximum activation of convolutions for image retrieval. In IEEE Signal Processing Letters, Vol. 24. IEEE, 609--613.
[7]
Zehang Lin, Zhenguo Yang, Feitao Huang, and Junhong Chen. 2018. Regional maximum activations of convolutions with attention for cross-domain beauty and personal care product retrieval. In 2018 ACM Multimedia Conference on Multimedia Conference. ACM, 2073--2077.
[8]
David G Lowe. 2004. Distinctive image features from scale-invariant keypoints. In International journal of computer vision, Vol. 60. Springer, 91--110.
[9]
Maxime Oquab, Leon Bottou, Ivan Laptev, and Josef Sivic. 2014. Learning and transferring mid-level image representations using convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1717--1724.
[10]
Filip Radenović, Giorgos Tolias, and Ondvr ej Chum. 2018. Fine-tuning CNN image retrieval with no human annotation. In IEEE transactions on pattern analysis and machine intelligence, Vol. 41. IEEE, 1655--1668.
[11]
Ali Sharif Razavian, Hossein Azizpour, Josephine Sullivan, and Stefan Carlsson. 2014. CNN features off-the-shelf: an astounding baseline for recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 806--813.
[12]
K. Simonyan and A. Zisserman. 2016. Very deep convolutional networks for large-scale image recognition. In International Conference on Learning Representations.
[13]
Josef Sivic and Andrew Zisserman. 2003. Video Google: A text retrieval approach to object matching in videos. In Proceedings Ninth IEEE International Conference on Computer Vision, Vol. 2. IEEE, 1470--1477.
[14]
Giorgos Tolias, Ronan Sicre, and Hervé Jégou. 2016. Particular object retrieval with integral max-pooling of CNN activations. In Proceedings of the International Conference on Learning Representations.
[15]
Antonio Torralba, Kevin P Murphy, William T Freeman, and Mark A Rubin. 2003. Context-based vision system for place and object recognition. In Proceedings Ninth IEEE International Conference on Computer Vision, Vol. 1. 273--280.
[16]
Si Liu Jianlong Fu Jiaying Liu Shintami Chusnul Hidayati Johnny Tseng Wen-Huang Cheng, Jia Jia and Jau Huang. 2019. Perfect Corp. Challenge 2019: Half Million Beauty Product Image Recognition. colorblue https://challenge2019.perfectcorp.com/.
[17]
A. B. Yandex and V. Lempitsky. 2015. Aggregating Local Deep Features for Image Retrieval. In 2015 IEEE International Conference on Computer Vision. 1269--1277.

Cited By

View all
  • (2023)Interaction-Matrix Based Personalized Image Aesthetics AssessmentIEEE Transactions on Multimedia10.1109/TMM.2022.318927625(5263-5278)Online publication date: 1-Jan-2023
  • (2023)Research on image retrieval based on artificial intelligence technology2023 International Conference on Image Processing, Computer Vision and Machine Learning (ICICML)10.1109/ICICML60161.2023.10424901(1067-1071)Online publication date: 3-Nov-2023
  • (2022)A Comparative Study of Reduction Methods Applied on a Convolutional Neural NetworkElectronics10.3390/electronics1109142211:9(1422)Online publication date: 28-Apr-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '19: Proceedings of the 27th ACM International Conference on Multimedia
October 2019
2794 pages
ISBN:9781450368896
DOI:10.1145/3343031
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 October 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. attention mechanism
  2. feature fusion
  3. image retrieval

Qualifiers

  • Research-article

Funding Sources

  • National Natural Science Foundation of China
  • Anhui Provincial Natural Science Foundation
  • Fundamental Research Funds for the Central Universities

Conference

MM '19
Sponsor:

Acceptance Rates

MM '19 Paper Acceptance Rate 252 of 936 submissions, 27%;
Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)13
  • Downloads (Last 6 weeks)0
Reflects downloads up to 12 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Interaction-Matrix Based Personalized Image Aesthetics AssessmentIEEE Transactions on Multimedia10.1109/TMM.2022.318927625(5263-5278)Online publication date: 1-Jan-2023
  • (2023)Research on image retrieval based on artificial intelligence technology2023 International Conference on Image Processing, Computer Vision and Machine Learning (ICICML)10.1109/ICICML60161.2023.10424901(1067-1071)Online publication date: 3-Nov-2023
  • (2022)A Comparative Study of Reduction Methods Applied on a Convolutional Neural NetworkElectronics10.3390/electronics1109142211:9(1422)Online publication date: 28-Apr-2022
  • (2022)Distilling Knowledge From Object Classification to Aesthetics AssessmentIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2022.318630732:11(7386-7402)Online publication date: Nov-2022
  • (2022)Language-Attention Modular-Network for Relational Referring Expression Comprehension in Videos2022 26th International Conference on Pattern Recognition (ICPR)10.1109/ICPR56361.2022.9956465(4103-4110)Online publication date: 21-Aug-2022
  • (2021)Neural Symbolic Representation Learning for Image CaptioningProceedings of the 2021 International Conference on Multimedia Retrieval10.1145/3460426.3463637(312-321)Online publication date: 24-Aug-2021
  • (2020)Multi-Scale Generalized Attention-Based Regional Maximum Activation of Convolutions for Beauty Product RetrievalProceedings of the 28th ACM International Conference on Multimedia10.1145/3394171.3416293(4733-4737)Online publication date: 12-Oct-2020
  • (2020)Attention Based Beauty Product Retrieval Using Global and Local DescriptorsProceedings of the 28th ACM International Conference on Multimedia10.1145/3394171.3416289(4708-4712)Online publication date: 12-Oct-2020
  • (2020)Learning to Remember Beauty ProductsProceedings of the 28th ACM International Conference on Multimedia10.1145/3394171.3416281(4728-4732)Online publication date: 12-Oct-2020
  • (2020)Attention-driven Unsupervised Image Retrieval for Beauty Products with Visual and Textual CluesProceedings of the 28th ACM International Conference on Multimedia10.1145/3394171.3416271(4718-4722)Online publication date: 12-Oct-2020

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media