short-paper

A Gradient Balancing Approach for Robust Logo Detection

Author:
Fuxing Leng

Huazhong University of Science and Technology & ByteDance, Wuhan, China

Huazhong University of Science and Technology & ByteDance, Wuhan, China
View Profile

MM '21: Proceedings of the 29th ACM International Conference on MultimediaOctober 2021Pages 4765–4769https://doi.org/10.1145/3474085.3479201

Published:17 October 2021Publication History

MM '21: Proceedings of the 29th ACM International Conference on Multimedia

Pages 4765–4769

ABSTRACT

This paper presents the 1st place solution to the Grand Challenge of ACM MM2021 Robust Logo Detection. We build our end-to-end solution on top of Cascade RCNN (using Res2Net101 as backbone). Through careful observation during training, we find that the model performance is limited by imbalanced gradients from different classes of the long-tailed dataset. We adopt a gradient balancing approach to tackle this problem. Our approach reweighs the gradients of each class to guide the training process towards a balance between all classes. Moreover, we design a series of data augmentation policies and propose a progressive data augmentation strategy to train our model to deal with adversarial samples. We demonstrate the accuracy and robustness of our method by achieving 70.2448 mAP on leaderboard A, and 63.8793 mAP on leaderboard B, which contains adversarial images.

References

Zhaowei Cai and Nuno Vasconcelos. 2017. Cascade R-CNN: Delving into High Quality Object Detection. CoRR abs/1712.00726 (2017). arXiv:1712.00726 http://arxiv.org/abs/1712.00726Google Scholar
Kai Chen, Jiaqi Wang, Jiangmiao Pang, Yuhang Cao, Yu Xiong, Xiaoxiao Li, Shuyang Sun,Wansen Feng, Ziwei Liu, Jiarui Xu, et al. 2019. MMDetection: Open mmlab detection toolbox and benchmark. arXiv preprint arXiv:1906.07155 (2019).Google Scholar
Mark Everingham, Luc Van Gool, Christopher KI Williams, John Winn, and Andrew Zisserman. 2010. The pascal visual object classes (voc) challenge. International journal of computer vision 88, 2 (2010), 303--338. Google ScholarDigital Library
Shanghua Gao, Ming-Ming Cheng, Kai Zhao, Xin-Yu Zhang, Ming-Hsuan Yang, and Philip HS Torr. 2019. Res2net: A new multi-scale backbone architecture. IEEE transactions on pattern analysis and machine intelligence (2019).Google Scholar
Ross Girshick. 2015. Fast r-cnn. In Proceedings of the IEEE international conference on computer vision. 1440--1448. Google ScholarDigital Library
Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. 2015. Regionbased convolutional networks for accurate object detection and segmentation. IEEE transactions on pattern analysis and machine intelligence 38, 1 (2015), 142--158. Google ScholarDigital Library
Agrim Gupta, Piotr Dollar, and Ross Girshick. 2019. LVIS: A dataset for large vocabulary instance segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5356--5364.Google ScholarCross Ref
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.Google ScholarCross Ref
Xuan Jin, Wei Su, Rong Zhang, Yuan He, and Hui Xue. 2020. The Open Brands Dataset: Unified brand detection and recognition at scale. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 4387--4391.Google ScholarCross Ref
Alexander B. Jung, Kentaro Wada, Jon Crall, Satoshi Tanaka, Jake Graving, Christoph Reinders, Sarthak Yadav, Joy Banerjee, Gábor Vecsei, Adam Kraft, Zheng Rui, Jirka Borovec, Christian Vallentin, Semen Zhydenko, Kilian Pfeiffer, Ben Cook, Ismael Fernández, François-Michel De Rainville, Chi-Hung Weng, Abner Ayala-Acevedo, Raphael Meudec, Matias Laporte, et al. 2020. imgaug. https://github.com/aleju/imgaug. Online; accessed 01-Feb-2020.Google Scholar
Yu Li, TaoWang, Bingyi Kang, Sheng Tang, ChunfengWang, Jintao Li, and Jiashi Feng. 2020. Overcoming classifier imbalance for long-tail object detection with balanced group softmax. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 10991--11000.Google ScholarCross Ref
Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. 2017. Feature pyramid networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2117--2125.Google ScholarCross Ref
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In European conference on computer vision. Springer, 740--755.Google ScholarCross Ref
Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C Berg. 2016. Ssd: Single shot multibox detector. In European conference on computer vision. Springer, 21--37.Google ScholarCross Ref
Joseph Redmon and Ali Farhadi. 2017. YOLO9000: better, faster, stronger. In Proceedings of the IEEE conference on computer vision and pattern recognition. 7263--7271.Google ScholarCross Ref
Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems 28 (2015), 91--99. Google ScholarDigital Library
Jingru Tan, Xin Lu, Gang Zhang, Changqing Yin, and Quanquan Li. 2021. Equalization Loss v2: A New Gradient Balance Approach for Long-tailed Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1685--1694.Google ScholarCross Ref
Jingru Tan, Changbao Wang, Buyu Li, Quanquan Li, Wanli Ouyang, Changqing Yin, and Junjie Yan. 2020. Equalization loss for long-tailed object recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 11662--11671.Google ScholarCross Ref
Saining Xie, Ross Girshick, Piotr Dollár, Zhuowen Tu, and Kaiming He. 2017. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1492--1500.Google ScholarCross Ref
Haoyang Zhang, Ying Wang, Feras Dayoub, and Niko Sünderhauf. 2020. Swa object detection. arXiv preprint arXiv:2012.12645 (2020).Google Scholar
Xizhou Zhu, Han Hu, Stephen Lin, and Jifeng Dai. 2019. Deformable convnets v2: More deformable, better results. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9308--9316.Google ScholarCross Ref
Barret Zoph, Ekin D Cubuk, Golnaz Ghiasi, Tsung-Yi Lin, Jonathon Shlens, and Quoc V Le. 2020. Learning data augmentation strategies for object detection. In European Conference on Computer Vision. Springer, 566--583.Google ScholarDigital Library

Index Terms

A Gradient Balancing Approach for Robust Logo Detection
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Object detection

Recommendations

Robust Logo Detection in E-Commerce Images by Data Augmentation
MM '21: Proceedings of the 29th ACM International Conference on Multimedia

Logo detection is an important task in the intellectual property protection in e-commerce. In the paper, we introduce our solution for the ACM MM2021 Robust Logo Detection Grand Challenge. The competition requires the detection of logos (515 categories) ...
Read More
Group Anomaly Detection Using Deep Generative Models
Machine Learning and Knowledge Discovery in Databases
Abstract
Unlike conventional anomaly detection research that focuses on point anomalies, our goal is to detect anomalous collections of individual data points. In particular, we perform group anomaly detection (GAD) with an emphasis on irregular group ...
Read More
Robust multi-logo watermarking by RDWT and ICA
Fractional calculus applications in signals and systems

This paper proposes a new approach to watermarking multimedia products by redundant discrete wavelet transform (RDWT) and independent component analysis (ICA). For watermark security, embedded logo watermarks are encrypted to random noise signal. To ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MM '21: Proceedings of the 29th ACM International Conference on Multimedia
October 2021
5796 pages
ISBN:9781450386517
DOI:10.1145/3474085
General Chairs:
Heng Tao Shen
University of Electronic Science&Technology of China, China
,
Yueting Zhuang
Zhejiang University, China
,
John R. Smith
IBM, USA
,
Program Chairs:
Yang Yang
University of Electronic Science and Technology of China, China
,
Pablo Cesar
CWI&TU Delft, The Netherlands
,
Florian Metze
FACEBOOK, Inc., USA
,
Balakrishnan Prabhakaran
University of Texas at Dallas, USA
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 17 October 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
adversarial
long-tail
robust logo detection
tiny instances
Qualifiers
- short-paper
Conference

Acceptance Rates
Overall Acceptance Rate995of4,171submissions,24%
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 3
  Total Citations
  View Citations
- 133
  Total Downloads
- Downloads (Last 12 months)19
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A Gradient Balancing Approach for Robust Logo Detection

MM '21: Proceedings of the 29th ACM International Conference on Multimedia

ABSTRACT

References

Cited By

Index Terms

Recommendations

Robust Logo Detection in E-Commerce Images by Data Augmentation

Group Anomaly Detection Using Deep Generative Models

Robust multi-logo watermarking by RDWT and ICA