research-article

A Decoupled Cross-layer Fusion Network with Bidirectional Guidance for Detecting Small Logos

Authors:

Baisong ZhangAuthors Info & Claims

MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in Asia

Article No.: 37, Pages 1 - 8

https://doi.org/10.1145/3595916.3626409

Published: 01 January 2024 Publication History

Abstract

Logo detection involves the use of machine learning algorithms to recognize and locate logos in images and videos, which has applications in a wide range of industries, including e-commerce, advertising, and entertainment. However, detecting small logos is still a challenging task due to their limited coverage of pixels and unclear details resulting in insufficient feature information for detection. Therefore, they are often easily confused by complex backgrounds and have lower perturbation tolerance to the bounding box, making them more difficult to detect compared to medium and large-scale logos. To address this problem, we propose a Decoupled Cross-layer Fusion Network (DCFNet) that enhances the feature representation of small logo objects, resulting in excellent detection performance. Specifically, the proposed DCFNet first adopts a bidirectional cross-layer connection mechanism to capture complementary information between different layers. Next, a two-phase feature averaging and enhancement strategy is used to further enhance the features. In the detection phase, DCFNet decouples the classification and boundary box regression branches into two identical Fully Connected (FC) heads, improving the accuracy of small logo classification and localization by avoiding mutual interference between the branches. Extensive experiments conducted on three publicly available logo datasets demonstrate that DCFNet achieves state-of-the-art performance in detecting small logos.

References

[1]

Yancheng Bai, Yongqiang Zhang, Mingli Ding, and Bernard Ghanem. 2018. Sod-mtgan: Small object detection via multi-task generative adversarial network. In Proceedings of the European Conference on Computer Vision. 206–221.

Digital Library

[2]

Yu Bao, Haojie Li, Xin Fan, Risheng Liu, and Qi Jia. 2016. Region-based CNN for logo detection. In Proceedings of the International Conference on Internet Multimedia Computing and Service. 319–322.

Digital Library

[3]

Ayan Kumar Bhunia, Ankan Kumar Bhunia, Shuvozit Ghose, Abhirup Das, Partha Pratim Roy, and Umapada Pal. 2019. A deep one-shot network for query-based logo retrieval. Pattern Recognition 96 (2019), 106965.

Digital Library

[4]

Ahmet Selman Bozkir and Murat Aydos. 2020. LogoSENSE: A companion HOG based logo detection scheme for phishing web page and E-mail brand recognition. Computers & Security 95 (2020), 101855.

[5]

Zhaowei Cai and Nuno Vasconcelos. 2018. Cascade r-cnn: Delving into high quality object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6154–6162.

[6]

Zhaowei Cai and Nuno Vasconcelos. 2019. Cascade R-CNN: high quality object detection and instance segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 43, 5 (2019), 1483–1498.

[7]

Kai Chen, Jiaqi Wang, Jiangmiao Pang, Yuhang Cao, Yu Xiong, Xiaoxiao Li, Shuyang Sun, Wansen Feng, Ziwei Liu, Jiarui Xu, 2019. MMDetection: Open mmlab detection toolbox and benchmark. arXiv preprint arXiv:1906.07155 (2019).

[8]

Ruilong Chen, Md Asif Jalal, Lyudmila Mihaylova, and Roger K Moore. 2018. Learning capsules for vehicle logo recognition. In 2018 21st International Conference on Information Fusion. IEEE, 565–572.

[9]

Navneet Dalal and Bill Triggs. 2005. Histograms of oriented gradients for human detection. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), Vol. 1. IEEE, 886–893.

Digital Library

[10]

Eduard Daoud, Dang Vu, Hung Nguyen, and Martin Gaedke. 2020. Improving fake product detection using ai-based technology. In Proceedings of the 18th International Conference on E-Society.

[11]

Chunfang Deng, Mengmeng Wang, Liang Liu, Yong Liu, and Yunliang Jiang. 2021. Extended feature pyramid network for small object detection. IEEE Transactions on Multimedia 24 (2021), 1968–1979.

[12]

Christian Eggert, Dan Zecha, Stephan Brehm, and Rainer Lienhart. 2017. Improving small object proposals for company logo detection. In Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval. 167–174.

Digital Library

[13]

Chengjian Feng, Yujie Zhong, Yu Gao, Matthew R Scott, and Weilin Huang. 2021. Tood: Task-aligned one-stage object detection. In 2021 IEEE/CVF International Conference on Computer Vision. IEEE Computer Society, 3490–3499.

[14]

Shreyansh Gandhi, Samrat Kokkula, Abon Chaudhuri, Alessandro Magnani, Theban Stanley, Behzad Ahmadi, Venkatesh Kandaswamy, Omer Ovenc, and Shie Mannor. 2020. Scalable detection of offensive and non-compliant content/logo in product images. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2247–2256.

[15]

Ke Gao, Shouxun Lin, Yongdong Zhang, Sheng Tang, and Dongming Zhang. 2009. Logo detection based on spatial-spectral saliency and partial spatial context. In 2009 IEEE International Conference on Multimedia and Expo. IEEE, 322–329.

Digital Library

[16]

Yuqi Gong, Xuehui Yu, Yao Ding, Xiaoke Peng, Jian Zhao, and Zhenjun Han. 2021. Effective fusion factor in FPN for tiny object detection. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 1160–1168.

[17]

Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. 2017. Mask r-cnn. In Proceedings of the IEEE international conference on computer vision. 2961–2969.

[18]

Mingbo Hong, Shuiwang Li, Yuchao Yang, Feiyu Zhu, Qijun Zhao, and Li Lu. 2021. SSPNet: Scale selection pyramid network for tiny person detection from UAV images. IEEE Geoscience and Remote Sensing Letters 19 (2021), 1–5.

[19]

Qiang Hou, Weiqing Min, Jing Wang, Sujuan Hou, Yuanjie Zheng, and Shuqiang Jiang. 2021. FoodLogoDet-1500: A dataset for large-scale food logo detection via multi-scale feature decoupling network. In Proceedings of the 29th ACM International Conference on Multimedia. 4670–4679.

Digital Library

[20]

Sujuan Hou, Xingzhuo Li, Weiqing Min, Jiacheng Li, Jing Wang, Yuanjie Zheng, and Shuqiang Jiang. 2023. A Cross-direction Task Decoupling Network for Small Logo Detection. In 2023 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 1493–1498.

[21]

M Iswarya, S Arun Shankar, and S Abdul Hameed. 2022. Fake Logo Detection. In 2022 1st International Conference on Computational Science and Technology. IEEE, 998–1001.

[22]

Xiaojun Jia, Huanqian Yan, Yonglin Wu, Xingxing Wei, Xiaochun Cao, and Yong Zhang. 2021. An effective and robust detector for logo detection. arXiv preprint arXiv:2108.00422 (2021).

[23]

Xiaoli Jiang, Kai Sun, Liqun Ma, Zhijian Qu, and Chongguang Ren. 2022. Vehicle Logo Detection Method Based on Improved YOLOv4. Electronics 11, 20 (2022), 3400.

[24]

Xuan Jin, Wei Su, Rong Zhang, Yuan He, and Hui Xue. 2020. The Open Brands Dataset: Unified brand detection and recognition at scale. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 4387–4391.

[25]

Jianan Li, Xiaodan Liang, Yunchao Wei, Tingfa Xu, Jiashi Feng, and Shuicheng Yan. 2017. Perceptual generative adversarial networks for small object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1222–1230.

[26]

Shuai Li, Chenhang He, Ruihuang Li, and Lei Zhang. 2022. A dual weighting label assignment scheme for object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9387–9396.

[27]

Xingzhuo Li, Sujuan Hou, Baisong Zhang, Jing Wang, Weikuan Jia, and Yuanjie Zheng. 2023. Long-range dependence involutional network for Logo Detection. Entropy 25, 1 (2023), 174.

[28]

Xiang Li, Wenhai Wang, Lijun Wu, Shuo Chen, Xiaolin Hu, Jun Li, Jinhui Tang, and Jian Yang. 2020. Generalized focal loss: Learning qualified and distributed bounding boxes for dense object detection. Advances in Neural Information Processing Systems 33 (2020), 21002–21012.

[29]

Yanghao Li, Yuntao Chen, Naiyan Wang, and Zhaoxiang Zhang. 2019. Scale-aware trident networks for object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 6054–6063.

[30]

Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. 2017. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2117–2125.

[31]

Hongmin Liu, Fan Jin, Hui Zeng, Huayan Pu, and Bin Fan. 2023. Image Enhancement Guided Object Detection in Visually Degraded Scenes. IEEE Transactions on Neural Networks and Learning Systems (2023).

[32]

Shu Liu, Lu Qi, Haifang Qin, Jianping Shi, and Jiaya Jia. 2018. Path aggregation network for instance segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8759–8768.

[33]

Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C Berg. 2016. Ssd: Single shot multibox detector. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14. Springer, 21–37.

[34]

Ziming Liu, Guangyu Gao, Lin Sun, and Zhiyuan Fang. 2021. HRDNet: High-resolution detection network for small objects. In 2021 IEEE International Conference on Multimedia and Expo. IEEE, 1–6.

[35]

David G Lowe. 1999. Object recognition from local scale-invariant features. In Proceedings of the seventh IEEE International Conference on Computer Vision, Vol. 2. Ieee, 1150–1157.

[36]

Ye Meng, Sujuan Hou, Jing Wang, Weikuan Jia, Yuanjie Zheng, and Awudu Karim. 2021. An adaptive representation algorithm for Multi-scale logo detection. Displays 70 (2021), 102090.

[37]

Junhyug Noh, Wonho Bae, Wonhee Lee, Jinhwan Seo, and Gunhee Kim. 2019. Better to follow, follow to be better: Towards precise supervision of feature super-resolution for small object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 9725–9734.

[38]

Jiangmiao Pang, Kai Chen, Jianping Shi, Huajun Feng, Wanli Ouyang, and Dahua Lin. 2019. Libra r-cnn: Towards balanced learning for object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 821–830.

[39]

Siyuan Qiao, Liang-Chieh Chen, and Alan Yuille. 2021. Detectors: Detecting objects with recursive feature pyramid and switchable atrous convolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10213–10224.

[40]

Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 779–788.

[41]

Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in Neural Information Processing Systems 28 (2015).

[42]

Stefan Romberg, Lluis Garcia Pueyo, Rainer Lienhart, and Roelof Van Zwol. 2011. Scalable logo recognition in real-world images. In Proceedings of the ACM International Conference on Multimedia. 1–8.

Digital Library

[43]

Subhajit Sanyal and Srinivasan H Sengamedu. 2007. Logoseeker: a system for detecting and matching logos in natural images. In Proceedings of the 15th ACM International Conference on Multimedia. 166–167.

Digital Library

[44]

Hang Su, Xiatian Zhu, and Shaogang Gong. 2018. Open logo detection challenge. arXiv preprint arXiv:1807.01964 (2018).

[45]

Ke Sun, Yang Zhao, Borui Jiang, Tianheng Cheng, Bin Xiao, Dong Liu, Yadong Mu, Xinggang Wang, Wenyu Liu, and Jingdong Wang. 2019. High-resolution representations for labeling pixels and regions. arXiv preprint arXiv:1904.04514 (2019).

[46]

Peize Sun, Rufeng Zhang, Yi Jiang, Tao Kong, Chenfeng Xu, Wei Zhan, Masayoshi Tomizuka, Lei Li, Zehuan Yuan, Changhu Wang, 2021. Sparse r-cnn: End-to-end object detection with learnable proposals. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14454–14463.

[47]

Jing Wang, Weiqing Min, Sujuan Hou, Shengnan Ma, Yuanjie Zheng, and Shuqiang Jiang. 2022. LogoDet-3K: A Large-Scale Image Dataset for Logo Detection. TOMM 18, 1 (2022), 1–19.

Digital Library

[48]

Jiaqi Wang, Wenwei Zhang, Yuhang Cao, Kai Chen, Jiangmiao Pang, Tao Gong, Jianping Shi, Chen Change Loy, and Dahua Lin. 2020. Side-aware boundary localization for more precise object detection. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part IV 16. Springer, 403–419.

[49]

Xiaolong Wang, Ross Girshick, Abhinav Gupta, and Kaiming He. 2018. Non-local neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7794–7803.

[50]

Yue Wu, Yinpeng Chen, Lu Yuan, Zicheng Liu, Lijuan Wang, Hongzhi Li, and Yun Fu. 2020. Rethinking classification and localization for object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10186–10195.

[51]

Weipeng Xu, Ye Liu, and Daquan Lin. 2021. A Simple and Effective Baseline for Robust Logo Detection. In Proceedings of the 29th ACM International Conference on Multimedia. 4784–4788.

Digital Library

[52]

Chenhongyi Yang, Zehao Huang, and Naiyan Wang. 2022. Querydet: Cascaded sparse query for accelerating high-resolution small object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 13668–13677.

[53]

Shuo Yang, Chunjuan Bo, Junxing Zhang, Pengxiang Gao, Yujie Li, and Seiichi Serikawa. 2021. VLD-45: A big dataset for vehicle logo recognition and detection. IEEE Transactions on Intelligent Transportation Systems 23, 12 (2021), 25567–25573.

[54]

Shuo Yang, Junxing Zhang, Chunjuan Bo, Meng Wang, and Lijun Chen. 2019. Fast vehicle logo detection in complex scenes. Optics & Laser Technology 110 (2019), 196–201.

[55]

Baisong Zhang, Sujuan Hou, Awudu Karim, Jing Wang, Weikuan Jia, and Yuanjie Zheng. 2023. Discriminative Semantic Feature Pyramid Network with Guided Anchoring for Logo Detection. Mathematics 11, 2 (2023), 481.

[56]

Hongkai Zhang, Hong Chang, Bingpeng Ma, Naiyan Wang, and Xilin Chen. 2020. Dynamic R-CNN: Towards high quality object detection via dynamic training. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XV 16. Springer, 260–275.

[57]

Junxing Zhang, Shuo Yang, Chunjuan Bo, and Zhiyuan Zhang. 2021. Vehicle logo detection based on deep convolutional networks. Computers & Electrical Engineering 90 (2021), 107004.

[58]

Shifeng Zhang, Cheng Chi, Yongqiang Yao, Zhen Lei, and Stan Z Li. 2020. Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9759–9768.

[59]

Zhaohui Zheng, Ping Wang, Wei Liu, Jinze Li, Rongguang Ye, and Dongwei Ren. 2020. Distance-IoU loss: Faster and better learning for bounding box regression. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 12993–13000.

[60]

Chenchen Zhu, Yihui He, and Marios Savvides. 2019. Feature selective anchor-free module for single-shot object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 840–849.

Index Terms

A Decoupled Cross-layer Fusion Network with Bidirectional Guidance for Detecting Small Logos
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Object detection

Recommendations

An improved YOLOv5 method for large objects detection with multi-scale feature cross-layer fusion network
Abstract
SSD and YOLOv5 are the one-stage object detector representative algorithms. An improved one-stage object detector based on the YOLOv5 method is proposed in this paper, named Multi-scale Feature Cross-layer Fusion Network (M-FCFN). ...
Graphical abstract

Display Omitted
Highlights
- We proposed Multi-scale Feature Cross-layer Fusion Network (M-FCFN).
- Two ...
Blind Detection Algorithm for BMP Stego Images Based on Feature Fusion and Ensemble Classification
IHMSC '12: Proceedings of the 2012 4th International Conference on Intelligent Human-Machine Systems and Cybernetics - Volume 02

Traditional blind detection techniques for BMP stego images mainly use a single feature set and a single classifier. However, a single feature set is difficult to completely reflect the differences caused by embedding, and a single classifier is also ...
Real-time detector design for small targets based on bi-channel feature fusion mechanism
Abstract
YOLOv4-tiny is a simplified version of YOLOv4 detector, which is extremely fast and with few parameters. However, the detection performance of YOLOv4-tiny is poor while the recognition of small targets and occluded objects is weak. It is mainly ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in Asia

December 2023

745 pages

ISBN:9798400702051

DOI:10.1145/3595916

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 January 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

National Nature Science Foundation of China

Conference

MMAsia '23

Sponsor:

SIGMM

MMAsia '23: ACM Multimedia Asia

December 6 - 8, 2023

Tainan, Taiwan

Acceptance Rates

Overall Acceptance Rate 59 of 204 submissions, 29%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
62
Total Downloads

Downloads (Last 12 months)37
Downloads (Last 6 weeks)3

Reflects downloads up to 28 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten