research-article

Adapting Object Detection to Fisheye Cameras: A Knowledge Distillation with Semi-Pseudo-Label Approach

Authors:

Chih-Chung Hsu,

Wei-Hao HuangAuthors Info & Claims

MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in Asia

Article No.: 111, Pages 1 - 6

https://doi.org/10.1145/3595916.3628350

Published: 01 January 2024 Publication History

Abstract

In this paper, we introduce a lightweight object detection system, custom-designed for fisheye cameras and optimized for quick deployment on embedded systems. Given the constraints of training solely on standard images, our methodology centers on the effective knowledge transfer to accentuate object detection in fisheye scenarios. The integration of the Parallel Residual Bi-Fusion (PRB) Feature Pyramid Network (FPN) into the state-of-the-art YOLOv7 backbone specifically addresses the challenges of detecting tiny objects often present in fisheye images.

Our unique two-phase training strategy operates as follows: Firstly, a comprehensive Teacher Model is trained on standard images, setting the stage for knowledge acquisition. Subsequently, in the second phase, this knowledge is distilled to a more compact Student Model. The twist is in using fisheye images as pseudo-information, ensuring the model’s adaptability to fisheye-centric environments. Combining knowledge distillation with semi-pseudo-label semi-supervised learning, this strategy guarantees optimal performance and embraces a lightweight design perfect for real-time applications on constrained devices. In essence, our contributions span the crafting of a specialized object detection framework for fisheye cameras, the proposition of a novel two-tiered training strategy, and the synergetic use of PRB with YOLOv7. Empirical results reinforce the efficacy of our approach, illustrating that while our model retains a compact footprint, it doesn’t compromise on performance, excelling in tasks with a comparable nature and offering swift inference.

References

[1]

R. Boutteau, X. Savatier, F. Bonardi, and J.Y. Ertaud. 2013. Road-line detection and 3D reconstruction using fisheye cameras. (2013), 1083–1088. https://doi.org/10.1109/ITSC.2013.6728376

[2]

Paola Cascante-Bonilla, Fuwen Tan, Yanjun Qi, and Vicente Ordonez. 2020. Curriculum Labeling: Revisiting Pseudo-Labeling for Semi-Supervised Learning. (2020). arxiv:2001.06001 [cs.LG]

[3]

Ping-Yang Chen, Ming-Ching Chang, Jun-Wei Hsieh, and Yong-Sheng Chen. 2021. Parallel Residual Bi-Fusion Feature Pyramid Network for Accurate Single-Shot Object Detection. IEEE Transactions on Image Processing 30 (2021), 9099–9111. https://doi.org/10.1109/tip.2021.3118953

[4]

Golnaz Ghiasi, Tsung-Yi Lin, and Quoc V. Le. 2019. NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection. (2019), 7029–7038. https://doi.org/10.1109/CVPR.2019.00720

[5]

Jianping Gou, Baosheng Yu, Stephen J. Maybank, and Dacheng Tao. 2021. Knowledge Distillation: A Survey. International Journal of Computer Vision 129, 6 (mar 2021), 1789–1819. https://doi.org/10.1007/s11263-021-01453-z

Digital Library

[6]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Deep Residual Learning for Image Recognition. (2015). arxiv:1512.03385 [cs.CV]

[7]

Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the Knowledge in a Neural Network. (2015). arxiv:1503.02531 [stat.ML]

[8]

Kenichi Hirose, Takashi Toriu, and Hiromitsu Hama. 2009. Accurate Estimation of Wheel Center Points for Estimate of Vehicle Baseline Length in a Circular Fisheye Image. (2009), 306–309. https://doi.org/10.1109/ICICIC.2009.66

Digital Library

[9]

Kenichi Hirose, Takashi Toriu, and Hiromitsu Hama. 2009. Detection of Tire-Road Contact Point for Vehicle Position Estimate Considering Shape Distortion in a Circular Fisheye Image. (2009), 178–181. https://doi.org/10.1109/IIH-MSP.2009.229

Digital Library

[10]

Zijian Hu, Zhengyu Yang, Xuefeng Hu, and Ram Nevatia. 2021. SimPLE: Similar Pseudo Label Exploitation for Semi-Supervised Classification. (jun 2021). https://doi.org/10.1109/cvpr46437.2021.01485

[11]

Chuyi Li, Lulu Li, Hongliang Jiang, Kaiheng Weng, Yifei Geng, Liang Li, Zaidan Ke, Qingyuan Li, Meng Cheng, Weiqiang Nie, Yiduo Li, Bo Zhang, Yufei Liang, Linyuan Zhou, Xiaoming Xu, Xiangxiang Chu, Xiaoming Wei, and Xiaolin Wei. 2022. YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications. (2022). arxiv:2209.02976 [cs.CV]

[12]

Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. 2017. Feature Pyramid Networks for Object Detection. (2017), 936–944. https://doi.org/10.1109/CVPR.2017.106

[13]

Raphael Gontijo Lopes, Stefano Fenu, and Thad Starner. 2017. Data-Free Knowledge Distillation for Deep Neural Networks. (2017). arxiv:1710.07535 [cs.LG]

[14]

Yeqiang Qian, Ming Yang, Chunxiang Wang, and Bing Wang. 2017. Self-adapting part-based pedestrian detection using a fish-eye camera. (2017), 33–38. https://doi.org/10.1109/IVS.2017.7995695

Digital Library

[15]

Weiwei "Shi, Yihong Gong, Chris Ding, Zhiheng Ma, Xiaoyu Tao, editor="Ferrari Vittorio Zheng, Nanning", Martial Hebert, Cristian Sminchisescu, and Yair" Weiss. 2018. Transductive Semi-Supervised Deep Learning Using Min-Max Features. (2018), 311–327.

[16]

Chien-Yao Wang, Alexey Bochkovskiy, and Hong-Yuan Mark Liao. 2022. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. (2022). arxiv:2207.02696 [cs.CV]

[17]

Huimin Wu, Xiaomeng Li, Yiqun Lin, and Kwang-Ting Cheng. 2023. Compete to Win: Enhancing Pseudo Labels for Barely-supervised Medical Image Segmentation. IEEE Transactions on Medical Imaging (2023), 1–1. https://doi.org/10.1109/tmi.2023.3279110

[18]

Jianhui Wu, Feng Huang, Wenjing Hu, W. He, Bing Tu, Longyuan Guo, Xianfeng Ou, and Guoyun Zhang. 2018. Study of multiple moving targets’ detection in fisheye video based on the moving blob model. Multimedia Tools and Applications 78 (2018), 877 – 896. https://api.semanticscholar.org/CorpusID:254835040

Digital Library

[19]

Zhengxia Zou, Keyan Chen, Zhenwei Shi, Yuhong Guo, and Jieping Ye. 2023. Object detection in 20 years: A survey. Proc. IEEE (2023).

Index Terms

Adapting Object Detection to Fisheye Cameras: A Knowledge Distillation with Semi-Pseudo-Label Approach
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
      2. Computer vision tasks
        Scene understanding
  2. Machine learning

Index terms have been assigned to the content through auto-classification.

Recommendations

Semi-supervised multi-label classification using incomplete label information
Highlights
- An inductive semi-supervised method called Smile is proposed for multi-label classification using incomplete label information.
Abstract
Classifying multi-label instances using incompletely labeled instances is one of the fundamental tasks in multi-label learning. Most existing methods regard this task as supervised weak-label learning problem and assume sufficient ...
Semi-supervised partial label learning algorithm via reliable label propagation
Abstract
Partial label learning (PLL) is a weakly supervised learning method that is able to predict one label as the correct answer from a given candidate label set. In PLL, when all possible candidate labels are as signed to real-world training examples, ...
KDNet: Leveraging Vision-Language Knowledge Distillation for Few-Shot Object Detection
Artificial Neural Networks and Machine Learning – ICANN 2024
Abstract
Few-shot object detection (FSOD) aims to detect new categories given only few instances for training. Recently emerged vision-language models (VLMs) have shown great performances in zero-shot and open-vocabulary object detection due to their ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in Asia

December 2023

745 pages

ISBN:9798400702051

DOI:10.1145/3595916

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 January 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

National Science and Technology Council

Conference

MMAsia '23

Sponsor:

SIGMM

MMAsia '23: ACM Multimedia Asia

December 6 - 8, 2023

Tainan, Taiwan

Acceptance Rates

Overall Acceptance Rate 59 of 204 submissions, 29%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
92
Total Downloads

Downloads (Last 12 months)63
Downloads (Last 6 weeks)4

Reflects downloads up to 28 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten