skip to main content
10.1145/3637684.3637685acmotherconferencesArticle/Chapter ViewAbstractPublication PagesdmipConference Proceedingsconference-collections
research-article

Detecting Cars and Their States Utilizing Object Detection

Published: 29 April 2024 Publication History

Abstract

This study focuses on detections of cars around a person driving a car, including their states such as "going forward" or "stopping" from their brake lights. Since brake lights are visual information, we believe that they can be applied to image recognition, and the object of this study is to detect cars including their states using object detection, which is one of the image recognition methods. By using the Swin transformer as a detection method, we succeed in detecting a car including its state from an image. In addition, pre-training and network optimization were performed to achieve higher detection accuracy

References

[1]
G. Mariem, E. Ridha, and Z. Mourad:“Detection of Abnormal Movements of a Crowd in a Video Scene”, International Journal of Computer Theory and Engineering vol. 8, no. 5, pp. 398-402, 2016.
[2]
Xin Xie, Huiping Li, and Fengping Hu:“The Flocs Target Detection Algorithm Based on the Three Frame Difference and Enhanced Method of the Otsu”, International Journal of Computer Theory and Engineering vol. 7, no. 3, pp. 197-200, 2015.
[3]
Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo: “Swin Transformer: Hierarchical Vision Transformer using Shifted Windows.”, IEEE International Conference on Computer Vision (ICCV), pp. 10012-10022, 2021.
[4]
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby: “An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale.”, International Conference on Learning Representations (ICLR), pp. 1-21, 2021.
[5]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin: “Attention Is All You Need.”, In Advances in Neural Information Processing Systems, pp. 5998-6008, 2017.
[6]
He K., Gkioxari G., Dollár P., Girshick, R. “Mask r-cnn”, IEEE International Conference on Computer Vision (CVPR), pp. 2961-2969,2017.
[7]
F. Yu, H. Chen, X. Wang, W. Xian, Y. Chen, F. Liu, V. Madhavan, and T. Darrell: “BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2636-2645, 2020.
[8]
Tsung-Yi Lin, Piotr Dollar, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie: “Feature Pyramid Networks for Object Detection”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2117-2125, 2017.
[9]
Shu Liu, Lu Qi, Haifang Qin, Jianping Shi, and Jiaya Jia: “Path Aggregation Network for Instance Segmentation”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8759-8768, 2018.
[10]
Ghiasi, Golnaz, Tsung-Yi Lin, and Quoc V. Le: “Nas-fpn: Learning Scalable Feature Pyramid Architecture for Object Detection.”, IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7036-7045, 2019.
[11]
Kai Chen, Yuhang Cao, Chen Change Loy, Dahua Lin and Christoph Feichtenhofer: “Feature Pyramid Grids”, arXiv Preprint arXiv:2004.03580, 2020.
[12]
Xiyang Dai, Yinpeng Chen, Bin Xiao, Dongdong Chen, Mengchen Liu, Lu Yuan, Lei Zhang: “Dynamic Head: Unifying Object Detection Heads With Attentions”, IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7373-7382, 2021.
[13]
J. Redmon, S. Divvala, R. Girshick, and A. Farhadi: “You Only Look Once: Unified, Real-Time Object Detection”, IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779-788, 2016.
[14]
J. Redmon and A. Farhadi: “YOLOv3: An Incremental Improvement”, arXiv:1804.02767, 2018.
[15]
A. Bochkovskiy, C. Y. Wang and H. Y. M. Liao: “YOLOv4: Optimal Speed and Accuracy of Object Detection”, arXiv:2004.10934, 2020.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
DMIP '23: Proceedings of the 2023 6th International Conference on Digital Medicine and Image Processing
November 2023
142 pages
ISBN:9798400709425
DOI:10.1145/3637684
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 April 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Automatic driving
  2. Car
  3. Object detection
  4. Transformer

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

DMIP 2023

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 14
    Total Downloads
  • Downloads (Last 12 months)14
  • Downloads (Last 6 weeks)2
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media