research-article

Detecting Cars and Their States Utilizing Object Detection

Authors:

Tomio GotoAuthors Info & Claims

DMIP '23: Proceedings of the 2023 6th International Conference on Digital Medicine and Image Processing

Pages 1 - 6

https://doi.org/10.1145/3637684.3637685

Published: 29 April 2024 Publication History

Abstract

This study focuses on detections of cars around a person driving a car, including their states such as "going forward" or "stopping" from their brake lights. Since brake lights are visual information, we believe that they can be applied to image recognition, and the object of this study is to detect cars including their states using object detection, which is one of the image recognition methods. By using the Swin transformer as a detection method, we succeed in detecting a car including its state from an image. In addition, pre-training and network optimization were performed to achieve higher detection accuracy

References

[1]

G. Mariem, E. Ridha, and Z. Mourad:“Detection of Abnormal Movements of a Crowd in a Video Scene”, International Journal of Computer Theory and Engineering vol. 8, no. 5, pp. 398-402, 2016.

[2]

Xin Xie, Huiping Li, and Fengping Hu:“The Flocs Target Detection Algorithm Based on the Three Frame Difference and Enhanced Method of the Otsu”, International Journal of Computer Theory and Engineering vol. 7, no. 3, pp. 197-200, 2015.

[3]

Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo: “Swin Transformer: Hierarchical Vision Transformer using Shifted Windows.”, IEEE International Conference on Computer Vision (ICCV), pp. 10012-10022, 2021.

[4]

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby: “An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale.”, International Conference on Learning Representations (ICLR), pp. 1-21, 2021.

[5]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia　Polosukhin: “Attention Is All You Need.”, In Advances in Neural Information Processing Systems, pp. 5998-6008, 2017.

Digital Library

[6]

He K., Gkioxari G., Dollár P., Girshick, R. “Mask r-cnn”, IEEE International Conference on Computer Vision (CVPR), pp. 2961-2969,2017.

[7]

F. Yu, H. Chen, X. Wang, W. Xian, Y. Chen, F. Liu, V. Madhavan, and T. Darrell: “BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2636-2645, 2020.

[8]

Tsung-Yi Lin, Piotr Dollar, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie: “Feature Pyramid Networks for Object Detection”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2117-2125, 2017.

[9]

Shu Liu, Lu Qi, Haifang Qin, Jianping Shi, and Jiaya Jia: “Path Aggregation Network for Instance Segmentation”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8759-8768, 2018.

[10]

Ghiasi, Golnaz, Tsung-Yi Lin, and Quoc V. Le: “Nas-fpn: Learning Scalable Feature Pyramid Architecture for Object Detection.”, IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7036-7045, 2019.

[11]

Kai Chen, Yuhang Cao, Chen Change Loy, Dahua Lin and Christoph Feichtenhofer: “Feature Pyramid Grids”, arXiv Preprint arXiv:2004.03580, 2020.

[12]

Xiyang Dai, Yinpeng Chen, Bin Xiao, Dongdong Chen, Mengchen Liu, Lu Yuan, Lei Zhang: “Dynamic Head: Unifying Object Detection Heads With Attentions”, IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7373-7382, 2021.

[13]

J. Redmon, S. Divvala, R. Girshick, and A. Farhadi: “You Only Look Once: Unified, Real-Time Object Detection”, IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779-788, 2016.

[14]

J. Redmon and A. Farhadi: “YOLOv3: An Incremental Improvement”, arXiv:1804.02767, 2018.

[15]

A. Bochkovskiy, C. Y. Wang and H. Y. M. Liao: “YOLOv4: Optimal Speed and Accuracy of Object Detection”, arXiv:2004.10934, 2020.

Recommendations

Development the Cockpit of the Autonomous Cars
Design, User Experience, and Usability: Design Thinking and Practice in Contemporary and Emerging Technologies
Abstract
The evolution of cars has been notorious over the years, since the first four-wheeled car, that emerged in mid-1891 up to the present day, where it’s already possible to see semi-autonomous cars.
Currently, and with the constant automotive ...
Plug-and-Play Multi-class Lane Detection Module
AMC-SME '23: Proceedings of the 2023 Workshop on Advanced Multimedia Computing for Smart Manufacturing and Engineering

Lanes play a crucial role in visual navigation systems for Autonomous driving. Several studies have employed deep learning technology to design networks for lane detection. However, most methods simply detect lanes area, ignoring that different types of ...
Scheme of Autonomous Vehicle Abnormal Behavior Detection Technology Based on Edge Computing
HPCCT & BDAI '20: Proceedings of the 2020 4th High Performance Computing and Cluster Technologies Conference & 2020 3rd International Conference on Big Data and Artificial Intelligence

As the development of automatic driving technology, more and more attention has been paid to the safety and standardization of autonomous vehicles. Although high-definition map[4] navigation has been used to assist the automatic driving of vehicles, ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

DMIP '23: Proceedings of the 2023 6th International Conference on Digital Medicine and Image Processing

November 2023

142 pages

ISBN:9798400709425

DOI:10.1145/3637684

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 April 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

DMIP 2023

DMIP 2023: 2023 6th International Conference on Digital Medicine and Image Processing

November 9 - 12, 2023

Kyoto, Japan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
14
Total Downloads

Downloads (Last 12 months)14
Downloads (Last 6 weeks)2

Reflects downloads up to 17 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten