research-article

YOLOv3-mobile for Real-time Pedestrian Detection on Embedded GPU

Authors:
Mohammad Alkhaleefah

National Taipei University of Technology, Taiwan

National Taipei University of Technology, Taiwan
View Profile

,
Narendra Babu Tatini

National Taipei University of Technology, Taiwan

National Taipei University of Technology, Taiwan
View Profile

,
Hung-Te Lee

National Taipei University of Technology, Taiwan

National Taipei University of Technology, Taiwan
View Profile

,
Tan-Hsu Tan

National Taipei University of Technology, Taiwan

National Taipei University of Technology, Taiwan
View Profile

,
Shang-Chih Ma

National Taipei University of Technology, Taiwan

National Taipei University of Technology, Taiwan
View Profile

,
Yang-Lang Chang

National Taipei University of Technology, Taiwan

National Taipei University of Technology, Taiwan
View Profile

ICGSP '21: Proceedings of the 5th International Conference on Graphics and Signal ProcessingJune 2021Pages 27–31https://doi.org/10.1145/3474906.3474915

Published:06 October 2021Publication History

ICGSP '21: Proceedings of the 5th International Conference on Graphics and Signal Processing

Pages 27–31

ABSTRACT

Pedestrian detection is one of the challenging tasks in the technology of autonomous driving. Recently, the object detection network of you only look once (YOLO), especially YOLOv3 and YOLOv3-tiny have demonstrated a high level of pedestrian detection performance on a powerful GPU card such as Pascal Titan X. However, it is still challenging to use YOLOv3 and YOLOv3-tiny on embedded GPU system due to their large network size. In this paper, we present a lightweight YOLOv3-mobile network by refining the architecture of YOLOv3-tiny to improve its pedestrian detection efficiency on embedded GPUs such as Nvidia Jetson TX1. The experimental results showed that the proposed framework can accelerate the frame rate per second (FPS) from 18 FPS to 37 FPS with comparable mean average precision (mAP).

References

Sha Ding, Fei Long, Huijin Fan, Lei Liu, and Yongji Wang. 2019. A novel YOLOv3-tiny network for unmanned airship obstacle detection. In 2019 IEEE 8th Data Driven Control and Learning Systems Conference (DDCLS). IEEE, 277–281.Google ScholarCross Ref
Andreas Ess, Bastian Leibe, Konrad Schindler, and Luc Van Gool. 2008. A mobile vision system for robust multi-person tracking. In 2008 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1–8.Google ScholarCross Ref
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770–778.Google ScholarCross Ref
Rasheed Hussain and Sherali Zeadally. 2018. Autonomous cars: Research results, issues, and future challenges. IEEE Communications Surveys & Tutorials 21, 2 (2018), 1275–1313.Google ScholarCross Ref
Forrest N Iandola, Matthew W Moskewicz, Khalid Ashraf, and Kurt Keutzer. 2016. Firecaffe: near-linear acceleration of deep neural network training on compute clusters. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2592–2600.Google ScholarCross Ref
SeonTaek Oh, Ji-Hwan You, and Young-Keun Kim. 2020. Implementation of Compressed YOLOv3-tiny on FPGA-SoC. In 2020 IEEE International Conference on Consumer Electronics-Asia (ICCE-Asia). IEEE, 1–4.Google ScholarCross Ref
Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 779–788.Google ScholarCross Ref
Joseph Redmon and Ali Farhadi. 2017. YOLO9000: better, faster, stronger. In Proceedings of the IEEE conference on computer vision and pattern recognition. 7263–7271.Google ScholarCross Ref
Joseph Redmon and Ali Farhadi. 2018. Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767(2018).Google Scholar
Eduard Zadobrischi and Mihai Negru. 2020. Pedestrian detection based on TensorFlow YOLOv3 embedded in a portable system adaptable to vehicles. In 2020 International Conference on Development and Application Systems (DAS). IEEE, 21–26.Google ScholarCross Ref

Recommendations

TX-RSA: A High Performance RSA Implementation Scheme on NVIDIA Tegra X2
Wireless Algorithms, Systems, and Applications
Abstract
Driven by computer vision and autopilot industries, embedded graphics processing units (GPUs) are now rapidly achieving extraordinary computing power, such NVIDIA Tegra K1/X1/X2, which are widely used in embedded environments such as mobile phones,...
Read More
Computing prestack Kirchhoff time migration on general purpose GPU

This paper introduces how to optimize a practical prestack Kirchhoff time migration program by the Compute Unified Device Architecture (CUDA) on a general purpose GPU (GPGPU). A few useful optimization methods on GPGPU are demonstrated, such as how to ...
Read More
Accelerating the discontinuous Galerkin method for seismic wave propagation simulations using the graphic processing unit (GPU)-single-GPU implementation

We have successfully ported an arbitrary high-order discontinuous Galerkin (ADER-DG) method for solving the three-dimensional elastic seismic wave equation on unstructured tetrahedral meshes to an Nvidia Tesla C2075 GPU using the Nvidia CUDA programming ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ICGSP '21: Proceedings of the 5th International Conference on Graphics and Signal Processing
June 2021
95 pages
ISBN:9781450389419
DOI:10.1145/3474906

Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 6 October 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
CUDA
Embedded GPU
Real-time Pedestrian Detection
YOLOv3-mobile.
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 72
  Total Downloads
- Downloads (Last 12 months)8
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

YOLOv3-mobile for Real-time Pedestrian Detection on Embedded GPU

ICGSP '21: Proceedings of the 5th International Conference on Graphics and Signal Processing

ABSTRACT

References

Cited By

Recommendations

TX-RSA: A High Performance RSA Implementation Scheme on NVIDIA Tegra X2

Computing prestack Kirchhoff time migration on general purpose GPU

Accelerating the discontinuous Galerkin method for seismic wave propagation simulations using the graphic processing unit (GPU)-single-GPU implementation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

YOLOv3-mobile for Real-time Pedestrian Detection on Embedded GPU

ICGSP '21: Proceedings of the 5th International Conference on Graphics and Signal Processing

ABSTRACT

References

Cited By

Recommendations

TX-RSA: A High Performance RSA Implementation Scheme on NVIDIA Tegra X2

Computing prestack Kirchhoff time migration on general purpose GPU

Accelerating the discontinuous Galerkin method for seismic wave propagation simulations using the graphic processing unit (GPU)-single-GPU implementation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media