skip to main content
10.1145/2897073.2897081acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

Accelerating a computer vision algorithm on a mobile SoC using CPU-GPU co-processing: a case study on face detection

Published: 14 May 2016 Publication History

Abstract

Recently, mobile devices have become equipped with sophisticated hardware components such as a heterogeneous multi-core SoC that consists of a CPU, GPU, and DSP. This provides opportunities to realize computationally-intensive computer vision applications using General Purpose GPU (GPGPU) programming tools such as Open Graphics Library for Embedded System (OpenGL ES) and Open Computing Language (OpenCL). As a case study, the aim of this research was to accelerate the Viola-Jones face detection algorithm which is computationally expensive and limited in use on mobile devices due to irregular memory access and imbalanced workloads resulting in low performance regarding the processing time. To solve the above challenges, the proposed method of this study adapted CPU--GPU task parallelism, sliding window parallelism, scale image parallelism, dynamic allocation of threads, and local memory optimization to improve the computational time. The experimental results show that the proposed method achieved a 3.3~6.29 times increased computational time compared to the well-optimized OpenCV implementation on a CPU. The proposed method can be adapted to other applications using mobile GPUs and CPUs.

References

[1]
Gallagher, A. C. and Chen, T. 2009. Understanding Images of Groups of People. Computer Vision and Pattern Recognition (CVPR). (2009), 256--263.
[2]
Hefenbrock, D., Oberg, J., Thanh, N. T. N., Kastner, R. and Baden, S. B. 2010. Accelerating Viola-Jones face detection to FPGA-level using GPUs. Proceedings - IEEE Symposium on Field-Programmable Custom Computing Machines, FCCM 2010. (2010), 11--18.
[3]
Jia, H., Zhang, Y., Wang, W. and Xu, J. 2012. Accelerating Viola-Jones Facce Detection Algorithm on GPUs. 2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems. (2012), 396--403.
[4]
Kakumanu, P., Makrogiannis, S. and Bourbakis, N. 2007. A survey of skin-color modeling and detection methods. Pattern Recognition. 40, 3 (2007), 1106--1122.
[5]
Kang, S. H., Lee, S. and Park, I. K. 2014. Parallelization and Optimization of Feature Detection Algorithms on Embedded GPU. (2014), 164--167. M. Rahman, J. Ren, and N. Kehtarnavaz, "Real-time implementation of robust face detection on mobile platforms," IEEE ICASSP'09, pp. 1353, 2009.
[6]
Li, E., Wang, B., Yang, L., Peng, Y., Du, Y., Zhang, Y. and Chiu, Y.-J. 2012. GPU and CPU Cooperative Accelaration for Face Detection on Modern Processors. 2012 IEEE International Conference on Multimedia and Expo. (2012), 769--775.
[7]
Liu, X., Lou, Y., Yu, A. and Lang, B. 2011. Search by mobile image based on visual and spatial consistency. Multimedia and Expo (ICME), (2011), 1--6.
[8]
Munshi, A., and Leech, J., 2009. OpenGL ES common profile specification version 2.0.24 (full specification). Khronos Group.
[9]
Munshi, A., 2010. OpenCL specification 1.1. Khronos OpenCL Working Group.
[10]
Nvidia. CUDA RUNTIME API, March. 2015. http://docs.nvidia.com/cuda/index.html#axzz3cGMEdjIx.
[11]
Obukhov, A. 2011. Haar classifiers for object detection with cuda. GPU Computing Gems Emerald Edition, 517--544.
[12]
Oro, D., Fern'ndez, C., Segura, C., Martorell, X. and Hernando, J. 2012. Accelerating Boosting-Based Face Detection on GPUs. 2012 41st International Conference on Parallel Processing. (2012), 309--318.
[13]
Oro, D., Fernández, C., Saeta, J. R., Martorell, X. and Hernando, J. 2011. Real-time GPU-based face detection in HD video sequences. Proceedings of the IEEE International Conference on Computer Vision. (2011), 530--537.
[14]
Pulli, K., Baksheev, A., Kornyakov, K. and Eruhimov, V. 2012. Real-time computer vision with OpenCV. Communications of the ACM. 55, 6 (2012), 61.
[15]
Rahman, M., Ren, J. and Kehtarnavaz, N. 2009. Realtime implementation of robust face detection on mobile platforms. Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on. (2009), 1353--1356.
[16]
Sharma, B., Thota, R., Vydyanathan, N. and Kale, A. 2009. Towards a robust, real-time face processing system using CUDA-enabled GPUs. 2009 International Conference on High Performance Computing (HiPC). (2009), 368--377.
[17]
Viola, P., Jones, M. 2001. Rapid object detection using a boosted cascade of simple features. Computer Vision and Pattern Recognition (CVPR) 1, I-511--I-518.
[18]
Viola, P., Jones, M. 2004. Robust real-time face detection. International journal of computer vision 57, 2, 137--154.
[19]
Wagner, D., Schmalstieg, D. 2009. History and future of tracking for mobile phone augmented reality. 2009 IEEE International Symposium on Ubiquitous Virtual Reality, 7--10.
[20]
Wang, G., Rister, B. and Cavallaro, J. R. 2013. Workload analysis and efficient OpenCL-based implementation of SIFT algorithm on a smartphone. 2013 IEEE Global Conference on Signal and Information Processing (December 2013), 759--762.
[21]
Wang, G., Xiong, Y., Yun, J. and Cavallaro, J. R. 2013. Accelerating computer vision algorithms using OpenCL framework on the mobile GPU - A case study. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. (2013), 2629--2633.

Cited By

View all
  • (2022)GPGPU-Based Parallel Computing of Viola and Jones Eyes Detection Algorithm to Drive an Intelligent WheelchairJournal of Signal Processing Systems10.1007/s11265-022-01783-294:12(1365-1379)Online publication date: 1-Jul-2022
  • (2021)Power-Efficient Layer Mapping for CNNs on Integrated CPU and GPU PlatformsProceedings of the 26th Asia and South Pacific Design Automation Conference10.1145/3394885.3431423(627-632)Online publication date: 18-Jan-2021
  • (2021)A comparative study on SoC embedded low power GPUs for real‐time edge‐based automated traffic surveillanceConcurrency and Computation: Practice and Experience10.1002/cpe.673634:10Online publication date: 3-Dec-2021
  • Show More Cited By
  1. Accelerating a computer vision algorithm on a mobile SoC using CPU-GPU co-processing: a case study on face detection

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MOBILESoft '16: Proceedings of the International Conference on Mobile Software Engineering and Systems
    May 2016
    326 pages
    ISBN:9781450341783
    DOI:10.1145/2897073
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 14 May 2016

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. CPU-GPU co-processing
    2. OpenCL
    3. OpenGL ES 2.0
    4. computer vision
    5. mobile GPGPU

    Qualifiers

    • Research-article

    Funding Sources

    • Ministry of Trade, Industry & Energy (MOTIE, Korea)

    Conference

    ICSE '16
    Sponsor:

    Upcoming Conference

    ICSE 2025

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)7
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 14 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)GPGPU-Based Parallel Computing of Viola and Jones Eyes Detection Algorithm to Drive an Intelligent WheelchairJournal of Signal Processing Systems10.1007/s11265-022-01783-294:12(1365-1379)Online publication date: 1-Jul-2022
    • (2021)Power-Efficient Layer Mapping for CNNs on Integrated CPU and GPU PlatformsProceedings of the 26th Asia and South Pacific Design Automation Conference10.1145/3394885.3431423(627-632)Online publication date: 18-Jan-2021
    • (2021)A comparative study on SoC embedded low power GPUs for real‐time edge‐based automated traffic surveillanceConcurrency and Computation: Practice and Experience10.1002/cpe.673634:10Online publication date: 3-Dec-2021
    • (2021)A survey on parallel computing for traditional computer visionConcurrency and Computation: Practice and Experience10.1002/cpe.663834:4Online publication date: 28-Sep-2021
    • (2020)Accelerating Computer Vision Algorithms on Heterogeneous Edge Computing Platforms2020 IEEE Workshop on Signal Processing Systems (SiPS)10.1109/SiPS50750.2020.9195221(1-6)Online publication date: Oct-2020
    • (2020)A Systematic Training Procedure for Viola-Jones Face Detector in Heterogeneous Computing ArchitectureJournal of Grid Computing10.1007/s10723-020-09517-zOnline publication date: 15-Apr-2020

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media