EVE: A Flexible SIMD Coprocessor for Embedded Vision Applications

Sankaran, Jagadeesh; Hung, Ching-Yu; Kisačanin, Branislav

doi:10.1007/s11265-013-0770-2

EVE: A Flexible SIMD Coprocessor for Embedded Vision Applications

Published: 10 July 2013

Volume 75, pages 95–107, (2014)
Cite this article

Journal of Signal Processing Systems Aims and scope Submit manuscript

Jagadeesh Sankaran¹,
Ching-Yu Hung¹ &
Branislav Kisačanin¹

531 Accesses
Explore all metrics

Abstract

In this paper we introduce EVE (embedded vision/vector engine), with a FlexSIMD (flexible SIMD) architecture highly optimized for embedded vision. We show how EVE can be used to meet the growing requirements of embedded vision applications in a power- and area-efficient manner. EVE’s SIMD features allow it to accelerate low-level vision functions (such as image filtering, color-space conversion, pyramids, and gradients). With added flexibility of data accesses, EVE can also be used to accelerate many mid-level vision tasks (such as connected components, integral image, histogram, and Hough transform). Our experiments with a silicon implementation of EVE show that it performs many low- and mid-level vision functions with a 3–12x speed advantage over a C64x+DSP, while consuming less power and area. EVE also achieves code size savings of 4–6x over a C64x+DSP for regular loops. Thanks to its flexibility and programmability, we were able to implement two end-to-end vision applications on EVE and achieve more than a 5× application-level speedup over a C64x+. Having EVE as a coprocessor next to a DSP or a general purpose processor, algorithm developers have an option to accelerate the low- and mid-level vision functions on EVE. This gives them more room to innovate and use the DSP for new, more complex, high-level vision algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Supporting Utilities for Heterogeneous Embedded Image Processing Platforms (STHEM): An Overview

PULP: A Ultra-Low Power Parallel Accelerator for Energy-Efficient and Flexible Embedded Vision

Article 07 November 2015

Heterogeneous Computing Utilizing FPGAs

Article 31 May 2018

Notes

T. S. Huang, in introduction to [11].
EVE can load 16 8-bit or 16 16-bit in 1 cycle. With 32-bit data, EVE can load 16 32-bit data in 1.5 cycles on average, using two regular vector loads and intelligent load buffering. EVE has load and store pre-fetch units, which exploit data re-use, and allow us to deal with memory contention.
For cases in which the algorithm can guarantee that the offsets do indeed point to different memory banks, we can use p_scatter (parallel scatter) and do them at the rate of 8 values per cycle, as opposed to the general case of sequential scatter.

References

Bertozzi, M., et al. (2002). Artificial vision in road vehicles. Proceedings of the IEEE, 90(7), 1258-1271.
Article Google Scholar
Chai, S. (2008). Mobile challenges for embedded computer vision. In B. Kisačanin, S.S. Bhattacharyya, S. Chai (Eds.), Embedded computer vision. London: Springer.
Google Scholar
Chiricescu, S., et al. (2005). RSVP II: a next generation automotive processor. In Proceedings of the intelligent vehicles symposium.
Crnojevic, V.S., Schubert, P.J., Kisačanin, B. (2006). Method of developing a classifier using adaboost-over-genetic programming. US Patent Application 20080126275.
Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In Proceedings of the IEEE CVPR.
Gagvani, N. (2008). Challenges in video analytics. In B. Kisačanin, S.S. Bhattacharyya, S. Chai (Eds.), Embedded computer vision. London: Springer.
Google Scholar
Goodacre, J., & Sloss, A.N. (2005). Parallelism and the ARM instruction set architecture. Computer, 38(7), 42–50.
Article Google Scholar
He, B., et al. (2007). Efficient gather and scatter operations on graphics processors. In Proceedings of SC07.
Horn, B.K.P. (2003). Determining constant optical flow. http://people.csail.mit.edu/bkph/articles/Fixed_Flow.pdf (retrieved 20 January 2009).
Iwata, N., Kagami, S., Hashimoto, K. (2007). A dynamically reconfigurable architecture combining pixel-level SIMD and operation-pipeline modes for high frame rate visual processing. In Proceedings of the ICFPT.
Kisačanin, B., Pavlović, V., Huang, T.S. (Eds.) (2005). Real-time computer vision for human-computer interaction. New York: Springer.
Google Scholar
Kisačanin, B., & Nikolić, Z. (2010). Algorithmic and software techniques for embedded vision on programmable processors. Signal Processing: Image Communication, 25(5), 352-362.
Google Scholar
Kisačanin, B. (2011). Automotive vision for advanced driver assistance systems. In Proceedings of international symposium on VLSI design, automation and test (VLSI-DAT).
Komuro, T., Kagami, S., Ishikawa, M.A. (2004). Dynamically reconfigurable SIMD processor for a vision chip. IEEE Journal Solid-State Circuits, 39(1), 265-268.
Article Google Scholar
Kyo, S., & Okazaki, S. (2008). In-vehicle vision processors for driver assistance systems. In Proceedings of the design automation conference.
Lee, V.W., et al. (2010). Debunking the 100 X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU. In Proceedings of ISCA.
Lipton, A.J. (2008). We can watch it for you wholesale. In B. Kisačanin, S.S. Bhattacharyya, S. Chai (Eds.), Embedded computer vision. London: Springer.
Google Scholar
Owens, J.D., et al. (2008). GPU computing. In Proceedings of the IEEE.
Porikli, F. (2005). Integral histogram: a fast way to extract histograms in Cartesian spaces. In Proceedings of the IEEE CVPR.
Shotton, J., et al. (2011). Real-time human pose recognition in parts from single depth images. In Proceedings of the IEEE CVPR.
Simar, R., & Tatge, R. (2009). How TI adopted VLIW in digital signal processors. IEEE Solid-State Circuits Magazine.
Stein, G.P., et al. (2005). A computer vision system on a chip: a case study from the automotive domain. In Proceedings of the IEEE workshop on embedded computer vision.
Trivedi, M.M., Gandhi, T., McCall, J. (2007). Looking-in and looking-out of a vehicle: computer-vision-based enhanced vehicle safety. IEEE Transactions on Intelligent Transportation Systems, 8(1), 108-120.
Article Google Scholar
Van Der Wal, G.S. (2010). Technical overview of Sarnoff Acadia II vision processor. In Proceedings of SPIE 7710, multisensor, multisource information fusion: architectures, algorithms, and applications.

Download references

Acknowledgments

The authors would like to thank Jeremiah Golston and Peter Barnum of Texas Instruments for their valuable comments on an early draft of this paper. We are also grateful to the anonymous reviewers of this paper for their constructive suggestions and comments. We would also like to express our gratitude to our management and colleagues for their support throughout this project.

Author information

Authors and Affiliations

Texas Instruments, Inc., Dallas, TX, 75243, USA
Jagadeesh Sankaran, Ching-Yu Hung & Branislav Kisačanin

Authors

Jagadeesh Sankaran
View author publications
You can also search for this author inPubMed Google Scholar
Ching-Yu Hung
View author publications
You can also search for this author inPubMed Google Scholar
Branislav Kisačanin
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Branislav Kisačanin.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sankaran, J., Hung, CY. & Kisačanin, B. EVE: A Flexible SIMD Coprocessor for Embedded Vision Applications. J Sign Process Syst 75, 95–107 (2014). https://doi.org/10.1007/s11265-013-0770-2

Download citation

Received: 20 February 2012
Revised: 03 April 2013
Accepted: 29 April 2013
Published: 10 July 2013
Issue Date: May 2014
DOI: https://doi.org/10.1007/s11265-013-0770-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

EVE: A Flexible SIMD Coprocessor for Embedded Vision Applications

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Supporting Utilities for Heterogeneous Embedded Image Processing Platforms (STHEM): An Overview

PULP: A Ultra-Low Power Parallel Accelerator for Energy-Efficient and Flexible Embedded Vision

Heterogeneous Computing Utilizing FPGAs

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now