Implementation of a CNN-based retinomorphic model on a high performance reconfigurable computer

doi:10.1016/j.neucom.2010.07.025

Neurocomputing

Volume 74, Issue 8, 15 March 2011, Pages 1290-1297

https://doi.org/10.1016/j.neucom.2010.07.025 Get rights and content

Abstract

The complexity of hardware design methodologies represents a significant difficulty for non-hardware focused scientists working on accelerating the simulation of complex bio-inspired applications. An emerging generation of electronic system level (ESL) design tools is been developed, which allow software–hardware codesign and partitioning of complex algorithms from high level language (HLL) descriptions. These tools, together with high performance reconfigurable computer (HPRC) systems consisting of standard microprocessors coupled with application specific FPGA chips, provide a new approach for rapid emulation and acceleration of highly parallelizable algorithms. In this article CoDeveloper, and ESL IDE from Impulse Accelerated Technologies, are analyzed. A model for the first synapse of the retina, based on a discrete-time sequential CNN architecture suitable for FPGA implementation proposed by the authors in a previous paper, is implemented using CoDeveloper tools and the DS1002 HPRC platform from DRC Computers. Results showed that, with a minimum development time, a 10×acceleration, when compared to the software emulation, can be obtained.

Introduction

Bioengineering combines fundamental knowledge from medicine and biology with other more technological domains, such as electronics engineering, computing, optics, etc. In this context, the design of bio-inspired systems has been one of the research areas which has received more attention in recent decades. In particular, the development of retinomorphic models, could help understanding how the retina processes data, and support future research both in medicine and engineering applications. During last years, several retina models have been proposed. Some of them were based on OPL filters [1], others used a CNN-based approach [2]. Although some of them were thought to be implemented as electronic circuits, frequently this models were too complex for real-time execution, and were used just for simulation purposes. The main challenge to be addressed is the fact that biological systems are complex environments, dynamic and unpredictable, which depend on many unknowns difficult to uncover. Making realistic models of these systems is complex because it requires a deep and accurate knowledge of the real system, techniques to extract the relevant data and tools to solve highly complex calculations. Fortunately, the objective in many cases is not to obtain an ideal model of the biological system, but to imitate the way it processes data to develop more or less simple systems that reproduce a given phenomenon of interest.

Once an adequate model has been extracted, the following issue is to provide the high computing resources required, since very often even the simplest model would handle data quantities that exceed the current computer systems capacity. The use of approximation, simplification and linearization techniques can cope with the implementation, making it feasible, although they also introduce errors and uncertainties that cause the models to be less realistic and accurate.

Traditionally, computing systems used to emulate bio-inspired models have used software programs running on standard microprocessors. Despite of their flexibility and ease of development, the sequential nature of these computers restrict their performance when dealing with the massive-parallel systems that these bio-inspired models usually require. Moreover, increments of performance can only be achieved by increasing clock speed, which is limited by the microelectronic technology. The use of High Performance Computers (HPC), consisting of multiple processors and/or processing nodes, has helped to improve the performance of simulators by exploiting an application's coarse-grain parallelism while preserving the flexibility of conventional platforms. However, when the software approach cannot meet specifications, the use of specific hardware is an alternative that can further improve performance by exploiting fine grain parallelism, thus mimicking biological systems inherent parallelism.

During the last decade, the tendency of using reconfigurable hardware (FPGA) has taken an increasing interest. These devices improve precision and design flexibility, while simultaneously reducing cost and developing time with respect to ASICs due to the nature of the devices and development tools provided. With clock frequencies an order of magnitude lower than that of typical microprocessors, FPGAs can provide greater performance in application domains like real-time video or image processing algorithms as they take advantage of their fine-grained parallelism. This has also propitiated that FPGAs were commonly used as platforms for massive-parallel systems, such as CNN emulation and acceleration [3], [4], [5], [6]. However, the design process with these devices is not exempt of difficulties as traditional methodologies, based on hardware description languages (VHDL, Verilog, etc.) still require deep hardware skills from the designer.

Recently, a new generation of tools for highly complex circuit design is been developed. This new methodology, known as ESL (Electronic System Level), aims to target the problem of hardware–software co-design from system level, untimed descriptions, using different flavors of high level programming languages, such as C, C++ or matlab. Also, a new generation of hybrid supercomputers, called HPRCs (High Performance Reconfigurable Computers), is been developed to take full advantage of the new co-design tools. These HPRC systems provide the standard microprocessor nodes, plus new closely coupled reconfigurable nodes based on FPGA chips. Taking advantage of both coarse and fine grain parallelism, HPRCs are able to combine the benefits of hardware- and software-only implementations.

In this paper, an implementation of a model for the first synapse of the retina is proposed, which is based on a discrete sequential CNN architecture, on a HPRC platform from DRC Computers [7].

The rest of the paper is organized as follows. Section 2, summarizes some of the main ESL tools and their features, focusing in CoDeveloper $^{™}$ design flow, an ESL IDE from Impulse Accelerated Technologies, Inc. [8] that was used for hardware–software co-design. Section 3 depicts a hardware architecture for a CNN and the projection of the model of the first synapse of the retina under this paradigm. Section 4 describes the implementation of the retinomophic model on a HPRC. Section 5 analyzes the results and performance gains obtained. Section 6 summarizes some of the key remarks that have to be observed when using the new system level design methodologies, evaluates their suitability for the non-hardware specialist scientist, and provide some keys to get better results with this kind of tools according to our experience. Finally, main conclusions and future research is included in Section 7.

Section snippets

Electronic system level design tools

To meet the challenges posed by chip complexity, new languages, tools and methodologies with greater modeling capabilities are been developed. This new ESL methodology aims to target the problem of hardware–software co-design from system-level untimed descriptions, using different flavors of High Level Languages (HLLs), such as C, C++ or Matlab. These tools allow that developers with little or no prior hardware design skills may implement complex systems composed of mixed software and

Retinomorphic model

The retina is a complex biological system which contains mainly five different kinds of cells: receptors (rods and cones), bipolar cells, ganglion cells, horizontal cells and amacrine cells [10], [11]. These cells are grouped together to constitute different neural circuits, which are distributed over different retina regions: the fovea, the foveola, the parafovea, the perifovea and several peripheric areas, in order of lower to higher connectivity. In a previous paper, a model for the first

Implementation on HPRC

Traditional platforms using commodity FPGA boards which communicate with a host workstation using high speed interfaces (like USB, PCI or Ethernet), are the preferred solution for standalone or not highly coupled applications. However, when using FPGAs as coprocessors to accelerate an algorithm, the main bottleneck usually comes from the communication between the software and hardware stages of the algorithm.

The HPRC are addressing this fact providing tightly coupled standard microprocessors

Performance and results

PEs parameters have been configured to work with 640×480-pixel images. Each column path in Fig. 4 mimics a neural circuit composed of a bipolar cell and an horizontal cell. This means that together the six paths emulate a neural circuit of 3,686,400 biological cells. The 26 PEs that form the model are divided in two classes, 6 of them perform two convolutions, while the other 20 perform just one convolution, as one of their two templates is null. Table 2 summarizes the hardware resources

Discussion

The results obtained in our first experiments with different CNN architectures show that this kind of algorithms can benefit from custom hardware coprocessors for accelerating execution, as well as for rapid prototyping from C-to-hardware compilers. However, to obtain any advantage, there are key aspects for the algorithm to fulfill:

•
The algorithm should make an intensive use of data in several concurrent processing flows, to compensate for time spent in the transfer to/from the accelerator.
•
The

Conclusions

HPRC systems are showing greater performance with respect to other HPC approaches, particularly taking into account that they provide also increments of several orders of magnitude in the GFlops/euro and GFlops/watio ratios.

Our first experiments have demonstrated the viability of applying HPRC platforms and ESL tools to rapid prototyping of DTCNN-based models, provided that some requisites comply. Our first results for a model of the first synapse of the retina, still under refinement, have

Acknowledgements

This work has been partially supported by the Fundación Séneca de la Región de Murcia through the research projects 08801/PI/08 and 08788/PI/08, and by the Spanish Government through project TIN2008-06893-C03.

J. Javier Martínez received B.Sc. and M.Sc. degrees in Electrical and Electronic Engineering from the Universidad Politécnica de Cartagena, Spain, in 1997 and 2001, respectively. He is currently pursuing his Ph.D. in Computer Science at the same university. Since 2001 he has been an Assistant Lecturer with the Departamento de Electrónica y Tecnología de Computadoras, Universidad Politécnica de Cartagena. His research interests range from signal and image processing, reconfigurable computing to

References (30)

J. Herault
Model of colour processing in the retina of vertebrates: from photoreceptors to colour opposition and colour constancy phenomena
Neurocomputing
(1996)
J.J. Martínez et al.
Study of the contrast processing in the early visual system using a neuromorphic retinal architecture
Neurocomputing
(2009)
N. Drasdo et al.
The length of henle fibers in the human retina and a model of ganglion receptive field density in the visual field
Vision Research
(2007)
D. Dacey et al.
Center surround receptive field structure of cone bipolar cells in primate retina
Vision Research
(2000)
D. Balya et al.
A CNN framework for modeling parallel processing in a mammalian retina I
Journal on Circuit Theory and Applications
(2002)
Z. Nagy et al.
Configurable multilayer CNN-UM emulator on FPGA
IEEE Transactions on Circuits and Systems I
(2003)
M. Perko, I. Fajfar, T. Tuma, J.J. Puhan, Low-cost, high-performance CNN simulator implemented in FPGA 2000, in: IEEE...
S. Malki, L. Spaanenburg, Nn image processing on a xilinx virtex-ii 6000, in: Proceedings ECCTD, 2003, pp....
J. Martínez, F. Garrigós, F. Toledo, J. Ferrández, High performance implementation of an FPGA-based sequential DT-CNN,...
Drc. computers, web page 〈http://www.drccomputer.com〉,...

Impulse accelerated technologies inc., web page 〈http://www.impulsec.com〉,...

D. Densmore et al.

A platform-based taxonomy for esl design

IEEE Design & Test of Computers

(2006)

H. Kolb

How the retina works

American Scientist

(2003)

E. Fernández, H. Kolb, R. Nelson, Webvision, web page 〈http://webvision.med.utah.edu〉,...

H. Harrer et al.

Discrete-time cellular neural networks

International Journal of Circuit Theory and Applications

(1992)

Cited by (2)

NEROvideo: a general-purpose CNN-UM video processing system
2016, Journal of Real-Time Image Processing
Theorems and application of local activity of CNN with five state variables and one port
2012, Computational and Mathematical Methods in Medicine

Javier Garrigós received B.Sc. and M.Sc. degrees in Electrical Engineering in 1992 and 1995, respectively, both from the University of Murcia, Spain. He received Ph.D. degree in 2002 from the Polytechnic University of Cartagena, Spain. From 1995 to 1996 he was with the University of Zaragoza as an Invited Researcher. From 1997 to 2002 he was a Teaching Assistant at the Departamento de Electrónica y Tecnología de Computadoras, Universidad de Murcia, and later at the Universidad Politécnica de Cartagena, where he is currently an Assistant Professor at the Departamento de Electrónica y Tecnología de Computadoras. His research interests are in the fields of parallel and application-specific processing architectures, mainly when oriented to soft-computing (fuzzy, artificial neural networks and evolutionary computation) and bio-inspired systems.

Javier Toledo received B.Sc. and M.Sc. degrees in Electrical and Electronic Engineering from the Universidad de Murcia, Spain, in 1995 and 1999, respectively. He is currently working toward the Ph.D. degree in Computer Science at the Universidad Politécnica de Cartagena, Spain. Since 2001 he has been an Assistant Lecturer with the Departamento de Electrónica y Tecnología de Computadoras, Universidad Politécnica de Cartagena. His research interests include signal and image processing, reconfigurable computing and augmented reality.

Eduardo Fernández received M.D. degree from the University of Alicante, Spain, in 1986 and Ph.D. degree in Neurosciences in 1990. He is currently Associate Professor at the University Miguel Hernández, Spain, and Director of the Articial Vision Laboratory at the Bioengineering Institute of the University Miguel Hernández, Spain. In the last years he has been using histological as well as electrophysiological techniques to understand how mammalian retinal cells and the circuitry within the retina can manage and code visual information. He is actively working on the development of a visual neuroprosthesis for the profoundly blind.

J. Manuel Ferrández received B.Sc. degree in Computer Science in 1995, and Ph.D. degree in 1998, all of them from the Universidad Politécnica de Madrid, Spain. He is currently Associate Professor at the Department of Electronics, Computer Technology and Projects at the Universidad Politécnica de Cartagena and Director of the Electronic Design and Signal Processing Research Group at the same University. His research interests include bio-inspired processing, neuromorphic engineering and augmented reality systems. He is actively working on the development of a visual prosthesis for visually impaired people.

View full text

Implementation of a CNN-based retinomorphic model on a high performance reconfigurable computer

Abstract

Introduction

Section snippets

Electronic system level design tools

Retinomorphic model

Implementation on HPRC

Performance and results

Discussion

Conclusions

Acknowledgements

Neurocomputing

Neurocomputing

Vision Research

Vision Research

A CNN framework for modeling parallel processing in a mammalian retina I

Journal on Circuit Theory and Applications

Configurable multilayer CNN-UM emulator on FPGA

IEEE Transactions on Circuits and Systems I

A platform-based taxonomy for esl design

IEEE Design & Test of Computers

How the retina works

American Scientist

Discrete-time cellular neural networks

International Journal of Circuit Theory and Applications