High-level design space exploration of locally linear neuro-fuzzy models for embedded systems

doi:10.1016/j.fss.2013.12.006

Fuzzy Sets and Systems

Volume 253, 16 October 2014, Pages 44-63

https://doi.org/10.1016/j.fss.2013.12.006 Get rights and content

Abstract

Recently, artificial neural networks and neuro-fuzzy systems are being introduced in embedded systems due to often-used solution for classification and nonlinear system identification. In this paper, we present a parametric neuro-fuzzy hardware and a framework for exploring design space for an efficient hardware realization of neuro-fuzzy models for embedded systems. The proposed hardware can be used as a stand-alone core or be coupled with a central processing unit for the purpose of online training. We also present a framework to explore the design space for optimal design parameters (hardware core parameters) so that an efficient neuro-fuzzy hardware in terms of area, power consumption, and performance (delay) can be selected. It examines whole design space to find Pareto efficient hardware without increasing time-to-market and non-recurring engineering cost with the aid of high-level design space exploration. Also, the performance of the proposed hardware is compared against a soft core embedded processor named NIOS II/s. The results obtained show that the selected core is able to perform actions faster than NIOS II while it dissipates less power. Moreover, the proposed framework is put into action through three different scenarios to show off the capabilities of framework for generating Pareto optimal points upon different application demands.

Introduction

Recent years have witnessed a wonderful expansion of intelligent and bio-inspired systems usage in many practical applications. Artificial Neural Network (ANN), which can be named as an example of bio-inspired algorithms, has a wide-spread successful applications in different fields of science from engineering to economic. Alongside with rapid development of ANN, Fuzzy Interference Systems (FISs) have also been using in a wide range of problems domain such as process control, image processing, and pattern recognition. Neuro-Fuzzy (NF) [1], [2], [3], [4] systems are connectionist structures in a way that they combine natural language description of fuzzy interference systems and learning properties of neural networks [5]. The successes and capabilities of bio-inspired and soft computing algorithms such as NF models motivate researchers to put effort toward embedded realizations via reconfigurable and dedicated hardware for industrial applications [6], [7].

An embedded system is a specific purpose computer that usually executes a single program repeatedly. They react to changes in their surrounded environments and must respond in real-time manner. The design goals of such system differ from one application to another, but it generally includes low unit cost, low Non-Recurring Engineering (NRE) cost, low power, low execution time, small size, and high flexibility. Designer uses three well-known approaches to realize an embedded system: (i) software (SW) approach, (ii) hardware (HW) approach, and (iii) HW/SW co-design approach. The main advantage of SW implementation [8], [9], [10], [11], [12], [13] is that it is highly flexible, since the connections between neurons are not physical (in contrast to physical implementation that requires wired interconnections). However, on the other side, fully software implementation is usually slower than hardware implementation. This dilemma caused by two reasons. The first one is due to processor resource restrictions. Arithmetic Logic Units (ALUs) of processors are the bottleneck and they are only designed to perform general and basic arithmetic operations. The second one is that software programs are usually developed sequentially; however, it is in contrast to the intrinsic parallel architecture of ANNs and NFs.

Similar to other applications, ANN and NF systems can be implemented as a hardware core [8], [14], [15], [16], [17], [18], [19], [20]. In this approach, the connections between neurons are hard wired and typically permanent. Therefore, ANN and NF hardware structures are not as flexible as SW approach; however they are fast and consume less power concerning software implementations. Researchers even decrease power consumption and improve area, and timing criteria by introducing fidelity slack (error) for bio-inspired and soft-computing algorithms such as ANN and NF [21], [22], [23]. Regarding an embedded system demands, introducing fidelity as another system design criterion endangers NRE, and timing-to-market. One solution to manage the design complexity is to define design procedure at higher abstraction level which has the most effect on implementation quality. The advantages of defining a design at higher levels of abstraction can be [24]: (1) ability to use of high level environment applications such as SystemVerilog, SystemC/TLM, C/C++, and MATLAB/Simulink, (2) potential for better exploration of design criteria which leads to more desirable designs, (3) shorter time to market, (4) lower design cost, etc.

Both software and hardware design paradigms have their own advantages and disadvantages. Recently, designers combine both space designs to alleviate their drawbacks and reinforce their benefits. HW/SW co-design [8], [25], [26] is the procedure of partitioning algorithm between hardware and software parts, then describing each of them in appropriate language such as VHDL [27], Verilog [28] for hardware part and C/C++ for software part. This paper proposes a hardware solution (HW partition) that can be used as an isolated core or as an agent around a central processing unit (SW partition) for the purpose of online training.

In regard to embedded system demands and computational complexity of NF, it can be certainly declared that direct efficient hardware implementation for ANN and NF models is inconceivable. The presented parameterized hardware description model and framework for high level design space exploration of Locally Linear Neuro-Fuzzy Models (LLNFM) help designers to analyze the effects of final NF structure on design goals such as power consumption, execution time, area occupation, and accuracy of estimated results (error). Since the proposed framework has the ability to generate a Pareto optimal design alternatives, design trade-offs are analyzed before physical implementation. The high-level design exploration capability of framework significantly increases the probability of finding better design solutions while decreasing NRE cost and time-to-market. To the best of authors' knowledge, no prior attempt has been made to present a framework for high-level design space exploration of LLNFM embedded implementation. The adjustable hardware model can also be configured during design procedure to satisfy imposed restriction in regard to application demands. It is a hardware module that can be modified in several dimensions by the means of bit resolution, and number of neurons. The main contributions of this paper are summarized as follows:

•
We propose a scalable LLNFM hardware which can be used tightly coupled with a central processing unit and other Intellectual Property (IP) cores as System-on-Chip (SoC) in a field-programmable gate array (FPGA). The proposed hardware core substitutes different number of neurons through sequential operations. It is also possible to instant variable number of hardware cores to increase the parallelism and improve performance.
•
We also present a framework to adjust NF hardware core parameters to obtain an efficient hardware solution regarding embedded system demands. The presented hardware in this paper is used for examining and showing off the capabilities of our proposed framework for high-level design space exploration of LLNFM.
•
Most importantly, the proposed framework is a high level design space explorer which can be used for other soft-computing paradigms such as artificial neural networks and fuzzy interference systems.

The rest of this paper is organized as follows: Section 2 gives an introduction to LLNFM theory, while the proposed parameterized hardware model is presented in Section 3. Section 4 describes the framework for high level design space exploration. Section 5 provides simulations and experimental results for three different scenarios. In this section, LLNFM is trained to efficiently approximate a complex square-root function under different imposed design restrictions. Finally the work impact and future directions are outlined in Section 6.

Section snippets

Locally linear neuro-fuzzy models: Theory

In order to explain the challenges addressed in this work, this section explains LLNFM principal characteristics.

LLNFM falls into Takagi–Sugeno (TS) [1] method, where its fuzzy consequents are function of inputs. The network structure of LLNFM is depicted in Fig. 1. For the sake of simplicity, it is assumed that $\underset{̲}{u} = [u_{1} u_{2} \dots u_{p}]$ is input vector, M is the number of neurons, and P is the number of inputs. As shown in Fig. 1, the network structure consists of one hidden layer and an adder in the

Locally linear neuro-fuzzy models: Hardware structure

This section proposes an IP core hardware architecture for LLNFM and discusses its hardware complexity. The proposed IP is based on the mathematical operations discussed in the previous section. In the following subsection the inner parts of hardware model and its specifications are fully studied.

A framework: A high-level design space exploration of proposed LLNFM IP core

In this section, a framework for optimizing multi-objective hardware realization of LLNFM is presented. The main modules of the framework are library generation, design space generation, and Pareto solver. The block diagram of the proposed framework is illustrated in Fig. 10. As seen, the inputs of framework (shaded boxes) are Functional Units, IP Core Architecture, Word Length, Circuit Frequency Intervals, Learning, and Test data. Functional units are basic arithmetic operations (i.e., adders,

Experimental results

In this section, LLNFM is trained to simulate the complex square-root defined as follows: $\sqrt{x + i y} = \sqrt{\frac{\sqrt{x^{2} + y^{2}} + x}{2}} \pm i \sqrt{\frac{\sqrt{x^{2} + y^{2}} - x}{2}} = F_{1} (x, y) \pm i F_{2} (x, y),$ where the input complex operand $(x, y)$ is in the range of $(1 ⩽ x, y ⩽ 2)$ . This function is a fundamental operation in digital signal processing applications such as 1) Orthogonal Frequency Division Multiplexing (OFDM), 2) and matrix decomposition of multi-antenna (MIMO) systems [41]. The framework and proposed IP are put into action thought three different scenarios. Each

Conclusion and future work

In this work, we presented an LLNFM IP core for embedded systems. We also presented a framework to adjust design variables (IP core parameters) so that an optimal LLNFM hardware regarding the application demands (imposed restrictions) can be selected in a very large design space. Through three scenarios, it is shown that the proposed framework is capable to effectively and efficiently adjust design parameters with the aim of exploring the design space at high-level. In the first scenario, the

Acknowledgements

Unfortunately, we lost our great scientist, Prof. Caro Lucas. He was a prominent researcher, a dedicated instructor and a person who was admired by all who knew him. He encouraged us to study the LLNFM implementation for embedded applications.

References (42)

N.E. Mitrakis et al.
A multilayered neuro-fuzzy classifier with self-organizing properties
Fuzzy Sets Syst.
(2008)
M.R. Jamali et al.
Emotion on FPGA: model driven approach
Expert Syst. Appl.
(2009)
W. Wang et al.
Neuro-fuzzy system with high-speed low-power analog blocks
Fuzzy Sets Syst.
(2006)
J.M. Tarela et al.
Optimised PWL recursive approximation and its application to neuro-fuzzy systems
Math. Comput. Model.
(2002)
T. Takagi et al.
Fuzzy identification of systems and its application to modeling and control
IEEE Trans. Syst. Man Cybern.
(1985)
J.S.R. Jang
ANFIS: Adaptive-network-based fuzzy inference system
IEEE Trans. Syst. Man Cybern.
(1993)
Nilkola K. Kasabov et al.
DENFIS: dynamic evolving neural-fuzzy inference system and its application for time-series prediction
IEEE Trans. Fuzzy Syst.
(2002)
L. Rutkowski et al.
Flexible neuro-fuzzy systems
IEEE Trans. Neural Netw.
(2003)
M.R. Jamali et al.
Real-time emotional controller
Neural Comput. Appl.
(2010)
L.M. Reyneri
Implementation issues of neuro-fuzzy hardware: going towards HW/SW codesign
IEEE Trans. Neural Netw.
(2003)

R. Ananthanarayanan et al.

The cat is out of the bag: cortical simulations with 10⁹ neurons, 10¹³ synapses

N. Lopes et al.

A robust learning model for dealing with missing values in many-core architectures

N. Lopes et al.

GPUMLib: A new library to combine machine learning algorithms with Graphics Processing Units

J. Hoffmann et al.

Simulating biological-inspired spiking neural networks with OpenCL

Fast artificial neural network library

R. Dlugosz et al.

Realization of the conscience mechanism in CMOS implementation of winner-takes-all self-organizing neural networks

IEEE Trans. Neural Netw.

(2010)

I.P. Morns et al.

Analog design of a new neural network for optical character recognition

IEEE Trans. Neural Netw.

(1999)

M.C.M. Teixeira et al.

Analog neural non-derivative optimizers

IEEE Trans. Neural Netw.

(1998)

B.E. Shi et al.

A resistor/transconductor network for linear fitting

IEEE Trans. Circuits Syst. II

(2000)

C. Mead

Analog VLSI and Neural Systems

(1989)

B. Sekerkiran et al.

A CMOS K-winners-take-all circuit with O(n) complexity

IEEE Trans. Circuits Syst. II

(1999)

Cited by (7)

A hierarchical local-model tree for predicting roof displacement in longwall tailgates
2021, Neural Computing and Applications
Real-Time Deep Learning at the Edge for Scalable Reliability Modeling of Si-MOSFET Power Electronics Converters
2019, IEEE Internet of Things Journal
REAL-TIME PERSON RE-IDENTIFICATION AT THE EDGE: A MIXED PRECISION APPROACH
2019, arXiv
REAL-TIME DEEP LEARNING AT THE EDGE FOR SCALABLE RELIABILITY MODELING OF SI-MOSFET POWER ELECTRONICS CONVERTERS
2019, arXiv
Real-time person re-identification at the edge: a mixed precision approach
2019, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Optimization of the Production of Inactivated Clostridium novyi Type B Vaccine Using Computational Intelligence Techniques
2016, Applied Biochemistry and Biotechnology

View all citing articles on Scopus

View full text

High-level design space exploration of locally linear neuro-fuzzy models for embedded systems

Abstract

Introduction

Section snippets

Locally linear neuro-fuzzy models: Theory

Locally linear neuro-fuzzy models: Hardware structure

A framework: A high-level design space exploration of proposed LLNFM IP core

Experimental results

Conclusion and future work

Acknowledgements

Fuzzy Sets Syst.

Expert Syst. Appl.

Fuzzy Sets Syst.

Math. Comput. Model.

Fuzzy identification of systems and its application to modeling and control

IEEE Trans. Syst. Man Cybern.

ANFIS: Adaptive-network-based fuzzy inference system

IEEE Trans. Syst. Man Cybern.

DENFIS: dynamic evolving neural-fuzzy inference system and its application for time-series prediction

IEEE Trans. Fuzzy Syst.

Flexible neuro-fuzzy systems

IEEE Trans. Neural Netw.

Real-time emotional controller

Neural Comput. Appl.

Implementation issues of neuro-fuzzy hardware: going towards HW/SW codesign

IEEE Trans. Neural Netw.

The cat is out of the bag: cortical simulations with 109 neurons, 1013 synapses

A robust learning model for dealing with missing values in many-core architectures

GPUMLib: A new library to combine machine learning algorithms with Graphics Processing Units

Simulating biological-inspired spiking neural networks with OpenCL

Fast artificial neural network library

Realization of the conscience mechanism in CMOS implementation of winner-takes-all self-organizing neural networks

IEEE Trans. Neural Netw.

Analog design of a new neural network for optical character recognition

IEEE Trans. Neural Netw.

Analog neural non-derivative optimizers

IEEE Trans. Neural Netw.

A resistor/transconductor network for linear fitting

IEEE Trans. Circuits Syst. II

Analog VLSI and Neural Systems

A CMOS K-winners-take-all circuit with O(n) complexity

IEEE Trans. Circuits Syst. II

The cat is out of the bag: cortical simulations with 10⁹ neurons, 10¹³ synapses