High-level design space exploration of locally linear neuro-fuzzy models for embedded systems
Introduction
Recent years have witnessed a wonderful expansion of intelligent and bio-inspired systems usage in many practical applications. Artificial Neural Network (ANN), which can be named as an example of bio-inspired algorithms, has a wide-spread successful applications in different fields of science from engineering to economic. Alongside with rapid development of ANN, Fuzzy Interference Systems (FISs) have also been using in a wide range of problems domain such as process control, image processing, and pattern recognition. Neuro-Fuzzy (NF) [1], [2], [3], [4] systems are connectionist structures in a way that they combine natural language description of fuzzy interference systems and learning properties of neural networks [5]. The successes and capabilities of bio-inspired and soft computing algorithms such as NF models motivate researchers to put effort toward embedded realizations via reconfigurable and dedicated hardware for industrial applications [6], [7].
An embedded system is a specific purpose computer that usually executes a single program repeatedly. They react to changes in their surrounded environments and must respond in real-time manner. The design goals of such system differ from one application to another, but it generally includes low unit cost, low Non-Recurring Engineering (NRE) cost, low power, low execution time, small size, and high flexibility. Designer uses three well-known approaches to realize an embedded system: (i) software (SW) approach, (ii) hardware (HW) approach, and (iii) HW/SW co-design approach. The main advantage of SW implementation [8], [9], [10], [11], [12], [13] is that it is highly flexible, since the connections between neurons are not physical (in contrast to physical implementation that requires wired interconnections). However, on the other side, fully software implementation is usually slower than hardware implementation. This dilemma caused by two reasons. The first one is due to processor resource restrictions. Arithmetic Logic Units (ALUs) of processors are the bottleneck and they are only designed to perform general and basic arithmetic operations. The second one is that software programs are usually developed sequentially; however, it is in contrast to the intrinsic parallel architecture of ANNs and NFs.
Similar to other applications, ANN and NF systems can be implemented as a hardware core [8], [14], [15], [16], [17], [18], [19], [20]. In this approach, the connections between neurons are hard wired and typically permanent. Therefore, ANN and NF hardware structures are not as flexible as SW approach; however they are fast and consume less power concerning software implementations. Researchers even decrease power consumption and improve area, and timing criteria by introducing fidelity slack (error) for bio-inspired and soft-computing algorithms such as ANN and NF [21], [22], [23]. Regarding an embedded system demands, introducing fidelity as another system design criterion endangers NRE, and timing-to-market. One solution to manage the design complexity is to define design procedure at higher abstraction level which has the most effect on implementation quality. The advantages of defining a design at higher levels of abstraction can be [24]: (1) ability to use of high level environment applications such as SystemVerilog, SystemC/TLM, C/C++, and MATLAB/Simulink, (2) potential for better exploration of design criteria which leads to more desirable designs, (3) shorter time to market, (4) lower design cost, etc.
Both software and hardware design paradigms have their own advantages and disadvantages. Recently, designers combine both space designs to alleviate their drawbacks and reinforce their benefits. HW/SW co-design [8], [25], [26] is the procedure of partitioning algorithm between hardware and software parts, then describing each of them in appropriate language such as VHDL [27], Verilog [28] for hardware part and C/C++ for software part. This paper proposes a hardware solution (HW partition) that can be used as an isolated core or as an agent around a central processing unit (SW partition) for the purpose of online training.
In regard to embedded system demands and computational complexity of NF, it can be certainly declared that direct efficient hardware implementation for ANN and NF models is inconceivable. The presented parameterized hardware description model and framework for high level design space exploration of Locally Linear Neuro-Fuzzy Models (LLNFM) help designers to analyze the effects of final NF structure on design goals such as power consumption, execution time, area occupation, and accuracy of estimated results (error). Since the proposed framework has the ability to generate a Pareto optimal design alternatives, design trade-offs are analyzed before physical implementation. The high-level design exploration capability of framework significantly increases the probability of finding better design solutions while decreasing NRE cost and time-to-market. To the best of authors' knowledge, no prior attempt has been made to present a framework for high-level design space exploration of LLNFM embedded implementation. The adjustable hardware model can also be configured during design procedure to satisfy imposed restriction in regard to application demands. It is a hardware module that can be modified in several dimensions by the means of bit resolution, and number of neurons. The main contributions of this paper are summarized as follows:
- •
We propose a scalable LLNFM hardware which can be used tightly coupled with a central processing unit and other Intellectual Property (IP) cores as System-on-Chip (SoC) in a field-programmable gate array (FPGA). The proposed hardware core substitutes different number of neurons through sequential operations. It is also possible to instant variable number of hardware cores to increase the parallelism and improve performance.
- •
We also present a framework to adjust NF hardware core parameters to obtain an efficient hardware solution regarding embedded system demands. The presented hardware in this paper is used for examining and showing off the capabilities of our proposed framework for high-level design space exploration of LLNFM.
- •
Most importantly, the proposed framework is a high level design space explorer which can be used for other soft-computing paradigms such as artificial neural networks and fuzzy interference systems.
The rest of this paper is organized as follows: Section 2 gives an introduction to LLNFM theory, while the proposed parameterized hardware model is presented in Section 3. Section 4 describes the framework for high level design space exploration. Section 5 provides simulations and experimental results for three different scenarios. In this section, LLNFM is trained to efficiently approximate a complex square-root function under different imposed design restrictions. Finally the work impact and future directions are outlined in Section 6.
Section snippets
Locally linear neuro-fuzzy models: Theory
In order to explain the challenges addressed in this work, this section explains LLNFM principal characteristics.
LLNFM falls into Takagi–Sugeno (TS) [1] method, where its fuzzy consequents are function of inputs. The network structure of LLNFM is depicted in Fig. 1. For the sake of simplicity, it is assumed that is input vector, M is the number of neurons, and P is the number of inputs. As shown in Fig. 1, the network structure consists of one hidden layer and an adder in the
Locally linear neuro-fuzzy models: Hardware structure
This section proposes an IP core hardware architecture for LLNFM and discusses its hardware complexity. The proposed IP is based on the mathematical operations discussed in the previous section. In the following subsection the inner parts of hardware model and its specifications are fully studied.
A framework: A high-level design space exploration of proposed LLNFM IP core
In this section, a framework for optimizing multi-objective hardware realization of LLNFM is presented. The main modules of the framework are library generation, design space generation, and Pareto solver. The block diagram of the proposed framework is illustrated in Fig. 10. As seen, the inputs of framework (shaded boxes) are Functional Units, IP Core Architecture, Word Length, Circuit Frequency Intervals, Learning, and Test data. Functional units are basic arithmetic operations (i.e., adders,
Experimental results
In this section, LLNFM is trained to simulate the complex square-root defined as follows: where the input complex operand is in the range of . This function is a fundamental operation in digital signal processing applications such as 1) Orthogonal Frequency Division Multiplexing (OFDM), 2) and matrix decomposition of multi-antenna (MIMO) systems [41]. The framework and proposed IP are put into action thought three different scenarios. Each
Conclusion and future work
In this work, we presented an LLNFM IP core for embedded systems. We also presented a framework to adjust design variables (IP core parameters) so that an optimal LLNFM hardware regarding the application demands (imposed restrictions) can be selected in a very large design space. Through three scenarios, it is shown that the proposed framework is capable to effectively and efficiently adjust design parameters with the aim of exploring the design space at high-level. In the first scenario, the
Acknowledgements
Unfortunately, we lost our great scientist, Prof. Caro Lucas. He was a prominent researcher, a dedicated instructor and a person who was admired by all who knew him. He encouraged us to study the LLNFM implementation for embedded applications.
References (42)
- et al.
A multilayered neuro-fuzzy classifier with self-organizing properties
Fuzzy Sets Syst.
(2008) - et al.
Emotion on FPGA: model driven approach
Expert Syst. Appl.
(2009) - et al.
Neuro-fuzzy system with high-speed low-power analog blocks
Fuzzy Sets Syst.
(2006) - et al.
Optimised PWL recursive approximation and its application to neuro-fuzzy systems
Math. Comput. Model.
(2002) - et al.
Fuzzy identification of systems and its application to modeling and control
IEEE Trans. Syst. Man Cybern.
(1985) ANFIS: Adaptive-network-based fuzzy inference system
IEEE Trans. Syst. Man Cybern.
(1993)- et al.
DENFIS: dynamic evolving neural-fuzzy inference system and its application for time-series prediction
IEEE Trans. Fuzzy Syst.
(2002) - et al.
Flexible neuro-fuzzy systems
IEEE Trans. Neural Netw.
(2003) - et al.
Real-time emotional controller
Neural Comput. Appl.
(2010) Implementation issues of neuro-fuzzy hardware: going towards HW/SW codesign
IEEE Trans. Neural Netw.
(2003)
The cat is out of the bag: cortical simulations with 109 neurons, 1013 synapses
A robust learning model for dealing with missing values in many-core architectures
GPUMLib: A new library to combine machine learning algorithms with Graphics Processing Units
Simulating biological-inspired spiking neural networks with OpenCL
Fast artificial neural network library
Realization of the conscience mechanism in CMOS implementation of winner-takes-all self-organizing neural networks
IEEE Trans. Neural Netw.
Analog design of a new neural network for optical character recognition
IEEE Trans. Neural Netw.
Analog neural non-derivative optimizers
IEEE Trans. Neural Netw.
A resistor/transconductor network for linear fitting
IEEE Trans. Circuits Syst. II
Analog VLSI and Neural Systems
A CMOS K-winners-take-all circuit with O(n) complexity
IEEE Trans. Circuits Syst. II
Cited by (7)
A hierarchical local-model tree for predicting roof displacement in longwall tailgates
2021, Neural Computing and ApplicationsReal-Time Deep Learning at the Edge for Scalable Reliability Modeling of Si-MOSFET Power Electronics Converters
2019, IEEE Internet of Things JournalReal-time person re-identification at the edge: a mixed precision approach
2019, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)Optimization of the Production of Inactivated Clostridium novyi Type B Vaccine Using Computational Intelligence Techniques
2016, Applied Biochemistry and Biotechnology